WO2022198538A1 - Active noise reduction audio device, and method for active noise reduction - Google Patents

Active noise reduction audio device, and method for active noise reduction Download PDF

Info

Publication number
WO2022198538A1
WO2022198538A1 PCT/CN2021/082870 CN2021082870W WO2022198538A1 WO 2022198538 A1 WO2022198538 A1 WO 2022198538A1 CN 2021082870 W CN2021082870 W CN 2021082870W WO 2022198538 A1 WO2022198538 A1 WO 2022198538A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
period
noise reduction
active noise
time
Prior art date
Application number
PCT/CN2021/082870
Other languages
French (fr)
Chinese (zh)
Inventor
张立斌
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180095625.9A priority Critical patent/CN116982106A/en
Priority to PCT/CN2021/082870 priority patent/WO2022198538A1/en
Publication of WO2022198538A1 publication Critical patent/WO2022198538A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones

Definitions

  • the present disclosure relates to the field of audio processing, and more particularly, to a method for noise reduction of ambient sound in an active noise reduction audio device and an active noise reduction audio device.
  • Audio equipment generally has two noise filtering methods (noise reduction methods) to filter or reduce ambient sound.
  • One form of noise reduction is passive noise reduction.
  • Passive noise reduction is usually created by enclosing the ear to form a closed space, or using sound-isolating materials such as silicone earplugs to block outside noise.
  • the noise reduction effect of passive noise reduction is limited, generally only high-frequency noise can be blocked, and the noise reduction effect of low-frequency noise is limited. Additionally, passive noise-cancelling audio devices can cause discomfort to the ears.
  • Active Noise Cancellation uses audio equipment to generate opposite-phase sound waves equal to the outside noise, neutralize the noise, and achieve the effect of noise reduction.
  • active noise-cancelling audio devices typically provide an ambient microphone at the outermost surface of the audio device, which is used to detect ambient sounds. Before the ambient sound reaches the inside of the ear, the audio equipment with active noise reduction needs to complete the detection of the ambient sound and the calculation and generation of the inverted signal. Ideally, when the inverted signal is accurately calculated and reaches the inside of the ear at the same time as the direct sound, better noise reduction can be achieved.
  • the conventional active noise reduction audio equipment is still not ideal for the noise reduction of the ambient sound, and in some cases may also require high-cost circuit components .
  • the embodiments of the present disclosure aim to provide a technical solution for noise reduction of ambient sound.
  • a method for active noise reduction includes acquiring a first signal representing a first ambient sound collected by a first microphone of an active noise reduction device during a first period of time; and acquiring an evaluation signal corresponding to the first period of time.
  • the method also includes determining, based on the evaluation signal, a length of time for a second time period, the second time period following the first time period.
  • the method further includes generating a second signal to be played by the first speaker of the active noise reduction device based on the first signal and the time length of the second period.
  • the second signal represents an inverse estimate of the first ambient sound during the second time period.
  • the evaluation signal represents the effect of active noise reduction and/or the accuracy of the estimate.
  • the evaluation signal can be obtained by a residual microphone described in detail later.
  • the evaluation signal may be determined by the evaluation signal estimated by the processor during the initial period and the ambient sound collected in the first period.
  • the reversed phase signal of the second period in advance and playing the reversed phase signal, the reversed phase sound can reach the human ear in synchronization with the direct sound at the subsequent time, shortening the delay of the reversed phase sound, and the reversed phase sound and the direct sound can be cancelled effectively. to improve the noise reduction effect.
  • the residual signal By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
  • generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second time period includes generating the third signal based on the first signal and the time length of the second time period, The third signal represents an estimate of the first ambient sound during the second time period; and inverting the third signal to generate the second signal.
  • the inverted signal can be obtained more accurately by estimating and then inverting to generate the inverted signal.
  • generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second period includes inverting the first signal to generate a fourth signal, the fourth signal an inverted sound representing the first ambient sound during the first period; and generating the second signal based on the fourth signal and a time length of the second period.
  • obtaining the evaluation signal corresponding to the first time period includes obtaining, as the evaluation signal, a residual signal collected by a residual microphone of the active noise reduction audio device during the first time period.
  • the residual microphone is different from the first microphone.
  • obtaining the evaluation signal corresponding to the first time period includes an inverting estimated signal corresponding to the first time period estimated based on the first signal and a time period preceding the first time period or based on the first signal and the first time period The estimated signal corresponding to the first period estimated for the period preceding the period determines the estimated signal.
  • acquiring the evaluation signal corresponding to the first time period includes an estimated signal corresponding to the first time period estimated based on an inversion signal of the first signal and a time period preceding the first time period, or based on the inversion of the first signal The signal and the inverted estimated signal corresponding to the first period estimated for the period preceding the first period, determine the evaluation signal.
  • Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
  • generating the initial output signal to be played by the first speaker of the active noise cancelling audio device based on the initial input signal and the time length of the second time period includes determining the amount of the first ambient sound during the first time period based on the initial input signal a category; determining an estimated model corresponding to the category of the first ambient sound based on the category of the first ambient sound; and generating a second signal based on the estimated model, the time length of the second period, and the first signal.
  • generating the second signal based on the estimated model, the time length of the second time period, and the first signal includes: if the determined category of the ambient sound indicates that the ambient sound is speech, then using the speech model to perform analysis on the first signal estimation for generating the second signal; if the determined category of the ambient sound indicates that the ambient sound is noise, the neural network model is used to estimate the first signal for generating the second signal; if the determined ambient sound is of noise The class indicates that the ambient sound is music, using a reserve pool model to estimate the first signal for generating the second signal; and if the determined class of the ambient sound indicates that the ambient sound is a mixed sound, using a weighted model to evaluate the first signal.
  • a signal is estimated for generating a second signal.
  • a reserve pool model is used to estimate the first signal for generating the second signal; and if the speech is unvoiced, a linear prediction model is used to estimate the first signal to for generating the second signal.
  • the method further includes: adjusting an estimated model corresponding to the category of ambient sound based on the evaluation signal; and generating subsequent audio to be played by the first speaker based on the audio signal subsequently collected by the first microphone and the adjusted model Signal.
  • adjusting the model includes replacing the single model and/or adding an estimated model.
  • adjusting the models includes replacing one or more models, adjusting model weights, and/or increasing or decreasing the estimated models.
  • the method further includes: determining a filter corresponding to the ambient sound based on the category of the ambient sound; and generating a second signal based on the first signal and the filtering.
  • filtering By filtering the collected audio signal, the noise in the sound can be effectively filtered. Different filtering can be used for different sound classes.
  • weighted filtering of various filters can also be used. Weighted filtering may have different filter weights for different sounds.
  • filtering can be adjusted based on feedback mechanisms such as the use of residual signals. Adjusting the filtering includes increasing/decreasing the filtering type, and/or adjusting the weight of each filter in the weighted filtering.
  • the first signal is filtered using a finite impulse response filter for generating the second signal. If the determined category of ambient sound indicates that the ambient sound is speech, the first signal is filtered using an infinite impulse response filter for generating the second signal.
  • determining the estimation model corresponding to the category of the first ambient sound based on the category of the first ambient sound includes determining a weighted model corresponding to the first ambient sound based on the category of the first ambient sound, the weighting model includes: a first The estimated model, the weight of the first estimated model, the second estimated model and the weight of the second estimated model.
  • the method further includes adjusting the estimated model based on the residual signal.
  • generating the second signal to be played by the first speaker based on the first signal and the time length of the second time period includes: generating the second signal to be played by the first speaker based on the first signal, the time length of the second time period, and the adjusted estimation model The second signal played by the first speaker.
  • determining the category of ambient sound during the first period based on the first signal includes at least one of a frame rate for the first energy range, a frame rate for the first frequency range, and a zero-crossing rate based on the first signal Determines the category of ambient sound.
  • the first energy range includes the low energy range.
  • the first frequency range includes the low frequency range.
  • generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second period includes determining a filter corresponding to the first ambient sound based on the category of the first ambient sound ; generating a second signal based on the first signal, the time length of the second time period and the filtering.
  • it may contain different noises.
  • the method further includes adjusting the filtering based on the evaluation signal.
  • an appropriate estimated filter can be selected based on the noise reduction effect to improve the noise reduction effect.
  • generating the initial output signal includes: using a weighted estimation model or a reserve pool model to estimate the first signal to generate an estimated signal, the estimated signal representing an estimate of ambient sound during the first time period; and evaluating the estimated signal Inversion is performed to generate the second signal.
  • the first signal is inverted to generate an inverted signal, and a weighted estimation model or a pool model is used to estimate the inverted signal to generate the second signal.
  • the method further includes acquiring a fifth signal, the fifth signal representing the second ambient sound collected by the second microphone in the active noise cancellation audio device during the first period; and based on the fifth signal and the second The length of time of the period generates a sixth signal to be played by the second speaker in the active noise cancellation audio device, the sixth signal representing an inverse estimate of the second ambient sound during the second period.
  • different microphones have different collection effects for different frequency bands. For example, the first microphone can capture more details for low-frequency sounds, while the second microphone can capture more details for mid-high frequency sounds. Therefore, by using different microphones to adopt different collection schemes for the ambient sound, various details of the ambient sound can be collected more effectively. In addition, active noise cancellation can be ensured even if one microphone fails.
  • different speakers have different performance effects in different frequency bands. For example, the first speaker is more prominent in the low frequency band, while the second speaker is more prominent in the mid-high frequency. Therefore, by using different speakers to play different inverted signals, different parts of the direct sound can be more accurately cancelled for better noise reduction depth and noise reduction width.
  • the method further includes first filtering the first signal to generate a filtered first filtered signal; generating a second signal based on the first filtered signal; Second filtering to generate a filtered second filtered signal; and generating a sixth signal based on the second filtered signal.
  • the method further includes: obtaining a first residual signal, the first residual signal representing the sum of the first sound and the direct sound, wherein the first sound is based on the initial output signal; obtaining a second residual a signal, the first residual signal represents the sum of the second sound and the direct sound, wherein the second sound is based on the sixth signal; the first filtering is adjusted based on the first residual signal; and the pair of The second filter is adjusted.
  • generating the second signal based on the first signal includes inverting the first signal to generate the first inverted signal; estimating the first inverted signal using the first estimation model to generate the first component signal ; use a second estimation model different from the first estimation model to estimate the first inverted signal to generate a second component signal different from the first component signal; use a weighting model to weight the first component signal and the second component signal , to generate the initial output signal.
  • a computer-readable storage medium storing one or more programs.
  • One or more programs are configured to be executed by one or more processors.
  • the one or more programs comprise instructions for performing the method according to the first aspect.
  • the inverted signal of the second period can be estimated in advance.
  • the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the time delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect.
  • the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved.
  • the estimated time length can be extended.
  • the estimation time length can be shortened to improve the estimation accuracy.
  • a computer program product includes one or more programs.
  • One or more programs are configured to be executed by one or more processors.
  • the one or more programs comprise instructions for performing the method according to the first aspect.
  • the inverted signal of the second time period can be estimated in advance.
  • the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect.
  • the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved.
  • the estimated time length can be extended.
  • the estimation time length can be shortened to improve the estimation accuracy.
  • an active noise reduction audio device includes an acquisition module for acquiring a first signal and an evaluation signal corresponding to a first time period, where the first signal represents the first ambient sound collected by the first microphone of the active noise reduction device during the first time period and an inverse estimation signal generation module for determining, based on the evaluation signal, a time length of a second time period, the second time period following the first time period; and A second signal played by the first speaker of the , the second signal representing an inverted estimate of the first ambient sound during the second time period.
  • the ANC audio device may estimate the inverted signal for the second period in advance.
  • the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the time delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect.
  • the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
  • the inverse estimation signal generation module is further configured to generate a third signal based on the first signal.
  • the third signal represents an estimate of ambient sound during the second time period; and inverting the third signal to generate the second signal.
  • the inverted signal can be obtained more accurately by estimating and then inverting to generate the inverted signal.
  • the acquisition module is further configured to acquire a first signal corresponding to the first microphone, where the first signal represents ambient sound collected during the first period of time.
  • the inverse estimated signal generation module is further configured to determine the time length of the second period based on the first signal and the estimated inverse estimated signal of the previous period or the estimated signal based on the first signal and the estimated period of the previous period, and the second period is in the first period after.
  • the inversion estimation signal generation module is further configured to determine the second phase based on the inversion signal of the first signal and the inversion estimation signal estimated in the previous period or based on the inversion signal of the first signal and the estimated signal estimated in the previous period.
  • the time length of the period, the second period follows the first period. Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
  • the obtaining module is further configured to obtain a residual signal corresponding to a residual microphone of the active noise reduction audio device, where the residual microphone is different from the first microphone. Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
  • the inverse estimation signal generation module is further configured to determine a category of the first ambient sound during the first time period based on the first signal; determine an estimate corresponding to the category of the first ambient sound based on the category of the first ambient sound a model; and generating a second signal based on the estimated model, the first signal, and a time length of the second time period.
  • the inverse estimation signal generation module is further configured to determine a weighting model corresponding to the ambient sound based on the category of the ambient sound, where the weighting model includes: a first estimation model, a weight of the second estimation model, the first estimation model and The weights of the second estimation model.
  • the weighting model includes: a first estimation model, a weight of the second estimation model, the first estimation model and The weights of the second estimation model.
  • the inverse estimation signal generation module is also used to adjust the estimation model based on the estimation signal.
  • an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect.
  • an active noise reduction audio device includes one or more processors; a memory storing one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including Instructions for the method of the first aspect.
  • an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect.
  • the residual signal By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
  • an active noise reduction audio device includes a first microphone, one or more processors, and a first speaker.
  • the first microphone is configured to collect the first ambient sound during the first period of time and generate the first signal.
  • the one or more processors are configured to obtain an evaluation signal corresponding to the first time period; determine a time length of the second time period based on the evaluation signal, the second time period following the first time period; and the first signal and the time length of the second time period A second signal is generated, the second signal representing an inverted estimate of the first ambient sound during the second time period.
  • the first speaker is configured to be configured to play the second signal during the second period of time.
  • an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect.
  • the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
  • the active noise reduction audio device further includes a residual microphone.
  • the residual microphone is configured to acquire residual sound to generate a residual signal as an evaluation signal. By using the residual signal to adjust subsequent sound estimates, the estimates can be dynamically adjusted in a feedback manner for better dynamic active noise reduction.
  • the active noise reduction audio device further includes: a second microphone and a second speaker.
  • the second microphone is configured to collect the second ambient sound during the first period of time and to generate a fifth signal that is different from the first signal.
  • the second speaker is configured to play a sixth signal, wherein the sixth signal is generated by the one or more processors based on the fifth signal.
  • the sixth signal represents a second inverted estimate of the second ambient sound during the second period and is different from the second signal.
  • FIG. 1 shows a schematic diagram of an active noise reduction audio device that may implement embodiments of the present disclosure
  • FIG. 2 shows a schematic block diagram of an active noise reduction audio device according to an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for filtering or reducing ambient sound according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a method for classifying ambient sounds and performing inverse estimation according to an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of a process of active noise reduction according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of a process of active noise reduction according to another embodiment of the present disclosure.
  • FIG. 7 shows a schematic diagram of another active noise cancellation audio device that may implement embodiments of the present disclosure.
  • FIG. 8 is a schematic diagram of a process of active noise reduction according to yet another embodiment of the present disclosure.
  • FIG. 9 is a schematic block diagram of a computer-readable storage medium according to one embodiment of the present disclosure.
  • FIG. 10 is a schematic block diagram of an apparatus for denoising ambient sound according to an embodiment of the present disclosure.
  • the term “comprising” and the like should be understood as open-ended inclusion, ie, “including but not limited to”.
  • the term “based on” should be understood as “based at least in part on”.
  • the terms “one embodiment” or “the embodiment” should be understood to mean “at least one embodiment”.
  • the terms “first”, “second”, etc. may refer to different or the same objects.
  • the term “period” refers to a continuous or discrete period of time, the smallest unit of which includes a single point in time. In other words, the term “period” may include at least one point in time. Other explicit and implicit definitions may also be included below.
  • microphone refers to a device for collecting sound and converting it into a corresponding electrical signal.
  • Speaker refers to a device used to convert electrical signals into sound based on audio data.
  • ambient sound refers to the sound present in the external environment in which the ANC audio device is located and captured by the ANC audio device, which may be a combination of one or more sounds, such as speech, music, noise, etc.
  • ambient sound may be noise to the user, and the user wishes to filter or reduce the ambient sound by using an active noise cancellation audio device.
  • the filtering or reduction of ambient sound is collectively referred to as noise reduction of ambient sound.
  • the existing active noise reduction and active noise reduction audio equipment is limited by factors such as the calculation of the inverted signal, and cannot filter or reduce the ambient sound well.
  • the time required for the real-time calculation of the inverted signal is greater than the propagation time of the ambient sound signal from the external sampling microphone of the ANC audio device to the inside of the ear, which can cause a mismatch between the inverted signal and the direct signal, and may even cause an increase in noise.
  • embodiments of the present disclosure provide a method for noise reduction of ambient sound, an active noise reduction audio device, and a computer-readable storage medium.
  • ambient sound is collected using an ambient sound microphone of an active noise-cancelling audio device, an inverted signal representing a possible ambient sound for a subsequent period is predicted or estimated based on a signal representing the ambient sound, and
  • the anti-phase signal is played using the built-in speaker of the active noise reduction audio device, so that the played anti-phase sound and the direct ambient sound cancel each other out at a subsequent moment, thereby realizing noise reduction of the ambient sound.
  • a better noise reduction width can be obtained, that is, the frequency at which the effective noise reduction effect can be increased, compared to the conventional scheme of generating a real-time inverted signal to cancel the direct sound. width.
  • the solution according to the embodiment of the present disclosure can obtain a better noise reduction depth (for example, it can be expressed in dB), thereby improving the degree of effectively suppressed/cancelled noise.
  • the current effect of active noise reduction can be known, and the estimated length of the subsequent signal can be adjusted based on the evaluation signal representing the current active noise reduction effect, so as to obtain subsequent more accurate noise reduction. Accurate inverse signal estimation to improve the effect of subsequent active noise reduction.
  • FIG. 1 shows a schematic diagram of an active noise cancellation audio device 10 in which embodiments of the present disclosure may be implemented.
  • the active noise cancelling audio device 10 may be, for example, an ear-contact audio playback device such as a True Wireless Stereo (TWS) headset.
  • the active noise cancellation audio device 10 may include a pair of earphones, and the two earphones 11 and 12 are configured substantially identically to each other. Therefore only one earphone 11 is used for the schematic description.
  • the headset 11 includes an external first microphone 13, a processor 17 located inside the headset 11, a first speaker located inside the headset 11 (relative to the external microphone 13 exposed to the environment) in an in-ear portion or in contact with the ear 15 and a residual microphone 14.
  • the first microphone 13 is configured to detect or collect the sound of the external environment.
  • the residual microphone 14 may not be present. In this case, dynamic adjustment of the estimates may not be required.
  • FIG. 1 a possible configuration of the active noise cancellation audio device 10 is shown in FIG. 1, this is for illustration only and does not limit the scope of the present disclosure.
  • the two earphones 11 and 12 may have only one processor 17, and the wireless signal is transmitted by wireless transmission such as Bluetooth signal transmission to realize the two earphones 11 and 12 to a single processor 17 shares.
  • the two earphones 11 and 12 may also share a single first microphone 13 .
  • the external first microphone 13 of the active noise reduction audio device 10 collects ambient sound, and performs acousto-electrical conversion to generate a continuous electrical signal and transmit it to the processor 17 .
  • the processor 17 predicts or estimates the ambient sound at the subsequent time based on the received signal, and generates and transmits to the first speaker 15 an inverted signal representing the ambient sound at the subsequent time.
  • inverted signal refers to a signal that is manipulated after inverting the audio signal, eg by directly inverting the sign of the audio sample points or further processing.
  • a signal that is not inverted may be referred to as a "non-inverting signal”.
  • the out-of-phase signal played by the speakers is used to cancel each other to a certain extent with the direct sound (normal-phase sound) directly inside the active noise cancelling audio device 10 to reduce the sound perceived inside the ear.
  • the first speaker 15 plays the reversed-phase sound based on the received reversed-phase signal, so as to cancel the direct ambient sound from the environment directly into the active noise reduction audio device 10 at a subsequent time, so as to achieve the effect of noise reduction.
  • the schematic configuration of the active noise cancellation audio device 10 is shown in FIG. 1 with TWS headphones, it is to be understood that the scope of the present disclosure is not so limited.
  • another possible configuration of an active noise cancelling audio device ie over-ear headphones, is shown with headphones in Figure 7 below.
  • the active noise reduction audio device 10 may also be, for example, an active noise reduction audio device that transmits audio through bone conduction.
  • FIG. 2 shows a schematic block diagram of an active noise reduction audio device 100 according to one embodiment of the present disclosure.
  • the ANC audio device 100 shown in FIG. 2 is only exemplary, for example, used to illustrate one possible implementation of the ANC audio device 10 of FIG. 1 , and should not constitute a description of the present disclosure. any limitations on the functionality and scope of the implementation.
  • the ANC audio device 100 may include a processor 110, a wireless communication module 160, an antenna 1, an audio module 170, a speaker module 170A, a microphone module 170C, a key 190 shown in solid boxes and solid lines , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , and a power management module 141 .
  • a processor 110 a wireless communication module 160, an antenna 1, an audio module 170, a speaker module 170A, a microphone module 170C, a key 190 shown in solid boxes and solid lines , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , and a power management module 141 .
  • USB universal serial bus
  • the processor 110 may be, for example, the processor 17 of FIG. 1 , and may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor ( graphics processing unit (GPU), image signal processor (ISP), controller, video codec, digital signal processor (DSP), baseband processor and/or neural network processor ( neural-network processing unit, NPU), etc.
  • the different processing units may be separate devices.
  • different processing units may also be integrated in one or more processors.
  • the controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the processor 110 executes various functional applications and data processing of the active noise reduction audio device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the active noise reduction audio device 100 may implement audio functions through an audio module 170, a speaker module 170A, a microphone module 170C, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal. Audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
  • the speaker module 170A also called “speaker” is used to convert audio electrical signals into sound signals.
  • the active noise cancellation audio device 100 can listen to music through the speaker module 170A, or listen to hands-free calls.
  • the microphone module 170C is also called “microphone” or “microphone”. When making a call or sending a voice message, the user can make a sound by approaching the microphone module 170C through a human mouth, and input the sound signal to the microphone module 170C.
  • the ANC audio device 100 may further include an antenna 2 and a mobile communication module 150 .
  • the ANC audio device 100 may further include an external memory interface 120, a battery 142, an earphone 170B, a headphone interface 170D, a sensor module 180, a motor 191, an indicator 192, a camera shown in dashed and dashed boxes 193, a display screen 194, and one or more of a subscriber identification module (subscriber identification module, SIM) card interface 195.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and an ambient light sensor One or more of 180L, bone conduction sensor 180M. Sensor module 180 may also include other types of sensors not listed.
  • the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the active noise reduction audio device 100 .
  • the active noise reduction audio device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • FIG. 3 is a schematic flowchart of a method 300 for noise reduction of ambient sound according to one embodiment of the present disclosure.
  • the method 300 may be performed by the active noise cancellation audio device 10 or the active noise cancellation audio device 100 .
  • the processor 17 of the active noise cancellation audio device 10 may execute computer program code or instructions that implement the method 300 .
  • the processor 110 of the active noise cancellation audio device 100 may execute computer program code or instructions that implement the method 300 .
  • the method 300 for noise reduction of ambient sound is described below by taking the active noise reduction audio device 10 as an example, and it should be understood that this is only exemplary and not limiting.
  • the ANC audio device 100 acquires an initial input signal corresponding to the first microphone 13 in the ANC audio device 10, the initial input signal representing the ambient sound collected by the ANC audio device 10 during the initial period.
  • the initial input signal is acquired by, for example, the first microphone 13 and transmitted to the active noise reduction audio device 100 .
  • the initial input signal may also be processed by an intermediate audio signal processing circuit (such as filtering, gain amplification, etc.) and then transmitted to the active noise reduction audio device 100 for subsequent processing.
  • an intermediate audio signal processing circuit such as filtering, gain amplification, etc.
  • the ANC audio device 10 usually processes the received signal with a fixed or variable data length, the fixed or variable data
  • the length corresponds to the length of time of the duration of the sampled ambient sound.
  • the initial input signal received by the active noise cancellation audio device 10 includes a microphone signal for one or more sampling time points.
  • the active noise cancellation audio device 10 acquires and processes the microphone signal one time period at a time.
  • the ANC audio device 10 acquires and processes the microphone signal of the first period.
  • the ANC audio device 10 acquires and processes the microphone signal for a first period after the initial period, and so on.
  • the length of the initial period may or may not be the same as the length of the first period.
  • the first time period may include a first plurality of time points, and the initial time period may include a second, different number of time points.
  • the active noise cancellation audio device 10 generates an initial output signal to be played by the first speaker 15 in the active noise cancellation audio device 10 based on the initial input signal.
  • the initial output signal represents an inverse estimate of the ambient sound during a first period following the initial period.
  • the active noise reduction audio device 10 predicts or estimates ambient sound signals for successive periods (eg, a first period) based on the sound sample signals of a previous period (eg, an initial period), and then estimates or estimates ambient sound signals for the estimated ambient sound
  • the signal is inverted to generate an initial output signal representative of sounds that the ambient sound may have produced during the first period of time.
  • the initial input signal may also be first inverted to generate an inverted signal, and prediction or estimation based on the inverted signal may be performed to generate the initial output signal.
  • the first loudspeaker 15 can generate an inverted initial inverted sound inside the user's ear by playing the initial output signal, and at this time, the ambient sound actually generated in the first period of time directly reaches the inside of the ear.
  • the initial anti-phase sound and the direct sound are added (because the anti-phase sound and the direct sound are in opposite phases, they actually cancel each other out to reduce the net volume), so as to obtain the effect of noise reduction.
  • by predicting or estimating the sound of the subsequent period it is equivalent to prolonging the time left for sound processing, so that the processing circuit has enough time to generate the inverted signal and the effect of noise reduction is improved.
  • the processing circuit since the processing circuit has sufficient time to generate the inverted signal, high-speed processing circuit is also not required, thereby reducing the circuit design complexity and manufacturing cost of the active noise reduction audio device 10 .
  • the estimation based on the initial input signal may be performed using a reserve pool model or a weighting model, which will be described in detail below, to generate the initial output signal.
  • the ANC audio device 10 may directly invert the initial input signal to generate the seventh signal and estimate the seventh signal using a first estimation model, such as a speech model, to generate the first component signal and use a different A second estimation model, such as a neural network model, of the first estimation model estimates the seventh signal to generate a second component signal that is different from the first component signal.
  • the ANC audio device 10 then weights the first component signal and the second component signal to generate an initial output signal using a weighting model, eg multiplying the first component signal by a first weighting factor to obtain a first product, the second component The signal is multiplied by a second weight factor to obtain a second product, and the first product and the second product are added to obtain an initial output signal.
  • a weighting model eg multiplying the first component signal by a first weighting factor to obtain a first product, the second component The signal is multiplied by a second weight factor to obtain a second product, and the first product and the second product are added to obtain an initial output signal.
  • the weighting coefficients for the first component signal and the second component signal can be dynamically adjusted. For example, the adjustment is performed by using the residual signal described later.
  • model used may be dynamically replaced during the noise reduction of the ANC audio device 10, eg, the predictive model used by the ANC audio device 10 may in some cases change from A first model, such as a reserve pool model, switches to a second model, such as a linear prediction model.
  • the parameters or configuration of the model used can be adjusted dynamically.
  • the weighting coefficients of the various terms of the weighting model may be dynamically adjusted based on the frequency content or class of the audio signal.
  • a default or fixed model can also be used to generate the initial output signal.
  • the user may be in a category according to the environment in which the user is located or based on ambient sound. category to generate the initial output signal in a targeted manner, so as to obtain a better noise reduction effect.
  • the active noise reduction audio device 10 may automatically determine the category of the ambient sound based on the initial input signal collected by the first microphone 13 .
  • the user may set the category of the environment through a portable electronic device in communication with the active noise cancelling audio device 10, or send a set command via the active noise cancelling audio device 10 to set the category of the environment by voice.
  • the user may set the level of noise reduction effect of the active noise cancellation audio device 10 .
  • the noise reduction effect of the active noise reduction audio device 10 is set to a low noise reduction level to avoid omitting the warning sound in the ambient sound.
  • the processor 17 acquires the first signal.
  • the first signal represents the first ambient sound collected by the first microphone 13 of the active noise reduction device during the first period of time.
  • the processing of the processor 17 here is similar to the processing of the initial input signal, and details are not repeated here. The corresponding descriptions regarding the initial input signal may apply here.
  • the processor 17 obtains an evaluation signal corresponding to the first time period.
  • the evaluation signal represents the effect of active noise reduction and/or the accuracy of the estimate.
  • the evaluation signal can be obtained by a residual microphone described in detail later.
  • the evaluation signal may be determined by the evaluation signal estimated by the processor 17 in the initial period and the ambient sound collected in the first period. For example, the processor 17 estimates an initial estimation signal in the initial period, and the initial estimation signal represents a possible situation of the estimated ambient sound in the first period.
  • the processor 17 receives the first signal collected by the first microphone 13 during the first period. The processor 17 then subtracts the initial estimated signal from the first signal, and the estimated signal can be determined.
  • the processor 17 estimates an initial inverted signal during the initial period, and the initial inverted estimated signal represents the estimated inverted signal of a possible situation of the ambient sound in the first period, that is, the inverted estimated signal of the first signal .
  • the processor 17 receives the first signal collected by the first microphone 13 during the first period.
  • the processor 17 then adds the initial inverted estimate signal to the first signal, and the estimate signal can be determined. It can be understood that there may also be other processing methods to obtain the residual signal, such as subtracting and inverting the inverted signal of the first signal and the initial inverted estimated signal, or adding the inverted signal of the first signal to the initial estimated signal , which is not limited in the present disclosure.
  • the processor 17 determines, based on the evaluation signal, a length of time for a second time period, the second time period following the first time period. Since the processor 17 obtains the evaluation signal, the previous evaluation effect can be determined. Therefore, the processor 17 can adjust subsequent estimates accordingly based on the estimated effect, such as adjusting the time length of successive time periods, to obtain a more accurate estimate, thereby improving the effect of active noise reduction. For example, when the residual signal indicates that the estimation effect is good, the time length of the estimation period can be maintained or increased. When the residual signal signal indicates that the estimation effect is not ideal, the time length of the subsequent estimation period can be reduced. In addition, the residual signal can also be used to adjust, and/or add or subtract the estimation model, to adjust the weight coefficients of each item in the weighted model, and to adjust the filtering, as described later.
  • the processor 17 generates a second signal to be played by the first speaker 15 based on the first signal and the time length of the second period.
  • the second signal represents an inverse estimate of the first ambient sound during the second time period.
  • the first speaker 15 of the active noise cancellation device 10 may play the second signal.
  • the processing of the processor 17 here is similar to the processing of the initial output signal, which is not repeated here. The corresponding descriptions regarding the initial output signal may apply here.
  • FIG. 4 is a schematic diagram of a method 400 for classifying ambient sounds and performing inverse estimation according to one embodiment of the present disclosure.
  • the ambient sound of the environment where the user is located can be classified into five categories: silence, speech, noise, music, and hybrid, where the hybrid includes, for example, a mixture of speech, music and/or noise.
  • the active noise cancellation audio device 10 may determine a category of ambient sound based on the first signal, and generate a third signal based on the determined category and the first signal, and invert the third signal to generate a second signal.
  • the first signal may be inverted first to generate the fourth signal, and the classification and estimation may be performed based on the inverted fourth signal to generate the second signal.
  • ambient sound can also be divided into three categories: music, noise, and silence.
  • speech is classified as noise.
  • the classification may also be based on the frequency of the ambient sound.
  • the active noise reduction audio device 10 may determine the category of the ambient sound based on at least one of a frame rate of the first energy range, a frame rate of the first frequency range, and a zero-crossing rate of the first signal. Specifically, at 402, the active noise cancellation audio device 10 acquires a first signal. After receiving the first signal, the active noise reduction audio device 10 performs characteristic analysis on the first signal to determine the classification of the first signal. For example, the first signal may be analyzed based on its energy frame rate and zero-crossing rate.
  • the silent class is the audio signal segment that the human ear cannot perceive, which may sometimes contain a small amount of noise. Therefore, in order to detect as accurately as possible, a short-term energy threshold and a short-term zero-crossing rate threshold can be used to judge the first signal. For example, at 404, the active noise reduction audio device 10 analyzes the first signal to determine whether its energy frame rate and zero-crossing rate are both above respective given thresholds. This approach is similar to endpoint detection. When both the short-term energy and the short-term zero-crossing rate of the first signal are less than the corresponding given thresholds, the ANC audio device 10 may determine that the first signal is a mute signal at 406 . This indicates that the environment is in a quiet state, and accordingly, at 408, the active noise cancellation audio device 10 need not generate the third signal and accordingly the second signal also need not be generated. This can save power consumption and prolong the use time of the active noise cancelling audio device 10 .
  • the ANC audio device 10 determines that at least one of the energy frame rate and the zero-crossing rate is not lower than the threshold, the ANC audio device 10 further analyzes at 410 whether the frame rate of the first energy range is higher than the threshold .
  • the first energy range is, for example, a low-energy range
  • the first frequency range is, for example, a low-frequency range. It can be understood that the specific ranges of the low energy range and the low frequency range can be set accordingly according to the performance design of the active noise reduction audio device 10 .
  • ambient sounds may also be classified with different energy ranges or frequency ranges, as long as the ambient sounds have corresponding characteristics in a particular energy and/or frequency range.
  • Speech signals and noise signals generally have higher low energy frame rates than music and mixed sounds. If the frame rate of the first energy range is higher than a given threshold, the active noise reduction audio device 10 may determine that the first signal representing the ambient sound is a speech signal or a noise signal. For speech signals and noise signals, although both have higher frame rates in the low energy range, they have different characteristics in terms of the frequency of the sound. For example, speech signals usually have more low-frequency frame rates, while noise signals are more dispersed in frequency due to their randomness, and have lower frame rates at low frequencies. Accordingly, at 412, the active noise reduction audio device 10 determines whether the frame rate of the first frequency range of the first signal is above a threshold.
  • the ANC audio device 10 determines at 413 that the first signal is a speech signal.
  • the active noise reduction audio device 10 may then generate a third signal using the speech model for the speech signal.
  • the active noise reduction audio device 10 may generate a third signal based on the first signal using a speech model such as a reserve pool model or a linear prediction (LPC) model.
  • a speech model such as a reserve pool model or a linear prediction (LPC) model. It will be appreciated that other speech models than the reserve pool model or the linear prediction model may also be used.
  • the alternating appearance of unvoiced and voiced sounds in a speech signal is a unique property, while the unvoiced part has a higher zero-crossing rate, and the voiced part has a lower zero-crossing rate, so the zero-crossing rate in the speech signal will change alternately .
  • the reserve pool model has good estimation results, so it can be used to estimate the third signal for the unvoiced signal.
  • the Reservoir model is also called the Echo state network.
  • the calculation of the reserve pool model simplifies the training process of the network, solves the problems that the traditional recurrent neural network structure is difficult to determine and the training algorithm is too complicated, and also overcomes the memory fading problem of the recurrent network.
  • the active noise reduction audio device 10 may use the reserve pool model and the LPC model alternately to estimate the speech signal. Alternatively, the active noise reduction audio device 10 may also use only the reserve pool model or the LPC model to estimate the speech signal. The accuracy of speech estimation can be further improved by using targeted models for unvoiced and voiced sounds.
  • the active noise cancellation audio device 10 determines at 414 that the first signal is a noise signal.
  • the ANC audio device 10 may process the noise signal using a suitable model to generate a third signal.
  • a neural network model may be used to estimate the first signal to generate the third signal.
  • the active noise cancellation audio device 10 may determine that the ambient sound is music or a mixed sound.
  • the characteristics of music are harmonious and pleasing to the ear, and the timbre or frequency range of the music is wide, and it can be music played by various musical instruments. Therefore, the audio components contained in the music are complex, and the energy value will be too large.
  • music does not have the sudden change of unvoiced and voiced sounds of speech signals, and the change of energy value is not as severe as that of speech signals, so the low-energy frame rate value of music is lower.
  • mixed sound also has a lower low energy frame rate. Therefore, low energy frame rate is a useful feature for judging music and mixed sounds.
  • the ANC audio device 10 further differentiates between music and mixed sounds.
  • the mixed appearance of speech and music in life can often render an atmosphere.
  • This mixed signal usually takes the music part as the background, and the speech part occupies the main part (that is, the speech energy) occupies the main part.
  • the alternating appearance of unvoiced and voiced sounds in speech signals is a unique property, and the unvoiced part has a higher zero-crossing rate, while the voiced part has a lower zero-crossing rate, so the zero-crossing rate in the speech signal will change alternately.
  • There is no alternation of clear and voiced sounds in the music signal so the change of the zero-crossing rate is relatively stable, and an effective parameter to measure the change of the zero-crossing rate is the zero-crossing rate variance.
  • the music signal and the speech-music mixed signal can be distinguished by the variance of the zero-crossing rate.
  • the active noise reduction audio device 10 may differentiate between music and mixed sounds based on zero-crossing rate variance. For example, at 420, the active noise reduction audio device 10 analyzes the variance of the zero-crossing rate of the first signal to determine whether the variance of the zero-crossing rate of the first signal is above a given threshold.
  • the active noise reduction audio device 10 determines at 422 that the first signal is a music signal.
  • the active noise cancellation audio device 10 may use the reserve pool model to estimate the first signal at 424 to generate the third signal.
  • a reserve pool model is used here for estimation based on the music signal, this is for illustration only and does not limit the scope of the present disclosure.
  • a weighted model may be used to estimate the third signal based on the music signal.
  • the ANC audio device 10 determines at 421 that the first signal is a mixed signal. For the mixed signal, the ANC audio device 10 may estimate the first signal at 423 using the weighted model to generate the second signal. In one embodiment, the active noise reduction audio device 10 may estimate the first signal using at least two models among the reserve pool model, the LPC model and the neural network model, and assign weight coefficients to the estimated results of each model. For example, if there are more voiced signals and less noise signals, a higher weight factor, such as 0.75, can be assigned to the result estimated by the LPC model for the voiced signal, and the result estimated by the neural network model for the noise signal can be assigned a higher weight coefficient, such as 0.75.
  • a higher weight factor such as 0.75
  • the result is assigned a lower weight factor, such as 0.25.
  • the corresponding estimation results are then multiplied by the weight coefficients, and the two products are added to obtain the final second signal.
  • the reserve pool model, the LPC model and the neural network model can also be used for weighted estimation to obtain the final third signal. It can be understood that the weighting coefficient for each model in the weighting model can be dynamically adjusted. For example, based on the residual signal described later, the active noise reduction audio device 10 can dynamically adjust each coefficient to obtain a better noise reduction effect.
  • noise reduction can be effectively performed even if the ambient sound has a wide sound frequency range, thereby significantly increasing the Noise reduction width.
  • ambient noises with strong decibels usually have specific sound characteristics and belong to specific categories. Therefore, by applying an estimation model in a targeted manner, the noise decibels can also be effectively reduced to obtain a significant noise reduction depth.
  • the active noise reduction audio device 10 may analyze various features of the first signal, and directly determine the classification category of the first signal based on the analysis result, without the need for step-by-step analysis as shown in FIG. 4 . to get the classification class of the first signal.
  • corresponding model estimation approaches may be added or removed to more accurately generate the third signal and then invert to generate the second signal.
  • FIG. 5 is a schematic diagram of a process 500 of active noise reduction according to one embodiment of the present disclosure.
  • the processor 17 may selectively filter 520 the first signal 502 to generate a filtered first filtered signal 504 .
  • filtering refers to operations performed on a signal other than estimation and inversion operations.
  • Filtering 520 may include, for example, existing or future signal processing such as adjusting gain, band filtering, signal noise reduction, and the like. Alternatively, filtering 520 may also be absent in some embodiments.
  • the processor 17 may then estimate 530 the first filtered signal 504, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. the first speaker 15. Note that, in FIG. 5, the estimation 530 includes the step of inversion, so the inversion operation is not repeated here.
  • filtering 520 and estimating 530 are described above using processor 17 as an example, filtering 520 and estimating 530 may also be performed by other components. For example, filtering 520 may be performed by a separate filter.
  • FIG. 5 it is shown in FIG. 5 that the filtering 520 is performed first and then the estimation 530 is performed, it is also possible to perform the estimation 530 first and then perform the filtering 520 .
  • the first speaker 15 plays the received second signal to generate the first sound 512 in the ear.
  • the direct sound 514 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 10 . Since the direct sound 514 is out of phase with the first sound 512 , the two actually cancel each other out such that the net volume of the sound 508 perceived inside the ear is reduced compared to the direct sound 514 and the first sound 512 . In this way, the active noise reduction audio device 10 can realize active noise reduction.
  • the inverted sound used for cancellation is the "predicted" inverted sound for the second time period, there is no need to provide a corresponding instant when the direct sound during the first time period reaches the inside of the ear The inverted sound of the first period of time. In this way, it is equivalent to shifting the generation of the reversed-phase sound by a period of time backward, thereby reducing the strict requirements on the processing speed of the processor 17, and avoiding or alleviating the direct sound and the reversed-phase sound caused by the same period of time.
  • the problem of unsatisfactory noise reduction effect caused by the mismatch On the other hand, hardware design complexity and correspondingly reduced hardware cost can also be reduced, since there is no need to use ultra-high-speed computing circuits in the active noise cancellation audio device 10 .
  • the active noise cancellation audio device 10 also has a built-in residual microphone 14 .
  • the residual microphone 14 is configured to pick up residual sound 516 inside the ear. Residual sound 516 is actually the residual sound after noise reduction.
  • the residual microphone 14 thus also generates a residual signal 508 representing the residual sound 516 based on the acquired residual sound 516 and feeds the residual signal 508 back to the processor 17 as an evaluation signal.
  • Processor 17 may adjust at least one of filtering 520 and estimating 530 based on residual signal 508 .
  • the filtering 520 is adaptive filtering, which can automatically adjust the filtering 520 based on the residual signal 508 .
  • the processor 17 may also adjust the estimate 530 based on the residual signal 508 .
  • the length of time of a second period following the first period during which the processor 17 estimates the inverted audio signal of the sound collected for the second period may be adjusted. If the residual signal 508 is small, this indicates a better estimate. Therefore, the processor 17 can accordingly increase the time length of the second period based on the residual signal 508 , for example, increase the time points included in the second period from 1 to 2, 3 or 4. If the residual signal 508 is larger, this indicates that the estimation is less effective. Therefore, the processor 17 can correspondingly reduce the time length of the second period based on the residual signal 508, for example, reducing the estimated time points included in the second period from 4 to 3, 2, 1, or even 0 ( i.e., not estimated).
  • the processor 17 may also alter the estimation model or adjust the model parameters. For example, changing from an LPC model to a reserve pool model, or adjusting the parameters of a weighted model.
  • the processor 17 can also adjust the filtering 520, because the unsatisfactory noise reduction effect may also be caused by the filtering 520. It can be understood that the processor 17 can adjust at least one of the above-mentioned second period length, estimated model, model parameters and filtering 520 to obtain better noise reduction performance.
  • the processor 17 may also adjust the length of the second time period, the estimation model, At least one of model parameters and filtering 520. For example, avoid the use of reserve pool models and LPC models or reduce their weighting factors in weighted models.
  • the residual microphone 14 and adaptive filtering and estimation adjustment based on the residual signal 508 are shown in FIG. 5, this is for illustration only and does not limit the scope of the present disclosure. In some embodiments, there may be no residual microphone 14, the filtering 520 is fixed filtering and the estimate 530 of the second signal is not adjusted accordingly. Furthermore, although only the first microphone 13 , the first speaker 15 and the residual microphone 14 are shown in FIG. 5 , this is for illustration only, and the active noise cancellation audio device 10 may have more microphones and/or speakers.
  • the first microphone 13 may collect ambient sound during the first period of time to generate the first signal.
  • the processor 17 may determine whether the previous estimate is accurate based on the previously estimated initial inversion estimation signal for the first period and the collected first signal representing the real ambient sound of the first period, and adjust the subsequent estimation based on the determined result. Estimated length of time. For example, the length of time for the second period following the first period is adjusted. If the sum of the first signal and the initial inverted estimated signal is small, this indicates that the estimation result is good, and the time length of subsequent estimation can be maintained or increased. Conversely, if the sum of the first signal and the initial inverted estimated signal is larger, it indicates that the estimation result is poor, and the time length of subsequent estimation can be correspondingly reduced.
  • the processor 17 may also determine whether the previous estimate was accurate based on the previously estimated estimate signal for the first period and the first signal. If the difference between the first signal and the estimated signal is small, this indicates that the estimation result is better, and the time length of subsequent estimation can be maintained or increased. Conversely, if the difference between the first signal and the estimated signal is large, it indicates that the estimation result is poor, and the time length of subsequent estimation can be correspondingly reduced. Furthermore, similar to the residual signal, in addition to adjusting the time length of the subsequent estimation, the above-described manner can also be used to adjust the estimation model and/or filtering.
  • FIG. 6 is a schematic diagram of a process 600 of active noise reduction according to another embodiment of the present disclosure.
  • the first microphone 13 of the active noise reduction audio device 10 collects the ambient sound, it converts it into a first signal 602 and outputs it to the processor 17 .
  • the processor 17 may perform a selective first filtering 630 on the first signal 602 to generate a filtered first filtered signal 604 .
  • the first filtering 630 may include, for example, processing such as adjusting gain, band filtering, noise reduction, and the like.
  • the processor 17 may then perform a first estimate 630 on the first filtered signal 604 , eg, using the method 300 of FIG. 3 and/or the process 400 of FIG.
  • the first estimation 630 includes an inversion step, so the inversion operation will not be repeated here.
  • the first speaker 15 plays the received second signal 606 to produce the first sound 612 in the ear.
  • FIG. 6 differs from FIG. 5 in that FIG. 6 also has a second branch to perform second filtering 622 and second estimation 632 .
  • the first microphone 13 outputs the first signal 602 to the processor 17 .
  • Processor 17 may perform selective second filtering 622 on first signal 602 to generate filtered second filtered signal 605 .
  • the second filtering 632 may include, for example, adjusting gain, band filtering, noise reduction, etc., and the second filtering 632 may be the same as or different from the first filtering 630 .
  • the processor 17 may then perform a second estimate 632 on the second filtered signal 605 , eg, using the method 300 of FIG. 3 and/or the process 400 of FIG.
  • the second estimation 632 includes an inversion step, so the inversion operation will not be repeated here.
  • the second speaker 18 plays the received sixth signal 607 to produce a second sound 613 in the ear.
  • the direct sound 614 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 10 . Since the direct sound 614 is in opposite phase to the first sound 612 and the second sound 613, the direct sound 614 and the first sound 612 and the second sound 613 actually cancel each other to a certain extent, so that the sound 616 is perceived inside the ear. The net volume is reduced compared to the direct sound 614 , the first sound 612 and the second sound 613 . In this way, the active noise reduction audio device 10 can realize active noise reduction.
  • the second filter 622 is different from the first filter 620 and the second estimate 632 is different from the first estimate 630 .
  • the first filtering 620 may be for speech signals and the first estimation 630 is also for speech signals, while the second filtering 622 is for music signals and the second estimation 632 is also for music signals.
  • the first filtering 620 is for low frequency audio signals and the first estimation 630 is also for low frequency signals, while the second filtering 622 is for medium and high frequency signals and the second estimation 632 is also for medium and high frequency signals.
  • the optimal settings can be set accordingly for each class of signals.
  • the corresponding first speaker 15 and the second speaker 18 can also be selected for different types of sounds, so as to obtain better noise reduction depth and noise reduction width, because the speakers for a specific type are often better than general-purpose speakers.
  • the wide-range speakers have better sound tuning.
  • FIG. 7 shows a schematic diagram of another active noise cancellation audio device 20 in which embodiments of the present disclosure may be implemented.
  • the active noise cancellation audio device 20 may be, for example, a headset.
  • the active noise cancellation audio device 20 may include a pair of ear cup portions, and the two ear cup portions are configured substantially identically to each other. Therefore, only one ear cup part is schematically described.
  • the ear cup part includes an external first microphone 13, a second microphone 19 and a processor 17 inside the ear cup part.
  • the earcup portion also includes a first residual microphone 14, a second residual microphone 16, a first speaker 15, and a first residual microphone 14, a second residual microphone 16, a first speaker 15, and a second residual microphone 14 located inside the earcup portion (relative to the first microphone 13 and the second microphone 19 exposed to the environment).
  • Both the first microphone 13 and the second microphone 19 are configured to detect or collect sounds of the external environment, and the first microphone 13 and the second microphone 19 may operate simultaneously or alternately and may collect the same or different sounds.
  • the first microphone 13 may have an internal first filter to pick up only sounds of the first frequency
  • the second microphone 19 may have an internal second filter to pick up only sounds of the second frequency.
  • the first frequency is a low frequency
  • the second frequency is a mid-high frequency.
  • the external first microphone 13 and the second microphone 19 of the active noise reduction audio device 20 collect ambient sound, and perform acousto-electrical conversion to generate a continuous electrical signal and transmit it to the processor 17 .
  • the processor 17 predicts or estimates the ambient sound at the subsequent time based on the received signal, and generates and transmits an inverted signal representing the ambient sound at the subsequent time to the built-in first speaker 15 and second speaker 18 .
  • the first speaker 15 and the second speaker 18 play the reversed-phase sound based on the received reversed-phase signal to cancel the direct ambient sound from the environment directly into the active noise reduction audio device 20 at a subsequent moment, so as to achieve the effect of noise reduction.
  • process 800 is a schematic diagram of a process 800 of active noise reduction according to yet another embodiment of the present disclosure.
  • process 800 may be implemented in the active noise cancellation audio device 20 shown in FIG. 7 .
  • the processor 17 may perform a selective first filtering 820 on the first signal 802 to generate a filtered first filtered signal 804 .
  • Filtering 820 may include, for example, adjusting gain, band filtering, noise reduction, and the like.
  • the processor 17 may then perform a first estimate 830 on the first filtered signal 804, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG.
  • the estimation 830 includes an inversion step, so the inversion operation is not repeated here.
  • the first filtering 820 and the first estimation 830 are described above with the processor 17 as an example, the first filtering 820 and the first estimation 830 may also be performed by other components.
  • the first filtering 820 may be performed by a separate filter.
  • the first filtering 820 is performed first and then the first estimation 830 is performed, it is also possible to perform the first estimation 830 first and then perform the first filtering 820 .
  • the second speaker 18 of the active noise cancellation audio device 20 is converted into a fifth signal and output to the processor 17 .
  • the processor 17 may perform a selective second filtering 822 on the fifth signal to generate a filtered second filtered signal 805 .
  • the second filtering 822 may include, for example, processing such as adjusting gain, band filtering, noise reduction, and the like.
  • the processor 17 may then perform a second estimate 832 on the second filtered signal 805, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. 4, to generate a sixth signal 807 corresponding to the second time period and output it to The built-in second speaker 18 of the active noise cancellation audio device 20 .
  • the second estimation 832 includes an inversion step, so the inversion operation will not be repeated here.
  • the first speaker 15 plays the received second signal 806 to produce the first sound 823 in the ear
  • the second speaker 18 plays the received sixth signal 807 to produce the second sound 825 in the ear.
  • the first sound 823 and the second sound 825 may be played simultaneously or alternately.
  • the direct sound 824 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 20 . Since the direct sound 824 is out of phase with the first sound 823 and the second sound 825, the direct sound 824 and the first sound 823 and the second sound 825 actually cancel each other to a certain extent, so that the sound 826 and The net volume of 827 is reduced compared to the direct sound 824 , the first sound 823 and the second sound 825 . In this way, the active noise reduction audio device 20 can implement active noise reduction.
  • the active noise reduction audio device 20 also has a built-in first residual microphone 14 and a second residual microphone 16 .
  • the residual microphone 14 is configured to pick up the first residual sound 826 inside the ear.
  • the first residual sound 826 is actually the residual sound after noise reduction.
  • the first residual microphone 14 thus also generates a first residual signal 808 representative of the first residual sound 826 based on the acquired first residual sound 826 and feeds back the first residual signal 808 to the processor 17 .
  • Processor 17 may adjust at least one of first filtering 820 and first estimation 830 based on first residual signal 808 .
  • the first filtering 820 is adaptive filtering, which can automatically adjust the first filtering 820 based on the first residual signal 808 .
  • the processor 17 may also adjust the first estimate 830 based on the first residual signal 808 .
  • the length of time of the second period following the first period during which the processor 17 estimates the inverted audio signal for the second period based on the sound signal collected during the first period may be adjusted. If the first residual signal 808 is small, this indicates a better estimation effect. Therefore, the processor 17 can correspondingly increase the time length of the second period based on the first residual signal 808 , for example, increase the time points included in the second period from 1 to 2, 3 or 4. If the first residual signal 808 is larger, this indicates that the estimation effect is less ideal. Therefore, the processor 17 can correspondingly reduce the time length of the second period based on the first residual signal 808 , for example, reduce the estimated time points included in the second period from 4 to 3, 2, 1, or even 0 (ie, not estimated).
  • the processor 17 may also change the estimation model or adjust the model parameters. For example, changing from an LPC model to a reserve pool model, or adjusting the parameters of a weighted model.
  • the processor 17 may also adjust the first filter 820, because the unsatisfactory noise reduction effect may also be caused by the first filter 820. It can be understood that the processor 17 can adjust at least one of the above-mentioned second period length, estimated model, model parameters and first filtering 820 to obtain better noise reduction performance.
  • the processor 17 may also adjust the length of the second time period, the estimation model, At least one of a model parameter and a first filter 820. For example, avoid the use of reserve pool models and LPC models or reduce their weight coefficients in weighted models.
  • the processor 17 may adjust at least one of the length of the second period, the estimated model, the model parameters, and the second filtering 822 based on the second residual signal 810 to obtain better noise reduction performance.
  • the first residual microphone 14, the second residual microphone 16, and the corresponding adaptive filtering and estimation adjustments based on the first residual signal 808 and the second residual signal 810 are shown in FIG. 8, this is only are illustrative and not limiting of the scope of the present disclosure.
  • the first residual microphone 14 and/or the second residual microphone 16 may not be present, eg, the first filtering 820 and/or the second filtering 822 are fixed filtering and the first estimate for the second signal 806 830 and/or the second estimate 832 for the sixth signal 807 is not adjusted accordingly.
  • the active noise cancellation audio device 20 may have more microphones and/or speakers.
  • the first sound 823 and the direct sound 824 and the second sound 825 and the direct sound 824 are canceled respectively, this is only a schematic example when the first sound 823 and the second sound 825 are played alternately.
  • the first sound 823, the second sound 825 and the direct sound 824 can be added together, and the same residual sound is generated for the first residual microphone 14 and the second residual sound Difference microphone 16 captures.
  • the first residual signal 808 and the second residual signal 810 may be the same.
  • the first residual signal 808 and the second residual signal 810 may also be different in this case, eg due to the performance and location of the first residual microphone 14 and the second residual microphone 16 .
  • the processor 17 may average the first residual signal 808 and the second residual signal 810 for use in the first filtering 820 , the second filtering 822 , the first estimation 830 and the second estimation 832 . This avoids noise-canceling disturbances due to location or microphone performance, providing a more stable noise-canceling effect.
  • the inverted sound used for cancellation is the "predicted" inverted sound for the second time period, there is no need to provide a corresponding instant when the direct sound during the first time period reaches the inside of the ear
  • the inverted sound of the first period of time is equivalent to shifting the generation of the reversed-phase sound by a period of time backward, thereby reducing the strict requirements on the processing speed of the processor 17, and avoiding or alleviating the direct sound and the reversed-phase sound caused by the same period of time.
  • the mismatch of noise reduction performance leads to the problem of unsatisfactory performance.
  • hardware design complexity and correspondingly reduced hardware cost can also be reduced because ultra-high-speed computing devices are not required.
  • first filter 820 and the second filter 822 may be the same or different, and the first estimate 830 and the second estimate 832 may be the same or different.
  • the second filter 822 is different from the first filter 820
  • the second estimate 832 is different from the first estimate 830 .
  • the first filtering 820 may be for speech signals and the first estimation 830 is also for speech signals
  • the second filtering 822 is for music signals and the second estimation 832 is also for music signals
  • the first filtering 820 is for low frequency audio signals and the first estimation 830 is also for low frequency signals
  • the second filtering 822 is for medium and high frequency signals and the second estimation 832 is also for medium and high frequency signals.
  • the first sound 823 is a low frequency sound
  • the second sound 825 is a medium and high frequency sound.
  • the optimal settings can be set accordingly for each class of signals.
  • the corresponding first speaker 15 and the second speaker 18 can also be selected for different types of sounds, so as to obtain better noise reduction depth and noise reduction width, because the speakers for a specific type are often better than the general ones. Wide-range speakers have better sound tuning.
  • FIG. 9 is a schematic block diagram of a computer-readable storage medium 900 according to one embodiment of the present disclosure.
  • the computer-readable storage medium 900 is, for example, a cache in the processor 17, the internal memory 121 in FIG. 2, and the like.
  • the computer readable storage medium 900 stores one or more programs 902 . . . 906 configured to be executed by one or more processors of an active noise cancellation audio device.
  • One or more of the programs 902 . . . 906 may individually or collectively include instructions that may be executed by the processor 17 to implement the methods or processes described herein, such as the method 300 shown in FIG.
  • the computer-readable storage medium 900 may also include programs for implementing other methods and steps.
  • FIG. 10 is a schematic block diagram of an apparatus 1000 for denoising ambient sound according to an embodiment of the present disclosure.
  • the apparatus 1000 may be applied to active noise reduction audio equipment.
  • the apparatus 1000 includes: an acquisition module and an inversion estimation signal generation module.
  • the acquisition module is used for acquiring the first signal and the evaluation signal corresponding to the first period.
  • the first signal represents the first ambient sound collected by the first microphone of the active noise reduction device during the first period of time.
  • an inverse estimation signal generation module for determining, based on the evaluation signal, a time length of a second time period, the second time period following the first time period; and generating a first signal to be used by the active noise reduction device based on the first signal and the time length of the second time period A second signal played by the speaker, the second signal representing an inverse estimate of the first ambient sound during the second time period.
  • Apparatus 1000 may also include corresponding modules for performing various steps in method 300 , method 400 , process 500 , process 600 and/or process 800 described above.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

A method for active noise reduction, comprising: obtaining a first signal (302), the first signal representing first ambient sound acquired by a first microphone of an active noise reduction device during a first period of time; obtaining an evaluation signal corresponding to the first period of time (304); determining the length of a second period of time on the basis of the evaluation signal (306), the second period of time following the first period of time; and generating, on the basis of the first signal and the length of the second period of time, a second signal to be played by a first loudspeaker of the active noise reduction device (308), the second signal representing an inverse estimate of the first ambient sound during the second period of time. Further disclosed are a corresponding active noise reduction audio device and a computer readable storage medium. The active noise reduction audio device estimates inverse-phased sound at a subsequent moment on the basis of the current sound and plays the inverse-phased sound; and an estimated time length of the subsequent inverse-phased sound is adjusted by using a residual signal, so that the accuracy of inverse-phased signal estimation can be dynamically adjusted and a noise reduction effect can be improved.

Description

主动降噪音频设备和用于主动降噪的方法Active noise cancellation audio device and method for active noise cancellation 技术领域technical field
本公开涉及音频处理领域,更具体而言涉及一种用于对主动降噪音频设备中的环境声音进行降噪的方法和主动降噪音频设备。The present disclosure relates to the field of audio processing, and more particularly, to a method for noise reduction of ambient sound in an active noise reduction audio device and an active noise reduction audio device.
背景技术Background technique
随着诸如智能手机之类的便携式设备的应用越来越广泛,与之匹配的音频设备也被越来越多地使用。用户在环境声音嘈杂的环境中使用音频设备时,用户往往希望能够滤除或减少环境声音,以使得能够清楚地听见音频设备播放的来自便携式设备的音频,或者仅是简单地希望能得到一个宁静的环境。此时,即使是音乐之类的环境声音对于用户而言也相当于噪声。As portable devices such as smartphones become more widely used, matching audio devices are also used more and more. When a user uses an audio device in a noisy environment, the user often wants to be able to filter or reduce the ambient sound so that the audio from the portable device can be clearly heard from the audio device, or simply want a quieter environment. environment of. At this time, even ambient sound such as music is equivalent to noise to the user.
音频设备通常具有两种噪声滤除方式(降噪方式)来滤除或减少环境声音。一种降噪方式是被动降噪。被动降噪通常通过包围耳朵形成封闭空间,或者采用硅胶耳塞等隔音材料来阻挡外界噪声。被动降噪的降噪效果有限,一般只能阻隔高频噪声,而对低频噪声的降噪效果有限。此外,被动降噪的音频设备会给耳朵带来不适感。Audio equipment generally has two noise filtering methods (noise reduction methods) to filter or reduce ambient sound. One form of noise reduction is passive noise reduction. Passive noise reduction is usually created by enclosing the ear to form a closed space, or using sound-isolating materials such as silicone earplugs to block outside noise. The noise reduction effect of passive noise reduction is limited, generally only high-frequency noise can be blocked, and the noise reduction effect of low-frequency noise is limited. Additionally, passive noise-cancelling audio devices can cause discomfort to the ears.
另一种降噪方式是主动降噪。主动降噪通过音频设备产生与外界噪音相等的反相声波,将噪音中和,从而实现降噪的效果。例如,主动降噪的音频设备通常在音频设备的最外侧表面处提供环境麦克风,其用于检测环境声音。在环境声音直达耳朵内部之前,主动降噪的音频设备需要完成环境声音的检测以及反相信号的计算和生成。理想情形下,当反相信号被准确计算并且与直达声音同时到达耳朵内部时,可以获得较好的降噪效果。然而,受限于诸如反相信号的计算等因素,在一些情形下,常规的主动降噪的音频设备对于环境声音的降噪仍不理想,在某些情况下还可能需要高成本的电路器件。Another way to reduce noise is Active Noise Cancellation. Active noise reduction uses audio equipment to generate opposite-phase sound waves equal to the outside noise, neutralize the noise, and achieve the effect of noise reduction. For example, active noise-cancelling audio devices typically provide an ambient microphone at the outermost surface of the audio device, which is used to detect ambient sounds. Before the ambient sound reaches the inside of the ear, the audio equipment with active noise reduction needs to complete the detection of the ambient sound and the calculation and generation of the inverted signal. Ideally, when the inverted signal is accurately calculated and reaches the inside of the ear at the same time as the direct sound, better noise reduction can be achieved. However, limited by factors such as the calculation of the inverted signal, in some cases, the conventional active noise reduction audio equipment is still not ideal for the noise reduction of the ambient sound, and in some cases may also require high-cost circuit components .
发明内容SUMMARY OF THE INVENTION
鉴于上述问题,本公开的实施例旨在提供一种用于对环境声音进行降噪的技术方案。In view of the above problems, the embodiments of the present disclosure aim to provide a technical solution for noise reduction of ambient sound.
根据本公开的第一方面,提供一种用于用于主动降噪的方法。该方法包括获取第一信号,第一信号表示主动降噪设备的第一麦克风在第一时段期间采集到的第一环境声音;以及获取与第一时段对应的评估信号。该方法还包括基于评估信号确定第二时段的时间长度,第二时段在第一时段之后。该方法进一步包括基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号。第二信号表示第二时段期间的第一环境声音的反相估计。评估信号表示主动降噪的效果和/或估计的准确率。在一个实现方式中,评估信号可以通过后文具体描述的残差麦克风来获得。备选地或附加地,在另一个实现方式中,评估信号可以由处理器在初始时段估计的估计信号和第一时段采集的环境声音来确定。通过提前估计第二时段的反相信号并且播放该反相信号,反相声音可以与后续时刻的直达声音同步到达人耳,缩短了反相声音的时延,反相声音与直达声音抵消以有效地提高降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a first aspect of the present disclosure, there is provided a method for active noise reduction. The method includes acquiring a first signal representing a first ambient sound collected by a first microphone of an active noise reduction device during a first period of time; and acquiring an evaluation signal corresponding to the first period of time. The method also includes determining, based on the evaluation signal, a length of time for a second time period, the second time period following the first time period. The method further includes generating a second signal to be played by the first speaker of the active noise reduction device based on the first signal and the time length of the second period. The second signal represents an inverse estimate of the first ambient sound during the second time period. The evaluation signal represents the effect of active noise reduction and/or the accuracy of the estimate. In one implementation, the evaluation signal can be obtained by a residual microphone described in detail later. Alternatively or additionally, in another implementation, the evaluation signal may be determined by the evaluation signal estimated by the processor during the initial period and the ambient sound collected in the first period. By estimating the reversed phase signal of the second period in advance and playing the reversed phase signal, the reversed phase sound can reach the human ear in synchronization with the direct sound at the subsequent time, shortening the delay of the reversed phase sound, and the reversed phase sound and the direct sound can be cancelled effectively. to improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
在一些实现方式中,基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号包括基于第一信号和第二时段的时间长度生成第三信号,第三信号表示第二时段期间的第一环境声音的估计;以及对第三信号进行反相以生成第二信号。通过先估计再反相以生成反相信号,可以更为准确地获得反相信号。In some implementations, generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second time period includes generating the third signal based on the first signal and the time length of the second time period, The third signal represents an estimate of the first ambient sound during the second time period; and inverting the third signal to generate the second signal. The inverted signal can be obtained more accurately by estimating and then inverting to generate the inverted signal.
在一些实现方式中,基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号包括对第一信号进行反相以生成第四信号,第四信号表示第一时段期间的第一环境声音的反相声音;以及基于第四信号和第二时段的时间长度生成第二信号。In some implementations, generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second period includes inverting the first signal to generate a fourth signal, the fourth signal an inverted sound representing the first ambient sound during the first period; and generating the second signal based on the fourth signal and a time length of the second period.
在一些实现方式中,获取与第一时段对应的评估信号包括获取主动降噪音频设备的残差麦克风在第一时段期间采集的残差信号作为评估信号。残差麦克风不同于第一麦克风。通过使用残差麦克风来采集残差声音作为评估信号,可以更为准确并且直观地获得主动降噪的效果,从而更有效地调节主动降噪。In some implementations, obtaining the evaluation signal corresponding to the first time period includes obtaining, as the evaluation signal, a residual signal collected by a residual microphone of the active noise reduction audio device during the first time period. The residual microphone is different from the first microphone. By using the residual microphone to collect the residual sound as the evaluation signal, the effect of active noise reduction can be obtained more accurately and intuitively, so that the active noise reduction can be adjusted more effectively.
在一些实现方式中,获取与第一时段对应的评估信号包括基于第一信号和在第一时段之前的时段估计的、与第一时段对应的反相估计信号或者基于第一信号和在第一时段之前的时段估计的、与第一时段对应的估计信号,确定评估信号。备选地,获取与第一时段对应的评估信号包括基于第一信号的反相信号和在第一时段之前的时段估计的、与第一时段对应的估计信号、或者基于第一信号的反相信号和在第一时段之前的时段估计的、与第一时段对应的反相估计信号,确定评估信号。通过多种方式获取残差信号,可以更为全面和准确地评估当前降噪的效果。In some implementations, obtaining the evaluation signal corresponding to the first time period includes an inverting estimated signal corresponding to the first time period estimated based on the first signal and a time period preceding the first time period or based on the first signal and the first time period The estimated signal corresponding to the first period estimated for the period preceding the period determines the estimated signal. Alternatively, acquiring the evaluation signal corresponding to the first time period includes an estimated signal corresponding to the first time period estimated based on an inversion signal of the first signal and a time period preceding the first time period, or based on the inversion of the first signal The signal and the inverted estimated signal corresponding to the first period estimated for the period preceding the first period, determine the evaluation signal. Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
在一些实现方式中,基于初始输入信号和第二时段的时间长度生成待由主动降噪音频设备的第一扬声器播放的初始输出信号包括基于初始输入信号确定第一时段期间的第一环境声音的类别;基于第一环境声音的类别确定与第一环境声音的类别对应的估计模型;以及基于估计模型、第二时段的时间长和第一信号生成第二信号。通过对于环境声音进行分类,可以更有针对性地对环境声音进行反相估计。通过使用与环境声音匹配的估计模型,可以提高估计的准确率并且相应地提高降噪宽度和降噪深度。In some implementations, generating the initial output signal to be played by the first speaker of the active noise cancelling audio device based on the initial input signal and the time length of the second time period includes determining the amount of the first ambient sound during the first time period based on the initial input signal a category; determining an estimated model corresponding to the category of the first ambient sound based on the category of the first ambient sound; and generating a second signal based on the estimated model, the time length of the second period, and the first signal. By classifying the ambient sound, the inverse phase estimation of the ambient sound can be more targeted. By using an estimation model that matches the ambient sound, the estimation accuracy can be improved and the noise reduction width and noise reduction depth can be increased accordingly.
在一些实现方式中,基于估计模型、第二时段的时间长和第一信号生成第二信号包括:如果所确定的环境声音的类别表示环境声音为语音,则使用语音模型来对第一信号进行估计以用于生成第二信号;如果所确定的环境声音的类别表示环境声音为噪音,则使用神经网络模型来对第一信号进行估计以用于生成第二信号;如果所确定的环境声音的类别表示环境声音为音乐,则使用储备池模型来对第一信号进行估计以用于生成第二信号;以及如果所确定的环境声音的类别表示环境声音为混合声音,则使用加权模型来对第一信号进行估计以用于生成第二信号。通过将环境声音具体分类为语音、噪音、音乐和混合声音,并且分别使用对应的语音模型、神经网络模型、储备池模型和加权模型来对其进行估计,可以更为准确地获得估计的信号并且相应地提高主动降噪效果。In some implementations, generating the second signal based on the estimated model, the time length of the second time period, and the first signal includes: if the determined category of the ambient sound indicates that the ambient sound is speech, then using the speech model to perform analysis on the first signal estimation for generating the second signal; if the determined category of the ambient sound indicates that the ambient sound is noise, the neural network model is used to estimate the first signal for generating the second signal; if the determined ambient sound is of noise The class indicates that the ambient sound is music, using a reserve pool model to estimate the first signal for generating the second signal; and if the determined class of the ambient sound indicates that the ambient sound is a mixed sound, using a weighted model to evaluate the first signal. A signal is estimated for generating a second signal. By specifically classifying ambient sounds into speech, noise, music, and mixed sounds, and estimating them using corresponding speech models, neural network models, reserve pool models, and weighting models, respectively, the estimated signal can be obtained more accurately and The Active Noise Cancellation is improved accordingly.
在一些实现方式中,如果语音为浊音,则使用储备池模型来对第一信号进行估计以用于生成第二信号;以及如果语音为清音,则使用线性预测模型来对第一信号进行估计以用于生成第二信号。通过更进一步地细分清音和浊音,可以进一步提高针对语音的估计准确率,并且相应地提高针对语音的主动降噪效果。In some implementations, if the speech is voiced, a reserve pool model is used to estimate the first signal for generating the second signal; and if the speech is unvoiced, a linear prediction model is used to estimate the first signal to for generating the second signal. By further subdividing unvoiced and voiced sounds, the estimation accuracy for speech can be further improved, and the effect of active noise reduction for speech is correspondingly improved.
在一些实现方式中,该方法还包括:基于评估信号调整与环境声音的类别对应的估计模型;以及基于第一麦克风后续采集的音频信号和经调整的模型生成待由第一扬声器播放的后续音频信号。对于单一估计模型,调整模型包括替换该单一模型和/或增加估计模型。对于多 个估计模型,调整模型包括替换一个或多个模型、调整模型权重和/或增加或减少估计模型。通过使用残差信号来调整后续的反相声音的估计模型,可以动态调整反相信号估计的准确率并且提升降噪效果。In some implementations, the method further includes: adjusting an estimated model corresponding to the category of ambient sound based on the evaluation signal; and generating subsequent audio to be played by the first speaker based on the audio signal subsequently collected by the first microphone and the adjusted model Signal. For a single estimated model, adjusting the model includes replacing the single model and/or adding an estimated model. For multiple estimation models, adjusting the models includes replacing one or more models, adjusting model weights, and/or increasing or decreasing the estimated models. By using the residual signal to adjust the estimation model of the subsequent inverted sound, the accuracy of the estimation of the inverted signal can be dynamically adjusted and the noise reduction effect can be improved.
在一些实现方式中,该方法还包括:基于环境声音的类别确定与环境声音对应的滤波;基于第一信号和滤波生成第二信号。通过对采集的音频信号进行滤波,可以有效地滤除声音中的噪声。对于不同的声音类别,可以使用不同的滤波。此外,还可以使用多种滤波的加权滤波。加权滤波针对不同的声音可以具有不同的滤波权重。类似地,可以基于诸如使用残差信号之类的反馈机制来调整滤波。调整滤波包括增加/减少滤波类型、和/或调整加权滤波中的各个滤波的权重。In some implementations, the method further includes: determining a filter corresponding to the ambient sound based on the category of the ambient sound; and generating a second signal based on the first signal and the filtering. By filtering the collected audio signal, the noise in the sound can be effectively filtered. Different filtering can be used for different sound classes. In addition, weighted filtering of various filters can also be used. Weighted filtering may have different filter weights for different sounds. Similarly, filtering can be adjusted based on feedback mechanisms such as the use of residual signals. Adjusting the filtering includes increasing/decreasing the filtering type, and/or adjusting the weight of each filter in the weighted filtering.
在一些实现方式中,如果确定环境声音为音乐,则使用有限冲击响应滤波器对第一信号进行滤波以用于生成第二信号。如果所确定的环境声音的类别表示环境声音为语音,则使用无限冲击响应滤波器对第一信号进行滤波以用于生成第二信号。通过针对不同的声音采用不同的滤波方式,可以更为有效地针对语音和音乐拉来滤除噪声,以获得更好的主动降噪效果。In some implementations, if the ambient sound is determined to be music, the first signal is filtered using a finite impulse response filter for generating the second signal. If the determined category of ambient sound indicates that the ambient sound is speech, the first signal is filtered using an infinite impulse response filter for generating the second signal. By using different filtering methods for different sounds, it is possible to filter out noise more effectively for voice and music, so as to obtain better active noise reduction effect.
在一些实现方式中,基于第一环境声音的类别确定与第一环境声音的类别对应的估计模型包括基于第一环境声音的类别确定与第一环境声音对应的加权模型,加权模型包括:第一估计模型,第一估计模型的权重,第二估计模型和第二估计模型的权重。通过使用加权估计模型,可以更为有效地以及更有针对性地进行环境声音的估计,以相应地提高降噪效果。In some implementations, determining the estimation model corresponding to the category of the first ambient sound based on the category of the first ambient sound includes determining a weighted model corresponding to the first ambient sound based on the category of the first ambient sound, the weighting model includes: a first The estimated model, the weight of the first estimated model, the second estimated model and the weight of the second estimated model. By using the weighted estimation model, the estimation of the ambient sound can be performed more effectively and in a more targeted manner, so as to correspondingly improve the noise reduction effect.
在一些实现方式中,该方法还包括基于残差信号调整估计模型。在一个实现方式中,基于第一信号和第二时段的时间长度生成待由第一扬声器播放的第二信号包括:基于第一信号、第二时段的时间长度和调整后的估计模型生成待由第一扬声器播放的第二信号。通过使用残差信号来调整后续的估计模型,可以基于降噪效果来选择适宜的估计模型以提升降噪效果。In some implementations, the method further includes adjusting the estimated model based on the residual signal. In one implementation, generating the second signal to be played by the first speaker based on the first signal and the time length of the second time period includes: generating the second signal to be played by the first speaker based on the first signal, the time length of the second time period, and the adjusted estimation model The second signal played by the first speaker. By using the residual signal to adjust the subsequent estimation model, an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect.
在一些实现方式中,基于第一信号确定第一时段期间的环境声音的类别包括基于第一信号的第一能量范围的帧率、第一频率范围的帧率和过零率中的至少一项确定环境声音的类别。第一能量范围包括低能量范围。第一频率范围包括低频范围。通过识别低能量的帧率、低频信号的帧率和过零率,可以有效地将声音信号区分为语音、音乐、噪声、静音和混合声音。这样可以更为准确地确定合适的估计模型,从而提供估计的准确率,并且相应地提高降噪宽度和降噪深度。此外,还可以进一步将语音分为清音和浊音,从而进一步提高针对语音估计的准确率,并且相应地提高降噪宽度和降噪深度。In some implementations, determining the category of ambient sound during the first period based on the first signal includes at least one of a frame rate for the first energy range, a frame rate for the first frequency range, and a zero-crossing rate based on the first signal Determines the category of ambient sound. The first energy range includes the low energy range. The first frequency range includes the low frequency range. By identifying low-energy frame rates, low-frequency signal frame rates, and zero-crossing rates, sound signals can be effectively differentiated into speech, music, noise, silence, and mixed sounds. In this way, a suitable estimation model can be more accurately determined, thereby providing estimation accuracy, and correspondingly increasing the noise reduction width and noise reduction depth. In addition, speech can be further divided into unvoiced and voiced, thereby further improving the accuracy of speech estimation, and correspondingly increasing the noise reduction width and noise reduction depth.
在一些实现方式中,基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号包括基于第一环境声音的类别确定与第一环境声音对应的滤波;基于第一信号、第二时段的时间长度和滤波生成第二信号。对于不同的声音类别,其含有的噪声可能不尽相同。通过针对不同类别的声音采用相对应的滤波机制,可以更为有效地滤除噪声。In some implementations, generating the second signal to be played by the first speaker of the active noise cancellation device based on the first signal and the time length of the second period includes determining a filter corresponding to the first ambient sound based on the category of the first ambient sound ; generating a second signal based on the first signal, the time length of the second time period and the filtering. For different sound categories, it may contain different noises. By adopting corresponding filtering mechanisms for different types of sounds, noise can be filtered out more effectively.
在一些实现方式中,该方法还包括基于评估信号调整滤波。通过使用评估信号来调整后续的滤波,可以基于降噪效果来选择适宜的估计滤波以提升降噪效果。In some implementations, the method further includes adjusting the filtering based on the evaluation signal. By using the evaluation signal to adjust subsequent filtering, an appropriate estimated filter can be selected based on the noise reduction effect to improve the noise reduction effect.
在一些实现方式中,生成初始输出信号包括:使用加权估计模型或储备池模型来对第一信号进行估计以生成估计信号,估计信号表示在第一时段期间的环境声音的估计;以及对估计信号进行反相以生成第二信号。备选地,对第一信号反相以生成反相信号,以及使用加权估计模型或储备池模型来对反相信号进行估计以生成第二信号。此外,还可以通过使用评估信号来动态调整加权模型中各个项的系数或是增加或删减所使用的模型的数量。通过使用默认的模型来估计,减少了估计的计算量,提高了估计速度。In some implementations, generating the initial output signal includes: using a weighted estimation model or a reserve pool model to estimate the first signal to generate an estimated signal, the estimated signal representing an estimate of ambient sound during the first time period; and evaluating the estimated signal Inversion is performed to generate the second signal. Alternatively, the first signal is inverted to generate an inverted signal, and a weighted estimation model or a pool model is used to estimate the inverted signal to generate the second signal. In addition, it is possible to dynamically adjust the coefficients of individual terms in the weighted model or to increase or decrease the number of models used by using the evaluation signal. By using the default model to estimate, the calculation amount of the estimation is reduced and the estimation speed is improved.
在一些实现方式中,该方法还包括获取第五信号,第五信号表示主动降噪音频设备中的第二麦克风在第一时段期间采集到的第二环境声音;以及基于第五信号和第二时段的时间长度生成待由主动降噪音频设备中的第二扬声器播放的第六信号,第六信号表示在第二时段期间的第二环境声音的反相估计。在一些实现方式中,不同的麦克风对于不同的频段具有不同的采集效果。例如,第一麦克风可以对于低频声音能够采集到更多细节,而第二麦克风可以对于中高频声音采集到更多细节。因此,通过使用不同的麦克风针对环境声音采用不同的采集方案,可以更为有效地采集环境声音的各个细节。此外,也可以在即使一个麦克风失效的情形下,仍能确保主动降噪的实现。在一些实现方式中,不同的扬声器在不同的频段具有不同的表现效果。例如第一扬声器在低频频段表现更为突出,而第二扬声器在中高频表现更为突出。因此,通过使用不同扬声器来播放不同的反相信号,可以更为准确地抵消直达声音的不同部分,以获得更好的降噪深度和降噪宽度。In some implementations, the method further includes acquiring a fifth signal, the fifth signal representing the second ambient sound collected by the second microphone in the active noise cancellation audio device during the first period; and based on the fifth signal and the second The length of time of the period generates a sixth signal to be played by the second speaker in the active noise cancellation audio device, the sixth signal representing an inverse estimate of the second ambient sound during the second period. In some implementations, different microphones have different collection effects for different frequency bands. For example, the first microphone can capture more details for low-frequency sounds, while the second microphone can capture more details for mid-high frequency sounds. Therefore, by using different microphones to adopt different collection schemes for the ambient sound, various details of the ambient sound can be collected more effectively. In addition, active noise cancellation can be ensured even if one microphone fails. In some implementations, different speakers have different performance effects in different frequency bands. For example, the first speaker is more prominent in the low frequency band, while the second speaker is more prominent in the mid-high frequency. Therefore, by using different speakers to play different inverted signals, different parts of the direct sound can be more accurately cancelled for better noise reduction depth and noise reduction width.
在一些实现方式中,该方法还包括对第一信号进行第一滤波以生成经滤波的第一滤波信号;基于第一滤波信号生成第二信号;对第五信号进行与第一滤波不同的第二滤波以生成经滤波的第二滤波信号;以及基于第二滤波信号生成第六信号。通过使用多个滤波、多个估计并且播放多个声音,可以针对不同的声音类别,使用针对该类别调教的扬声器以获得更好的降噪效果。In some implementations, the method further includes first filtering the first signal to generate a filtered first filtered signal; generating a second signal based on the first filtered signal; Second filtering to generate a filtered second filtered signal; and generating a sixth signal based on the second filtered signal. By using multiple filters, multiple estimates, and playing multiple sounds, it is possible for different sound classes to use speakers tuned for that class for better noise reduction.
在一些实现方式中,该方法还包括:获取第一残差信号,第一残差信号表示第一声音和直达声音的加和,其中第一声音是基于初始输出信号的;获取第二残差信号,第一残差信号表示第二声音和直达声音的加和,其中第二声音是基于第六信号的;基于第一残差信号对第一滤波进行调整;以及基于第二残差信号对第二滤波进行调整。通过使用多个残差信号,可以避免因残差麦克风的位置或性能的影响而导致的残差信号采集不准确的情形,从而持续地获得更为稳定的降噪效果。In some implementations, the method further includes: obtaining a first residual signal, the first residual signal representing the sum of the first sound and the direct sound, wherein the first sound is based on the initial output signal; obtaining a second residual a signal, the first residual signal represents the sum of the second sound and the direct sound, wherein the second sound is based on the sixth signal; the first filtering is adjusted based on the first residual signal; and the pair of The second filter is adjusted. By using a plurality of residual signals, the situation of inaccurate residual signal acquisition caused by the influence of the position or performance of the residual microphone can be avoided, so that a more stable noise reduction effect can be continuously obtained.
在一些实现方式中,基于第一信号生成第二信号包括:对第一信号进行反相以生成第一反相信号;使用第一估计模型对第一反相信号进行估计以生成第一分量信号;使用不同于第一估计模型的第二估计模型对第一反相信号进行估计以生成不同于第一分量信号的第二分量信号;使用加权模型对第一分量信号和第二分量信号进行加权,以生成初始输出信号。通过使用默认的加权模型来估计,减少了估计的计算量,提高了估计速度,并且还可以动态调整估计以提高估计准确率。In some implementations, generating the second signal based on the first signal includes inverting the first signal to generate the first inverted signal; estimating the first inverted signal using the first estimation model to generate the first component signal ; use a second estimation model different from the first estimation model to estimate the first inverted signal to generate a second component signal different from the first component signal; use a weighting model to weight the first component signal and the second component signal , to generate the initial output signal. By using the default weighted model to estimate, the calculation amount of the estimation is reduced, the estimation speed is improved, and the estimation can be dynamically adjusted to improve the estimation accuracy.
根据本公开的第二方面,提供一种计算机可读存储介质,存储一个或多个程序。一个或多个程序被配置为一个或多个处理器执行。一个或多个程序包括用于执行根据第一方面的方法的指令。通过读取并执行计算机可读存储介质中的一个或多个程序,可以提前估计第二时段的反相信号。通过播放该反相信号,反相声音可以与后续时刻的直达声音同步到达人耳,缩短了反相声音的时延,反相声音与直达声音抵消以有效地提高降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a second aspect of the present disclosure, there is provided a computer-readable storage medium storing one or more programs. One or more programs are configured to be executed by one or more processors. The one or more programs comprise instructions for performing the method according to the first aspect. By reading and executing one or more programs in the computer-readable storage medium, the inverted signal of the second period can be estimated in advance. By playing the reversed-phase signal, the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the time delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
根据本公开的第三方面,提供一种计算机程序产品。计算机程序产品包括一个或多个程序。一个或多个程序被配置为一个或多个处理器执行。一个或多个程序包括用于执行根据第一方面的方法的指令。通过执行该计算机程序产品,可以提前估计第二时段的反相信号。通过播放该反相信号,反相声音可以与后续时刻的直达声音同步到达人耳,缩短了反相声音的 时延,反相声音与直达声音抵消以有效地提高降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a third aspect of the present disclosure, there is provided a computer program product. A computer program product includes one or more programs. One or more programs are configured to be executed by one or more processors. The one or more programs comprise instructions for performing the method according to the first aspect. By executing the computer program product, the inverted signal of the second time period can be estimated in advance. By playing the reversed-phase signal, the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
根据本公开的第四方面,提供一种主动降噪音频设备。该主动降噪音频设备包括获取模块,用于获取第一信号和与第一时段对应的评估信号,第一信号表示主动降噪设备的第一麦克风在第一时段期间采集到的第一环境声音;以及反相估计信号生成模块,用于基于评估信号确定第二时段的时间长度,第二时段在第一时段之后;以及基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号,第二信号表示第二时段期间的第一环境声音的反相估计。该主动降噪音频设备可以提前估计第二时段的反相信号。通过播放该反相信号,反相声音可以与后续时刻的直达声音同步到达人耳,缩短了反相声音的时延,反相声音与直达声音抵消以有效地提高降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a fourth aspect of the present disclosure, an active noise reduction audio device is provided. The active noise reduction audio device includes an acquisition module for acquiring a first signal and an evaluation signal corresponding to a first time period, where the first signal represents the first ambient sound collected by the first microphone of the active noise reduction device during the first time period and an inverse estimation signal generation module for determining, based on the evaluation signal, a time length of a second time period, the second time period following the first time period; and A second signal played by the first speaker of the , the second signal representing an inverted estimate of the first ambient sound during the second time period. The ANC audio device may estimate the inverted signal for the second period in advance. By playing the reversed-phase signal, the reversed-phase sound can reach the human ear synchronously with the direct sound at the subsequent time, shortening the time delay of the reversed-phase sound, and the reversed-phase sound and the direct sound can be canceled to effectively improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
在一些实现方式中,反相估计信号生成模块还用于基于第一信号生成第三信号。第三信号表示第二时段期间的环境声音的估计;以及对第三信号进行反相以生成第二信号。通过先估计再反相以生成反相信号,可以更为准确地获得反相信号。In some implementations, the inverse estimation signal generation module is further configured to generate a third signal based on the first signal. The third signal represents an estimate of ambient sound during the second time period; and inverting the third signal to generate the second signal. The inverted signal can be obtained more accurately by estimating and then inverting to generate the inverted signal.
在一些实现方式中,获取模块还用于获取与第一麦克风对应的第一信号,第一信号表示在第一时段期间采集到的环境声音。反相估计信号生成模块还用于基于第一信号和之前时段估计的反相估计信号或基于第一信号和之前时段估计的估计信号,确定第二时段的时间长度,第二时段在第一时段之后。备选地,反相估计信号生成模块还用于基于第一信号的反相信号和之前时段估计的反相估计信号或基于第一信号的反相信号和之前时段估计的估计信号,确定第二时段的时间长度,第二时段在第一时段之后。通过多种方式获取残差信号,可以更为全面和准确地评估当前降噪的效果。In some implementations, the acquisition module is further configured to acquire a first signal corresponding to the first microphone, where the first signal represents ambient sound collected during the first period of time. The inverse estimated signal generation module is further configured to determine the time length of the second period based on the first signal and the estimated inverse estimated signal of the previous period or the estimated signal based on the first signal and the estimated period of the previous period, and the second period is in the first period after. Alternatively, the inversion estimation signal generation module is further configured to determine the second phase based on the inversion signal of the first signal and the inversion estimation signal estimated in the previous period or based on the inversion signal of the first signal and the estimated signal estimated in the previous period. The time length of the period, the second period follows the first period. Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
在一些实现方式中,获取模块还用于获取与主动降噪音频设备的残差麦克风对应的残差信号,残差麦克风不同于第一麦克风。通过多种方式获取残差信号,可以更为全面和准确地评估当前降噪的效果。In some implementations, the obtaining module is further configured to obtain a residual signal corresponding to a residual microphone of the active noise reduction audio device, where the residual microphone is different from the first microphone. Obtaining the residual signal in various ways can more comprehensively and accurately evaluate the effect of the current noise reduction.
在一些实现方式中,反相估计信号生成模块还用于基于第一信号确定第一时段期间的第一环境声音的类别;基于第一环境声音的类别确定与第一环境声音的类别对应的估计模型;以及基于估计模型、第一信号和第二时段的时间长度生成第二信号。通过对于环境声音进行分类,可以更有针对性地对环境声音进行反相估计。通过使用与环境声音匹配的估计模型,可以提高估计的准确率并且相应地提高降噪宽度和降噪深度。In some implementations, the inverse estimation signal generation module is further configured to determine a category of the first ambient sound during the first time period based on the first signal; determine an estimate corresponding to the category of the first ambient sound based on the category of the first ambient sound a model; and generating a second signal based on the estimated model, the first signal, and a time length of the second time period. By classifying the ambient sound, the inverse phase estimation of the ambient sound can be more targeted. By using an estimation model that matches the ambient sound, the estimation accuracy can be improved and the noise reduction width and noise reduction depth can be increased accordingly.
在一些实现方式中,反相估计信号生成模块还用于基于环境声音的类别确定与环境声音对应的加权模型,加权模型包括:第一估计模型,第二估计模型的权重,第一估计模型和第二估计模型的权重。通过使用加权估计模型,可以更为有效地以及更有针对性地进行环境声音的估计,以相应地提高降噪效果。In some implementations, the inverse estimation signal generation module is further configured to determine a weighting model corresponding to the ambient sound based on the category of the ambient sound, where the weighting model includes: a first estimation model, a weight of the second estimation model, the first estimation model and The weights of the second estimation model. By using the weighted estimation model, the estimation of the ambient sound can be performed more effectively and in a more targeted manner, so as to correspondingly improve the noise reduction effect.
在一些实现方式中,反相估计信号生成模块还用于基于评估信号调整估计模型。通过使用评估信号来调整后续的估计模型,可以基于降噪效果来选择适宜的估计模型以提升降噪效果。In some implementations, the inverse estimation signal generation module is also used to adjust the estimation model based on the estimation signal. By using the evaluation signal to adjust the subsequent estimation model, an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect.
根据本公开的第五方面,提供一种主动降噪音频设备。该主动降噪音频设备包括一个或多个处理器;存储器,存储一个或多个程序,一个或多个程序被配置为由一个或多个处理器执行,一个或多个程序包括用于执行根据第一方面的方法的指令。通过使用残差信号来调整后续的估计模型,可以基于降噪效果来选择适宜的估计模型以提升降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a fifth aspect of the present disclosure, an active noise reduction audio device is provided. The active noise cancellation audio device includes one or more processors; a memory storing one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including Instructions for the method of the first aspect. By using the residual signal to adjust the subsequent estimation model, an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
根据本公开的第六方面,提供一种主动降噪音频设备。该主动降噪音频设备包括第一麦克风、一个或多个处理器和第一扬声器。第一麦克风被配置为采集第一时段期间的第一环境声音并且生成第一信号。一个或多个处理器被配置为获取与第一时段对应的评估信号;基于评估信号确定第二时段的时间长度,第二时段在第一时段之后;以及第一信号和第二时段的时间长度生成第二信号,第二信号表示第二时段期间的第一环境声音的反相估计。第一扬声器被配置为被配置为在第二时段期间播放第二信号。通过使用残差信号来调整后续的估计模型,可以基于降噪效果来选择适宜的估计模型以提升降噪效果。通过使用残差信号来调整后续的反相声音的估计的时间长度,可以动态调整反相信号估计的准确率并且提升降噪效果。例如,当残差信号表示主动降噪效果较好时,可以延长估计的时间长度。反之,当残差信号表示主动降噪效果不甚理想时,可以缩短估计的时间长度,以提高估计的准确率。According to a sixth aspect of the present disclosure, an active noise reduction audio device is provided. The active noise cancellation audio device includes a first microphone, one or more processors, and a first speaker. The first microphone is configured to collect the first ambient sound during the first period of time and generate the first signal. The one or more processors are configured to obtain an evaluation signal corresponding to the first time period; determine a time length of the second time period based on the evaluation signal, the second time period following the first time period; and the first signal and the time length of the second time period A second signal is generated, the second signal representing an inverted estimate of the first ambient sound during the second time period. The first speaker is configured to be configured to play the second signal during the second period of time. By using the residual signal to adjust the subsequent estimation model, an appropriate estimation model can be selected based on the noise reduction effect to improve the noise reduction effect. By using the residual signal to adjust the estimated time length of the subsequent anti-phase sound, the accuracy of the anti-phase signal estimation can be dynamically adjusted and the noise reduction effect can be improved. For example, when the residual signal indicates that the active noise reduction effect is better, the estimated time length can be extended. Conversely, when the residual signal indicates that the effect of active noise reduction is not ideal, the estimation time length can be shortened to improve the estimation accuracy.
在一些实现方式中,该主动降噪音频设备还包括残差麦克风。残差麦克风被配置为采集残差声音以生成残差信号以作为评估信号。通过使用残差信号来调整后续的声音估计,可以以反馈的方式来动态调整估计,以使得更好的动态主动降噪效果。In some implementations, the active noise reduction audio device further includes a residual microphone. The residual microphone is configured to acquire residual sound to generate a residual signal as an evaluation signal. By using the residual signal to adjust subsequent sound estimates, the estimates can be dynamically adjusted in a feedback manner for better dynamic active noise reduction.
在一些实现方式中,该主动降噪音频设备还包括:第二麦克风和第二扬声器。第二麦克风被配置为采集第一时段期间的第二环境声音并且生成第五信号,第五信号不同于第一信号。第二扬声器被配置为播放第六信号,其中第六信号是由一个或多个处理器基于第五信号生成的。第六信号表示第二时段期间的第二环境声音的第二反相估计并且不同于第二信号。In some implementations, the active noise reduction audio device further includes: a second microphone and a second speaker. The second microphone is configured to collect the second ambient sound during the first period of time and to generate a fifth signal that is different from the first signal. The second speaker is configured to play a sixth signal, wherein the sixth signal is generated by the one or more processors based on the fifth signal. The sixth signal represents a second inverted estimate of the second ambient sound during the second period and is different from the second signal.
应当理解,发明内容部分中所描述的内容并非旨在限定本公开的实施例的关键或重要特征,亦非用于限制本公开的范围。本公开的其它特征将通过以下的描述变得容易理解。It should be understood that the matters described in this Summary are not intended to limit key or critical features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
附图说明Description of drawings
结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标记指示相同或相似的元素,其中:The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent when taken in conjunction with the accompanying drawings and with reference to the following detailed description. In the drawings, the same or similar reference numbers refer to the same or similar elements, wherein:
图1示出了可以实施本公开的实施例的主动降噪音频设备的示意图;1 shows a schematic diagram of an active noise reduction audio device that may implement embodiments of the present disclosure;
图2示出了根据本公开的一个实施例的主动降噪音频设备的示意框图;2 shows a schematic block diagram of an active noise reduction audio device according to an embodiment of the present disclosure;
图3是根据本公开的一个实施例的用于滤除或降低环境声音的方法的示意流程图;3 is a schematic flowchart of a method for filtering or reducing ambient sound according to an embodiment of the present disclosure;
图4是根据本公开的一个实施例的对环境声音的分类并且进行反相估计的方法的示意图;4 is a schematic diagram of a method for classifying ambient sounds and performing inverse estimation according to an embodiment of the present disclosure;
图5是根据本公开的一个实施例的主动降噪的过程的示意图;5 is a schematic diagram of a process of active noise reduction according to an embodiment of the present disclosure;
图6是根据本公开的另一实施例的主动降噪的过程的示意图;6 is a schematic diagram of a process of active noise reduction according to another embodiment of the present disclosure;
图7示出了可以实施本公开的实施例的另一主动降噪音频设备的示意图;7 shows a schematic diagram of another active noise cancellation audio device that may implement embodiments of the present disclosure;
图8是根据本公开的又一实施例的主动降噪的过程的示意图;8 is a schematic diagram of a process of active noise reduction according to yet another embodiment of the present disclosure;
图9是根据本公开的一个实施例的计算机可读存储介质的示意框图;以及9 is a schematic block diagram of a computer-readable storage medium according to one embodiment of the present disclosure; and
图10是根据本公开的一个实施例的用于对环境声音进行降噪的装置的示意框图。FIG. 10 is a schematic block diagram of an apparatus for denoising ambient sound according to an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for the purpose of A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are only for exemplary purposes, and are not intended to limit the protection scope of the present disclosure.
在本公开的实施例的描述中,术语“包括”及其类似用语应当理解为开放性包含,即“包括但不限于”。术语“基于”应当理解为“至少部分地基于”。术语“一个实施例”或“该实施例”应当理解为“至少一个实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。术语“时段”表示持续或离散的一段时间,其最小单元包括单个时间点。换言之,术语“时段”可以包括至少一个时间点。下文还可能包括其他明确的和隐含的定义。术语“表示”在本文中指示两者之间有着直接或者间接的关联关系,并且可以由一者来表征另一者的某个属性和/或属性的变化。在本文中,“麦克风”是指用于采集声音并且将其转换为对应的电信号的设备。“扬声器”是指用于基于音频数据将电信号转换为声音的设备。术语“环境声音”在本文中指示主动降噪音频设备所处的外部环境中存在并且被主动降噪音频设备采集到的声音,其可以是一种或多种声音的组合,例如语音、音乐、噪声等。In the description of embodiments of the present disclosure, the term "comprising" and the like should be understood as open-ended inclusion, ie, "including but not limited to". The term "based on" should be understood as "based at least in part on". The terms "one embodiment" or "the embodiment" should be understood to mean "at least one embodiment". The terms "first", "second", etc. may refer to different or the same objects. The term "period" refers to a continuous or discrete period of time, the smallest unit of which includes a single point in time. In other words, the term "period" may include at least one point in time. Other explicit and implicit definitions may also be included below. The term "representation" herein indicates that there is a direct or indirect relationship between the two, and that one can characterize a certain attribute and/or a change in attribute of the other. As used herein, "microphone" refers to a device for collecting sound and converting it into a corresponding electrical signal. "Speaker" refers to a device used to convert electrical signals into sound based on audio data. The term "ambient sound" herein refers to the sound present in the external environment in which the ANC audio device is located and captured by the ANC audio device, which may be a combination of one or more sounds, such as speech, music, noise, etc.
如上所述,在一些情形下,环境声音对于用户而言可能是噪声,并且用户希望通过使用主动降噪音频设备滤除或降低环境声音。在本文中,对于环境声音的滤除或降低被统称为对环境声音进行降噪。现有的主动降噪主动降噪音频设备受限于诸如反相信号的计算等因素,并不能很好地对环境声音进行滤除或降低。例如,对于反相信号的实时计算所需的时间大于环境声音信号从主动降噪音频设备的外置采样麦克风到耳朵内部的传播时间,这会引起反相信号和直达信号的不匹配,甚至可能引起噪声增加。As mentioned above, in some situations, ambient sound may be noise to the user, and the user wishes to filter or reduce the ambient sound by using an active noise cancellation audio device. In this document, the filtering or reduction of ambient sound is collectively referred to as noise reduction of ambient sound. The existing active noise reduction and active noise reduction audio equipment is limited by factors such as the calculation of the inverted signal, and cannot filter or reduce the ambient sound well. For example, the time required for the real-time calculation of the inverted signal is greater than the propagation time of the ambient sound signal from the external sampling microphone of the ANC audio device to the inside of the ear, which can cause a mismatch between the inverted signal and the direct signal, and may even cause an increase in noise.
针对上述问题以及其他潜在的问题,本公开的实施例提供了一种用于对环境声音进行降噪的方法、主动降噪音频设备和计算机可读存储介质。在本公开的一个实施例中,通过使用主动降噪音频设备的环境声音麦克风来采集环境声音,基于表示该环境声音的信号来预测或估计表示后续时段的可能的环境声音的反相信号,并且使用主动降噪音频设备的内置扬声器播放该反相信号,以在后续时刻使得播放的反相声音和直达的环境声音相抵消,从而实现对环境声音的降噪。更具体而言,相比于产生实时反相信号以抵消直达声音的常规方案,通过使用根据本公开的实施例的方案可以获得更好的降噪宽度,即,能够增加有效降噪效果的频率宽度。同时,根据本公开的实施例的方案能够获得更佳的降噪深度(例如可以用dB数来表示),从而提高了有效抑制/抵消的噪声的程度。此外,在本公开的一个实施例中,通过获取评估信号,可以知晓当前主动降噪的效果,并且基于表示当前主动降噪效果的评估信号可以调整后续信号估计的长度,从而获得随后的更为准确的反相信号估计以提高后续的主动降噪的效果。In response to the above problems and other potential problems, embodiments of the present disclosure provide a method for noise reduction of ambient sound, an active noise reduction audio device, and a computer-readable storage medium. In one embodiment of the present disclosure, ambient sound is collected using an ambient sound microphone of an active noise-cancelling audio device, an inverted signal representing a possible ambient sound for a subsequent period is predicted or estimated based on a signal representing the ambient sound, and The anti-phase signal is played using the built-in speaker of the active noise reduction audio device, so that the played anti-phase sound and the direct ambient sound cancel each other out at a subsequent moment, thereby realizing noise reduction of the ambient sound. More specifically, by using the scheme according to the embodiment of the present disclosure, a better noise reduction width can be obtained, that is, the frequency at which the effective noise reduction effect can be increased, compared to the conventional scheme of generating a real-time inverted signal to cancel the direct sound. width. At the same time, the solution according to the embodiment of the present disclosure can obtain a better noise reduction depth (for example, it can be expressed in dB), thereby improving the degree of effectively suppressed/cancelled noise. In addition, in one embodiment of the present disclosure, by acquiring the evaluation signal, the current effect of active noise reduction can be known, and the estimated length of the subsequent signal can be adjusted based on the evaluation signal representing the current active noise reduction effect, so as to obtain subsequent more accurate noise reduction. Accurate inverse signal estimation to improve the effect of subsequent active noise reduction.
图1示出了可以实施本公开的实施例的主动降噪音频设备10的示意图。在一个实施例中,主动降噪音频设备10例如可以是诸如真无线立体声(TWS)耳机之类的与耳朵接触的音频播放设备。主动降噪音频设备10可以包括一对耳机,并且两个耳机11和12彼此基本上被相同地配置。因此仅以一个耳机11进行示意性描述。耳机11包括外置的第一麦克风13、位于耳机11内部的处理器17、位于耳机11内部(相对于暴露于环境的外置麦克风13而言)的入 耳部或与耳朵接触部的第一扬声器15以及残差麦克风14。第一麦克风13被配置为检测或采集外部环境的声音。FIG. 1 shows a schematic diagram of an active noise cancellation audio device 10 in which embodiments of the present disclosure may be implemented. In one embodiment, the active noise cancelling audio device 10 may be, for example, an ear-contact audio playback device such as a True Wireless Stereo (TWS) headset. The active noise cancellation audio device 10 may include a pair of earphones, and the two earphones 11 and 12 are configured substantially identically to each other. Therefore only one earphone 11 is used for the schematic description. The headset 11 includes an external first microphone 13, a processor 17 located inside the headset 11, a first speaker located inside the headset 11 (relative to the external microphone 13 exposed to the environment) in an in-ear portion or in contact with the ear 15 and a residual microphone 14. The first microphone 13 is configured to detect or collect the sound of the external environment.
在一些实施例中,可以不存在残差麦克风14。在此情形下,可以无需对估计进行动态调整。虽然在图1中示出了主动降噪音频设备10的可能配置,但是这仅是示意而非对本公开的范围进行限制。例如,在一些实施例中,两个耳机11和12可以仅具有一个处理器17,并且通过诸如蓝牙信号传输之类的无线传输方式来传输无线信号以实现两个耳机11和12对单个处理器17的共享。在另一实施例中,两个耳机11和12也可以共享单个的第一麦克风13。In some embodiments, the residual microphone 14 may not be present. In this case, dynamic adjustment of the estimates may not be required. Although a possible configuration of the active noise cancellation audio device 10 is shown in FIG. 1, this is for illustration only and does not limit the scope of the present disclosure. For example, in some embodiments, the two earphones 11 and 12 may have only one processor 17, and the wireless signal is transmitted by wireless transmission such as Bluetooth signal transmission to realize the two earphones 11 and 12 to a single processor 17 shares. In another embodiment, the two earphones 11 and 12 may also share a single first microphone 13 .
在一个实施例中,主动降噪音频设备10的外置的第一麦克风13对环境声音进行采集,并且进行声电转换以生成连续的电信号并且传输至处理器17。处理器17基于所接收的信号来预测或估计后续时刻的环境声音,并且生成表示后续时刻的环境声音的反相信号并将其传输至第一扬声器15。在本文中,“反相信号”表示对音频信号进行反相操作之后操作的信号,例如通过对音频采样点的符号直接取反或是进一步处理。与之相对的,未被反相的信号可以被称为“正相信号”。扬声器播放的反相信号用于与直达主动降噪音频设备10内部的直达声音(正相声音)在一定程度上相互抵消以降低耳朵内部感知到的声音。第一扬声器15基于所接收的反相信号来播放反相声音,以与后续时刻的从环境直达主动降噪音频设备10内的直达环境声音进行抵消,从而实现降噪的效果。虽然在图1中以TWS耳机示出了主动降噪音频设备10的示意配置,但是可以理解本公开的范围不限于此。例如,在下文的图7中以头戴式耳机示出了主动降噪音频设备的另一可能配置,即耳罩式耳机。备选地,主动降噪音频设备10例如还可以是通过骨传导来传递音频的主动降噪音频设备。In one embodiment, the external first microphone 13 of the active noise reduction audio device 10 collects ambient sound, and performs acousto-electrical conversion to generate a continuous electrical signal and transmit it to the processor 17 . The processor 17 predicts or estimates the ambient sound at the subsequent time based on the received signal, and generates and transmits to the first speaker 15 an inverted signal representing the ambient sound at the subsequent time. As used herein, "inverted signal" refers to a signal that is manipulated after inverting the audio signal, eg by directly inverting the sign of the audio sample points or further processing. In contrast, a signal that is not inverted may be referred to as a "non-inverting signal". The out-of-phase signal played by the speakers is used to cancel each other to a certain extent with the direct sound (normal-phase sound) directly inside the active noise cancelling audio device 10 to reduce the sound perceived inside the ear. The first speaker 15 plays the reversed-phase sound based on the received reversed-phase signal, so as to cancel the direct ambient sound from the environment directly into the active noise reduction audio device 10 at a subsequent time, so as to achieve the effect of noise reduction. Although the schematic configuration of the active noise cancellation audio device 10 is shown in FIG. 1 with TWS headphones, it is to be understood that the scope of the present disclosure is not so limited. For example, another possible configuration of an active noise cancelling audio device, ie over-ear headphones, is shown with headphones in Figure 7 below. Alternatively, the active noise reduction audio device 10 may also be, for example, an active noise reduction audio device that transmits audio through bone conduction.
图2示出了根据本公开的一个实施例的主动降噪音频设备100的示意框图。应当理解,图2所示出的主动降噪音频设备100仅仅是示例性的,例如用于示出图1的主动降噪音频设备10的一种可能实现方式,而不应当构成对本公开所描述的实现的功能和范围的任何限制。在一个实施例中,主动降噪音频设备100可以包括以实线框和实线示出的处理器110、无线通信模块160、天线1、音频模块170、扬声器模块170A、麦克风模块170C、按键190、内部存储器121、通用串行总线(universal serial bus,USB)接口130、充电管理模块140、电源管理模块141。FIG. 2 shows a schematic block diagram of an active noise reduction audio device 100 according to one embodiment of the present disclosure. It should be understood that the ANC audio device 100 shown in FIG. 2 is only exemplary, for example, used to illustrate one possible implementation of the ANC audio device 10 of FIG. 1 , and should not constitute a description of the present disclosure. any limitations on the functionality and scope of the implementation. In one embodiment, the ANC audio device 100 may include a processor 110, a wireless communication module 160, an antenna 1, an audio module 170, a speaker module 170A, a microphone module 170C, a key 190 shown in solid boxes and solid lines , an internal memory 121 , a universal serial bus (USB) interface 130 , a charging management module 140 , and a power management module 141 .
处理器110例如可以是图1的处理器17,并且可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP)、调制解调处理器、图形处理器(graphics processing unit,GPU)、图像信号处理器(image signal processor,ISP)、控制器、视频编解码器、数字信号处理器(digital signal processor,DSP)、基带处理器和/或神经网络处理器(neural-network processing unit,NPU)等。在一些实施例中,不同的处理单元可以是独立的器件。在另一些实施例中,不同的处理单元也可以集成在一个或多个处理器中。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The processor 110 may be, for example, the processor 17 of FIG. 1 , and may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor ( graphics processing unit (GPU), image signal processor (ISP), controller, video codec, digital signal processor (DSP), baseband processor and/or neural network processor ( neural-network processing unit, NPU), etc. In some embodiments, the different processing units may be separate devices. In other embodiments, different processing units may also be integrated in one or more processors. The controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
处理器110中还可以设置存储器,用于存储指令和数据。内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行主动降噪音频设备100的各种功能应用以及数据处理。A memory may also be provided in the processor 110 for storing instructions and data. Internal memory 121 may be used to store computer executable program code, which includes instructions. The internal memory 121 may include a storage program area and a storage data area. The processor 110 executes various functional applications and data processing of the active noise reduction audio device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
主动降噪音频设备100可以通过音频模块170、扬声器模块170A、麦克风模块170C以及应用处理器等实现音频功能。例如音乐播放,录音等。音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可 以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。The active noise reduction audio device 100 may implement audio functions through an audio module 170, a speaker module 170A, a microphone module 170C, an application processor, and the like. Such as music playback, recording, etc. The audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal. Audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
扬声器模块170A,也称“喇叭”,用于将音频电信号转换为声音信号。主动降噪音频设备100可以通过扬声器模块170A收听音乐,或收听免提通话。麦克风模块170C,也称“话筒”,“传声器”。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风模块170C发声,将声音信号输入到麦克风模块170C。The speaker module 170A, also called "speaker", is used to convert audio electrical signals into sound signals. The active noise cancellation audio device 100 can listen to music through the speaker module 170A, or listen to hands-free calls. The microphone module 170C is also called "microphone" or "microphone". When making a call or sending a voice message, the user can make a sound by approaching the microphone module 170C through a human mouth, and input the sound signal to the microphone module 170C.
在另一些实施例中,主动降噪音频设备100还可以包括天线2和移动通信模块150。除了上述部件之外,主动降噪音频设备100还可以包括以虚线和虚线框示出的外部存储器接口120、电池142、受话器170B、耳机接口170D、传感器模块180、马达191、指示器192、摄像头193、显示屏194、以及用户标识模块(subscriber identification module,SIM)卡接口195中的一项或多项。传感器模块180可以包括压力传感器180A、陀螺仪传感器180B、气压传感器180C、磁传感器180D、加速度传感器180E、距离传感器180F、接近光传感器180G、指纹传感器180H、温度传感器180J、触摸传感器180K、环境光传感器180L、骨传导传感器180M中的一项或多项。传感器模块180还可以包括其它被未列出的其它类型的传感器。In other embodiments, the ANC audio device 100 may further include an antenna 2 and a mobile communication module 150 . In addition to the above components, the ANC audio device 100 may further include an external memory interface 120, a battery 142, an earphone 170B, a headphone interface 170D, a sensor module 180, a motor 191, an indicator 192, a camera shown in dashed and dashed boxes 193, a display screen 194, and one or more of a subscriber identification module (subscriber identification module, SIM) card interface 195. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and an ambient light sensor One or more of 180L, bone conduction sensor 180M. Sensor module 180 may also include other types of sensors not listed.
可以理解的是,本发明实施例示意的结构并不构成对主动降噪音频设备100的具体限定。在本申请另一些实施例中,主动降噪音频设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the active noise reduction audio device 100 . In other embodiments of the present application, the active noise reduction audio device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
图3是根据本公开的一个实施例的用于对环境声音进行降噪的方法300的示意流程图。在一个实施例中,方法300可以由主动降噪音频设备10或主动降噪音频设备100。在主动降噪音频设备10执行方法300的情况下,该主动降噪音频设备10的处理器17可以执行实现方法300的计算机程序代码或指令。在主动降噪音频设备10执行方法300的情况下,该主动降噪音频设备100的处理器110可以执行实现方法300的计算机程序代码或指令。仅作为示例,以下以主动降噪音频设备10为例来描述用于对环境声音进行降噪的方法300,应当理解这仅仅是示例性而非限制性的。FIG. 3 is a schematic flowchart of a method 300 for noise reduction of ambient sound according to one embodiment of the present disclosure. In one embodiment, the method 300 may be performed by the active noise cancellation audio device 10 or the active noise cancellation audio device 100 . Where the active noise cancellation audio device 10 performs the method 300 , the processor 17 of the active noise cancellation audio device 10 may execute computer program code or instructions that implement the method 300 . Where the active noise cancellation audio device 10 performs the method 300 , the processor 110 of the active noise cancellation audio device 100 may execute computer program code or instructions that implement the method 300 . As an example only, the method 300 for noise reduction of ambient sound is described below by taking the active noise reduction audio device 10 as an example, and it should be understood that this is only exemplary and not limiting.
主动降噪音频设备100获取与主动降噪音频设备10中的第一麦克风13对应的初始输入信号,初始输入信号表示主动降噪音频设备10在初始时段期间采集到的环境声音。初始输入信号例如由第一麦克风13采集获得,并且被传输至主动降噪音频设备100。可以理解,初始输入信号也可以由中间的音频信号处理电路处理(例如滤波、增益放大等)之后再传输至主动降噪音频设备100进行后续处理。本公开的实施例的范围对此不进行限制。The ANC audio device 100 acquires an initial input signal corresponding to the first microphone 13 in the ANC audio device 10, the initial input signal representing the ambient sound collected by the ANC audio device 10 during the initial period. The initial input signal is acquired by, for example, the first microphone 13 and transmitted to the active noise reduction audio device 100 . It can be understood that the initial input signal may also be processed by an intermediate audio signal processing circuit (such as filtering, gain amplification, etc.) and then transmitted to the active noise reduction audio device 100 for subsequent processing. The scope of the embodiments of the present disclosure is not limited in this regard.
环境声音持续不断地产生,但从主动降噪音频设备10的角度而言,主动降噪音频设备10通常将所接收的信号按固定或可变的数据长度进行处理,该固定或可变的数据长度对应于所采样的环境声音的持续时间的时间长度。例如,主动降噪音频设备10所接收的初始输入信号包括针对一个或多个采样时间点的麦克风信号。因此,在一个实施例中,主动降噪音频设备10每次获取并且处理一个时段的麦克风信号。例如,在主动降噪音频设备10的一个处理周期中,主动降噪音频设备10获取第一时段的麦克风信号并且对其进行处理。在主动降噪音频设备10的后继的一个处理器周期中,主动降噪音频设备10获取在初始时段之后的第一时段的麦克风信号并且对其进行处理,以此类推。初始时段的长度可以与第一时段的长度相同或不同。例如,第一时段可以包括第一多个时间点,而初始时段可以包括数目并不相同的第二多个时间点。Ambient sounds are continuously generated, but from the perspective of the ANC audio device 10, the ANC audio device 10 usually processes the received signal with a fixed or variable data length, the fixed or variable data The length corresponds to the length of time of the duration of the sampled ambient sound. For example, the initial input signal received by the active noise cancellation audio device 10 includes a microphone signal for one or more sampling time points. Thus, in one embodiment, the active noise cancellation audio device 10 acquires and processes the microphone signal one time period at a time. For example, in one processing cycle of the ANC audio device 10, the ANC audio device 10 acquires and processes the microphone signal of the first period. In a subsequent processor cycle of the ANC audio device 10, the ANC audio device 10 acquires and processes the microphone signal for a first period after the initial period, and so on. The length of the initial period may or may not be the same as the length of the first period. For example, the first time period may include a first plurality of time points, and the initial time period may include a second, different number of time points.
主动降噪音频设备10基于初始输入信号生成待由主动降噪音频设备10中的第一扬声器15播放的初始输出信号。初始输出信号表示在初始时段之后的第一时段期间的环境声音的反相估计。在本公开的实施例中,主动降噪音频设备10基于先前时段(例如初始时段)的声音采样信号来预测或估计相继时段(例如第一时段)的环境声音信号,然后对所估计的环境声音信号进行反相处理,以生成表示环境声音在第一时段期间可能产生的声音的初始输出信号。备选地,也可以对初始输入信号先进行反相处理以生成反相信号,并且基于该反相信号进行预测或估计来生成初始输出信号。The active noise cancellation audio device 10 generates an initial output signal to be played by the first speaker 15 in the active noise cancellation audio device 10 based on the initial input signal. The initial output signal represents an inverse estimate of the ambient sound during a first period following the initial period. In an embodiment of the present disclosure, the active noise reduction audio device 10 predicts or estimates ambient sound signals for successive periods (eg, a first period) based on the sound sample signals of a previous period (eg, an initial period), and then estimates or estimates ambient sound signals for the estimated ambient sound The signal is inverted to generate an initial output signal representative of sounds that the ambient sound may have produced during the first period of time. Alternatively, the initial input signal may also be first inverted to generate an inverted signal, and prediction or estimation based on the inverted signal may be performed to generate the initial output signal.
第一扬声器15通过播放初始输出信号可以在用户耳朵内部产生反相的初始反相声音,此时在第一时段真实产生的环境声音直达至耳朵内部。初始反相声音和直达声音进行加和(由于反相声音和直达声音相位相反,实际上是相互抵消以减小净音量),从而获得降噪的效果。由于在本实施例中,通过预测或估计后续时段的声音,因此相当于延长了留给声音处理的时间,从而使得处理电路具有足够的时间来生成反相信号并且提升了降噪的效果。另一方面,由于处理电路具有足够的时间来生成反相信号,因此也无需使用高速处理电路,从而降低了主动降噪音频设备10的电路设计设计复杂性和制造成本。The first loudspeaker 15 can generate an inverted initial inverted sound inside the user's ear by playing the initial output signal, and at this time, the ambient sound actually generated in the first period of time directly reaches the inside of the ear. The initial anti-phase sound and the direct sound are added (because the anti-phase sound and the direct sound are in opposite phases, they actually cancel each other out to reduce the net volume), so as to obtain the effect of noise reduction. In this embodiment, by predicting or estimating the sound of the subsequent period, it is equivalent to prolonging the time left for sound processing, so that the processing circuit has enough time to generate the inverted signal and the effect of noise reduction is improved. On the other hand, since the processing circuit has sufficient time to generate the inverted signal, high-speed processing circuit is also not required, thereby reducing the circuit design complexity and manufacturing cost of the active noise reduction audio device 10 .
在一个示例性实施例中,可以使用下文将具体描述的储备池模型或加权模型来基于初始输入信号进行估计,以生成初始输出信号。例如,主动降噪音频设备10可以对初始输入信号直接进行反相以生成第七信号,并且使用诸如语音模型之类的第一估计模型对第七信号进行估计以生成第一分量信号以及使用不同于第一估计模型的诸如神经网络模型之类的第二估计模型对第七信号进行估计以生成不同于第一分量信号的第二分量信号。主动降噪音频设备10继而使用加权模型对第一分量信号和第二分量信号进行加权以生成初始输出信号,例如将第一分量信号乘以第一权重系数以得到第一乘积,将第二分量信号乘以第二权重系数以得到第二乘积,并且将第一乘积与第二乘积相加以得到初始输出信号。可以理解,在加权模型中,针对第一分量信号和第二分量信号的权重系数可以被动态调整。例如,通过使用后文所述的残差信号进行调整。In one exemplary embodiment, the estimation based on the initial input signal may be performed using a reserve pool model or a weighting model, which will be described in detail below, to generate the initial output signal. For example, the ANC audio device 10 may directly invert the initial input signal to generate the seventh signal and estimate the seventh signal using a first estimation model, such as a speech model, to generate the first component signal and use a different A second estimation model, such as a neural network model, of the first estimation model estimates the seventh signal to generate a second component signal that is different from the first component signal. The ANC audio device 10 then weights the first component signal and the second component signal to generate an initial output signal using a weighting model, eg multiplying the first component signal by a first weighting factor to obtain a first product, the second component The signal is multiplied by a second weight factor to obtain a second product, and the first product and the second product are added to obtain an initial output signal. It can be understood that in the weighting model, the weighting coefficients for the first component signal and the second component signal can be dynamically adjusted. For example, the adjustment is performed by using the residual signal described later.
也可以使用其它模型来对初始输入信号进行估计,例如其它神经网络模型、线性预测模型等,只要该模型能够基于当前音频信号对后续音频信号进行预测。可以理解,在一些实施例中,所使用的模型可以在主动降噪音频设备10的降噪过程中动态地被替换,例如,主动降噪音频设备10所使用的预测模型可以在一些情形下从诸如储备池模型之类的第一模型切换至诸如线性预测模型之类的第二模型。在另一些实施例中,所使用的模型的参数或配置可以动态地调整。例如,加权模型的各个项的权重系数可以基于音频信号的频率成分或类别而被动态地调整。备选地,也可以使用默认或固定的模型来生成初始输出信号。Other models may also be used to estimate the initial input signal, such as other neural network models, linear prediction models, etc., as long as the model can predict subsequent audio signals based on the current audio signal. It will be appreciated that in some embodiments, the model used may be dynamically replaced during the noise reduction of the ANC audio device 10, eg, the predictive model used by the ANC audio device 10 may in some cases change from A first model, such as a reserve pool model, switches to a second model, such as a linear prediction model. In other embodiments, the parameters or configuration of the model used can be adjusted dynamically. For example, the weighting coefficients of the various terms of the weighting model may be dynamically adjusted based on the frequency content or class of the audio signal. Alternatively, a default or fixed model can also be used to generate the initial output signal.
由于佩戴主动降噪音频设备10的用户所处的环境可能是多种多样的,并且用户可能穿梭于不同的环境,因此在一些实施例中可以根据用户所处的环境的类别或基于环境声音的类别来针对性地生成初始输出信号,从而获得更优的降噪效果。例如,主动降噪音频设备10可以基于由第一麦克风13采集的初始输入信号来自动地判定环境声音的类别。备选地,用户可以通过与主动降噪音频设备10通信的便携式电子设备设置环境的类别,或者通过语音来经由主动降噪音频设备10向发送设置命令来设置环境的类别。在另一些实施例中,用户可以设置主动降噪音频设备10的降噪效果的级别。例如,将主动降噪音频设备10的降噪效果设置为低降噪水平,以避免遗漏环境声音中的警示音。Since the environment in which the user wearing the active noise cancelling audio device 10 may be located may be diverse, and the user may travel through different environments, in some embodiments, the user may be in a category according to the environment in which the user is located or based on ambient sound. category to generate the initial output signal in a targeted manner, so as to obtain a better noise reduction effect. For example, the active noise reduction audio device 10 may automatically determine the category of the ambient sound based on the initial input signal collected by the first microphone 13 . Alternatively, the user may set the category of the environment through a portable electronic device in communication with the active noise cancelling audio device 10, or send a set command via the active noise cancelling audio device 10 to set the category of the environment by voice. In other embodiments, the user may set the level of noise reduction effect of the active noise cancellation audio device 10 . For example, the noise reduction effect of the active noise reduction audio device 10 is set to a low noise reduction level to avoid omitting the warning sound in the ambient sound.
在302,处理器17获取第一信号。第一信号表示主动降噪设备的第一麦克风13在第一 时段期间采集到的第一环境声音。处理器17在此的处理与针对初始输入信号的处理类似,在此不再赘述。关于初始输入信号的相应描述可以适用于此。At 302, the processor 17 acquires the first signal. The first signal represents the first ambient sound collected by the first microphone 13 of the active noise reduction device during the first period of time. The processing of the processor 17 here is similar to the processing of the initial input signal, and details are not repeated here. The corresponding descriptions regarding the initial input signal may apply here.
在304,处理器17获取与第一时段对应的评估信号。在本公开中,评估信号表示主动降噪的效果和/或估计的准确率。在一个实施例中,评估信号可以通过后文具体描述的残差麦克风来获得。备选地或附加地,在另一个实施例中,评估信号可以由处理器17在初始时段估计的估计信号和第一时段采集的环境声音来确定。例如,处理器17在初始时段估计出初始估计信号,初始估计信号表示估计出的环境声音在第一时段的可能情形。处理器17接收来自第一麦克风13在第一时段采集的第一信号。处理器17继而将初始估计信号与第一信号相减,则可以确定出评估信号。备选地,处理器17在初始时段估计出初始反相信号,初始反相估计信号表示估计出的环境声音在第一时段的可能情形的反相信号,即,第一信号的反相估计信号。处理器17接收来自第一麦克风13在第一时段采集的第一信号。处理器17继而将初始反相估计信号与第一信号相加,则可以确定出评估信号。可以理解,还可以有其它处理方式来获取残差信号,例如将第一信号的反相信号与初始反相估计信号相减并且取反或将第一信号的反相信号与初始估计信号相加,本公开对此不进行限制。At 304, the processor 17 obtains an evaluation signal corresponding to the first time period. In the present disclosure, the evaluation signal represents the effect of active noise reduction and/or the accuracy of the estimate. In one embodiment, the evaluation signal can be obtained by a residual microphone described in detail later. Alternatively or additionally, in another embodiment, the evaluation signal may be determined by the evaluation signal estimated by the processor 17 in the initial period and the ambient sound collected in the first period. For example, the processor 17 estimates an initial estimation signal in the initial period, and the initial estimation signal represents a possible situation of the estimated ambient sound in the first period. The processor 17 receives the first signal collected by the first microphone 13 during the first period. The processor 17 then subtracts the initial estimated signal from the first signal, and the estimated signal can be determined. Alternatively, the processor 17 estimates an initial inverted signal during the initial period, and the initial inverted estimated signal represents the estimated inverted signal of a possible situation of the ambient sound in the first period, that is, the inverted estimated signal of the first signal . The processor 17 receives the first signal collected by the first microphone 13 during the first period. The processor 17 then adds the initial inverted estimate signal to the first signal, and the estimate signal can be determined. It can be understood that there may also be other processing methods to obtain the residual signal, such as subtracting and inverting the inverted signal of the first signal and the initial inverted estimated signal, or adding the inverted signal of the first signal to the initial estimated signal , which is not limited in the present disclosure.
在306,处理器17基于评估信号确定第二时段的时间长度,第二时段在第一时段之后。由于处理器17在获取了评估信号之后,可以确定先前的估计效果。因此,处理器17可以基于该估计效果来相应地调整后续估计,例如调整相继时段的时间长度,以获得更为准确的估计,从而提高主动降噪的效果。例如,当残差信号表示估计效果较好时,可以维持或者增加估计时段的时间长度。当残差信号信号表示估计效果不理想时,可以降低后续估计时段的时间长度。此外,残差信号还可以用于调整、和/或增加或删减估计模型,用于调整加权模型中各个项的权重系数,还可以用于调整滤波,如后文所述。At 306, the processor 17 determines, based on the evaluation signal, a length of time for a second time period, the second time period following the first time period. Since the processor 17 obtains the evaluation signal, the previous evaluation effect can be determined. Therefore, the processor 17 can adjust subsequent estimates accordingly based on the estimated effect, such as adjusting the time length of successive time periods, to obtain a more accurate estimate, thereby improving the effect of active noise reduction. For example, when the residual signal indicates that the estimation effect is good, the time length of the estimation period can be maintained or increased. When the residual signal signal indicates that the estimation effect is not ideal, the time length of the subsequent estimation period can be reduced. In addition, the residual signal can also be used to adjust, and/or add or subtract the estimation model, to adjust the weight coefficients of each item in the weighted model, and to adjust the filtering, as described later.
在308,处理器17基于第一信号和第二时段的时间长度生成待由第一扬声器15播放的第二信号。第二信号表示第二时段期间的第一环境声音的反相估计。主动降噪设备10的第一扬声器15可以播放该第二信号。处理器17在此的处理与针对初始输出信号的处理类似,在此不再赘述。关于初始输出信号的相应描述可以适用于此。At 308, the processor 17 generates a second signal to be played by the first speaker 15 based on the first signal and the time length of the second period. The second signal represents an inverse estimate of the first ambient sound during the second time period. The first speaker 15 of the active noise cancellation device 10 may play the second signal. The processing of the processor 17 here is similar to the processing of the initial output signal, which is not repeated here. The corresponding descriptions regarding the initial output signal may apply here.
图4是根据本公开的一个实施例的对环境声音的分类并且进行反相估计的方法400的示意图。总体而言,用户所处的环境的环境声音可以分为五大类别:静音类、语音类、噪声类、音乐类和混合类,其中混合类例如包括语音、音乐和/或噪声的混合。主动降噪音频设备10可以基于第一信号来确定环境声音的类别,并且基于所确定的类别和第一信号来生成第三信号,并且对第三信号进行反相来生成第二信号。备选地,也可以先将第一信号反相以生成第四信号,并且基于反相的第四信号进行分类和估计以生成第二信号。FIG. 4 is a schematic diagram of a method 400 for classifying ambient sounds and performing inverse estimation according to one embodiment of the present disclosure. In general, the ambient sound of the environment where the user is located can be classified into five categories: silence, speech, noise, music, and hybrid, where the hybrid includes, for example, a mixture of speech, music and/or noise. The active noise cancellation audio device 10 may determine a category of ambient sound based on the first signal, and generate a third signal based on the determined category and the first signal, and invert the third signal to generate a second signal. Alternatively, the first signal may be inverted first to generate the fourth signal, and the classification and estimation may be performed based on the inverted fourth signal to generate the second signal.
虽然将环境声音分为上述五个类别,但这仅是示意,而非对本公开的范围进行限制。例如,环境声音也可以分为音乐、噪声和静音三个类别。在此情形下,语音被归入噪声类别。备选地,在一些情形下,还可以基于环境声音的频率进行分类。通过对不同类别的环境声音信号进行分类,可以更为有效地对环境声音进行降噪,以获得更好的降噪效果。Although the ambient sounds are classified into the above five categories, this is for illustration only, and does not limit the scope of the present disclosure. For example, ambient sound can also be divided into three categories: music, noise, and silence. In this case, speech is classified as noise. Alternatively, in some cases, the classification may also be based on the frequency of the ambient sound. By classifying different types of ambient sound signals, the ambient sound can be noise-reduced more effectively to obtain a better noise-reduction effect.
在一个实施例中,主动降噪音频设备10可以基于第一信号的第一能量范围的帧率、第一频率范围的帧率和过零率中的至少一项确定所述环境声音的类别。具体而言,在402,主动降噪音频设备10获取第一信号。主动降噪音频设备10在接收到第一信号之后对第一信号进行特征分析,以确定第一信号的分类。例如,可以基于第一信号的能量帧率和过零率对第一信号进行分析。In one embodiment, the active noise reduction audio device 10 may determine the category of the ambient sound based on at least one of a frame rate of the first energy range, a frame rate of the first frequency range, and a zero-crossing rate of the first signal. Specifically, at 402, the active noise cancellation audio device 10 acquires a first signal. After receiving the first signal, the active noise reduction audio device 10 performs characteristic analysis on the first signal to determine the classification of the first signal. For example, the first signal may be analyzed based on its energy frame rate and zero-crossing rate.
对于静音类而言,静音类就是人耳感受不到的音频信号段,其中有时也可能包含少量噪音。因此为了可以尽可能地精确检测,可以使用短时能量阈值和短时过零率阈值对第一信号进行判断。例如,在404,主动降噪音频设备10对第一信号进行分析以确定其能量帧率和过零率是否均高于相应给定阈值。这种方式类似于端点检测,当第一信号的短时能量和短时过零率均小于相应给定阈值时,在406,主动降噪音频设备10可以确定第一信号为静音信号。这表示环境处于安静状态,相应地在408,主动降噪音频设备10无需生成第三信号并且也无需相应地生成第二信号。这可以节省功耗,延长主动降噪音频设备10的使用时间。For the silent class, the silent class is the audio signal segment that the human ear cannot perceive, which may sometimes contain a small amount of noise. Therefore, in order to detect as accurately as possible, a short-term energy threshold and a short-term zero-crossing rate threshold can be used to judge the first signal. For example, at 404, the active noise reduction audio device 10 analyzes the first signal to determine whether its energy frame rate and zero-crossing rate are both above respective given thresholds. This approach is similar to endpoint detection. When both the short-term energy and the short-term zero-crossing rate of the first signal are less than the corresponding given thresholds, the ANC audio device 10 may determine that the first signal is a mute signal at 406 . This indicates that the environment is in a quiet state, and accordingly, at 408, the active noise cancellation audio device 10 need not generate the third signal and accordingly the second signal also need not be generated. This can save power consumption and prolong the use time of the active noise cancelling audio device 10 .
返回至404,当主动降噪音频设备10确定能量帧率和过零率中至少一项不低于阈值时,主动降噪音频设备10在410进一步分析第一能量范围的帧率是否高于阈值。在本实施例中,第一能量范围例如是低能量范围,而第一频率范围例如是低频范围。可以理解,该低能量范围和低频范围的具体范围可以根据主动降噪音频设备10的性能设计而进行相应地设置。备选地,也可以具有不同的能量范围或频率范围来对环境声音进行分类,只要该环境声音在特定能量和/或频率范围具有对应的特征。Returning to 404, when the ANC audio device 10 determines that at least one of the energy frame rate and the zero-crossing rate is not lower than the threshold, the ANC audio device 10 further analyzes at 410 whether the frame rate of the first energy range is higher than the threshold . In this embodiment, the first energy range is, for example, a low-energy range, and the first frequency range is, for example, a low-frequency range. It can be understood that the specific ranges of the low energy range and the low frequency range can be set accordingly according to the performance design of the active noise reduction audio device 10 . Alternatively, ambient sounds may also be classified with different energy ranges or frequency ranges, as long as the ambient sounds have corresponding characteristics in a particular energy and/or frequency range.
相比于音乐和混合声音,语音信号和噪声信号通常具有较高的低能量帧率。如果第一能量范围的帧率高于给定阈值,则主动降噪音频设备10可以确定表示环境声音的第一信号为语音信号或是噪声信号。对于语音信号和噪声信号而言,虽然两者都具有较高的低能量范围的帧率,但是两者在声音的频率方面具有不同的特征。例如,语音信号通常具有较多的低频帧率,然而噪声信号因其随机性而在频率上分布较为分散,并且在低频的帧率较低。因此,在412,主动降噪音频设备10确定第一信号的第一频率范围的帧率是否高于阈值。如果高于阈值,则主动降噪音频设备10在413确定第一信号是语音信号。主动降噪音频设备10继而可以使用针对与语音信号的语音模型来生成第三信号。例如,在415,主动降噪音频设备10可以使用储备池模型或线性预测(LPC)模型之类的语音模型来基于第一信号生成第三信号。可以理解,也可以使用除了储备池模型或线性预测模型之外的其它语音模型。Speech signals and noise signals generally have higher low energy frame rates than music and mixed sounds. If the frame rate of the first energy range is higher than a given threshold, the active noise reduction audio device 10 may determine that the first signal representing the ambient sound is a speech signal or a noise signal. For speech signals and noise signals, although both have higher frame rates in the low energy range, they have different characteristics in terms of the frequency of the sound. For example, speech signals usually have more low-frequency frame rates, while noise signals are more dispersed in frequency due to their randomness, and have lower frame rates at low frequencies. Accordingly, at 412, the active noise reduction audio device 10 determines whether the frame rate of the first frequency range of the first signal is above a threshold. If above the threshold, the ANC audio device 10 determines at 413 that the first signal is a speech signal. The active noise reduction audio device 10 may then generate a third signal using the speech model for the speech signal. For example, at 415, the active noise reduction audio device 10 may generate a third signal based on the first signal using a speech model such as a reserve pool model or a linear prediction (LPC) model. It will be appreciated that other speech models than the reserve pool model or the linear prediction model may also be used.
具体而言,语音信号中清音和浊音交替出现是特有的性质,而清音部分有较高的过零率,浊音部分有较低的过零率,因此语音信号中的过零率会出现交替变化。对于清音信号而言,储备池模型具有良好的估计结果,因此可以用于针对清音信号估计第三信号。储备池(Reservoir)模型也称回声状态网络(Echo state network)。储备池模型的计算简化了网络的训练过程,解决了传统递归神经网络结构难以确定、训练算法过于复杂的问题,同时也克服了递归网络存在的记忆渐消问题。而对于浊音信号而言,本发明人发现LPC模型具有较好的估计效果,因此可以用于针对浊音信号估计第三信号。在一个示例实施例中,主动降噪音频设备10可以交替使用储备池模型和LPC模型来对语音信号进行估计。备选地,主动降噪音频设备10也可以仅使用储备池模型或LPC模型来对语音信号进行估计。通过针对清音和浊音使用针对性的模型,可以进一步提高语音估计的准确率。Specifically, the alternating appearance of unvoiced and voiced sounds in a speech signal is a unique property, while the unvoiced part has a higher zero-crossing rate, and the voiced part has a lower zero-crossing rate, so the zero-crossing rate in the speech signal will change alternately . For the unvoiced signal, the reserve pool model has good estimation results, so it can be used to estimate the third signal for the unvoiced signal. The Reservoir model is also called the Echo state network. The calculation of the reserve pool model simplifies the training process of the network, solves the problems that the traditional recurrent neural network structure is difficult to determine and the training algorithm is too complicated, and also overcomes the memory fading problem of the recurrent network. As for the voiced signal, the inventors found that the LPC model has a better estimation effect, so it can be used to estimate the third signal for the voiced signal. In an example embodiment, the active noise reduction audio device 10 may use the reserve pool model and the LPC model alternately to estimate the speech signal. Alternatively, the active noise reduction audio device 10 may also use only the reserve pool model or the LPC model to estimate the speech signal. The accuracy of speech estimation can be further improved by using targeted models for unvoiced and voiced sounds.
返回至412,如果第一频率范围的阵列不高于阈值,则主动降噪音频设备10在414确定第一信号为噪声信号。继而,主动降噪音频设备10可以使用合适的模型对噪声信号进行处理以生成第三信号。在一个实施例中,可以使用神经网络模型来对第一信号进行估计,以生成第三信号。虽然在上面针对语音信号和噪声信号使用储备池模型、LPC模型和神经网络模型进行估计以生成第三信号,但是这仅是示意,而非对本公开的范围进行限制。也可以使用其它合适的模型,例如下文描述的加权模型,来生成第三信号。Returning to 412 , if the array of the first frequency range is not above the threshold, the active noise cancellation audio device 10 determines at 414 that the first signal is a noise signal. In turn, the ANC audio device 10 may process the noise signal using a suitable model to generate a third signal. In one embodiment, a neural network model may be used to estimate the first signal to generate the third signal. Although estimated above using the reserve pool model, the LPC model and the neural network model for the speech signal and the noise signal to generate the third signal, this is for illustration only and does not limit the scope of the present disclosure. Other suitable models, such as the weighting model described below, may also be used to generate the third signal.
返回至410,如果第一能量范围不高于给定阈值,则主动降噪音频设备10可以确定环境 声音为音乐或混合声音。音乐的特点是和谐悦耳,并且音乐的音色或频率的范围较广,其可以是各种乐器弹奏的音乐。因此音乐包含的音频成分复杂,并且能量值也会偏大。此外,音乐没有语音信号的清音浊音的突变,能量值变化不如语音信号剧烈,所以音乐的低能量帧率值较低。类似地,混合声音也具有较低的低能量帧率。因此,低能量帧率是判断音乐和混合声音的有效特征。Returning to 410, if the first energy range is not above the given threshold, the active noise cancellation audio device 10 may determine that the ambient sound is music or a mixed sound. The characteristics of music are harmonious and pleasing to the ear, and the timbre or frequency range of the music is wide, and it can be music played by various musical instruments. Therefore, the audio components contained in the music are complex, and the energy value will be too large. In addition, music does not have the sudden change of unvoiced and voiced sounds of speech signals, and the change of energy value is not as severe as that of speech signals, so the low-energy frame rate value of music is lower. Similarly, mixed sound also has a lower low energy frame rate. Therefore, low energy frame rate is a useful feature for judging music and mixed sounds.
在此之后,主动降噪音频设备10进一步对音乐和混合声音进行区分。生活中的语音和音乐混合出现往往能渲染一种气氛,这种混合信号通常是以音乐部分为背景,语音部分占主体(即语音能量)占主要部分。语音信号中清音和浊音交替出现是特有的性质,而清音部分有较高的过零率,浊音部分有较低的过零率,因此语音信号中的过零率会出现交替变化。音乐信号中不存在清浊音的交替,因此过零率的变化比较平稳,而衡量过零率变化快慢的一个有效参数就是过零率方差。因此,可以通过过零率方差来区别音乐信号和语音音乐混合信号。在一个实施例中,主动降噪音频设备10可以基于过零率方差来对音乐和混合声音进行区分。例如,在420,主动降噪音频设备10对第一信号的过零率方差进行分析以确定第一信号的过零率方差是否高于给定阈值。After this, the ANC audio device 10 further differentiates between music and mixed sounds. The mixed appearance of speech and music in life can often render an atmosphere. This mixed signal usually takes the music part as the background, and the speech part occupies the main part (that is, the speech energy) occupies the main part. The alternating appearance of unvoiced and voiced sounds in speech signals is a unique property, and the unvoiced part has a higher zero-crossing rate, while the voiced part has a lower zero-crossing rate, so the zero-crossing rate in the speech signal will change alternately. There is no alternation of clear and voiced sounds in the music signal, so the change of the zero-crossing rate is relatively stable, and an effective parameter to measure the change of the zero-crossing rate is the zero-crossing rate variance. Therefore, the music signal and the speech-music mixed signal can be distinguished by the variance of the zero-crossing rate. In one embodiment, the active noise reduction audio device 10 may differentiate between music and mixed sounds based on zero-crossing rate variance. For example, at 420, the active noise reduction audio device 10 analyzes the variance of the zero-crossing rate of the first signal to determine whether the variance of the zero-crossing rate of the first signal is above a given threshold.
如果过零率方差不高于给定阈值,则主动降噪音频设备10在422确定第一信号是音乐信号。对于音乐信号,本发明人发现,储备池模型同样对于音乐信号具有较好的估计效果。因此,主动降噪音频设备10可以在424使用储备池模型对第一信号进行估计,以生成第三信号。虽然在此使用储备池模型来基于音乐信号进行估计,但是这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以使用加权模型来基于音乐信号估计第三信号。If the zero-crossing rate variance is not above the given threshold, the active noise reduction audio device 10 determines at 422 that the first signal is a music signal. For the music signal, the inventors found that the reserve pool model also has a better estimation effect for the music signal. Accordingly, the active noise cancellation audio device 10 may use the reserve pool model to estimate the first signal at 424 to generate the third signal. Although a reserve pool model is used here for estimation based on the music signal, this is for illustration only and does not limit the scope of the present disclosure. In some embodiments, a weighted model may be used to estimate the third signal based on the music signal.
如果过零率方差高于给定阈值,则主动降噪音频设备10在421确定第一信号是混合信号。对于混合信号,主动降噪音频设备10可以在423使用加权模型对第一信号进行估计,以生成第二信号。在一个实施例中,主动降噪音频设备10可以使用储备池模型、LPC模型和神经网络模型中的至少两种模型对第一信号进行估计,并且针对每种模型估计出来的结果赋予权重系数。例如,如果浊音信号较多,噪声信号较少,则可以对使用LPC模型针对浊音信号估计出来的结果赋予一个较高的权重系数,例如0.75,并且可以对使用神经网络模型针对噪音信号估计出来的结果赋予一个较低的权重系数,例如0.25。然后使用权重系数乘以相应的估计结果,并且将两个乘积相加以获得最终的第二信号。在另一实施例中,还可以使用储备池模型、LPC模型和神经网络模型这三者进行加权估计,以获得最终的第三信号。可以理解,加权模型中针对各个模型的加权系数可以动态调整。例如,基于后文所述的残差信号,主动降噪音频设备10可以动态调整各个系数,以获得更好的降噪效果。If the zero-crossing rate variance is above a given threshold, the ANC audio device 10 determines at 421 that the first signal is a mixed signal. For the mixed signal, the ANC audio device 10 may estimate the first signal at 423 using the weighted model to generate the second signal. In one embodiment, the active noise reduction audio device 10 may estimate the first signal using at least two models among the reserve pool model, the LPC model and the neural network model, and assign weight coefficients to the estimated results of each model. For example, if there are more voiced signals and less noise signals, a higher weight factor, such as 0.75, can be assigned to the result estimated by the LPC model for the voiced signal, and the result estimated by the neural network model for the noise signal can be assigned a higher weight coefficient, such as 0.75. The result is assigned a lower weight factor, such as 0.25. The corresponding estimation results are then multiplied by the weight coefficients, and the two products are added to obtain the final second signal. In another embodiment, the reserve pool model, the LPC model and the neural network model can also be used for weighted estimation to obtain the final third signal. It can be understood that the weighting coefficient for each model in the weighting model can be dynamically adjusted. For example, based on the residual signal described later, the active noise reduction audio device 10 can dynamically adjust each coefficient to obtain a better noise reduction effect.
由此可见,通过对环境声音的分类进行细分并且针对性的采用适于各个分类的估计模型,因此即使环境声音具有宽广的声音频率范围,也可以有效地进行降噪,由此显著增加了降噪宽度。另一方面,通常具有较强分贝的环境噪声都具有特定的声音特征并且属于特定分类,因此通过针对性的施加估计模型,也可以有效地降低噪声分贝以获得显著的降噪深度。It can be seen that by subdividing the classification of ambient sounds and using the estimation model suitable for each category, noise reduction can be effectively performed even if the ambient sound has a wide sound frequency range, thereby significantly increasing the Noise reduction width. On the other hand, ambient noises with strong decibels usually have specific sound characteristics and belong to specific categories. Therefore, by applying an estimation model in a targeted manner, the noise decibels can also be effectively reduced to obtain a significant noise reduction depth.
虽然在上面依据图4所示的流程按序描述了分类过程和相应的估计过程,但是可以理解这仅是示意,而非对本公开的范围进行限制。其它分类和估计过程是可能的。例如,主动降噪音频设备10在获取到第一信号之后可以对其的各个特征方面进行分析,并且基于分析结果直接确定第一信号的分类类别,而无需如图4所示地按序逐步分析以得到第一信号的分类类别。此外,在针对更多或更少类别的情形下,可以添加或去除相应的模型估计方式以更为准确地生成第三信号并继而进行反相以生成第二信号。Although the classification process and the corresponding estimation process are sequentially described above according to the flow shown in FIG. 4 , it should be understood that this is for illustration only and does not limit the scope of the present disclosure. Other classification and estimation procedures are possible. For example, after acquiring the first signal, the active noise reduction audio device 10 may analyze various features of the first signal, and directly determine the classification category of the first signal based on the analysis result, without the need for step-by-step analysis as shown in FIG. 4 . to get the classification class of the first signal. Furthermore, in the case of more or fewer classes, corresponding model estimation approaches may be added or removed to more accurately generate the third signal and then invert to generate the second signal.
图5是根据本公开的一个实施例的主动降噪的过程500的示意图。主动降噪音频设备10的第一麦克风13在采集到环境声音之后,将其转换为第一信号502并且输出至处理器17。处理器17可以对第一信号502进行选择性的滤波520,以生成经滤波的第一滤波信号504。在本文中,滤波表示对信号进行的、除了估计和反相操作之外的其它操作。滤波520例如可以包括调整增益、频带过滤、信号降噪等现有或将来的信号处理。备选地,在一些实施例中也可以不具有滤波520。处理器17继而可以对第一滤波信号504进行估计530,例如使用图3的方法300和/或图4的过程400,以生成第二信号506并且将其输出至主动降噪音频设备10的内置的第一扬声器15。注意,在图5中,估计530包括反相的步骤,因此反相操作在此不再赘述。虽然在上面以处理器17为例描述了滤波520和估计530,但是滤波520和估计530也可以由其它部件执行。例如,滤波520可以由单独的滤波器执行。此外,虽然在图5中示出先执行滤波520后执行估计530,但是也可以先进行估计530并且随后执行滤波520。FIG. 5 is a schematic diagram of a process 500 of active noise reduction according to one embodiment of the present disclosure. After the first microphone 13 of the active noise reduction audio device 10 collects the ambient sound, it converts it into a first signal 502 and outputs it to the processor 17 . The processor 17 may selectively filter 520 the first signal 502 to generate a filtered first filtered signal 504 . In this context, filtering refers to operations performed on a signal other than estimation and inversion operations. Filtering 520 may include, for example, existing or future signal processing such as adjusting gain, band filtering, signal noise reduction, and the like. Alternatively, filtering 520 may also be absent in some embodiments. The processor 17 may then estimate 530 the first filtered signal 504, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. the first speaker 15. Note that, in FIG. 5, the estimation 530 includes the step of inversion, so the inversion operation is not repeated here. Although filtering 520 and estimating 530 are described above using processor 17 as an example, filtering 520 and estimating 530 may also be performed by other components. For example, filtering 520 may be performed by a separate filter. Furthermore, although it is shown in FIG. 5 that the filtering 520 is performed first and then the estimation 530 is performed, it is also possible to perform the estimation 530 first and then perform the filtering 520 .
第一扬声器15播放所接收的第二信号以在耳朵内产生第一声音512。与此同时,环境声音的在第二时段期间的直达声音514也经由主动降噪音频设备10直达耳朵内部。由于直达声音514与第一声音512相位相反,因此两者实际上互相抵消以使得耳朵内部所感知到的声音508的净音量相比于直达声音514和第一声音512减小。这样,主动降噪音频设备10可以实现主动降噪。在本公开的实施例中,由于用于抵消的反相声音是针对第二时段的“预测”出的反相声音,因此无需在在第一时段期间的直达声音到达耳朵内部时即时地提供对应的第一时段的反相声音。这样,相当于将反相声音的产生在时间上往后偏移一个时段,从而减少了对处理器17的处理速度的苛刻要求,并且避免或缓解了因相同时段内的直达声音和反相声音的不匹配导致的降噪效果不理想的问题。另一方面,也可以降低硬件设计复杂性并且相应地减少硬件成本,这是因为无需在主动降噪音频设备10中使用超高速计算电路。The first speaker 15 plays the received second signal to generate the first sound 512 in the ear. At the same time, the direct sound 514 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 10 . Since the direct sound 514 is out of phase with the first sound 512 , the two actually cancel each other out such that the net volume of the sound 508 perceived inside the ear is reduced compared to the direct sound 514 and the first sound 512 . In this way, the active noise reduction audio device 10 can realize active noise reduction. In an embodiment of the present disclosure, since the inverted sound used for cancellation is the "predicted" inverted sound for the second time period, there is no need to provide a corresponding instant when the direct sound during the first time period reaches the inside of the ear The inverted sound of the first period of time. In this way, it is equivalent to shifting the generation of the reversed-phase sound by a period of time backward, thereby reducing the strict requirements on the processing speed of the processor 17, and avoiding or alleviating the direct sound and the reversed-phase sound caused by the same period of time. The problem of unsatisfactory noise reduction effect caused by the mismatch. On the other hand, hardware design complexity and correspondingly reduced hardware cost can also be reduced, since there is no need to use ultra-high-speed computing circuits in the active noise cancellation audio device 10 .
在一个实施例中,主动降噪音频设备10还具有内置的残差麦克风14。残差麦克风14被配置为采集耳朵内部的残差声音516。残差声音516实际上是降噪之后的残余的声音。残差麦克风14因此也基于所采集到的残差声音516生成表示残差声音516的残差信号508,并且将残差信号508作为评估信号反馈至处理器17。处理器17可以基于残差信号508来调整滤波520和估计530中的至少一项。例如,在一个实施例中,滤波520是自适应滤波,其可以基于残差信号508来自动调整滤波520。此外,处理器17也可以基于残差信号508来调整估计530。例如,可以调整在第一时段之后的第二时段的时间长度,处理器17在第二时段期间估计第二时段采集的声音的反相音频信号。如果残差信号508较小,则这表明估计效果较好。因此,处理器17可以基于残差信号508相应地增加第二时段的时间长度,例如将第二时段所包含的时间点从1个增加为2个、3个或4个。如果残差信号508较大,则这表明估计效果较不理想。因此,处理器17可以基于残差信号508相应地减少第二时段的时间长度,例如将第二时段所包含的估计时间点从4个减少为3个、2个、1个,甚至0个(即,不估计)。In one embodiment, the active noise cancellation audio device 10 also has a built-in residual microphone 14 . The residual microphone 14 is configured to pick up residual sound 516 inside the ear. Residual sound 516 is actually the residual sound after noise reduction. The residual microphone 14 thus also generates a residual signal 508 representing the residual sound 516 based on the acquired residual sound 516 and feeds the residual signal 508 back to the processor 17 as an evaluation signal. Processor 17 may adjust at least one of filtering 520 and estimating 530 based on residual signal 508 . For example, in one embodiment, the filtering 520 is adaptive filtering, which can automatically adjust the filtering 520 based on the residual signal 508 . Additionally, the processor 17 may also adjust the estimate 530 based on the residual signal 508 . For example, the length of time of a second period following the first period during which the processor 17 estimates the inverted audio signal of the sound collected for the second period may be adjusted. If the residual signal 508 is small, this indicates a better estimate. Therefore, the processor 17 can accordingly increase the time length of the second period based on the residual signal 508 , for example, increase the time points included in the second period from 1 to 2, 3 or 4. If the residual signal 508 is larger, this indicates that the estimation is less effective. Therefore, the processor 17 can correspondingly reduce the time length of the second period based on the residual signal 508, for example, reducing the estimated time points included in the second period from 4 to 3, 2, 1, or even 0 ( i.e., not estimated).
另一方面,如果残差信号508较大,则处理器17还可以变更估计模型或调整模型参数。例如从LPC模型变为储备池模型,或调整加权模型的参数。此外,处理器17还可以调整滤波520,因为降噪效果不理想也可能是滤波520造成。可以理解,处理器17可以对上述的第二时段长度、估计模型、模型参数和滤波520中的至少一项进行调整,以获得更优的降噪性能。此外,为了不遗漏重要的环境声音(例如地铁报站语音)而将降噪性能设置为非最优降噪性能的情形下,处理器17也可以根据设置来调整第二时段长度、估计模型、模型参数和滤波520中的至少一项。例如避免使用储备池模型和LPC模型或降低其在加权模型中的权重系 数。On the other hand, if the residual signal 508 is large, the processor 17 may also alter the estimation model or adjust the model parameters. For example, changing from an LPC model to a reserve pool model, or adjusting the parameters of a weighted model. In addition, the processor 17 can also adjust the filtering 520, because the unsatisfactory noise reduction effect may also be caused by the filtering 520. It can be understood that the processor 17 can adjust at least one of the above-mentioned second period length, estimated model, model parameters and filtering 520 to obtain better noise reduction performance. In addition, in the case where the noise reduction performance is set to be non-optimal noise reduction performance in order not to miss important ambient sounds (such as subway station announcement voice), the processor 17 may also adjust the length of the second time period, the estimation model, At least one of model parameters and filtering 520. For example, avoid the use of reserve pool models and LPC models or reduce their weighting factors in weighted models.
虽然在图5中示出了残差麦克风14以及基于残差信号508的自适应滤波和估计调整,但是这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以不具有残差麦克风14,滤波520是固定滤波并且第二信号的估计530也不进行相应调整。此外,虽然在图5中仅示出了第一麦克风13、第一扬声器15和残差麦克风14,但是这仅是示意,主动降噪音频设备10可以具有更多的麦克风和/或扬声器。Although the residual microphone 14 and adaptive filtering and estimation adjustment based on the residual signal 508 are shown in FIG. 5, this is for illustration only and does not limit the scope of the present disclosure. In some embodiments, there may be no residual microphone 14, the filtering 520 is fixed filtering and the estimate 530 of the second signal is not adjusted accordingly. Furthermore, although only the first microphone 13 , the first speaker 15 and the residual microphone 14 are shown in FIG. 5 , this is for illustration only, and the active noise cancellation audio device 10 may have more microphones and/or speakers.
在一些实施例中,第一麦克风13可以在第一时段期间采集环境声音以生成第一信号。处理器17基于先前估计出的针对第一时段的初始反相估计信号和采集到的表示第一时段的真实环境声音的第一信号可以确定先前估计是否准确,并且基于所确定的结果来调整后续估计的时间长度。例如,调整在第一时段之后的第二时段的时间长度。如果第一信号和初始反相估计信号之和较小,则这表明估计结果较好,可以维持或增加后续估计的时间长度。反之,如果第一信号和初始反相估计信号之和较大,则这表明估计结果较差,可以相应减少后续估计的时间长度。备选地,处理器17基于先前估计出的针对第一时段的估计信号和第一信号也可以确定先前估计是否准确。如果第一信号和估计信号之差较小,则这表明估计结果较好,可以维持或增加后续估计的时间长度。反之,如果第一信号和估计信号之差较大,则这表明估计结果较差,可以相应减少后续估计的时间长度。此外,与残差信号类似,上述方式除了可以用于调整后续估计的时间长度之外,还可以用于调整估计模型和/或滤波。In some embodiments, the first microphone 13 may collect ambient sound during the first period of time to generate the first signal. The processor 17 may determine whether the previous estimate is accurate based on the previously estimated initial inversion estimation signal for the first period and the collected first signal representing the real ambient sound of the first period, and adjust the subsequent estimation based on the determined result. Estimated length of time. For example, the length of time for the second period following the first period is adjusted. If the sum of the first signal and the initial inverted estimated signal is small, this indicates that the estimation result is good, and the time length of subsequent estimation can be maintained or increased. Conversely, if the sum of the first signal and the initial inverted estimated signal is larger, it indicates that the estimation result is poor, and the time length of subsequent estimation can be correspondingly reduced. Alternatively, the processor 17 may also determine whether the previous estimate was accurate based on the previously estimated estimate signal for the first period and the first signal. If the difference between the first signal and the estimated signal is small, this indicates that the estimation result is better, and the time length of subsequent estimation can be maintained or increased. Conversely, if the difference between the first signal and the estimated signal is large, it indicates that the estimation result is poor, and the time length of subsequent estimation can be correspondingly reduced. Furthermore, similar to the residual signal, in addition to adjusting the time length of the subsequent estimation, the above-described manner can also be used to adjust the estimation model and/or filtering.
图6是根据本公开的另一实施例的主动降噪的过程600的示意图。主动降噪音频设备10的第一麦克风13在采集到环境声音之后,将其转换为第一信号602并且输出至处理器17。处理器17可以对第一信号602进行选择性的第一滤波630,以生成经滤波的第一滤波信号604。第一滤波630例如可以包括调整增益、频带过滤、降噪等处理。处理器17继而可以对第一滤波信号604进行第一估计630,例如使用图3的方法300和/或图4的过程400,以生成第二信号606并且将其输出至主动降噪音频设备10的内置的第一扬声器15。注意,在图6中,第一估计630包括反相的步骤,因此反相操作在此不再赘述。第一扬声器15播放所接收的第二信号606以在耳朵内产生第一声音612。FIG. 6 is a schematic diagram of a process 600 of active noise reduction according to another embodiment of the present disclosure. After the first microphone 13 of the active noise reduction audio device 10 collects the ambient sound, it converts it into a first signal 602 and outputs it to the processor 17 . The processor 17 may perform a selective first filtering 630 on the first signal 602 to generate a filtered first filtered signal 604 . The first filtering 630 may include, for example, processing such as adjusting gain, band filtering, noise reduction, and the like. The processor 17 may then perform a first estimate 630 on the first filtered signal 604 , eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. 4 , to generate and output a second signal 606 to the active noise cancellation audio device 10 The built-in first speaker 15. Note that in FIG. 6 , the first estimation 630 includes an inversion step, so the inversion operation will not be repeated here. The first speaker 15 plays the received second signal 606 to produce the first sound 612 in the ear.
图6与图5的不同之处在于,图6还具有第二分支执行第二滤波622和第二估计632。类似地,第一麦克风13将第一信号602输出至处理器17。处理器17可以对第一信号602进行选择性的第二滤波622以生成经滤波的第二滤波信号605。第二滤波632例如可以包括调整增益、频带过滤、降噪等处理,并且第二滤波632可以与第一滤波630相同或不同。处理器17继而可以对第二滤波信号605进行第二估计632,例如使用图3的方法300和/或图4的过程400,以生成第六信号607并且将其输出至主动降噪音频设备10的内置的第二扬声器18。注意,在图6中,第二估计632包括反相的步骤,因此反相操作在此不再赘述。第二扬声器18播放所接收的第六信号607以在耳朵内产生第二声音613。FIG. 6 differs from FIG. 5 in that FIG. 6 also has a second branch to perform second filtering 622 and second estimation 632 . Similarly, the first microphone 13 outputs the first signal 602 to the processor 17 . Processor 17 may perform selective second filtering 622 on first signal 602 to generate filtered second filtered signal 605 . The second filtering 632 may include, for example, adjusting gain, band filtering, noise reduction, etc., and the second filtering 632 may be the same as or different from the first filtering 630 . The processor 17 may then perform a second estimate 632 on the second filtered signal 605 , eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. 4 , to generate and output a sixth signal 607 to the active noise cancellation audio device 10 The built-in second speaker 18. Note that in FIG. 6 , the second estimation 632 includes an inversion step, so the inversion operation will not be repeated here. The second speaker 18 plays the received sixth signal 607 to produce a second sound 613 in the ear.
与此同时,环境声音的在第二时段期间的直达声音614也经由主动降噪音频设备10直达耳朵内部。由于直达声音614与第一声音612和第二声音613相位相反,因此直达声音614与第一声音612和第二声音613在一定程度上实际上互相抵消,以使得耳朵内部所感知到声音616的净音量相比于直达声音614、第一声音612和第二声音613减小。这样,主动降噪音频设备10可以实现主动降噪。At the same time, the direct sound 614 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 10 . Since the direct sound 614 is in opposite phase to the first sound 612 and the second sound 613, the direct sound 614 and the first sound 612 and the second sound 613 actually cancel each other to a certain extent, so that the sound 616 is perceived inside the ear. The net volume is reduced compared to the direct sound 614 , the first sound 612 and the second sound 613 . In this way, the active noise reduction audio device 10 can realize active noise reduction.
在一个实施例中,第二滤波622与第一滤波620不同,并且第二估计632与第一估计630不同。例如,第一滤波620可以是针对语音信号的并且第一估计630也是针对语音信号的, 而第二滤波622是针对音乐信号的并且第二估计632也是针对音乐信号的。再例如,第一滤波620是针对低频音频信号的并且第一估计630也是针对低频信号的,而第二滤波622是针对中高频信号的并且第二估计632也是针对中高频信号的。在此情形,可以针对各个类别的信号进行相应设置优化设置。此外,可以理解,也可以针对不同类别的声音而选择相应的第一扬声器15和第二扬声器18,以获得更好的降噪深度和降噪宽度,这是因为针对特定类别的扬声器往往比通用的宽范围扬声器具有更好的声音调教。In one embodiment, the second filter 622 is different from the first filter 620 and the second estimate 632 is different from the first estimate 630 . For example, the first filtering 620 may be for speech signals and the first estimation 630 is also for speech signals, while the second filtering 622 is for music signals and the second estimation 632 is also for music signals. As another example, the first filtering 620 is for low frequency audio signals and the first estimation 630 is also for low frequency signals, while the second filtering 622 is for medium and high frequency signals and the second estimation 632 is also for medium and high frequency signals. In this case, the optimal settings can be set accordingly for each class of signals. In addition, it can be understood that the corresponding first speaker 15 and the second speaker 18 can also be selected for different types of sounds, so as to obtain better noise reduction depth and noise reduction width, because the speakers for a specific type are often better than general-purpose speakers. The wide-range speakers have better sound tuning.
图7示出了可以实施本公开的实施例的另一主动降噪音频设备20的示意图。主动降噪音频设备20例如可以是头戴式耳机。主动降噪音频设备20可以包括一对耳罩部分,并且两个耳罩部分彼此基本上被相同地配置。因此仅以一个耳罩部分进行示意性描述。耳罩部分包括外置的第一麦克风13、第二麦克风19和位于耳罩部分内部的处理器17。耳罩部分还包括位于耳罩部分内部(相对于暴露于环境的第一麦克风13和第二麦克风19而言)的第一残差麦克风14、第二残差麦克风16、第一扬声器15和第二扬声器18。第一麦克风13和第二麦克风19均被配置为检测或采集外部环境的声音,并且第一麦克风13和第二麦克风19可以同时操作或交替操作并且可以采集相同或不同的声音。在一个实施例中,第一麦克风13可以具有内部的第一滤波器以仅采集第一频率的声音,并且第二麦克风19可以具有内部的第二滤波器以仅采集第二频率的声音。例如,第一频率是低频,并且第二频率是中高频。通过针对不同频率来捕捉声音,可以获得更多的环境声音细节以实现更好的声音估计,从而获得更好的降噪宽度和降噪深度。FIG. 7 shows a schematic diagram of another active noise cancellation audio device 20 in which embodiments of the present disclosure may be implemented. The active noise cancellation audio device 20 may be, for example, a headset. The active noise cancellation audio device 20 may include a pair of ear cup portions, and the two ear cup portions are configured substantially identically to each other. Therefore, only one ear cup part is schematically described. The ear cup part includes an external first microphone 13, a second microphone 19 and a processor 17 inside the ear cup part. The earcup portion also includes a first residual microphone 14, a second residual microphone 16, a first speaker 15, and a first residual microphone 14, a second residual microphone 16, a first speaker 15, and a second residual microphone 14 located inside the earcup portion (relative to the first microphone 13 and the second microphone 19 exposed to the environment). Two speakers 18. Both the first microphone 13 and the second microphone 19 are configured to detect or collect sounds of the external environment, and the first microphone 13 and the second microphone 19 may operate simultaneously or alternately and may collect the same or different sounds. In one embodiment, the first microphone 13 may have an internal first filter to pick up only sounds of the first frequency, and the second microphone 19 may have an internal second filter to pick up only sounds of the second frequency. For example, the first frequency is a low frequency, and the second frequency is a mid-high frequency. By capturing sounds for different frequencies, more ambient sound detail can be obtained for better sound estimation, resulting in better noise reduction width and noise reduction depth.
在一个实施例中,主动降噪音频设备20的外置的第一麦克风13和第二麦克风19对环境声音进行采集,并且进行声电转换以生成连续的电信号并且传输至处理器17。处理器17基于所接收的信号来预测或估计后续时刻的环境声音,并且生成表示后续时刻的环境声音的反相信号并将其传输至内置的第一扬声器15和第二扬声器18。第一扬声器15和第二扬声器18基于所接收的反相信号来播放反相声音以与后续时刻的从环境直达主动降噪音频设备20内的直达环境声音进行抵消,从而实现降噪的效果。In one embodiment, the external first microphone 13 and the second microphone 19 of the active noise reduction audio device 20 collect ambient sound, and perform acousto-electrical conversion to generate a continuous electrical signal and transmit it to the processor 17 . The processor 17 predicts or estimates the ambient sound at the subsequent time based on the received signal, and generates and transmits an inverted signal representing the ambient sound at the subsequent time to the built-in first speaker 15 and second speaker 18 . The first speaker 15 and the second speaker 18 play the reversed-phase sound based on the received reversed-phase signal to cancel the direct ambient sound from the environment directly into the active noise reduction audio device 20 at a subsequent moment, so as to achieve the effect of noise reduction.
图8是根据本公开的又一实施例的主动降噪的过程800的示意图。在一个实施例中,过程800可以在图7所示的主动降噪音频设备20中实施。主动降噪音频设备20的第一麦克风13在第一时段期间采集到环境声音之后,将其转换为第一信号802并且输出至处理器17。处理器17可以对第一信号802进行选择性的第一滤波820,以生成经滤波的第一滤波信号804。滤波820例如可以包括调整增益、频带过滤、降噪等处理。处理器17继而可以对第一滤波信号804进行第一估计830,例如使用图3的方法300和/或图4的过程400,以生成对应于第二时段的第二信号806并且将其输出至主动降噪音频设备20的内置的第一扬声器15。注意,在图8中,估计830包括反相的步骤,因此反相操作在此不再赘述。虽然在上面以处理器17为例描述了第一滤波820和第一估计830,但是第一滤波820和第一估计830也可以由其它部件执行。例如,第一滤波820可以由单独的滤波器执行。此外,虽然在图8中示出先执行第一滤波820后执行第一估计830,但是也可以先进行第一估计830并且随后执行第一滤波820。8 is a schematic diagram of a process 800 of active noise reduction according to yet another embodiment of the present disclosure. In one embodiment, process 800 may be implemented in the active noise cancellation audio device 20 shown in FIG. 7 . After the first microphone 13 of the active noise reduction audio device 20 collects the ambient sound during the first period of time, it is converted into a first signal 802 and output to the processor 17 . The processor 17 may perform a selective first filtering 820 on the first signal 802 to generate a filtered first filtered signal 804 . Filtering 820 may include, for example, adjusting gain, band filtering, noise reduction, and the like. The processor 17 may then perform a first estimate 830 on the first filtered signal 804, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. 4, to generate a second signal 806 corresponding to the second time period and output it to The built-in first speaker 15 of the active noise cancellation audio device 20 . Note that, in FIG. 8 , the estimation 830 includes an inversion step, so the inversion operation is not repeated here. Although the first filtering 820 and the first estimation 830 are described above with the processor 17 as an example, the first filtering 820 and the first estimation 830 may also be performed by other components. For example, the first filtering 820 may be performed by a separate filter. Furthermore, although it is shown in FIG. 8 that the first filtering 820 is performed first and then the first estimation 830 is performed, it is also possible to perform the first estimation 830 first and then perform the first filtering 820 .
类似地,主动降噪音频设备20的第二扬声器18在第一时段期间采集到环境声音之后,将其转换为第五信号并且输出至处理器17。处理器17可以对第五信号进行选择性的第二滤波822,以生成经滤波的第二滤波信号805。第二滤波822例如可以包括调整增益、频带过滤、降噪等处理。处理器17继而可以对第二滤波信号805进行第二估计832,例如使用图3的方 法300和/或图4的过程400,以生成对应于第二时段的第六信号807并且将其输出至主动降噪音频设备20的内置的第二扬声器18。注意,在图8中,第二估计832包括反相的步骤,因此反相操作在此不再赘述。Similarly, after the second speaker 18 of the active noise cancellation audio device 20 has picked up the ambient sound during the first period, it is converted into a fifth signal and output to the processor 17 . The processor 17 may perform a selective second filtering 822 on the fifth signal to generate a filtered second filtered signal 805 . The second filtering 822 may include, for example, processing such as adjusting gain, band filtering, noise reduction, and the like. The processor 17 may then perform a second estimate 832 on the second filtered signal 805, eg, using the method 300 of FIG. 3 and/or the process 400 of FIG. 4, to generate a sixth signal 807 corresponding to the second time period and output it to The built-in second speaker 18 of the active noise cancellation audio device 20 . Note that in FIG. 8 , the second estimation 832 includes an inversion step, so the inversion operation will not be repeated here.
第一扬声器15播放所接收的第二信号806以在耳朵内产生第一声音823,并且第二扬声器18播放所接收的第六信号807以在耳朵内产生第二声音825。第一声音823与第二声音825可以同时或交替地播放。环境声音的在第二时段期间的直达声音824也经由主动降噪音频设备20直达耳朵内部。由于直达声音824与第一声音823和第二声音825相位相反,因此直达声音824与第一声音823和第二声音825在一定程度上实际上互相抵消,以使得耳朵内部所接收到声音826和827的净音量相比于直达声音824、第一声音823和第二声音825减小。这样,主动降噪音频设备20可以实现主动降噪。The first speaker 15 plays the received second signal 806 to produce the first sound 823 in the ear, and the second speaker 18 plays the received sixth signal 807 to produce the second sound 825 in the ear. The first sound 823 and the second sound 825 may be played simultaneously or alternately. The direct sound 824 of the ambient sound during the second period also goes directly inside the ear via the active noise cancellation audio device 20 . Since the direct sound 824 is out of phase with the first sound 823 and the second sound 825, the direct sound 824 and the first sound 823 and the second sound 825 actually cancel each other to a certain extent, so that the sound 826 and The net volume of 827 is reduced compared to the direct sound 824 , the first sound 823 and the second sound 825 . In this way, the active noise reduction audio device 20 can implement active noise reduction.
在一个实施例中,主动降噪音频设备20还具有内置的第一残差麦克风14和第二残差麦克风16。残差麦克风14被配置为采集耳朵内部的第一残差声音826。第一残差声音826实际上是降噪之后的残余的声音。第一残差麦克风14因此也基于所采集到的第一残差声音826生成表示第一残差声音826的第一残差信号808,并且将第一残差信号808反馈至处理器17。处理器17可以基于第一残差信号808来调整第一滤波820和第一估计830中的至少一项。例如,在一个实施例中,第一滤波820是自适应滤波,其可以基于第一残差信号808来自动调整第一滤波820。此外,处理器17也可以基于第一残差信号808来调整第一估计830。例如,可以调整在第一时段之后的第二时段的时间长度,处理器17在第二时段期间基于在第一时段期间采集的声音信号来估计第二时段的反相音频信号。如果第一残差信号808较小,则这表明估计效果较好。因此,处理器17可以基于第一残差信号808相应地增加第二时段的时间长度,例如将第二时段所包含的时间点从1个增加为2个、3个或4个。如果第一残差信号808较大,则这表明估计效果较不理想。因此,处理器17可以基于第一残差信号808相应地减少第二时段的时间长度,例如将第二时段所包含的估计时间点从4个减少为3个、2个、1个,甚至0个(即,不估计)。In one embodiment, the active noise reduction audio device 20 also has a built-in first residual microphone 14 and a second residual microphone 16 . The residual microphone 14 is configured to pick up the first residual sound 826 inside the ear. The first residual sound 826 is actually the residual sound after noise reduction. The first residual microphone 14 thus also generates a first residual signal 808 representative of the first residual sound 826 based on the acquired first residual sound 826 and feeds back the first residual signal 808 to the processor 17 . Processor 17 may adjust at least one of first filtering 820 and first estimation 830 based on first residual signal 808 . For example, in one embodiment, the first filtering 820 is adaptive filtering, which can automatically adjust the first filtering 820 based on the first residual signal 808 . Furthermore, the processor 17 may also adjust the first estimate 830 based on the first residual signal 808 . For example, the length of time of the second period following the first period during which the processor 17 estimates the inverted audio signal for the second period based on the sound signal collected during the first period may be adjusted. If the first residual signal 808 is small, this indicates a better estimation effect. Therefore, the processor 17 can correspondingly increase the time length of the second period based on the first residual signal 808 , for example, increase the time points included in the second period from 1 to 2, 3 or 4. If the first residual signal 808 is larger, this indicates that the estimation effect is less ideal. Therefore, the processor 17 can correspondingly reduce the time length of the second period based on the first residual signal 808 , for example, reduce the estimated time points included in the second period from 4 to 3, 2, 1, or even 0 (ie, not estimated).
另一方面,如果第一残差信号808较大,则处理器17还可以变更估计模型或调整模型参数。例如从LPC模型变为储备池模型,或调整加权模型的参数。此外,处理器17还可以调整第一滤波820,因为降噪效果不理想也可能是第一滤波820造成。可以理解,处理器17可以对上述的第二时段长度、估计模型、模型参数和第一滤波820中的至少一项进行调整,以获得更优的降噪性能。此外,为了不遗漏重要的环境声音(例如地铁报站语音)而将降噪性能设置为非最优降噪性能的情形下,处理器17也可以根据设置来调整第二时段长度、估计模型、模型参数和第一滤波820中的至少一项。例如避免使用储备池模型和LPC模型或降低其在加权模型中的权重系数。On the other hand, if the first residual signal 808 is large, the processor 17 may also change the estimation model or adjust the model parameters. For example, changing from an LPC model to a reserve pool model, or adjusting the parameters of a weighted model. In addition, the processor 17 may also adjust the first filter 820, because the unsatisfactory noise reduction effect may also be caused by the first filter 820. It can be understood that the processor 17 can adjust at least one of the above-mentioned second period length, estimated model, model parameters and first filtering 820 to obtain better noise reduction performance. In addition, in the case where the noise reduction performance is set to be non-optimal noise reduction performance in order not to miss important ambient sounds (such as subway station announcement voice), the processor 17 may also adjust the length of the second time period, the estimation model, At least one of a model parameter and a first filter 820. For example, avoid the use of reserve pool models and LPC models or reduce their weight coefficients in weighted models.
类似地,处理器17可以基于第二残差信号810调整第二时段的长度、估计模型、模型参数和第二滤波822中的至少一项进行调整,以获得更优的降噪性能。虽然在图8中示出了第一残差麦克风14、第二残差麦克风16以及基于第一残差信号808和第二残差信号810的相应的自适应滤波和估计调整,但是这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以不具有第一残差麦克风14和/或第二残差麦克风16,例如第一滤波820和/或第二滤波822是固定滤波并且针对第二信号806的第一估计830和/或针对第六信号807的第二估计832也不进行相应调整。此外,可以理解主动降噪音频设备20可以具有更多的麦克风和/或扬声器。Similarly, the processor 17 may adjust at least one of the length of the second period, the estimated model, the model parameters, and the second filtering 822 based on the second residual signal 810 to obtain better noise reduction performance. Although the first residual microphone 14, the second residual microphone 16, and the corresponding adaptive filtering and estimation adjustments based on the first residual signal 808 and the second residual signal 810 are shown in FIG. 8, this is only are illustrative and not limiting of the scope of the present disclosure. In some embodiments, the first residual microphone 14 and/or the second residual microphone 16 may not be present, eg, the first filtering 820 and/or the second filtering 822 are fixed filtering and the first estimate for the second signal 806 830 and/or the second estimate 832 for the sixth signal 807 is not adjusted accordingly. Additionally, it is understood that the active noise cancellation audio device 20 may have more microphones and/or speakers.
虽然在图8中示出第一声音823与直达声音824以及第二声音825与直达声音824分别抵消,但这仅是第一声音823与第二声音825交替播放时的示意性示例。当第一声音823和第二声音825同时播放时,第一声音823、第二声音825和直达声音824可以共同加和,并且生成同一残差声音以供第一残差麦克风14和第二残差麦克风16采集。在此情形下,第一残差信号808和第二残差信号810可以相同。备选地,第一残差信号808和第二残差信号810在此情形下也可以不同,例如由于第一残差麦克风14和第二残差麦克风16的性能和位置所致。在此情形下,处理器17可以将第一残差信号808和第二残差信号810取平均以供第一滤波820、第二滤波822、第一估计830和第二估计832使用。这样可以避免因位置或麦克风性能的原因导致降噪失常,从而提供更为稳定的降噪效果。Although it is shown in FIG. 8 that the first sound 823 and the direct sound 824 and the second sound 825 and the direct sound 824 are canceled respectively, this is only a schematic example when the first sound 823 and the second sound 825 are played alternately. When the first sound 823 and the second sound 825 are played at the same time, the first sound 823, the second sound 825 and the direct sound 824 can be added together, and the same residual sound is generated for the first residual microphone 14 and the second residual sound Difference microphone 16 captures. In this case, the first residual signal 808 and the second residual signal 810 may be the same. Alternatively, the first residual signal 808 and the second residual signal 810 may also be different in this case, eg due to the performance and location of the first residual microphone 14 and the second residual microphone 16 . In this case, the processor 17 may average the first residual signal 808 and the second residual signal 810 for use in the first filtering 820 , the second filtering 822 , the first estimation 830 and the second estimation 832 . This avoids noise-canceling disturbances due to location or microphone performance, providing a more stable noise-canceling effect.
在本公开的实施例中,由于用于抵消的反相声音是针对第二时段的“预测”出的反相声音,因此无需在在第一时段期间的直达声音到达耳朵内部时即时地提供对应的第一时段的反相声音。这样,相当于将反相声音的产生在时间上往后偏移一个时段,从而减少了对处理器17的处理速度的苛刻要求,并且避免或缓解了因相同时段内的直达声音和反相声音的不匹配导致的降噪性能不理想的问题。另一方面,也可以降低硬件设计复杂性并且相应地减少硬件成本,这是因为无需超高速计算器件。In an embodiment of the present disclosure, since the inverted sound used for cancellation is the "predicted" inverted sound for the second time period, there is no need to provide a corresponding instant when the direct sound during the first time period reaches the inside of the ear The inverted sound of the first period of time. In this way, it is equivalent to shifting the generation of the reversed-phase sound by a period of time backward, thereby reducing the strict requirements on the processing speed of the processor 17, and avoiding or alleviating the direct sound and the reversed-phase sound caused by the same period of time. The mismatch of noise reduction performance leads to the problem of unsatisfactory performance. On the other hand, hardware design complexity and correspondingly reduced hardware cost can also be reduced because ultra-high-speed computing devices are not required.
可以理解,第一滤波820与第二滤波822可以相同或不同,并且第一估计830和第二估计832可以相同或不同。在一个实施例中,第二滤波822与第一滤波820不同,并且第二估计832与第一估计830不同。例如,第一滤波820可以是针对语音信号的并且第一估计830也是针对语音信号的,而第二滤波822是针对音乐信号的并且第二估计832也是针对音乐信号的。再例如,第一滤波820是针对低频音频信号的并且第一估计830也是针对低频信号的,而第二滤波822是针对中高频信号的并且第二估计832也是针对中高频信号的。相应地,第一声音823是低频声音,并且第二声音825是中高频声音。在此情形,可以针对各个类别的信号进行相应设置优化设置。此外,可以理解也可以针对不同类别的声音而选择相应的第一扬声器15和第二扬声器18,以获得更好的降噪深度和降噪宽度,这是因为针对特定类别的扬声器往往比通用的宽范围扬声器具有更好的声音调教。It will be appreciated that the first filter 820 and the second filter 822 may be the same or different, and the first estimate 830 and the second estimate 832 may be the same or different. In one embodiment, the second filter 822 is different from the first filter 820 , and the second estimate 832 is different from the first estimate 830 . For example, the first filtering 820 may be for speech signals and the first estimation 830 is also for speech signals, while the second filtering 822 is for music signals and the second estimation 832 is also for music signals. As another example, the first filtering 820 is for low frequency audio signals and the first estimation 830 is also for low frequency signals, while the second filtering 822 is for medium and high frequency signals and the second estimation 832 is also for medium and high frequency signals. Accordingly, the first sound 823 is a low frequency sound, and the second sound 825 is a medium and high frequency sound. In this case, the optimal settings can be set accordingly for each class of signals. In addition, it can be understood that the corresponding first speaker 15 and the second speaker 18 can also be selected for different types of sounds, so as to obtain better noise reduction depth and noise reduction width, because the speakers for a specific type are often better than the general ones. Wide-range speakers have better sound tuning.
图9是根据本公开的一个实施例的计算机可读存储介质900的示意框图。计算机可读存储介质900例如是处理器17中的高速缓存、图2中的内部存储器121等。计算机可读存储介质900存储有一个或多个程序902…..906,一个或多个程序902…..906被配置为由主动降噪音频设备的一个或多个处理器执行。一个或多个程序902…..906可以单独或共同地包括指令,该指令可以由处理器17执行以实施本文中所描述的方法或过程,例如图3所示的方法300、图4中所示的方法400、图5中所示的过程500、图6中所示的过程600和/或图8所示的过程800。可以理解,计算机可读存储介质900还可以包括用于实施其它方法和步骤的程序。FIG. 9 is a schematic block diagram of a computer-readable storage medium 900 according to one embodiment of the present disclosure. The computer-readable storage medium 900 is, for example, a cache in the processor 17, the internal memory 121 in FIG. 2, and the like. The computer readable storage medium 900 stores one or more programs 902 . . . 906 configured to be executed by one or more processors of an active noise cancellation audio device. One or more of the programs 902 . . . 906 may individually or collectively include instructions that may be executed by the processor 17 to implement the methods or processes described herein, such as the method 300 shown in FIG. The method 400 shown in FIG. 5 , the process 500 shown in FIG. 5 , the process 600 shown in FIG. 6 , and/or the process 800 shown in FIG. 8 . It will be appreciated that the computer-readable storage medium 900 may also include programs for implementing other methods and steps.
图10是根据本公开的一个实施例的用于对环境声音进行降噪的装置1000的示意框图。装置1000可以应用于主动降噪音频设备。装置1000包括:获取模块以及反相估计信号生成模块。获取模块用于获取第一信号和与第一时段对应的评估信号。第一信号表示主动降噪设备的第一麦克风在第一时段期间采集到的第一环境声音。反相估计信号生成模块用于基于评估信号确定第二时段的时间长度,第二时段在第一时段之后;以及基于第一信号和第二时段的时间长度生成待由主动降噪设备的第一扬声器播放的第二信号,第二信号表示第二时段期间的第一环境声音的反相估计。通过基于当前时刻的音频信号预计将来时刻的声音,可以获得更好的降噪宽度和降噪深度,并且还可以简化电路设计并且降低对于高速处理电路的需求, 从而降低了主动降噪音频设备的成本。FIG. 10 is a schematic block diagram of an apparatus 1000 for denoising ambient sound according to an embodiment of the present disclosure. The apparatus 1000 may be applied to active noise reduction audio equipment. The apparatus 1000 includes: an acquisition module and an inversion estimation signal generation module. The acquisition module is used for acquiring the first signal and the evaluation signal corresponding to the first period. The first signal represents the first ambient sound collected by the first microphone of the active noise reduction device during the first period of time. an inverse estimation signal generation module for determining, based on the evaluation signal, a time length of a second time period, the second time period following the first time period; and generating a first signal to be used by the active noise reduction device based on the first signal and the time length of the second time period A second signal played by the speaker, the second signal representing an inverse estimate of the first ambient sound during the second time period. By predicting the sound of the future time based on the audio signal at the current time, better noise reduction width and depth of noise reduction can be obtained, and the circuit design can also be simplified and the demand for high-speed processing circuits can be reduced, thereby reducing the active noise reduction audio equipment. cost.
虽然在图10中仅示出了两个模块,但是可以理解这仅是示意而非对本公开的范围进行限制。装置1000还可以包括用于执行上述的方法300、方法400、过程500、过程600和/或过程800中的各个步骤的相应模块。Although only two modules are shown in Figure 10, it is understood that this is illustrative only and not limiting of the scope of the present disclosure. Apparatus 1000 may also include corresponding modules for performing various steps in method 300 , method 400 , process 500 , process 600 and/or process 800 described above.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the subject matter has been described in language specific to structural features and/or logical acts of method, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (21)

  1. 一种用于主动降噪的方法,包括:A method for active noise cancellation, comprising:
    获取第一信号,所述第一信号表示主动降噪设备的第一麦克风在第一时段期间采集到的第一环境声音;acquiring a first signal, the first signal representing the first ambient sound collected by the first microphone of the active noise reduction device during the first period of time;
    获取与所述第一时段对应的评估信号;obtaining an evaluation signal corresponding to the first time period;
    基于所述评估信号确定第二时段的时间长度,所述第二时段在所述第一时段之后;以及determining a length of time for a second time period based on the evaluation signal, the second time period following the first time period; and
    基于所述第一信号和第二时段的时间长度生成待由所述主动降噪设备的第一扬声器播放的第二信号,所述第二信号表示所述第二时段期间的第一环境声音的反相估计。A second signal to be played by a first speaker of the active noise reduction device is generated based on the first signal and a time length of the second time period, the second signal representing the presence of a first ambient sound during the second time period Inverse estimation.
  2. 根据权利要求1所述的方法,其中基于所述第一信号和第二时段的时间长度生成待由所述主动降噪设备的第一扬声器播放的第二信号包括:The method of claim 1, wherein generating a second signal to be played by a first speaker of the active noise reduction device based on the first signal and a time length of a second time period comprises:
    基于所述第一信号和第二时段的时间长度生成第三信号,所述第三信号表示所述第二时段期间的第一环境声音的估计;以及generating a third signal based on the first signal and a time length of the second time period, the third signal representing an estimate of the first ambient sound during the second time period; and
    对所述第三信号进行反相以生成所述第二信号。The third signal is inverted to generate the second signal.
  3. 根据权利要求1所述的方法,其中基于所述第一信号和第二时段的时间长度生成待由所述主动降噪设备的第一扬声器播放的第二信号包括:The method of claim 1, wherein generating a second signal to be played by a first speaker of the active noise reduction device based on the first signal and a time length of a second time period comprises:
    对所述第一信号进行反相以生成第四信号,所述第四信号表示所述第一时段期间的第一环境声音的反相声音;以及inverting the first signal to generate a fourth signal representing an inverted sound of the first ambient sound during the first time period; and
    基于所述第四信号和第二时段的时间长度生成所述第二信号。The second signal is generated based on the fourth signal and the time length of the second period.
  4. 根据权利要求1-3中任一项所述的方法,其中获取与所述第一时段对应的评估信号包括:The method of any one of claims 1-3, wherein obtaining an evaluation signal corresponding to the first period of time comprises:
    获取所述主动降噪音频设备的残差麦克风在所述第一时段期间采集的残差信号作为所述评估信号,所述残差麦克风不同于所述第一麦克风。A residual signal collected by a residual microphone of the active noise reduction audio device during the first period of time is acquired as the evaluation signal, the residual microphone being different from the first microphone.
  5. 根据权利要求1-3中任一项所述的方法,其中获取与所述第一时段对应的评估信号包括:The method of any one of claims 1-3, wherein obtaining an evaluation signal corresponding to the first period of time comprises:
    基于所述第一信号和在所述第一时段之前的时段估计的、与所述第一时段对应的反相估计信号或者基于所述第一信号和在所述第一时段之前的时段估计的、与所述第一时段对应的估计信号,确定所述评估信号。An inverted estimated signal corresponding to the first period estimated based on the first signal and a period preceding the first period or estimated based on the first signal and a period preceding the first period , an estimated signal corresponding to the first time period, and the estimated signal is determined.
  6. 根据权利要求1-5中任一项所述的方法,其中基于所述第一信号和第二时段的时间长度生成待由所述主动降噪设备的第一扬声器播放的第二信号包括:5. The method of any of claims 1-5, wherein generating a second signal to be played by a first speaker of the active noise reduction device based on the first signal and a time length of a second time period comprises:
    基于所述第一信号确定所述第一时段期间的第一环境声音的类别;determining a category of a first ambient sound during the first time period based on the first signal;
    基于所述第一环境声音的类别确定与所述第一环境声音的类别对应的估计模型;以及determining an estimation model corresponding to the category of the first ambient sound based on the category of the first ambient sound; and
    基于所述估计模型、所述第一信号和第二时段的时间长度生成所述第二信号。The second signal is generated based on the estimated model, the first signal, and a time length of a second time period.
  7. 根据权利要求6所述的方法,其中基于所述第一环境声音的类别确定与所述第一环境声音的类别对应的估计模型包括:The method of claim 6, wherein determining an estimation model corresponding to the category of the first ambient sound based on the category of the first ambient sound comprises:
    基于所述第一环境声音的类别确定与所述第一环境声音对应的加权模型,所述加权模型包括:第一估计模型,第一估计模型的权重,第二估计模型和第二估计模型的权重。A weighting model corresponding to the first environmental sound is determined based on the category of the first environmental sound, where the weighting model includes: a first estimation model, a weight of the first estimation model, a second estimation model and a weight of the second estimation model Weights.
  8. 根据权利要求6或7所述的方法,还包括:The method according to claim 6 or 7, further comprising:
    基于所述评估信号调整所述估计模型。The estimation model is adjusted based on the estimation signal.
  9. 根据权利要求6-8中任一项所述的方法,其中基于所述第一信号和第二时段的时间长 度生成待由所述主动降噪设备的第一扬声器播放的第二信号包括:The method of any of claims 6-8, wherein generating a second signal to be played by a first speaker of the active noise reduction device based on the first signal and a time length of a second time period comprises:
    基于所述第一环境声音的类别确定与所述第一环境声音对应的滤波;determining a filter corresponding to the first ambient sound based on the category of the first ambient sound;
    基于所述第一信号、第二时段的时间长度和所述滤波生成所述第二信号。The second signal is generated based on the first signal, the time length of the second period, and the filtering.
  10. 根据权利要求1-9中任一项所述的方法,还包括:The method according to any one of claims 1-9, further comprising:
    获取第五信号,所述第五信号表示所述主动降噪音频设备中的第二麦克风在所述第一时段期间采集到的第二环境声音;以及obtaining a fifth signal representing a second ambient sound collected by a second microphone in the active noise reduction audio device during the first period of time; and
    基于所述第五信号和所述第二时段的时间长度生成待由所述主动降噪音频设备中的第二扬声器播放的第六信号,所述第六信号表示在所述第二时段期间的所述第二环境声音的反相估计。A sixth signal to be played by a second speaker in the ANC audio device is generated based on the fifth signal and a time length of the second time period, the sixth signal representing the noise during the second time period an inverse estimate of the second ambient sound.
  11. 一种计算机可读存储介质,存储一个或多个程序,所述一个或多个程序被配置为一个或多个处理器执行,所述一个或多个程序包括用于执行权利要求1-10中任一项所述的方法的指令。A computer-readable storage medium storing one or more programs, the one or more programs being configured to be executed by one or more processors, the one or more programs comprising for executing the Instructions for any of the methods.
  12. 一种计算机程序产品,所述计算机程序产品包括一个或多个程序,所述一个或多个程序被配置为一个或多个处理器执行,所述一个或多个程序包括用于执行权利要求1-10中任一项所述的方法的指令。A computer program product comprising one or more programs configured to be executed by one or more processors, the one or more programs comprising means for executing claim 1 - Instructions for the method of any of -10.
  13. 一种主动降噪音频设备,包括:An active noise-cancelling audio device comprising:
    获取模块,用于获取第一信号和与第一时段对应的评估信号,所述第一信号表示主动降噪设备的第一麦克风在所述第一时段期间采集到的第一环境声音;以及an acquisition module, configured to acquire a first signal and an evaluation signal corresponding to a first time period, the first signal representing the first ambient sound collected by the first microphone of the active noise reduction device during the first time period; and
    反相估计信号生成模块,用于Inverted estimation signal generation block for
    基于所述评估信号确定第二时段的时间长度,所述第二时段在所述第一时段之后;以及determining a length of time for a second time period based on the evaluation signal, the second time period following the first time period; and
    基于所述第一信号和第二时段的时间长度生成待由所述主动降噪设备的第一扬声器播放的第二信号,所述第二信号表示所述第二时段期间的第一环境声音的反相估计。A second signal to be played by a first speaker of the active noise reduction device is generated based on the first signal and a time length of the second time period, the second signal representing the presence of a first ambient sound during the second time period Inverse estimation.
  14. 根据权利要求13所述的主动降噪音频设备,其中所述获取模块还用于:The active noise reduction audio device of claim 13, wherein the acquisition module is further configured to:
    基于所述第一信号和在所述第一时段之前的时段估计的、与所述第一时段对应的反相估计信号或者基于所述第一信号和在所述第一时段之前的时段估计的、与所述第一时段对应的估计信号,确定所述评估信号。An inverted estimated signal corresponding to the first period estimated based on the first signal and a period preceding the first period or estimated based on the first signal and a period preceding the first period , an estimated signal corresponding to the first time period, and the estimated signal is determined.
  15. 根据权利要求13所述的主动降噪音频设备,其中The active noise cancellation audio device of claim 13, wherein
    所述获取模块还用于获取所述主动降噪音频设备的残差麦克风在所述第一时段期间采集的残差信号作为所述评估信号,所述残差麦克风不同于所述第一麦克风。The obtaining module is further configured to obtain, as the evaluation signal, a residual signal collected by a residual microphone of the active noise reduction audio device during the first period of time, where the residual microphone is different from the first microphone.
  16. 根据权利要求13-15中任一项所述的主动降噪音频设备,其中所述反相估计信号生成模块还用于:The active noise reduction audio device according to any one of claims 13-15, wherein the inversion estimation signal generation module is further configured to:
    基于所述第一信号确定所述第一时段期间的第一环境声音的类别;determining a category of a first ambient sound during the first time period based on the first signal;
    基于所述第一环境声音的类别确定与所述第一环境声音的类别对应的估计模型;以及determining an estimation model corresponding to the category of the first ambient sound based on the category of the first ambient sound; and
    基于所述估计模型、所述第一信号和第二时段的时间长度生成所述第二信号。The second signal is generated based on the estimated model, the first signal, and a time length of a second time period.
  17. 根据权利要求16所述的主动降噪音频设备,其中所述反相估计信号生成模块还用于:The active noise reduction audio device of claim 16, wherein the inverse estimation signal generation module is further configured to:
    基于所述第一环境声音的类别确定与所述第一环境声音对应的加权模型,所述加权模型包括:第一估计模型,第一估计模型的权重,第二估计模型和第二估计模型的权重。A weighted model corresponding to the first environmental sound is determined based on the category of the first environmental sound, where the weighted model includes: a first estimation model, a weight of the first estimation model, a second estimation model and a weight of the second estimation model Weights.
  18. 根据权利要求16或17所述的主动降噪音频设备,其中所述反相估计信号生成模块还用于所述评估信号调整所述估计模型。17. The active noise reduction audio device of claim 16 or 17, wherein the inverse estimation signal generation module is further adapted for the estimation signal to adjust the estimation model.
  19. 一种主动降噪音频设备,包括:An active noise-cancelling audio device comprising:
    一个或多个处理器;one or more processors;
    存储器,存储一个或多个程序,所述一个或多个程序被配置为由所述一个或多个处理器执行,所述一个或多个程序包括用于执行权利要求1-10中任一项所述的方法的指令。a memory storing one or more programs configured to be executed by the one or more processors, the one or more programs comprising means for performing any of claims 1-10 instructions for the method described.
  20. 一种主动降噪音频设备,包括:An active noise-cancelling audio device comprising:
    第一麦克风,被配置为:The first microphone, configured as:
    采集第一时段期间的第一环境声音并且生成第一信号;collecting a first ambient sound during a first period of time and generating a first signal;
    一个或多个处理器,被配置为:One or more processors, configured as:
    获取与所述第一时段对应的评估信号;obtaining an evaluation signal corresponding to the first time period;
    基于所述评估信号确定第二时段的时间长度,所述第二时段在所述第一时段之后;以及determining a length of time for a second time period based on the evaluation signal, the second time period following the first time period; and
    所述第一信号和所述第二时段的时间长度生成第二信号,所述第二信号表示第二时段期间的第一环境声音的反相估计;以及the first signal and the time length of the second time period generate a second signal representing an inverted estimate of the first ambient sound during the second time period; and
    第一扬声器,被配置为在所述第二时段期间播放所述第二信号。A first speaker configured to play the second signal during the second period of time.
  21. 根据权利要求20所述的主动降噪音频设备,还包括:The active noise cancellation audio device of claim 20, further comprising:
    残差麦克风,被配置为采集残差声音以生成所述残差信号作为所述评估信号。A residual microphone configured to collect residual sound to generate the residual signal as the evaluation signal.
PCT/CN2021/082870 2021-03-25 2021-03-25 Active noise reduction audio device, and method for active noise reduction WO2022198538A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180095625.9A CN116982106A (en) 2021-03-25 2021-03-25 Active noise reduction audio device and method for active noise reduction
PCT/CN2021/082870 WO2022198538A1 (en) 2021-03-25 2021-03-25 Active noise reduction audio device, and method for active noise reduction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/082870 WO2022198538A1 (en) 2021-03-25 2021-03-25 Active noise reduction audio device, and method for active noise reduction

Publications (1)

Publication Number Publication Date
WO2022198538A1 true WO2022198538A1 (en) 2022-09-29

Family

ID=83395091

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/082870 WO2022198538A1 (en) 2021-03-25 2021-03-25 Active noise reduction audio device, and method for active noise reduction

Country Status (2)

Country Link
CN (1) CN116982106A (en)
WO (1) WO2022198538A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101589628A (en) * 2007-01-25 2009-11-25 沃福森微电子股份有限公司 Ambient noise reduction
CN101616351A (en) * 2008-06-27 2009-12-30 索尼株式会社 Noise reduction audio reproducing device and noise reduction audio reproducing method
CN105612576A (en) * 2013-10-14 2016-05-25 高通股份有限公司 Limiting active noise cancellation output
CN106797511A (en) * 2015-05-08 2017-05-31 华为技术有限公司 Active noise reduction equipment
CN107564538A (en) * 2017-09-18 2018-01-09 武汉大学 The definition enhancing method and system of a kind of real-time speech communicating
CN111050250A (en) * 2020-01-15 2020-04-21 北京声智科技有限公司 Noise reduction method, device, equipment and storage medium
US10789934B2 (en) * 2017-03-16 2020-09-29 Panasonic Intellectual Property Management Co., Ltd. Active noise reduction device and active noise reduction method
CN112188340A (en) * 2020-09-22 2021-01-05 泰凌微电子(上海)有限公司 Active noise reduction method, active noise reduction device and earphone

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101589628A (en) * 2007-01-25 2009-11-25 沃福森微电子股份有限公司 Ambient noise reduction
CN101616351A (en) * 2008-06-27 2009-12-30 索尼株式会社 Noise reduction audio reproducing device and noise reduction audio reproducing method
CN105612576A (en) * 2013-10-14 2016-05-25 高通股份有限公司 Limiting active noise cancellation output
CN106797511A (en) * 2015-05-08 2017-05-31 华为技术有限公司 Active noise reduction equipment
US10789934B2 (en) * 2017-03-16 2020-09-29 Panasonic Intellectual Property Management Co., Ltd. Active noise reduction device and active noise reduction method
CN107564538A (en) * 2017-09-18 2018-01-09 武汉大学 The definition enhancing method and system of a kind of real-time speech communicating
CN111050250A (en) * 2020-01-15 2020-04-21 北京声智科技有限公司 Noise reduction method, device, equipment and storage medium
CN112188340A (en) * 2020-09-22 2021-01-05 泰凌微电子(上海)有限公司 Active noise reduction method, active noise reduction device and earphone

Also Published As

Publication number Publication date
CN116982106A (en) 2023-10-31

Similar Documents

Publication Publication Date Title
KR102512311B1 (en) Earbud speech estimation
JP6150988B2 (en) Audio device including means for denoising audio signals by fractional delay filtering, especially for "hands free" telephone systems
RU2595636C2 (en) System and method for audio signal generation
JP6572894B2 (en) Information processing apparatus, information processing method, and program
JP5395895B2 (en) Signal processing method and system
KR101444100B1 (en) Noise cancelling method and apparatus from the mixed sound
RU2605522C2 (en) Device containing plurality of audio sensors and operation method thereof
CN109493877B (en) Voice enhancement method and device of hearing aid device
EP3682651A1 (en) Low latency audio enhancement
US20230352038A1 (en) Voice activation detecting method of earphones, earphones and storage medium
GB2581596A (en) Headset on ear state detection
JP6250147B2 (en) Hearing aid system signal processing method and hearing aid system
CN111683319A (en) Call pickup noise reduction method, earphone and storage medium
WO2022140928A1 (en) Audio signal processing method and system for suppressing echo
CN113949956B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
CN112954530A (en) Earphone noise reduction method, device and system and wireless earphone
CN113949955B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
CN115348520A (en) Hearing aid comprising a feedback control system
US10972834B1 (en) Voice detection using ear-based devices
WO2022198538A1 (en) Active noise reduction audio device, and method for active noise reduction
CN114023352B (en) Voice enhancement method and device based on energy spectrum depth modulation
CN112382305B (en) Method, apparatus, device and storage medium for adjusting audio signal
WO2023077252A1 (en) Fxlms structure-based active noise reduction system, method, and device
US11948546B2 (en) Feed-forward adaptive noise-canceling with dynamic filter selection based on classifying acoustic environment
CN116419111A (en) Earphone control method, parameter generation method, device, storage medium and earphone

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932166

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180095625.9

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932166

Country of ref document: EP

Kind code of ref document: A1