WO2022227056A1 - Acoustic device - Google Patents

Acoustic device Download PDF

Info

Publication number
WO2022227056A1
WO2022227056A1 PCT/CN2021/091652 CN2021091652W WO2022227056A1 WO 2022227056 A1 WO2022227056 A1 WO 2022227056A1 CN 2021091652 W CN2021091652 W CN 2021091652W WO 2022227056 A1 WO2022227056 A1 WO 2022227056A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
signal
target
microphone
microphone array
Prior art date
Application number
PCT/CN2021/091652
Other languages
French (fr)
Chinese (zh)
Inventor
肖乐
郑金波
张承乾
廖风云
齐心
Original Assignee
深圳市韶音科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市韶音科技有限公司 filed Critical 深圳市韶音科技有限公司
Priority to CN202180094203.XA priority Critical patent/CN116918350A/en
Priority to US17/451,659 priority patent/US11328702B1/en
Priority to PCT/CN2021/131927 priority patent/WO2022227514A1/en
Priority to KR1020227044224A priority patent/KR20230013070A/en
Priority to EP21938133.2A priority patent/EP4131997A4/en
Priority to BR112022023372A priority patent/BR112022023372A2/en
Priority to JP2022580472A priority patent/JP2023532489A/en
Priority to CN202111408328.3A priority patent/CN115243137A/en
Priority to TW111111172A priority patent/TW202243486A/en
Priority to US17/657,743 priority patent/US11715451B2/en
Priority to TW111115388A priority patent/TW202242855A/en
Priority to US18/047,639 priority patent/US20230063283A1/en
Publication of WO2022227056A1 publication Critical patent/WO2022227056A1/en
Priority to US18/332,746 priority patent/US20230317048A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1083Reduction of ambient noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R9/00Transducers of moving-coil, moving-strip, or moving-wire type
    • H04R9/02Details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R9/00Transducers of moving-coil, moving-strip, or moving-wire type
    • H04R9/06Loudspeakers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2460/00Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
    • H04R2460/13Hearing devices using bone conduction transducers

Definitions

  • the present application relates to the field of acoustics, and in particular, to an acoustic device.
  • the acoustic device allows users to listen to audio content and make voice calls while ensuring the privacy of user interaction content, and does not disturb surrounding people when listening.
  • Acoustic devices can generally be divided into two categories: in-ear acoustic devices and open acoustic devices.
  • the in-ear acoustic device may block the user's ear during use, and the user is prone to experience blockage, foreign body, swelling and pain when wearing it for a long time.
  • the open acoustic device can open the user's ear, which is conducive to long-term wearing, but when the external noise is large, its noise reduction effect is not obvious, which reduces the user's listening experience.
  • the acoustic device may include a microphone array, a processor and at least one speaker.
  • the microphone array may be configured to pick up ambient noise.
  • the processor may be configured to estimate a sound field at a target spatial location using the microphone array.
  • the target spatial location may be closer to the user's ear canal than any microphone in the microphone array.
  • the processor may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location.
  • the at least one speaker may be configured to output a target signal according to the noise reduction signal.
  • the target signal can be used to reduce the ambient noise.
  • the microphone array may be positioned in a target area to minimize interference signal exposure to the microphone array from the at least one loudspeaker.
  • the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location may include estimating noise at the target spatial location based on the picked-up ambient noise and based on the target The noise at the spatial location and the sound field estimate at the spatial location of the target generate the noise reduction signal.
  • the acoustic device may further include one or more sensors for acquiring motion information of the acoustic device.
  • the processor may be further configured to update the noise of the target spatial position and the sound field estimate of the target spatial position based on the motion information and the noise based on the updated target spatial position and the updated target The sound field estimation of the spatial location generates the noise reduction signal.
  • the estimating noise at the target spatial location based on the picked-up ambient noise may include determining one or more spatial noise sources associated with the picked-up ambient noise and based on the spatial noise sources, Estimate the noise at the spatial location of the object.
  • using the microphone array to estimate the sound field of the target spatial location may include constructing a virtual microphone based on the microphone array, the virtual microphone including a mathematical model or a machine learning model for representing if the The target spatial position includes audio data collected by the microphone after the microphone, and the sound field of the target spatial position is estimated based on the virtual microphone.
  • the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial location may include estimating noise at the target spatial location based on the virtual microphone and estimating noise at the target spatial location based on the target spatial location
  • the noise reduction signal is generated by an estimate of the noise and the sound field of the target spatial location.
  • the at least one speaker may be a bone conduction speaker.
  • the interference signal may include a sound leakage signal and a vibration signal of the bone conduction speaker.
  • the target area may be an area where the total energy of the leakage signal and the vibration signal transmitted to the bone conduction speaker of the microphone array is the smallest.
  • the location of the target area may be related to the orientation of the diaphragms of the microphones in the microphone array.
  • the orientation of the diaphragm of the microphone can reduce the magnitude of the vibration signal of the bone conduction speaker received by the microphone.
  • the diaphragm of the microphone is oriented such that the vibration signal of the bone conduction speaker received by the microphone and the sound leakage signal of the bone conduction speaker received by the microphone at least partially cancel each other.
  • the vibration signal of the bone conduction speaker received by the microphone can reduce the sound leakage signal of the bone conduction speaker received by the microphone by 5-6 dB.
  • the at least one speaker may be an air conduction speaker.
  • the target area may be a minimum area of the sound pressure level of the radiated sound field of the air conduction speaker.
  • the processor may be further configured to process the noise reduction signal based on a transfer function.
  • the transfer function may include a first transfer function and a second transfer function.
  • the first transfer function may represent a change in a parameter of the target signal from the at least one loudspeaker to a location where the target signal and the ambient noise cancel.
  • the second transfer function may represent a change in a parameter of the ambient noise from the target spatial location to a location where the target signal and the ambient noise cancel.
  • the at least one speaker may be further configured to output the target signal based on the processed noise reduction signal.
  • the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial location may include dividing the picked-up ambient noise into multiple frequency bands, the multiple frequency bands corresponding to different and generating, for at least one of the plurality of frequency bands, a noise reduction signal corresponding to each of the at least one frequency band.
  • the processor may be further configured to make amplitude and phase adjustments to noise at the target spatial location based on the sound field estimate of the target spatial location to generate the noise reduction signal.
  • the acoustic device may further include a securing structure configured to secure the acoustic device in a position adjacent to the user's ear without blocking the user's ear canal.
  • the acoustic device may further include a housing structure configured to carry or house the microphone array, the processor, and the at least one speaker.
  • the noise reduction method may include picking up ambient noise by a microphone array.
  • the noise reduction method may include estimating, by a processor, a sound field of a target spatial location using the microphone array.
  • the target spatial location may be closer to the user's ear canal than any microphone in the microphone array.
  • the noise reduction method may include generating a noise reduction signal based on the picked-up ambient noise and a sound field estimate of the target spatial location.
  • the noise reduction method may further include outputting, by at least one speaker, a target signal according to the noise reduction signal.
  • the target signal can be used to reduce the ambient noise.
  • the microphone array may be positioned in a target area to minimize interference signal exposure to the microphone array from the at least one loudspeaker.
  • FIG. 1 is a schematic structural diagram of an exemplary acoustic device according to some embodiments of the present application
  • FIG. 2 is a schematic structural diagram of an exemplary processor according to some embodiments of the present application.
  • FIG. 3 is an exemplary noise reduction flowchart of an acoustic device according to some embodiments of the present application.
  • FIG. 4 is an exemplary noise reduction flowchart of an acoustic device according to some embodiments of the present application.
  • 5A-D are schematic diagrams of exemplary arrangements of microphone arrays according to some embodiments of the present application.
  • 6A-B are schematic diagrams of exemplary arrangements of microphone arrays according to some embodiments of the present application.
  • FIG. 7 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application.
  • FIG. 8 is a schematic diagram of estimating noise at a spatial position of a target according to some embodiments of the present application.
  • FIG. 9 is an exemplary flowchart of estimating the sound field and noise of a target spatial position according to some embodiments of the present application.
  • FIG. 10 is a schematic diagram of constructing a virtual microphone according to some embodiments of the present application.
  • FIG. 11 is a schematic diagram of the sound leakage signal distribution of the three-dimensional sound field at 1000 Hz of the bone conduction speaker according to some embodiments of the present application;
  • FIG. 12 is a schematic diagram illustrating the distribution of sound leakage signals in a two-dimensional sound field at 1000 Hz of a bone conduction speaker according to some embodiments of the present application;
  • FIG. 13 is a schematic diagram of the frequency response of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker according to some embodiments of the present application;
  • FIGS. 14A-B are schematic diagrams of sound field distribution of an air conduction speaker according to some embodiments of the present application.
  • FIG. 15 is an exemplary flowchart of outputting a target signal based on a transfer function according to some embodiments of the present application.
  • FIG. 16 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application.
  • system means for distinguishing different components, elements, parts, parts or assemblies at different levels.
  • device means for converting signals into signals.
  • unit means for converting signals into signals.
  • module means for converting signals into signals.
  • An open acoustic device such as an open acoustic earphone, is an acoustic device that can open a user's ear.
  • the open acoustic device can fix the speaker at a position near the user's ear without blocking the user's ear canal through a fixing structure (eg, ear hook, head hook, glasses leg, etc.).
  • a fixing structure eg, ear hook, head hook, glasses leg, etc.
  • the noise from the external environment will directly enter the user's ear canal, so that the user can hear the louder environment.
  • Noise, ambient noise can interfere with the user's music listening experience.
  • the microphone when a user wears an open acoustic device to make a call, the microphone not only picks up the user's own speaking voice, but also picks up ambient noise, which makes the user's call experience poor.
  • the acoustic device may include a microphone array, a processor, and at least one speaker.
  • the microphone array can be configured to pick up ambient noise.
  • the processor may be configured to estimate the sound field of the target spatial location using the microphone array.
  • the target spatial location may be closer to the user's ear canal than any microphone in the microphone array. It can be understood that each microphone in the microphone array may be distributed in different positions near the user's ear canal, and each microphone in the microphone array is used to estimate the sound field near the user's ear canal position (eg, a target spatial position).
  • the processor may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location.
  • the at least one speaker may be configured to output the target signal according to the noise reduction signal.
  • the target signal can be used to reduce ambient noise.
  • the microphone array may be positioned in the target area to minimize the microphone array's exposure to interfering signals from the at least one loudspeaker.
  • the interference signal may include a leakage signal and a vibration signal of the bone conduction speaker
  • the target area may be an area where the total energy of the leakage signal and vibration signal of the bone conduction speaker transmitted to the microphone array is the smallest .
  • the at least one loudspeaker is an air-conducting loudspeaker
  • the target area may be an area of minimum sound pressure level of the radiated sound field of the air-conducting loudspeaker.
  • the target signal output by at least one speaker is used to reduce the ambient noise at the user's ear canal (for example, the target spatial position) through the above setting, so as to realize the active noise reduction of the acoustic device, and improve the user's performance when using the device. Aural experience during acoustic installation.
  • the microphone array (also referred to as a feedforward microphone) can simultaneously realize the pickup of ambient noise and the estimation of the sound field at the user's ear canal (eg, the target spatial position).
  • the microphone array is arranged in the target area, which reduces or prevents the microphone array from picking up the interference signal (for example, the target signal) emitted by at least one speaker, thereby ensuring the active noise reduction of the open acoustic device.
  • FIG. 1 is a schematic structural diagram of an exemplary acoustic device 100 according to some embodiments of the present application.
  • the acoustic device 100 may be an open acoustic device.
  • the acoustic device 100 may include a microphone array 110 , a processor 120 and a speaker 130 .
  • the microphone array 110 can pick up ambient noise, and convert the picked-up ambient noise into electrical signals and transmit them to the processor 120 for processing.
  • the processor 120 may couple (eg, electrically connect) the microphone array 110 and the speaker 130 .
  • the processor 120 may receive and process the electrical signals communicated by the microphone array 110 to generate a noise reduction signal and communicate the generated noise reduction signal to the speaker 130 .
  • the speaker 130 may output the target signal according to the noise reduction signal.
  • the target signal can be used to reduce or cancel the ambient noise at the user's ear canal position (eg, the target spatial position), so as to realize active noise reduction of the acoustic device 100 and improve the user's listening experience during the use of the acoustic device 100 .
  • ambient noise may refer to a combination of various external sounds in the environment in which the user is located.
  • ambient noise may include one or more of traffic noise, industrial noise, building construction noise, social noise, and the like.
  • Traffic noise may include, but is not limited to, driving noise of motor vehicles, whistle noise, and the like.
  • Industrial noise may include, but is not limited to, factory power machinery operating noise, and the like.
  • Building construction noise may include, but is not limited to, power machinery excavation noise, hole drilling noise, stirring noise, and the like.
  • Social living environment noise may include, but is not limited to, crowd assembly noise, entertainment and publicity noise, crowd noise, household appliance noise, and the like.
  • the microphone array 110 may be disposed near the user's ear canal to pick up the ambient noise transmitted to the user's ear canal, and convert the picked-up ambient noise into electrical signals and transmit them to the processor 120 for processing.
  • the microphone array 110 may be positioned at the user's left and/or right ear.
  • the microphone array 110 may include a first sub-microphone array and a second sub-microphone array. The first sub-microphone array may be located at the user's left ear, and the second sub-microphone array may be located at the user's right ear. The first sub-microphone array and the second sub-microphone array may enter the working state at the same time or one of the two may enter the working state.
  • the ambient noise may include the sound of the user speaking.
  • the microphone array 110 may pick up ambient noise according to the talking state of the acoustic device 100 .
  • the sound produced by the user's own speech can be regarded as environmental noise, and the microphone array 110 can simultaneously pick up the user's own speech and other environmental noises.
  • the microphone array 110 may pick up ambient noise in addition to the user's own speaking sound.
  • the microphone array 110 may pick up noise emanating from a noise source located a certain distance (eg, 0.5 meters, 1 meter) away from the microphone array 110 .
  • the microphone array 110 may include one or more air conduction microphones.
  • the air conduction microphone can simultaneously acquire the noise of the external environment and the voice of the user when speaking, and use the acquired noise of the external environment and the voice of the user as the ambient noise.
  • the microphone array 110 may also include one or more bone conduction microphones. The bone conduction microphone can directly contact the user's skin, and the vibration signal generated by the bones or muscles when the user speaks can be directly transmitted to the bone conduction microphone, and then the bone conduction microphone converts the vibration signal into an electrical signal, and transmits the electrical signal to the processor 120 to be processed.
  • the bone conduction microphone may also not be in direct contact with the human body, and the vibration signal generated by the bones or muscles when the user speaks can be first transmitted to the casing structure of the acoustic device 100, and then transmitted to the bone conduction microphone by the casing structure.
  • the processor 120 may use the sound signal collected by the air conduction microphone as environmental noise and use the environmental noise for noise reduction, and the sound signal collected by the bone conduction microphone may be transmitted to the terminal device as a voice signal , so as to ensure the call quality of the user during the call.
  • the processor 120 may control the switch states of the bone conduction microphone and the air conduction microphone based on the working state of the acoustic device 100 .
  • the working state of the acoustic device 100 may refer to the usage state used when the user wears the acoustic device 100 .
  • the working state of the acoustic device 100 may include, but is not limited to, a talking state, a non-calling state (eg, a music playing state), a voice message sending state, and the like.
  • the on/off state of the bone conduction microphone and the on/off state of the air conduction microphone in the microphone array 110 may be determined according to the working state of the acoustic device 100 .
  • the switch state of the bone conduction microphone may be the standby state, and the switch state of the air conduction microphone may be the working state.
  • the switch state of the bone conduction microphone may be the working state, and the switch state of the air conduction microphone may be the working state.
  • the processor 120 may control the on/off state of the microphones (eg, bone conduction microphones, air conduction microphones) in the microphone array 110 by sending a control signal.
  • the processor 120 may control the bone conduction microphone to be in a standby state and the air conduction microphone to be in a working state.
  • the sound signal of the user speaking by himself may be regarded as environmental noise.
  • the voice signal of the user's own speaking included in the ambient noise picked up by the air conduction microphone may not be filtered out, so that the voice signal of the user's own voice, as a part of the ambient noise, may also be similar to the target signal output by the speaker 130 . offset.
  • the processor 120 may control the bone conduction microphone to be in the working state and the air conduction microphone to be in the working state.
  • the processor 120 may send a control signal to control the bone conduction microphone to be in a working state, the bone conduction microphone picks up the sound signal of the user speaking, and the processor 120 removes the user speaking picked up by the bone conduction microphone from the ambient noise picked up by the air conduction microphone So that the voice signal of the user's own speech does not cancel the target signal output by the speaker 130, so as to ensure the normal call state of the user.
  • the processor 120 may control the bone conduction microphone to maintain the working state.
  • the sound pressure of ambient noise can reflect the intensity of ambient noise.
  • the preset threshold here may be a value pre-stored in the acoustic device 100, for example, any other value such as 50dB, 60dB, or 70dB. When the sound pressure of the ambient noise is greater than the preset threshold, the ambient noise will affect the call quality of the user.
  • the processor 120 can control the bone conduction microphone to maintain a working state by sending a control signal, and the bone conduction microphone can acquire the vibration signal of the facial muscles of the user when speaking, without basically picking up external environmental noise. At this time, the vibration signal picked up by the bone conduction microphone is used. As a voice signal during a call, so as to ensure the normal call of the user.
  • the processor 120 may control the bone conduction microphone to switch from the working state to the standby state.
  • the sound pressure of the environmental noise is less than the preset threshold, the sound pressure of the environmental noise is smaller than the sound pressure of the sound signal generated by the user's speech, and the sound generated by the user's speech transmitted to a certain position of the user's ear through the first sound path
  • the processor 120 can control the bone conduction microphone to switch from the working state to the standby state by sending a control signal, thereby reducing the complexity of signal processing and the power consumption of the acoustic device
  • the microphone array 110 may include dynamic microphones, ribbon microphones, condenser microphones, electret microphones, electromagnetic microphones, carbon microphones, etc., or any combination thereof, according to the working principle of the microphones.
  • the arrangement of the microphone array 110 may include a linear array (eg, linear, curved), a planar array (eg, cross, circle, ring, polygon, mesh, etc., regular and/or irregular shapes), stereoscopic arrays (eg, cylindrical, spherical, hemispherical, polyhedral, etc.), etc., or any combination thereof.
  • a linear array eg, linear, curved
  • a planar array eg, cross, circle, ring, polygon, mesh, etc., regular and/or irregular shapes
  • stereoscopic arrays eg, cylindrical, spherical, hemispherical, polyhedral, etc.
  • the processor 120 may be configured to use the microphone array 110 to estimate the sound field of the target spatial location.
  • the sound field of a target spatial location may refer to the distribution and variation of sound waves at or near the target spatial location (eg, as a function of time, as a function of location).
  • the physical quantities describing the sound field may include sound pressure, sound frequency, sound amplitude, sound phase, sound source vibration velocity, or medium (eg air) density, and the like. In general, these physical quantities can be functions of position and time.
  • the target spatial location may refer to a spatial location close to the user's ear canal by a specific distance. The target spatial location may be closer to the user's ear canal than any microphone in the microphone array 110 .
  • the specific distance here may be a fixed distance, for example, 0.5 cm, 1 cm, 2 cm, 3 cm, and the like.
  • the target spatial position may be related to the number of each microphone in the microphone array 110 and the distribution position relative to the user's ear canal.
  • the target spatial position can be adjusted by adjusting the number of the microphones in the microphone array 110 and/or the distribution position relative to the user's ear canal. For example, by increasing the number of microphones in the microphone array 110, the target spatial location can be brought closer to the user's ear canal.
  • the target spatial position can also be made closer to the user's ear canal by reducing the distance between the microphones in the microphone array 110 .
  • the arrangement of the microphones in the microphone array 110 can also be changed to make the target spatial position closer to the user's ear canal.
  • the processor 120 may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location. Specifically, the processor 120 may receive and process the ambient noise-converted electrical signal transmitted by the microphone array 110 to obtain parameters (eg, amplitude, phase, etc.) of the ambient noise. The processor 120 may further adjust parameters of the ambient noise (eg, amplitude, phase, etc.) based on the sound field estimate of the target spatial location to generate a noise reduction signal. The parameters of the noise reduction signal (eg, amplitude, phase, etc.) correspond to parameters of the ambient noise.
  • the processor 120 may include hardware modules and software modules.
  • the hardware module may include a digital signal processing (Digital Signal Processor, DSP) chip, an advanced reduced instruction set machine (Advanced RISC Machines, ARM), and the software module may include an algorithm module.
  • DSP Digital Signal Processor
  • Advanced RISC Machines Advanced RISC Machines
  • the speaker 130 may be configured to output the target signal according to the noise reduction signal.
  • the target signal may be used to reduce or cancel ambient noise delivered to a certain location of the user's ear (eg, tympanic membrane, basilar membrane).
  • the speaker 130 may be located near the user's ear.
  • the speaker 130 may include a dynamic speaker (eg, a moving coil speaker), a magnetic speaker, an ion speaker, an electrostatic speaker (or a condenser speaker), a piezoelectric speaker, etc. one or more of.
  • the speaker 130 may include an air conduction speaker and/or a bone conduction speaker depending on how the sound output by the speaker propagates.
  • the number of speakers 130 may be one or more. When the number of speakers 130 is one, the speaker 130 can be used to output the target signal to eliminate ambient noise and can be used to deliver the sound information (eg, device media audio, call far-end audio) to the user that the user needs to hear. For example, when the number of speakers 130 is one and is an air conduction speaker, the air conduction speaker can be used to output a target signal to cancel ambient noise.
  • the target signal may be a sound wave (ie, vibration of air), which may be transmitted through the air to the target spatial location and cancel each other with ambient noise at the target spatial location.
  • the air conduction speaker can also be used to transmit the sound information that the user needs to hear to the user.
  • the bone conduction speaker can be used to output the target signal to eliminate ambient noise.
  • the target signal may be a vibration signal (eg, vibration of a speaker housing) that may be transmitted through bone or tissue to the user's basilar membrane and cancels out ambient noise at the user's basilar membrane.
  • the bone conduction speaker can also be used to transmit the sound information that the user needs to hear to the user.
  • a part of the multiple speakers 130 can be used to output the target signal to eliminate ambient noise, and the other part can be used to transmit the sound information that the user needs to listen to (for example, device media audio, call far-end audio).
  • the air conduction speakers can be used to output sound waves to reduce or eliminate ambient noise, and the bone conduction speakers can be used to transmit the sound information that the user needs to hear to the user .
  • bone conduction speakers can directly transmit mechanical vibrations through the user's body (eg, bones, skin tissue, etc.) to the user's auditory nerves, and the interference to the air conduction microphones that pick up ambient noise is relatively high during this process. Small.
  • the speaker 130 may be an independent functional device, or may be part of a single device capable of implementing multiple functions.
  • the speaker 130 may be integrated and/or formed in one piece with the processor 120 .
  • the arrangement of the multiple speakers 130 may include linear arrays (eg, straight, curved), planar arrays (eg, cross, mesh, circular, annular, polygonal and other regular and/or irregular shapes), three-dimensional arrays (eg, cylindrical, spherical, hemispherical, polyhedron, etc.), etc., or any combination thereof, which is not limited herein.
  • the speaker 130 may be positioned at the user's left and/or right ear.
  • the speaker 130 may include a first sub-speaker and a second sub-speaker.
  • the first sub-speaker may be located at the user's left ear
  • the second sub-speaker may be located at the user's right ear.
  • the first sub-speaker and the second sub-speaker may enter the working state at the same time or one of the two may enter the working state.
  • the speaker 130 may be a speaker with a directional sound field, the main lobe of which is directed to the user's ear canal.
  • the acoustic device 100 may also include one or more sensors 140 .
  • One or more sensors 140 may be electrically connected to other components of acoustic device 100 (eg, processor 120).
  • One or more sensors 140 may be used to obtain physical location and/or motion information of the acoustic device 100.
  • the one or more sensors 140 may include an Inertial Measurement Unit (IMU), a Global Positioning System (GPS), a radar, and the like.
  • the motion information may include motion trajectory, motion direction, motion speed, motion acceleration, motion angular velocity, motion-related time information (eg, motion start time, end time), etc., or any combination thereof.
  • the IMU may include a Microelectro Mechanical System (MEMS).
  • MEMS Microelectro Mechanical System
  • the microelectromechanical system may include multi-axis accelerometers, gyroscopes, magnetometers, etc., or any combination thereof.
  • the IMU may be used to detect the physical location and/or motion information of the acoustic device 100 to enable control of the acoustic device 100 based on the physical location and/or motion information. More information on the control of the acoustic device 100 based on physical location and/or motion information can be found elsewhere in this application, eg, FIG. 4 and its corresponding description.
  • the acoustic device 100 may include a signal transceiver 150 .
  • the signal transceiver 150 may be electrically connected with other components of the acoustic device 100 (eg, the processor 120).
  • the signal transceiver 150 may include Bluetooth, an antenna, and the like.
  • the acoustic device 100 may communicate with other external devices (eg, mobile phones, tablet computers, smart watches) through the signal transceiver 150 .
  • the acoustic device 100 may wirelessly communicate with other devices via Bluetooth.
  • the acoustic device 100 may include a housing structure 160 .
  • Housing structure 160 may be configured to carry other components of acoustic device 100 (eg, microphone array 110, processor 120, speaker 130, one or more sensors 140, signal transceiver 150).
  • the housing structure 160 may be a closed or semi-closed structure with a hollow interior, and the other components of the acoustic device 100 are located in or on the housing structure.
  • the shape of the shell structure may be a regular or irregular three-dimensional structure such as a rectangular parallelepiped, a cylinder, and a truncated cone. When the user wears the acoustic device 100, the housing structure may be located close to the user's ear.
  • the housing structure may be located on the peripheral side (eg, the front side or the back side) of the user's pinna.
  • the housing structure may be located on the user's ear without blocking or covering the user's ear canal.
  • the acoustic device 100 may be a bone conduction earphone, and at least one side of the housing structure may be in contact with the user's skin.
  • Acoustic drivers eg, vibrating speakers
  • bone conduction headphones convert audio signals into mechanical vibrations that can be transmitted to the user's auditory nerves through the housing structure and the user's bones.
  • the acoustic device 100 may be an air conduction earphone, and at least one side of the housing structure may or may not be in contact with the user's skin.
  • the side wall of the housing structure includes at least one sound guide hole, and the speaker in the air conduction earphone converts the audio signal into the air conduction sound, and the air conduction sound can be radiated toward the user's ear through the sound guide hole.
  • the acoustic device 100 may include a fixed structure 170 .
  • the securing structure 170 may be configured to secure the acoustic device 100 in a position near the user's ear and without blocking the user's ear canal.
  • the securing structure 170 may be physically connected (eg, snapped, screwed, etc.) with the housing structure 160 of the acoustic device 100 .
  • the housing structure 160 of the acoustic device 100 may be part of the fixed structure 170 .
  • the fixing structure 170 may include ear hooks, back hooks, elastic bands, temples, etc., so that the acoustic device 100 can be better fixed near the user's ears and prevent the user from falling during use.
  • the securing structure 170 may be an ear loop, which may be configured to be worn around the ear area.
  • the earhook can be a continuous hook and can be elastically stretched to fit on the user's ear, while the earhook can also apply pressure to the user's pinna, so that the acoustic device 100 is securely attached Fixed to a specific location on the user's ear or head.
  • the earhook may be a discontinuous band.
  • an earhook may include a rigid portion and a flexible portion.
  • the rigid part may be made of rigid material (eg, plastic or metal), and the rigid part may be fixed to the housing structure 160 of the acoustic device 100 by means of a physical connection (eg, snap-fit, screw connection, etc.).
  • the flexible portion may be made of elastic material (eg, cloth, composite or/and neoprene).
  • the securing structure 170 may be a neckband configured to be worn around the neck/shoulder area.
  • the fixing structure 170 may be an eyeglass leg, which, as a part of the eyeglasses, is erected on the user's ear.
  • the acoustic device 100 may further include an interaction module (not shown) for adjusting the sound pressure of the target signal.
  • the interaction module may include buttons, voice assistants, gesture sensors, and the like.
  • the user can adjust the noise reduction mode of the acoustic device 100 by controlling the interaction module. Specifically, the user can adjust (eg, amplify or reduce) the amplitude information of the noise reduction signal by controlling the interaction module, so as to change the sound pressure of the target signal emitted by the speaker array 130, thereby achieving different noise reduction effects.
  • the noise reduction mode may include a strong noise reduction mode, a medium noise reduction mode, a weak noise reduction mode, and the like.
  • the user when the user wears the acoustic device 100 indoors, the external environment noise is small, and the user can turn off or adjust the noise reduction mode of the acoustic device 100 to a weak noise reduction mode through the interaction module.
  • the user when the user wears the acoustic device 100 when walking in public places such as the street, the user needs to maintain a certain ability to perceive the surrounding environment while listening to audio signals (eg, music, voice information) to cope with emergencies, At this time, the user can select the intermediate noise reduction mode through the interaction module (for example, a button or a voice assistant) to preserve the surrounding ambient noise (such as sirens, impacts, car horns, etc.).
  • the interaction module for example, a button or a voice assistant
  • the processor 120 may also send prompt information to the acoustic apparatus 100 or a terminal device (eg, a mobile phone, a smart watch, etc.) that is communicatively connected to the acoustic apparatus 100 based on the ambient noise intensity range, so as to remind the user to adjust the noise reduction mode .
  • a terminal device eg, a mobile phone, a smart watch, etc.
  • acoustic device 100 e.g. one or more sensors 140, signal transceivers 150, fixed structures 170, interaction modules, etc.
  • acoustic device 100 may be replaced by other elements that perform similar functions.
  • the acoustic device 100 may not include the fixed structure 170, and the housing structure 160 or a portion thereof may have a human ear-compatible shape (eg, circular, oval, polygonal (regular or irregular), U-shaped, V-shaped, semi-circular) shell structure so that the shell structure can hang near the user's ear.
  • a component in acoustic device 100 may be split into multiple sub-components, or multiple components may be combined into a single component.
  • FIG. 2 is a schematic structural diagram of an exemplary processor 120 according to some embodiments of the present application.
  • the processor 120 may include an analog-to-digital conversion unit 210 , a noise estimation unit 220 , an amplitude-phase compensation unit 230 and a digital-to-analog conversion unit 240 .
  • the analog-to-digital conversion unit 210 may be configured to convert the signal input by the microphone array 110 into a digital signal. Specifically, the microphone array 110 picks up the environmental noise, and converts the picked-up environmental noise into electrical signals and transmits them to the processor 120 . After receiving the electrical signal of environmental noise sent by the microphone array 110, the analog-to-digital conversion unit 210 may convert the electrical signal into a digital signal. In some embodiments, the analog-to-digital conversion unit 210 may be electrically connected to the microphone array 110 and further to other components of the processor 120 (eg, the noise estimation unit 220). Further, the analog-to-digital conversion unit 210 may transfer the converted digital signal of environmental noise to the noise estimation unit 220 .
  • the noise estimation unit 220 may be configured to estimate the ambient noise from the received digital signal of the ambient noise.
  • the noise estimation unit 220 may estimate the relevant parameters of the environmental noise at the target spatial location according to the received digital signal of the environmental noise.
  • the parameters may include a noise source (eg, location, orientation), transfer direction, magnitude, phase, etc., of the noise at the target spatial location, or any combination thereof.
  • the noise estimation unit 220 may also be configured to use the microphone array 110 to estimate the sound field of the target spatial location. For more information on estimating the sound field of the target spatial location, reference can be made elsewhere in this application, eg, FIG. 4 and its corresponding description.
  • the noise estimation unit 220 may be electrically connected with other components of the processor 120 (eg, the amplitude and phase compensation unit 230). Further, the noise estimation unit 220 may transfer the estimated parameters related to environmental noise and the sound field of the target spatial position to the amplitude and phase compensation unit 230 .
  • the amplitude and phase compensation unit 230 may be configured to compensate the estimated ambient noise related parameters according to the sound field of the target spatial location. For example, the amplitude and phase compensation unit 230 may compensate the amplitude and phase of the ambient noise according to the sound field of the target spatial position to obtain a digital noise reduction signal. In some embodiments, the amplitude and phase compensation unit 230 may adjust the amplitude of the ambient noise and inversely compensate the phase of the ambient noise to obtain a digital noise reduction signal. The amplitude of the digital noise reduction signal may be approximately equal to the amplitude of the digital signal corresponding to the environmental noise, and the phase of the digital noise reduction signal may be approximately opposite to the phase of the digital signal corresponding to the environmental noise.
  • the amplitude and phase compensation unit 230 may be electrically connected with other components of the processor 120 (eg, the digital-to-analog conversion unit 240 ). Further, the amplitude and phase compensation unit 230 may transmit the digital noise reduction signal to the digital-to-analog conversion unit 240 .
  • the digital-to-analog conversion unit 240 may be configured to convert the digital noise reduction signal to an analog signal to obtain a noise reduction signal (eg, an electrical signal).
  • the digital-to-analog conversion unit 240 may include pulse width modulation (Pulse Width Modulation, PMW).
  • PMW Pulse Width Modulation
  • the digital-to-analog conversion unit 240 may be electrically connected with other components of the processor 120 (eg, the speaker 130). Further, the digital-to-analog conversion unit 240 may transmit the noise reduction signal to the speaker 130 .
  • the processor 120 may include a signal amplification unit 250 .
  • the signal amplifying unit 250 may be configured to amplify the input signal.
  • the signal amplifying unit 250 may amplify the signal input by the microphone array 110 .
  • the signal amplification unit 250 may be used to amplify the voice of the user input by the microphone array 110 .
  • the signal amplifying unit 250 may amplify the amplitude of the ambient noise according to the sound field of the target spatial position.
  • the signal amplification unit 250 may be electrically connected with other components of the processor 120 (eg, the microphone array 110, the noise estimation unit 220, the amplitude and phase compensation unit 230).
  • processor 120 may be omitted.
  • a component in processor 120 may be split into multiple sub-components, or multiple components may be combined into a single component.
  • the noise estimation unit 220 and the amplitude and phase compensation unit 230 may be integrated into one component for realizing the functions of the noise estimation unit 220 and the amplitude and phase compensation unit 230 .
  • process 300 may be performed by acoustic device 100 . As shown in FIG. 3, the process 300 may include:
  • step 310 ambient noise is picked up. In some embodiments, this step may be performed by microphone array 110 .
  • ambient noise may refer to a combination of various external sounds (eg, traffic noise, industrial noise, building construction noise, social noise) in the environment where the user is located.
  • the microphone array 110 may be located near the user's ear canal for picking up ambient noise delivered to the user's ear canal. Further, the microphone array 110 can convert the picked-up ambient noise signal into an electrical signal and transmit it to the processor 120 for processing.
  • step 320 the noise of the target spatial location is estimated based on the picked-up ambient noise. In some embodiments, this step may be performed by processor 120 .
  • the processor 120 may perform signal separation on the picked-up ambient noise.
  • the ambient noise picked up by the microphone array 110 may include various sounds.
  • the processor 120 may perform signal analysis on the ambient noise picked up by the microphone array 110 to separate the various sounds.
  • the processor 120 can adaptively adjust the parameters of the filter according to the statistical distribution characteristics and structural characteristics of various sounds in different dimensions such as space, time domain, and frequency domain, and estimate the parameter information of each sound signal in the environmental noise, And complete the signal separation process according to the parameter information of each sound signal.
  • the statistical distribution characteristics of noise may include probability distribution density, power spectral density, autocorrelation function, probability density function, variance, mathematical expectation, and the like.
  • the structured features of noise may include noise distribution, noise intensity, global noise intensity, noise rate, etc., or any combination thereof.
  • the global noise intensity may refer to an average noise intensity or a weighted average noise intensity.
  • the noise rate may refer to the degree of dispersion of the noise distribution.
  • the ambient noise picked up by the microphone array 110 may include a first signal, a second signal, and a third signal.
  • the processor 120 obtains the differences between the first signal, the second signal, and the third signal in the space (eg, where the signals are located), the time domain (eg, delay), and the frequency domain (eg, amplitude, phase), and according to the three
  • the first signal, the second signal, and the third signal are separated by the difference in these dimensions, and the relatively pure first signal, the second signal, and the third signal are obtained.
  • the processor 120 may update the ambient noise according to the parameter information (eg, frequency information, phase information, amplitude information) of the separated signal.
  • the processor 120 may determine that the first signal is the user's call sound according to the parameter information of the first signal, and remove the first signal from the ambient noise to update the ambient noise.
  • the removed first signal may be transmitted to the far end of the call.
  • the first signal may be transmitted to the far end of the call.
  • the target spatial location is based on a location determined by the microphone array 110 at or near the user's ear canal.
  • the target spatial position may refer to a spatial position close to a user's ear canal (eg, ear canal) at a specific distance (eg, 0.5 cm, 1 cm, 2 cm, 3 cm). In some embodiments, the target spatial location is closer to the user's ear canal than any of the microphones in the microphone array 110 .
  • the target spatial position is related to the number of microphones in the microphone array 110 and the distribution position relative to the user’s ear canal.
  • estimating noise at the spatial location of the target based on the picked-up environmental noise may further include determining one or more spatial noise sources related to the picked-up environmental noise, estimating the target based on the spatial noise sources Noise in spatial location.
  • the ambient noise picked up by the microphone array 110 may come from different azimuths and different types of spatial noise sources.
  • the parameter information eg, frequency information, phase information, and amplitude information corresponding to each spatial noise source is different.
  • the processor 120 may perform signal separation and extraction on the noise at the target spatial location according to the statistical distribution and structural features of different types of noise in different dimensions (eg, spatial domain, time domain, frequency domain, etc.), so as to obtain Noise of different types (eg, different frequencies, different phases, etc.), and estimate the parameter information (eg, amplitude information, phase information, etc.) corresponding to each noise.
  • the processor 120 may further determine the overall parameter information of the noise at the target spatial position according to the parameter information corresponding to different types of noise at the target spatial position. For more information on estimating noise at a target spatial location based on one or more spatial noise sources, reference may be made elsewhere in the specification of this application, eg, FIGS. 7-8 and their corresponding descriptions.
  • estimating the noise at the target spatial location based on the picked-up ambient noise may further include constructing a virtual microphone based on the microphone array 110 and estimating the noise at the target spatial location based on the virtual microphone.
  • estimating the noise at the spatial location of the target based on the virtual microphone reference may be made to other places in the specification of this application, such as FIGS. 9-10 and their corresponding descriptions.
  • step 330 a noise reduction signal is generated based on the noise at the target spatial location. In some embodiments, this step may be performed by processor 120 .
  • the processor 120 may generate a noise reduction signal based on the parameter information (eg, amplitude information, phase information, etc.) of the noise at the target spatial location obtained in step 320 .
  • the phase difference between the phase of the noise reduction signal and the phase of the noise at the target spatial location may be less than or equal to a preset phase threshold.
  • the preset phase threshold may be in the range of 90-180 degrees.
  • the preset phase threshold can be adjusted within this range according to user needs. For example, when the user does not want to be disturbed by the sound of the surrounding environment, the preset phase threshold may be a larger value, such as 180 degrees, that is, the phase of the noise reduction signal is opposite to the phase of the noise at the target spatial position.
  • the preset phase threshold may be a small value, such as 90 degrees. It should be noted that the more ambient sounds the user wishes to receive, the closer the preset phase threshold may be to 90 degrees, and the less ambient sounds the user wishes to receive, the closer the preset phase threshold may be to 180 degrees.
  • the phase of the noise reduction signal is a certain phase (eg, the phase is opposite) to the noise at the target spatial position, the amplitude of the noise at the target spatial position is different from the amplitude of the noise reduction signal. Can be less than or equal to the preset amplitude threshold.
  • the preset amplitude threshold may be a small value, such as 0 dB, that is, the amplitude of the noise reduction signal is equal to the amplitude of the noise at the target spatial position.
  • the preset amplitude threshold may be a relatively large value, for example, approximately equal to the amplitude of the noise at the target spatial position.
  • the preset amplitude threshold can be to the amplitude of the noise at the target spatial position, and the less the user wishes to receive the sound of the surrounding environment, the preset amplitude threshold can be The closer it is to 0dB.
  • the speaker 130 may output the target signal based on the noise reduction signal generated by the processor 120 .
  • the speaker 130 may convert a noise reduction signal (eg, an electrical signal) into a target signal (ie, a vibration signal) based on a vibration component in the speaker 130, and the target signal may cancel out the ambient noise.
  • the speaker 130 may output target signals corresponding to the plurality of spatial noise sources based on the noise reduction signal.
  • the speaker 130 may output a first target signal having an approximately opposite phase and approximately equal amplitude to the noise of the first spatial noise source to cancel the first spatial noise.
  • the noise of the noise source and the noise of the second spatial noise source are approximately opposite in phase and approximately equal in amplitude to the second target signal to cancel the noise of the second spatial noise source.
  • the loudspeaker 130 is an air conduction loudspeaker
  • the position where the target signal and the ambient noise are canceled may be the target spatial position.
  • the distance between the target space position and the user's ear canal is small, and the noise at the target space position can be approximately regarded as the noise at the user's ear canal position. Therefore, the noise reduction signal and the noise at the target space position cancel each other out, and can be approximately transmitted to the user.
  • the ambient noise of the ear canal is eliminated, and the active noise reduction of the acoustic device 100 is realized.
  • the loudspeaker 130 is a bone conduction loudspeaker
  • the position where the target signal and the ambient noise are canceled may be the basilar membrane.
  • the target signal and ambient noise are canceled at the basilar membrane of the user, thereby realizing active noise reduction of the acoustic device 100 .
  • process 400 may be performed by acoustic device 100 . As shown in FIG. 4, the process 400 may include:
  • step 410 ambient noise is picked up. In some embodiments, this step may be performed by microphone array 110 . In some embodiments, step 410 may be performed in a similar manner to step 310, and the related description is not repeated here.
  • step 420 the noise of the target spatial location is estimated based on the picked-up ambient noise. In some embodiments, this step may be performed by processor 120. In some embodiments, step 420 may be performed in a similar manner to step 320, and the related description is not repeated here.
  • step 430 the sound field of the target spatial location is estimated. In some embodiments, this step may be performed by processor 120 .
  • the processor 120 may utilize the microphone array 110 to estimate the sound field of the target spatial location. Specifically, the processor 120 may construct a virtual microphone based on the microphone array 110 and estimate the sound field of the target spatial position based on the virtual microphone. For more content about estimating the sound field of the target spatial position based on the virtual microphone, reference may be made to other places in the specification of this application, for example, FIGS. 9-10 and their corresponding descriptions.
  • step 440 a noise reduction signal is generated based on the noise at the target spatial location and the sound field estimate at the target spatial location.
  • step 440 may be performed by processor 120 .
  • the processor 120 may obtain sound field-related physical quantities (for example, sound pressure, sound frequency, sound amplitude, sound phase, sound source vibration velocity, or medium (for example, air) according to the sound field obtained in step 430 ) density, etc.), adjust the parameter information (eg, frequency information, amplitude information, phase information) of the noise of the target spatial position to generate a noise reduction signal. For example, the processor 120 may determine whether the physical quantity (eg, sound frequency, sound amplitude, sound phase) related to the sound field is the same as the parameter information of the noise at the target spatial location. If the physical quantity related to the sound field is the same as the parameter information of the noise at the target spatial position, the processor 120 may not adjust the parameter information of the noise at the target spatial position.
  • sound field-related physical quantities for example, sound pressure, sound frequency, sound amplitude, sound phase, sound source vibration velocity, or medium (for example, air) according to the sound field obtained in step 430 ) density, etc.
  • the parameter information eg, frequency information, amplitude
  • the processor 120 may determine the difference between the physical quantity related to the sound field and the parameter information of the noise at the target spatial position, and adjust the target based on the difference The parameter information of the noise at the spatial location. Just as an example, when the difference is greater than a certain range, the processor 120 may take the average value of the physical quantity related to the sound field and the parameter information of the noise at the target space position as the adjusted parameter information of the noise at the target space position and based on the adjusted value The parameter information of the noise at the target spatial location generates a noise reduction signal.
  • the processor 120 can pick up the ambient noise according to the microphone array.
  • the time information and current time information of the target space position and the physical quantities related to the sound field of the target space position estimate the change amount of the parameter information of the environmental noise of the target space position, and based on the change amount Parameter information for adjusting the noise at the spatial location of the target.
  • the amplitude information and frequency information of the noise reduction signal can be more consistent with the amplitude information and frequency information of the environmental noise at the current target spatial position, and the phase information of the noise reduction signal is inversely related to the environmental noise at the current target spatial position.
  • the phase information is more consistent, so that the noise reduction signal can more accurately eliminate ambient noise, improve the noise reduction effect and the user's listening experience.
  • the processor 120 may be based on motion information (eg, motion trajectory, motion direction, motion speed, motion acceleration, motion angular velocity, motion-related time information) of the acoustic device 100 obtained by one or more sensors 140 of the acoustic device 100 . Update the noise at the target space location and the sound field estimate at the target space location.
  • motion information eg, motion trajectory, motion direction, motion speed, motion acceleration, motion angular velocity, motion-related time information
  • the processor 120 may generate a noise reduction signal based on the updated noise at the target spatial location and the sound field estimate at the target spatial location.
  • One or more sensors 140 can record the motion information of the acoustic device 100, and then the processor 120 can quickly update the noise reduction signal, which can improve the noise tracking performance of the acoustic device 100, so that the noise reduction signal can more accurately eliminate the environment noise, further improving the noise reduction effect and the user's listening experience.
  • the processor 120 may divide the picked-up ambient noise into multiple frequency bands. Multiple frequency bands correspond to different frequency ranges. For example, the processor 120 may divide the picked-up ambient noise into four frequency bands of 100-300 Hz, 300-500 Hz, 500-800 Hz, and 800-1500 Hz. In some embodiments, each frequency band includes parameter information (eg, frequency information, amplitude information, phase information) of environmental noise in a corresponding frequency range. For at least one of the plurality of frequency bands, the processor 120 may perform steps 420-440 thereon to generate a noise reduction signal corresponding to each of the at least one frequency band.
  • parameter information eg, frequency information, amplitude information, phase information
  • the processor 120 may perform steps 420-440 on the four frequency bands, the frequency bands 300-500 Hz and the frequency bands 500-800 Hz, to generate noise reduction signals corresponding to the frequency bands 300-500 Hz and 500-800 Hz, respectively.
  • the speaker 130 may output the target signal corresponding to each frequency band based on the noise reduction signal corresponding to each frequency band. For example, the speaker 130 may output a target signal that is approximately opposite in phase and approximately equal in amplitude to the noise in the frequency band 300-500 Hz to cancel the noise in the frequency band 300-500 Hz, and the target signal approximately opposite in phase and approximately equal in amplitude to the noise in the frequency band 500-800 Hz signal to cancel the noise in the frequency band 500-800Hz.
  • the processor 120 may also update the noise reduction signal according to the user's manual input. For example, when the user wears the acoustic device 100 to play music in a relatively noisy external environment, the user's own hearing experience effect is not ideal, and the user can manually adjust the parameter information (for example, frequency information, phase information, phase information, etc.) of the noise reduction signal according to the user's own hearing effect. information, amplitude information).
  • the parameter information for example, frequency information, phase information, phase information, etc.
  • the adjustment multiples of the parameter information of the noise reduction signal can be preset, and special users can adjust the noise reduction signal according to their own hearing effects and the preset adjustment multiples of the parameter information of the noise reduction signal, so as to update the noise reduction signal.
  • the user can manually adjust the noise reduction signal through keys on the acoustic device 100 .
  • the user may adjust the noise reduction signal through the terminal device.
  • the parameter information of the noise reduction signal suggested to the user can be displayed on the acoustic device 100 or an external device (eg, mobile phone, tablet computer, computer) in communication with the acoustic device 100, and the user can perform parameter information according to their own hearing experience. fine-tuning.
  • the arrangement of the microphone array may be a regular geometric shape.
  • the microphone array may be a linear array.
  • the arrangement of the microphone array can also be in other shapes.
  • the microphone array may be a cross-shaped array.
  • the microphone array may be a circular array.
  • the arrangement of the microphone array may also be of irregular geometry.
  • the microphone array may be an irregular array.
  • the arrangement of the microphone arrays is not limited to the linear arrays, cross-shaped arrays, circular arrays, and irregular arrays shown in FIGS. Arrays, planar arrays, stereoscopic arrays, radial arrays, etc., are not limited in this application.
  • each of the short solid lines in FIGS. 5A-D may be considered a microphone or a group of microphones.
  • the number of each group of microphones may be the same or different
  • the types of each group of microphones may be the same or different
  • the orientation of each group of microphones may be the same or different.
  • the type, quantity and orientation of the microphones can be adaptively adjusted according to the actual application, which is not limited in this application.
  • the microphones in the microphone array may be uniformly distributed.
  • the uniform distribution here may refer to the same spacing between any two adjacent microphones in the microphone array.
  • the microphones in the microphone array may also be non-uniformly distributed.
  • the non-uniform distribution here may mean that the spacing between any two adjacent microphones in the microphone array is different.
  • the spacing between the microphones in the microphone array can be adaptively adjusted according to the actual situation, which is not limited in this application.
  • FIGS. 6A-B are schematic diagrams of exemplary arrangements of microphone arrays (eg, microphone array 110 ) according to some embodiments of the present application.
  • the microphone arrays are arranged at or around the human ear in a semicircular arrangement.
  • the microphone arrays are arranged in a linear arrangement. The way of cloth is set at the human ear.
  • the arrangement of the microphone arrays is not limited to the semicircular and linear shapes shown in FIGS. 6A and 6B , and the arrangement positions of the microphone arrays are not limited to the positions shown in FIGS. 6A and 6B .
  • the circular and linear shapes and placement of the microphone arrays are for illustrative purposes only.
  • FIG. 7 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application. As shown in FIG. 7, process 700 may include:
  • step 710 one or more sources of spatial noise related to ambient noise picked up by the microphone array are determined.
  • this step may be performed by processor 120 .
  • determining a spatial noise source refers to determining information related to the spatial noise source, such as the location of the spatial noise source (including the orientation of the spatial noise source, the distance between the spatial noise source and the target spatial location, etc.), the spatial noise source The phase and the amplitude of the spatial noise source, etc.
  • a spatial noise source related to ambient noise refers to a noise source whose sound waves can be delivered to the user's ear canal (eg, a target spatial location) or near the user's ear canal.
  • the spatial noise sources may be noise sources in different directions (eg, front, rear, etc.) of the user's body. For example, there is crowd noise in front of the user's body and vehicle whistle noise to the left of the user's body.
  • the spatial noise sources include crowd noise sources in front of the user's body and vehicle whistle noise sources to the left of the user's body.
  • a microphone array (eg, the microphone array 110 ) can pick up spatial noise in all directions of the user's body, and convert the spatial noise into electrical signals and transmit them to the processor 120 , and the processor 120 can process the electrical signals corresponding to the spatial noise.
  • parameter information eg, frequency information, amplitude information, phase information, etc.
  • the processor 120 determines the information of the spatial noise sources in various directions according to the parameter information of the spatial noise in various directions, for example, the azimuth of the spatial noise source, the distance of the spatial noise source, the phase of the spatial noise source, and the amplitude of the spatial noise source.
  • the processor 120 may determine the source of the spatial noise through a noise localization algorithm based on the spatial noise picked up by the microphone array (eg, the microphone array 110).
  • the noise localization algorithm may include one or more of a beamforming algorithm, a super-resolution spatial spectrum estimation algorithm, a time difference of arrival algorithm (also referred to as a delay estimation algorithm), and the like.
  • the beamforming algorithm is a sound source localization method based on controllable beamforming of maximum output power.
  • beamforming algorithms may include Steering Response Power-Phase Transform (SPR-PHAT) algorithms, delay-and-sum beamforming, differential microphone algorithms, sidelobes Cancellation (Generalized Sidelobe Canceller, GSC) algorithm, Minimum Variance Distortionless Response (Minimum Variance Distortionless Response, MVDR) algorithm, etc.
  • SPR-PHAT Steering Response Power-Phase Transform
  • GSC Generalized Sidelobe Canceller
  • MVDR Minimum Variance Distortionless Response
  • Super-resolution spatial spectral estimation algorithms can include autoregressive AR models, minimum variance spectral estimation (MV), and eigenvalue decomposition methods (for example, Multiple Signal Classification (MUSIC) algorithms), etc., which can be obtained by acquiring microphone arrays.
  • MUSIC Multiple Signal Classification
  • the picked-up sound signal (eg, spatial noise) is used to calculate the correlation matrix of the spatial spectrum and efficiently estimate the direction of the spatial noise source.
  • the time difference of arrival algorithm can first estimate the sound arrival time difference, and obtain the sound delay (Time Difference Of Arrival, TDOA) between the microphones in the microphone array, and then use the obtained sound arrival time difference, combined with the known spatial position of the microphone array to further Locate the location of spatial noise sources.
  • TDOA Time Difference Of Arrival
  • the time delay estimation algorithm can determine the location of the noise source by calculating the time difference between the ambient noise signal and different microphones in the microphone array.
  • the SPR-PHAT algorithm can perform beamforming in the direction of each noise source, and the direction with the strongest beam energy can be approximately regarded as the direction of the noise source.
  • the MUSIC algorithm may perform eigenvalue decomposition on the covariance matrix of the environmental noise signal picked up by the microphone array to obtain the subspace of the environmental noise signal, thereby separating the direction of the environmental noise.
  • a spatial super-resolution image of environmental noise can be formed by methods such as synthetic aperture, sparse restoration, co-prime array, etc., and the spatial super-resolution image can be used to reflect the signal reflection map of environmental noise, so as to further improve the source of spatial noise. positioning accuracy.
  • the processor 120 may divide the picked-up environmental noise into multiple frequency bands according to a specific frequency bandwidth (for example, every 500 Hz as a frequency band), each frequency band may correspond to a different frequency range, and at least one The spatial noise source corresponding to the frequency band is determined on the frequency band.
  • the processor 120 may perform signal analysis on the frequency bands divided by the environmental noise, obtain parameter information of the environmental noise corresponding to each frequency band, and determine the spatial noise source corresponding to each frequency band according to the parameter information.
  • the processor 120 may determine a spatial noise source corresponding to each frequency band through a noise localization algorithm.
  • step 720 the noise of the target spatial location is estimated based on the spatial noise sources. In some embodiments, this step may be performed by processor 120 . As described herein, estimating the noise at the target spatial position refers to estimating parameter information of the noise at the target spatial position, such as frequency information, amplitude information, phase information, and the like.
  • the processor 120 may estimate that each spatial noise source respectively transmits the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the spatial noise sources located in various directions of the user's body obtained in step 710 .
  • the parameter information of the noise to the target space position, so as to estimate the noise of the target space position. For example, if there is a spatial noise source in the first orientation (eg, front) and the second orientation (eg, behind) of the user's body, the processor 120 may determine the location information, frequency information, phase information or amplitude of the spatial noise source according to the first orientation.
  • the processor 120 may estimate the frequency information of the second azimuth spatial noise source when the noise of the second azimuth spatial noise source is transmitted to the target spatial position according to the position information, frequency information, phase information or amplitude information of the second azimuth spatial noise source. , phase information or amplitude information. Further, the processor 120 may estimate the noise information of the target spatial position based on the frequency information, phase information or amplitude information of the first azimuth spatial noise source and the second azimuth spatial noise source, thereby estimating the noise information of the noise of the target spatial position.
  • the processor 120 may estimate noise information for the target spatial location using virtual microphone techniques or other methods.
  • the processor 120 may extract parameter information of the noise of the spatial noise source from the frequency response curve of the spatial noise source picked up by the microphone array through a feature extraction method.
  • the method for extracting the parameter information of the noise of the spatial noise source may include, but is not limited to, Principal Components Analysis (PCA), Independent Component Algorithm (ICA), Linear Discriminant Analysis (Linear Discriminant) Analysis, LDA), singular value decomposition (Singular Value Decomposition, SVD) and so on.
  • PCA Principal Components Analysis
  • ICA Independent Component Algorithm
  • LDA Linear Discriminant Analysis
  • SVD singular value decomposition
  • the above description about the process 700 is only for example and illustration, and does not limit the scope of application of the present application.
  • the process 700 may further include the steps of locating the spatial noise source, extracting noise parameter information of the spatial noise source, and the like.
  • step 710 and step 720 may be combined into one step. Such corrections and changes are still within the scope of this application.
  • FIG. 8 is a schematic diagram of estimating noise at a spatial position of a target according to some embodiments of the present application.
  • the following takes the time difference of arrival algorithm as an example to illustrate how the location of the spatial noise source is realized.
  • a processor eg, processor 120
  • the processor can estimate the phase delay and amplitude change of the noise signal emitted by the noise source from the noise source to the target spatial location 830 according to the location of the noise source.
  • the processor can obtain the parameter information when the environmental noise is transmitted to the target spatial location 830 (eg, frequency information, amplitude information, phase information, etc.), thereby estimating the noise at the spatial location of the target.
  • the noise sources 811, 812 and 813, the microphone array 820, the microphones 821 and 822 in the microphone array 820, and the target spatial position 830 described in FIG. Scope of application Various modifications and changes will occur to those skilled in the art under the guidance of this application.
  • the microphones in the microphone array 820 are not limited to the microphone 821 and the microphone 822, and the microphone array 820 may also include more microphones and the like. Such corrections and changes are still within the scope of this application.
  • FIG. 9 is an exemplary flowchart of estimating noise and sound field at a spatial location of a target according to some embodiments of the present application. As shown in FIG. 9, the process 900 may include:
  • a virtual microphone is constructed based on the microphone array (eg, microphone array 110, microphone array 820). In some embodiments, this step may be performed by processor 120 .
  • a virtual microphone may be used to represent or simulate audio data collected by the microphone if the microphone is placed at the target spatial location. That is, the audio data obtained by the virtual microphone can be approximated or equivalent to the audio data collected by the physical microphone if the physical microphone is placed at the target spatial position.
  • the virtual microphone may include a mathematical model.
  • the mathematical model can embody the relationship between the noise or sound field estimation of the target spatial location and the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array and parameters of the microphone array.
  • the parameters of the microphone array may include one or more of the arrangement of the microphone array, the spacing between the microphones, the number and position of the microphones in the microphone array, and the like.
  • the mathematical model can be obtained by calculation based on the initial mathematical model and parameters of the microphone array and parameter information (eg, frequency information, amplitude information, phase information, etc.) of the sound (eg, ambient noise) picked up by the microphone array.
  • the initial mathematical model may include parameters and model parameters corresponding to parameters of the microphone array and parameter information of ambient noise picked up by the microphone array.
  • the parameters of the microphone array and the parameter information of the sound picked up by the microphone array and the initial values of the model parameters are brought into the initial mathematical model to obtain the predicted noise or sound field of the target spatial position.
  • This predicted noise or sound field is then compared with the data (noise and sound field estimates) obtained by physical microphones placed at the target spatial location to make adjustments to the model parameters of the mathematical model.
  • the mathematical model is obtained by adjusting multiple times through a large amount of data (for example, parameters of the microphone array and parameter information of ambient noise picked up by the microphone array).
  • the virtual microphone may include a machine learning model.
  • the machine learning model may be obtained through training based on parameters of the microphone array and parameter information (eg, frequency information, amplitude information, phase information, etc.) of the sound (eg, ambient noise) picked up by the microphone array.
  • the machine learning model is obtained by training an initial machine learning model (eg, a neural network model) using the parameters of the microphone array and the parameter information of the sound picked up by the microphone array as training samples.
  • the parameters of the microphone array and the parameter information of the sound picked up by the microphone array can be input into the initial machine learning model, and prediction results (eg, noise and sound field estimation of the target spatial position) can be obtained.
  • This prediction is then compared with data (noise and sound field estimates) obtained from physical microphones set up at the target spatial location to adjust the parameters of the initial machine learning model.
  • data noise and sound field estimates
  • the parameters of the initial machine learning model are optimized until the prediction results of the initial machine learning model match the target.
  • the machine learning model is obtained when the data obtained by the physical microphones set at the spatial location are the same or approximately the same.
  • Virtual microphone technology can move physical microphones away from locations where microphone placement is difficult (eg, target spatial locations). For example, in order to open the user's ears without blocking the user's ear canal, the physical microphone cannot be set at the position of the user's ear hole (eg, a target spatial position). At this time, the microphone array can be set close to the user's ear without blocking the ear canal through the virtual microphone technology, for example, at the user's auricle, etc., and then a virtual microphone at the position of the user's ear hole can be constructed through the microphone array.
  • the virtual microphone can predict sound data (eg, amplitude, phase, sound pressure, sound field, etc.) at a second location (eg, a target spatial location) using a physical microphone (ie, a microphone array) at a first location.
  • the sound data of the second position (which may also be called a specific position, such as a target spatial position) predicted by the virtual microphone may be based on the distance between the virtual microphone and the physical microphone (ie, the microphone array), the Type (eg, mathematical model virtual microphone, machine learning virtual microphone), etc. For example, the closer the distance between the virtual microphone and the physical microphone (ie, the microphone array), the more accurate the sound data of the second position predicted by the virtual microphone.
  • the sound data of the second position predicted by the machine learning virtual microphone is more accurate than that of the mathematical model virtual microphone.
  • the position corresponding to the virtual microphone ie, the second position, for example, the target spatial position
  • the microphone array may be near the microphone array, or may be far away from the microphone array.
  • step 920 the noise and sound field of the target spatial location is estimated based on the virtual microphone. In some embodiments, this step may be performed by processor 120 .
  • the processor 120 may combine the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array in real time with the parameters of the microphone array (eg, , the arrangement of the microphone array, the spacing between each microphone, the number of microphones in the microphone array) are input into the mathematical model as parameters of the mathematical model to estimate the noise and sound field of the target spatial location.
  • the parameter information eg, frequency information, amplitude information, phase information, etc.
  • the parameters of the microphone array eg, the arrangement of the microphone array, the spacing between each microphone, the number of microphones in the microphone array
  • the processor 120 may real-time compare the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array with the parameters of the microphone array ( For example, the arrangement of the microphone array, the spacing between the individual microphones, the number of microphones in the microphone array) are input into the machine learning model and the noise and sound field of the target spatial location are estimated based on the output of the machine learning model.
  • the parameter information eg, frequency information, amplitude information, phase information, etc.
  • process 900 is only for example and illustration, and does not limit the scope of application of the present application.
  • Various modifications and changes to process 900 may be made to process 900 under the guidance of the present application to those skilled in the art.
  • step 920 may be divided into two steps to separately estimate the noise and sound field of the target spatial location. Such corrections and changes are still within the scope of this application.
  • FIG. 10 is a schematic diagram of constructing a virtual microphone according to some embodiments of the present application.
  • the target spatial location 1010 may be located near the user's ear canal.
  • the target spatial position 1010 cannot be provided with a physical microphone, so that the noise and sound field of the target spatial position 1010 cannot be directly estimated through the physical microphone.
  • a microphone array 1020 may be provided in the vicinity of the target spatial location 1010.
  • the microphone array 1020 may include a first microphone 1021 , a second microphone 1022 , and a third microphone 1023 .
  • Each microphone in the microphone array 1020 eg, the first microphone 1021, the second microphone 1022, and the third microphone 1023 can pick up ambient noise in the space where the user is located.
  • the processor 120 can construct a virtual microphone. Further, based on the virtual microphone, the processor 120 can estimate the noise and sound field at the target spatial location 1010 .
  • the target spatial position 1010 and the microphone array 1020 and the first microphone 1021 , the second microphone 1022 and the third microphone 1023 in the microphone array 1020 described in FIG. Scope of application Various modifications and changes may be made by those skilled in the art under the guidance of the present application.
  • the microphones in the microphone array 1020 are not limited to the first microphone 1021, the second microphone 1022 and the third microphone 1023, and the microphone array 1020 may also include more microphones and the like. Such corrections and changes are still within the scope of this application.
  • the microphone arrays may also pick up interfering signals (eg, target signals and other sound signals) from speakers while picking up ambient noise.
  • interfering signals eg, target signals and other sound signals
  • the microphone array can be arranged far away from the speaker.
  • the microphone array may be too far away from the target spatial location to accurately estimate the sound field and/or noise at the target spatial location.
  • the microphone array can be placed in the target area to minimize the interference signal from the speaker to the microphone array.
  • the target area may be the area of minimum sound pressure level of the loudspeaker.
  • the area of minimum sound pressure level may be the area where the sound radiated by the speaker is low.
  • the loudspeaker may form at least one set of acoustic dipoles.
  • a set of sound signals output from the front of the speaker diaphragm and the back of the diaphragm with approximately opposite phases and approximately the same amplitude can be regarded as two point sound sources.
  • the two point sound sources can form an acoustic dipole or similar acoustic dipoles, and the sound radiated outward has obvious directivity.
  • the sound radiated by the speaker is larger, and the radiated sound in other directions is significantly reduced. ) zone speakers radiate minimal sound.
  • a speaker (eg, speaker 130 ) in an acoustic device may be a bone conduction speaker.
  • the target area may be an area with a minimum sound pressure level of the sound leakage signal of the bone conduction speaker.
  • the region with the minimum sound pressure level of the sound leakage signal may refer to the region where the sound leakage signal radiated by the bone conduction speaker is the minimum.
  • the microphone array is arranged in the area with the minimum sound pressure level of the leakage signal of the bone conduction speaker, which can reduce the interference signal of the bone conduction speaker picked up by the microphone array, and can also effectively solve the problem that the microphone array is too far away from the target space, resulting in the inability to accurately estimate the target.
  • FIG. 11 is a schematic diagram illustrating the distribution of sound leakage signals in a three-dimensional sound field of a bone conduction speaker at 1000 Hz according to some embodiments of the present application.
  • FIG. 12 is a schematic diagram of the sound leakage signal distribution of the two-dimensional sound field at 1000 Hz of the bone conduction speaker according to some embodiments of the present application.
  • the acoustic device 1100 may include a contact surface 1110 .
  • the contact surface 1110 may be configured to make contact with the user's body (eg, face, ears) when the user wears the acoustic device 1100 .
  • the bone conduction speaker may be disposed inside the acoustic device 1100 . As shown in FIG.
  • the color on the acoustic device 1100 can represent the sound leakage signal of the bone conduction speaker, and different color depths can represent different sizes of the sound leakage signal.
  • the area 1120 where the dotted line is located may be the area with the minimum sound pressure level of the sound leakage signal of the bone conduction speaker.
  • the microphone array may be placed in the area 1120 where the dotted line is located (eg, position 1) so that less leakage signals are received from the bone conduction speaker.
  • the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 5-30 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 7-28 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 9-26 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 11-24 dB lower than the maximum output sound pressure of the bone conduction speaker.
  • the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 13-22 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 15-20 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 17-18 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 15dB lower than the maximum output sound pressure of the bone conduction speaker.
  • the two-dimensional sound field distribution shown in FIG. 12 is a two-dimensional cross-sectional view of the three-dimensional sound field leakage signal distribution of FIG. 11 .
  • the color on the cross section can represent the sound leakage signal of the bone conduction speaker, and different color depths can represent different sizes of the sound leakage signal. The lighter the color, the greater the sound leakage signal of the bone conduction speaker, and the darker the color, the smaller the sound leakage signal of the bone conduction speaker.
  • the regions 1210 and 1220 where the dotted lines are located have darker colors and smaller sound leakage signals.
  • the regions 1210 and 1220 where the dotted lines are located may be the regions where the sound pressure level of the sound leakage signal of the bone conduction speaker is the minimum.
  • the microphone arrays may be placed in the regions 1210 and 1220 where the dotted lines are located (eg, position A and position B) so that less leakage signals are received from the bone conduction speakers.
  • the vibration signal emitted by the bone conduction speaker is relatively large during the vibration process. Therefore, not only the sound leakage signal of the bone conduction speaker will interfere with the microphone array, but also the vibration signal of the bone conduction speaker will interfere with the microphone array.
  • the vibration signal of the bone conduction speaker may refer to the vibration of other parts of the acoustic device (eg, the housing, the microphone array) driven by the vibration of the vibration part of the bone conduction speaker.
  • the interference signal of the bone conduction speaker may include sound leakage signal and vibration signal of the bone conduction speaker.
  • the target area where the microphone array is located may be an area where the total energy of the sound leakage signal and the vibration signal of the bone conduction speaker transmitted to the microphone array is the smallest.
  • the sound leakage signal and the vibration signal of the bone conduction speaker are relatively independent signals, and the region with the minimum sound pressure level of the sound leakage signal of the bone conduction speaker cannot represent the region with the minimum total energy of the sound leakage signal and the vibration signal of the bone conduction speaker. Therefore, the determination of the target area needs to analyze the total signal of the vibration signal and the leakage signal of the bone conduction speaker.
  • FIG. 13 is a schematic diagram of the frequency response of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker according to some embodiments of the present application.
  • FIG. 13 shows the frequency response curves of the total signal of the vibration signal and the leakage signal of the bone conduction speaker at positions 1, 2, 3 and 4 on the acoustic device 1100 in FIG. 11 .
  • the abscissa may represent the frequency
  • the ordinate may represent the sound pressure of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker.
  • the position 1 when only considering the sound leakage signal of the bone conduction speaker, the position 1 is located in the area with the minimum sound pressure level of the speaker 130 , which can be used as the target for setting the microphone array (eg, the microphone array 110 , the microphone array 820 , and the microphone array 1020 ). area.
  • the target area of the microphone array ie the area where the total signal of the vibration signal and the leakage signal of the bone conduction speaker has the smallest sound pressure
  • Position 1 compared with other positions, the sound pressure of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker corresponding to position 2 is relatively small, therefore, position 2 can be used as the target area for setting the microphone array.
  • the location of the target area may be related to the orientation of the diaphragms of the microphones in the microphone array.
  • the orientation of the diaphragm of the microphone can affect the magnitude of the vibration signal of the bone conduction speaker received by the microphone. For example, when the diaphragm of the microphone is perpendicular to the vibration component of the bone conduction speaker, the vibration signal of the bone conduction speaker that can be collected by the microphone is small. For another example, when the diaphragm of the microphone is parallel to the vibration component of the bone conduction speaker, the vibration signal of the bone conduction speaker that can be collected by the microphone is larger.
  • the vibration signal of the bone conduction speaker received by the microphone can be reduced by setting the orientation of the microphone diaphragm.
  • the vibration signal of the bone conduction speaker can be ignored in the process of determining the target position for setting the microphone array, and only the sound leakage signal of the bone conduction speaker is considered, that is, according to Figure 11 and the description of FIG. 12 to determine the target position for setting the microphone array.
  • the vibration signal and the sound leakage signal of the bone conduction speaker can be considered in the process of determining the target position for setting the microphone array, that is, the setting is determined according to the description in FIG. 13 .
  • the target position of the microphone array is determined according to the description in FIG. 13 .
  • the phase of the vibration signal of the bone conduction speaker received by the microphone can be adjusted by adjusting the orientation of the diaphragm of the microphone, so that the vibration signal of the bone conduction speaker received by the microphone and the leakage of the bone conduction speaker received by the microphone
  • the phase of the sound signal is approximately opposite and the magnitude is approximately equal, so that the vibration signal of the bone conduction speaker received by the microphone and the sound leakage signal of the bone conduction speaker received by the microphone can be at least partially canceled, so as to reduce the amount of noise received by the microphone array.
  • the interference signal emitted by the bone conduction speaker can reduce the sound leakage signal of the bone conduction speaker received by the microphone by 5-6 dB.
  • a speaker (eg, speaker 130 ) in an acoustic device may be an air conduction speaker.
  • the loudspeaker is an air-conductive loudspeaker and the interference signal is a sound signal (ie, a radiated sound field) emitted by the air-conducted loudspeaker
  • the target area may be an area with a minimum sound pressure level of the radiated sound field of the air-conducted loudspeaker.
  • the microphone array is arranged in the area with the minimum sound pressure level of the radiated sound field of the air conduction speaker, which can reduce the interference signal of the air conduction speaker picked up by the microphone array, and can also effectively solve the problem that the microphone array is too far away from the target space and cannot accurately estimate the target space.
  • the position of the sound field is the problem.
  • FIGS 14A-B are schematic diagrams of sound field distribution of an air conduction speaker according to some embodiments of the present application.
  • the air-conducting speakers may be disposed within the open acoustic device 1400 and radiate sound outward from two sound guiding holes (eg, 1401 and 1402 in Figures 14A-B) of the open acoustic device 1400, And the emitted sound can form a dipole (represented by "+", "-" shown in Figures 14A-B).
  • the open acoustic device 1400 is positioned so that the line connecting the dipoles is approximately perpendicular to the user's face area.
  • the sound radiated by the dipole can form three stronger sound field regions 1421, 1422 and 1423).
  • the minimum sound pressure level region also called the region with low sound pressure
  • the minimum sound pressure level region may refer to a region where the sound intensity output by the open acoustic device 1400 is relatively small.
  • the microphones 1430 in the microphone array may be positioned in the region of minimum sound pressure level.
  • the microphone 1430 in the microphone array can be set at the position where the dotted line in FIG. 14 intersects the housing of the open acoustic device 1400, so that the microphone 1430 can receive as little noise from the air conduction speaker as possible while collecting the external ambient noise.
  • the sound signal reduces the interference of the sound signal emitted by the air conduction speaker to the active noise reduction function of the open acoustic device 1400 .
  • the open acoustic device 1400 is positioned so that the lines connecting the dipoles are approximately parallel to the user's face area.
  • the sound radiated by the dipole can form two stronger sound field regions 1424 and 1425).
  • a minimum sound pressure level region of the radiated sound field of the air conduction speaker may be formed, for example, the dashed line in FIG. 14B and its vicinity.
  • the microphones 1440 in the microphone array may be positioned in the region of minimum sound pressure level.
  • the microphone 1440 in the microphone array can be set at the position where the dotted line in FIG.
  • the microphone 1440 can collect the external ambient noise and receive as little sound from the air conduction speaker as possible. Signal, reducing the interference of the sound signal emitted by the air conduction speaker to the active noise reduction function of the open acoustic device 1400.
  • FIG. 15 is an exemplary flowchart of outputting a target signal based on a transfer function according to some embodiments of the present application. As shown in Figure 15, process 1500 may include:
  • the noise reduction signal is processed based on the transfer function.
  • this step may be performed by the processor 120 (eg, the amplitude and phase compensation unit 230).
  • the processor 120 eg, the amplitude and phase compensation unit 230
  • the speaker eg, the speaker 130
  • the speaker may output the target signal based on the noise reduction signal generated by the processor 120 .
  • the target signal output by the speaker can be transmitted to a specific position in the user's ear (also referred to as a noise cancellation position) through a first sound path, and ambient noise can be transmitted to a specific position of the user's ear through a second sound path , and at a specific location, the target signal and the ambient noise cancel each other out, so that the user cannot perceive the ambient noise or can perceive the relatively weak ambient noise.
  • the specific location where the target signal and the ambient noise cancel each other out may be the user's ear canal or its vicinity, eg, the target spatial location.
  • the first sound path may be the path through which the target signal is transmitted from the air conduction speaker to the target spatial position through the air
  • the second sound path may be the path through which the ambient noise is transmitted from the noise source to the target spatial position.
  • the specific position where the target signal and the ambient noise cancel each other may be at the user's basilar membrane.
  • the first sound path may be the path of the target signal from the bone conduction speaker, through the user's bone or tissue to the user's basement membrane
  • the second sound path may be the path of the ambient noise from the noise source, through the user's ear canal, tympanic membrane to the user's basement membrane path.
  • the speaker (eg, speaker 130) may be positioned near and not obstructing the user's ear canal, so that the speaker is at a distance from the noise cancellation location (eg, target spatial location, basilar membrane). Therefore, when the target signal output by the speaker is delivered to the noise cancellation position, the phase information and amplitude information of the target signal may change. As a result, it may occur that the target signal output by the speaker cannot achieve the effect of reducing the ambient noise signal, or even enhance the ambient noise, thereby causing the active noise reduction function of the acoustic device (eg, the open acoustic output device 100 ) to fail to achieve.
  • the noise cancellation location eg, target spatial location, basilar membrane
  • the processor 120 can obtain the transfer function of the target signal emitted from the speaker to the noise canceling position.
  • the transfer functions may include a first transfer function and a second transfer function.
  • the first transfer function may represent the change (eg, change in amplitude, change in phase) of a parameter of the target signal with the sound path (ie, the first sound path) from the loudspeaker to the noise cancellation position.
  • the target signal emitted by the bone conduction speaker is the bone conduction signal
  • the position where the target signal emitted by the bone conduction speaker and the ambient noise are canceled is the basement membrane of the user.
  • the first transfer function may represent the change in the parameters (eg, phase, amplitude) of the target signal emanating from the bone conduction speaker to the basilar membrane delivered to the user.
  • the speaker when the speaker is a bone conduction speaker, the first transfer function can be obtained experimentally.
  • the bone conduction speaker outputs the target signal, and at the same time, an air conduction sound signal with the same frequency as the target signal is played near the ear canal of the user, and the cancellation effect of the target signal and the air conduction sound signal is observed.
  • the target signal and the air conduction sound signal cancel each other the first transfer function of the bone conduction speaker can be obtained based on the air conduction sound signal and the target signal output by the bone conduction speaker.
  • the signal emitted by the air-conducting loudspeaker to the target is an air-conducting sound signal
  • the first transfer function can be obtained by simulation and calculation of an acoustic diffusion field.
  • the sound field of the target signal emitted by the air-conducting speaker can be simulated by using the acoustic diffusion field, and the first transfer function of the air-conducting speaker can be calculated based on the sound field.
  • the second transfer function may represent a change in a parameter of the ambient noise (eg, change in amplitude, change in phase) from a target spatial location to a location where the target signal and ambient noise cancel.
  • the second transfer function may represent the change in the parameters of the ambient noise from the target spatial location to the basilar membrane of the user.
  • the second transfer function may be obtained by acoustic diffuse field simulation and calculation.
  • a sound field of ambient noise can be simulated using an acoustic diffuse field, and a second transfer function can be calculated based on the sound field.
  • the transfer function may include a phase transfer function and an amplitude transfer function.
  • both the phase transfer function and the magnitude transfer function can be obtained by the above method.
  • the processor 120 may process the noise reduction signal based on the obtained transfer function. In some embodiments, the processor 120 may adjust the amplitude and phase of the noise reduction signal based on the obtained transfer function. In some embodiments, the processor 120 may adjust the phase of the noise reduction signal based on the obtained phase transfer function and the amplitude of the noise reduction signal based on the magnitude transfer function.
  • step 1520 the target signal is output according to the processed noise reduction signal. In some embodiments, this step may be performed by speaker 130 .
  • the speaker 130 may output a target signal based on the noise reduction signal processed in step 1510, so that when the target signal output by the speaker 130 based on the processed noise reduction signal is transmitted to a position that cancels out the ambient noise, the target The magnitude of the phase sum of the signal and ambient noise satisfies certain conditions.
  • the phase difference between the phase of the target signal and the phase of the ambient noise may be less than or equal to a certain phase threshold.
  • the phase threshold may be in the range of 90-180 degrees. The phase threshold can be adjusted within this range according to the user's needs.
  • the phase threshold when the user does not want to be disturbed by the sound of the surrounding environment, the phase threshold may be a larger value, such as 180 degrees, that is, the phase of the target signal is opposite to that of the ambient noise.
  • the phase threshold when the user wants to be sensitive to the surrounding environment, the phase threshold may be a small value, such as 90 degrees. It should be noted that the more ambient sounds the user wishes to receive, the closer the phase threshold may be to 90 degrees, and the less ambient sounds the user wishes to receive, the closer the phase threshold may be to 180 degrees.
  • the amplitude difference between the amplitude of the ambient noise and the amplitude of the target signal may be less than or equal to a certain amplitude value threshold.
  • the amplitude threshold may be a small value, such as 0 dB, that is, the amplitude of the target signal is equal to the amplitude of the ambient noise.
  • the amplitude threshold may be a larger value, for example, approximately equal to the amplitude of ambient noise.
  • the more ambient sounds the user wishes to receive the closer the amplitude threshold can be to the amplitude of ambient noise, and the less ambient sounds the user wishes to receive, the closer the amplitude threshold can be to 0dB.
  • the purpose of reducing ambient noise and the active noise reduction function of the acoustic device eg, the acoustic output device 100 ) are achieved, and the user's listening experience is improved.
  • process 1500 is only for example and illustration, and does not limit the scope of application of this specification.
  • process 1500 may also include the step of obtaining a transfer function.
  • steps 1510 and 1520 may be combined into one step. Such corrections and changes are still within the scope of this application.
  • FIG. 16 is an exemplary flowchart for estimating noise at a spatial location of a target, provided according to some embodiments of the present specification. As shown in Figure 16, process 1600 may include:
  • step 1610 components associated with the signal picked up by the bone conduction microphone are removed from the picked up ambient noise in order to update the ambient noise.
  • this step may be performed by processor 120 .
  • the microphone array eg, the microphone array 110
  • the user's own speaking voice is also picked up by the microphone array, that is, the user's own speaking voice is also regarded as part of the ambient noise.
  • the target signal output by the speaker eg, the speaker 130
  • the user's own voice needs to be preserved, for example, in scenarios such as the user making a voice call or sending a voice message.
  • the acoustic device eg, the acoustic device 100
  • the bone conduction microphone may pick up the vibration signal generated by the facial bones or muscles when the user speaks. to pick up the voice signal of the user speaking, and transmit it to the processor 120 .
  • the processor 120 obtains parameter information from the sound signal picked up by the bone conduction microphone, and removes sound signal components associated with the sound signal picked up by the bone conduction microphone from the ambient noise picked up by the microphone array (eg, the microphone array 110).
  • the processor 120 updates the ambient noise according to the remaining parameter information of the ambient noise.
  • the updated environmental noise no longer includes the sound signal of the user's own speech, that is, the user can hear the sound signal of the user's own speech when the user makes a voice call.
  • step 1620 the noise of the target spatial location is estimated according to the updated ambient noise. In some embodiments, this step may be performed by processor 120 . Step 1620 may be performed in a similar manner to step 320, and the related description is not repeated here.
  • process 1600 is only for example and description, and does not limit the scope of application of the present application.
  • Various modifications and changes to process 1600 may be made to process 1600 under the guidance of the present application to those skilled in the art.
  • components associated with the signal picked up by the bone conduction microphone may also be preprocessed, and the signal picked up by the bone conduction microphone may be transmitted to the terminal device as an audio signal. Such corrections and changes are still within the scope of this application.
  • references such as “one embodiment,” “an embodiment,” and/or “some embodiments” mean a certain feature, structure, or characteristic associated with at least one embodiment of the present application.
  • references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various places throughout this application are not necessarily referring to the same embodiment .
  • certain features, structures or characteristics of the one or more embodiments of the present application may be combined as appropriate.
  • aspects of this application may be illustrated and described in several patentable categories or situations, including any new and useful process, machine, product, or combination of matter, or combinations of them. of any new and useful improvements. Accordingly, various aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, microcode, etc.), or by a combination of hardware and software.
  • the above hardware or software may be referred to as a "data block”, “module”, “engine”, “unit”, “component” or “system”.
  • aspects of the present application may be embodied as a computer product comprising computer readable program code embodied in one or more computer readable media.
  • a computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on baseband or as part of a carrier wave.
  • the propagating signal may take a variety of manifestations, including electromagnetic, optical, etc., or a suitable combination.
  • Computer storage media can be any computer-readable media other than computer-readable storage media that can communicate, propagate, or transmit a program for use by coupling to an instruction execution system, apparatus, or device.
  • Program code on a computer storage medium may be transmitted over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.

Abstract

The present application discloses an acoustic device. The acoustic device may comprise a microphone array, a processor, and at least one loudspeaker. The microphone array may be configured to pick up ambient noise. The processor may be configured to estimate a sound field of a target spatial position by using the microphone array. The target spatial position may be closer to the auditory meatus of a user than any microphone in the microphone array. The processor may be further configured to generate a noise reduction signal on the basis of the picked ambient noise and the sound field estimation of the target spatial position. The at least one loudspeaker may be configured to output a target signal according to the noise reduction signal. The target signal may be used for reducing the ambient noise. The microphone array may be provided in a target area to minimize interference signals, from the at least one loudspeaker, for the microphone array.

Description

声学装置acoustic device
交叉引用cross reference
本申请要求于2021年4月25日提交的申请号为PCT/CN2021/089670的国际申请的优先权,其全部内容通过引用结合于此。This application claims priority to International Application No. PCT/CN2021/089670, filed on April 25, 2021, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请涉及声学领域,特别涉及一种声学装置。The present application relates to the field of acoustics, and in particular, to an acoustic device.
背景技术Background technique
声学装置允许用户在收听音频内容、进行语音通话的同时保证用户交互内容的私密性,且收听时不打扰到周围人群。声学装置通常可以分为入耳式声学装置和开放式声学装置两类。入耳式声学装置在使用过程中会堵塞用户耳部,且用户在长时间佩戴时容易产生堵塞、异物、胀痛等感受。开放式声学装置可以开放用户耳部,有利于长期佩戴,但当外界噪声较大时,其降噪效果不明显,降低得用户听觉体验。The acoustic device allows users to listen to audio content and make voice calls while ensuring the privacy of user interaction content, and does not disturb surrounding people when listening. Acoustic devices can generally be divided into two categories: in-ear acoustic devices and open acoustic devices. The in-ear acoustic device may block the user's ear during use, and the user is prone to experience blockage, foreign body, swelling and pain when wearing it for a long time. The open acoustic device can open the user's ear, which is conducive to long-term wearing, but when the external noise is large, its noise reduction effect is not obvious, which reduces the user's listening experience.
因此,希望提供一种声学装置,可以开放用户双耳以及提高用户听觉体验。Therefore, it is desirable to provide an acoustic device that can open the user's ears and improve the user's hearing experience.
发明内容SUMMARY OF THE INVENTION
本申请实施例之一提供一种声学装置。该声学装置可以包括麦克风阵列、处理器和至少一个扬声器。所述麦克风阵列可以被配置为拾取环境噪声。所述处理器可以被配置为利用所述麦克风阵列对目标空间位置的声场进行估计。所述目标空间位置可以比所述麦克风阵列中任一麦克风更加靠近用户耳道。所述处理器可以进一步被配置为基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号。所述至少一个扬声器可以被配置为根据所述降噪信号输出目标信号。所述目标信号可以用于降低所述环境噪声。所述麦克风阵列可以设置在目标区域以使所述麦克风阵列受来自所述至少一个扬声器的干扰信号最小。One of the embodiments of the present application provides an acoustic device. The acoustic device may include a microphone array, a processor and at least one speaker. The microphone array may be configured to pick up ambient noise. The processor may be configured to estimate a sound field at a target spatial location using the microphone array. The target spatial location may be closer to the user's ear canal than any microphone in the microphone array. The processor may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location. The at least one speaker may be configured to output a target signal according to the noise reduction signal. The target signal can be used to reduce the ambient noise. The microphone array may be positioned in a target area to minimize interference signal exposure to the microphone array from the at least one loudspeaker.
在一些实施例中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号可以包括基于所述拾取的环境噪声估计所述目标空间位置的噪声以及基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。In some embodiments, the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location may include estimating noise at the target spatial location based on the picked-up ambient noise and based on the target The noise at the spatial location and the sound field estimate at the spatial location of the target generate the noise reduction signal.
在一些实施例中,所述声学装置可以进一步包括一个或多个传感器,用于获取所述声学装置的运动信息。所述处理器可以进一步被配置为基于所述运动信息更新所述 目标空间位置的噪声和所述目标空间位置的声场估计以及基于所述更新后的目标空间位置的噪声和所述更新后的目标空间位置的声场估计生成所述降噪信号。In some embodiments, the acoustic device may further include one or more sensors for acquiring motion information of the acoustic device. The processor may be further configured to update the noise of the target spatial position and the sound field estimate of the target spatial position based on the motion information and the noise based on the updated target spatial position and the updated target The sound field estimation of the spatial location generates the noise reduction signal.
在一些实施例中,所述基于所述拾取的环境噪声估计所述目标空间位置的噪声可以包括确定一个或多个与所述拾取的环境噪声有关的空间噪声源以及基于所述空间噪声源,估计所述目标空间位置的噪声。In some embodiments, the estimating noise at the target spatial location based on the picked-up ambient noise may include determining one or more spatial noise sources associated with the picked-up ambient noise and based on the spatial noise sources, Estimate the noise at the spatial location of the object.
在一些实施例中,所述利用所述麦克风阵列对目标空间位置的声场进行估计可以包括基于所述麦克风阵列构建虚拟麦克风,所述虚拟麦克风包括数学模型或机器学习模型,用于表示若所述目标空间位置处包括麦克风后所述麦克风采集的音频数据,以及基于所述虚拟麦克风对所述目标空间位置的声场进行估计。In some embodiments, using the microphone array to estimate the sound field of the target spatial location may include constructing a virtual microphone based on the microphone array, the virtual microphone including a mathematical model or a machine learning model for representing if the The target spatial position includes audio data collected by the microphone after the microphone, and the sound field of the target spatial position is estimated based on the virtual microphone.
在一些实施例中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号可以包括基于所述虚拟麦克风估计所述目标空间位置的噪声以及基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。In some embodiments, the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial location may include estimating noise at the target spatial location based on the virtual microphone and estimating noise at the target spatial location based on the target spatial location The noise reduction signal is generated by an estimate of the noise and the sound field of the target spatial location.
在一些实施例中,所述至少一个扬声器可以是骨导扬声器。所述干扰信号可以包括所述骨导扬声器的漏音信号和振动信号。所述目标区域可以为传递到所述麦克风阵列的所述骨导扬声器的所述漏音信号和所述振动信号的总能量最小的区域。In some embodiments, the at least one speaker may be a bone conduction speaker. The interference signal may include a sound leakage signal and a vibration signal of the bone conduction speaker. The target area may be an area where the total energy of the leakage signal and the vibration signal transmitted to the bone conduction speaker of the microphone array is the smallest.
在一些实施例中,所述目标区域的位置可以与所述麦克风阵列中的麦克风的振膜的朝向有关。所述麦克风的振膜的朝向可以降低所述麦克风接收到的所述骨导扬声器的所述振动信号的大小。所述麦克风的振膜的朝向可以使得所述麦克风接收到的所述骨导扬声器的所述振动信号与所述麦克风接收到的所述骨导扬声器的所述漏音信号至少部分互相抵消。所述麦克风接收到的所述骨导扬声器的所述振动信号可以降低所述麦克风接收到的所述骨导扬声器的所述漏音信号5-6dB。In some embodiments, the location of the target area may be related to the orientation of the diaphragms of the microphones in the microphone array. The orientation of the diaphragm of the microphone can reduce the magnitude of the vibration signal of the bone conduction speaker received by the microphone. The diaphragm of the microphone is oriented such that the vibration signal of the bone conduction speaker received by the microphone and the sound leakage signal of the bone conduction speaker received by the microphone at least partially cancel each other. The vibration signal of the bone conduction speaker received by the microphone can reduce the sound leakage signal of the bone conduction speaker received by the microphone by 5-6 dB.
在一些实施例中,所述至少一个扬声器可以是气导扬声器。所述目标区域可以为所述气导扬声器的辐射声场的声压级最小区域。In some embodiments, the at least one speaker may be an air conduction speaker. The target area may be a minimum area of the sound pressure level of the radiated sound field of the air conduction speaker.
在一些实施例中,所述处理器可以进一步被配置为基于传递函数处理所述降噪信号。所述传递函数可以包括第一传递函数和第二传递函数。所述第一传递函数可以表示从所述至少一个扬声器发出到所述目标信号和所述环境噪声抵消的位置所述目标信号的参数的变化。所述第二传递函数可以表示从所述目标空间位置到所述目标信号和所述环境噪声抵消的位置所述环境噪声的参数的变化。所述至少一个扬声器可以进一步被配置为根据所述处理后的降噪信号输出所述目标信号。In some embodiments, the processor may be further configured to process the noise reduction signal based on a transfer function. The transfer function may include a first transfer function and a second transfer function. The first transfer function may represent a change in a parameter of the target signal from the at least one loudspeaker to a location where the target signal and the ambient noise cancel. The second transfer function may represent a change in a parameter of the ambient noise from the target spatial location to a location where the target signal and the ambient noise cancel. The at least one speaker may be further configured to output the target signal based on the processed noise reduction signal.
在一些实施例中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估 计生成降噪信号可以包括将所述拾取的环境噪声划分为多个频带,所述多个频带对应不同的频率范围,以及对于所述多个频带中的至少一个,生成与所述至少一个频带中的每一个对应的降噪信号。In some embodiments, the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial location may include dividing the picked-up ambient noise into multiple frequency bands, the multiple frequency bands corresponding to different and generating, for at least one of the plurality of frequency bands, a noise reduction signal corresponding to each of the at least one frequency band.
在一些实施例中,所述处理器可以进一步被配置为基于所述目标空间位置的声场估计对所述目标空间位置的噪声进行幅度和相位调整以生成所述降噪信号。In some embodiments, the processor may be further configured to make amplitude and phase adjustments to noise at the target spatial location based on the sound field estimate of the target spatial location to generate the noise reduction signal.
在一些实施例中,所述声学装置可以进一步包括固定结构,被配置为将所述声学装置固定在用户耳朵附近且不堵塞用户耳道的位置。In some embodiments, the acoustic device may further include a securing structure configured to secure the acoustic device in a position adjacent to the user's ear without blocking the user's ear canal.
在一些实施例中,所声学装置可以进一步包括壳体结构,被配置为承载或容纳所述麦克风阵列、所述处理器和所述至少一个扬声器。In some embodiments, the acoustic device may further include a housing structure configured to carry or house the microphone array, the processor, and the at least one speaker.
本申请实施例之一提供一种降噪方法。所述降噪方法可以包括由麦克风阵列拾取环境噪声。所述降噪方法可以包括由处理器利用所述麦克风阵列对目标空间位置的声场进行估计。所述目标空间位置可以比所述麦克风阵列中任一麦克风更加靠近用户耳道。所述降噪方法可以包括基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号。所述降噪方法可以进一步包括由至少一个扬声器,根据所述降噪信号输出目标信号。所述目标信号可以用于降低所述环境噪声。所述麦克风阵列可以设置在目标区域以使所述麦克风阵列受来自所述至少一个扬声器的干扰信号最小。One of the embodiments of the present application provides a noise reduction method. The noise reduction method may include picking up ambient noise by a microphone array. The noise reduction method may include estimating, by a processor, a sound field of a target spatial location using the microphone array. The target spatial location may be closer to the user's ear canal than any microphone in the microphone array. The noise reduction method may include generating a noise reduction signal based on the picked-up ambient noise and a sound field estimate of the target spatial location. The noise reduction method may further include outputting, by at least one speaker, a target signal according to the noise reduction signal. The target signal can be used to reduce the ambient noise. The microphone array may be positioned in a target area to minimize interference signal exposure to the microphone array from the at least one loudspeaker.
本申请的一部分附加特性可以在下面的描述中进行说明。通过对以下描述和相应附图的研究或者对实施例的生产或操作的了解,本申请的一部分附加特性对于本领域技术人员是明显的。本申请的特征可以通过实践或使用以下详细实例中阐述的方法、工具和组合的各个方面来实现和获得。Some of the additional features of the present application may be illustrated in the following description. Some of the additional features of the present application will become apparent to those skilled in the art from a study of the following description and the corresponding drawings, or from a knowledge of the production or operation of the embodiments. The features of the present application can be implemented and obtained by practicing or using various aspects of the methods, tools and combinations set forth in the following detailed examples.
附图说明Description of drawings
本申请将以示例性实施例的方式进一步说明,这些示例性实施例将通过附图进行详细描述。这些实施例并非限制性的,在这些实施例中,相同的编号表示相同的结构,其中:The present application will be further described by way of exemplary embodiments, which will be described in detail with reference to the accompanying drawings. These examples are not limiting, and in these examples, the same numbers refer to the same structures, wherein:
图1是根据本申请的一些实施例所示的示例性声学装置的结构示意图;FIG. 1 is a schematic structural diagram of an exemplary acoustic device according to some embodiments of the present application;
图2是根据本申请的一些实施例所示的示例性处理器的结构示意图;FIG. 2 is a schematic structural diagram of an exemplary processor according to some embodiments of the present application;
图3是根据本申请的一些实施例所示的声学装置的示例性降噪流程图;3 is an exemplary noise reduction flowchart of an acoustic device according to some embodiments of the present application;
图4是根据本申请的一些实施例所示的声学装置的示例性降噪流程图;FIG. 4 is an exemplary noise reduction flowchart of an acoustic device according to some embodiments of the present application;
图5A-D是根据本申请一些实施例所示的麦克风阵列的示例性排布方式的示意 图;5A-D are schematic diagrams of exemplary arrangements of microphone arrays according to some embodiments of the present application;
图6A-B是根据本申请一些实施例所示的麦克风阵列的示例性排布方式的示意图;6A-B are schematic diagrams of exemplary arrangements of microphone arrays according to some embodiments of the present application;
图7是根据本申请一些实施例所示的估计目标空间位置的噪声的示例性流程图;FIG. 7 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application;
图8是根据本申请一些实施例所示的估计目标空间位置的噪声的示意图;FIG. 8 is a schematic diagram of estimating noise at a spatial position of a target according to some embodiments of the present application;
图9是根据本申请一些实施例所示的估计目标空间位置的声场和噪声的示例性流程图;FIG. 9 is an exemplary flowchart of estimating the sound field and noise of a target spatial position according to some embodiments of the present application;
图10是根据本申请一些实施例所示的构建虚拟麦克风的示意图;10 is a schematic diagram of constructing a virtual microphone according to some embodiments of the present application;
图11是根据本申请一些实施例所示的骨导扬声器在1000Hz时的三维声场漏音信号分布示意图;FIG. 11 is a schematic diagram of the sound leakage signal distribution of the three-dimensional sound field at 1000 Hz of the bone conduction speaker according to some embodiments of the present application;
图12是根据本申请一些实施例所示的骨导扬声器在1000Hz时的二维声场漏音信号分布示意图;FIG. 12 is a schematic diagram illustrating the distribution of sound leakage signals in a two-dimensional sound field at 1000 Hz of a bone conduction speaker according to some embodiments of the present application;
图13是根据本申请一些实施例所示的骨导扬声器的振动信号和漏音信号的总信号的频率响应示意图;13 is a schematic diagram of the frequency response of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker according to some embodiments of the present application;
图14A-B是根据本申请一些实施例所示的气导扬声器的声场分布示意图;14A-B are schematic diagrams of sound field distribution of an air conduction speaker according to some embodiments of the present application;
图15是根据本申请一些实施例所示的基于传递函数输出目标信号的示例性流程图;以及FIG. 15 is an exemplary flowchart of outputting a target signal based on a transfer function according to some embodiments of the present application; and
图16是根据本申请一些实施例所示的估计目标空间位置的噪声的示例性流程图。FIG. 16 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application.
具体实施方式Detailed ways
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本申请的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本申请应用于其它类似情景。除非从语言环境中显而易见或另做说明,图中相同标号代表相同结构或操作。In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application. For those of ordinary skill in the art, without any creative effort, the present application can also be applied to the present application according to these drawings. other similar situations. Unless obvious from the locale or otherwise specified, the same reference numbers in the figures represent the same structure or operation.
应当理解,本文使用的“系统”、“装置”、“单元”和/或“模组”是用于区分不同级别的不同组件、元件、部件、部分或装配的一种方法。然而,如果其他词语可实现相同的目的,则可通过其他表达来替换所述词语。It should be understood that "system", "device", "unit" and/or "module" as used herein is a method used to distinguish different components, elements, parts, parts or assemblies at different levels. However, other words may be replaced by other expressions if they serve the same purpose.
如本申请和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一个”、 “一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其它的步骤或元素。As shown in this application and in the claims, unless the context clearly dictates otherwise, the words "a", "an", "an" and/or "the" are not intended to be specific in the singular and may include the plural. Generally speaking, the terms "comprising" and "comprising" only imply that the clearly identified steps and elements are included, and these steps and elements do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.
本申请中使用了流程图用来说明根据本申请的实施例的系统所执行的操作。应当理解的是,前面或后面操作不一定按照顺序来精确地执行。相反,可以按照倒序或同时处理各个步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。Flow diagrams are used in this application to illustrate operations performed by a system according to an embodiment of the application. It should be understood that the preceding or following operations are not necessarily performed in the exact order. Instead, the various steps can be processed in reverse order or simultaneously. At the same time, other actions can be added to these procedures, or a step or steps can be removed from these procedures.
开放式声学装置(例如开放式声学耳机)是一种可以开放用户耳部的声学设备。开放式声学装置可以通过固定结构(例如,耳挂、头挂、眼镜脚等)将扬声器固定于用户耳朵附近且不堵塞用户耳道的位置。当用户使用开放式声学装置时,外界环境噪音也可以被用户听到,这就使得用户的听觉体验较差。例如,在外界环境噪音较大的场所(例如,街道、景区等),用户在使用开放式声学装置进行音乐播放时,外界环境的噪音会直接进入用户耳道,使得用户听到较大的环境噪音,环境噪音会干扰用户的听音乐体验。又例如,当用户佩戴开放式声学装置进行通话时,麦克风不仅会拾取用户自身的说话声音,也会拾取环境噪音,使得用户通话体验较差。An open acoustic device, such as an open acoustic earphone, is an acoustic device that can open a user's ear. The open acoustic device can fix the speaker at a position near the user's ear without blocking the user's ear canal through a fixing structure (eg, ear hook, head hook, glasses leg, etc.). When the user uses the open acoustic device, the external environment noise can also be heard by the user, which makes the user's hearing experience poor. For example, in places with high ambient noise (eg, streets, scenic spots, etc.), when a user uses an open acoustic device to play music, the noise from the external environment will directly enter the user's ear canal, so that the user can hear the louder environment. Noise, ambient noise can interfere with the user's music listening experience. For another example, when a user wears an open acoustic device to make a call, the microphone not only picks up the user's own speaking voice, but also picks up ambient noise, which makes the user's call experience poor.
基于上述问题,本申请实施例中提供一种声学装置。该声学装置可以包括麦克风阵列、处理器以及至少一个扬声器。麦克风阵列可以被配置为拾取环境噪声。处理器可以被配置为利用麦克风阵列对目标空间位置的声场进行估计。目标空间位置可以比麦克风阵列中任一麦克风更加靠近用户耳道。可以理解的是,麦克风阵列中的各麦克风可以分布于用户耳道附近的不同位置,利用麦克风阵列中的各麦克风来估计靠近用户耳道位置处(例如,目标空间位置)的声场。处理器可以进一步被配置为基于拾取的环境噪声和目标空间位置的声场估计生成降噪信号。至少一个扬声器可以被配置为根据降噪信号输出目标信号。该目标信号可以用于降低环境噪声。另外,麦克风阵列可以设置在目标区域以使麦克风阵列受来自至少一个扬声器的干扰信号最小。当至少一个扬声器是骨导扬声器时,干扰信号可以包括骨导扬声器的漏音信号和振动信号,目标区域可以为传递到麦克风阵列的骨导扬声器的漏音信号和振动信号的总能量最小的区域。当至少一个扬声器是气导扬声器时,目标区域可以为气导扬声器的辐射声场的声压级最小区域。Based on the above problems, an acoustic device is provided in the embodiments of the present application. The acoustic device may include a microphone array, a processor, and at least one speaker. The microphone array can be configured to pick up ambient noise. The processor may be configured to estimate the sound field of the target spatial location using the microphone array. The target spatial location may be closer to the user's ear canal than any microphone in the microphone array. It can be understood that each microphone in the microphone array may be distributed in different positions near the user's ear canal, and each microphone in the microphone array is used to estimate the sound field near the user's ear canal position (eg, a target spatial position). The processor may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location. The at least one speaker may be configured to output the target signal according to the noise reduction signal. The target signal can be used to reduce ambient noise. Additionally, the microphone array may be positioned in the target area to minimize the microphone array's exposure to interfering signals from the at least one loudspeaker. When at least one speaker is a bone conduction speaker, the interference signal may include a leakage signal and a vibration signal of the bone conduction speaker, and the target area may be an area where the total energy of the leakage signal and vibration signal of the bone conduction speaker transmitted to the microphone array is the smallest . When the at least one loudspeaker is an air-conducting loudspeaker, the target area may be an area of minimum sound pressure level of the radiated sound field of the air-conducting loudspeaker.
在本申请的实施例中,通过上述设置利用至少一个扬声器输出的目标信号降低用户耳道(例如,目标空间位置)处的环境噪声,实现了声学装置的主动降噪,提高了用户在使用该声学装置过程中的听觉体验。In the embodiments of the present application, the target signal output by at least one speaker is used to reduce the ambient noise at the user's ear canal (for example, the target spatial position) through the above setting, so as to realize the active noise reduction of the acoustic device, and improve the user's performance when using the device. Aural experience during acoustic installation.
进一步,在本申请的实施例中,麦克风阵列(也可以称为前馈麦克风)可以同时实现对环境噪声的拾取和对用户耳道(例如,目标空间位置)处声场的估计。Further, in the embodiment of the present application, the microphone array (also referred to as a feedforward microphone) can simultaneously realize the pickup of ambient noise and the estimation of the sound field at the user's ear canal (eg, the target spatial position).
另外,在本申请的实施例中,麦克风阵列设置在目标区域,减少或避免了麦克风阵列拾取至少一个扬声器发出的干扰信号(例如,目标信号),从而保障了开放式声学装置的主动降噪的实现。In addition, in the embodiment of the present application, the microphone array is arranged in the target area, which reduces or prevents the microphone array from picking up the interference signal (for example, the target signal) emitted by at least one speaker, thereby ensuring the active noise reduction of the open acoustic device. accomplish.
图1是根据本申请的一些实施例所示的示例性声学装置100的结构示意图。在一些实施例中,声学装置100可以为开放式的声学装置。如图1所示,声学装置100可以包括麦克风阵列110、处理器120和扬声器130。在一些实施例中,麦克风阵列110可以拾取环境噪声,并将拾取到的环境噪声转换为电信号传递至处理器120进行处理。处理器120可以耦接(例如,电连接)麦克风阵列110和扬声器130。处理器120可以接收麦克风阵列110传递的电信号并对其进行处理以生成降噪信号并将生成的降噪信号传递至扬声器130。扬声器130可以根据降噪信号输出目标信号。该目标信号可以用于降低或抵消用户耳道位置处(例如,目标空间位置)的环境噪声,从而实现声学装置100的主动降噪,提高用户在使用声学装置100过程中的听觉体验。FIG. 1 is a schematic structural diagram of an exemplary acoustic device 100 according to some embodiments of the present application. In some embodiments, the acoustic device 100 may be an open acoustic device. As shown in FIG. 1 , the acoustic device 100 may include a microphone array 110 , a processor 120 and a speaker 130 . In some embodiments, the microphone array 110 can pick up ambient noise, and convert the picked-up ambient noise into electrical signals and transmit them to the processor 120 for processing. The processor 120 may couple (eg, electrically connect) the microphone array 110 and the speaker 130 . The processor 120 may receive and process the electrical signals communicated by the microphone array 110 to generate a noise reduction signal and communicate the generated noise reduction signal to the speaker 130 . The speaker 130 may output the target signal according to the noise reduction signal. The target signal can be used to reduce or cancel the ambient noise at the user's ear canal position (eg, the target spatial position), so as to realize active noise reduction of the acoustic device 100 and improve the user's listening experience during the use of the acoustic device 100 .
麦克风阵列110可以被配置为拾取环境噪声。在一些实施例中,环境噪声可以指用户所处环境中的多种外界声音的组合。仅作为示例,环境噪声可以包括交通噪声、工业噪声、建筑施工噪声、社会噪声等中的一种或多种。交通噪声可以包括但不限于机动车辆的行驶噪声、鸣笛噪声等。工业噪声可以包括但不限于工厂动力机械运转噪声等。建筑施工噪声可以包括但不限于动力机械挖掘噪声、打洞噪声、搅拌噪声等。社会生活环境噪声可以包括但不限于群众集会噪声、文娱宣传噪声、人群喧闹噪声、家用电器噪声等。在一些实施例中,麦克风阵列110可以设置于用户耳道附近位置,用于拾取传递至用户耳道处的环境噪声,并将拾取的环境噪声转换为电信号传递至处理器120进行处理。在一些实施例中,麦克风阵列110可以设置于用户的左耳和/或右耳处。例如,麦克风阵列110可以包括第一子麦克风阵列和第二子麦克风阵列。第一子麦克风阵列可以位于用户的左耳处,第二子麦克风阵列可以位于用户的右耳处。第一子麦克风阵列和第二子麦克风阵列可以同时进入工作状态或二者中的一个进入工作状态。The microphone array 110 may be configured to pick up ambient noise. In some embodiments, ambient noise may refer to a combination of various external sounds in the environment in which the user is located. For example only, ambient noise may include one or more of traffic noise, industrial noise, building construction noise, social noise, and the like. Traffic noise may include, but is not limited to, driving noise of motor vehicles, whistle noise, and the like. Industrial noise may include, but is not limited to, factory power machinery operating noise, and the like. Building construction noise may include, but is not limited to, power machinery excavation noise, hole drilling noise, stirring noise, and the like. Social living environment noise may include, but is not limited to, crowd assembly noise, entertainment and publicity noise, crowd noise, household appliance noise, and the like. In some embodiments, the microphone array 110 may be disposed near the user's ear canal to pick up the ambient noise transmitted to the user's ear canal, and convert the picked-up ambient noise into electrical signals and transmit them to the processor 120 for processing. In some embodiments, the microphone array 110 may be positioned at the user's left and/or right ear. For example, the microphone array 110 may include a first sub-microphone array and a second sub-microphone array. The first sub-microphone array may be located at the user's left ear, and the second sub-microphone array may be located at the user's right ear. The first sub-microphone array and the second sub-microphone array may enter the working state at the same time or one of the two may enter the working state.
在一些实施例中,环境噪声可以包括用户讲话的声音。例如,麦克风阵列110可以根据声学装置100的通话状态拾取环境噪声。当声学装置100处于未通话状态时,用户自身说话产生的声音可以被视为环境噪声,麦克风阵列110可以同时拾取用户自身说话的声音以及其他环境噪声。当声学装置100处于通话状态时,用户自身说话产 生的声音可以不被视为环境噪声,麦克风阵列110可以拾取除用户自身说话的声音之外环境噪声。例如,麦克风阵列110可以拾取距离麦克风阵列110一定距离(例如,0.5米、1米)之外的噪声源发出的噪声。In some embodiments, the ambient noise may include the sound of the user speaking. For example, the microphone array 110 may pick up ambient noise according to the talking state of the acoustic device 100 . When the acoustic device 100 is not in a call state, the sound produced by the user's own speech can be regarded as environmental noise, and the microphone array 110 can simultaneously pick up the user's own speech and other environmental noises. When the acoustic device 100 is in a call state, the sound produced by the user's own speech may not be regarded as ambient noise, and the microphone array 110 may pick up ambient noise in addition to the user's own speaking sound. For example, the microphone array 110 may pick up noise emanating from a noise source located a certain distance (eg, 0.5 meters, 1 meter) away from the microphone array 110 .
在一些实施例中,麦克风阵列110可以包括一个或多个气导麦克风。例如,用户在使用声学装置100听取音乐时,气导麦克风可以同时获取外界环境的噪声和用户说话时的声音并将获取的外界环境的噪声和用户说话时的声音一起作为环境噪声。在一些实施例中,麦克风阵列110还可以包括一个或多个骨导麦克风。骨导麦克风可以直接与用户的皮肤接触,用户说话时骨骼或肌肉产生的振动信号可以直接传递给骨导麦克风,进而骨导麦克风将振动信号转换为电信号,并将电信号传递至处理器120进行处理。骨导麦克风也可以不与人体直接接触,用户说话时骨骼或肌肉产生的振动信号可以先传递至声学装置100的壳体结构,再由壳体结构传递至骨导麦克风。在一些实施例中,用户在通话状态时,处理器120可以将气导麦克风采集的声音信号作为环境噪声并利用该环境噪声进行降噪,骨导麦克风采集的声音信号作为语音信号传输至终端设备,从而保证用户通话时的通话质量。In some embodiments, the microphone array 110 may include one or more air conduction microphones. For example, when a user listens to music using the acoustic device 100, the air conduction microphone can simultaneously acquire the noise of the external environment and the voice of the user when speaking, and use the acquired noise of the external environment and the voice of the user as the ambient noise. In some embodiments, the microphone array 110 may also include one or more bone conduction microphones. The bone conduction microphone can directly contact the user's skin, and the vibration signal generated by the bones or muscles when the user speaks can be directly transmitted to the bone conduction microphone, and then the bone conduction microphone converts the vibration signal into an electrical signal, and transmits the electrical signal to the processor 120 to be processed. The bone conduction microphone may also not be in direct contact with the human body, and the vibration signal generated by the bones or muscles when the user speaks can be first transmitted to the casing structure of the acoustic device 100, and then transmitted to the bone conduction microphone by the casing structure. In some embodiments, when the user is in a call state, the processor 120 may use the sound signal collected by the air conduction microphone as environmental noise and use the environmental noise for noise reduction, and the sound signal collected by the bone conduction microphone may be transmitted to the terminal device as a voice signal , so as to ensure the call quality of the user during the call.
在一些实施例中,处理器120可以基于声学装置100的工作状态控制骨导麦克风和气导麦克风的开关状态。声学装置100的工作状态可以指用户佩戴声学装置100时所使用的用途状态。仅作为示例,声学装置100的工作状态可以包括但不限于通话状态、未通话状态(例如,音乐播放状态)、发送语音消息状态等。在一些实施例中,麦克风阵列110拾取环境噪声时,麦克风阵列110中的骨导麦克风的开关状态和气导麦克风的开关状态可以根据声学装置100的工作状态决定。例如,用户佩戴声学装置100进行音乐播放时,骨导麦克风的开关状态可以为待机状态,气导麦克风的开关状态可以为工作状态。又例如,用户佩戴声学装置100进行发送语音消息时,骨导麦克风的开关状态可以为工作状态,气导麦克风的开关状态可以为工作状态。在一些实施例中,处理器120可以通过发送控制信号控制麦克风阵列110中的麦克风(例如,骨导麦克风、气导麦克风)的开关状态。In some embodiments, the processor 120 may control the switch states of the bone conduction microphone and the air conduction microphone based on the working state of the acoustic device 100 . The working state of the acoustic device 100 may refer to the usage state used when the user wears the acoustic device 100 . Just as an example, the working state of the acoustic device 100 may include, but is not limited to, a talking state, a non-calling state (eg, a music playing state), a voice message sending state, and the like. In some embodiments, when the microphone array 110 picks up ambient noise, the on/off state of the bone conduction microphone and the on/off state of the air conduction microphone in the microphone array 110 may be determined according to the working state of the acoustic device 100 . For example, when the user wears the acoustic device 100 to play music, the switch state of the bone conduction microphone may be the standby state, and the switch state of the air conduction microphone may be the working state. For another example, when the user wears the acoustic device 100 to send a voice message, the switch state of the bone conduction microphone may be the working state, and the switch state of the air conduction microphone may be the working state. In some embodiments, the processor 120 may control the on/off state of the microphones (eg, bone conduction microphones, air conduction microphones) in the microphone array 110 by sending a control signal.
在一些实施例中,当声学装置100的工作状态为未通话状态(例如,音乐播放状态)时,处理器120可以控制骨导麦克风为待机状态,气导麦克风为工作状态。声学装置100在未通话状态下,用户自身说话的声音信号可以视为环境噪声。这种情况下,气导麦克风拾取的环境噪声中包括的用户自身说话的声音信号可以不被滤除,从而使得用户自身说话的声音信号作为环境噪声的一部分也可以与扬声器130输出的目标信号 相抵消。当声学装置100的工作状态为通话状态时,处理器120可以控制骨导麦克风为工作状态,气导麦克风为工作状态。声学装置100在通话状态下,用户自身说话的声音信号需要保留。这种情况下,处理器120可以发送控制信号控制骨导麦克风为工作状态,骨导麦克风拾取用户说话的声音信号,处理器120从气导麦克风拾取的环境噪声中去除骨导麦克风拾取的用户说话的声音信号,以使用户自身说话的声音信号不与扬声器130输出的目标信号相抵消,从而保证用户正常的通话状态。In some embodiments, when the working state of the acoustic device 100 is a non-calling state (eg, a music playing state), the processor 120 may control the bone conduction microphone to be in a standby state and the air conduction microphone to be in a working state. When the acoustic device 100 is not in a call state, the sound signal of the user speaking by himself may be regarded as environmental noise. In this case, the voice signal of the user's own speaking included in the ambient noise picked up by the air conduction microphone may not be filtered out, so that the voice signal of the user's own voice, as a part of the ambient noise, may also be similar to the target signal output by the speaker 130 . offset. When the working state of the acoustic device 100 is the talking state, the processor 120 may control the bone conduction microphone to be in the working state and the air conduction microphone to be in the working state. When the acoustic device 100 is in a call state, the voice signal of the user's own speech needs to be retained. In this case, the processor 120 may send a control signal to control the bone conduction microphone to be in a working state, the bone conduction microphone picks up the sound signal of the user speaking, and the processor 120 removes the user speaking picked up by the bone conduction microphone from the ambient noise picked up by the air conduction microphone So that the voice signal of the user's own speech does not cancel the target signal output by the speaker 130, so as to ensure the normal call state of the user.
在一些实施例中,当声学装置100的工作状态为通话状态时,若环境噪声的声压大于预设阈值时,处理器120可以控制骨导麦克风保持工作状态。环境噪声的声压可以反映环境噪声的强度。这里的预设阈值可以是预先存储在声学装置100中的数值,例如,50dB、60dB或70dB等其它任意数值。当环境噪声的声压大于预设阈值时,环境噪声会影响用户的通话质量。处理器120可以通过发送控制信号控制骨导麦克风保持工作状态,骨导麦克风可以获取用户讲话时的面部肌肉的振动信号,而基本不会拾取外部环境噪声,此时将骨导麦克风拾取的振动信号作为通话时的语音信号,从而保证用户的正常通话。In some embodiments, when the working state of the acoustic device 100 is the talking state, if the sound pressure of the ambient noise is greater than the preset threshold, the processor 120 may control the bone conduction microphone to maintain the working state. The sound pressure of ambient noise can reflect the intensity of ambient noise. The preset threshold here may be a value pre-stored in the acoustic device 100, for example, any other value such as 50dB, 60dB, or 70dB. When the sound pressure of the ambient noise is greater than the preset threshold, the ambient noise will affect the call quality of the user. The processor 120 can control the bone conduction microphone to maintain a working state by sending a control signal, and the bone conduction microphone can acquire the vibration signal of the facial muscles of the user when speaking, without basically picking up external environmental noise. At this time, the vibration signal picked up by the bone conduction microphone is used. As a voice signal during a call, so as to ensure the normal call of the user.
在一些实施例中,当声学装置100的工作状态为通话状态时,若环境噪声的声压小于预设阈值时,处理器120可以控制骨导麦克风由工作状态切换至待机状态。当环境噪声的声压小于预设阈值时,环境噪声的声压相对于用户说话产生的声音信号的声压较小,通过第一声径传输至用户耳部某个位置的用户说话产生的声音信号被扬声器130输出的通过第二声径传输至用户耳部某个位置的目标信号抵消一部分后,剩余的用户说话产生的声音信号仍可以被用户听觉中枢接收足以保证用户的正常通话。这种情况下,处理器120可以通过发送控制信号控制骨导麦克风由工作状态切换至待机状态,进而降低信号处理复杂度以及声学装置100的功率损耗。In some embodiments, when the working state of the acoustic device 100 is the talking state, if the sound pressure of the ambient noise is lower than the preset threshold, the processor 120 may control the bone conduction microphone to switch from the working state to the standby state. When the sound pressure of the environmental noise is less than the preset threshold, the sound pressure of the environmental noise is smaller than the sound pressure of the sound signal generated by the user's speech, and the sound generated by the user's speech transmitted to a certain position of the user's ear through the first sound path After the signal is partially offset by the target signal output by the speaker 130 and transmitted to a certain position of the user's ear through the second sound path, the remaining sound signal generated by the user's speech can still be received by the user's auditory center, which is sufficient to ensure the user's normal conversation. In this case, the processor 120 can control the bone conduction microphone to switch from the working state to the standby state by sending a control signal, thereby reducing the complexity of signal processing and the power consumption of the acoustic device 100 .
在一些实施例中,根据麦克风的工作原理,麦克风阵列110可以包括动圈式麦克风、带式麦克风、电容式麦克风、驻极体式麦克风、电磁式麦克风、碳粒式麦克风等,或其任意组合。在一些实施例中,麦克风阵列110的排布方式可以包括线性阵列(例如,直线形、曲线形)、平面阵列(例如,十字形、圆形、环形、多边形、网状形等规则和/或不规则形状)、立体阵列(例如,圆柱状、球状、半球状、多面体等)等,或其任意组合。关于麦克风阵列110的排布方式的更多介绍可以参考本申请其它地方,例如,图5A-D、图6A-B及其相应描述。In some embodiments, the microphone array 110 may include dynamic microphones, ribbon microphones, condenser microphones, electret microphones, electromagnetic microphones, carbon microphones, etc., or any combination thereof, according to the working principle of the microphones. In some embodiments, the arrangement of the microphone array 110 may include a linear array (eg, linear, curved), a planar array (eg, cross, circle, ring, polygon, mesh, etc., regular and/or irregular shapes), stereoscopic arrays (eg, cylindrical, spherical, hemispherical, polyhedral, etc.), etc., or any combination thereof. For more introduction on the arrangement of the microphone array 110, reference may be made to other places in this application, for example, FIGS. 5A-D, 6A-B and their corresponding descriptions.
处理器120可以被配置为利用麦克风阵列110对目标空间位置的声场进行估 计。目标空间位置的声场可以指声波在目标空间位置处或目标空间位置附近的分布和变化(例如,随时间的变化,随位置的变化)。描述声场的物理量可以包括声压、声音频率,声音幅值、声音相位、声源振动速度、或媒质(例如空气)密度等。通常,这些物理量可以是位置和时间的函数。目标空间位置可以指靠近用户耳道特定距离的空间位置。该目标空间位置可以比麦克风阵列110中任一麦克风更加靠近用户耳道。这里的特定距离可以是固定的距离,例如,0.5cm、1cm、2cm、3cm等。在一些实施例中,目标空间位置可以与麦克风阵列110中各麦克风的数量、相对于用户耳道的分布位置相关。通过调整麦克风阵列110中各麦克风的数量和/或相对于用户耳道的分布位置可以对目标空间位置进行调整。例如,通过增加麦克风阵列110中麦克风的数量可以使目标空间位置更加靠近用户耳道。又例如,还可以通过减小麦克风阵列110中各麦克风的间距使目标空间位置更加靠近用户耳道。再例如,还可以通过改变麦克风阵列110中各麦克风的排列方式使目标空间位置更加靠近用户耳道。The processor 120 may be configured to use the microphone array 110 to estimate the sound field of the target spatial location. The sound field of a target spatial location may refer to the distribution and variation of sound waves at or near the target spatial location (eg, as a function of time, as a function of location). The physical quantities describing the sound field may include sound pressure, sound frequency, sound amplitude, sound phase, sound source vibration velocity, or medium (eg air) density, and the like. In general, these physical quantities can be functions of position and time. The target spatial location may refer to a spatial location close to the user's ear canal by a specific distance. The target spatial location may be closer to the user's ear canal than any microphone in the microphone array 110 . The specific distance here may be a fixed distance, for example, 0.5 cm, 1 cm, 2 cm, 3 cm, and the like. In some embodiments, the target spatial position may be related to the number of each microphone in the microphone array 110 and the distribution position relative to the user's ear canal. The target spatial position can be adjusted by adjusting the number of the microphones in the microphone array 110 and/or the distribution position relative to the user's ear canal. For example, by increasing the number of microphones in the microphone array 110, the target spatial location can be brought closer to the user's ear canal. For another example, the target spatial position can also be made closer to the user's ear canal by reducing the distance between the microphones in the microphone array 110 . For another example, the arrangement of the microphones in the microphone array 110 can also be changed to make the target spatial position closer to the user's ear canal.
处理器120可以进一步被配置为基于拾取的环境噪声和目标空间位置的声场估计生成降噪信号。具体地,处理器120可以接收麦克风阵列110传递的环境噪声转换的电信号并对其进行处理以获取环境噪声的参数(例如,幅值、相位等)。处理器120可以进一步基于目标空间位置的声场估计调整环境噪声的参数(例如,幅值、相位等)以生成降噪信号。该降噪信号的参数(例如,幅值、相位等)与环境噪声的参数相对应。仅作为示例,降噪信号的幅值可以与环境噪声的幅值近似相等,降噪信号的相位可以与环境噪声的相位近似相反。在一些实施例中,处理器120可以包括硬件模块和软件模块。仅作为示例,硬件模块可以包括数字信号处理(Digital Signal Processor,DSP)芯片、高级精简指令集机器(Advanced RISC Machines,ARM),软件模块可以包括算法模块。关于处理器120的更多介绍可以参考本申请其它地方,例如,图2及其相应描述。The processor 120 may be further configured to generate a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location. Specifically, the processor 120 may receive and process the ambient noise-converted electrical signal transmitted by the microphone array 110 to obtain parameters (eg, amplitude, phase, etc.) of the ambient noise. The processor 120 may further adjust parameters of the ambient noise (eg, amplitude, phase, etc.) based on the sound field estimate of the target spatial location to generate a noise reduction signal. The parameters of the noise reduction signal (eg, amplitude, phase, etc.) correspond to parameters of the ambient noise. For example only, the amplitude of the noise reduction signal may be approximately equal to the amplitude of the ambient noise, and the phase of the noise reduction signal may be approximately opposite to the phase of the ambient noise. In some embodiments, the processor 120 may include hardware modules and software modules. Just as an example, the hardware module may include a digital signal processing (Digital Signal Processor, DSP) chip, an advanced reduced instruction set machine (Advanced RISC Machines, ARM), and the software module may include an algorithm module. For more information on processor 120, reference may be made elsewhere in this application, eg, FIG. 2 and its corresponding description.
扬声器130可以被配置为根据降噪信号输出目标信号。该目标信号可以用于降低或消除传递到用户耳朵的某个位置处(例如,鼓膜、基底膜)的环境噪声。在一些实施例中,当用户佩戴声学装置100时,扬声器130可以位于用户耳部的附近位置。在一些实施例中,根据扬声器的工作原理,扬声器130可以包括电动式扬声器(例如,动圈式扬声器)、磁式扬声器、离子扬声器、静电式扬声器(或电容式扬声器)、压电式扬声器等中的一种或多种。在一些实施例中,根据扬声器输出的声音的传播方式,扬声器130可以包括气导扬声器和/或骨导扬声器。在一些实施例中,扬声器130的数量可 以为一个或多个。当扬声器130的数量为一个时,该扬声器130可以用于输出目标信号以消除环境噪声且可以用于向用户传递用户需要听取的声音信息(例如,设备媒体音频、通话远端音频)。例如,当扬声器130的数量为一个且为气导扬声器时,该气导扬声器可以用于输出目标信号以消除环境噪声。在这种情况下,目标信号可以为声波(即空气的振动),该声波可以通过空气传递到目标空间位置处并与环境噪声在目标空间位置处相互抵消。同时,该气导扬声器还可以用于向用户传递用户需要听取的声音信息。又例如,当扬声器130的数量为一个且为骨导扬声器时,该骨导扬声器可以用于输出目标信号以消除环境噪声。在这种情况下,目标信号可以为振动信号(例如,扬声器壳体的振动),该振动信号可以通过骨头或组织传递到用户的基底膜并与环境噪声在用户的基底膜处相互抵消。同时,该骨导扬声器还可以用于向用户传递用户需要听取的声音信息。当扬声器130的数量为多个时,多个扬声器130中的一部分可以用于输出目标信号以消除环境噪声,另一部分可以用于向用户传递用户需要听取的声音信息(例如,设备媒体音频、通话远端音频)。例如,当扬声器130的数量为多个且包括骨导扬声器和气导扬声器时,气导扬声器可以用于输出声波以降低或消除环境噪声,骨导扬声器可以用于向用户传递用户需要听取的声音信息。相比于气导扬声器,骨导扬声器可以将机械振动直接通过用户的身体(例如,骨骼、皮肤组织等)传递至用户的听觉神经,在此过程中对于拾取环境噪声的气导麦克风的干扰较小。The speaker 130 may be configured to output the target signal according to the noise reduction signal. The target signal may be used to reduce or cancel ambient noise delivered to a certain location of the user's ear (eg, tympanic membrane, basilar membrane). In some embodiments, when the user wears the acoustic device 100, the speaker 130 may be located near the user's ear. In some embodiments, depending on the working principle of the speaker, the speaker 130 may include a dynamic speaker (eg, a moving coil speaker), a magnetic speaker, an ion speaker, an electrostatic speaker (or a condenser speaker), a piezoelectric speaker, etc. one or more of. In some embodiments, the speaker 130 may include an air conduction speaker and/or a bone conduction speaker depending on how the sound output by the speaker propagates. In some embodiments, the number of speakers 130 may be one or more. When the number of speakers 130 is one, the speaker 130 can be used to output the target signal to eliminate ambient noise and can be used to deliver the sound information (eg, device media audio, call far-end audio) to the user that the user needs to hear. For example, when the number of speakers 130 is one and is an air conduction speaker, the air conduction speaker can be used to output a target signal to cancel ambient noise. In this case, the target signal may be a sound wave (ie, vibration of air), which may be transmitted through the air to the target spatial location and cancel each other with ambient noise at the target spatial location. At the same time, the air conduction speaker can also be used to transmit the sound information that the user needs to hear to the user. For another example, when the number of speakers 130 is one and it is a bone conduction speaker, the bone conduction speaker can be used to output the target signal to eliminate ambient noise. In this case, the target signal may be a vibration signal (eg, vibration of a speaker housing) that may be transmitted through bone or tissue to the user's basilar membrane and cancels out ambient noise at the user's basilar membrane. At the same time, the bone conduction speaker can also be used to transmit the sound information that the user needs to hear to the user. When the number of speakers 130 is multiple, a part of the multiple speakers 130 can be used to output the target signal to eliminate ambient noise, and the other part can be used to transmit the sound information that the user needs to listen to (for example, device media audio, call far-end audio). For example, when the number of speakers 130 is multiple and includes bone conduction speakers and air conduction speakers, the air conduction speakers can be used to output sound waves to reduce or eliminate ambient noise, and the bone conduction speakers can be used to transmit the sound information that the user needs to hear to the user . Compared with air conduction speakers, bone conduction speakers can directly transmit mechanical vibrations through the user's body (eg, bones, skin tissue, etc.) to the user's auditory nerves, and the interference to the air conduction microphones that pick up ambient noise is relatively high during this process. Small.
需要注意的是,扬声器130可以是独立的功能器件,也可以是能够实现多个功能的单个器件的一部分。仅作为示例,扬声器130可以和处理器120集成在一起和/或形成为一体。在一些实施例中,当扬声器130的数量为多个时,多个扬声器130的排布方式可以包括线性阵列(例如,直线形、曲线形)、平面阵列(例如,十字形、网状形、圆形、环形、多边形等规则和/或不规则形状)、立体阵列(例如,圆柱状、球状、半球状、多面体等)等,或其任意组合,本申请在此不做限定。在一些实施例中,扬声器130可以设置于用户的左耳和/或右耳处。例如,扬声器130可以包括第一子扬声器和第二子扬声器。第一子扬声器可以位于用户的左耳处,第二子扬声器可以位于用户的右耳处。第一子扬声器和第二子扬声器可以同时进入工作状态或二者中的一个进入工作状态。在一些实施例中,扬声器130可以为具有定向声场的扬声器,其主瓣的指向用户耳道处。It should be noted that the speaker 130 may be an independent functional device, or may be part of a single device capable of implementing multiple functions. For example only, the speaker 130 may be integrated and/or formed in one piece with the processor 120 . In some embodiments, when the number of speakers 130 is multiple, the arrangement of the multiple speakers 130 may include linear arrays (eg, straight, curved), planar arrays (eg, cross, mesh, circular, annular, polygonal and other regular and/or irregular shapes), three-dimensional arrays (eg, cylindrical, spherical, hemispherical, polyhedron, etc.), etc., or any combination thereof, which is not limited herein. In some embodiments, the speaker 130 may be positioned at the user's left and/or right ear. For example, the speaker 130 may include a first sub-speaker and a second sub-speaker. The first sub-speaker may be located at the user's left ear, and the second sub-speaker may be located at the user's right ear. The first sub-speaker and the second sub-speaker may enter the working state at the same time or one of the two may enter the working state. In some embodiments, the speaker 130 may be a speaker with a directional sound field, the main lobe of which is directed to the user's ear canal.
在一些实施例中,声学装置100还可以包括一个或多个传感器140。一个或多个传感器140可以与声学装置100的其他部件(例如,处理器120)电连接。一个或 多个传感器140可以用于获取声学装置100的物理位置和/或运动信息。仅作为示例,一个或多个传感器140可以包括惯性测量单元(Inertial Measurement Unit,IMU)、全球定位系统(Global Position System,GPS)、雷达等。运动信息可以包括运动轨迹、运动方向、运动速度、运动加速度、运动角速度、运动相关的时间信息(例如运动开始时间,结束时间)等,或其任意组合。以IMU为例,IMU可以包括微电子机械系统(Microelectro Mechanical System,MEMS)。该微电子机械系统可以包括多轴加速度计、陀螺仪、磁力计等,或其任意组合。IMU可以用于检测声学装置100的物理位置和/或运动信息,以启用基于物理位置和/或运动信息对声学装置100的控制。关于基于物理位置和/或运动信息对声学装置100的控制的更多介绍可以参考本申请其它地方,例如,图4及其相应描述。In some embodiments, the acoustic device 100 may also include one or more sensors 140 . One or more sensors 140 may be electrically connected to other components of acoustic device 100 (eg, processor 120). One or more sensors 140 may be used to obtain physical location and/or motion information of the acoustic device 100. For example only, the one or more sensors 140 may include an Inertial Measurement Unit (IMU), a Global Positioning System (GPS), a radar, and the like. The motion information may include motion trajectory, motion direction, motion speed, motion acceleration, motion angular velocity, motion-related time information (eg, motion start time, end time), etc., or any combination thereof. Taking an IMU as an example, the IMU may include a Microelectro Mechanical System (MEMS). The microelectromechanical system may include multi-axis accelerometers, gyroscopes, magnetometers, etc., or any combination thereof. The IMU may be used to detect the physical location and/or motion information of the acoustic device 100 to enable control of the acoustic device 100 based on the physical location and/or motion information. More information on the control of the acoustic device 100 based on physical location and/or motion information can be found elsewhere in this application, eg, FIG. 4 and its corresponding description.
在一些实施例中,声学装置100可以包括信号收发器150。信号收发器150可以与声学装置100的其他部件(例如,处理器120)电连接。在一些实施例中,信号收发器150可以包括蓝牙、天线等。声学装置100可以通过信号收发器150与其他外部设备(例如,移动电话、平板电脑、智能手表)进行通信。例如,声学装置100可以通过蓝牙与其他设备进行无线通信。In some embodiments, the acoustic device 100 may include a signal transceiver 150 . The signal transceiver 150 may be electrically connected with other components of the acoustic device 100 (eg, the processor 120). In some embodiments, the signal transceiver 150 may include Bluetooth, an antenna, and the like. The acoustic device 100 may communicate with other external devices (eg, mobile phones, tablet computers, smart watches) through the signal transceiver 150 . For example, the acoustic device 100 may wirelessly communicate with other devices via Bluetooth.
在一些实施例中,声学装置100可以包括壳体结构160。壳体结构160可以被配置为承载声学装置100的其他部件(例如,麦克风阵列110、处理器120、扬声器130、一个或多个传感器140、信号收发器150)。在一些实施例中,壳体结构160可以是内部中空的封闭式或半封闭式结构,且声学装置100的其他部件位于壳体结构内或上。在一些实施例中,壳体结构的形状可以为长方体、圆柱体、圆台等规则或不规则形状的立体结构。当用户佩戴声学装置100时,壳体结构可以位于靠近用户耳朵附近的位置。例如,壳体结构可以位于用户耳廓的周侧(例如,前侧或后侧)。又例如,壳体结构可以位于用户耳朵上但不堵塞或覆盖用户的耳道。在一些实施例中,声学装置100可以为骨导耳机,壳体结构的至少一侧可以与用户的皮肤接触。骨导耳机中声学驱动器(例如,振动扬声器)将音频信号转换为机械振动,该机械振动可以通过壳体结构以及用户的骨骼传递至用户的听觉神经。在一些实施例中,声学装置100可以为气导耳机,壳体结构的至少一侧可以与用户的皮肤接触或不接触。壳体结构的侧壁上包括至少一个导声孔,气导耳机中的扬声器将音频信号转换为气导声音,该气导声音可以通过导声孔向用户耳朵的方向进行辐射。In some embodiments, the acoustic device 100 may include a housing structure 160 . Housing structure 160 may be configured to carry other components of acoustic device 100 (eg, microphone array 110, processor 120, speaker 130, one or more sensors 140, signal transceiver 150). In some embodiments, the housing structure 160 may be a closed or semi-closed structure with a hollow interior, and the other components of the acoustic device 100 are located in or on the housing structure. In some embodiments, the shape of the shell structure may be a regular or irregular three-dimensional structure such as a rectangular parallelepiped, a cylinder, and a truncated cone. When the user wears the acoustic device 100, the housing structure may be located close to the user's ear. For example, the housing structure may be located on the peripheral side (eg, the front side or the back side) of the user's pinna. As another example, the housing structure may be located on the user's ear without blocking or covering the user's ear canal. In some embodiments, the acoustic device 100 may be a bone conduction earphone, and at least one side of the housing structure may be in contact with the user's skin. Acoustic drivers (eg, vibrating speakers) in bone conduction headphones convert audio signals into mechanical vibrations that can be transmitted to the user's auditory nerves through the housing structure and the user's bones. In some embodiments, the acoustic device 100 may be an air conduction earphone, and at least one side of the housing structure may or may not be in contact with the user's skin. The side wall of the housing structure includes at least one sound guide hole, and the speaker in the air conduction earphone converts the audio signal into the air conduction sound, and the air conduction sound can be radiated toward the user's ear through the sound guide hole.
在一些实施例中,声学装置100可以包括固定结构170。固定结构170可以被 配置为将声学装置100固定在用户耳朵附近且不堵塞用户耳道的位置。在一些实施例中,固定结构170可以与声学装置100的壳体结构160物理连接(例如,卡接、螺纹连接等)。在一些实施例中,声学装置100的壳体结构160可以为固定结构170的一部分。在一些实施例中,固定结构170可以包括耳挂、后挂、弹性带、眼镜腿等,使得声学装置100可以更好地固定在用户耳朵附近位置,防止用户在使用时发生掉落。例如,固定结构170可以为耳挂,耳挂可以被配置为围绕耳部区域佩戴。在一些实施例中,耳挂可以是连续的钩状物,并可以被弹性地拉伸以佩戴在用户的耳部,同时耳挂还可以对用户的耳廓施加压力,使得声学装置100牢固地固定在用户的耳部或头部的特定位置上。在一些实施例中,耳挂可以是不连续的带状物。例如,耳挂可以包括刚性部分和柔性部分。刚性部分可以由刚性材料(例如,塑料或金属)制成,刚性部分可以与声学装置100的壳体结构160通过物理连接(例如,卡接、螺纹连接等)的方式进行固定。柔性部分可以由弹性材料(例如,布料、复合材料或/和氯丁橡胶)制成。又例如,固定结构170可以为颈带,被配置为围绕颈/肩区域佩戴。再例如,固定结构170可以为眼镜腿,其作为眼镜的一部分,被架设在用户耳部。In some embodiments, the acoustic device 100 may include a fixed structure 170 . The securing structure 170 may be configured to secure the acoustic device 100 in a position near the user's ear and without blocking the user's ear canal. In some embodiments, the securing structure 170 may be physically connected (eg, snapped, screwed, etc.) with the housing structure 160 of the acoustic device 100 . In some embodiments, the housing structure 160 of the acoustic device 100 may be part of the fixed structure 170 . In some embodiments, the fixing structure 170 may include ear hooks, back hooks, elastic bands, temples, etc., so that the acoustic device 100 can be better fixed near the user's ears and prevent the user from falling during use. For example, the securing structure 170 may be an ear loop, which may be configured to be worn around the ear area. In some embodiments, the earhook can be a continuous hook and can be elastically stretched to fit on the user's ear, while the earhook can also apply pressure to the user's pinna, so that the acoustic device 100 is securely attached Fixed to a specific location on the user's ear or head. In some embodiments, the earhook may be a discontinuous band. For example, an earhook may include a rigid portion and a flexible portion. The rigid part may be made of rigid material (eg, plastic or metal), and the rigid part may be fixed to the housing structure 160 of the acoustic device 100 by means of a physical connection (eg, snap-fit, screw connection, etc.). The flexible portion may be made of elastic material (eg, cloth, composite or/and neoprene). As another example, the securing structure 170 may be a neckband configured to be worn around the neck/shoulder area. For another example, the fixing structure 170 may be an eyeglass leg, which, as a part of the eyeglasses, is erected on the user's ear.
在一些实施例中,声学装置100还可以包括用于调整目标信号声压的交互模块(未示出)。在一些实施例中,交互模块可以包括按钮、语音助手、手势传感器等。用户通过控制交互模块可以调整声学装置100的降噪模式。具体地,用户通过控制交互模块可以调整(例如,放大或缩小)降噪信号的幅值信息,以改变扬声器阵列130发出的目标信号的声压,进而达到不同的降噪效果。仅作为示例,降噪模式可以包括强降噪模式、中级降噪模式、弱降噪模式等。例如,用户在室内佩戴声学装置100时,外界环境噪声较小,用户可以通过交互模块将声学装置100的降噪模式关闭或调整为弱降噪模式。又例如,当用户在街边等公共场合行走时佩戴声学装置100,用户需要在收听音频信号(例如,音乐、语音信息)的同时,保持对周围环境的一定感知能力,以应对突发状况,此时用户可以通过交互模块(例如,按钮或语音助手)选择中级降噪模式,以保留周围环境噪声(如警报声、撞击声、汽车鸣笛声等)。再例如,用户在乘坐地铁或飞机等交通工具时,用户可以通过交互模块选择强降噪模式,以进一步降低周围环境噪声。在一些实施例中,处理器120还可以基于环境噪声强度范围向声学装置100或与声学装置100通信连接的终端设备(例如,手机、智能手表等)发出提示信息,以提醒用户调整降噪模式。In some embodiments, the acoustic device 100 may further include an interaction module (not shown) for adjusting the sound pressure of the target signal. In some embodiments, the interaction module may include buttons, voice assistants, gesture sensors, and the like. The user can adjust the noise reduction mode of the acoustic device 100 by controlling the interaction module. Specifically, the user can adjust (eg, amplify or reduce) the amplitude information of the noise reduction signal by controlling the interaction module, so as to change the sound pressure of the target signal emitted by the speaker array 130, thereby achieving different noise reduction effects. For example only, the noise reduction mode may include a strong noise reduction mode, a medium noise reduction mode, a weak noise reduction mode, and the like. For example, when the user wears the acoustic device 100 indoors, the external environment noise is small, and the user can turn off or adjust the noise reduction mode of the acoustic device 100 to a weak noise reduction mode through the interaction module. For another example, when the user wears the acoustic device 100 when walking in public places such as the street, the user needs to maintain a certain ability to perceive the surrounding environment while listening to audio signals (eg, music, voice information) to cope with emergencies, At this time, the user can select the intermediate noise reduction mode through the interaction module (for example, a button or a voice assistant) to preserve the surrounding ambient noise (such as sirens, impacts, car horns, etc.). For another example, when a user takes a subway or a plane, the user can select a strong noise reduction mode through the interaction module to further reduce ambient noise. In some embodiments, the processor 120 may also send prompt information to the acoustic apparatus 100 or a terminal device (eg, a mobile phone, a smart watch, etc.) that is communicatively connected to the acoustic apparatus 100 based on the ambient noise intensity range, so as to remind the user to adjust the noise reduction mode .
应当注意的是,以上关于图1的描述仅仅是出于说明的目的而提供的,并不旨 在限制本申请的范围。对于本领域的普通技术人员来说,根据本申请的指导可以做出多种变化和修改。在一些实施例中,声学装置100中的一个或多个部件(例如,一个或多个传感器140、信号收发器150、固定结构170、交互模块等)可以省略。在一些实施例中,声学装置100中的一个或多个部件可以被其他能实现类似功能的元件替代。例如,声学装置100可以不包括固定结构170,壳体结构160或其一部分可以为具有人体耳朵适配形状(例如圆环形、椭圆形、多边形(规则或不规则)、U型、V型、半圆形)的壳体结构,以便壳体结构可以挂靠在用户的耳朵附近。在一些实施例中,声学装置100中的一个部件可以拆分成多个子部件,或者多个部件可以合并为单个部件。这些变化和修改不会背离本申请的范围。It should be noted that the above description with respect to FIG. 1 is provided for illustrative purposes only and is not intended to limit the scope of the present application. Numerous changes and modifications may be made by those of ordinary skill in the art in light of the teachings of this application. In some embodiments, one or more components in acoustic device 100 (eg, one or more sensors 140, signal transceivers 150, fixed structures 170, interaction modules, etc.) may be omitted. In some embodiments, one or more components of acoustic device 100 may be replaced by other elements that perform similar functions. For example, the acoustic device 100 may not include the fixed structure 170, and the housing structure 160 or a portion thereof may have a human ear-compatible shape (eg, circular, oval, polygonal (regular or irregular), U-shaped, V-shaped, semi-circular) shell structure so that the shell structure can hang near the user's ear. In some embodiments, a component in acoustic device 100 may be split into multiple sub-components, or multiple components may be combined into a single component. These changes and modifications do not depart from the scope of this application.
图2是根据本申请的一些实施例所示的示例性处理器120的结构示意图。如图2所示,处理器120可以包括模数转换单元210、噪声估计单元220、幅相补偿单元230和数模转换单元240。FIG. 2 is a schematic structural diagram of an exemplary processor 120 according to some embodiments of the present application. As shown in FIG. 2 , the processor 120 may include an analog-to-digital conversion unit 210 , a noise estimation unit 220 , an amplitude-phase compensation unit 230 and a digital-to-analog conversion unit 240 .
在一些实施例中,模数转换单元210可以被配置为将麦克风阵列110输入的信号转换为数字信号。具体的,麦克风阵列110拾取环境噪声,并将拾取到的环境噪声转换为电信号传递至处理器120。接收到麦克风阵列110发送的环境噪声的电信号后,模数转换单元210可以将电信号转换为数字信号。在一些实施例中,模数转换单元210可以与麦克风阵列110电连接并进一步与处理器120的其他部件(例如,噪声估计单元220)电连接。进一步,模数转换单元210可以将转换的环境噪声的数字信号传递到噪声估计单元220。In some embodiments, the analog-to-digital conversion unit 210 may be configured to convert the signal input by the microphone array 110 into a digital signal. Specifically, the microphone array 110 picks up the environmental noise, and converts the picked-up environmental noise into electrical signals and transmits them to the processor 120 . After receiving the electrical signal of environmental noise sent by the microphone array 110, the analog-to-digital conversion unit 210 may convert the electrical signal into a digital signal. In some embodiments, the analog-to-digital conversion unit 210 may be electrically connected to the microphone array 110 and further to other components of the processor 120 (eg, the noise estimation unit 220). Further, the analog-to-digital conversion unit 210 may transfer the converted digital signal of environmental noise to the noise estimation unit 220 .
在一些实施例中,噪声估计单元220可以被配置为根据接收的环境噪声的数字信号对环境噪声进行估计。例如,噪声估计单元220可以根据接收的环境噪声的数字信号估计目标空间位置处的环境噪声的相关参数。仅作为示例,所述参数可以包括目标空间位置处的噪声的噪声源(例如,噪声源的位置,方位)、传递方向、幅值、相位等,或其任意组合。在一些实施例中,噪声估计单元220还可以被配置为利用麦克风阵列110对目标空间位置的声场进行估计。关于估计目标空间位置的声场的更多介绍可以参考本申请其它地方,例如,图4及其相应描述。在一些实施例中,噪声估计单元220可以与处理器120的其他部件(例如,幅相补偿单元230)电连接。进一步,噪声估计单元220可以将估计的环境噪声相关的参数和目标空间位置的声场传递到幅相补偿单元230。In some embodiments, the noise estimation unit 220 may be configured to estimate the ambient noise from the received digital signal of the ambient noise. For example, the noise estimation unit 220 may estimate the relevant parameters of the environmental noise at the target spatial location according to the received digital signal of the environmental noise. For example only, the parameters may include a noise source (eg, location, orientation), transfer direction, magnitude, phase, etc., of the noise at the target spatial location, or any combination thereof. In some embodiments, the noise estimation unit 220 may also be configured to use the microphone array 110 to estimate the sound field of the target spatial location. For more information on estimating the sound field of the target spatial location, reference can be made elsewhere in this application, eg, FIG. 4 and its corresponding description. In some embodiments, the noise estimation unit 220 may be electrically connected with other components of the processor 120 (eg, the amplitude and phase compensation unit 230). Further, the noise estimation unit 220 may transfer the estimated parameters related to environmental noise and the sound field of the target spatial position to the amplitude and phase compensation unit 230 .
在一些实施例中,幅相补偿单元230可以被配置为根据目标空间位置的声场对 估计的环境噪声相关的参数进行补偿。例如,幅相补偿单元230可以根据目标空间位置的声场对环境噪声的幅值和相位进行补偿以获得数字降噪信号。在一些实施例中,幅相补偿单元230可以调整环境噪声的幅值并对环境噪声的相位进行反向补偿以获得数字降噪信号。数字降噪信号的幅值可以与环境噪声对应的数字信号幅值近似相等,数字降噪信号的相位可以与环境噪声对应的数字信号的相位近似相反。在一些实施例中,幅相补偿单元230可以与处理器120的其他部件(例如,数模转换单元240)电连接。进一步,幅相补偿单元230可以将数字降噪信号传递到数模转换单元240。In some embodiments, the amplitude and phase compensation unit 230 may be configured to compensate the estimated ambient noise related parameters according to the sound field of the target spatial location. For example, the amplitude and phase compensation unit 230 may compensate the amplitude and phase of the ambient noise according to the sound field of the target spatial position to obtain a digital noise reduction signal. In some embodiments, the amplitude and phase compensation unit 230 may adjust the amplitude of the ambient noise and inversely compensate the phase of the ambient noise to obtain a digital noise reduction signal. The amplitude of the digital noise reduction signal may be approximately equal to the amplitude of the digital signal corresponding to the environmental noise, and the phase of the digital noise reduction signal may be approximately opposite to the phase of the digital signal corresponding to the environmental noise. In some embodiments, the amplitude and phase compensation unit 230 may be electrically connected with other components of the processor 120 (eg, the digital-to-analog conversion unit 240 ). Further, the amplitude and phase compensation unit 230 may transmit the digital noise reduction signal to the digital-to-analog conversion unit 240 .
在一些实施例中,数模转换单元240可以被配置为将数字降噪信号转换为模拟信号以获得降噪信号(例如,电信号)。仅作为示例,数模转换单元240可以包括脉冲宽度调制(Pulse Width Modulation,PMW)。在一些实施例中,数模转换单元240可以与处理器120的其他部件(例如,扬声器130)电连接。进一步,数模转换单元240可以将降噪信号传递至扬声器130。In some embodiments, the digital-to-analog conversion unit 240 may be configured to convert the digital noise reduction signal to an analog signal to obtain a noise reduction signal (eg, an electrical signal). For example only, the digital-to-analog conversion unit 240 may include pulse width modulation (Pulse Width Modulation, PMW). In some embodiments, the digital-to-analog conversion unit 240 may be electrically connected with other components of the processor 120 (eg, the speaker 130). Further, the digital-to-analog conversion unit 240 may transmit the noise reduction signal to the speaker 130 .
在一些实施例中,处理器120可以包括信号放大单元250。信号放大单元250可以被配置为放大输入的信号。例如,信号放大单元250可以放大麦克风阵列110输入的信号。仅作为示例,当声学装置100处于通话状态时,信号放大单元250可以用于放大麦克风阵列110输入的用户说话的声音。又例如,信号放大单元250可以根据目标空间位置的声场对环境噪声的幅值进行放大。在一些实施例中,信号放大单元250可以与处理器120的其他部件(例如,麦克风阵列110、噪声估计单元220、幅相补偿单元230)电连接。In some embodiments, the processor 120 may include a signal amplification unit 250 . The signal amplifying unit 250 may be configured to amplify the input signal. For example, the signal amplifying unit 250 may amplify the signal input by the microphone array 110 . Just as an example, when the acoustic device 100 is in a call state, the signal amplification unit 250 may be used to amplify the voice of the user input by the microphone array 110 . For another example, the signal amplifying unit 250 may amplify the amplitude of the ambient noise according to the sound field of the target spatial position. In some embodiments, the signal amplification unit 250 may be electrically connected with other components of the processor 120 (eg, the microphone array 110, the noise estimation unit 220, the amplitude and phase compensation unit 230).
应当注意的是,以上关于图2的描述仅仅是出于说明的目的而提供的,并不旨在限制本申请的范围。对于本领域的普通技术人员来说,根据本申请的指导可以做出多种变化和修改。在一些实施例中,处理器120中的一个或多个部件(例如,信号放大单元250)可以省略。在一些实施例中,处理器120中的一个部件可以拆分成多个子部件,或者多个部件可以合并为单个部件。例如,噪声估计单元220和幅相补偿单元230可以集成为一个部件用于实现噪声估计单元220和幅相补偿单元230的功能。这些变化和修改不会背离本申请的范围。It should be noted that the above description with respect to FIG. 2 is provided for illustrative purposes only and is not intended to limit the scope of the present application. Numerous changes and modifications may be made by those of ordinary skill in the art in light of the teachings of this application. In some embodiments, one or more components in processor 120 (eg, signal amplification unit 250) may be omitted. In some embodiments, a component in processor 120 may be split into multiple sub-components, or multiple components may be combined into a single component. For example, the noise estimation unit 220 and the amplitude and phase compensation unit 230 may be integrated into one component for realizing the functions of the noise estimation unit 220 and the amplitude and phase compensation unit 230 . These changes and modifications do not depart from the scope of this application.
图3是根据本申请的一些实施例所示的声学装置的示例性降噪流程图。在一些实施例中,流程300可以由声学装置100执行。如图3所示,流程300可以包括:3 is an exemplary noise reduction flow diagram of an acoustic device according to some embodiments of the present application. In some embodiments, process 300 may be performed by acoustic device 100 . As shown in FIG. 3, the process 300 may include:
在步骤310中,拾取环境噪声。在一些实施例中,该步骤可以由麦克风阵列110执行。In step 310, ambient noise is picked up. In some embodiments, this step may be performed by microphone array 110 .
根据图1中的相关描述,环境噪声可以指用户所处环境中的多种外界声音(例如,交通噪声、工业噪声、建筑施工噪声、社会噪声)的组合。在一些实施例中,麦克风阵列110可以位于用户耳道的附近位置,用于拾取传递至用户耳道处的环境噪声。进一步,麦克风阵列110可以将拾取的环境噪声信号转换为电信号并传递至处理器120进行处理。According to the related description in FIG. 1 , ambient noise may refer to a combination of various external sounds (eg, traffic noise, industrial noise, building construction noise, social noise) in the environment where the user is located. In some embodiments, the microphone array 110 may be located near the user's ear canal for picking up ambient noise delivered to the user's ear canal. Further, the microphone array 110 can convert the picked-up ambient noise signal into an electrical signal and transmit it to the processor 120 for processing.
在步骤320中,基于拾取的环境噪声估计目标空间位置的噪声。在一些实施例中,该步骤可以由处理器120执行。In step 320, the noise of the target spatial location is estimated based on the picked-up ambient noise. In some embodiments, this step may be performed by processor 120 .
在一些实施例中,处理器120可以对拾取的环境噪声进行信号分离。在一些实施例中,麦克风阵列110拾取的环境噪声可以包括各种声音。处理器120可以对麦克风阵列110拾取的环境噪声进行信号分析,以分离该各种声音。具体地,处理器120可以根据各种声音在空间、时域、频域等不同维度的统计分布特性及结构化特征,自适应调整滤波器的参数,估计环境噪声中各个声音信号的参数信息,并根据各个声音信号的参数信息完成信号分离过程。在一些实施例中,噪声的统计分布特性可以包括概率分布密度、功率谱密度、自相关函数、概率密度函数、方差、数学期望等。在一些实施例中,噪声的结构化特征可以包括噪声分布、噪声强度、全局噪声强度、噪声率等,或其任意组合。全局噪声强度可以指平均噪声强度或加权平均噪声强度。噪声率可以指噪声分布的分散程度。仅作为示例,麦克风阵列110拾取的环境噪声可以包括第一信号、第二信号、第三信号。处理器120获取第一信号、第二信号、第三信号在空间(例如,信号所处位置)、时域(例如,延迟)、频域(例如,幅值、相位)的差异,并根据三种维度上的差异将第一信号、第二信号、第三信号分离,得到相对纯净的第一信号、第二信号、第三信号。进一步,处理器120可以根据分离得到的信号的参数信息(例如,频率信息、相位信息、幅值信息)更新环境噪声。例如,处理器120可以根据第一信号的参数信息确定第一信号为用户的通话声音,并从环境噪声中去除第一信号从而更新环境噪声。在一些实施例中,被去除第一信号可以被传输至通话远端。例如,用户佩戴声学装置100进行语音通话时,第一信号可以被传输至通话远端。In some embodiments, the processor 120 may perform signal separation on the picked-up ambient noise. In some embodiments, the ambient noise picked up by the microphone array 110 may include various sounds. The processor 120 may perform signal analysis on the ambient noise picked up by the microphone array 110 to separate the various sounds. Specifically, the processor 120 can adaptively adjust the parameters of the filter according to the statistical distribution characteristics and structural characteristics of various sounds in different dimensions such as space, time domain, and frequency domain, and estimate the parameter information of each sound signal in the environmental noise, And complete the signal separation process according to the parameter information of each sound signal. In some embodiments, the statistical distribution characteristics of noise may include probability distribution density, power spectral density, autocorrelation function, probability density function, variance, mathematical expectation, and the like. In some embodiments, the structured features of noise may include noise distribution, noise intensity, global noise intensity, noise rate, etc., or any combination thereof. The global noise intensity may refer to an average noise intensity or a weighted average noise intensity. The noise rate may refer to the degree of dispersion of the noise distribution. For example only, the ambient noise picked up by the microphone array 110 may include a first signal, a second signal, and a third signal. The processor 120 obtains the differences between the first signal, the second signal, and the third signal in the space (eg, where the signals are located), the time domain (eg, delay), and the frequency domain (eg, amplitude, phase), and according to the three The first signal, the second signal, and the third signal are separated by the difference in these dimensions, and the relatively pure first signal, the second signal, and the third signal are obtained. Further, the processor 120 may update the ambient noise according to the parameter information (eg, frequency information, phase information, amplitude information) of the separated signal. For example, the processor 120 may determine that the first signal is the user's call sound according to the parameter information of the first signal, and remove the first signal from the ambient noise to update the ambient noise. In some embodiments, the removed first signal may be transmitted to the far end of the call. For example, when the user wears the acoustic device 100 for a voice call, the first signal may be transmitted to the far end of the call.
目标空间位置是基于麦克风阵列110确定的位于用户耳道或用户耳道附近的位置。根据图1中的相关描述,目标空间位置可以指靠近用户耳道(例如,耳孔)特定距离(例如,0.5cm、1cm、2cm、3cm)的空间位置。在一些实施例中,目标空间位置比麦克风阵列110中任一麦克风更加靠近用户耳道。根据图1中的相关描述,目标空间位置与麦克风阵列110中各麦克风的数量、相对于用户耳道的分布位置相关,通 过调整麦克风阵列110中各麦克风的数量和/或相对于用户耳道的分布位置可以对目标空间位置进行调整。在一些实施例中,基于拾取的环境噪声(或更新后的环境噪声)估计目标空间位置的噪声还可以包括确定一个或多个与拾取的环境噪声有关的空间噪声源,基于空间噪声源估计目标空间位置的噪声。麦克风阵列110拾取的环境噪声可以是来自不同方位、不同种类的空间噪声源。每一个空间噪声源对应的参数信息(例如,频率信息、相位信息、幅值信息)是不同的。在一些实施例中,处理器120可以根据不同类型的噪声在不同维度(例如,空域、时域、频域等)的统计分布和结构化特征将目标空间位置的噪声进行信号分离提取,从而获取不同类型(例如不同频率、不同相位等)的噪声,并估计每种噪声所对应的参数信息(例如,幅值信息、相位信息等)。在一些实施例中,处理器120还可以将根据目标空间位置处不同类型噪声对应的参数信息确定目标空间位置的噪声的整体参数信息。关于基于一个或多个空间噪声源估计目标空间位置的噪声的更多内容可以参考本申请说明书其它地方,例如,图7-8及其相应描述。The target spatial location is based on a location determined by the microphone array 110 at or near the user's ear canal. According to the related description in FIG. 1 , the target spatial position may refer to a spatial position close to a user's ear canal (eg, ear canal) at a specific distance (eg, 0.5 cm, 1 cm, 2 cm, 3 cm). In some embodiments, the target spatial location is closer to the user's ear canal than any of the microphones in the microphone array 110 . According to the relevant description in FIG. 1 , the target spatial position is related to the number of microphones in the microphone array 110 and the distribution position relative to the user’s ear canal. By adjusting the number of microphones in the microphone array 110 and/or relative to the user’s ear canal The distribution position can be adjusted to the target spatial position. In some embodiments, estimating noise at the spatial location of the target based on the picked-up environmental noise (or updated environmental noise) may further include determining one or more spatial noise sources related to the picked-up environmental noise, estimating the target based on the spatial noise sources Noise in spatial location. The ambient noise picked up by the microphone array 110 may come from different azimuths and different types of spatial noise sources. The parameter information (eg, frequency information, phase information, and amplitude information) corresponding to each spatial noise source is different. In some embodiments, the processor 120 may perform signal separation and extraction on the noise at the target spatial location according to the statistical distribution and structural features of different types of noise in different dimensions (eg, spatial domain, time domain, frequency domain, etc.), so as to obtain Noise of different types (eg, different frequencies, different phases, etc.), and estimate the parameter information (eg, amplitude information, phase information, etc.) corresponding to each noise. In some embodiments, the processor 120 may further determine the overall parameter information of the noise at the target spatial position according to the parameter information corresponding to different types of noise at the target spatial position. For more information on estimating noise at a target spatial location based on one or more spatial noise sources, reference may be made elsewhere in the specification of this application, eg, FIGS. 7-8 and their corresponding descriptions.
在一些实施例中,基于拾取的环境噪声(或更新后的环境噪声)估计目标空间位置的噪声还可以包括基于麦克风阵列110构建虚拟麦克风以及基于虚拟麦克风估计目标空间位置的噪声。关于基于虚拟麦克风估计目标空间位置的噪声的更多内容可以参考本申请说明书其它地方,例如图9-10及其相应描述。In some embodiments, estimating the noise at the target spatial location based on the picked-up ambient noise (or updated ambient noise) may further include constructing a virtual microphone based on the microphone array 110 and estimating the noise at the target spatial location based on the virtual microphone. For more content about estimating noise at the spatial location of the target based on the virtual microphone, reference may be made to other places in the specification of this application, such as FIGS. 9-10 and their corresponding descriptions.
在步骤330中,基于目标空间位置的噪声生成降噪信号。在一些实施例中,该步骤可以由处理器120执行。In step 330, a noise reduction signal is generated based on the noise at the target spatial location. In some embodiments, this step may be performed by processor 120 .
在一些实施例中,处理器120可以基于步骤320中获得的目标空间位置的噪声的参数信息(例如,幅值信息、相位信息等)生成降噪信号。在一些实施例中,降噪信号的相位与目标空间位置的噪声的相位的相位差可以小于或等于预设相位阈值。该预设相位阈值可以处于90-180度范围内。该预设相位阈值可以根据用户的需要在该范围内进行调整。例如,当用户不希望被周围环境的声音打扰时,该预设相位阈值可以为较大值,例如180度,即降噪信号的相位与目标空间位置的噪声的相位相反。又例如,当用户希望对周围环境保持敏感时,该预设相位阈值可以为较小值,例如90度。需要注意的是,用户希望接收越多周围环境的声音,该预设相位阈值可以越接近90度,用户希望接收越少周围环境的声音,该预设相位阈值可以越接近180度。在一些实施例中,当降噪信号的相位与目标空间位置的噪声的相位一定的情况下(例如相位相反),目标空间位置的噪声的幅值与该降噪信号的幅值的幅值差可以小于或等于预设幅值阈值。例如,当用户不希望被周围环境的声音打扰时,该预设幅值阈值可以为较小值,例如0dB,即 降噪信号的幅值与目标空间位置的噪声的幅值相等。又例如,当用户希望对周围环境保持敏感时,该预设幅值阈值可以为较大值,例如约等于目标空间位置的噪声的幅值。需要注意的是,用户希望接收越多周围环境的声音,该预设幅值阈值可以越接近目标空间位置的噪声的幅值,用户希望接收越少周围环境的声音,该预设幅值阈值可以越接近0dB。In some embodiments, the processor 120 may generate a noise reduction signal based on the parameter information (eg, amplitude information, phase information, etc.) of the noise at the target spatial location obtained in step 320 . In some embodiments, the phase difference between the phase of the noise reduction signal and the phase of the noise at the target spatial location may be less than or equal to a preset phase threshold. The preset phase threshold may be in the range of 90-180 degrees. The preset phase threshold can be adjusted within this range according to user needs. For example, when the user does not want to be disturbed by the sound of the surrounding environment, the preset phase threshold may be a larger value, such as 180 degrees, that is, the phase of the noise reduction signal is opposite to the phase of the noise at the target spatial position. For another example, when the user wants to be sensitive to the surrounding environment, the preset phase threshold may be a small value, such as 90 degrees. It should be noted that the more ambient sounds the user wishes to receive, the closer the preset phase threshold may be to 90 degrees, and the less ambient sounds the user wishes to receive, the closer the preset phase threshold may be to 180 degrees. In some embodiments, when the phase of the noise reduction signal is a certain phase (eg, the phase is opposite) to the noise at the target spatial position, the amplitude of the noise at the target spatial position is different from the amplitude of the noise reduction signal. Can be less than or equal to the preset amplitude threshold. For example, when the user does not want to be disturbed by the sound of the surrounding environment, the preset amplitude threshold may be a small value, such as 0 dB, that is, the amplitude of the noise reduction signal is equal to the amplitude of the noise at the target spatial position. For another example, when the user wishes to remain sensitive to the surrounding environment, the preset amplitude threshold may be a relatively large value, for example, approximately equal to the amplitude of the noise at the target spatial position. It should be noted that the more the user wishes to receive the sound of the surrounding environment, the closer the preset amplitude threshold can be to the amplitude of the noise at the target spatial position, and the less the user wishes to receive the sound of the surrounding environment, the preset amplitude threshold can be The closer it is to 0dB.
在一些实施例中,扬声器130可以基于处理器120生成的降噪信号输出目标信号。例如,扬声器130可以将降噪信号(例如,电信号)基于扬声器130中的振动组件转化为目标信号(即振动信号),该目标信号可以与环境噪声相互抵消。在一些实施例中,目标空间位置的噪声为多个空间噪声源时,扬声器130可以基于降噪信号输出与多个空间噪声源相对应的目标信号。例如,多个空间噪声源包括第一空间噪声源和第二空间噪声源,扬声器130可以输出与第一空间噪声源的噪声相位近似相反、幅值近似相等的第一目标信号以抵消第一空间噪声源的噪声,与第二空间噪声源的噪声相位近似相反、幅值近似相等的第二目标信号以抵消第二空间噪声源的噪声。在一些实施例中,当扬声器130为气导扬声器时,目标信号与环境噪声向抵消的位置可以为目标空间位置。目标空间位置与用户耳道之间的间距较小,目标空间位置的噪声可以近似视为用户耳道位置的噪声,因此,降噪信号与目标空间位置的噪声相互抵消,可以近似为传递至用户耳道的环境噪声被消除,实现声学装置100的主动降噪。在一些实施例中,当扬声器130为骨导扬声器时,目标信号与环境噪声向抵消的位置可以为基底膜。目标信号与环境噪声在用户的基底膜被抵消,从而实现声学装置100的主动降噪。In some embodiments, the speaker 130 may output the target signal based on the noise reduction signal generated by the processor 120 . For example, the speaker 130 may convert a noise reduction signal (eg, an electrical signal) into a target signal (ie, a vibration signal) based on a vibration component in the speaker 130, and the target signal may cancel out the ambient noise. In some embodiments, when the noise at the target spatial location is a plurality of spatial noise sources, the speaker 130 may output target signals corresponding to the plurality of spatial noise sources based on the noise reduction signal. For example, if the plurality of spatial noise sources include a first spatial noise source and a second spatial noise source, the speaker 130 may output a first target signal having an approximately opposite phase and approximately equal amplitude to the noise of the first spatial noise source to cancel the first spatial noise. The noise of the noise source and the noise of the second spatial noise source are approximately opposite in phase and approximately equal in amplitude to the second target signal to cancel the noise of the second spatial noise source. In some embodiments, when the loudspeaker 130 is an air conduction loudspeaker, the position where the target signal and the ambient noise are canceled may be the target spatial position. The distance between the target space position and the user's ear canal is small, and the noise at the target space position can be approximately regarded as the noise at the user's ear canal position. Therefore, the noise reduction signal and the noise at the target space position cancel each other out, and can be approximately transmitted to the user. The ambient noise of the ear canal is eliminated, and the active noise reduction of the acoustic device 100 is realized. In some embodiments, when the loudspeaker 130 is a bone conduction loudspeaker, the position where the target signal and the ambient noise are canceled may be the basilar membrane. The target signal and ambient noise are canceled at the basilar membrane of the user, thereby realizing active noise reduction of the acoustic device 100 .
应当注意的是,上述有关流程300的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程300进行各种修正和改变。例如,还可以增加、省略或合并流程300中的步骤。又例如,还可以对环境噪声进行信号处理(例如,滤波处理等)。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 300 is only for example and illustration, and does not limit the scope of application of the present application. For those skilled in the art, various modifications and changes can be made to the process 300 under the guidance of the present application. For example, steps in flow 300 may also be added, omitted, or combined. For another example, signal processing (eg, filtering processing, etc.) may also be performed on the environmental noise. Such corrections and changes are still within the scope of this application.
图4是根据本申请的一些实施例所示的声学装置的示例性降噪流程图。在一些实施例中,流程400可以由声学装置100执行。如图4所示,流程400可以包括:4 is an exemplary noise reduction flow diagram of an acoustic device according to some embodiments of the present application. In some embodiments, process 400 may be performed by acoustic device 100 . As shown in FIG. 4, the process 400 may include:
在步骤410中,拾取环境噪声。在一些实施例中,该步骤可以由麦克风阵列110执行。在一些实施例中,可以以与步骤310类似的方式来执行步骤410,并且在此不再重复相关的描述。In step 410, ambient noise is picked up. In some embodiments, this step may be performed by microphone array 110 . In some embodiments, step 410 may be performed in a similar manner to step 310, and the related description is not repeated here.
在步骤420中,基于拾取的环境噪声估计目标空间位置的噪声。在一些实施例 中,该步骤可以由处理器120执行。在一些实施例中,可以以与步骤320类似的方式来执行步骤420,并且在此不再重复相关的描述。In step 420, the noise of the target spatial location is estimated based on the picked-up ambient noise. In some embodiments, this step may be performed by processor 120. In some embodiments, step 420 may be performed in a similar manner to step 320, and the related description is not repeated here.
在步骤430中,对目标空间位置的声场进行估计。在一些实施例中,该步骤可以由处理器120执行。In step 430, the sound field of the target spatial location is estimated. In some embodiments, this step may be performed by processor 120 .
在一些实施例中,处理器120可以利用麦克风阵列110对目标空间位置的声场进行估计。具体地,处理器120可以基于麦克风阵列110构建虚拟麦克风并基于虚拟麦克风对目标空间位置的声场进行估计。关于基于虚拟麦克风对目标空间位置的声场进行估计的更多内容可以参考本申请说明书其它地方,例如,图9-10及其相应描述。In some embodiments, the processor 120 may utilize the microphone array 110 to estimate the sound field of the target spatial location. Specifically, the processor 120 may construct a virtual microphone based on the microphone array 110 and estimate the sound field of the target spatial position based on the virtual microphone. For more content about estimating the sound field of the target spatial position based on the virtual microphone, reference may be made to other places in the specification of this application, for example, FIGS. 9-10 and their corresponding descriptions.
在步骤440中,基于目标空间位置的噪声和目标空间位置的声场估计生成降噪信号。在一些实施例中,步骤440可以由处理器120执行。In step 440, a noise reduction signal is generated based on the noise at the target spatial location and the sound field estimate at the target spatial location. In some embodiments, step 440 may be performed by processor 120 .
在一些实施例中,处理器120可以根据步骤430中得到的目标空间位置的声场相关的物理量(例如,声压、声音频率,声音幅值、声音相位、声源振动速度、或媒质(例如空气)密度等),调整目标空间位置的噪声的参数信息(例如,频率信息、幅值信息、相位信息)以生成降噪信号。例如,处理器120可以判断与该声场相关的物理量(例如,声音频率,声音幅值、声音相位)与目标空间位置的噪声的参数信息是否相同。如果与该声场相关的物理量与目标空间位置的噪声的参数信息相同,处理器120可以不调整目标空间位置的噪声的参数信息。如果与该声场相关的物理量与目标空间位置的噪声的参数信息不相同,处理器120可以确定与该声场相关的物理量与目标空间位置的噪声的参数信息的差值,并基于该差值调整目标空间位置的噪声的参数信息。仅作为示例,当该差值大于一定范围,处理器120可以将该声场相关的物理量与目标空间位置的噪声的参数信息的平均值作为调整后的目标空间位置的噪声的参数信息并基于调整后的目标空间位置的噪声的参数信息生成降噪信号。又例如,由于环境中的噪声是不断变化的,当处理器120生成降噪信号时,实际环境中目标空间位置的噪声可能已经发生了细微变化,因此,处理器120可以根据麦克风阵列拾取环境噪声的时间信息和当前时间信息以及目标空间位置的声场相关的物理量(例如,声源振动速度、媒质(例如空气)密度)估计目标空间位置的环境噪声的参数信息的变化量,并基于该变化量调整目标空间位置的噪声的参数信息。经过上述调整可以使得降噪信号的幅值信息、频率信息与当前目标空间位置的环境噪声的幅值信息、频率信息更加吻合,且降噪信号的相位信息与当前目标空间位置的环境噪声的反相位信息更加吻合,从而使得降噪信号可以更加精准的消除环境噪声,提高降噪效果和用户的听觉体验。In some embodiments, the processor 120 may obtain sound field-related physical quantities (for example, sound pressure, sound frequency, sound amplitude, sound phase, sound source vibration velocity, or medium (for example, air) according to the sound field obtained in step 430 ) density, etc.), adjust the parameter information (eg, frequency information, amplitude information, phase information) of the noise of the target spatial position to generate a noise reduction signal. For example, the processor 120 may determine whether the physical quantity (eg, sound frequency, sound amplitude, sound phase) related to the sound field is the same as the parameter information of the noise at the target spatial location. If the physical quantity related to the sound field is the same as the parameter information of the noise at the target spatial position, the processor 120 may not adjust the parameter information of the noise at the target spatial position. If the physical quantity related to the sound field is different from the parameter information of the noise at the target spatial position, the processor 120 may determine the difference between the physical quantity related to the sound field and the parameter information of the noise at the target spatial position, and adjust the target based on the difference The parameter information of the noise at the spatial location. Just as an example, when the difference is greater than a certain range, the processor 120 may take the average value of the physical quantity related to the sound field and the parameter information of the noise at the target space position as the adjusted parameter information of the noise at the target space position and based on the adjusted value The parameter information of the noise at the target spatial location generates a noise reduction signal. For another example, since the noise in the environment is constantly changing, when the processor 120 generates the noise reduction signal, the noise of the target spatial position in the actual environment may have changed slightly. Therefore, the processor 120 can pick up the ambient noise according to the microphone array. The time information and current time information of the target space position and the physical quantities related to the sound field of the target space position (for example, the vibration velocity of the sound source, the density of the medium (for example, air)) estimate the change amount of the parameter information of the environmental noise of the target space position, and based on the change amount Parameter information for adjusting the noise at the spatial location of the target. After the above adjustment, the amplitude information and frequency information of the noise reduction signal can be more consistent with the amplitude information and frequency information of the environmental noise at the current target spatial position, and the phase information of the noise reduction signal is inversely related to the environmental noise at the current target spatial position. The phase information is more consistent, so that the noise reduction signal can more accurately eliminate ambient noise, improve the noise reduction effect and the user's listening experience.
在一些实施例中,当声学装置100的位置发生变化,例如,佩戴声学装置100的用户的头部发生转动时,环境噪声(例如噪声方向、幅值、相位)随之发生变化,声学装置100执行降噪的速度难以跟上环境噪声改变的速度,导致主动降噪功能失效甚至噪声增大。为此,处理器120可以基于声学装置100的一个或多个传感器140获取的声学装置100的运动信息(例如,运动轨迹、运动方向、运动速度、运动加速度、运动角速度、运动相关的时间信息)更新目标空间位置的噪声和目标空间位置的声场估计。进一步,基于更新后的目标空间位置的噪声和目标空间位置的声场估计,处理器120可以生成降噪信号。一个或多个传感器140可以记录声学装置100的运动信息,进而处理器120可以对降噪信号进行快速的更新,这可以提高声学装置100的噪声跟踪性能,使得降噪信号可以更加精准的消除环境噪声,进一步提高降噪效果和用户的听觉体验。In some embodiments, when the position of the acoustic device 100 changes, for example, when the head of the user wearing the acoustic device 100 rotates, the ambient noise (eg, noise direction, amplitude, phase) changes accordingly, the acoustic device 100 The speed of performing noise reduction is difficult to keep up with the speed of environmental noise changes, resulting in the failure of the active noise reduction function and even the increase of noise. To this end, the processor 120 may be based on motion information (eg, motion trajectory, motion direction, motion speed, motion acceleration, motion angular velocity, motion-related time information) of the acoustic device 100 obtained by one or more sensors 140 of the acoustic device 100 . Update the noise at the target space location and the sound field estimate at the target space location. Further, the processor 120 may generate a noise reduction signal based on the updated noise at the target spatial location and the sound field estimate at the target spatial location. One or more sensors 140 can record the motion information of the acoustic device 100, and then the processor 120 can quickly update the noise reduction signal, which can improve the noise tracking performance of the acoustic device 100, so that the noise reduction signal can more accurately eliminate the environment noise, further improving the noise reduction effect and the user's listening experience.
在一些实施例中,处理器120可以将拾取的环境噪声划分为多个频带。多个频带对应不同的频率范围。例如,处理器120可以将拾取的环境噪声划分为100-300Hz、300-500Hz、500-800Hz、800-1500Hz四个频带。在一些实施例中,每个频带中包含了对应频率范围的环境噪声的参数信息(例如,频率信息、幅值信息、相位信息)。对于多个频带中的至少一个,处理器120可以对其执行步骤420-440以生成与该至少一个频带中的每一个对应的降噪信号。例如,处理器120可以对四个频带中频带300-500Hz和频带500-800Hz执行步骤420-440以生成分别与频带300-500Hz和频带500-800Hz对应的降噪信号。进一步,在一些实施例中,扬声器130可以基于对应各个频带的降噪信号输出与各个频带对应的目标信号。例如,扬声器130可以输出与频带300-500Hz的噪声相位近似相反、幅值近似相等的目标信号以抵消频带300-500Hz的噪声,与频带500-800Hz的噪声相位近似相反、幅值近似相等的目标信号以抵消频带500-800Hz的噪声。In some embodiments, the processor 120 may divide the picked-up ambient noise into multiple frequency bands. Multiple frequency bands correspond to different frequency ranges. For example, the processor 120 may divide the picked-up ambient noise into four frequency bands of 100-300 Hz, 300-500 Hz, 500-800 Hz, and 800-1500 Hz. In some embodiments, each frequency band includes parameter information (eg, frequency information, amplitude information, phase information) of environmental noise in a corresponding frequency range. For at least one of the plurality of frequency bands, the processor 120 may perform steps 420-440 thereon to generate a noise reduction signal corresponding to each of the at least one frequency band. For example, the processor 120 may perform steps 420-440 on the four frequency bands, the frequency bands 300-500 Hz and the frequency bands 500-800 Hz, to generate noise reduction signals corresponding to the frequency bands 300-500 Hz and 500-800 Hz, respectively. Further, in some embodiments, the speaker 130 may output the target signal corresponding to each frequency band based on the noise reduction signal corresponding to each frequency band. For example, the speaker 130 may output a target signal that is approximately opposite in phase and approximately equal in amplitude to the noise in the frequency band 300-500 Hz to cancel the noise in the frequency band 300-500 Hz, and the target signal approximately opposite in phase and approximately equal in amplitude to the noise in the frequency band 500-800 Hz signal to cancel the noise in the frequency band 500-800Hz.
在一些实施例中,处理器120还可以根据用户的手动输入更新降噪信号。例如,用户在比较嘈杂的外界环境中佩戴声学装置100进行音乐播放时,用户自身的听觉体验效果不理想,用户可以根据自身的听觉效果手动调整降噪信号的参数信息(例如,频率信息、相位信息、幅值信息)。又例如,特殊用户(例如,听力受损用户或者年龄较大用户)在使用声学装置100的过程中,特殊用户的听力能力与普通用户的听力能力存在差异,声学装置100本身生成的降噪信号无法满足特殊用户的需要,导致特殊用户的听觉体验较差。这种情况下,可以预先设置一些降噪信号的参数信息的调整倍数,特殊用户可以根据自身的听觉效果和预先设置的降噪信号的参数信息的调整倍数调整 降噪信号,从而更新降噪信号以提高特殊用户的听觉体验。在一些实施例中,用户可以通过声学装置100上的按键手动调整降噪信号。在另一些实施例中,用户可以是通过终端设备调整降噪信号。具体地,声学装置100或者与声学装置100通信连接外部设备(例如,手机、平板电脑、电脑)上可以显示给用户建议的降噪信号的参数信息,用户可以根据自身的听觉体验情况进行参数信息的微调。In some embodiments, the processor 120 may also update the noise reduction signal according to the user's manual input. For example, when the user wears the acoustic device 100 to play music in a relatively noisy external environment, the user's own hearing experience effect is not ideal, and the user can manually adjust the parameter information (for example, frequency information, phase information, phase information, etc.) of the noise reduction signal according to the user's own hearing effect. information, amplitude information). For another example, when a special user (eg, a hearing-impaired user or an older user) uses the acoustic device 100, the hearing ability of the special user is different from that of an ordinary user, and the noise reduction signal generated by the acoustic device 100 itself It cannot meet the needs of special users, resulting in poor listening experience for special users. In this case, the adjustment multiples of the parameter information of the noise reduction signal can be preset, and special users can adjust the noise reduction signal according to their own hearing effects and the preset adjustment multiples of the parameter information of the noise reduction signal, so as to update the noise reduction signal. To improve the listening experience of special users. In some embodiments, the user can manually adjust the noise reduction signal through keys on the acoustic device 100 . In other embodiments, the user may adjust the noise reduction signal through the terminal device. Specifically, the parameter information of the noise reduction signal suggested to the user can be displayed on the acoustic device 100 or an external device (eg, mobile phone, tablet computer, computer) in communication with the acoustic device 100, and the user can perform parameter information according to their own hearing experience. fine-tuning.
应当注意的是,上述有关流程400的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程400进行各种修正和改变。例如,还可以增加、省略或合并流程400中的步骤。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 400 is only for example and illustration, and does not limit the scope of application of the present application. Various modifications and changes to the process 400 may be made to those skilled in the art under the guidance of the present application. For example, steps in flow 400 may also be added, omitted, or combined. Such corrections and changes are still within the scope of this application.
图5A-D是根据本申请一些实施例所示的麦克风阵列(例如麦克风阵列110)的示例性排布方式的示意图。在一些实施例中,麦克风阵列的排布方式可以是规则几何形状。如图5A所示,麦克风阵列可以为线形阵列。在一些实施例中,麦克风阵列的排布方式也可以是其他形状。例如,如图5B所示,麦克风阵列可以为十字形阵列。又例如,如图5C所示,麦克风阵列可以为圆形阵列。在一些实施例中,麦克风阵列的排布方式也可以是不规则几何形状。例如,如图5D所示,麦克风阵列可以为不规则阵列。需要说明的是,麦克风阵列的排布方式不限于图5A-D所示的线形阵列、十字形阵列、圆形阵列、不规则阵列,也可以是其他形状的阵列,例如,三角形阵列、螺旋形阵列、平面阵列、立体阵列、辐射型阵列等,本申请对此不做限定。5A-D are schematic diagrams of exemplary arrangements of microphone arrays (eg, microphone array 110 ) according to some embodiments of the present application. In some embodiments, the arrangement of the microphone array may be a regular geometric shape. As shown in FIG. 5A, the microphone array may be a linear array. In some embodiments, the arrangement of the microphone array can also be in other shapes. For example, as shown in FIG. 5B, the microphone array may be a cross-shaped array. For another example, as shown in FIG. 5C , the microphone array may be a circular array. In some embodiments, the arrangement of the microphone array may also be of irregular geometry. For example, as shown in Figure 5D, the microphone array may be an irregular array. It should be noted that the arrangement of the microphone arrays is not limited to the linear arrays, cross-shaped arrays, circular arrays, and irregular arrays shown in FIGS. Arrays, planar arrays, stereoscopic arrays, radial arrays, etc., are not limited in this application.
在一些实施例中,图5A-D中的每一条短实线可以视为一个麦克风或一组麦克风。当每一条短实线被视为一组麦克风时,每组麦克风的数量可以相同或不同,每组麦克风的种类可以相同或不同,每组麦克风的朝向可以相同或不同。麦克风的种类、数量以及朝向可以根据实际应用情况进行适应性调整,本申请对此不做限定。In some embodiments, each of the short solid lines in FIGS. 5A-D may be considered a microphone or a group of microphones. When each short solid line is regarded as a group of microphones, the number of each group of microphones may be the same or different, the types of each group of microphones may be the same or different, and the orientation of each group of microphones may be the same or different. The type, quantity and orientation of the microphones can be adaptively adjusted according to the actual application, which is not limited in this application.
在一些实施例中,麦克风阵列中的麦克风之间可以是均匀分布。这里的均匀分布可以指麦克风阵列中的任意相邻两个麦克风之间的间距相同。在一些实施例中,麦克风阵列中的麦克风也可以是非均匀分布。这里的非均匀分布可以指麦克风阵列中的任意相邻两个麦克风之间的间距不同。麦克风阵列中的麦克风之间的间距可以根据实际情况做适应性调整,本申请对此不做限定。In some embodiments, the microphones in the microphone array may be uniformly distributed. The uniform distribution here may refer to the same spacing between any two adjacent microphones in the microphone array. In some embodiments, the microphones in the microphone array may also be non-uniformly distributed. The non-uniform distribution here may mean that the spacing between any two adjacent microphones in the microphone array is different. The spacing between the microphones in the microphone array can be adaptively adjusted according to the actual situation, which is not limited in this application.
图6A-B是根据本申请一些实施例所示的麦克风阵列(例如麦克风阵列110)的示例性排布方式的示意图。如图6A所示,当用户佩戴具有麦克风阵列的声学装置,麦克风阵列以半圆形排布的排布方式设置于人耳处或周围,如图6B所示,麦克风阵列以 线形排布的排布方式是设置于人耳处。需要说明的是,麦克风阵列的排布方式不限于图6A和图6B中所示的半圆形和线形,麦克风阵列的设置位置也不限于图6A和图6B中所示的位置,这里的半圆形和线形以及麦克风阵列的设置位置只出于说明的目的。6A-B are schematic diagrams of exemplary arrangements of microphone arrays (eg, microphone array 110 ) according to some embodiments of the present application. As shown in FIG. 6A , when the user wears an acoustic device with a microphone array, the microphone arrays are arranged at or around the human ear in a semicircular arrangement. As shown in FIG. 6B , the microphone arrays are arranged in a linear arrangement. The way of cloth is set at the human ear. It should be noted that the arrangement of the microphone arrays is not limited to the semicircular and linear shapes shown in FIGS. 6A and 6B , and the arrangement positions of the microphone arrays are not limited to the positions shown in FIGS. 6A and 6B . The circular and linear shapes and placement of the microphone arrays are for illustrative purposes only.
图7是根据本申请一些实施例所示的估计目标空间位置的噪声的示例性流程图。如图7所示,流程700可以包括:FIG. 7 is an exemplary flowchart of estimating noise at a spatial location of a target according to some embodiments of the present application. As shown in FIG. 7, process 700 may include:
在步骤710中,确定一个或多个与麦克风阵列拾取的环境噪声有关的空间噪声源。在一些实施例中,该步骤可以由处理器120执行。如本文中所述,确定空间噪声源指的是确定空间噪声源相关信息,例如,空间噪声源的位置(包括空间噪声源的方位、空间噪声源与目标空间位置的距离等)、空间噪声源的相位以及空间噪声源的幅值等。In step 710, one or more sources of spatial noise related to ambient noise picked up by the microphone array are determined. In some embodiments, this step may be performed by processor 120 . As described herein, determining a spatial noise source refers to determining information related to the spatial noise source, such as the location of the spatial noise source (including the orientation of the spatial noise source, the distance between the spatial noise source and the target spatial location, etc.), the spatial noise source The phase and the amplitude of the spatial noise source, etc.
在一些实施例中,与环境噪声有关的空间噪声源是指其声波可传递至用户耳道处(例如,目标空间位置)或靠近用户耳道处的噪声源。在一些实施例中,空间噪声源可以为用户身体不同方向(例如,前方、后方等)的噪声源。例如,用户身体前方存在人群喧闹噪声、用户身体左方存在车辆鸣笛噪声,这种情况下,空间噪声源包括用户身体前方的人群喧闹噪声源和用户身体左方的车辆鸣笛噪声源。在一些实施例中,麦克风阵列(例如麦克风阵列110)可以拾取用户身体各个方向的空间噪声,并将空间噪声转化为电信号传递至处理器120,处理器120可以将空间噪声对应的电信号进行分析,得到所述拾取的各个方向的空间噪声的参数信息(例如,频率信息、幅值信息、相位信息等)。处理器120根据各个方向的空间噪声的参数信息确定各个方向的空间噪声源的信息,例如,空间噪声源的方位、空间噪声源的距离、空间噪声源的相位以及空间噪声源的幅值等。在一些实施例中,处理器120可以基于麦克风阵列(例如麦克风阵列110)拾取的空间噪声通过噪声定位算法确定空间噪声源。噪声定位算法可以包括波束形成算法、超分辨空间谱估计算法、到达时差算法(也可以称为时延估计算法)等中的一种或多种。波束形成算法是一种基于最大输出功率的可控波束形成的声源定位方法。仅作为示例,波束形成算法可以包括可控响应功率和相位变换(Steering Response Power-Phase Transform,SPR-PHAT)算法、延迟-叠加波束形成(delay-and-sum beamforming)、差分麦克风算法、旁瓣相消(Generalized Sidelobe Canceller,GSC)算法、最小方差无失真响应(Minimum Variance Distortionless Response,MVDR)算法等。超分辨空间谱估计算法可以包括自回归AR模型、最小方差谱估计(MV)和特征值分解方法(例如,多信号分类(Multiple Signal Classification,MUSIC)算法)等,这些方法都可以通过获取麦克风阵列拾取的声音信号(例如,空间噪声)来计算空间谱 的相关矩阵,并对空间噪声源的方向进行有效估计。到达时差算法可以先进行声音到达时间差估计,并从中获取麦克风阵列中麦克风之间的声延迟(Time Difference Of Arrival,TDOA),再利用获取的声音到达时间差,结合已知的麦克风阵列的空间位置进一步定位出空间噪声源的位置。In some embodiments, a spatial noise source related to ambient noise refers to a noise source whose sound waves can be delivered to the user's ear canal (eg, a target spatial location) or near the user's ear canal. In some embodiments, the spatial noise sources may be noise sources in different directions (eg, front, rear, etc.) of the user's body. For example, there is crowd noise in front of the user's body and vehicle whistle noise to the left of the user's body. In this case, the spatial noise sources include crowd noise sources in front of the user's body and vehicle whistle noise sources to the left of the user's body. In some embodiments, a microphone array (eg, the microphone array 110 ) can pick up spatial noise in all directions of the user's body, and convert the spatial noise into electrical signals and transmit them to the processor 120 , and the processor 120 can process the electrical signals corresponding to the spatial noise. After analysis, parameter information (eg, frequency information, amplitude information, phase information, etc.) of the picked up spatial noise in various directions is obtained. The processor 120 determines the information of the spatial noise sources in various directions according to the parameter information of the spatial noise in various directions, for example, the azimuth of the spatial noise source, the distance of the spatial noise source, the phase of the spatial noise source, and the amplitude of the spatial noise source. In some embodiments, the processor 120 may determine the source of the spatial noise through a noise localization algorithm based on the spatial noise picked up by the microphone array (eg, the microphone array 110). The noise localization algorithm may include one or more of a beamforming algorithm, a super-resolution spatial spectrum estimation algorithm, a time difference of arrival algorithm (also referred to as a delay estimation algorithm), and the like. The beamforming algorithm is a sound source localization method based on controllable beamforming of maximum output power. By way of example only, beamforming algorithms may include Steering Response Power-Phase Transform (SPR-PHAT) algorithms, delay-and-sum beamforming, differential microphone algorithms, sidelobes Cancellation (Generalized Sidelobe Canceller, GSC) algorithm, Minimum Variance Distortionless Response (Minimum Variance Distortionless Response, MVDR) algorithm, etc. Super-resolution spatial spectral estimation algorithms can include autoregressive AR models, minimum variance spectral estimation (MV), and eigenvalue decomposition methods (for example, Multiple Signal Classification (MUSIC) algorithms), etc., which can be obtained by acquiring microphone arrays. The picked-up sound signal (eg, spatial noise) is used to calculate the correlation matrix of the spatial spectrum and efficiently estimate the direction of the spatial noise source. The time difference of arrival algorithm can first estimate the sound arrival time difference, and obtain the sound delay (Time Difference Of Arrival, TDOA) between the microphones in the microphone array, and then use the obtained sound arrival time difference, combined with the known spatial position of the microphone array to further Locate the location of spatial noise sources.
例如,时延估计算法可以通过计算环境噪声信号传递到麦克风阵列中的不同麦克风的时间差,进而通过几何关系确定噪声源的位置。又例如,SPR-PHAT算法可以通过在每一个噪声源的方向上进行波束形成,其波束能量最强的方向可以近似认为是噪声源的方向。再例如,MUSIC算法可以是通过对麦克风阵列拾取的环境噪声信号的协方差矩阵进行特征值分解,得到环境噪声信号的子空间,从而分离出环境噪声的方向。关于确定噪声源的更多内容可以参考本申请说明书其它地方,例如,图8及其相应描述。For example, the time delay estimation algorithm can determine the location of the noise source by calculating the time difference between the ambient noise signal and different microphones in the microphone array. For another example, the SPR-PHAT algorithm can perform beamforming in the direction of each noise source, and the direction with the strongest beam energy can be approximately regarded as the direction of the noise source. For another example, the MUSIC algorithm may perform eigenvalue decomposition on the covariance matrix of the environmental noise signal picked up by the microphone array to obtain the subspace of the environmental noise signal, thereby separating the direction of the environmental noise. For more details on determining noise sources, reference may be made to other places in the specification of this application, eg, FIG. 8 and its corresponding description.
在一些实施例中,可以通过合成孔径、稀疏恢复、互素阵列等方法形成环境噪声的空间超分辨图像,该空间超分辨图像可以用于反映环境噪声的信号反射图,以进一步提高空间噪声源的定位精度。In some embodiments, a spatial super-resolution image of environmental noise can be formed by methods such as synthetic aperture, sparse restoration, co-prime array, etc., and the spatial super-resolution image can be used to reflect the signal reflection map of environmental noise, so as to further improve the source of spatial noise. positioning accuracy.
在一些实施例中,处理器120可以将拾取的环境噪声按照特定的频带宽度(例如,每500Hz作为一个频带)划分为多个频带,每个频带可以分别对应不同的频率范围,并在至少一个频带上确定与该频带对应的空间噪声源。例如,处理器120可以对环境噪声划分的频带进行信号分析,得到每个频带对应的环境噪声的参数信息,并根据参数信息确定与每个频带对应的空间噪声源。又例如,处理器120可以通过噪声定位算法确定与每个频带对应的空间噪声源。In some embodiments, the processor 120 may divide the picked-up environmental noise into multiple frequency bands according to a specific frequency bandwidth (for example, every 500 Hz as a frequency band), each frequency band may correspond to a different frequency range, and at least one The spatial noise source corresponding to the frequency band is determined on the frequency band. For example, the processor 120 may perform signal analysis on the frequency bands divided by the environmental noise, obtain parameter information of the environmental noise corresponding to each frequency band, and determine the spatial noise source corresponding to each frequency band according to the parameter information. For another example, the processor 120 may determine a spatial noise source corresponding to each frequency band through a noise localization algorithm.
在步骤720中,基于空间噪声源,估计目标空间位置的噪声。在一些实施例中,该步骤可以由处理器120执行。如本文中所述,估计目标空间位置的噪声指的是估计目标空间位置处的噪声的参数信息,例如,频率信息、幅值信息、相位信息等。In step 720, the noise of the target spatial location is estimated based on the spatial noise sources. In some embodiments, this step may be performed by processor 120 . As described herein, estimating the noise at the target spatial position refers to estimating parameter information of the noise at the target spatial position, such as frequency information, amplitude information, phase information, and the like.
在一些实施例中,处理器120可以基于步骤710中得到的位于用户身体各个方向的空间噪声源的参数信息(例如,频率信息、幅值信息、相位信息等),估计各个空间噪声源分别传递至目标空间位置的噪声的参数信息,从而估计出目标空间位置的噪声。例如,用户身体第一方位(例如,前方)和第二方位(例如,后方)分别有一个空间噪声源,处理器120可以根据第一方位空间噪声源的位置信息、频率信息、相位信息或幅值信息,估计第一方位空间噪声源的噪声传递到目标空间位置时,第一方位空间噪声源的频率信息、相位信息或幅值信息。处理器120可以根据第二方位空间噪声源的位置 信息、频率信息、相位信息或幅值信息,估计第二方位空间噪声源的噪声传递到目标空间位置时,第二方位空间噪声源的频率信息、相位信息或幅值信息。进一步,处理器120可以基于第一方位空间噪声源和第二方位空间噪声源的频率信息、相位信息或幅值信息,估计目标空间位置的噪声信息,从而估计目标空间位置的噪声的噪声信息。仅作为示例,处理器120可以利用虚拟传声器技术或其他方法估计目标空间位置的噪声信息。在一些实施例中,处理器120可以通过特征提取的方法从麦克风阵列拾取的空间噪声源的频率响应曲线提取空间噪声源的噪声的参数信息。在一些实施例中,提取空间噪声源的噪声的参数信息的方法可以包括但不限于主成分分析(Principal Components Analysis,PCA)、独立成分分析(Independent Component Algorithm,ICA)、线性判别分析(Linear Discriminant Analysis,LDA)、奇异值分解(Singular Value Decomposition,SVD)等。In some embodiments, the processor 120 may estimate that each spatial noise source respectively transmits the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the spatial noise sources located in various directions of the user's body obtained in step 710 . The parameter information of the noise to the target space position, so as to estimate the noise of the target space position. For example, if there is a spatial noise source in the first orientation (eg, front) and the second orientation (eg, behind) of the user's body, the processor 120 may determine the location information, frequency information, phase information or amplitude of the spatial noise source according to the first orientation. value information, the frequency information, phase information or amplitude information of the first azimuth spatial noise source is estimated when the noise of the first azimuth spatial noise source is transmitted to the target spatial position. The processor 120 may estimate the frequency information of the second azimuth spatial noise source when the noise of the second azimuth spatial noise source is transmitted to the target spatial position according to the position information, frequency information, phase information or amplitude information of the second azimuth spatial noise source. , phase information or amplitude information. Further, the processor 120 may estimate the noise information of the target spatial position based on the frequency information, phase information or amplitude information of the first azimuth spatial noise source and the second azimuth spatial noise source, thereby estimating the noise information of the noise of the target spatial position. For example only, the processor 120 may estimate noise information for the target spatial location using virtual microphone techniques or other methods. In some embodiments, the processor 120 may extract parameter information of the noise of the spatial noise source from the frequency response curve of the spatial noise source picked up by the microphone array through a feature extraction method. In some embodiments, the method for extracting the parameter information of the noise of the spatial noise source may include, but is not limited to, Principal Components Analysis (PCA), Independent Component Algorithm (ICA), Linear Discriminant Analysis (Linear Discriminant) Analysis, LDA), singular value decomposition (Singular Value Decomposition, SVD) and so on.
应当注意的是,上述有关流程700的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程700进行各种修正和改变。例如,流程700还可以包括对空间噪声源进行定位,提取空间噪声源的噪声的参数信息等步骤。又例如,步骤710和步骤720可以合并为一个步骤。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 700 is only for example and illustration, and does not limit the scope of application of the present application. Various modifications and changes to the process 700 may be made to those skilled in the art under the guidance of the present application. For example, the process 700 may further include the steps of locating the spatial noise source, extracting noise parameter information of the spatial noise source, and the like. For another example, step 710 and step 720 may be combined into one step. Such corrections and changes are still within the scope of this application.
图8是根据本申请一些实施例所示的估计目标空间位置的噪声的示意图。下面以到达时差算法为例说明空间噪声源的定位是如何实现的。如图8所示,处理器(例如,处理器120)可以计算噪声源(例如,811、812、813)产生的噪声信号传递到麦克风阵列820中的不同麦克风(例如,麦克风821、麦克风822等)的时间差,进而结合已知的麦克风阵列820的空间位置,通过麦克风阵列820和噪声源的位置关系(比如,距离、相对方位)确定噪声源的位置。FIG. 8 is a schematic diagram of estimating noise at a spatial position of a target according to some embodiments of the present application. The following takes the time difference of arrival algorithm as an example to illustrate how the location of the spatial noise source is realized. As shown in FIG. 8 , a processor (eg, processor 120 ) may calculate noise signals generated by noise sources (eg, 811 , 812 , 813 ) to pass to different microphones (eg, microphone 821 , microphone 822 , etc.) in microphone array 820 ), and then combined with the known spatial position of the microphone array 820 to determine the position of the noise source through the positional relationship (eg, distance, relative azimuth) between the microphone array 820 and the noise source.
获得噪声源(例如,811、812、813)的位置后,处理器可以根据噪声源的位置估计噪声源发出的噪声信号从噪声源传递至目标空间位置830的相位延迟和幅值变化。根据该相位延迟、幅值变化和空间噪声源发出的噪声信号的参数信息(例如,频率信息、幅值信息、相位信息等),处理器可以获得环境噪声传递至目标空间位置830时的参数信息(例如,频率信息、幅值信息、相位信息等),从而估计出目标空间位置的噪声。After obtaining the location of the noise source (eg, 811, 812, 813), the processor can estimate the phase delay and amplitude change of the noise signal emitted by the noise source from the noise source to the target spatial location 830 according to the location of the noise source. According to the phase delay, amplitude change and parameter information (eg, frequency information, amplitude information, phase information, etc.) of the noise signal emitted by the spatial noise source, the processor can obtain the parameter information when the environmental noise is transmitted to the target spatial location 830 (eg, frequency information, amplitude information, phase information, etc.), thereby estimating the noise at the spatial location of the target.
需要说明的是,图8中所描述的噪声源811、812和813、麦克风阵列820以及麦克风阵列820中的麦克风821和822、目标空间位置830仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以进行各 种修正和改变。例如,麦克风阵列820中的麦克风并不限于麦克风821和麦克风822,麦克风阵列820还可以包括更多个麦克风等。这些修正和改变仍在本申请的范围之内。It should be noted that the noise sources 811, 812 and 813, the microphone array 820, the microphones 821 and 822 in the microphone array 820, and the target spatial position 830 described in FIG. Scope of application. Various modifications and changes will occur to those skilled in the art under the guidance of this application. For example, the microphones in the microphone array 820 are not limited to the microphone 821 and the microphone 822, and the microphone array 820 may also include more microphones and the like. Such corrections and changes are still within the scope of this application.
图9是根据本申请一些实施例所示的估计目标空间位置的噪声和声场的示例性流程图。如图9所示,流程900可以包括:FIG. 9 is an exemplary flowchart of estimating noise and sound field at a spatial location of a target according to some embodiments of the present application. As shown in FIG. 9, the process 900 may include:
在步骤910中,基于麦克风阵列(例如麦克风阵列110、麦克风阵列820)构建虚拟麦克风。在一些实施例中,该步骤可以由处理器120执行。In step 910, a virtual microphone is constructed based on the microphone array (eg, microphone array 110, microphone array 820). In some embodiments, this step may be performed by processor 120 .
在一些实施例中,虚拟麦克风可以用于表示或模拟若目标空间位置处设置麦克风后所述麦克风采集的音频数据。即通过虚拟麦克风得到的音频数据可以近似或等效为若目标空间位置处放置物理麦克风后该物理麦克风所采集的音频数据。In some embodiments, a virtual microphone may be used to represent or simulate audio data collected by the microphone if the microphone is placed at the target spatial location. That is, the audio data obtained by the virtual microphone can be approximated or equivalent to the audio data collected by the physical microphone if the physical microphone is placed at the target spatial position.
在一些实施例中,虚拟麦克风可以包括数学模型。该数学模型可以体现目标空间位置的噪声或声场估计与麦克风阵列拾取的环境噪声的参数信息(例如,频率信息、幅值信息、相位信息等)和麦克风阵列的参数之间的关系。麦克风阵列的参数可以包括麦克风阵列的排布方式、各个麦克风之间的间距、麦克风阵列中麦克风的数量和位置等中的一种或多种。该数学模型可以基于初始数学模型以及麦克风阵列的参数和麦克风阵列拾取的声音(例如环境噪声)的参数信息(例如,频率信息、幅值信息、相位信息等)通过计算获得。例如,初始数学模型可以包括对应麦克风阵列的参数和麦克风阵列拾取的环境噪声的参数信息的参数以及模型参数。将麦克风阵列的参数和麦克风阵列拾取的声音的参数信息和模型参数的初始值带入初始数学模型获得预测的目标空间位置的噪声或声场。然后将该预测噪声或声场与目标空间位置处设置的物理麦克风获得的数据(噪声和声场估计)进行比较以对数学模型的模型参数进行调整。基于上述调整方法,通过大量数据(例如,麦克风阵列的参数和麦克风阵列拾取的环境噪声的参数信息),多次调整,从而获得该数学模型。In some embodiments, the virtual microphone may include a mathematical model. The mathematical model can embody the relationship between the noise or sound field estimation of the target spatial location and the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array and parameters of the microphone array. The parameters of the microphone array may include one or more of the arrangement of the microphone array, the spacing between the microphones, the number and position of the microphones in the microphone array, and the like. The mathematical model can be obtained by calculation based on the initial mathematical model and parameters of the microphone array and parameter information (eg, frequency information, amplitude information, phase information, etc.) of the sound (eg, ambient noise) picked up by the microphone array. For example, the initial mathematical model may include parameters and model parameters corresponding to parameters of the microphone array and parameter information of ambient noise picked up by the microphone array. The parameters of the microphone array and the parameter information of the sound picked up by the microphone array and the initial values of the model parameters are brought into the initial mathematical model to obtain the predicted noise or sound field of the target spatial position. This predicted noise or sound field is then compared with the data (noise and sound field estimates) obtained by physical microphones placed at the target spatial location to make adjustments to the model parameters of the mathematical model. Based on the above adjustment method, the mathematical model is obtained by adjusting multiple times through a large amount of data (for example, parameters of the microphone array and parameter information of ambient noise picked up by the microphone array).
在一些实施例中,虚拟麦克风可以包括机器学习模型。该机器学习模型可以基于麦克风阵列的参数和麦克风阵列拾取的声音(例如,环境噪声)的参数信息(例如,频率信息、幅值信息、相位信息等)通过训练获得。例如,将麦克风阵列的参数和麦克风阵列拾取的声音的参数信息作为训练样本对初始机器学习模型(例如,神经网络模型)进行训练获得该机器学习模型。具体的,可以将麦克风阵列的参数和麦克风阵列拾取的声音的参数信息输入初始机器学习模型,并获得预测结果(例如,目标空间位置的噪声和声场估计)。然后,将该预测结果与目标空间位置处设置的物理麦克风获得的数据(噪声和声场估计)进行比较以对初始机器学习模型的参数进行调整。基于上述调整方法, 通过大量数据(例如,麦克风阵列的参数和麦克风阵列拾取的环境噪声的参数信息),经过多次迭代,优化初始机器学习模型的参数,直至初始机器学习模型的预测结果与目标空间位置处设置的物理麦克风获得的数据相同或近似相同时,获得机器学习模型。In some embodiments, the virtual microphone may include a machine learning model. The machine learning model may be obtained through training based on parameters of the microphone array and parameter information (eg, frequency information, amplitude information, phase information, etc.) of the sound (eg, ambient noise) picked up by the microphone array. For example, the machine learning model is obtained by training an initial machine learning model (eg, a neural network model) using the parameters of the microphone array and the parameter information of the sound picked up by the microphone array as training samples. Specifically, the parameters of the microphone array and the parameter information of the sound picked up by the microphone array can be input into the initial machine learning model, and prediction results (eg, noise and sound field estimation of the target spatial position) can be obtained. This prediction is then compared with data (noise and sound field estimates) obtained from physical microphones set up at the target spatial location to adjust the parameters of the initial machine learning model. Based on the above adjustment method, through a large amount of data (for example, the parameters of the microphone array and the parameter information of the ambient noise picked up by the microphone array), after many iterations, the parameters of the initial machine learning model are optimized until the prediction results of the initial machine learning model match the target. The machine learning model is obtained when the data obtained by the physical microphones set at the spatial location are the same or approximately the same.
虚拟麦克风技术可以将物理麦克风从难以放置麦克风的位置(例如,目标空间位置)移开。例如,为了实现开放用户双耳不堵塞用户耳道的目的,物理麦克风不能设置于用户耳孔的位置(例如,目标空间位置)。此时,可以通过虚拟麦克风技术将麦克风阵列设置于靠近用户耳朵且不堵塞耳道的位置,例如,用户耳廓处等,然后通过麦克风阵列构建处于用户耳孔的位置的虚拟麦克风。虚拟麦克风可以利用处于第一位置物理麦克风(即麦克风阵列)来预测处于第二位置(例如,目标空间位置)的声音数据(例如,幅值、相位、声压、声场等)。在一些实施例中,虚拟麦克风预测得到的第二位置(也可以称为特定位置,例如目标空间位置)的声音数据可以根据虚拟麦克风与物理麦克风(即麦克风阵列)之间的距离、虚拟麦克风的类型(例如,数学模型虚拟麦克风、机器学习虚拟麦克风)等调整。例如,虚拟麦克风与物理麦克风(即麦克风阵列)之间的距离越近,虚拟麦克风预测得到的第二位置的声音数据越准确。又例如,在一些特定应用场景中,机器学习虚拟麦克风预测得到的第二位置的声音数据比数学模型虚拟麦克的更准确。在一些实施例中,虚拟麦克风对应的位置(即第二位置,例如目标空间位置)可以在麦克风阵列的附近,也可以远离麦克风阵列。Virtual microphone technology can move physical microphones away from locations where microphone placement is difficult (eg, target spatial locations). For example, in order to open the user's ears without blocking the user's ear canal, the physical microphone cannot be set at the position of the user's ear hole (eg, a target spatial position). At this time, the microphone array can be set close to the user's ear without blocking the ear canal through the virtual microphone technology, for example, at the user's auricle, etc., and then a virtual microphone at the position of the user's ear hole can be constructed through the microphone array. The virtual microphone can predict sound data (eg, amplitude, phase, sound pressure, sound field, etc.) at a second location (eg, a target spatial location) using a physical microphone (ie, a microphone array) at a first location. In some embodiments, the sound data of the second position (which may also be called a specific position, such as a target spatial position) predicted by the virtual microphone may be based on the distance between the virtual microphone and the physical microphone (ie, the microphone array), the Type (eg, mathematical model virtual microphone, machine learning virtual microphone), etc. For example, the closer the distance between the virtual microphone and the physical microphone (ie, the microphone array), the more accurate the sound data of the second position predicted by the virtual microphone. For another example, in some specific application scenarios, the sound data of the second position predicted by the machine learning virtual microphone is more accurate than that of the mathematical model virtual microphone. In some embodiments, the position corresponding to the virtual microphone (ie, the second position, for example, the target spatial position) may be near the microphone array, or may be far away from the microphone array.
在步骤920中,基于虚拟麦克风估计目标空间位置的噪声和声场。在一些实施例中,该步骤可以由处理器120执行。In step 920, the noise and sound field of the target spatial location is estimated based on the virtual microphone. In some embodiments, this step may be performed by processor 120 .
在一些实施例中,当虚拟麦克风为数学模型时,处理器120可以实时将麦克风阵列拾取的环境噪声的参数信息(例如,频率信息、幅值信息、相位信息等)和麦克风阵列的参数(例如,麦克风阵列的排布方式、各个麦克风之间的间距、麦克风阵列中麦克风的数量)作为数学模型的参数输入数学模型以估计目标空间位置的噪声和声场。In some embodiments, when the virtual microphone is a mathematical model, the processor 120 may combine the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array in real time with the parameters of the microphone array (eg, , the arrangement of the microphone array, the spacing between each microphone, the number of microphones in the microphone array) are input into the mathematical model as parameters of the mathematical model to estimate the noise and sound field of the target spatial location.
在一些实施例中,当虚拟麦克风为机器学习模型时,处理器120可以实时将麦克风阵列拾取的环境噪声的参数信息(例如,频率信息、幅值信息、相位信息等)和麦克风阵列的参数(例如,麦克风阵列的排布方式、各个麦克风之间的间距、麦克风阵列中麦克风的数量)输入机器学习模型并基于机器学习模型的输出估计目标空间位置的噪声和声场。In some embodiments, when the virtual microphone is a machine learning model, the processor 120 may real-time compare the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by the microphone array with the parameters of the microphone array ( For example, the arrangement of the microphone array, the spacing between the individual microphones, the number of microphones in the microphone array) are input into the machine learning model and the noise and sound field of the target spatial location are estimated based on the output of the machine learning model.
应当注意的是,上述有关流程900的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程900进行 各种修正和改变。例如,步骤920可以被分两个步骤以分别估计目标空间位置的噪声和声场。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 900 is only for example and illustration, and does not limit the scope of application of the present application. Various modifications and changes to process 900 may be made to process 900 under the guidance of the present application to those skilled in the art. For example, step 920 may be divided into two steps to separately estimate the noise and sound field of the target spatial location. Such corrections and changes are still within the scope of this application.
图10是根据本申请一些实施例所示的构建虚拟麦克风的示意图。如图10所示,目标空间位置1010可以位于用户耳道附近。为了实现开放用户双耳且不堵塞耳道的目的,目标空间位置1010不能设置物理麦克风,从而目标空间位置1010的噪声和声场不能通过物理麦克风直接估计。FIG. 10 is a schematic diagram of constructing a virtual microphone according to some embodiments of the present application. As shown in Figure 10, the target spatial location 1010 may be located near the user's ear canal. In order to achieve the purpose of opening the user's ears and not blocking the ear canal, the target spatial position 1010 cannot be provided with a physical microphone, so that the noise and sound field of the target spatial position 1010 cannot be directly estimated through the physical microphone.
为了估计目标空间位置1010的噪声和声场,可以在目标空间位置1010的附近设置麦克风阵列1020。仅作为示例,如图10所示,麦克风阵列1020可以包括第一麦克风1021、第二麦克风1022、第三麦克风1023。麦克风阵列1020中的各个麦克风(例如,第一麦克风1021、第二麦克风1022、第三麦克风1023)可以拾取用户所在空间的环境噪声。根据麦克风阵列1020中的各个麦克风拾取的环境噪声的参数信息(例如,频率信息、幅值信息、相位信息等)和麦克风阵列1020的参数(例如,麦克风阵列1020的排布方式、各个麦克风之间的间距、麦克风阵列1020中麦克风的数量),处理器120可以构建虚拟麦克风。进一步,基于该虚拟麦克风,处理器120可以估计目标空间位置1010处的噪声和声场。In order to estimate the noise and sound field of the target spatial location 1010, a microphone array 1020 may be provided in the vicinity of the target spatial location 1010. For example only, as shown in FIG. 10 , the microphone array 1020 may include a first microphone 1021 , a second microphone 1022 , and a third microphone 1023 . Each microphone in the microphone array 1020 (eg, the first microphone 1021, the second microphone 1022, and the third microphone 1023) can pick up ambient noise in the space where the user is located. According to the parameter information (eg, frequency information, amplitude information, phase information, etc.) of the ambient noise picked up by each microphone in the microphone array 1020 and the parameters of the microphone array 1020 (eg, the arrangement of the microphone array 1020 , the distance between the microphones distance, the number of microphones in the microphone array 1020), the processor 120 can construct a virtual microphone. Further, based on the virtual microphone, the processor 120 can estimate the noise and sound field at the target spatial location 1010 .
需要说明的是,图10中所描述的目标空间位置1010和麦克风阵列1020以及麦克风阵列1020中的第一麦克风1021、第二麦克风1022、第三麦克风1023仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以进行各种修正和改变。例如,麦克风阵列1020中的麦克风并不限于第一麦克风1021、第二麦克风1022和第三麦克风1023,麦克风阵列1020还可以包括更多个麦克风等。这些修正和改变仍在本申请的范围之内。It should be noted that the target spatial position 1010 and the microphone array 1020 and the first microphone 1021 , the second microphone 1022 and the third microphone 1023 in the microphone array 1020 described in FIG. Scope of application. Various modifications and changes may be made by those skilled in the art under the guidance of the present application. For example, the microphones in the microphone array 1020 are not limited to the first microphone 1021, the second microphone 1022 and the third microphone 1023, and the microphone array 1020 may also include more microphones and the like. Such corrections and changes are still within the scope of this application.
在一些实施例中,麦克风阵列(例如,麦克风阵列110、麦克风阵列820、麦克风阵列1020)在拾取环境噪声的同时,也可能会拾取扬声器发出的干扰信号(例如,目标信号和其他声音信号)。为了避免麦克风阵列拾取扬声器发出的干扰信号,麦克风阵列可以设置于远离扬声器的位置。但是,当设置于远离扬声器的位置时,麦克风阵列可能因为距离目标空间位置过远而无法对目标空间位置的声场和/噪声进行准确的估计。为了解决上述问题,麦克风阵列可以设置在目标区域以使麦克风阵列受来自扬声器的干扰信号最小。In some embodiments, the microphone arrays (eg, microphone array 110, microphone array 820, microphone array 1020) may also pick up interfering signals (eg, target signals and other sound signals) from speakers while picking up ambient noise. In order to prevent the microphone array from picking up the interference signal emitted by the speaker, the microphone array can be arranged far away from the speaker. However, when placed far from the speaker, the microphone array may be too far away from the target spatial location to accurately estimate the sound field and/or noise at the target spatial location. In order to solve the above problems, the microphone array can be placed in the target area to minimize the interference signal from the speaker to the microphone array.
在一些实施例中,目标区域可以是扬声器的声压级最小区域。声压级最小区域可以为扬声器辐射的声音较小的区域。在一些实施例中,扬声器可以形成至少一组声学 偶极子。例如,扬声器振膜正面和振膜背面输出的一组相位近似相反、幅值近似相同的声音信号可以视为两个点声源。该两个点声源可以构成声学偶极子或类似声学偶极子,其向外辐射的声音具有明显的指向性。理想情况下,在两个点声源连线所在的直线方向,扬声器辐射的声音较大,其余方向辐射声音明显减小,在两个点声源连线的中垂线(或中垂线附近)区域扬声器辐射的声音最小。In some embodiments, the target area may be the area of minimum sound pressure level of the loudspeaker. The area of minimum sound pressure level may be the area where the sound radiated by the speaker is low. In some embodiments, the loudspeaker may form at least one set of acoustic dipoles. For example, a set of sound signals output from the front of the speaker diaphragm and the back of the diaphragm with approximately opposite phases and approximately the same amplitude can be regarded as two point sound sources. The two point sound sources can form an acoustic dipole or similar acoustic dipoles, and the sound radiated outward has obvious directivity. Ideally, in the direction of the straight line where the two point sound sources are connected, the sound radiated by the speaker is larger, and the radiated sound in other directions is significantly reduced. ) zone speakers radiate minimal sound.
在一些实施例中,声学装置(例如声学装置100)中的扬声器(例如扬声器130)可以是骨导扬声器。当扬声器为骨导扬声器且干扰信号为骨导扬声器的漏音信号时,目标区域可以是骨导扬声器的漏音信号的声压级最小区域。漏音信号的声压级最小区域可以指骨导扬声器辐射的漏音信号最小的区域。麦克风阵列设置于骨导扬声器的漏音信号的声压级最小区域,可以降低麦克风阵列拾取的骨导扬声器的干扰信号,也可以有效地解决麦克风阵列距离目标空间位置过远而导致无法准确估计目标空间位置的声场的问题。In some embodiments, a speaker (eg, speaker 130 ) in an acoustic device (eg, acoustic device 100 ) may be a bone conduction speaker. When the speaker is a bone conduction speaker and the interference signal is a sound leakage signal of the bone conduction speaker, the target area may be an area with a minimum sound pressure level of the sound leakage signal of the bone conduction speaker. The region with the minimum sound pressure level of the sound leakage signal may refer to the region where the sound leakage signal radiated by the bone conduction speaker is the minimum. The microphone array is arranged in the area with the minimum sound pressure level of the leakage signal of the bone conduction speaker, which can reduce the interference signal of the bone conduction speaker picked up by the microphone array, and can also effectively solve the problem that the microphone array is too far away from the target space, resulting in the inability to accurately estimate the target. The problem of sound field in spatial position.
图11是根据本申请一些实施例所示的骨导扬声器在1000Hz时的三维声场漏音信号分布示意图。图12是根据本申请一些实施例所示的骨导扬声器在1000Hz时的二维声场漏音信号分布示意图。如图11-12所示,声学装置1100可以包括接触面1110。接触面1110可以被配置为当用户佩戴声学装置1100时与用户身体(例如,脸部、耳部)接触。骨导扬声器可以设置于声学装置1100内部。如图11所示,声学装置1100上的颜色可以表示骨导扬声器的漏音信号,不同的颜色深度可以表示漏音信号的大小不同。颜色越浅,表示骨导扬声器的漏音信号越大;颜色越深,表示骨导扬声器的漏音信号越小。如图11所示,相对于其他区域,虚线所在的区域1120的颜色较深,漏音信号较小,因此虚线所在的区域1120可以为骨导扬声器的漏音信号的声压级最小区域。仅作为示例,麦克风阵列可以设置在虚线所在的区域1120(例如,位置1),从而接收到来自骨导扬声器的漏音信号较小。FIG. 11 is a schematic diagram illustrating the distribution of sound leakage signals in a three-dimensional sound field of a bone conduction speaker at 1000 Hz according to some embodiments of the present application. FIG. 12 is a schematic diagram of the sound leakage signal distribution of the two-dimensional sound field at 1000 Hz of the bone conduction speaker according to some embodiments of the present application. As shown in FIGS. 11-12 , the acoustic device 1100 may include a contact surface 1110 . The contact surface 1110 may be configured to make contact with the user's body (eg, face, ears) when the user wears the acoustic device 1100 . The bone conduction speaker may be disposed inside the acoustic device 1100 . As shown in FIG. 11 , the color on the acoustic device 1100 can represent the sound leakage signal of the bone conduction speaker, and different color depths can represent different sizes of the sound leakage signal. The lighter the color, the greater the sound leakage signal of the bone conduction speaker; the darker the color, the smaller the sound leakage signal of the bone conduction speaker. As shown in FIG. 11 , compared with other areas, the area 1120 where the dotted line is located has a darker color and a smaller sound leakage signal, so the area 1120 where the dotted line is located may be the area with the minimum sound pressure level of the sound leakage signal of the bone conduction speaker. For example only, the microphone array may be placed in the area 1120 where the dotted line is located (eg, position 1) so that less leakage signals are received from the bone conduction speaker.
在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低5-30dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低7-28dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低9-26dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低11-24dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低13-22dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器 的最大输出声压可以降低15-20dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低17-18dB。在一些实施例中,骨导扬声器的声压级最小区域的声压比骨导扬声器的最大输出声压可以降低15dB。In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 5-30 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 7-28 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 9-26 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 11-24 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 13-22 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 15-20 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 17-18 dB lower than the maximum output sound pressure of the bone conduction speaker. In some embodiments, the sound pressure in the region of minimum sound pressure level of the bone conduction speaker may be 15dB lower than the maximum output sound pressure of the bone conduction speaker.
图12所示的二维声场分布是图11的三维声场漏音信号分布的二维截面图。如图12所示,截面上的颜色可以表示骨导扬声器的漏音信号,不同的颜色深度可以表示漏音信号的大小不同。颜色越浅,表示骨导扬声器的漏音信号越大,颜色越深,表示骨导扬声器的漏音信号越小。如图12所示,相对于其他区域,虚线所在的区域1210和1220的颜色较深,漏音信号较小。因此,虚线所在的区域1210和1220可以为骨导扬声器的漏音信号的声压级最小区域。仅作为示例,麦克风阵列可以设置在虚线所在的区域1210和1220(例如,位置A和位置B),从而接收到来自骨导扬声器的漏音信号较小。The two-dimensional sound field distribution shown in FIG. 12 is a two-dimensional cross-sectional view of the three-dimensional sound field leakage signal distribution of FIG. 11 . As shown in Figure 12, the color on the cross section can represent the sound leakage signal of the bone conduction speaker, and different color depths can represent different sizes of the sound leakage signal. The lighter the color, the greater the sound leakage signal of the bone conduction speaker, and the darker the color, the smaller the sound leakage signal of the bone conduction speaker. As shown in FIG. 12 , compared with other regions, the regions 1210 and 1220 where the dotted lines are located have darker colors and smaller sound leakage signals. Therefore, the regions 1210 and 1220 where the dotted lines are located may be the regions where the sound pressure level of the sound leakage signal of the bone conduction speaker is the minimum. For example only, the microphone arrays may be placed in the regions 1210 and 1220 where the dotted lines are located (eg, position A and position B) so that less leakage signals are received from the bone conduction speakers.
在一些实施例中,骨导扬声器在振动的过程中发出的振动信号较大,因此不仅骨导扬声器的漏音信号会对麦克风阵列产生干扰,骨导扬声器的振动信号也会对麦克风阵列产生干扰。此处骨导扬声器的振动信号可以指骨导扬声器的振动部件的振动带动的声学装置的其他部件(例如壳体、麦克风阵列)的振动。在这种情况下,骨导扬声器的干扰信号可以包括骨导扬声器的漏音信号和振动信号。为了避免麦克风阵列拾取骨导扬声器的干扰信号,麦克风阵列所处的目标区域可以是传递到麦克风阵列的骨导扬声器的漏音信号和振动信号的总能量最小的区域。骨导扬声器的漏音信号和振动信号是相对独立的信号,骨导扬声器的漏音信号的声压级最小区域不能表示骨导扬声器的漏音信号和振动信号的总能量最小的区域。因此,目标区域的确定需要对骨导扬声器的振动信号和漏音信号的总信号进行分析。In some embodiments, the vibration signal emitted by the bone conduction speaker is relatively large during the vibration process. Therefore, not only the sound leakage signal of the bone conduction speaker will interfere with the microphone array, but also the vibration signal of the bone conduction speaker will interfere with the microphone array. . Here, the vibration signal of the bone conduction speaker may refer to the vibration of other parts of the acoustic device (eg, the housing, the microphone array) driven by the vibration of the vibration part of the bone conduction speaker. In this case, the interference signal of the bone conduction speaker may include sound leakage signal and vibration signal of the bone conduction speaker. In order to prevent the microphone array from picking up the interference signal of the bone conduction speaker, the target area where the microphone array is located may be an area where the total energy of the sound leakage signal and the vibration signal of the bone conduction speaker transmitted to the microphone array is the smallest. The sound leakage signal and the vibration signal of the bone conduction speaker are relatively independent signals, and the region with the minimum sound pressure level of the sound leakage signal of the bone conduction speaker cannot represent the region with the minimum total energy of the sound leakage signal and the vibration signal of the bone conduction speaker. Therefore, the determination of the target area needs to analyze the total signal of the vibration signal and the leakage signal of the bone conduction speaker.
图13是根据本申请一些实施例所示的骨导扬声器的振动信号和漏音信号的总信号的频率响应示意图。图13示出了骨导扬声器的振动信号和漏音信号的总信号在图11中的声学装置1100上的位置1、位置2、位置3和位置4处的频率响应曲线。如图13所示,横坐标可以表示频率,纵坐标可以表示骨导扬声器的振动信号和漏音信号的总信号的声压。根据图11的相关描述,仅考虑骨导扬声器的漏音信号时,位置1位于扬声器130的声压级最小区域可以作为设置麦克风阵列(例如麦克风阵列110、麦克风阵列820、麦克风阵列1020)的目标区域。当同时考虑骨导扬声器的振动信号和漏音信号时,设置麦克风阵列的目标区域(即骨导扬声器的振动信号和漏音信号的总信号的声压最小的区域)却不一定为位置1。参照图13,相对于其他位置,位置2对应的骨导 扬声器的振动信号和漏音信号的总信号的声压较小,因此,位置2可以作为设置麦克风阵列的目标区域。FIG. 13 is a schematic diagram of the frequency response of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker according to some embodiments of the present application. FIG. 13 shows the frequency response curves of the total signal of the vibration signal and the leakage signal of the bone conduction speaker at positions 1, 2, 3 and 4 on the acoustic device 1100 in FIG. 11 . As shown in FIG. 13 , the abscissa may represent the frequency, and the ordinate may represent the sound pressure of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker. According to the related description of FIG. 11 , when only considering the sound leakage signal of the bone conduction speaker, the position 1 is located in the area with the minimum sound pressure level of the speaker 130 , which can be used as the target for setting the microphone array (eg, the microphone array 110 , the microphone array 820 , and the microphone array 1020 ). area. When considering both the vibration signal and the leakage signal of the bone conduction speaker, the target area of the microphone array (ie the area where the total signal of the vibration signal and the leakage signal of the bone conduction speaker has the smallest sound pressure) is not necessarily Position 1. Referring to Fig. 13 , compared with other positions, the sound pressure of the total signal of the vibration signal and the sound leakage signal of the bone conduction speaker corresponding to position 2 is relatively small, therefore, position 2 can be used as the target area for setting the microphone array.
在一些实施例中,目标区域的位置可以与麦克风阵列中的麦克风的振膜的朝向有关。麦克风的振膜的朝向可以影响麦克风接收到的骨导扬声器的振动信号的大小。例如,当麦克风的振膜与骨导扬声器的振动部件垂直时,麦克风可以采集的骨导扬声器的振动信号较小。又例如,当麦克风的振膜与骨导扬声器的振动部件平行时,麦克风可以采集的骨导扬声器的振动信号较大。在一些实施例中,可以通过设置麦克风振膜的朝向,从而减低麦克风接收到的骨导扬声器的振动信号。例如,当麦克风的振膜与骨导扬声器的振动部件垂直时,确定设置麦克风阵列的目标位置的过程中可以忽略骨导扬声器的振动信号,仅考虑骨导扬声器的漏音信号,即根据图11和图12的描述确定设置麦克风阵列的目标位置。又例如,当麦克风的振膜与骨导扬声器的振动部件平行时,确定设置麦克风阵列的目标位置的过程中可以同时考虑骨导扬声器的振动信号和漏音信号,即根据图13的描述确定设置麦克风阵列的目标位置。In some embodiments, the location of the target area may be related to the orientation of the diaphragms of the microphones in the microphone array. The orientation of the diaphragm of the microphone can affect the magnitude of the vibration signal of the bone conduction speaker received by the microphone. For example, when the diaphragm of the microphone is perpendicular to the vibration component of the bone conduction speaker, the vibration signal of the bone conduction speaker that can be collected by the microphone is small. For another example, when the diaphragm of the microphone is parallel to the vibration component of the bone conduction speaker, the vibration signal of the bone conduction speaker that can be collected by the microphone is larger. In some embodiments, the vibration signal of the bone conduction speaker received by the microphone can be reduced by setting the orientation of the microphone diaphragm. For example, when the diaphragm of the microphone is perpendicular to the vibration component of the bone conduction speaker, the vibration signal of the bone conduction speaker can be ignored in the process of determining the target position for setting the microphone array, and only the sound leakage signal of the bone conduction speaker is considered, that is, according to Figure 11 and the description of FIG. 12 to determine the target position for setting the microphone array. For another example, when the diaphragm of the microphone is parallel to the vibration component of the bone conduction speaker, the vibration signal and the sound leakage signal of the bone conduction speaker can be considered in the process of determining the target position for setting the microphone array, that is, the setting is determined according to the description in FIG. 13 . The target position of the microphone array.
在一些实施例中,通过调节麦克风的振膜的朝向可以调节麦克风接收到的骨导扬声器的振动信号的相位,使得麦克风接收到的骨导扬声器的振动信号与麦克风接收到的骨导扬声器的漏音信号的相位近似相反且大小近似相等,从而使麦克风接收到的骨导扬声器的振动信号与麦克风接收到的骨导扬声器的漏音信号可以至少部分相抵消,以此来实现降低麦克风阵列接收到的骨导扬声器的发出的干扰信号。在一些实施例中,麦克风接收到的骨导扬声器的振动信号可以降低麦克风接收到的骨导扬声器的漏音信号5-6dB。In some embodiments, the phase of the vibration signal of the bone conduction speaker received by the microphone can be adjusted by adjusting the orientation of the diaphragm of the microphone, so that the vibration signal of the bone conduction speaker received by the microphone and the leakage of the bone conduction speaker received by the microphone The phase of the sound signal is approximately opposite and the magnitude is approximately equal, so that the vibration signal of the bone conduction speaker received by the microphone and the sound leakage signal of the bone conduction speaker received by the microphone can be at least partially canceled, so as to reduce the amount of noise received by the microphone array. The interference signal emitted by the bone conduction speaker. In some embodiments, the vibration signal of the bone conduction speaker received by the microphone can reduce the sound leakage signal of the bone conduction speaker received by the microphone by 5-6 dB.
在一些实施例中,声学装置(例如声学装置100)中的扬声器(例如扬声器130)可以是气导扬声器。当扬声器为气导扬声器且干扰信号为气导扬声器的发出的声音信号(即辐射声场)时,目标区域可以是气导扬声器的辐射声场的声压级最小区域。麦克风阵列设置于气导扬声器的辐射声场的声压级最小区域,可以降低麦克风阵列拾取的气导扬声器的干扰信号,也可以有效地解决麦克风阵列距离目标空间位置过远而导致无法准确估计目标空间位置的声场的问题。In some embodiments, a speaker (eg, speaker 130 ) in an acoustic device (eg, acoustic device 100 ) may be an air conduction speaker. When the loudspeaker is an air-conductive loudspeaker and the interference signal is a sound signal (ie, a radiated sound field) emitted by the air-conducted loudspeaker, the target area may be an area with a minimum sound pressure level of the radiated sound field of the air-conducted loudspeaker. The microphone array is arranged in the area with the minimum sound pressure level of the radiated sound field of the air conduction speaker, which can reduce the interference signal of the air conduction speaker picked up by the microphone array, and can also effectively solve the problem that the microphone array is too far away from the target space and cannot accurately estimate the target space. The position of the sound field is the problem.
图14A-B是根据本申请一些实施例所示的气导扬声器的声场分布示意图。如图14A-B所示,气导扬声器可以设置在开放式声学装置1400内并从开放式声学装置1400的两个导声孔(例如图14A-B中的1401和1402)向外辐射声音,且发出的声音可以形成偶极子(以图14A-B中所示的“+”、“-”来表示)。14A-B are schematic diagrams of sound field distribution of an air conduction speaker according to some embodiments of the present application. As shown in Figures 14A-B, the air-conducting speakers may be disposed within the open acoustic device 1400 and radiate sound outward from two sound guiding holes (eg, 1401 and 1402 in Figures 14A-B) of the open acoustic device 1400, And the emitted sound can form a dipole (represented by "+", "-" shown in Figures 14A-B).
如图14A所示,开放式声学装置1400被设置以使偶极子的连线与用户脸部区域近似垂直。在这种情况下,偶极子辐射的声音可以形成三个较强的声场区域1421、1422和1423)。声场区域1421和1423之间以及声场区域1422和1423之间可以形成气导扬声器的辐射声场的声压级最小区域(也可以称为声压较小区域),例如,图14A中的虚线及其附近区域。该声压级最小区域可以指开放式声学装置1400输出的声音强度相对较小的区域。在一些实施例中,麦克风阵列中的麦克风1430可以设置于该声压级最小区域。例如,麦克风阵列中的麦克风1430可以设置于图14中虚线与开放式声学装置1400的壳体相交的位置,这样可以使麦克风1430在采集外部环境噪声的同时收到尽量少的气导扬声器发出的声音信号,降低气导扬声器发出的声音信号对开放式声学装置1400的主动降噪功能的干扰。As shown in FIG. 14A, the open acoustic device 1400 is positioned so that the line connecting the dipoles is approximately perpendicular to the user's face area. In this case, the sound radiated by the dipole can form three stronger sound field regions 1421, 1422 and 1423). Between the sound field regions 1421 and 1423 and between the sound field regions 1422 and 1423, the minimum sound pressure level region (also called the region with low sound pressure) of the radiated sound field of the air conduction speaker can be formed, for example, the dotted line in FIG. 14A and its nearby area. The minimum sound pressure level region may refer to a region where the sound intensity output by the open acoustic device 1400 is relatively small. In some embodiments, the microphones 1430 in the microphone array may be positioned in the region of minimum sound pressure level. For example, the microphone 1430 in the microphone array can be set at the position where the dotted line in FIG. 14 intersects the housing of the open acoustic device 1400, so that the microphone 1430 can receive as little noise from the air conduction speaker as possible while collecting the external ambient noise. The sound signal reduces the interference of the sound signal emitted by the air conduction speaker to the active noise reduction function of the open acoustic device 1400 .
如图14B所示,开放式声学装置1400被设置以使偶极子的连线与用户脸部区域近似平行。在这种情况下,偶极子辐射的声音可以形成两个较强的声场区域1424和1425)。声场区域1424和1425之间可以形成气导扬声器的辐射声场的声压级最小区域,例如,图14B中的虚线及其附近区域。在一些实施例中,麦克风阵列中的麦克风1440可以设置于该声压级最小区域。例如,麦克风阵列中的麦克风1440可以设置于图14中虚线与开放式声学装置1400的壳体相交的位置,这样可以使麦克风1440在采集外部环境噪声的同时尽量少收到气导扬声器发出的声音信号,降低气导扬声器发出的声音信号对开放式声学装置1400的主动降噪功能的干扰。As shown in FIG. 14B, the open acoustic device 1400 is positioned so that the lines connecting the dipoles are approximately parallel to the user's face area. In this case, the sound radiated by the dipole can form two stronger sound field regions 1424 and 1425). Between the sound field regions 1424 and 1425, a minimum sound pressure level region of the radiated sound field of the air conduction speaker may be formed, for example, the dashed line in FIG. 14B and its vicinity. In some embodiments, the microphones 1440 in the microphone array may be positioned in the region of minimum sound pressure level. For example, the microphone 1440 in the microphone array can be set at the position where the dotted line in FIG. 14 intersects with the casing of the open acoustic device 1400, so that the microphone 1440 can collect the external ambient noise and receive as little sound from the air conduction speaker as possible. Signal, reducing the interference of the sound signal emitted by the air conduction speaker to the active noise reduction function of the open acoustic device 1400.
图15是根据本申请一些实施例所示的基于传递函数输出目标信号的示例性流程图。如图15所示,流程1500可以包括:FIG. 15 is an exemplary flowchart of outputting a target signal based on a transfer function according to some embodiments of the present application. As shown in Figure 15, process 1500 may include:
在步骤1510中,基于传递函数处理降噪信号。在一些实施例中,该步骤可以由处理器120(例如,幅相补偿单元230)执行。关于降噪信号的更多介绍可以参考本申请其它地方,例如,图3及其相应描述。另外,根据图3的描述,扬声器(例如扬声器130)可以基于处理器120生成的降噪信号输出目标信号。In step 1510, the noise reduction signal is processed based on the transfer function. In some embodiments, this step may be performed by the processor 120 (eg, the amplitude and phase compensation unit 230). For more information on noise reduction signals, reference can be made elsewhere in this application, eg, FIG. 3 and its corresponding description. In addition, according to the description of FIG. 3 , the speaker (eg, the speaker 130 ) may output the target signal based on the noise reduction signal generated by the processor 120 .
在一些实施例中,扬声器输出的目标信号可以通过第一声径传送到用户耳朵中的特定位置(也可以称为噪声抵消位置),环境噪声可以通过第二声径传送到用户耳朵的特定位置,并在特定位置处,目标信号与环境噪声相互抵消,从而用户无法感知到环境噪声或者可以感知较为微弱的环境噪声。在一些实施例中,当扬声器为气导扬声器时,目标信号与环境噪声相互抵消的特定位置可以为用户耳道或其附近,例如,目标空间位置。第一声径可以为目标信号从气导扬声器经空气传输到目标空间位置的路径,第二声 径可以为环境噪声从的噪声源传输到目标空间位置的路径。在一些实施例中,当扬声器为骨导扬声器时,目标信号与环境噪声相互抵消的特定位置可以为用户的基底膜处。第一声径可以为目标信号从骨导扬声器,经用户的骨骼或组织到用户的基底膜的路径,第二声径可以为环境噪声从噪声源,经用户的耳道、鼓膜到用户的基底膜的路径。In some embodiments, the target signal output by the speaker can be transmitted to a specific position in the user's ear (also referred to as a noise cancellation position) through a first sound path, and ambient noise can be transmitted to a specific position of the user's ear through a second sound path , and at a specific location, the target signal and the ambient noise cancel each other out, so that the user cannot perceive the ambient noise or can perceive the relatively weak ambient noise. In some embodiments, when the loudspeaker is an air-conducting loudspeaker, the specific location where the target signal and the ambient noise cancel each other out may be the user's ear canal or its vicinity, eg, the target spatial location. The first sound path may be the path through which the target signal is transmitted from the air conduction speaker to the target spatial position through the air, and the second sound path may be the path through which the ambient noise is transmitted from the noise source to the target spatial position. In some embodiments, when the speaker is a bone conduction speaker, the specific position where the target signal and the ambient noise cancel each other may be at the user's basilar membrane. The first sound path may be the path of the target signal from the bone conduction speaker, through the user's bone or tissue to the user's basement membrane, and the second sound path may be the path of the ambient noise from the noise source, through the user's ear canal, tympanic membrane to the user's basement membrane path.
在一些实施例中,扬声器(例如,扬声器130)可以设置于用户耳道附近且不堵塞用户耳道的位置,从而扬声器与噪声抵消位置(例如,目标空间位置、基底膜)有一定的距离。因此,当扬声器输出的目标信号传递到与噪声抵消位置时,目标信号的相位信息和幅值信息可能会发生变化。结果,可能出现扬声器输出的目标信号无法实现降低环境噪声信号的作用,甚至会增强环境噪声,从而导致声学装置(例如,开放式声学输出装置100)的主动降噪功能无法实现。In some embodiments, the speaker (eg, speaker 130) may be positioned near and not obstructing the user's ear canal, so that the speaker is at a distance from the noise cancellation location (eg, target spatial location, basilar membrane). Therefore, when the target signal output by the speaker is delivered to the noise cancellation position, the phase information and amplitude information of the target signal may change. As a result, it may occur that the target signal output by the speaker cannot achieve the effect of reducing the ambient noise signal, or even enhance the ambient noise, thereby causing the active noise reduction function of the acoustic device (eg, the open acoustic output device 100 ) to fail to achieve.
基于上述情况,处理器120可以获得目标信号从扬声器发出到噪声抵消位置的传递函数。传递函数可以包括第一传递函数和第二传递函数。第一传递函数可以表示从扬声器发出到噪声抵消位置,目标信号的参数随声径(即第一声径)的变化(例如,幅值的变化、相位的变化)。在一些实施例中,当扬声器为骨导扬声器时,骨导扬声器发出到目标信号为骨导信号,骨导扬声器发出的目标信号和环境噪声抵消的位置为用户的基底膜。在这种情况下,第一传递函数可以表示从骨导扬声器发出到传递到用户的基底膜,该目标信号的参数(例如,相位、幅值)的变化。在一些实施例中,当扬声器为骨导扬声器时,第一传递函数可以通过实验获得。例如,骨导扬声器输出目标信号,同时在用户耳道附近位置播放一个与目标信号频率相同的气导声音信号,观测目标信号与气导声音信号的抵消效果。当目标信号与气导声音信号互相抵消时,可以基于气导声音信号和骨导扬声器输出的目标信号获得骨导扬声器的第一传递函数。在一些实施例中,当扬声器为气导扬声器时,气导扬声器发出到目标信号为气导声音信号,第一传递函数可以通过声学扩散场仿真和计算获得。例如,可以利用声学扩散场仿真气导扬声器发出的目标信号的声场,并基于该声场计算气导扬声器的第一传递函数。第二传递函数可以表示从目标空间位置到目标信号和环境噪声抵消的位置,环境噪声的参数(例如,幅值的变化、相位的变化)的变化。仅作为示例,当扬声器为骨导扬声器时,第二传递函数可以表示从目标空间位置到用户的基底膜,环境噪声的参数的变化。在一些实施例中,第二传递函数可以通过声学扩散场仿真和计算获得。例如,可以利用声学扩散场仿真环境噪声的声场,并基于该声场计算第二传递函数。Based on the above situation, the processor 120 can obtain the transfer function of the target signal emitted from the speaker to the noise canceling position. The transfer functions may include a first transfer function and a second transfer function. The first transfer function may represent the change (eg, change in amplitude, change in phase) of a parameter of the target signal with the sound path (ie, the first sound path) from the loudspeaker to the noise cancellation position. In some embodiments, when the speaker is a bone conduction speaker, the target signal emitted by the bone conduction speaker is the bone conduction signal, and the position where the target signal emitted by the bone conduction speaker and the ambient noise are canceled is the basement membrane of the user. In this case, the first transfer function may represent the change in the parameters (eg, phase, amplitude) of the target signal emanating from the bone conduction speaker to the basilar membrane delivered to the user. In some embodiments, when the speaker is a bone conduction speaker, the first transfer function can be obtained experimentally. For example, the bone conduction speaker outputs the target signal, and at the same time, an air conduction sound signal with the same frequency as the target signal is played near the ear canal of the user, and the cancellation effect of the target signal and the air conduction sound signal is observed. When the target signal and the air conduction sound signal cancel each other, the first transfer function of the bone conduction speaker can be obtained based on the air conduction sound signal and the target signal output by the bone conduction speaker. In some embodiments, when the loudspeaker is an air-conducting loudspeaker, the signal emitted by the air-conducting loudspeaker to the target is an air-conducting sound signal, and the first transfer function can be obtained by simulation and calculation of an acoustic diffusion field. For example, the sound field of the target signal emitted by the air-conducting speaker can be simulated by using the acoustic diffusion field, and the first transfer function of the air-conducting speaker can be calculated based on the sound field. The second transfer function may represent a change in a parameter of the ambient noise (eg, change in amplitude, change in phase) from a target spatial location to a location where the target signal and ambient noise cancel. For example only, when the speaker is a bone conduction speaker, the second transfer function may represent the change in the parameters of the ambient noise from the target spatial location to the basilar membrane of the user. In some embodiments, the second transfer function may be obtained by acoustic diffuse field simulation and calculation. For example, a sound field of ambient noise can be simulated using an acoustic diffuse field, and a second transfer function can be calculated based on the sound field.
在一些实施例中,在目标信号的传递过程中,不仅会存在相位改变,也可能会 存在信号的能量损耗。因此传递函数可以包括相位传递函数和幅值传递函数。在一些实施例中,相位传递函数和幅值传递函数都可以通过上述方法获得。In some embodiments, during the transmission of the target signal, there may not only be a phase change, but also energy loss of the signal. Thus the transfer function may include a phase transfer function and an amplitude transfer function. In some embodiments, both the phase transfer function and the magnitude transfer function can be obtained by the above method.
进一步,处理器120可以基于获得的传递函数处理降噪信号。在一些实施例中,处理器120可以基于获得的传递函数对降噪信号的幅值和相位进行调整。在一些实施例中,处理器120可以基于获得的相位传递函数调整降噪信号的相位并基于幅值传递函数调整降噪信号的幅值。Further, the processor 120 may process the noise reduction signal based on the obtained transfer function. In some embodiments, the processor 120 may adjust the amplitude and phase of the noise reduction signal based on the obtained transfer function. In some embodiments, the processor 120 may adjust the phase of the noise reduction signal based on the obtained phase transfer function and the amplitude of the noise reduction signal based on the magnitude transfer function.
在步骤1520中,根据处理后的降噪信号输出目标信号。在一些实施例中,该步骤可以由扬声器130执行。In step 1520, the target signal is output according to the processed noise reduction signal. In some embodiments, this step may be performed by speaker 130 .
在一些实施例中,扬声器130可以基于步骤1510中处理后的降噪信号输出目标信号,以使得扬声器130基于处理后的降噪信号输出的目标信号传递至与环境噪声抵消的位置时,该目标信号与环境噪声相位和的幅值满足特定条件。在一些实施例中,目标信号的相位与环境噪声的相位的相位差可以小于或等于一定相位阈值。该相位阈值可以处于90-180度范围内。该相位阈值可以根据用户的需要在该范围内进行调整。例如,当用户不希望被周围环境的声音打扰时,该相位阈值可以为较大值,例如180度,即目标信号的相位与环境噪声的相位相反。又例如,当用户希望对周围环境保持敏感时,该相位阈值可以为较小值,例如90度。需要注意的是,用户希望接收越多周围环境的声音,该相位阈值可以越接近90度,用户希望接收越少周围环境的声音,该相位阈值可以越接近180度。在一些实施例中,当目标信号的相位与环境噪声的相位一定的情况下(例如相位相反),环境噪声的幅值与该目标信号的幅值之间的幅值差可以小于或等于一定幅值阈值。例如,当用户不希望被周围环境的声音打扰时,该幅值阈值可以为较小值,例如0dB,即目标信号的幅值与环境噪声的幅值相等。又例如,当用户希望对周围环境保持敏感时,该幅值阈值可以为较大值,例如约等于环境噪声的幅值。需要注意的是,用户希望接收越多周围环境的声音,该幅值阈值可以越接近环境噪声的幅值,用户希望接收越少周围环境的声音,该幅值阈值可以越接近0dB。从而实现降低环境噪声的目的以及声学装置(例如,声学输出装置100)的主动降噪功能,提高用户的听觉体验。In some embodiments, the speaker 130 may output a target signal based on the noise reduction signal processed in step 1510, so that when the target signal output by the speaker 130 based on the processed noise reduction signal is transmitted to a position that cancels out the ambient noise, the target The magnitude of the phase sum of the signal and ambient noise satisfies certain conditions. In some embodiments, the phase difference between the phase of the target signal and the phase of the ambient noise may be less than or equal to a certain phase threshold. The phase threshold may be in the range of 90-180 degrees. The phase threshold can be adjusted within this range according to the user's needs. For example, when the user does not want to be disturbed by the sound of the surrounding environment, the phase threshold may be a larger value, such as 180 degrees, that is, the phase of the target signal is opposite to that of the ambient noise. For another example, when the user wants to be sensitive to the surrounding environment, the phase threshold may be a small value, such as 90 degrees. It should be noted that the more ambient sounds the user wishes to receive, the closer the phase threshold may be to 90 degrees, and the less ambient sounds the user wishes to receive, the closer the phase threshold may be to 180 degrees. In some embodiments, when the phase of the target signal and the phase of the ambient noise are certain (for example, the phases are opposite), the amplitude difference between the amplitude of the ambient noise and the amplitude of the target signal may be less than or equal to a certain amplitude value threshold. For example, when the user does not want to be disturbed by the sound of the surrounding environment, the amplitude threshold may be a small value, such as 0 dB, that is, the amplitude of the target signal is equal to the amplitude of the ambient noise. For another example, when the user wants to be sensitive to the surrounding environment, the amplitude threshold may be a larger value, for example, approximately equal to the amplitude of ambient noise. It should be noted that the more ambient sounds the user wishes to receive, the closer the amplitude threshold can be to the amplitude of ambient noise, and the less ambient sounds the user wishes to receive, the closer the amplitude threshold can be to 0dB. Thus, the purpose of reducing ambient noise and the active noise reduction function of the acoustic device (eg, the acoustic output device 100 ) are achieved, and the user's listening experience is improved.
应当注意的是,上述有关流程1500的描述仅仅是为了示例和说明,而不限定本说明书的适用范围。对于本领域技术人员来说,在本说明书的指导下可以对流程1500进行各种修正和改变。例如,流程1500还可以包括获得传递函数的步骤。又例如,步骤1510和步骤1520可以合并为一个步骤。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 1500 is only for example and illustration, and does not limit the scope of application of this specification. Various modifications and changes can be made to the process 1500 to those skilled in the art under the guidance of this specification. For example, the process 1500 may also include the step of obtaining a transfer function. For another example, steps 1510 and 1520 may be combined into one step. Such corrections and changes are still within the scope of this application.
图16是根据本申请说明书一些实施例提供的估计目标空间位置的噪声的示例性流程图。如图16所示,流程1600可以包括:FIG. 16 is an exemplary flowchart for estimating noise at a spatial location of a target, provided according to some embodiments of the present specification. As shown in Figure 16, process 1600 may include:
在步骤1610中,从拾取的环境噪声中去除与骨导麦克风拾取的信号相关联的成分,以便更新环境噪声。In step 1610, components associated with the signal picked up by the bone conduction microphone are removed from the picked up ambient noise in order to update the ambient noise.
在一些实施例中,该步骤可以由处理器120执行。在一些实施例中,麦克风阵列(例如,麦克风阵列110)在拾取环境噪声时,用户自身的说话声音也会被麦克风阵列拾取,即,用户自身说话的声音也被视为环境噪声的一部分。这种情况下,扬声器(例如,扬声器130)输出的目标信号会将用户自身说话的声音抵消。在一些实施例中,特定场景下,用户自身说话的声音需要被保留,例如,用户进行语音通话、发送语音消息等场景中。在一些实施例中,声学装置(例如声学装置100)可以包括骨导麦克风,用户佩戴声学装置进行语音通话或录制语音信息时,骨导麦克风可以通过拾取用户说话时面部骨骼或肌肉产生的振动信号来拾取用户说话的声音信号,并传递至处理器120。处理器120获取来自骨导麦克风拾取的声音信号的参数信息,并从麦克风阵列(例如,麦克风阵列110)拾取的环境噪声中去除与骨导麦克风拾取的声音信号相关联的声音信号成分。处理器120根据剩余的环境噪声的参数信息更新环境噪声。更新后的环境噪声中不再包含用户自身说话的声音信号,即在用户进行语音通话时用户可以听到用户自身说话的声音信号。In some embodiments, this step may be performed by processor 120 . In some embodiments, when the microphone array (eg, the microphone array 110 ) picks up the ambient noise, the user's own speaking voice is also picked up by the microphone array, that is, the user's own speaking voice is also regarded as part of the ambient noise. In this case, the target signal output by the speaker (eg, the speaker 130 ) can cancel the user's own voice. In some embodiments, in certain scenarios, the user's own voice needs to be preserved, for example, in scenarios such as the user making a voice call or sending a voice message. In some embodiments, the acoustic device (eg, the acoustic device 100 ) may include a bone conduction microphone. When the user wears the acoustic device to make a voice call or record voice information, the bone conduction microphone may pick up the vibration signal generated by the facial bones or muscles when the user speaks. to pick up the voice signal of the user speaking, and transmit it to the processor 120 . The processor 120 obtains parameter information from the sound signal picked up by the bone conduction microphone, and removes sound signal components associated with the sound signal picked up by the bone conduction microphone from the ambient noise picked up by the microphone array (eg, the microphone array 110). The processor 120 updates the ambient noise according to the remaining parameter information of the ambient noise. The updated environmental noise no longer includes the sound signal of the user's own speech, that is, the user can hear the sound signal of the user's own speech when the user makes a voice call.
在步骤1620中,根据更新后的环境噪声估计目标空间位置的噪声。在一些实施例中,该步骤可以由处理器120执行。可以以与步骤320类似的方式来执行步骤1620,并且在此不再重复相关的描述。In step 1620, the noise of the target spatial location is estimated according to the updated ambient noise. In some embodiments, this step may be performed by processor 120 . Step 1620 may be performed in a similar manner to step 320, and the related description is not repeated here.
应当注意的是,上述有关流程1600的描述仅仅是为了示例和说明,而不限定本申请的适用范围。对于本领域技术人员来说,在本申请的指导下可以对流程1600进行各种修正和改变。例如,还可以对骨导麦克风拾取的信号相关联的成分进行预处理,并将骨导麦克风拾取的信号作为音频信号传输至终端设备。这些修正和改变仍在本申请的范围之内。It should be noted that the above description about the process 1600 is only for example and description, and does not limit the scope of application of the present application. Various modifications and changes to process 1600 may be made to process 1600 under the guidance of the present application to those skilled in the art. For example, components associated with the signal picked up by the bone conduction microphone may also be preprocessed, and the signal picked up by the bone conduction microphone may be transmitted to the terminal device as an audio signal. Such corrections and changes are still within the scope of this application.
上文已对基本概念做了描述,显然,对于本领域技术人员来说,上述详细披露仅仅作为示例,而并不构成对本申请的限定。虽然此处并没有明确说明,本领域技术人员可能会对本申请进行各种修改、改进和修正。该类修改、改进和修正在本申请中被建议,所以该类修改、改进、修正仍属于本申请示范实施例的精神和范围。The basic concept has been described above. Obviously, for those skilled in the art, the above detailed disclosure is only an example, and does not constitute a limitation to the present application. Although not explicitly described herein, various modifications, improvements, and corrections to this application may occur to those skilled in the art. Such modifications, improvements, and corrections are suggested in this application, so such modifications, improvements, and corrections still fall within the spirit and scope of the exemplary embodiments of this application.
同时,本申请使用了特定词语来描述本申请的实施例。如“一个实施例”、“一实 施例”、和/或“一些实施例”意指与本申请至少一个实施例相关的某一特征、结构或特点。因此,应强调并注意的是,本申请中在不同位置两次或多次提及的“一实施例”或“一个实施例”或“一个替代性实施例”并不一定是指同一实施例。此外,本申请的一个或多个实施例中的某些特征、结构或特点可以进行适当的组合。Meanwhile, the present application uses specific words to describe the embodiments of the present application. References such as "one embodiment," "an embodiment," and/or "some embodiments" mean a certain feature, structure, or characteristic associated with at least one embodiment of the present application. Thus, it should be emphasized and noted that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this application are not necessarily referring to the same embodiment . Furthermore, certain features, structures or characteristics of the one or more embodiments of the present application may be combined as appropriate.
此外,本领域技术人员可以理解,本申请的各方面可以通过若干具有可专利性的种类或情况进行说明和描述,包括任何新的和有用的工序、机器、产品或物质的组合,或对他们的任何新的和有用的改进。相应地,本申请的各个方面可以完全由硬件执行、可以完全由软件(包括固件、常驻软件、微码等)执行、也可以由硬件和软件组合执行。以上硬件或软件均可被称为“数据块”、“模块”、“引擎”、“单元”、“组件”或“系统”。此外,本申请的各方面可能表现为位于一个或多个计算机可读介质中的计算机产品,该产品包括计算机可读程序编码。Furthermore, those skilled in the art will appreciate that aspects of this application may be illustrated and described in several patentable categories or situations, including any new and useful process, machine, product, or combination of matter, or combinations of them. of any new and useful improvements. Accordingly, various aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, microcode, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as a "data block", "module", "engine", "unit", "component" or "system". Furthermore, aspects of the present application may be embodied as a computer product comprising computer readable program code embodied in one or more computer readable media.
计算机存储介质可能包含一个内含有计算机程序编码的传播数据信号,例如在基带上或作为载波的一部分。该传播信号可能有多种表现形式,包括电磁形式、光形式等,或合适的组合形式。计算机存储介质可以是除计算机可读存储介质之外的任何计算机可读介质,该介质可以通过连接至一个指令执行系统、装置或设备以实现通讯、传播或传输供使用的程序。位于计算机存储介质上的程序编码可以通过任何合适的介质进行传播,包括无线电、电缆、光纤电缆、RF、或类似介质,或任何上述介质的组合。A computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on baseband or as part of a carrier wave. The propagating signal may take a variety of manifestations, including electromagnetic, optical, etc., or a suitable combination. Computer storage media can be any computer-readable media other than computer-readable storage media that can communicate, propagate, or transmit a program for use by coupling to an instruction execution system, apparatus, or device. Program code on a computer storage medium may be transmitted over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.
此外,除非权利要求中明确说明,本申请所述处理元素和序列的顺序、数字字母的使用、或其他名称的使用,并非用于限定本申请流程和方法的顺序。尽管上述披露中通过各种示例讨论了一些目前认为有用的发明实施例,但应当理解的是,该类细节仅起到说明的目的,附加的权利要求并不仅限于披露的实施例,相反,权利要求旨在覆盖所有符合本申请实施例实质和范围的修正和等价组合。例如,虽然以上所描述的系统组件可以通过硬件设备实现,但是也可以只通过软件的解决方案得以实现,如在现有的服务器或移动设备上安装所描述的系统。Furthermore, unless explicitly stated in the claims, the order of processing elements and sequences described in the present application, the use of numbers and letters, or the use of other names are not intended to limit the order of the procedures and methods of the present application. While the foregoing disclosure discusses by way of various examples some embodiments of the invention that are presently believed to be useful, it is to be understood that such details are for purposes of illustration only and that the appended claims are not limited to the disclosed embodiments, but rather The requirements are intended to cover all modifications and equivalent combinations falling within the spirit and scope of the embodiments of the present application. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described systems on existing servers or mobile devices.
同理,应当注意的是,为了简化本申请披露的表述,从而帮助对一个或多个发明实施例的理解,前文对本申请实施例的描述中,有时会将多种特征归并至一个实施例、附图或对其的描述中。但是,这种披露方法并不意味着本申请对象所需要的特征比权利要求中提及的特征多。实际上,实施例的特征要少于上述披露的单个实施例的全部特征。Similarly, it should be noted that, in order to simplify the expressions disclosed in the present application and thus help the understanding of one or more embodiments of the invention, in the foregoing description of the embodiments of the present application, various features are sometimes combined into one embodiment, in the drawings or descriptions thereof. However, this method of disclosure does not imply that the subject matter of the application requires more features than those mentioned in the claims. Indeed, there are fewer features of an embodiment than all of the features of a single embodiment disclosed above.
一些实施例中使用了描述成分、属性数量的数字,应当理解的是,此类用于实施例描述的数字,在一些示例中使用了修饰词“大约”、“近似”或“大体上”来修饰。除非另 外说明,“大约”、“近似”或“大体上”表明所述数字允许有±20%的变化。相应地,在一些实施例中,说明书和权利要求中使用的数值参数均为近似值,该近似值根据个别实施例所需特点可以发生改变。在一些实施例中,数值参数应考虑规定的有效数位并采用一般位数保留的方法。尽管本申请一些实施例中用于确认其范围广度的数值域和参数为近似值,在具体实施例中,此类数值的设定在可行范围内尽可能精确。Some examples use numbers to describe quantities of ingredients and attributes, it should be understood that such numbers used to describe the examples, in some examples, use the modifiers "about", "approximately" or "substantially" to retouch. Unless stated otherwise, "about", "approximately" or "substantially" means that a variation of ±20% is allowed for the stated number. Accordingly, in some embodiments, the numerical parameters set forth in the specification and claims are approximations that can vary depending upon the desired characteristics of individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and use a general digit reservation method. Notwithstanding that the numerical fields and parameters used in some embodiments of the present application to confirm the breadth of their ranges are approximations, in particular embodiments such numerical values are set as precisely as practicable.
针对本申请引用的每个专利、专利申请、专利申请公开物和其他材料,如文章、书籍、说明书、出版物、文档等,特此将其全部内容并入本申请作为参考。与本申请内容不一致或产生冲突的申请历史文件除外,对本申请权利要求最广范围有限制的文件(当前或之后附加于本申请中的)也除外。需要说明的是,如果本申请附属材料中的描述、定义、和/或术语的使用与本申请所述内容有不一致或冲突的地方,以本申请的描述、定义和/或术语的使用为准。Each patent, patent application, patent application publication, and other material, such as article, book, specification, publication, document, etc., cited in this application is hereby incorporated by reference in its entirety. Application history documents that are inconsistent with or conflict with the content of this application are excluded, as are documents (currently or hereafter appended to this application) that limit the broadest scope of the claims of this application. It should be noted that, if there is any inconsistency or conflict between the descriptions, definitions and/or terms used in the attached materials of this application and the content of this application, the descriptions, definitions and/or terms used in this application shall prevail .
最后,应当理解的是,本申请中所述实施例仅用以说明本申请实施例的原则。其他的变形也可能属于本申请的范围。因此,作为示例而非限制,本申请实施例的替代配置可视为与本申请的教导一致。相应地,本申请的实施例不仅限于本申请明确介绍和描述的实施例。Finally, it should be understood that the embodiments described in the present application are only used to illustrate the principles of the embodiments of the present application. Other variations are also possible within the scope of this application. Accordingly, by way of example and not limitation, alternative configurations of embodiments of the present application may be considered consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to the embodiments expressly introduced and described in the present application.

Claims (24)

  1. 一种声学装置,包括:An acoustic device comprising:
    麦克风阵列,被配置为拾取环境噪声;an array of microphones, configured to pick up ambient noise;
    处理器,被配置为:The processor, configured as:
    利用所述麦克风阵列对目标空间位置的声场进行估计,所述目标空间位置比所述麦克风阵列中任一麦克风更加靠近用户耳道,以及using the microphone array to estimate a sound field at a target spatial location that is closer to the user's ear canal than any microphone in the microphone array, and
    基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号;以及generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location; and
    至少一个扬声器,被配置为根据所述降噪信号输出目标信号,所述目标信号用于降低所述环境噪声,其中所述麦克风阵列设置在目标区域以使所述麦克风阵列受来自所述至少一个扬声器的干扰信号最小。at least one speaker configured to output a target signal based on the noise reduction signal, the target signal being used to reduce the ambient noise, wherein the microphone array is positioned in the target area so that the microphone array is affected by the at least one The loudspeaker has minimal interference signals.
  2. 根据权利要求1所述的声学装置,其中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号包括:The acoustic device of claim 1, wherein the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location comprises:
    基于所述拾取的环境噪声估计所述目标空间位置的噪声;以及estimating noise at the target spatial location based on the picked-up ambient noise; and
    基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on noise at the target spatial location and a sound field estimate at the target spatial location.
  3. 根据权利要求2所述的声学装置,其中The acoustic device of claim 2, wherein
    所述声学装置进一步包括一个或多个传感器,用于获取所述声学装置的运动信息,以及The acoustic device further includes one or more sensors for acquiring motion information of the acoustic device, and
    所述处理器进一步被配置为:The processor is further configured to:
    基于所述运动信息更新所述目标空间位置的噪声和所述目标空间位置的声场估计;以及updating the noise of the target spatial location and the sound field estimate of the target spatial location based on the motion information; and
    基于所述更新后的目标空间位置的噪声和所述更新后的目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on the updated noise of the target spatial position and the updated sound field estimate of the target spatial position.
  4. 根据权利要求2所述的声学装置,其中,所述基于所述拾取的环境噪声估计所述目标空间位置的噪声包括:The acoustic device of claim 2, wherein the estimating noise of the target spatial location based on the picked-up ambient noise comprises:
    确定一个或多个与所述拾取的环境噪声有关的空间噪声源;以及determining one or more spatial noise sources associated with the picked-up ambient noise; and
    基于所述空间噪声源,估计所述目标空间位置的噪声。Based on the spatial noise source, the noise of the target spatial location is estimated.
  5. 根据权利要求1所述的声学装置,其中,所述利用所述麦克风阵列对目标空间位置的声场进行估计包括:The acoustic device according to claim 1, wherein the estimating the sound field of the target spatial location using the microphone array comprises:
    基于所述麦克风阵列构建虚拟麦克风,所述虚拟麦克风包括数学模型或机器学习模型,用于表示若所述目标空间位置处包括麦克风后所述麦克风采集的音频数据;以及constructing a virtual microphone based on the microphone array, the virtual microphone including a mathematical model or a machine learning model for representing audio data collected by the microphone if the target spatial location includes a microphone; and
    基于所述虚拟麦克风对所述目标空间位置的声场进行估计。The sound field of the target spatial location is estimated based on the virtual microphone.
  6. 根据权利要求5所述的声学装置,其中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号包括:The acoustic device of claim 5, wherein the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location comprises:
    基于所述虚拟麦克风估计所述目标空间位置的噪声;以及estimating noise at the target spatial location based on the virtual microphone; and
    基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on noise at the target spatial location and a sound field estimate at the target spatial location.
  7. 根据权利要求1所述的声学装置,其中The acoustic device of claim 1, wherein
    所述至少一个扬声器是骨导扬声器,the at least one speaker is a bone conduction speaker,
    所述干扰信号包括所述骨导扬声器的漏音信号和振动信号,以及The interference signal includes a sound leakage signal and a vibration signal of the bone conduction speaker, and
    所述目标区域为传递到所述麦克风阵列的所述骨导扬声器的所述漏音信号和所述振动信号的总能量最小的区域。The target area is an area where the total energy of the leakage signal and the vibration signal transmitted to the bone conduction speaker of the microphone array is the smallest.
  8. 根据权利要求7所述的声学装置,其中,The acoustic device of claim 7, wherein,
    所述目标区域的位置与所述麦克风阵列中的麦克风的振膜的朝向有关,The position of the target area is related to the orientation of the diaphragms of the microphones in the microphone array,
    所述麦克风的振膜的朝向降低所述麦克风接收到的所述骨导扬声器的所述振动信号的大小,The orientation of the diaphragm of the microphone reduces the magnitude of the vibration signal of the bone conduction speaker received by the microphone,
    所述麦克风的振膜的朝向使得所述麦克风接收到的所述骨导扬声器的所述振动信号与所述麦克风接收到的所述骨导扬声器的所述漏音信号至少部分互相抵消,以及The diaphragm of the microphone is oriented such that the vibration signal of the bone conduction speaker received by the microphone and the leakage signal of the bone conduction speaker received by the microphone at least partially cancel each other, and
    所述麦克风接收到的所述骨导扬声器的所述振动信号降低所述麦克风接收到的所述骨导扬声器的所述漏音信号5-6dB。The vibration signal of the bone conduction speaker received by the microphone reduces the sound leakage signal of the bone conduction speaker received by the microphone by 5-6 dB.
  9. 根据权利要求1所述的声学装置,其中The acoustic device of claim 1, wherein
    所述至少一个扬声器是气导扬声器,以及the at least one speaker is an air conduction speaker, and
    所述目标区域为所述气导扬声器的辐射声场的声压级最小区域。The target area is the area with the minimum sound pressure level of the radiated sound field of the air conduction speaker.
  10. 根据权利要求1所述的声学装置,其中The acoustic device of claim 1, wherein
    所述处理器进一步被配置为基于传递函数处理所述降噪信号,所述传递函数包括第一传递函数和第二传递函数,所述第一传递函数表示从所述至少一个扬声器发出到所述目标信号和所述环境噪声抵消的位置所述目标信号的参数的变化,所述第二传递函数表示从所述目标空间位置到所述目标信号和所述环境噪声抵消的位置所述环境噪声的参数的变化;以及The processor is further configured to process the noise reduction signal based on a transfer function, the transfer function including a first transfer function and a second transfer function, the first transfer function representing the emission from the at least one speaker to the The target signal and the position where the ambient noise cancels the change in the parameter of the target signal, and the second transfer function represents the change of the ambient noise from the target spatial position to the position where the target signal and the ambient noise cancel. changes in parameters; and
    所述至少一个扬声器进一步被配置为根据所述处理后的降噪信号输出所述目标信号。The at least one speaker is further configured to output the target signal based on the processed noise reduction signal.
  11. 根据权利要求1所述的声学装置,其中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号包括:The acoustic device of claim 1, wherein the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location comprises:
    将所述拾取的环境噪声划分为多个频带,所述多个频带对应不同的频率范围;以及dividing the picked-up ambient noise into a plurality of frequency bands, the plurality of frequency bands corresponding to different frequency ranges; and
    对于所述多个频带中的至少一个,生成与所述至少一个频带中的每一个对应的降噪信号。For at least one of the plurality of frequency bands, a noise reduction signal corresponding to each of the at least one frequency band is generated.
  12. 根据权利要求1所述的声学装置,其中,所述处理器进一步被配置为基于所述目标空间位置的声场估计对所述目标空间位置的噪声进行幅度和相位调整以生成所述降噪信号。1. The acoustic device of claim 1, wherein the processor is further configured to amplitude and phase adjust the noise of the target spatial location based on the sound field estimate of the target spatial location to generate the noise reduction signal.
  13. 根据权利要求1所述的声学装置,其中所述声学装置进一步包括固定结构,被配置为将所述声学装置固定在用户耳朵附近且不堵塞用户耳道的位置。2. The acoustic device of claim 1, wherein the acoustic device further comprises a securing structure configured to secure the acoustic device in a position adjacent to the user's ear and not occluding the user's ear canal.
  14. 根据权利要求1所述的声学装置,其中所声学装置进一步包括壳体结构,被配置为承载或容纳所述麦克风阵列、所述处理器和所述至少一个扬声器。The acoustic device of claim 1, wherein the acoustic device further comprises a housing structure configured to carry or house the microphone array, the processor, and the at least one speaker.
  15. 一种降噪方法,包括:A noise reduction method comprising:
    由麦克风阵列拾取环境噪声;Ambient noise is picked up by the microphone array;
    由处理器by the processor
    利用所述麦克风阵列对目标空间位置的声场进行估计,所述目标空间位置比所 述麦克风阵列中任一麦克风更加靠近用户耳道;Using the microphone array to estimate the sound field of a target spatial position, the target spatial position is closer to the user's ear canal than any microphone in the microphone array;
    基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号;以及generating a noise reduction signal based on the picked-up ambient noise and the sound field estimate of the target spatial location; and
    由至少一个扬声器,根据所述降噪信号输出目标信号,所述目标信号用于降低所述环境噪声,其中所述麦克风阵列设置在目标区域以使所述麦克风阵列受来自所述至少一个扬声器的干扰信号最小。A target signal is output from at least one speaker according to the noise reduction signal, the target signal is used to reduce the ambient noise, wherein the microphone array is arranged in the target area so that the microphone array is subjected to the noise from the at least one speaker. Interfering signals are minimal.
  16. 根据权利要求15所述的降噪方法,其中,所述由处理器,基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号包括:16. The noise reduction method of claim 15, wherein the generating, by the processor, a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial location comprises:
    基于所述拾取的环境噪声估计所述目标空间位置的噪声;以及estimating noise at the target spatial location based on the picked-up ambient noise; and
    基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on noise at the target spatial location and a sound field estimate at the target spatial location.
  17. 根据权利要求16所述的降噪方法,进一步包括:The noise reduction method of claim 16, further comprising:
    由一个或多个传感器,获取所述声学装置的运动信息,acquiring motion information of the acoustic device by one or more sensors,
    由所述处理器by the processor
    基于所述运动信息更新所述目标空间位置的噪声和所述目标空间位置的声场估计;以及updating the noise of the target spatial location and the sound field estimate of the target spatial location based on the motion information; and
    基于所述更新后的目标空间位置的噪声和所述更新后的目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on the updated noise of the target spatial position and the updated sound field estimate of the target spatial position.
  18. 根据权利要求16所述的降噪方法,其中,所述基于所述拾取的环境噪声估计所述目标空间位置的噪声包括:The noise reduction method according to claim 16, wherein the estimating the noise of the target spatial position based on the picked-up environmental noise comprises:
    确定一个或多个与所述拾取的环境噪声有关的空间噪声源;以及determining one or more spatial noise sources associated with the picked-up ambient noise; and
    基于所述空间噪声源,估计所述目标空间位置的噪声。Based on the spatial noise source, the noise of the target spatial location is estimated.
  19. 根据权利要求15所述的降噪方法,其中,所述由处理器,利用所述麦克风阵列对目标空间位置的声场进行估计包括:The noise reduction method according to claim 15, wherein the estimation by the processor of the sound field of the target spatial position using the microphone array comprises:
    基于所述麦克风阵列构建虚拟麦克风,所述虚拟麦克风包括数学模型或机器学习模型,用于表示若所述目标空间位置处包括麦克风后所述麦克风采集的音频数据;以及constructing a virtual microphone based on the microphone array, the virtual microphone including a mathematical model or a machine learning model for representing audio data collected by the microphone if the target spatial location includes a microphone; and
    基于所述虚拟麦克风对所述目标空间位置的声场进行估计。The sound field of the target spatial location is estimated based on the virtual microphone.
  20. 根据权利要求19所述的降噪方法,其中,所述基于所述拾取的环境噪声和所述目标空间位置的声场估计生成降噪信号包括:The noise reduction method of claim 19, wherein the generating a noise reduction signal based on the picked-up ambient noise and the sound field estimation of the target spatial position comprises:
    基于所述虚拟麦克风估计所述目标空间位置的噪声;以及estimating noise at the target spatial location based on the virtual microphone; and
    基于所述目标空间位置的噪声和所述目标空间位置的声场估计生成所述降噪信号。The noise reduction signal is generated based on noise at the target spatial location and a sound field estimate at the target spatial location.
  21. 根据权利要求15所述的降噪方法,其中The noise reduction method of claim 15, wherein
    所述至少一个扬声器是骨导扬声器,the at least one speaker is a bone conduction speaker,
    所述干扰信号包括所述骨导扬声器的漏音信号和振动信号,以及The interference signal includes a sound leakage signal and a vibration signal of the bone conduction speaker, and
    所述目标区域为传递到所述麦克风阵列的所述骨导扬声器的所述漏音信号和所述振动信号的总能量最小的区域。The target area is an area where the total energy of the leakage signal and the vibration signal transmitted to the bone conduction speaker of the microphone array is the smallest.
  22. 根据权利要求21所述的降噪方法,其中,所述目标区域的位置与所述麦克风阵列中的麦克风的振膜的朝向有关。The noise reduction method according to claim 21, wherein the position of the target area is related to the orientation of the diaphragms of the microphones in the microphone array.
  23. 根据权利要求15所述的降噪方法,其中The noise reduction method of claim 15, wherein
    所述至少一个扬声器是气导扬声器,以及the at least one speaker is an air conduction speaker, and
    所述目标区域为所述气导扬声器的辐射声场的声压级最小区域。The target area is the area with the minimum sound pressure level of the radiated sound field of the air conduction speaker.
  24. 根据权利要求15所述的降噪方法,进一步包括The noise reduction method of claim 15, further comprising
    由所述处理器,基于传递函数处理所述降噪信号,所述传递函数包括第一传递函数和第二传递函数,所述第一传递函数表示从所述至少一个扬声器发出到所述目标信号和所述环境噪声抵消的位置所述目标信号的参数的变化,所述第二传递函数表示从所述目标空间位置到所述目标信号和所述环境噪声抵消的位置所述环境噪声的参数的变化;以及processing, by the processor, the noise reduction signal based on a transfer function, the transfer function comprising a first transfer function and a second transfer function, the first transfer function representing the signal emitted from the at least one speaker to the target signal and the change of the parameter of the target signal at the position where the ambient noise is canceled, the second transfer function represents the change of the parameter of the ambient noise from the target spatial position to the position where the target signal and the ambient noise are canceled changes; and
    由所述至少一个扬声器,根据所述处理后的降噪信号输出所述目标信号。The target signal is output by the at least one speaker according to the processed noise reduction signal.
PCT/CN2021/091652 2021-04-25 2021-04-30 Acoustic device WO2022227056A1 (en)

Priority Applications (13)

Application Number Priority Date Filing Date Title
CN202180094203.XA CN116918350A (en) 2021-04-25 2021-04-30 Acoustic device
US17/451,659 US11328702B1 (en) 2021-04-25 2021-10-21 Acoustic devices
BR112022023372A BR112022023372A2 (en) 2021-04-25 2021-11-19 HEADPHONES
KR1020227044224A KR20230013070A (en) 2021-04-25 2021-11-19 earphone
EP21938133.2A EP4131997A4 (en) 2021-04-25 2021-11-19 Earphone
PCT/CN2021/131927 WO2022227514A1 (en) 2021-04-25 2021-11-19 Earphone
JP2022580472A JP2023532489A (en) 2021-04-25 2021-11-19 earphone
CN202111408328.3A CN115243137A (en) 2021-04-25 2021-11-19 Earphone set
TW111111172A TW202243486A (en) 2021-04-25 2022-03-24 A type of headphone
US17/657,743 US11715451B2 (en) 2021-04-25 2022-04-01 Acoustic devices
TW111115388A TW202242855A (en) 2021-04-25 2022-04-22 Acoustic device
US18/047,639 US20230063283A1 (en) 2021-04-25 2022-10-18 Earphones
US18/332,746 US20230317048A1 (en) 2021-04-25 2023-06-11 Acoustic devices

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/089670 WO2022226696A1 (en) 2021-04-25 2021-04-25 Open earphone
CNPCT/CN2021/089670 2021-04-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/451,659 Continuation US11328702B1 (en) 2021-04-25 2021-10-21 Acoustic devices

Publications (1)

Publication Number Publication Date
WO2022227056A1 true WO2022227056A1 (en) 2022-11-03

Family

ID=83665731

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CN2021/089670 WO2022226696A1 (en) 2021-04-25 2021-04-25 Open earphone
PCT/CN2021/091652 WO2022227056A1 (en) 2021-04-25 2021-04-30 Acoustic device

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/089670 WO2022226696A1 (en) 2021-04-25 2021-04-25 Open earphone

Country Status (3)

Country Link
CN (2) CN117501710A (en)
TW (1) TW202242856A (en)
WO (2) WO2022226696A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115615624B (en) * 2022-12-13 2023-03-31 杭州兆华电子股份有限公司 Equipment leakage detection method and system based on unmanned inspection device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413403A (en) * 2010-09-03 2012-04-11 Nxp股份有限公司 Noise reduction circuit and method therefor
CN108668188A (en) * 2017-03-30 2018-10-16 天津三星通信技术研究有限公司 The method and its electric terminal of the active noise reduction of the earphone executed in electric terminal
CN111095944A (en) * 2017-09-13 2020-05-01 索尼公司 Ear bud earphone device, ear bud earphone device and method
CN111935589A (en) * 2020-09-28 2020-11-13 深圳市汇顶科技股份有限公司 Active noise reduction method and device, electronic equipment and chip
CN112204998A (en) * 2018-07-17 2021-01-08 三星电子株式会社 Method and apparatus for processing audio signal

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102348151B (en) * 2011-09-10 2015-07-29 歌尔声学股份有限公司 Noise canceling system and method, intelligent control method and device, communication equipment
CN107346664A (en) * 2017-06-22 2017-11-14 河海大学常州校区 A kind of ears speech separating method based on critical band
CN107452375A (en) * 2017-07-17 2017-12-08 湖南海翼电子商务股份有限公司 Bluetooth earphone
US10706868B2 (en) * 2017-09-06 2020-07-07 Realwear, Inc. Multi-mode noise cancellation for voice detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413403A (en) * 2010-09-03 2012-04-11 Nxp股份有限公司 Noise reduction circuit and method therefor
CN108668188A (en) * 2017-03-30 2018-10-16 天津三星通信技术研究有限公司 The method and its electric terminal of the active noise reduction of the earphone executed in electric terminal
CN111095944A (en) * 2017-09-13 2020-05-01 索尼公司 Ear bud earphone device, ear bud earphone device and method
CN112204998A (en) * 2018-07-17 2021-01-08 三星电子株式会社 Method and apparatus for processing audio signal
CN111935589A (en) * 2020-09-28 2020-11-13 深圳市汇顶科技股份有限公司 Active noise reduction method and device, electronic equipment and chip

Also Published As

Publication number Publication date
TW202242856A (en) 2022-11-01
WO2022226696A1 (en) 2022-11-03
CN117501710A (en) 2024-02-02
CN115240697A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
US11304014B2 (en) Hearing aid device for hands free communication
US10219083B2 (en) Method of localizing a sound source, a hearing device, and a hearing system
CN107690119B (en) Binaural hearing system configured to localize sound source
US11328702B1 (en) Acoustic devices
US10123134B2 (en) Binaural hearing assistance system comprising binaural noise reduction
US10587962B2 (en) Hearing aid comprising a directional microphone system
WO2022227056A1 (en) Acoustic device
WO2023087565A1 (en) Open acoustic apparatus
WO2023087572A1 (en) Acoustic apparatus and transfer function determination method therefor
US11689845B2 (en) Open acoustic device
RU2800546C1 (en) Open acoustic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21938530

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180094203.X

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE