CN117501710A - Open earphone - Google Patents

Open earphone Download PDF

Info

Publication number
CN117501710A
CN117501710A CN202180099448.1A CN202180099448A CN117501710A CN 117501710 A CN117501710 A CN 117501710A CN 202180099448 A CN202180099448 A CN 202180099448A CN 117501710 A CN117501710 A CN 117501710A
Authority
CN
China
Prior art keywords
noise
microphone array
user
microphone
picked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180099448.1A
Other languages
Chinese (zh)
Inventor
肖乐
郑金波
张承乾
廖风云
齐心
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Voxtech Co Ltd
Original Assignee
Shenzhen Voxtech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Voxtech Co Ltd filed Critical Shenzhen Voxtech Co Ltd
Publication of CN117501710A publication Critical patent/CN117501710A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1083Reduction of ambient noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R9/00Transducers of moving-coil, moving-strip, or moving-wire type
    • H04R9/02Details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R9/00Transducers of moving-coil, moving-strip, or moving-wire type
    • H04R9/06Loudspeakers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2460/00Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
    • H04R2460/13Hearing devices using bone conduction transducers

Abstract

The application discloses open earphone, this open earphone includes: a securing structure configured to secure the earphone in a position near the user's ear and not occluding the user's ear canal; a housing structure configured to carry: a first microphone array configured to pick up ambient noise; at least one speaker array; and a signal processor configured to: estimating noise at a first spatial location based on the picked-up ambient noise, the first spatial location being closer to a user's ear canal than any microphone of the first microphone array; and generating a noise reduction signal based on the noise of the first spatial location such that the at least one speaker array outputs noise reduction sound waves for canceling ambient noise delivered to the user's ear canal in accordance with the noise reduction signal.

Description

Open earphone Technical Field
The present application relates to the field of acoustics, and in particular, to an open earphone.
Background
The earphone device allows a user to listen to audio content, conduct voice conversation and guarantee privacy of user interaction content, and surrounding people are not disturbed during listening. Headset devices can generally be divided into two categories, in-ear headset devices and open headset devices. The in-ear earphone device can block the ears of a user in the use process, and the user can easily feel blocking, foreign matters, distending pain and the like when wearing for a long time. The open earphone device can open the ears of the user, is beneficial to wearing for a long time, but has insignificant noise reduction effect when external noise is large, so that the user has poor listening experience.
It is therefore desirable to provide an open earphone that can open the user's ears and enhance the user's listening experience.
Disclosure of Invention
The embodiment of the application provides an open earphone, which comprises: a securing structure configured to secure the earphone in a position near the user's ear and not occluding the user's ear canal; a housing structure configured to carry: a first microphone array configured to pick up ambient noise; at least one speaker array; and a signal processor configured to: estimating noise at a first spatial location based on the picked-up ambient noise, the first spatial location being closer to a user's ear canal than any microphone of the first microphone array; and generating a noise reduction signal based on the noise of the first spatial location such that the at least one speaker array outputs noise reduction sound waves for canceling ambient noise delivered to the user's ear canal in accordance with the noise reduction signal.
In some embodiments, the housing structure is configured to house a second microphone array configured to pick up ambient noise and the noise reducing acoustic wave, the second microphone array being at least partially distinct from the first microphone array; and the signal processor is configured to update the noise reduction signal based on the sound signal picked up by the second microphone array.
In some embodiments, the updating the noise reduction signal based on the sound signal picked up by the second microphone array comprises: estimating a sound field at the ear canal of the user based on the sound signals picked up by the second microphone array; and adjusting parameter information of the noise reduction signal according to the sound field at the auditory canal of the user.
In some embodiments, the signal processor is further configured to: acquiring user input; and adjusting parameter information of the noise reduction signal according to user input.
In some embodiments, the second microphone array includes a microphone that is closer to the ear canal of the user than any of the microphones in the first microphone array.
In some embodiments, the signal processor estimating noise of the first space based on the picked-up ambient noise comprises: and carrying out signal separation according to the picked-up environmental noise, acquiring parameter information corresponding to the environmental noise, and generating a noise reduction signal based on the parameter information.
In some embodiments, the signal processor estimating noise for the first spatial location based on the picked-up ambient noise comprises: determining one or more spatial noise sources related to the picked-up ambient noise; and estimating noise at the first spatial location based on the spatial noise source.
In some embodiments, the determining one or more spatial noise sources related to the picked-up ambient noise comprises: dividing the picked-up ambient noise into a plurality of sub-bands, each sub-band corresponding to a different frequency range; and determining a spatial noise source corresponding to the at least one subband.
In some embodiments, the first microphone array includes a first sub-microphone array and a second sub-microphone array, the first sub-microphone array and the second sub-microphone array being located at a left ear and a right ear of the user, respectively, the determining spatial noise sources corresponding to the at least one sub-band includes: acquiring a user head function, wherein the user head function reflects the reflection or absorption condition of the user head on sound; and determining a spatial noise source corresponding to the user head function by combining the environmental noise picked up by the first sub-microphone array, the environmental noise picked up by the second sub-microphone array and the user head function on the at least one sub-band.
In some embodiments, the determining one or more spatial noise sources related to the picked-up ambient noise comprises: the one or more spatial noise sources are located by one or more of beamforming, super-resolution spatial spectrum estimation, or time difference of arrival.
In some embodiments, the first microphone array comprises one noise microphone, the at least one speaker array forms at least one set of acoustic dipoles, and the noise microphone is located at an acoustic null of the dipole radiated sound field.
In some embodiments, the at least one microphone array comprises a bone conduction microphone configured to pick up a speaking sound of the user, the signal processor estimating noise of the first spatial location based on the picked up ambient noise comprises: removing components associated with the bone conduction microphone picked up signal from the picked up ambient noise to update the ambient noise; and estimating noise of the first spatial location according to the updated environmental noise.
In some embodiments, the at least one microphone array includes a bone conduction microphone and a gas conduction microphone, and the signal processor controls a switching state of the bone conduction microphone and the gas conduction microphone based on an operating state of the headset.
In some embodiments, the states of the earphone include a talking state and an un-talking state, and if the working state of the earphone is the un-talking state, the signal processor controls the bone conduction microphone to be in a standby state; and if the working state of the earphone is a call state, the signal processor controls the bone conduction microphone to be in the working state.
In some embodiments, when the working state of the earphone is a talking state, if the sound pressure level of the environmental noise is greater than a preset threshold, the signal processor controls the bone conduction microphone to keep the working state; and if the sound pressure level of the environmental noise is smaller than a preset threshold value, the signal processor controls the bone conduction microphone to be switched from the working state to the standby state.
Drawings
The present application will be further illustrated by way of example embodiments, which will be described in detail with reference to the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:
FIG. 1 is an exemplary frame structure diagram of an open earphone provided in accordance with some embodiments of the present application;
FIG. 2 is an exemplary schematic flow diagram of an open earphone provided in accordance with some embodiments of the present application;
FIG. 3 is an exemplary flow chart of updating noise reduction signals provided in accordance with some embodiments of the present application;
FIG. 4A is an exemplary distribution diagram of the arrangement and positional relationship of a first microphone array and a second microphone array provided in accordance with some embodiments of the present description;
FIG. 4B is an exemplary distribution diagram of an arrangement of a first microphone array and a second microphone array provided in accordance with further embodiments of the present application;
FIG. 5 is an exemplary flow chart of estimating noise for a first spatial location provided in accordance with some embodiments of the present description;
FIG. 6 is an exemplary flow chart for determining spatial noise sources provided in accordance with some embodiments of the present description;
FIG. 7 is another exemplary flow chart for determining spatial noise sources provided in accordance with some embodiments of the present description;
fig. 8A is a schematic diagram of an arrangement of a first sub-microphone array provided according to some embodiments of the present application;
fig. 8B is a schematic diagram of an arrangement of a first sub-microphone array according to other embodiments of the present disclosure;
fig. 8C is a schematic diagram of an arrangement of a first sub-microphone array according to other embodiments of the present disclosure;
fig. 8D is a schematic diagram of an arrangement of a first sub-microphone array according to other embodiments of the present disclosure;
fig. 9A is a schematic diagram of a positional relationship of a first sub-microphone array and a second sub-microphone array provided in accordance with some embodiments of the present description;
fig. 9B is a schematic diagram of a positional relationship of a first sub-microphone array and a second sub-microphone array provided according to other embodiments of the present application;
Fig. 10 is an exemplary flow chart of estimating noise for a first spatial location provided in accordance with some embodiments of the present description.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present application, and it is obvious to those skilled in the art that the present application may be applied to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
It will be appreciated that "system," "apparatus," "unit" and/or "module" as used herein is one method for distinguishing between different components, elements, parts, portions or assemblies of different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.
As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
Flowcharts are used in this application to describe the operations performed by systems according to embodiments of the present application. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
An open earphone is an earphone device that can open the ear of a user. The open earphone may secure the speaker in a position near the user's ear and not occluding the user's ear canal by a securing structure (e.g., an ear hook, a head hook, etc.). When the user uses the open earphone, external environmental noise can be heard by the user, which makes the user feel worse. For example, in places where external environmental noise is large (e.g., streets, scenic spots, etc.), when a user plays music using an open earphone, the external environmental noise may directly enter the ear canal of the user, so that the user hears the large environmental noise, which may interfere with the user's music listening experience. For another example, when a user wears an open earphone to make a call, the microphone may pick up not only the speaking sound of the user, but also environmental noise, so that the user experience of the call is poor.
Based on the above-described problems, embodiments of the present disclosure provide an open earphone that may include a fixed structure, a housing structure, a first microphone array, at least one speaker array, and a signal processor in some embodiments. Wherein the securing structure is configured to secure the open earphone in a position near the user's ear and not occluding the user's ear canal. The housing structure is configured to house a first microphone array, at least one speaker array, and a signal processor. The first microphone array is configured to pick up ambient noise. The signal processor is configured to estimate noise at a first spatial location based on ambient noise picked up by the first microphone array, where the first spatial location is closer to the user's ear canal than any of the first microphone arrays, and to generate a noise reduction signal based on the noise at the first spatial location. It will be appreciated herein that the microphones in the first microphone array may be distributed at different locations near the user's ear canal, with the noise at locations near the user's ear canal (e.g., first spatial locations) being estimated from the ambient noise signals acquired for the microphones in the first microphone array. The at least one speaker array is configured to output noise-reducing sound waves based on the noise-reducing signals, which may be used to cancel ambient noise delivered to the user's ear canal. In some embodiments, the open earphone may further comprise a second microphone array configured to pick up ambient noise and noise-reducing sound waves output by the at least one speaker array. In some embodiments, the signal processor may update the noise reduction signal based on the sound signal picked up by the second microphone array. For example, the signal processor may estimate a sound field at the ear canal of the user based on the sound signal picked up by the second microphone array, and adjust a phase or an amplitude of the noise reduction signal according to the sound field at the ear canal of the user, so as to implement updating of the noise reduction signal. In the embodiment of the specification, the noise reduction sound wave is utilized to eliminate the environmental noise at the auditory canal of the user by the method, so that the active noise reduction of the open earphone is realized, and the hearing experience of the user in the process of using the open earphone is improved.
Fig. 1 is an exemplary frame structure diagram of an open earphone provided in accordance with some embodiments of the present application. As shown in fig. 1, the open earphone 100 may include a housing structure 110, a first microphone array 130, a signal processor 140, and a speaker array 150, wherein the first microphone array 130, the signal processor 140, and the speaker array 150 are located at the housing structure 110. The open earphone 100 may secure the earphone near the user's ear by the securing structure 120 and not occlude the user's ear canal. In some embodiments, the first microphone array 130 located at the housing structure 110 may pick up ambient noise at the user's ear canal and convert the picked up ambient noise signal into an electrical signal for signal processing to the signal processor 140. The signal processor 140 is coupled to the first microphone array 130 and the speaker array 150, and the signal processor 140 receives and processes the environmental noise signal picked up by the first microphone array 130 to obtain parameter information (e.g., amplitude information, phase information, etc.) of the environmental noise. The signal processor 140 may estimate noise at a first spatial location of the user based on parameter information (e.g., amplitude information, phase information, etc.) of the ambient noise and generate a noise reduction signal based on the noise at the first spatial location. The parameter information of the noise reduction signal corresponds to the parameter information of the environmental noise, for example, the magnitude of the noise reduction signal is approximately equal to the magnitude of the environmental noise, and the phase of the noise reduction signal is approximately opposite to the phase of the environmental noise. The signal processor 140 delivers the generated noise reduction signal to the speaker array 150. The speaker array 150 may output noise-reducing sound waves according to the noise-reducing signal generated by the signal processor 140, where the noise-reducing sound waves may cancel out environmental noise at the ear canal of the user, so as to implement active noise reduction of the open earphone 100, and improve the hearing experience of the user during the use of the open earphone 100.
The housing structure 110 may be configured to carry a first microphone array 130, a signal processor 140, and a speaker array 150. In some embodiments, the housing structure 110 may be an enclosed or semi-enclosed housing structure that is hollow inside, and the first microphone array 130, the signal processor 140, and the speaker array 150 are located at the housing structure 110. In some embodiments, the shape of the housing structure 110 may be a regular or irregular shaped solid structure such as a cuboid, a cylinder, a truncated cone, or the like. When the user wears the open earphone 100, the housing structure 110 may be located near the user's ear, for example, the housing structure 110 may be located on the circumferential side (e.g., front or back) of the user's pinna, or on the user's ear but not blocking or covering the user's ear canal. In some embodiments, the open earphone 100 may be a bone conduction earphone, and at least one side of the housing structure 110 may be in contact with the skin of the user's head. An acoustic driver (e.g., a vibration speaker) in the bone conduction headset converts the audio signal into mechanical vibrations that can be transmitted through the housing structure 110 and the user's bones to the user's auditory nerve. In some embodiments, the open earphone 100 may be an air-conducting earphone, and at least one side of the housing structure 110 may or may not be in contact with the skin of the user's head. The side wall of the housing structure 110 includes at least one sound guide hole, and a speaker in the air conduction earphone converts the audio signal into air conduction sound, which can radiate toward the user's ear through the sound guide hole.
The first microphone array 130 may be configured to pick up ambient noise. In some embodiments, ambient noise refers to a combination of multiple ambient sounds in the environment in which the user is located. In some embodiments, the environmental noise may include one or more of traffic noise, industrial noise, construction noise, social noise, and the like. In some embodiments, traffic noise may include, but is not limited to, motor vehicle travel noise, whistling noise, and the like. Industrial noise may include, but is not limited to, plant power machine operation noise, and the like. The construction noise may include, but is not limited to, power machine excavation noise, hole drilling noise, agitation noise, and the like. The social living environment noise may include, but is not limited to, crowd gathering noise, entertainment promotional noise, crowd noise, household appliance noise, and the like. In some embodiments, the first microphone array 130 may be disposed near the ear canal of the user for picking up ambient noise transmitted to the ear canal of the user, and the first microphone array 130 may convert the picked up ambient noise signal into an electrical signal and transmit to the signal processor 140 for signal processing. In some embodiments, the ambient noise may also include the sound of the user speaking. For example, when the open earphone 100 is in an unvoiced state, the sound generated by the user speaking itself may be regarded as ambient noise, and the first microphone array 130 may pick up the sound generated by the user speaking itself and other ambient noise, and convert the sound signal generated by the user speaking and other ambient noise into an electrical signal to be transmitted to the signal processor 140 for signal processing. In some embodiments, the first microphone array 130 may be distributed at the left or right ear of the user. In some embodiments, the first microphone array 130 may also be located at the left and right ears of the user. For example, the first microphone array 130 may include a first sub-microphone array located at the left ear of the user and a second sub-microphone array located at the right ear of the user, and the first sub-microphone array and the second sub-microphone array may be simultaneously or one of them may be brought into operation.
In some embodiments, the first microphone array 130 may include an air-conductive microphone and/or a bone-conductive microphone. For example, in some embodiments, the first microphone array 130 may include one or more air conduction microphones. For example, when a user listens to music using the open earphone 100, the air conduction microphone may simultaneously acquire noise of the external environment and sound when the user speaks, convert it into an electrical signal as ambient noise, and transmit it to the signal processor 140 for processing. In some embodiments, the first microphone array 130 may also include one or more bone conduction microphones. In some embodiments, the bone conduction microphone may be in direct contact with the skin of the user's head, and the vibration signal generated by the bones or muscles of the user's face when speaking may be directly transferred to the bone conduction microphone, which in turn converts the vibration signal into an electrical signal and transfers the electrical signal to the signal processor 140 for signal processing. In some embodiments, the bone conduction microphone may not be in direct contact with the human body, and the vibration signal generated by the bones or muscles of the face when the user speaks may be transmitted to the housing structure 110, and then transmitted to the bone conduction microphone by the housing structure 110, and the bone conduction microphone further converts the human body vibration signal into an electrical signal containing voice information. For example, when the user is in a call state, the signal processor 140 may perform noise reduction processing by using the sound signal collected by the air conduction microphone as ambient noise, and the sound signal collected by the bone conduction microphone is retained as a voice signal, so as to ensure the call quality when the user is in a call. In some embodiments, the first microphone array 130 may include a moving coil microphone, a ribbon microphone, a condenser microphone, an electret microphone, an electromagnetic microphone, a carbon particle microphone, etc., or any combination thereof, as a classification according to the operating principle of the microphones. In some embodiments, the array arrangement of the first microphone array 130 may be a linear array (e.g., linear, curved), a planar array (e.g., regular and/or irregular shape of cross, circle, ring, polygon, mesh, etc.), or a stereo array (e.g., cylindrical, spherical, hemispherical, polyhedral, etc.), and reference may be made specifically to fig. 8 of the present specification and its related content with respect to the arrangement of the first microphone array 130.
The signal processor 140 is configured to estimate noise at the first spatial location based on the ambient noise picked up by the first microphone array 130 and to generate a noise reduction signal based on the noise at the first spatial location. The first spatial location refers to a spatial location that is closer to the user's ear canal than any of the microphones in the first microphone array 130 by a particular distance. The specific distance here may be a fixed distance, for example, 0.5cm, 1cm, 2cm, 3cm, etc. In some embodiments, the first spatial position is related to a position, number, of the microphones in the first microphone array 130 relative to the user's ear, and the first spatial position may be adjusted by adjusting the position, number, or both of the microphones in the first microphone array 130 relative to the user's ear. For example, the first spatial location may be brought closer to the user's ear canal by increasing the number of microphones in the first microphone array 130.
The signal processor 140 may perform signal processing on the received ambient noise signal to estimate noise at the first spatial location. In some embodiments, the signal processor 140 may be coupled to the first microphone array 130 and the speaker array 150, and the signal processor 140 may receive the ambient noise picked up by the first microphone array 130 to estimate the noise at the first spatial location. For example, the signal processor 140 may determine one or more spatial noise sources related to the picked-up ambient noise. For another example, the signal processor 140 may perform position estimation, phase information estimation, amplitude information estimation, etc. on spatial noise sources related to the environmental noise. The signal processor 140 may generate the noise reduction signal based on the noise estimate (e.g., phase information, amplitude information) of the first spatial location. The noise reduction signal is a sound signal having an approximately equal magnitude and an approximately opposite phase to the noise at the first spatial position.
The speaker array 150 is configured to output noise-reducing sound waves for canceling ambient noise delivered to the user's ear canal based on the noise-reducing signals. In some embodiments, the speaker array 150 may be disposed at the housing structure 110, and the speaker array 150 may be located near the user's ear when the open ended earphone 100 is worn by the user. In some embodiments, speaker array 150 may output noise-reducing sound waves based on the noise-reducing signals to cancel out ambient noise at the first spatial location. For example only, the signal processor 140 controls the speaker array 150 to output sound signals having approximately equal magnitudes and approximately opposite phases to the noise at the first spatial location to cancel the noise at the first spatial location. In some embodiments, the spacing between the first spatial location and the user's ear canal is small, the noise at the first spatial location may be approximately regarded as noise delivered to the user's ear, the noise-reducing sound waves output by the speaker array 150 based on the noise-reducing signal may cancel out with the noise at the first spatial location, and may be approximately the ambient noise delivered to the user's ear canal is eliminated. In some embodiments, categorized according to the principle of operation of the speaker, the speaker array 150 may include one or more of an electrodynamic speaker (e.g., a moving coil speaker), a magnetic speaker, an ion speaker, an electrostatic speaker (or a capacitive speaker), a piezoelectric speaker, and the like. In some embodiments, the speaker array 150 may include an air conduction speaker or a bone conduction speaker, classified according to the manner in which sound output from the speakers propagates. For example, where speaker array 150 includes only air conduction speakers, some of the air conduction speakers in speaker array 150 may be used to output noise-reducing sound waves to eliminate noise, and other air conduction speakers in speaker array 150 may be used to convey to the user the sound information (e.g., device media audio, far-end audio for conversation) that the user needs to hear. In some embodiments, speakers in speaker array 150 that are used to convey to the user the sound information that the user needs to hear may also be used to output noise-reducing sound waves. For another example, where the speaker array 150 includes a bone conduction speaker and a gas conduction speaker, the gas conduction speaker may be used to output noise-reducing sound waves to eliminate noise, the bone conduction speaker may be used to transmit to the user sound information that the user needs to hear, the bone conduction speaker transmitting mechanical vibrations directly through the user's body (e.g., bone, skin tissue, etc.) to the user's auditory nerve with less interference with the gas conduction microphone picking up ambient noise than the gas conduction speaker. The bone conduction speaker may cause the housing structure 110 to generate mechanical vibration during the process of transmitting the mechanical vibration to the user, and the mechanical vibration generated by the housing structure 110 acts on the air to generate air conduction sound, and in some embodiments, the air conduction sound generated by the housing structure 110 may also act as noise reduction sound wave. It should be noted that in some embodiments, speaker array 150 may be a stand-alone functional device or may be part of a single device capable of performing multiple functions. In some embodiments, the signal processor 140 may be integrated and/or formed integrally with the speaker array 150. In some embodiments, speaker array 150 may be arranged in a linear array (e.g., linear, curved), a planar array (e.g., regular and/or irregular shapes such as cross, mesh, circular, annular, polygonal, etc.), or a volumetric array (e.g., cylindrical, spherical, hemispherical, polyhedral, etc.), as not limited herein.
In some embodiments, the open ended earphone 100 may further comprise a securing structure 120, the securing structure 120 being configured to secure the open ended earphone 100 in a position near the user's ear and not occluding the user's ear canal. In some embodiments, the securing structure 120 may include an ear hook, a head rest, an elastic band, or the like, so that the open earphone 100 may be better secured near the user's ear to prevent the user from falling out during use. For example only, the fixation structure 120 may be an ear hook, which may be configured to be worn around an ear region. For another example, the fixation structure 120 may be a neck strap configured to be worn around the neck/shoulder region. In some embodiments, the earhook may be a continuous hook and may be elastically stretched to be worn over the user's ear, while the earhook may also apply pressure to the user's pinna such that the open ended headset 100 is securely fastened to a particular location on the user's ear or head. In some embodiments, the earhook may be a discontinuous ribbon. For example, the earhook may include a rigid portion and a flexible portion, wherein the rigid portion may be made of a rigid material (e.g., plastic or metal) and the rigid portion may be secured to the housing structure of the acoustic output device by way of a physical connection (e.g., a snap fit, a threaded connection, etc.). The flexible portion may be made of an elastic material (e.g., cloth, composite, or/and neoprene).
It should be noted that the above description with respect to fig. 1 and 2 is provided for illustrative purposes only and is not intended to limit the scope of the present application. Many variations and modifications will be apparent to those of ordinary skill in the art in light of the teaching of this disclosure. However, such changes and modifications do not depart from the scope of the present application. For example, one or more elements (e.g., the fixed structure 120, etc.) in the open earphone 100 may be omitted. In some embodiments, one element may be replaced with another element that performs a similar function. For example, in some embodiments, the open earphone 100 may not include the fixed structure 120, and the housing structure 110 may be a housing structure having a shape that fits around the human ear, such as a ring, oval, polygon (regular or irregular), U-shape, V-shape, semi-circle shape, so that the housing structure 110 may hang near the user's ear. In some embodiments, one element may be split into multiple sub-elements, or multiple elements may be combined into a single element.
Fig. 2 is an exemplary principle flow diagram of an open earphone provided in accordance with some embodiments of the present application. As shown in fig. 2, the process 200 may include:
In step 210, ambient noise is picked up.
In some embodiments, this step may be performed by the first microphone array 130. In some embodiments, ambient noise refers to a combination of multiple ambient sounds in the environment in which the user is located. In some embodiments, the environmental noise may include one or more of traffic noise, industrial noise, construction noise, social noise, and the like. In some embodiments, traffic noise may include, but is not limited to, motor vehicle travel noise, whistling noise, and the like. Industrial noise may include, but is not limited to, plant power machine operation noise, and the like. The construction noise may include, but is not limited to, power machine excavation noise, hole drilling noise, agitation noise, and the like. The social living environment noise may include, but is not limited to, crowd gathering noise, entertainment promotional noise, crowd noise, household appliance noise, and the like. In some embodiments, the first microphone array 130 may be positioned near the user's ear canal for picking up ambient noise delivered to the user's ear canal, and the first microphone array 130 may convert the picked up ambient noise signal into an electrical signal and deliver to the signal processor 140 for signal processing. In some embodiments, the ambient noise may also include the sound of the user speaking. For example, when the open earphone 100 is in an unvoiced state (e.g., listening to audio or watching video), the sound generated by the user speaking itself may be regarded as ambient noise, and the first microphone array 130 may pick up the sound of the user speaking itself and other ambient noise, and convert the sound signal generated by the user speaking and other ambient noise into an electrical signal to be transmitted to the signal processor 140 for signal processing.
In step 220, noise at the first spatial location is estimated based on the picked-up ambient noise.
In some embodiments, this step may be performed by the signal processor 140. The first spatial position refers to a spatial position that is close to a specific distance of the user's ear canal. The specific distance may be a fixed distance, for example, 0.5cm, 1cm, 2cm, 3cm, etc., and may be adaptively adjusted according to practical situations. The ambient noise picked up by the first microphone array 130 may be spatial noise sources from different directions and different kinds, and thus the parameter information (e.g., phase information, amplitude information) corresponding to each spatial noise source is different. In some embodiments, the signal processor 140 may perform signal separation and extraction on the noise of the first spatial location according to the statistical distribution and the structural characteristics of different types of noise in different dimensions (for example, spatial domain, time domain, frequency domain, etc.), so as to estimate different types of noise (for example, different frequencies, different phases, etc.), and estimate parameter information (for example, amplitude information, phase information, etc.) corresponding to each type of noise. In some embodiments, the signal processor 140 may also determine overall parameter information of the noise at the first spatial location based on parameter information corresponding to different types of noise at the first spatial location. In some embodiments, estimating noise of the first spatial location based on the picked-up ambient noise may further include determining one or more spatial noise sources related to the picked-up ambient noise, estimating noise of the first spatial location based on the spatial noise sources. For example, the picked-up ambient noise is divided into a plurality of sub-bands, each sub-band corresponding to a different frequency range, and on at least one sub-band, a spatial noise source corresponding thereto is determined. It is noted that the spatial noise source estimated by the subband here is a virtual noise source corresponding to an external real noise source. For details on estimating noise of the first spatial location based on picked-up ambient noise reference may be made elsewhere in the present specification, e.g. fig. 5-7 and their corresponding descriptions.
The open earphone 100 does not block the user's ear canal and cannot acquire ambient noise by placing microphones at the ear canal, and the first spatial location is a spatial region constructed by the first microphone array 130 for simulating the position of the user's ear canal, and in order to more accurately estimate the ambient noise delivered at the user's ear canal, in some embodiments, the first spatial location is closer to the user's ear canal than any one of the first microphone array 130. In some embodiments, the first spatial position is related to a position, number, of the microphones in the first microphone array 130 relative to the user's ear, and the first spatial position may be adjusted by adjusting the position, or number, of the microphones in the first microphone array 130 relative to the user's ear. For example, the first spatial location may be brought closer to the user's ear canal by increasing the number of microphones in the first microphone array 130. For another example, the first spatial location may also be brought closer to the user's ear canal by decreasing the pitch of the microphones in the first microphone array 130. For another example, the first spatial location may also be closer to the user's ear canal by changing the arrangement of the microphones in the first microphone array 130.
In step 230, a noise reduction signal is generated based on the noise at the first spatial location.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the signal processor 140 may generate the noise reduction signal based on parameter information (e.g., amplitude information, phase information, etc.) of the noise of the first spatial location obtained in step 220. For example, the phase of the noise reducing acoustic wave may be approximately opposite to the phase of the noise at the first spatial location. For another example, the phase of the noise-reducing acoustic wave may be approximately opposite to the phase of the noise at the first spatial location, and the magnitude of the noise-reducing signal may be approximately equal to the magnitude of the noise at the first spatial location. In some embodiments, the speaker array 150 may output a noise-reduced sound wave based on the noise-reduced signal generated by the signal processor 140, and the noise-reduced sound wave may cancel noise of the first spatial location. In some embodiments, the noise at the first spatial location may be approximately regarded as noise at the user's ear canal location, and thus the noise reduction signal and the noise at the first spatial location cancel each other, which may be approximately the ambient noise delivered to the user's ear canal is eliminated. In some embodiments, the open earphone 100 may eliminate ambient noise at the user's ear canal location by the method steps described in fig. 2, enabling active noise reduction of the open earphone 100.
It should be noted that the above description of the process 200 is for illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 200 will be apparent to those skilled in the art in light of the present description. However, such modifications and variations are still within the scope of the present description. For example, steps in flow 200 may also be added, omitted, or combined, such as, for example, signal processing (e.g., filtering, etc.) may also be performed on the ambient noise.
With continued reference to fig. 1, in some embodiments, the open earphone 100 may further comprise a second microphone array 160. The second microphone array 160 may be located inside the housing structure 110. In some embodiments, the second microphone array 160 is at least partially distinct from the first microphone array 130. For example, the microphones in the second microphone array 160 may differ from one or more of the number, type, location, arrangement, etc. of microphones in the first microphone array 130. In some embodiments, for example, the arrangement of microphones in the first microphone array 130 may be linear and the arrangement of microphones in the second microphone array 160 may be circular. For another example, the microphones in the second microphone array 160 may include only air-conductive microphones and the first microphone array 130 may include air-conductive microphones and bone-conductive microphones. In some embodiments, the microphones in the second microphone array 160 may be any one or more of the microphones included in the first microphone array 130, and the microphones in the second microphone array 160 may also be independent of the microphones of the first microphone array 130. In some embodiments, the second microphone array 160 may be configured to pick up ambient noise and noise-reducing sound waves output by the speaker array 150. Ambient noise and noise-reducing sound waves picked up by the second microphone array 160 may be transferred to the signal processor 140. In some embodiments, the signal processor 140 may update the noise reduction signal based on the sound signal picked up by the second microphone array 160. For example, the signal processor 140 may adjust the parameter information of the noise reduction signal according to the parameter information (e.g., frequency information, amplitude information, phase information, etc.) of the sound signal picked up by the second microphone array 160, so that the amplitude of the adjusted noise reduction signal can be more identical to the amplitude of the noise at the first spatial position, or the phase of the adjusted noise reduction signal can be more identical to the opposite phase of the noise at the first spatial position, so that the updated noise reduction sound wave and the noise at the first spatial position can be more comprehensively offset. For details on updating the noise reduction signal based on the sound signal picked up by the second microphone array 160, reference may be made to fig. 3 of the present specification and the description thereof.
Fig. 3 is an exemplary flow chart for updating noise reduction signals provided in accordance with some embodiments of the present application. As shown in fig. 3, the process 300 may include:
in step 310, the sound field at the user's ear canal is estimated based on the sound signals picked up by the second microphone array 160.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the sound signals picked up by the second microphone array 160 include ambient noise and noise-reducing sound waves output by the speaker array 150. In some embodiments, after the ambient noise and the noise-reducing sound wave output by the speaker array 150 cancel, a portion of the acoustic signals that are not canceled by each other may still exist near the ear canal of the user, and these acoustic signals that are not canceled may be residual ambient noise and/or residual noise-reducing sound wave, so that a certain noise still exists at the ear canal of the user after the ambient noise and the noise-reducing sound wave cancel. The signal processor 140 may perform signal processing according to the sound signal (e.g., environmental noise, noise-reduced sound wave) picked up by the second microphone array 160, to obtain parameter information of the sound field at the ear canal of the user, such as frequency information, amplitude information, phase information, etc., so as to implement sound field estimation at the ear canal of the user.
In step 320, parameter information of the noise reduction signal is adjusted according to the sound field at the ear canal of the user.
In some embodiments, step 320 may be performed by signal processor 140. In some embodiments, the signal processor 140 may adjust the parameter information (e.g., the frequency information, the amplitude information, and/or the phase information) of the noise reduction signal according to the parameter information of the sound field at the ear canal of the user obtained in step 310, so that the amplitude information and the frequency information of the updated noise reduction signal are more consistent with the amplitude information and the frequency information of the environmental noise at the ear canal of the user, and the phase information of the updated noise reduction signal is more consistent with the anti-phase information of the environmental noise at the ear canal of the user, so that the updated noise reduction signal may more accurately eliminate the environmental noise.
It should be noted that the above description of the process 300 is for purposes of example and illustration only and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 300 will be apparent to those skilled in the art in light of the present description. However, such modifications and variations are still within the scope of the present description. For example, the microphone array for picking up the sound field at the user's ear canal is not limited to the second microphone array, but may include other microphone arrays, such as a third microphone array, a fourth microphone array, etc., and the relevant parameter information of the sound field at the user's ear canal picked up by the plurality of microphone arrays may be estimated in an averaging or weighting algorithm, etc. manner.
In some embodiments, to more accurately acquire the sound field at the user's ear canal, the second microphone array 160 includes a microphone that is closer to the user's ear canal than any of the microphones in the first microphone array 130. In some embodiments, the sound signal picked up by the first microphone array 130 is ambient noise and the sound signal picked up by the second microphone array 160 is ambient noise and noise reducing sound waves. In some embodiments, the signal processor 140 may estimate a sound field at the user's ear canal from the sound signals picked up by the second microphone array 160 to update the noise reduction signal. The second microphone array 160 needs to monitor the sound field of the user's ear canal after the noise reduction signal and the environmental noise cancel, and the second microphone array 160 includes a microphone that is closer to the user's ear canal than any microphone in the first microphone array 130, so that the sound signal heard by the user can be more accurately represented, and the noise reduction effect and the hearing experience of the user can be further improved by estimating the sound field of the second microphone array 160 to update the noise reduction signal.
In some embodiments, the arrangement of the first microphone array 130 and the second microphone array 160 may be the same. The first microphone array 130 and the second microphone array 160 are arranged in the same manner, and the arrangement shapes of the two are understood to be approximately the same. Fig. 4A is an exemplary distribution diagram of an arrangement and positional relationship of a first microphone array and a second microphone array provided in accordance with some embodiments of the present application. As shown in fig. 4A, the first microphone array 130 is disposed at the ear of the user in a semicircular arrangement, and the second microphone array 160 is also disposed at the ear of the user in a semicircular arrangement, where the microphones in the second microphone array 160 are closer to the ear canal of the user than any microphone in the first microphone array 130. In some embodiments, the microphones in the first microphone array 130 may be provided separately from the microphones in the second microphone array 160. For example, the microphones in the first microphone array 130 in fig. 4A are arranged in a semicircular arrangement, the microphones in the second microphone array 160 are arranged in a semicircular arrangement, and the microphones in the first microphone array 130 and the microphones in the second microphone array 160 do not overlap or cross. In some embodiments, the microphones in the first microphone array 130 may partially overlap or intersect the microphones in the second microphone array 160.
In some embodiments, the arrangement of the first microphone array 130 and the second microphone array 160 may be different. Fig. 4B is an exemplary distribution diagram of an arrangement of a first microphone array and a second microphone array provided according to further embodiments of the present application. As shown in fig. 4B, the first microphone array 130 is disposed at the ear of the user in a semicircular arrangement, and the second microphone array 160 is disposed at the ear of the user in a linear arrangement, wherein the microphones in the second microphone array 160 are closer to the ear canal of the user than any microphone in the first microphone array 130. In some embodiments, the first microphone array 130 and the second microphone array 160 may also be arranged in a combined arrangement. For example, in fig. 4B, the second microphone array 160 includes a linear arrangement portion and a semicircular arrangement portion, and the semicircular arrangement portion of the second microphone array 160 is an integral part of the first microphone array 130. It should be noted that, the arrangement of the first microphone array 130 and the second microphone array 160 is not limited to the semicircle shape and the line shape shown in fig. 4A and fig. 4B, and the semicircle shape and the line shape are only for illustrative purposes, and reference may be made to fig. 8 and the related description of the present specification regarding the arrangement of the microphone arrays.
In some embodiments, the noise reduction signal may also be updated based on manual input by the user. For example, in some embodiments, when a user wears the open earphone 100 in a relatively noisy external environment to play music, the user's own hearing experience is not ideal, and the user may manually adjust parameter information (e.g., frequency information, phase information, or amplitude information) of the noise reduction signal according to the own hearing effect. For another example, in the process of using the open earphone 100 by a special user (for example, a hearing impaired user or a user with older age), the hearing ability of the special user is different from that of the general user, and the noise reduction sound wave generated by the open earphone 100 itself is not matched with that of the special crowd, so that the hearing experience of the special user is poor. In this case, the special user can manually adjust the frequency information, the phase information or the amplitude information of the noise reduction signal according to the auditory effect of the special user, so as to update the noise reduction signal to improve the auditory experience of the special user. In some embodiments, the manner in which the user manually adjusts the noise reduction signal may be by manually adjusting the key locations on the open earphone 100. In some embodiments, the manner in which the user manually adjusts the noise reduction signal may also be by manually inputting an adjustment through the terminal device. In some embodiments, the open earphone 100 or an electronic product such as a mobile phone, a tablet computer, a computer, etc. which is communicatively connected to the open earphone 100 may display a sound field at the ear canal of the user, and feed back the sound field to the frequency information range, the amplitude information range, or the phase information range of the noise reduction signal suggested by the user, so that the user may manually input the sound field according to the parameter information of the noise reduction signal suggested by the user, and then fine-tune the parameter information according to the hearing experience condition of the user.
In some embodiments, the signal processor 140 estimating noise of the first space based on the picked-up ambient noise may include: and carrying out signal separation according to the picked-up environmental noise, acquiring parameter information corresponding to the environmental noise, and generating a noise reduction signal based on the parameter information corresponding to the environmental noise. In some embodiments, the ambient noise picked up by the microphone arrays (e.g., first microphone array 130, second microphone array 160) may include noise, user human voice, audio output by speaker array 150, and so forth. In some embodiments, the audio output by the speaker array 150 may include far-end call audio, device media audio, noise-reducing sound waves, etc., output by the speaker array 150. In some embodiments, the signal processor 140 may perform signal analysis on the environmental noise picked up by the microphone array, and perform signal separation on various sound signals included in the environmental noise, so as to obtain multiple single sound signals such as noise, user voice, noise reduction sound wave, device media audio, far-end audio of a call, and the like. Specifically, the signal processor 140 may adaptively adjust the filter set parameters according to the statistical distribution characteristics and the structural characteristics of noise, user voice, noise-reducing sound wave, device media audio, far-end audio for communication, etc. in different dimensions such as space, time domain, frequency domain, etc., estimate the parameter information of each sound signal (for example, noise, user voice, noise-reducing sound wave, device media audio, far-end audio for communication, etc.) in the environmental noise, and complete the signal separation process according to the different parameter information. For example, in some embodiments, the microphone array may convert the picked-up noise, the user's voice, the noise-reducing sound waves into corresponding first, second, and third signals, respectively. The signal processor 140 obtains a spatial difference (e.g., where the signal is located), a time domain difference (e.g., delay), a frequency domain difference (e.g., amplitude, phase) of the first signal, the second signal, and the third signal, and performs signal separation on the first signal, the second signal, and the third signal according to the differences in three dimensions, to obtain relatively pure first signal, second signal, and third signal. The separated first signal, second signal and third signal respectively correspond to pure noise, user voice and noise reduction sound wave, and the signal processor 140 completes the signal separation process. In some embodiments, the signal processor 140 may update the noise-reducing sound wave according to the parameter information of the noise, the noise-reducing sound wave, the device media audio, the far-end audio of the call, etc. obtained by the signal separation, and the updated noise-reducing sound wave is output through the speaker array 150.
In some embodiments, the structured features of the noise may include noise distribution, noise intensity, global noise intensity, noise rate, or the like, or any combination thereof. In some embodiments, the noise intensity may refer to a value of a noise pixel reflecting the magnitude of noise in the noise pixel, and thus, the noise distribution may reflect the probability density of noise in the image having different noise intensities. The global noise strength may reflect an average noise strength or a weighted average noise strength in the image. The noise rate may reflect the degree of dispersion of the noise distribution. In some embodiments, the statistical distribution characteristics of the noise may include, but are not limited to, probability distribution density, power spectral density, autocorrelation function, probability density function, variance, mathematical expectation, and the like. In some embodiments, the user voice, device media audio, far-end audio, etc. separated by the signal may also be transmitted to the far-end of the call. For example, when a user wears the open earphone 100 to make a voice call, the user's voice can be transmitted to the far end of the call.
Fig. 5 is an exemplary flow chart of estimating noise for a first spatial location provided in accordance with some embodiments of the present description. As shown in fig. 5, the process 500 may include:
In step 510, one or more spatial noise sources associated with the picked-up ambient noise are determined.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the spatial noise source associated with ambient noise refers to a noise source whose sound waves may be transmitted at or near the user's ear canal (e.g., a first spatial location). In some embodiments, the spatial noise source may be a noise source in a different direction (e.g., front, rear, etc.) of the user's body. For example, people in front of the body of the user have noise, and vehicles in the left of the body of the user have whistling noise, in which case the spatial noise sources are the noise sources of people in front of the body of the user and the vehicles in the left of the body of the user. In some embodiments, the first microphone array 130 may pick up spatial noise in various directions of the body of the user, convert the spatial noise into an electrical signal, and transmit the electrical signal to the signal processor 140, and the signal processor 140 may perform signal analysis on the electrical signal corresponding to the spatial noise to obtain parameter information (for example, azimuth information, amplitude information, phase information, etc.) of the spatial noise in various directions. The signal processor 140 determines spatial noise sources of the respective directions, for example, the orientations of the spatial noise sources, the phases of the spatial noise sources, the magnitudes of the spatial noise sources, and the like, based on the parameter information of the spatial noise of the respective directions. In some embodiments, the signal processor 140 may determine the spatial noise source through a noise localization algorithm. In some embodiments, the noise localization algorithm may include one or more of beamforming, super-resolution spatial spectrum estimation, arrival time differences, and the like. In some embodiments, the signal processor 140 may divide the picked-up ambient noise into a plurality of sub-bands according to a specific frequency bandwidth (e.g., as one frequency band every 500 Hz), each sub-band may correspond to a different frequency range, respectively, and determine a spatial noise source corresponding to the sub-band on at least one sub-band. The positioning method of the spatial noise source can be specifically referred to other places in the specification, and will not be described herein. For a detailed description of determining one or more spatial noise sources associated with picked-up ambient noise, reference may be made to this specification to fig. 6 and its associated description.
In step 520, noise at a first spatial location is estimated based on the spatial noise sources.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the signal processor 140 may estimate the parameter information of the noise respectively transferred to the first spatial location by each spatial noise source based on the parameter information (e.g., frequency information, amplitude information, phase information, etc.) of the spatial noise sources located in various directions of the user's body obtained in step 510, thereby estimating the noise of the first spatial location. For example, in some embodiments where there is one spatial noise source in front of and behind the user's body, the signal processor 140 may estimate the frequency information, phase information, or amplitude information of the front spatial noise source when the front spatial noise source is delivered to the first spatial location based on the frequency information, phase information, or amplitude information of the front spatial noise source. The signal processor 140 estimates frequency information, phase information, or amplitude information of the rear spatial noise source when the rear spatial noise source is transferred to the first spatial location based on the frequency information, phase information, or amplitude information of the rear spatial noise source. The signal processor 140 estimates noise information of the first spatial location based on frequency information, phase information, or amplitude information of the front spatial noise source and frequency information, phase information, or amplitude information of the rear spatial noise source, thereby estimating noise of the first spatial location. In some embodiments, parameter information of the spatial noise source may be extracted from a frequency response curve of the spatial noise source picked up by the microphone array by a feature extraction method. In some embodiments, methods of extracting parameter information of spatial noise sources may include, but are not limited to, principal component analysis (Principal Components Analysis, PCA), independent component analysis (Independent Component Algorithm, ICA), linear discriminant analysis (Linear Discriminant Analysis, LDA), singular value decomposition (Singular Value Decomposition, SVD), and the like.
In some embodiments, determining one or more spatial noise sources related to the picked-up ambient noise may include locating the one or more spatial noise sources by one or more of beamforming, super-resolution spatial spectrum estimation, or time difference of arrival. The beam forming positioning mode is a controllable beam forming sound source positioning method based on maximum output power. In some embodiments, the beamformed sound source localization method may weight and sum sound signals picked up by the individual microphone array elements in the microphone array to form a beam, direct the beam by searching for possible locations of spatial noise sources, and modify weights such that the output signal power of the microphone array is maximized. It should be noted that, the beam forming sound source positioning method may be used in both the time domain and the frequency domain. The time shift of the beamforming in the time domain is equivalent to the phase delay in the frequency domain. In some embodiments, the sound source localization method of the super-resolution spatial spectrum estimation may include an autoregressive AR model, a minimum variance spectrum estimation (MV), a eigenvalue decomposition method (e.g., a Music algorithm), and the like, which can calculate a correlation matrix of the spatial spectrum by acquiring sound signals of the microphone array, and effectively estimate the direction of the spatial noise source. In some embodiments, the arrival time difference sound source localization method may first estimate the arrival time difference, obtain the acoustic delay (TDOA) between the array elements in the microphone array from the arrival time difference, and further localize the position of the spatial noise source by using the obtained arrival time difference and combining the known spatial position of the microphone array.
In order to more clearly illustrate the positioning principle of the spatial noise source, a beam forming sound source positioning method is taken as an example to specifically illustrate how the positioning of the spatial noise source is achieved. Taking the microphone array as a linear array as an example, the spatial noise source may be a far-field sound source, where the incident sound waves of the spatial noise source incident on the microphone array are considered to be parallel. In a parallel sound field, when the incident angle of the sound waves of the spatial noise source is perpendicular to the microphone plane in the microphone array (e.g., the first microphone array 130 or the second microphone array 160), the incident sound waves can reach the respective microphones in the microphone array (e.g., the first microphone array 130 or the second microphone array 160) at the same time. In some embodiments, when the angle of incidence of the sound waves by the spatial noise source in the parallel sound field is not perpendicular to the microphone plane in the microphone array (e.g., the first microphone array 130 or the second microphone array 160), the arrival of the incident sound waves at each microphone in the microphone array (e.g., the first microphone array 130 or the second microphone array 160) may have a delay, which may be determined by the angle of incidence. In some embodiments, the noise waveform intensity after superposition is different for different angles of incidence. For example, when the incidence angle is 0 °, the noise signal intensity is weak, and when the incidence angle is 45 °, the noise signal intensity is strongest. When the incidence angles are different, the waveform superposition intensities of the noise waveforms after superposition are different, so that the microphone array has polarity, and a polarity diagram of the microphone array can be obtained. In some embodiments, the microphone array (e.g., the first microphone array 130 or the second microphone array 160) may be a directional array, the directionality of which may be implemented by a time domain algorithm or a frequency domain phase delay algorithm, e.g., delay, superposition, etc. In some embodiments, pointing in different directions may be achieved by controlling different delays. In some embodiments, the directional array is controllable and corresponds to a spatial filter, the noise positioning area is firstly meshed, then the time domain delay is carried out on each microphone through the delay time of each grid point, finally the time domain delays of each microphone are overlapped, the sound pressure of each grid is obtained through calculation, and therefore the relative sound pressure of each grid is obtained, and finally the positioning of the spatial noise source is achieved.
It should be noted that the above description of the process 500 is for purposes of illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 500 will be apparent to those skilled in the art in light of the present description. For example, the process 500 may also include locating spatial noise sources, extracting parameter information for the spatial noise sources, and the like. For another example, step 510 and step 520 may be combined into one step. However, such modifications and variations are still within the scope of the present description.
Fig. 6 is an exemplary flow chart for determining spatial noise sources provided in accordance with some embodiments of the present description. As shown in fig. 6, the flow 600 may include:
in step 610, the picked-up ambient noise is divided into a plurality of sub-bands, each sub-band corresponding to a different frequency range.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the frequency of the ambient noise from different directions of the user's body may be different, and the signal processor 140 may divide the ambient noise frequency band into a plurality of sub-bands each corresponding to a different frequency range when signal processing the ambient noise signal. The frequency range corresponding to each sub-band may be a predetermined frequency range, for example, 80Hz-100Hz, 100Hz-300Hz, 300Hz-800Hz, etc. In some embodiments, each sub-band includes parameter information of the environmental noise of the corresponding frequency band. For example, the signal processor 140 may divide the picked-up ambient noise into four sub-bands of 80Hz-100Hz, 100Hz-300Hz, 300Hz-800Hz, 800Hz-1000Hz, which correspond to parameters of the ambient noise of 80Hz-100Hz, 100Hz-300Hz, 300Hz-800Hz, 800Hz-1000Hz, respectively.
In step 620, on at least one subband, a spatial noise source corresponding thereto is determined.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the signal processor 140 may perform signal analysis on the sub-bands divided by the environmental noise, obtain parameter information of the environmental noise corresponding to each sub-band, and determine a spatial noise source corresponding to each sub-band according to the parameter information. For example, on a sub-band of 300Hz-800Hz, the signal processor 140 may acquire parameter information (e.g., frequency information, amplitude information, phase information, etc.) of the corresponding ambient noise contained in the sub-band, and the signal processor 140 determines a spatial noise source corresponding to the sub-band of 300Hz-800Hz based on the acquired parameter information.
It should be noted that the above description of the process 600 is for purposes of example and illustration only and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 600 will be apparent to those skilled in the art in light of the present description. For example, step 610 and step 620 are combined. For another example, other steps are added to flow 600. However, such modifications and variations are still within the scope of the present description.
Fig. 7 is another exemplary flow chart for determining spatial noise sources provided in accordance with some embodiments of the present description. As shown in fig. 7, the flow 700 may include:
in step 710, a user head function is obtained, the user head function reflecting the reflection or absorption of sound by the user head.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the first microphone array 130 may include a first sub-microphone array and a second sub-microphone array, which are located at the left and right ears of the user, respectively. In some embodiments, the first microphone array 130 may be a bilateral-mode arrangement, i.e., a bilateral-mode arrangement in which the first sub-microphone array and the second sub-microphone array are simultaneously enabled. In some embodiments, when the first sub-microphone array is located at the left ear position of the user and the second sub-microphone array is located at the right ear position of the user, the user's head may reflect or absorb the sound signal during the transmission process of the sound signal, so that the pickup of the same environmental noise by the first sub-microphone array and the second sub-microphone array is different. In some embodiments, the signal processor 140 may construct a user's head function based on a difference between the parameter information of the ambient noise picked up by the first sub-microphone array and the parameter information of the same ambient noise picked up by the second sub-microphone array, which may reflect the reflection and absorption of sound by the user's head.
In step 720, on at least one sub-band, the spatial noise source corresponding to the ambient noise picked up by the first sub-microphone array, the ambient noise picked up by the second sub-microphone array, and the user head function are determined in combination.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the first microphone array 130 has amplitude and phase differences between the amplitude and phase information of the ambient noise signals picked up by the first sub-microphone array and the amplitude and phase information of the ambient noise signals picked up by the second sub-microphone array in the bilateral mode. The signal processor 140 may perform frequency bin synthesis on at least one sub-band of the ambient noise according to the ambient noise picked up by the first sub-microphone array, the ambient noise picked up by the second sub-microphone array, and the user head function acquired by the signal processor 140 in step 710, that is, the head function is used as a priori information, and the frequency bin of the ambient noise on the corresponding sub-band picked up by the first sub-microphone array and the frequency bin of the ambient noise on the corresponding sub-band picked up by the second sub-microphone array are synthesized on the at least one sub-band of the ambient noise. The parameter information contained in the sub-band after the frequency point synthesis is completed corresponds to the parameter information of the reconstructed virtual noise source. The signal processor 140 determines a spatial noise source based on the parameter information of the reconstructed virtual sound source, thereby completing spatial noise source localization.
In some embodiments, the first microphone array 130 may also be a single-sided pattern arrangement. For example, only the first sub-microphone array or the second sub-microphone array is enabled. In some embodiments, the first microphone array 130 is a single-sided mode arrangement, and when the first sub-microphone array located at the left ear of the user is enabled, the signal processor 140 may synthesize, on at least one sub-band of the ambient noise, the frequency points of the ambient noise on the corresponding sub-band picked up by the first sub-microphone array using the user head function as a priori information. The parameter information contained in the sub-band after the frequency point synthesis is completed corresponds to the reconstructed virtual noise source parameter information. The signal processor 140 determines a spatial noise source based on the parameter information of the reconstructed virtual sound source, thereby completing spatial noise source localization.
In some embodiments, the first sub-microphone array may pick up ambient noise arriving at the user's left ear, and the signal processor 140 may also estimate parameter information of the ambient noise arriving at the user's right ear from the user head function based on the ambient noise parameter information. The signal processor 140 can more accurately complete the positioning of the spatial noise source according to the estimated parameter information when the environmental noise reaches the right ear of the user. In some embodiments, the single-side mode of the first microphone array 130 may be that only one sub-microphone array is provided, and the spatial noise source positioning process performed in such single-side mode is similar to the single-side mode spatial noise source positioning process performed by only enabling the first sub-microphone array (or the second sub-microphone array), which is not described herein.
It should be noted that the above description of the process 700 is for purposes of illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 300 will be apparent to those skilled in the art in light of the present description. However, such modifications and variations are still within the scope of the present description.
In some embodiments, the arrangement of the first sub-microphone array or the second sub-microphone array may be a regular geometric array. Fig. 8A is a schematic diagram of an arrangement of a first sub-microphone array according to some embodiments of the present application. As shown in fig. 8A, the first sub-microphone array forms a linear array. In some embodiments, the arrangement of the first sub-microphone array or the second sub-microphone array may be an array with other shapes, for example, fig. 8B is a schematic diagram of the arrangement of the first sub-microphone array according to other embodiments of the present application. As shown in fig. 8B, the first sub-microphone array is in a cross-shaped array. For another example, fig. 8C is a schematic diagram illustrating an arrangement of a first sub-microphone array according to other embodiments of the present application. As shown in fig. 8C, the first sub-microphone array is in a circular array. The arrangement of the first sub-microphone array or the second sub-microphone array is not limited to the linear array, the cross array, and the circular array shown in fig. 8A, 8B, and 8C, but may be any other shape of array pattern, for example, a triangular array, a spiral array, a planar array, a stereo array, and the like. It should be noted that each of the short solid lines in fig. 8A-8D may be regarded as one microphone or a group of microphones. In some embodiments, when each short solid line is a group of microphones, the number of the microphones in each group may be the same or different, the types of the microphones in each group may be the same or different, and the orientations of the microphones in each group may be the same or different, so that the types, the numbers, the orientations and the pitches of the microphones may be adaptively adjusted according to practical situations.
In some embodiments, the arrangement of the first sub-microphone array or the second sub-microphone array may also be an irregularly geometrically shaped array. For example, fig. 8D is a schematic diagram of an arrangement of a first sub-microphone array according to other embodiments of the present application. As shown in fig. 8D, the first sub-microphones are arrayed in an irregular array. It should be noted that the irregular-shaped array arrangement of the first sub-microphone array or the second sub-microphone array is not limited to the shape shown in fig. 8D, but may be an array arrangement of an irregular pattern or the like of other shapes, for example, an irregular polygon or the like, which is not limited in this specification.
In some embodiments, the microphones in the first sub-microphone array (or the second sub-microphone array) may be uniformly distributed, where uniform distribution refers to the same spacing between the microphones in the first sub-microphone array (or the second sub-microphone array). In some embodiments, the microphones in the first sub-microphone array (or the second sub-microphone array) may also be unevenly distributed, where uneven distribution refers to a difference in spacing between the microphones in the first sub-microphone array (or the second sub-microphone array). The distance between the microphone array elements in the sub-microphone array can be adaptively adjusted according to practical situations, which is not limited in this specification.
Fig. 9A is a schematic diagram of a positional relationship of a first sub-microphone array and a second sub-microphone array provided in accordance with some embodiments of the present application. As shown in fig. 9A, the first sub-microphone array 911 is located at the left ear of the user, and the first sub-microphone array 911 is arranged in an approximately triangular shape. The second sub-microphone array 912 is located at the right ear of the user, and the second sub-microphone array 912 is also arranged in an approximately triangular shape, and the second sub-microphone array 912 is arranged in the same manner as the first sub-microphone array 911 and is symmetrically distributed about the head of the user. Referring to fig. 9A, an extension line of the first sub-microphone array 911 in the array direction and an extension line of the second sub-microphone array 912 in the array direction intersect, and may constitute a quadrangular structure.
Fig. 9B is a schematic diagram of a positional relationship of a first sub-microphone array and a second sub-microphone array provided according to other embodiments of the present application. As shown in fig. 9B, the first sub-microphone array 921 is located at the left ear of the user, and the first sub-microphone array 921 is arranged in a line shape. The second sub-microphone array 922 is located at the right ear of the user, the second sub-microphone array 922 is arranged in an approximately triangular shape, and the second sub-microphone array 922 is arranged in a different manner from the first sub-microphone array 921 and is asymmetrically distributed with respect to the head of the user. Referring to fig. 9B, an extension line of the first sub-microphone array 921 along the array direction intersects with an extension line of the second sub-microphone array 922 along the array direction, and may be configured as a triangle structure.
In some embodiments, the first and second sub-microphone arrays 921 and 922 may form regular and/or irregular shapes such as a figure eight, circular, elliptical, annular, polygonal, etc., in addition to the quadrilateral shown in fig. 9A, the triangle shown in fig. 9B. The first sub-microphone array and the second sub-microphone array are distributed in a specific shape or a three-dimensional space, so that the environmental noise in all directions of a user can be obtained in an omnibearing manner, the spatial noise source can be positioned more accurately through the parameter information of the environmental noise obtained by each microphone, and then the noise field at the auditory canal of the user can be simulated more accurately, so that the better noise reduction effect is achieved. Different arrangements of the first sub-microphone array and the second sub-microphone array have different spatial filtering performance. By way of example only, spatial filtering performance may include main lobe width, side lobe (also referred to as side lobe) width. The main lobe width refers to the maximum radiation beam of the acoustic radiation. Sidelobe width refers to the radiation beam other than the maximum radiation beam. Wherein, the narrower the main lobe width, the higher the microphone array resolution, the better the directivity. The lower the side lobe height is, the better the anti-interference performance of the microphone array is, and the higher the side lobe height is, the worse the anti-interference performance of the microphone array is. For example, the main lobe width corresponding to the beam pattern of the cross array is narrower than the main lobe width corresponding to the circular, rectangular or spiral beam pattern, that is, the cross array has higher spatial resolution and better directivity under the condition of the same number of array elements. From the side lobe height, the side lobe width corresponding to the beam pattern of the cross array is higher than that corresponding to the circular, rectangular or spiral beam pattern, that is, the interference resistance of the cross array is poor. The arrangement manner of the first sub-microphone array and the second sub-microphone array can be adaptively adjusted according to practical application conditions, and is not further limited herein. It should be noted that each of the short solid lines in fig. 9A and 9B may be regarded as one microphone or a group of microphones. In some embodiments, when each short solid line is a group of microphones, the number of the microphones in each group may be the same or different, the types of the microphones in each group may be the same or different, and the orientations of the microphones in each group may be the same or different, so that the types, the numbers, the orientations and the pitches of the microphones may be adaptively adjusted according to practical situations. In some embodiments, a spatial super-resolution image of the environmental noise can be formed by methods such as synthetic aperture, sparse recovery, mutual pixel array and the like, and the spatial super-resolution image can be used for reflecting a signal reflection map of the environmental noise so as to further improve the positioning accuracy of the spatial noise source. In some embodiments, the position, spacing, on-off state, etc. of the microphones in the microphone arrays (e.g., first microphone array 130, second microphone array 160) may be adjusted based on feedback of the positioning accuracy of the spatial noise source.
In some embodiments, the first microphone array 130 may include a noise microphone, where the noise microphone in the first microphone array 130 is configured to pick up spatial noise at the ear canal of the user, and where the noise microphone picks up spatial noise at the ear canal of the user, the noise-reducing sound waves output by the speaker array 150 are also picked up, which are not desired to be picked up by the noise microphone. Accordingly, a noise microphone may be disposed at an acoustic null of an acoustic dipole formed in the speaker array 150 to minimize noise reduction sound waves picked up by the noise microphone. In some embodiments, at least one speaker array 150 forms at least one set of acoustic dipoles, and the noise microphone is located at an acoustic null of the dipole radiated sound field. In some embodiments, the sound signals output by any two speakers in the speaker array 150 may be considered as two point sources of outwardly radiated sound that are the same in amplitude and opposite in phase. The two loudspeakers can form an acoustic dipole or similar acoustic dipole, and the outward radiated sound has obvious directivity, so that an 8-shaped sound radiation area is formed. And in the linear direction of the connecting line of the two speakers, the sound radiated at the speakers is maximum, the sound radiated in the other directions is obviously reduced, and the sound radiated at the perpendicular bisector of the connecting line of the two speakers is minimum. In some embodiments, the sound signal output by one speaker in the speaker array 150 may also be considered a dipole. For example, a set of approximately opposite phase, approximately equal amplitude sound signals output from one speaker diaphragm front and diaphragm back in the speaker array 150 may be considered two point sources.
In some embodiments, the ambient noise signal picked up by the microphone array (e.g., the first microphone array 130) at the acoustic zero position may also be obtained by an algorithm. For example, in some embodiments, one or more microphones in the first microphone array 130 may be pre-positioned at acoustic zero locations of an acoustic dipole formed by the speaker array 150 of a particular frequency band. The specific frequency band may be a frequency band that plays a key role in speech intelligibility, e.g., 500Hz-1500Hz. In some embodiments, the signal processor 140 calculates and pre-stores compensation parameters for a particular frequency band based on the acoustic dipole locations (i.e., the locations of the two speakers that make up the acoustic dipole) and the acoustic transfer function. The signal processor 140 may perform amplitude compensation and/or phase compensation on the ambient noise basis picked up by the remaining microphones (i.e., microphones not disposed at the acoustic zero point position) in the first microphone array 130 according to the pre-stored compensation parameters, and the compensated ambient noise signal is equivalent to the ambient noise signal picked up by the noise microphone disposed at the acoustic zero point position. It should be noted that the microphones in the microphone array (e.g., the first microphone array 130) may not be disposed at the acoustic zero point of the acoustic dipole formed by the speaker array 150, for example, in some embodiments, the signal processor 140 may perform signal separation and extraction on the noise at the first spatial position picked up by the microphone array according to the statistical distribution and the structural characteristics of the different types of noise in different dimensions (e.g., spatial domain, time domain, frequency domain, etc.), so as to obtain the different types of noise (e.g., different frequencies, different phases, etc.), and cancel the noise-reduced sound waves emitted by the speaker array 150 picked up by the microphone array through the signal processor.
Fig. 10 is an exemplary flow chart of estimating noise for a first spatial location provided in accordance with some embodiments of the present description. As shown in fig. 10, the process 1000 may include:
in step 1010, components associated with the signal picked up by the bone conduction microphone are removed from the picked up ambient noise in order to update the ambient noise.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, when the microphone arrays (e.g., the first microphone array 130, the second microphone array 160) pick up ambient noise, the user's own speech sounds will also be picked up by the microphone arrays, i.e., the user's own speech sounds will also be considered as part of the ambient noise. In this case, the noise-reduced sound wave output from the speaker array 150 will cancel the sound of the user's own speech. In some embodiments, the voice of the user speaking itself needs to be preserved in certain scenarios, such as scenarios where the user is engaged in a voice call, sending a voice message, etc. In some embodiments, when a user wears the open earphone 100 to make a voice call or record voice information, the bone conduction microphone may pick up a voice signal of the user speaking by picking up a vibration signal generated by bones or muscles of the face when the user speaks, and transmit the same to the signal processor 140. The signal processor 140 obtains parameter information from the sound signals picked up by the bone conduction microphone, and the signal processor 140 finds and removes sound signal components associated with the sound signals picked up by the bone conduction microphone from ambient noise picked up by the microphone arrays (e.g., the first microphone array 130, the second microphone array 160). The signal processor 140 updates the ambient noise according to the parameter information of the ambient noise picked up by the remaining microphone array. The updated environmental noise no longer contains the voice signal of the user speaking itself, i.e. the voice signal of the user speaking itself is preserved when the user performs voice call.
In step 1020, noise at the first spatial location is estimated from the updated ambient noise.
In some embodiments, this step may be performed by the signal processor 140. In some embodiments, the signal processor 140 may estimate the noise of the first spatial location based on the updated ambient noise. For a detailed description of estimating the noise of the first spatial location according to the environmental noise, reference may be made to fig. 2 of the present specification and the related description thereof, and details thereof will not be repeated herein.
It should be noted that the above description of the process 1000 is for illustration and description only, and is not intended to limit the scope of applicability of the present disclosure. Various modifications and changes to flow 1000 may be made by those skilled in the art under the guidance of this specification. For example, it is also possible to pre-process components associated with the signals picked up by the bone conduction microphone and transmit the signals picked up by the bone conduction microphone as audio signals to the terminal device. However, such modifications and variations are still within the scope of the present description.
In some embodiments, the at least one microphone array may include bone conduction microphones and air conduction microphones, and the signal processor 140 may control the on-off states of the bone conduction microphones and the air conduction microphones based on the operating state of the open earphone 100. In some embodiments, the operating state of the open earphone 100 may refer to a usage state used when the user wears the open earphone 100. In some embodiments, the operating state of the open earphone 100 may include, but is not limited to, a music playing state, a voice call state, a voice transmission state, and the like. In some embodiments, when the microphone array picks up ambient noise, the on-off state of the bone conduction microphones and the on-off state of the air conduction microphones in the microphone array may be determined according to the operating state of the open earphone 100. For example, when the user wears the open earphone 100 to play music, the on-off state of the bone conduction microphone may be a standby state, and the on-off state of the air conduction microphone may be an operating state. For another example, when the user wears the open earphone 100 to perform voice transmission, the on-off state of the bone conduction microphone may be an operating state, and the on-off state of the air conduction microphone may be an operating state. In some embodiments, the signal processor 140 is coupled to the microphone array, and the signal processor 140 can control the on-off states of the microphones (e.g., bone conduction microphones, air conduction microphones) in the microphone array by sending control signals.
In some embodiments, the operating states of the open earphone 100 may include a talk state and an un-talk state. In some embodiments, when the operating state of the open earphone 100 is the non-talking state, the signal processor 140 may control the bone conduction microphone to be in the standby state. For example, in the non-talking state of the open earphone 100, the sound signal of the user speaking itself may be regarded as ambient noise, in which case the sound signal of the user speaking itself included in the ambient noise picked up by the microphone array may not be filtered out, so that the sound signal of the user speaking itself may also be cancelled out by the noise-reducing wave output by the speaker array 150.
In some embodiments, when the operating state of the open earphone 100 is a talking state, the signal processor 140 may control the bone conduction microphone to be in the operating state. For example, in the case where the open earphone 100 is in a talking state, the signal processor 140 may send a control signal to control the bone conduction microphone to be in a working state, the bone conduction microphone picks up the voice signal of the user, and the signal processor 140 finds and removes the voice signal component associated with the voice signal picked up by the bone conduction microphone from the environmental noise picked up by the microphone array, so that the voice signal of the user does not cancel the noise reduction sound wave output by the speaker array 150, thereby ensuring the normal talking state of the user.
In some embodiments, when the operating state of the open earphone 100 is a talking state, the signal processor 140 may control the bone conduction microphone to keep the operating state if the sound pressure level of the ambient noise is greater than the preset threshold. In some embodiments, the sound pressure level of the ambient noise may reflect the intensity of the ambient noise. The preset threshold here may be a value stored in the open earphone 100 in advance, for example, any other value such as 50dB, 60dB, or 70 dB. In some embodiments, when the sound pressure level of the environmental noise is greater than a preset threshold, the environmental noise may affect the call quality of the user. The signal processor 140 may control the bone conduction microphone to maintain an operating state by transmitting a control signal, and the bone conduction microphone may acquire a vibration signal of facial muscles when a user speaks, without picking up external environmental noise, and at this time, use the vibration signal picked up by the bone conduction microphone as a voice signal when a user speaks, thereby ensuring normal conversation of the user.
In some embodiments, when the operating state of the open earphone 100 is a talking state, the signal processor 140 may control the bone conduction microphone to switch from the operating state to the standby state if the sound pressure level of the ambient noise is less than the preset threshold. In some embodiments, when the sound pressure level of the environmental noise is smaller than the preset threshold, the sound pressure level of the environmental noise is smaller than that of the sound signal generated by the user speaking, and after a part of the sound signal generated by the user speaking is counteracted by the noise reduction sound wave output by the speaker array 150, the remaining sound signal generated by the user speaking still can reach the call standard, so that normal call of the user can be ensured. In this case, the signal processor 140 may control the bone conduction microphone to switch from the operating state to the standby state by transmitting the control signal, thereby reducing the complexity of signal processing and the power loss of the open earphone 100.
In some embodiments, the open earphone 100 may further include an adjustment module for adjusting the sound pressure level of the noise reducing sound waves. In some embodiments, the adjustment module may include buttons, voice assistants, gesture sensors, and the like. The user can adjust the noise reduction mode of the open earphone 100 by controlling the adjustment module. Specifically, the user can adjust (for example, amplify or shrink) the amplitude information of the noise reduction signal through controlling the adjusting module, so as to change the sound pressure level of the noise reduction sound wave sent by the loudspeaker array, and further achieve different noise reduction effects. For example only, in some embodiments, the noise reduction modes may include a strong noise reduction mode, a medium noise reduction mode, a weak noise reduction mode, and so forth. For example, when the user wears the open earphone 100 indoors, the external environment noise is small, and the user can turn off or adjust the noise reduction mode of the open earphone to the weak noise reduction mode through the adjustment module. For another example, when the user wears the open earphone 100 while walking in public places such as a street, the user needs to maintain a certain sensing ability of the surrounding environment while listening to an audio signal (e.g., music, voice information) to cope with an emergency, and at this time, the user may select a medium-level noise reduction mode through an adjustment module (e.g., a button or a voice assistant) to preserve surrounding environment noise (e.g., alarm sound, impact sound, car whistle sound, etc.). For another example, when the user takes a vehicle such as a subway or an airplane, the user can select a strong noise reduction mode through the adjusting module so as to further reduce surrounding noise. In some embodiments, the signal processor 140 may also send a prompt message to the open earphone 100 or a terminal device (e.g., a mobile phone, a smart watch, etc.) communicatively connected to the open earphone 100 based on the ambient noise intensity range to prompt the user to adjust the noise reduction mode.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations of the present application may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this application, and are therefore within the spirit and scope of the exemplary embodiments of this application.
Meanwhile, the present application uses specific words to describe embodiments of the present application. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present application. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present application may be combined as suitable.
Furthermore, those skilled in the art will appreciate that the various aspects of the invention are illustrated and described in the context of a number of patentable categories or circumstances, including any novel and useful procedures, machines, products, or materials, or any novel and useful modifications thereof. Accordingly, aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.
The computer storage medium may contain a propagated data signal with the computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take on a variety of forms, including electro-magnetic, optical, etc., or any suitable combination thereof. A computer storage medium may be any computer readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated through any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or a combination of any of the foregoing.
Furthermore, the order in which the elements and sequences are presented, the use of numerical letters, or other designations are used in the application and are not intended to limit the order in which the processes and methods of the application are performed unless explicitly recited in the claims. While certain presently useful inventive embodiments have been discussed in the foregoing disclosure, by way of various examples, it is to be understood that such details are merely illustrative and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements included within the spirit and scope of the embodiments of the present application. For example, while the system components described above may be implemented by hardware devices, they may also be implemented solely by software solutions, such as installing the described system on an existing server or mobile device.
Likewise, it should be noted that in order to simplify the presentation disclosed herein and thereby aid in understanding one or more inventive embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof. This method of disclosure, however, is not intended to imply that more features than are presented in the claims are required for the subject application. Indeed, less than all of the features of a single embodiment disclosed above.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this application is hereby incorporated by reference in its entirety. Except where the application history file is inconsistent or conflicting with the content of the present application, files that limit the broadest scope of the claims of the present application (currently or hereafter attached to the present application) are also excluded. It is noted that the descriptions, definitions, and/or terms used in the subject matter of this application are subject to such descriptions, definitions, and/or terms if they are inconsistent or conflicting with such descriptions, definitions, and/or terms.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of this application. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present application may be considered in keeping with the teachings of the present application. Accordingly, embodiments of the present application are not limited to only the embodiments explicitly described and depicted herein.

Claims (15)

  1. An open earphone, comprising:
    a securing structure configured to secure the earphone in a position near the user's ear and not occluding the user's ear canal;
    A housing structure configured to carry:
    a first microphone array configured to pick up ambient noise;
    at least one speaker array; and
    a signal processor configured to: estimating noise at a first spatial location based on the picked-up ambient noise, the first spatial location being closer to a user's ear canal than any microphone of the first microphone array; and
    a noise reduction signal is generated based on the noise of the first spatial location such that the at least one speaker array outputs noise reduction sound waves for canceling ambient noise delivered to the user's ear canal in accordance with the noise reduction signal.
  2. The open earphone of claim 1, wherein the open earphone comprises a plurality of earphone elements,
    the housing structure is configured to house a second microphone array configured to pick up ambient noise and the noise-reducing acoustic wave, the second microphone array being at least partially distinct from the first microphone array; and
    the signal processor is configured to update the noise reduction signal based on the sound signal picked up by the second microphone array.
  3. The open-ended earphone of claim 2 wherein the updating the noise reduction signal based on the sound signal picked up by the second microphone array comprises:
    Estimating a sound field at the ear canal of the user based on the sound signals picked up by the second microphone array; and
    and adjusting parameter information of the noise reduction signal according to the sound field at the auditory canal of the user.
  4. The open ended headset of claim 2, wherein the signal processor is further configured to:
    acquiring user input; and
    and adjusting parameter information of the noise reduction signal according to user input.
  5. The open ended headset of claim 2, wherein the second microphone array comprises a microphone that is closer to the ear canal of the user than any of the microphones in the first microphone array.
  6. The open ended headset of claim 1, wherein the signal processor to estimate noise of a first space based on the picked-up ambient noise comprises:
    and carrying out signal separation according to the picked-up environmental noise, acquiring parameter information corresponding to the environmental noise, and generating a noise reduction signal based on the parameter information.
  7. The open ended headset of claim 1, wherein the signal processor estimating noise of a first spatial location based on the picked-up ambient noise comprises:
    Determining one or more spatial noise sources related to the picked-up ambient noise; and
    based on the spatial noise source, noise at the first spatial location is estimated.
  8. The open ended headset of claim 7, wherein the determining one or more spatial noise sources related to the picked-up ambient noise comprises:
    dividing the picked-up ambient noise into a plurality of sub-bands, each sub-band corresponding to a different frequency range; and
    on at least one subband, a spatial noise source corresponding thereto is determined.
  9. The open earphone of claim 8 wherein the first microphone array comprises a first sub-microphone array and a second sub-microphone array, the first sub-microphone array and the second sub-microphone array being located at a left ear and a right ear of a user, respectively, the determining spatial noise sources corresponding to the at least one sub-band comprising:
    acquiring a user head function, wherein the user head function reflects the reflection or absorption condition of the user head on sound; and
    on the at least one sub-band, the spatial noise source corresponding to the environmental noise picked up by the first sub-microphone array, the environmental noise picked up by the second sub-microphone array and the user head function are combined.
  10. The open ended headset of claim 7, wherein the determining one or more spatial noise sources related to the picked-up ambient noise comprises:
    the one or more spatial noise sources are located by one or more of beamforming, super-resolution spatial spectrum estimation, or time difference of arrival.
  11. The open earphone of claim 1 wherein the first microphone array comprises one noise microphone, the at least one speaker array forms at least one set of acoustic dipoles, and the noise microphone is located at an acoustic null of the dipole radiated sound field.
  12. The open ended headset of claim 1, wherein the at least one microphone array comprises a bone conduction microphone configured to pick up a user's speaking voice, the signal processor estimating noise of a first spatial location based on the picked-up ambient noise comprising:
    removing components associated with the bone conduction microphone picked up signal from the picked up ambient noise to update the ambient noise; and
    and estimating the noise of the first space position according to the updated environmental noise.
  13. The open earphone of claim 1 wherein the at least one microphone array comprises a bone conduction microphone and a gas conduction microphone, the signal processor controlling the on-off states of the bone conduction microphone and the gas conduction microphone based on the operational state of the earphone.
  14. The open earphone of claim 13 wherein the earphone states include a talk state and an un-talk state,
    if the working state of the earphone is a non-talking state, the signal processor controls the bone conduction microphone to be in a standby state; and
    and if the working state of the earphone is a conversation state, the signal processor controls the bone conduction microphone to be in the working state.
  15. The open earphone of claim 14, wherein when the operating state of the earphone is a talking state, the signal processor controls the bone conduction microphone to maintain the operating state if the sound pressure level of the ambient noise is greater than a preset threshold;
    and if the sound pressure level of the environmental noise is smaller than a preset threshold value, the signal processor controls the bone conduction microphone to be switched from the working state to the standby state.
CN202180099448.1A 2021-04-25 2021-04-25 Open earphone Pending CN117501710A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/089670 WO2022226696A1 (en) 2021-04-25 2021-04-25 Open earphone

Publications (1)

Publication Number Publication Date
CN117501710A true CN117501710A (en) 2024-02-02

Family

ID=83665731

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202180099448.1A Pending CN117501710A (en) 2021-04-25 2021-04-25 Open earphone
CN202110486203.6A Pending CN115240697A (en) 2021-04-25 2021-04-30 Acoustic device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202110486203.6A Pending CN115240697A (en) 2021-04-25 2021-04-30 Acoustic device

Country Status (3)

Country Link
CN (2) CN117501710A (en)
TW (1) TW202242856A (en)
WO (2) WO2022226696A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115615624B (en) * 2022-12-13 2023-03-31 杭州兆华电子股份有限公司 Equipment leakage detection method and system based on unmanned inspection device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8649526B2 (en) * 2010-09-03 2014-02-11 Nxp B.V. Noise reduction circuit and method therefor
CN102348151B (en) * 2011-09-10 2015-07-29 歌尔声学股份有限公司 Noise canceling system and method, intelligent control method and device, communication equipment
CN108668188A (en) * 2017-03-30 2018-10-16 天津三星通信技术研究有限公司 The method and its electric terminal of the active noise reduction of the earphone executed in electric terminal
CN107346664A (en) * 2017-06-22 2017-11-14 河海大学常州校区 A kind of ears speech separating method based on critical band
CN107452375A (en) * 2017-07-17 2017-12-08 湖南海翼电子商务股份有限公司 Bluetooth earphone
US10706868B2 (en) * 2017-09-06 2020-07-07 Realwear, Inc. Multi-mode noise cancellation for voice detection
JP6972814B2 (en) * 2017-09-13 2021-11-24 ソニーグループ株式会社 Earphone device, headphone device and method
KR102406572B1 (en) * 2018-07-17 2022-06-08 삼성전자주식회사 Method and apparatus for processing audio signal
CN111935589B (en) * 2020-09-28 2021-02-12 深圳市汇顶科技股份有限公司 Active noise reduction method and device, electronic equipment and chip

Also Published As

Publication number Publication date
TW202242856A (en) 2022-11-01
WO2022227056A1 (en) 2022-11-03
CN115240697A (en) 2022-10-25
WO2022226696A1 (en) 2022-11-03

Similar Documents

Publication Publication Date Title
CN108600907B (en) Method for positioning sound source, hearing device and hearing system
US10321241B2 (en) Direction of arrival estimation in miniature devices using a sound sensor array
CN107690119B (en) Binaural hearing system configured to localize sound source
EP2928214B1 (en) A binaural hearing assistance system comprising binaural noise reduction
US11328702B1 (en) Acoustic devices
US9980055B2 (en) Hearing device and a hearing system configured to localize a sound source
US9439005B2 (en) Spatial filter bank for hearing system
US20170295436A1 (en) Hearing aid comprising a directional microphone system
EP3883266A1 (en) A hearing device adapted to provide an estimate of a user's own voice
CN117501710A (en) Open earphone
WO2023087565A1 (en) Open acoustic apparatus
CN116156372A (en) Acoustic device and transfer function determining method thereof
US11689845B2 (en) Open acoustic device
RU2800546C1 (en) Open acoustic device
US11743661B2 (en) Hearing aid configured to select a reference microphone
US20230054213A1 (en) Hearing system comprising a database of acoustic transfer functions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination