CN115278493A - Hearing device with omnidirectional sensitivity - Google Patents

Hearing device with omnidirectional sensitivity Download PDF

Info

Publication number
CN115278493A
CN115278493A CN202210449900.9A CN202210449900A CN115278493A CN 115278493 A CN115278493 A CN 115278493A CN 202210449900 A CN202210449900 A CN 202210449900A CN 115278493 A CN115278493 A CN 115278493A
Authority
CN
China
Prior art keywords
input signal
power
gain value
signal
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210449900.9A
Other languages
Chinese (zh)
Inventor
马长学
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GN Hearing AS
Original Assignee
GN Hearing AS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/244,756 external-priority patent/US11617037B2/en
Application filed by GN Hearing AS filed Critical GN Hearing AS
Publication of CN115278493A publication Critical patent/CN115278493A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/50Customised settings for obtaining desired overall acoustical characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/55Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception using an external connection, either wireless or wired
    • H04R25/552Binaural
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/43Signal processing in hearing aids to enhance the speech intelligibility

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Neurosurgery (AREA)
  • Otolaryngology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention relates to a hearing device and a method performed by a hearing device. A method performed by a first hearing device (100) comprising a microphone configured to generate a first input signal (i), a communication unit (120) configured to receive a second input signal (r) from a second hearing device, an output unit (140) and a processor, the method comprising the steps of: generating a first intermediate signal (v) comprising or being based on a first weighted combination of the first input signal (l) and the second input signal (r); wherein the first weighted combination is based on the first gain value (α) and/or the second gain value (1- α); and generating an output signal for the output unit based on the first intermediate signal; wherein one or both of the first gain value (α) and the second gain value (1- α) is determined with the aim of having the power of the first input signal (l) and the power of the second input signal (r) differ in a weighted combination by a preset power level difference (d) of more than 2dB.

Description

Hearing device with omnidirectional sensitivity
Technical Field
The invention relates to a hearing device and a method performed by a hearing device. At least one embodiment described herein relates to a method performed by a first hearing device comprising a first input unit comprising one or more microphones and configured to generate a first input signal, a communication unit configured to receive a second input signal from a second hearing device, an output unit, and a processor coupled to the first input unit, the communication unit, and the output unit.
Background
A person with normal hearing can typically selectively focus on a particular speaker to achieve speech intelligibility and maintain situational awareness in noisy listening conditions such as restaurants, bars, concert venues, and the like. In the field of hearing aids, this is sometimes referred to as a so-called cocktail party scenario.
A person with normal hearing can naturally use a better listening strategy, where the person focuses his or her attention on the speech signal of the ear with the best signal-to-noise ratio for the target talker or speaker (i.e., the desired sound source). This native, better ear listening strategy may also allow monitoring of off-axis unattended talkers through cognitive filtering mechanisms such as selective attention.
In contrast, listening to a specific, desired sound source in such a noisy sound environment while maintaining environmental awareness by monitoring the talker off-axis or unattended remains a challenging task for the hearing impaired individual. It is therefore desirable to provide similar listening capabilities to hearing impaired individuals, for example by exploiting the spatial filtering capabilities of existing well-known binaural hearing systems. However, the use of binaural hearing systems and related beamforming techniques typically focuses on increasing or improving the signal-to-noise ratio (SNR) of incoming sound of a bilateral or binaural beamformed microphone signal to a particular target direction (typically in front of the individual or another target direction), but at the cost of reducing audibility to talkers who are not noticed in the sound environment (typically off-axis). The improvement of the signal-to-noise ratio of the binaural beamforming microphone signal is caused by the high directivity index of the binaural beamforming microphone signal, which means that off-axis sound sources placed outside a relatively narrow angular range around the selected target direction are severely attenuated or suppressed. This characteristic of the binaural beamforming microphone signal may lead to an unpleasant sensation of a hearing impaired person or patient/user of so-called "tunnel hearing", which may lose situational awareness.
There is a need in the art for a binaural hearing system that provides improved speech intelligibility for hearing impaired individuals in cocktail party sound environments or similar adverse listening conditions, but without sacrificing off-axis perception to provide enhanced situational perception relative to comparable directional hearing systems of the prior art. One problem related to the use of hearing devices with directional sensitivity is that either directional sensitivity is used, which provides some useful advantages, such as spatial noise reduction; or use omni-directional sensitivity to enable listening from multiple directions. However, omni-directional sensitivity typically comes at the expense of increased noise levels.
There are a variety of beamforming algorithms available for performing spatial filtering in which microphones receive sound waves that have different times of arrival. However, for listening devices, the sound waves are head filtered before reaching the microphone, which is commonly referred to as a head shadowing effect. However, due to the head-shadowing effect, the relative level between the left signal captured by the left-ear device and the right signal captured by the right-ear device varies significantly depending on the direction to the sound source (e.g., the speaking person).
The higher the frequency of sound, the stronger the head shadowing effect. In general, beamforming algorithms that assume that sound waves propagate in a free field need to be improved to properly compensate for head shadowing effects.
Disclosure of Invention
One hearing device (e.g., a right ear hearing device) associated with certain binaural hearing systems provides a monitoring signal having at least approximately omnidirectional directivity, and a second hearing device (e.g., a left ear hearing device) provides a focused signal exhibiting maximum sensitivity in a target direction (e.g., in a direction of a user's line of sight) and reduced sensitivity on the left and right sides. Such a binaural hearing system may at least reduce the above-mentioned unpleasant sensation of "tunnel hearing". However, it is observed that in case of multiple speakers, at least some users of hearing devices still experience problems. In particular, it is observed that improvements are needed in relation to providing an improvement in the quality of the monitored signal, for example in relation to binaural hearing systems. Here, the hearing instrument generating the monitoring signal is denoted as ipsilateral equipment and the hearing instrument generating the focus signal is denoted as contralateral equipment.
Provided are:
a method performed by a first listening device; the first hearing instrument comprises a first input unit comprising one or more microphones and configured to generate a first input signal (l), a communication unit configured to receive a second input signal (r) from the second hearing instrument, an output unit (140) and a processor; and the processor is coupled to the first input unit, the communication unit and the output unit, the method comprising the steps of:
determining a first gain value (α), a second gain value (1- α), or both the first gain value (α) and the second gain value (1- α);
generating a first intermediate signal (v) comprising or being based on a first weighted combination of the first input signal (l) and the second input signal (r); wherein the first weighted combination is based on the first gain value (α), the second gain value (1- α), or both the first gain value (α) and the second gain value (1- α); and
generating an output signal (z) for the output unit (140) based on the first intermediate signal;
wherein one or both of the first gain value (α) and the second gain value (1- α) is determined according to a target for the power of the first input signal (l) and the power of the second input signal (r) to differ in weighted combination by a preset power level difference (d) of more than 2dB.
An advantage is that sound fidelity can be significantly improved, at least when compared to methods involving selection between directional focus sensitivity and omnidirectional sensitivity. In particular, the wearer experiences improvements in a social environment where the user may want to hear and/or be able to hear more than one person's voice while enjoying a reduced noise of the surrounding environment.
In particular, it is observed that the claimed method achieves a desired balance to achieve directional sensitivity, e.g. focusing on an on-axis target signal source, while enabling an off-axis signal source to be heard, at least with better intelligibility. Hearing tests have shown that the user experiences less "tunneling" when providing a system employing the claimed method.
In addition to suppressing or reducing undesirable "tunneling", off-axis noise suppression is also promoted, as evidenced by the improved directivity index. This is also the case in the presence of an off-axis target signal source.
Furthermore, the measurements show that the directivity index is improved in a certain frequency range, at least in a frequency range above 500Hz, in particular in a frequency range above 1000 Hz.
This approach makes it possible to maintain the directionality of the hearing device despite the presence of off-axis target sound sources.
Rather than employing a method of entering omni-directional mode to capture off-axis target sound sources, or alternatively suppressing off-axis target sound sources due to directivity, the signals from off-axis sound sources are reproduced at a cost that is acceptable for slightly suppressing the signals from on-axis sound sources, but only in proportion to the signal strength from off-axis sound sources. Since the signal from an on-axis sound source is slightly suppressed and is proportional to the signal strength from an off-axis sound source, the signal from the off-axis sound source can be perceived.
Thus, in some aspects, the method includes forgoing automatically entering omni-directional mode. In particular, exposure of the user to a reproduced signal with an increased noise level when entering the omni-directional mode is thereby avoided.
In at least some aspects, the method is directed to exploiting a head shadowing effect on a beamforming algorithm by scaling the first signal and the second signal. A scaling or equalization of the first signal relative to the second signal, and vice versa, is estimated from the first signal and the second signal.
An advantage is that the sometimes observed comb filtering effect is reduced or substantially eliminated.
The method can be implemented in different ways. In some aspects, the first gain value and the second gain value are not band limited, i.e., the method is performed in a frequency band that has no explicit band limitation. In other aspects, the first and second gain values are associated with band-limited portions of the first and second signals. In some aspects, a plurality of first gain values and a corresponding plurality of second gain values are associated with respective band limited portions of the first signal and the second signal. In some aspects, the first gain value and the second gain value are comprised of respective arrays of a plurality of gain values for a respective plurality of frequency bands or frequency indices (sometimes denoted as frequency bins). In some aspects, prior to summing, the first gain value scales an amplitude of the first signal to provide a scaled first signal, and the second gain value scales an amplitude of the second signal to provide a scaled second signal. The scaled first signal and the scaled second signal are then combined by addition.
In other aspects, the first gain value scales the amplitude of the first signal to provide a scaled first signal that is combined with the second signal by addition to provide a combined signal. The combined signal is then scaled by a second gain value. The method may include foregoing scaling by the second gain value.
In some aspects, the combining is provided by summing (e.g., using an adder) or by an alternative (e.g., equivalent) method.
In some aspects, the weighted combination is obtained by mixing a first input signal scaled by a first gain value and a second input signal scaled by a second gain value. In some aspects, the intermediate signal is a single channel signal or a mono signal. The single channel signal may be a discrete time domain signal or a discrete frequency domain signal.
In some aspects, the combination of the first directional input signal and the second directional input signal is a linear combination.
As an illustrative example, the ipsilateral and contralateral hearing devices communicate, e.g., wirelessly, with each other such that each of the ipsilateral and contralateral hearing devices is capable of processing a first directional input signal and a second directional input signal, one of which is received from the other device. The signals may be streamed bi-directionally such that the ipsilateral device receives a second signal from the contralateral device and such that the ipsilateral device transmits a first signal to the contralateral device. The transmission and reception may be according to a power saving protocol.
As an illustrative example, the method is performed simultaneously at the ipsilateral and contralateral hearing devices. In this respect, the respective output units of the respective devices present the output signal to the user as a mono signal. The mono signal has no spatial cues in terms of intentionally introducing a time delay to add spatial cues.
In some examples, the output signal is transmitted to an output unit of the ipsilateral hearing device.
As another illustrative example, each of the ipsilateral and contralateral hearing devices includes one or more respective directional microphones or one or more respective omnidirectional microphones, the microphones including a beamforming processor to generate the signals.
As a further illustrative example, each of the first signal and the second signal is associated with a fixed directionality relative to a user wearing the hearing device. In this context, an on-axis direction may refer to a direction directly in front of the user, while an off-axis direction may refer to any other direction, such as a left or right direction. In some aspects, the user may select a fixed directionality, for example, at a user interface of an auxiliary electronic device in communication with one or more hearing devices. In some embodiments, directionality may be automatically selected, for example, based on focusing on the strongest signal.
In some examples, the method includes combining the first and second signals from the mono, fixed beamformer output from the ipsilateral and contralateral devices, respectively, to further enhance the target speaker.
The method may be implemented in hardware or a combination of hardware and software. The method may include one or both of time domain processing and frequency domain processing. The method encompasses embodiments using iterative estimation of the first gain value and/or the second gain value, as well as embodiments using deterministic calculation of the first gain value and/or the second gain value.
In some aspects, one or both of the first input signal and the second input signal is an omnidirectional input signal or a super-cardioid input signal. In some aspects, one or both of the first input signal and the second input signal is a directional input signal. In some aspects, one or both of the first input signal and the second input signal is a directional input signal having a concentrated directivity.
In some aspects, at least one of the microphones is arranged as a Microphone (MIE) in the ear canal. The microphone, despite being arranged in the ear canal, is still able to capture the sound of the surroundings.
In some aspects, the sum of the first gain value and the second gain value is a value of "1.0". Therefore, the power level of the monitoring signal is not increased by mixing the first and second input signals.
In some aspects, the method is performed by a system comprising a first hearing device and a second hearing device. The second hearing instrument comprises a first input unit comprising one or more microphones and configured to generate a first input signal, a communication unit configured to receive a second input signal from the second hearing instrument, an output unit and a processor; and the processor is coupled to the first input unit, the communication unit and the output unit.
In some embodiments, the preset power level difference (d) is greater than or equal to 3dB, 4dB, 5dB, or 6dB in weighted combination.
In some embodiments, the preset power level difference (d) is equal to or less than 6dB, 8dB, 10dB or 12dB in weighted combination.
In some examples, the preset power level difference is in a range of 6dB to 9 dB. This power level difference reduces the comb signal components in the intermediate signal and the output signal well.
The preset power level difference d corresponds to the gain difference g, d =20 · log10(1/g2). In one example, 1/g2=0.45 corresponds to a preset power level difference substantially equal to 7dB. That is, the omni-directional signal from one side of the wearer's head is approximately 7dB stronger than the omni-directional signal from the other side of the wearer's head.
In some examples, the preset power level difference is hard programmed or soft programmed into the first listening device. In some examples, the preset power level difference has a default value. In some examples, the preset power level difference is received via a user interface of an electronic device, such as a general purpose computer, a smartphone, a tablet, etc., which is connected to the first listening apparatus via, for example, a wireless connection.
In some embodiments, one or both of the first gain value (α) and the second gain value (1- α) are determined with the objective of having the power of the first input signal (l) and the power of the second input signal (r) differ by a preset power level difference (d) when the power of the first input signal (l) and the power of the second input signal (r) differ by less than 6dB, or less than 8dB, or less than 10 dB.
An advantage is that the method performed by the first hearing instrument outputs a lower level of artifacts and distortions in the output signal. The wearer may experience a more stable omnidirectional sound image reproduction. Thus, having the lowest power level (P)min) Is generated by the input signal (l; r) is still the signal with the lowest power level in the weighted combination.
In some embodiments, the first intermediate signal (v) is generated to maintain a signal having the highest power level (P)max) Is generated by the input signal (l; r) has the highest power level in the weighted combination.
The advantage is that the fidelity and stability of the reproduction of the sound environment is improved.
In some examples, the method comprises the steps of:
generating a first intermediate signal (v) comprising or based on a weighted combination of the first input signal (l) and the second input signal (r) such that it has the highest power level (P)max) The input signal (l; r) at least the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) The signal with the highest power level in the weighted combination is maintained with a phase difference of less than 6dB.
In some aspects, the method includes determining a highest power level (P) based on a first input signal (l) and a second input signal (r)max) And the lowest power level (P)min). In some examples, the method includes determining a power level (P) of the first input signall) And the power level (P) of the second input signalr)。
In some aspects, the method packageIncludes determining which of the first and second signals has the maximum power level (P)max) And which of the first and second signals has the lowest power level (P)min)。
In an example, the input signal having the highest power level is multiplied by the largest gain value of the first gain value (α) and the second gain value (1- α). Thus, the input signal with the lowest power level is multiplied by another (minimum) gain value.
In some examples, the power of the first input signal and the power of the second input equal signal are at substantially the same level, and either of the first gain value and the second gain value may be used for e.g. the (slightly) strongest signal.
In some embodiments, the power of the first input signal is generated to be higher than the power of the second input signal received, wherein in the weighted combination the power of the first input signal is higher than the power of the second input signal.
In some embodiments, the power of the received second input signal is higher than the power of the generated first input signal, and in the weighted combination, the power of the second input signal is higher than the power of the first input signal.
In some embodiments, the method comprises the steps of:
generating a second intermediate signal (va) comprising or based on a second weighted combination of the first input signal (l) and the second input signal (r) in dependence on the first gain value (α) and the second gain value (1- α), respectively;
generating a third intermediate signal (vb) comprising or based on a third weighted combination of the first input signal (l) and the second input signal (r) depending on the second gain value (1- α) and the first gain value (α), respectively;
wherein the first intermediate signal (v) is based on the second intermediate signal (va) and the third intermediate signal (vb) in dependence on the first output value (gx) and the second output value (1-gx) based on the mixing function;
wherein the mixing function is the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) A function of the difference or the ratio of the two at the first poleA smooth transition or a multi-step transition between the limit value ("0") and the second limit value ("1").
An advantage is that artifacts and distortions can be reduced. In particular, in case the power levels of the two input signals are substantially the same, e.g. frequently changing between the signal with the maximum power level or another signal, artifacts and distortions may be reduced. This function may be used to suppress such frequent changes, thereby reducing artifacts and distortions in the intermediate signal and/or the output signal. The wearer may experience a more stable omnidirectional sound image reproduction. In particular, the mixing function is used to provide soft decisions when determining (deciding) the highest and lowest power levels.
In some examples, the first limit value is 0 and the second limit value is 1. In some examples, the function is a Sigmoid function or another function. The Sigmoid function may be defined as follows:
Figure BDA0003618162490000081
wherein x = k · ln (R), wherein
Figure BDA0003618162490000082
Where k is a number, for example greater than 3, for example 4 to 10. If the power levels are close to equal and alternate between one being greater than the other, the output of the mixing function remains substantially unchanged. Thereby suppressing the generation of artifacts. A larger variation of the power level difference and thus the resulting variation the signal with the largest power results in a more significant variation of the intermediate signal v. Thus, only a relatively large difference in power level between the first input signal and the second input signal results in a significant change in the value of the function S (x).
In some embodiments, the method comprises the steps of:
determining the power (P) of a first input signal (l)l) And determining the power (P) of the second input signal (r)r);
Based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) And based on mixing functionsThe output value (gx) determines the maximum power level (P)max);
Based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) And determining the lowest power level (P) based on the complementary output values (1-gx) of the mixing functionmin);
Wherein the mixing function is the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) The difference or ratio function is a smooth or multi-step transition between a first limit value ("0") and a second limit value ("1").
It is advantageous that one or both of the first gain value (alpha) and the second gain value (1-alpha; alpha) may be based on a smoothly rather than abruptly changing highest power level (P)max) And the lowest power level (P)min) Is determined by the determination of (1). It is an advantage, particularly in a time domain implementation, to determine one or both of the first gain value (α) and the second gain value (1- α;) while introducing only a limited amount of artifacts in the intermediate signal and/or the output signal.
The values "1-gx" are complementary relative to "gx" in that the sum of these values is a constant value that is at least substantially invariant over time, such as "1" or another value greater or less than "1".
In some embodiments, the power (P) of the first input signal (l)l) -smoothing and squaring values based on the first input signal (l); and wherein the power (P) of the second input signal (r)r) Based on the smoothed and squared values of the second directional input signal (r).
An advantage is that sudden sounds, e.g. from one side of the wearer's head, do not interfere with the wearer's perception of the sound image, which remains balanced despite sudden sounds from a certain direction.
In some examples, the first directional input signal (f)R) Power p ofRAnd a second directional input signal (f)L) Power p ofLCalculated by the following expression:
Pl(n)=γ·Pl(n-1)+(1-γ)·l(n)·l(n)
Pr(n)=γ·Pr(n-1)+(1-γ)·r(n)·r(n)
where γ is a "forgetting factor" reflecting how much the sum of previous values should be weighted over the instantaneous value. Thus, the sudden influence of the instantaneous value is reduced. Other methods for providing a smoothed power level estimate may be possible. Here, n designates the time index of each sample of the signal or frame of signal samples.
In some embodiments, the first gain value (α) is iteratively adjusted with the goal of satisfying the following equation:
Figure BDA0003618162490000101
wherein p ismaxIs the power level of the input signal with the highest power level of the first input signal and the second input signal; and wherein p isminIs the power level of the input signal having the lowest power level of the first input signal and the second input signal, β =1- α is the second gain value, 1/g2Corresponding to a preset power level difference.
An advantage is that the observed comb filtering effect is reduced or substantially eliminated, while the power level in the intermediate signal and/or the output signal can be kept substantially constant.
In some examples, the first gain value (α) is adjusted to converge at least towards the first gain value α, at least approximately satisfying the above equation.
In some aspects, the weighted combination is weighted based on both the first gain value α and the second gain value β. In some aspects, β is at least about equal to 1- α. Thus, the power of the weighted sum of the first directional input signal and the second directional input signal is at least approximately equal to the sum of the first directional input signal and the second directional input signal.
In some embodiments, the first gain value α is determined based on the following expression or an approximation thereof:
Figure BDA0003618162490000102
25 in which P ismaxIs based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) The highest power level of; pminIs based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) The lowest power level of; and g is a gain factor corresponding to the preset power level difference (d).
An advantage is that at least the first gain value alpha and the second gain value beta can be determined easily and continuously in a time domain implementation.
The highest power level and the lowest power level are conveniently determined as described above. Alternatively or additionally, the highest power level and the lowest power level are determined in another way, for example by calculating power levels over consecutive and/or time-overlapping frames of concurrent segments of the first input signal and the second input signal.
In some embodiments, the method comprises the steps of:
iteratively determining a current value (a) of one or both of the first and second gain values at least at a first time and a second timen) (ii) a Wherein the current value (alpha) of the first gain valuen) Iteratively determining from:
i. estimating a first gain value (α) fulfilling the purpose of having the power of the first input signal (l) and the power of the second input signal (r) differ in a weighted combination by more than a preset power level difference (d) of 2dB, and
previous value of first gain value (α)n-1) Adding an estimated value based on the first gain value (alpha) and a previous value (alpha)n-1) The iteration step value of (1).
An advantage is that the method performed by the first hearing instrument outputs a lower level of artifacts and distortions in the output signal. The wearer may experience a more stable omnidirectional sound image reproduction.
Iteratively determining the current value of one or both of the first and second gain values forces the value of one or both of the first and second gain values to develop smoothly over time.
In some examples, the current value of the first gain value, αnIteratively determined by the expression:
αn=αn-1+ step (α - α)n-1)
Wherein the step size is a numerical value, e.g. a fixed value. Term (alpha-alpha)n-1) Representation for iterative determination of alphanOf the gradient of (c).
In some examples, the preset power level difference (d) is about 6dB, corresponding to at least about g =0.25. Then, in case the power level of the first input signal is equal or substantially equal to the power level of the second input signal, the first gain value will converge to
Figure BDA0003618162490000111
And (1- α) =0.2. However, this is for the case where the power level of the first input signal and the power level of the second input signal remain equal or substantially equal.
For completeness, the first gain value (α) may be determined based on a quadratic equation, wherein the first gain value (α) is an unknown value, and wherein the known value comprises a first preset power level difference (g), a power (p) of the first directional input signalL) And the power (p) of the second directional input signalR). However, this approach may be less than ideal because it is based on assuming that the power level is constant.
In some embodiments, the method comprises the steps of:
delaying one of the first input signal (l) and the second input signal (r) to delay the first input signal (l) with respect to the second input signal or to delay the second input signal (r) with respect to the first input signal (l).
An advantage is that comb filtering effects are reduced or substantially eliminated.
In some examples, the delay τ introduced between the first directional input signal and the second directional input signal is in the range of 3 to 17 milliseconds; for example 5 to 15 milliseconds. The delay τ effectively reduces the comb filtering effect. In particular, constructive interference and echo reduction may be observed. In particular, it was observed that spatial regions with constructive or destructive interference can be avoided.
In some embodiments, the method comprises the steps of:
the first gain value (α), the second gain value (1- α), or both the first gain value (α) and the second gain value (1- α) are repeatedly determined based on a non-instantaneous level of the first input signal (r) and a non-instantaneous level of the second input signal (r).
This has the advantage that less distortion and less audible modulation artifacts are introduced when repeatedly determining one or both of the first gain value (α) and the second gain value (1- α).
The non-instantaneous level of the first directional input signal and the non-instantaneous level of the second directional input signal may be obtained by calculating a first time average of the power estimate of the first directional input signal and a second time average of the power estimate of the first directional input signal, respectively. The first time average may be a moving average.
The non-instantaneous level of the first directional input signal and the non-instantaneous level of the second directional input signal may be proportional to: a first norm (1-norm) or a second norm (2-norm) or the power of the corresponding signal (e.g., the power of both signals).
The non-instantaneous level of the first directional input signal and the non-instantaneous level of the second directional input signal may be obtained by a recursive smoothing process. The recursive smoothing process may operate over the full bandwidth of the signal or over each of a plurality of frequency band windows. For example, in a frequency domain implementation, the recursive smoothing process may smooth at each frequency band across the short-time fourier transform frame by, for example, a weighted sum of the values in the current frame and the values in the frame carrying the cumulative average.
Alternatively, the non-instantaneous level of the first directional input signal and the non-instantaneous level of the second directional input signal may be obtained by a time-domain filter, such as an IIR filter.
In some embodiments, the first gain value (α) and the second gain value (1- α) are determined iteratively and subject to a constraint that the sum of the first gain value (α) and the second gain value (1- α) is a predetermined time invariant value.
An advantage is that no undesired modulation or artifacts are introduced as a function of the value variation of the first gain value (α) and the second gain value (1- α). In some examples, the predetermined time-invariant value is 1, but other larger or smaller values may be used.
In some embodiments, the method comprises the steps of:
the intermediate signal (v) is processed to perform hearing loss compensation.
An advantage is that the compensation of hearing loss can be improved based on the method described herein.
The invention also provides:
a hearing instrument, comprising:
a first input unit (110) comprising one or more microphones (112, 113);
a communication unit (120);
an output unit (140) comprising an output transducer (141);
at least one processor (130) coupled to the first input unit (110), the communication unit and the output unit; and
a memory storing at least one program, the at least one program comprising instructions for causing the at least one processor to perform the method.
The invention also provides:
a computer readable storage medium storing at least one program, the at least one program comprising instructions, which when executed by a processor of a hearing device (100) enable the hearing device to perform the method of any one of claims 1-17.
The computer readable storage medium may be, for example, a software package, embedded software. The computer-readable storage medium may be stored locally and/or remotely.
The term "processor" may include a combination of one or more hardware elements. In this regard, the processor may be configured to run the software program or software components thereof. One or more hardware elements may be programmable or non-programmable.
Drawings
Described in more detail below with reference to the accompanying drawings, wherein:
fig. 1 shows an ipsilateral hearing device with a communication unit for communicating with a contralateral hearing device;
FIG. 2 shows a first, a second and a third processing unit;
FIG. 3 shows a processing unit for performing blending;
fig. 4 shows a detailed view of a first processing unit for determining a maximum power level and a minimum power level;
FIG. 5 shows a top view of a human user and first and second target speakers; and
fig. 6 shows the amplitude response of the monitoring signal as a function of frequency.
Detailed Description
Various embodiments are described below with reference to the drawings. Like reference numerals refer to like elements throughout. Therefore, similar elements will not be described in detail with respect to the description of each figure. It should also be noted that the figures are only intended to facilitate the description of the embodiments. They are not intended as an exhaustive description of the claimed invention or as a limitation on the scope of the claimed invention. Moreover, the illustrated embodiments need not have all of the aspects or advantages shown. Aspects or advantages described in connection with a particular embodiment are not necessarily limited to that embodiment and may be practiced in any other embodiment, even if not so illustrated or even if not so explicitly described.
Fig. 1 shows an ipsilateral hearing device with a communication unit for communicating with a contralateral hearing device (not shown). The ipsilateral hearing device 100 generates a monitoring signal through the speaker 141. The ipsilateral hearing device 100 includes a communication unit 120 with an antenna 122 and transceiver 121 for bi-directional communication with the contralateral device. The ipsilateral hearing device 100 further comprises a first input unit 110 having a first microphone 112 and a second microphone 113, each coupled to a beamformer 111 generating a first input signal/. In at least some embodiments, the first input signal/is a time domain signal, which may be designated as/(t), where t designates a time or time index. In some examples, the beamformer 111 is a beamformer with a super-cardioid feature or a beamformer with another feature. In some examples, the beamformer 111 is a delay-sum beamformer. In some examples, microphones 112 and 113, and optionally additional microphones, are arranged in an end-fire or edge-fire configuration as known in the art. In some examples, the beamformer 111 is omitted and instead one or more microphones with omnidirectional or hyper-cardioid characteristics. In some examples, the beamformer 111 is capable of selectively operating in a non-beamforming mode in which the first input signal is not beamformed. In some examples, the beamformer 111 is omitted, but at least one of the microphone 112 and the microphone 113 or a third microphone is arranged as a Microphone (MIE) in the ear canal. The third microphone and/or the first and second microphones may have an omnidirectional or hyper-cardioid character. The microphone, despite being arranged in the ear canal, is still able to capture the sound of the surroundings.
The communication unit 120 receives a second input signal r from, for example, a contralateral hearing instrument. The second input signal r may also be a time domain signal, which may be designated r (t). At the contralateral device, the second signal r may be captured by an input unit corresponding to the first input unit 110.
For convenience, the first input signal l and the second input signal r are represented as an ipsilateral signal and a contralateral signal, respectively. In some examples, the first device, e.g., ipsilateral device, is positioned and/or configured to be located at or in the left ear of the user. In some examples, the second device, e.g., the contralateral device, is located at or in the right ear of the user. The first device and the second device may have the same or similar processors. In some examples, one of the processors is configured to operate as a master and the other is configured to operate as a slave.
The first input signal l and the second signal r are input to a processor 130 comprising a mixer unit 131. The mixer unit 131 may be based on a gain unit or filter as described in more detail herein and outputs an intermediate signal v, e.g. designated v (t). The mixer unit 131 is configured to generate an intermediate signal v based on a first weighted combination of the first input signal (l) and the second input signal (r) in dependence on the first gain value α and the second gain value "1- α". The first gain value α and the second gain value "1- α" are determined according to a target for the power of the first input signal/and the power of the second input signal r to differ by a preset power level difference d of more than 2dB when weighted. This has been shown to improve the fidelity of the monitoring signal mentioned in the background section. In particular, this has shown that artifacts in the intermediate signal, such as comb filtering effects, can be reduced. This is illustrated in fig. 6. As described in more detail herein, one or more gain values are determined, including the gain value a.
In some examples, mixer unit 131 outputs a single-channel intermediate signal v. In some examples, the single-channel intermediate signal is a mono signal.
In some embodiments, the mixer unit 131 is filter-based, e.g. a multi-tap FIR filter. Each of the input signals l and r may be filtered by a respective multi-tap FIR filter before combining the respective filtered signals, e.g. by summing.
The intermediate signal v output from the mixing unit 131 is input to a post filter 132, which outputs a filtered intermediate signal y. In some embodiments, the post-filter 132 is integrated in the mixer 131. In some embodiments, the post mixer 132 is omitted or at least temporarily omitted or bypassed.
In some embodiments, the intermediate signal v and/or the filtered intermediate signal y are input to a hearing loss compensation unit 133, which includes a prescribed compensation for the user's hearing loss as is known in the art. The hearing loss compensation unit 133 outputs a hearing loss compensation signal z. In some embodiments, the hearing loss compensation unit 133 is omitted or bypassed.
The intermediate signal v and/or the filtered intermediate signal y and/or the hearing loss compensated signal z are input to an output unit 140, which may comprise a so-called "receiver" or loudspeaker 141 of the ipsilateral device for providing an acoustic signal to the user. In some embodiments, one or more of the signals v, y and z are input to a second communication unit for transmission to another device. The other device may be a contralateral device or an auxiliary device.
Although a time-domain to frequency-domain transform, such as a short-time fourier transform (STFT), may be used, a corresponding inverse transform, such as an inverse short-time fourier transform (STIFT), may also be used, and will not be described in detail herein.
In some examples, the contralateral device 100 includes another beamformer (not shown) configured with focusing (high directivity) features that provide additional beamformed signals based on the microphones 112 and 113 and optionally additional microphones. Further beamformed signals may be transmitted to the contralateral device (not shown).
More details about the processing, in particular the processing performed by the mixing unit, are given below:
fig. 2 shows a first, a second and a third processing unit. These processing units may be part of the processor 130 or more specifically of the mixer 131. The first processing unit 201 receives a first input signal l and a second input signal r, which may be time domain signals. Based on the first input signal i and the second input signal r, the first processor 201 first estimates the power level P of the first input signal ilAnd the power level P of the second input signal rr. Next, the first processing unit 201 estimates the maximum power level PmaxAnd minimum power level Pmin. The estimation of the maximum power level and the minimum power level corresponds to:
Pmax=max(Pl,Pr)
Pmin=min(Pl,Pr)
where max () and min () are based on inputs (P) to these functionsl、Pr) A function of maximum or minimum power is selected or estimated.
The estimation of the maximum power level and the minimum power level may be based on continuously calculated estimates instead of a (binary) decision. This will be explained in more detail below.
The first processing unit 201 is further configured to output the value gx of the mixing function and the values "1-gx" of the complementary mixing function. The mixing function is based on, for example, a Sigmoid function or a tangent functionIs sometimes expressed as Α tan (). Essentially, the mixing function is taken as the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) A function of the difference or ratio between the first limit value (e.g., "0") and the second limit value (e.g., "1") is smoothed or divided into a plurality of discrete step transitions. An advantage is that the estimation of the maximum and minimum power levels may be based on continuously calculated estimates instead of (binary) decisions. In some examples, the mixing function is a piecewise linear function, such as a function having three or more linear segments.
The second processing unit 202 is configured to be based on the maximum power level PmaxAnd minimum power level PminA first gain value (α) and a second gain value (1- α) are determined.
The first gain value α and the second gain value "1- α" may be estimated based on the following expression, where g is the gain difference corresponding to the preset power level difference d:
Figure BDA0003618162490000181
as required, it at least approximately satisfies the following expression, which is quadratic for solving for α:
Figure BDA0003618162490000182
thus, d =20 · log10(1/g2). In one example, 1/g2=0.45 corresponds to a preset power level difference d, approximately equal to 7dB.
It should be noted that for the sake of completeness the above expression (quadratic for solving for a) can be solved in a conventional way, but the solution will require stationary input signals l and r, which is not usually the case for hearing devices.
The third processing unit 203 generates a value alphan,αnThe iteration converges to the first gain value alpha. The subscript "n" designates the time index. The calculation iteratively converges correspondingly on the second gain value betaValue of betanDue to βn=1-αnIs simply calculated therefrom. The third processor calculates alpha cyclically, e.g. at predetermined time intervals, e.g. once or more times per framenAnd betanWhere a frame includes a predetermined number of samples, such as 32, 64, 128 or other number of samples.
Fig. 3 shows a fourth processing unit for performing blending. The fourth processing unit 300 outputs an intermediate signal v based on the first input signal l and the second input signal r. The processing being based on the first gain value alpha or an iteratively determined value alphan(ii) a Second gain value beta or betan(ii) a The value of the mixing function gx and the value of the complementary mixing function "1-gx", for example provided by the processing unit described in connection with fig. 2.
As shown, the first input signal i is input to two complementary units 310 and 320 which output respective intermediate signals va and vb to a unit 330, the unit 330 mixing the intermediate signals va and vb to an intermediate signal v.
Thus, the fourth processing unit 300 provides a mixture of the first input signal and the second input signal to output an intermediate signal v, which is also denoted as the first intermediate signal v. Although being a mixer itself, the fourth processing unit 300 comprises two complementary units 310 and 320, which are also mixers, and a further unit 330 is also a mixer. The fourth processing unit 300 may thus be denoted as a first mixer, the units 310 and 320 as a second and a third mixer, and the unit 330 as a fourth mixer. The second mixer 310 generates a second intermediate signal (va) comprising or being based on a second weighted combination of the first input signal (l) and the second input signal r from the first gain value alpha and the second gain value "1-alpha", respectively. The third mixer generates a third intermediate signal (vb) comprising or based on a third weighted combination of the first input signal l and the second input signal r from the second gain value "1- α" and the first gain value α, respectively. The fourth mixer generates a first intermediate signal v comprising or based on a fourth weighted combination of the second intermediate signal va and the third input signal vb from the first output value gx and the second output values "1-gx" based on the mixing function. Mixing functions for implementing basesAt maximum power level PmaxAnd minimum power level PminThis is a smooth switching rather than having difficulty in reducing artifacts. The mixing function being the power P of the first input signal llAnd the power P of the second input signal rrA function of the difference or the ratio between the first limit value and the second limit value, a smooth transition or a transition in a plurality of steps. For example, the mixing function is a Sigmoid function having limit values of "0" and "1". The Sigmoid function may be defined as follows:
Figure BDA0003618162490000191
wherein x = k · ln (R), wherein
Figure BDA0003618162490000192
Where k is a number, for example greater than 3, for example 4 to 10. The value of gx is gx = S (x). Other implementations may be defined. In some aspects, to conserve computational resources, the computation of S (x) may be cut off (discarded) when the value of x exceeds or falls below a corresponding threshold, which may result in S (x) assuming a value near the limit. The value gm can then be chosen to take the value of the respective limit value or a value close to the respective limit value.
The fourth processing unit 300 implements the following expression:
v(t)=(gx*(α*l(t)+(1-α)*r(t-τ))+(1—gx)(a*r(t—τ)+(1—α)*l(t))
wherein the symbol "+" in an embodiment denotes a multiplication, wherein α is realized by a gain stage. In embodiments where α is implemented by a Finite Impulse Response (FIR) filter, the symbol "", may also specify a convolution operation. For simplicity, the embodiment in fig. 3 is described as an embodiment where α is implemented by a gain stage.
As shown, the second signal r is delayed by a time delay τ by the delay unit 301. The delay unit 301 thus delays the second input signal r with respect to the first input signal l. The delay τ is in the range of 3 to 17 milliseconds; for example 5 to 15 milliseconds. The delay is omitted in some embodiments.
The unit 310, i.e. the second mixer, comprises a gain unit 311 and a gain unit 312 to provide respective signals α x l (t) and (1- α x r (t- τ), respectively, which are input to an adder 313, which outputs a signal va.
In a mirror fashion, the unit 320, i.e. the third mixer, comprises a gain unit 322 and a gain unit 321 to provide respective signals α x r (t- τ) and (1- α) x l (t) which are input to a summer 323, respectively, which outputs a signal vb.
The signals va and vb are input to the unit 330, i.e. the fourth mixer. The fourth mixer comprises a gain stage 331 and a gain stage 332, the gain stage 331 weighting the signal va according to the value gx and the gain stage 332 weighting the signal vb according to the complementary values "1-gx" before the weighted signals are combined by an adder 333 to provide the intermediate signal v. Thus, a smooth mixing may be achieved in a manner that is particularly suited for time domain implementations. Although a time domain implementation is preferred, it should be mentioned that a smooth mixing can also be achieved in a frequency domain implementation or a short-time frequency domain implementation. However, for frequency domain or short time frequency domain implementations, there may be better choices.
Fig. 4 shows a detailed view of a first processing unit for determining a maximum power level and a minimum power level. The first processing unit utilizes a mixing function, e.g. a Sigmoid function, as indicated by reference numeral 440 at the bottom left hand side. From the above it can be seen that x = k · ln (R), where
Figure BDA0003618162490000201
Figure BDA0003618162490000202
Where k is a number, for example, greater than 3, at least for some embodiments.
The first processing unit receives a first input signal l = l (t) and a second input signal r = r (t) and calculates a corresponding power level PlAnd Pr. The power level may be calculated recursively to obtain a smoothed power estimate. The power level may be calculated using the following expression:
pL(n)=γ·pL(n-1)+(1-γ)·l(n)·r(n)
pR(n)=γ·pR(n-1)+(1-γ)·r(n)·r(n)
where γ is a "forgetting factor" reflecting how much the sum of previous values should be weighted over the instantaneous value. Here, n designates the time index of individual samples of the signal or frames of signal samples. The power level may be calculated in other ways.
Based on the calculated corresponding power level PlAnd PrThe value gx of the mixing function S () that can be based on the Sigmoid function is calculated by the unit 413. Accordingly, in element 414, the complementary values "1-gx" are calculated based on the input from element 413.
Corresponding power level PlAnd PrAre weighted by means 421 and 422 according to the value gx of the mixing function and the complementary value "1-gx", the units 421 and 422 can be mixers, multipliers or gain stages or combinations thereof.
The weighted sum is generated by an adder 423 receiving the respective power levels P weighted according to the value gx of the mixing function and the complementary values "1-gxlAnd Pr. The weighted sum being an estimate P of the maximum power levelmax=max(Pl,Pr)。PmaxIs output by unit 420, which receives the values of gm and "1-gx" from unit 410.
Again based on the values of gm and "1-gx" from cell 410, although in a mirrored fashion, cell 430 outputs an estimate P of the minimum power levelmin=min(Pl,Pr). The weighted sum is generated by an adder 433 that receives the respective power levels P weighted according to the complementary values "1-gx" and the value "gx" of the mixing functionlAnd Pr
In this way, the maximum and minimum power levels may be estimated sample-by-sample or frame-by-frame, while suppressing abrupt changes that may cause audible artifacts.
Fig. 5 shows a top view of the dialog of the wearer of the left and right hearing devices with the first and second speakers. The wearer 510 of the left and right hearing devices 501, 502 is positioned with the first speaker 511 in front (e.g., at about 0 degrees, on-axis) and the second speaker 512 to the right (e.g., at about 50 degrees, off-axis). In addition, some audible noise sources 513 and 514 are located around wearer 510. Audible noise sources 513 and 514 may be anything that causes sound, such as a speaker, a person speaking, etc.
With respect to hearing devices 501 and 502, the right hearing device 502 (also denoted as ipsilateral device) may be configured to provide a monitoring signal to the wearer, and the left hearing device 501 (also designated as contralateral device) may be configured to provide a focusing signal to the wearer 510. The hearing devices 501 and 502 communicate via a wireless link 503.
The ipsilateral device 502, here on the right hand side of the wearer, receives a first input signal/and a second input signal r, as described herein. These signals may have approximately omnidirectional characteristics 520 and 521, but differ greatly from omnidirectional characteristics due to the head shadowing effect caused by the wearer's head.
The contralateral device 501, here on the left hand side of the wearer, may be configured to provide a focus signal to the wearer. The focus signal may be based on a mono or binaural signal forming one or more focus features 522 and 523. The focusing feature may be fixed in front of the wearer, for example at about 0 degrees, either adaptive or controllable by the wearer. As is known in the art.
The first speaker 511 is coaxial in front of the wearer 510. Thus, the acoustic speech signal from the first speaker 511 arrives at least substantially simultaneously at the ipsilateral device and the contralateral device, whereby the signals are captured simultaneously. For the first loudspeaker 511 the signals l and r are therefore of equal strength. To suppress the comb effect, it has been observed that a delay that delays the signals l and r with respect to each other is effective. The delay is small enough not to be considered an echo.
However, the second speaker 512 is offset from the axis of the wearer 510, slightly to the right. When the second speaker 512 emits sound, the claimed method suppresses the signal from the first target speaker 511 that is on-axis with respect to the user, in proportion to the signal strength received at the ipsilateral and contralateral devices from the second speaker 512 that is off-axis with respect to the user. Thus, it is possible to forego entering omni-directional mode while still being able to perceive the (speech) signal from the second speaker 512. Furthermore, in the weighted combination, the power of the first input signal/and the power of the second input signal r are reproduced as a preset power level difference d differing by more than 2dB to reduce the comb effect. The comb effect will be described in more detail in connection with fig. 6.
In some cases, in the prior art, determining that there is a signal from, for example, speaker 512, may cause the listening device to switch to a so-called omni-directional mode, whereby noise sources 513 and 514 suddenly contribute to the sound presented to the user of the prior art listening device, who may be experiencing significantly increased noise levels, although the sound levels of noise sources 513 and 514 are lower than the sound level of target speaker 512.
Fig. 6 shows the amplitude response of the monitoring signal as a function of frequency. In this example, the detection signals are denoted by reference numerals 604a and 604b and correspond to the intermediate signal v output from the mixer 131, i.e. without post-filtering and hearing loss compensation. The intermediate signal v is recorded as a preset power level difference of 10 dB. The magnitude response is plotted as power [ dB ] as a function of frequency [ Hz ]. The amplitude response of the sound source in front of the wearer (at 0 degrees in the direction of view) is recorded.
For comparison, the magnitude response 603 is plotted for the signal from the front microphone (front microphone) arranged towards the viewing direction. Accordingly, a magnitude response 602 is plotted for signals from a rear microphone (rear microphone) arranged away from the viewing direction.
Furthermore, for comparison, signals labeled 601a and 601b are plotted for the mixer, wherein the preset power level difference is about 0dB, and wherein the first gain value α and the second gain value "1- α" are kept fixed, e.g. the value α =0.5.
It can be seen that in the frequency range of about 1000Hz to about 4000-5000Hz, the signals labeled 601a and 601b exhibit a relatively large comb effect at 601a, spanning a peak-to-peak range of about 10 dB.
In comparison, the intermediate signal v, represented by reference numerals 604a and 604b and output from the mixer 131, exhibits a suppressed, relatively small comb effect, spanning a peak-to-peak range of about 3-5dB, over a frequency range of about 1000Hz to about 4000-5000 Hz.
The comb effect is reduced when one or both of the first gain value a and the second gain value "1-a" are determined according to a target that causes the power of the first input signal/and the power of the second input signal r to differ in weighted combination by more than a preset power level difference d of 2dB. Thus, artifacts in the intermediate signal are reduced and the fidelity of the signal reproduced for the wearer may be improved.
In some examples, the power of the first input signal (l) may be the power of the original first input signal. In other examples, the power of the first input signal (l) may be the power of the weighted first input signal. Also, in other examples where the weighting is based on the first gain value, the power of the first input signal (l) may be the power of the first input signal to which the gain is applied.
Similarly, in some examples, the power of the second input signal (r) may be the power of the original second input signal. In other examples, the power of the second input signal (r) may be the power of a weighted second input signal. Also, in other examples where the weighting is based on the second gain value, the power of the second input signal (r) may be the power of the second input signal to which the gain is applied.
Also, in some examples, the goal of having the power of the first input signal (l) and the power of the second input signal (r) differ in a weighted combination by a preset power level difference (d) greater than 2dB may be applicable when | P1-P2| < =6dB, where P1 is the power of the generated first input signal and P2 is the power of the received second input signal. In other examples, the target may be applicable when | P1-P2| > = 6dB. In a further example, the target applies regardless of the value of | P1-P2 |.
It should be understood that the methods described herein may be implemented in different ways. However, some details may be apparent.
In some examples, generating the monitoring signal is intended to achieve a sensitivity similar to a binaural natural ear to a surrounding environment, e.g. a moving sound source, while the focusing signal uses a beamformed signal.
Mixing left and right signals in a time domain implementation to achieve at least an approximate "true" omni-directional characteristic, wherein the mixing is generated as follows:
v(t)=a*l(t)+(1-a)*r(t-τ)
due to the head shadowing effect, the relative level between the left and right signals varies significantly as the sound source moves around the user. Furthermore, it is desirable to suppress the observed comb effect (also referred to as comb filter effect). It is therefore proposed to control the weights of the signals l (t) and r (t) by a parameter α in order to improve the (true) omnidirectional sensitivity or situational awareness index in cocktail party situations and to mitigate the comb filtering effect.
The wearer's head has some head shadowing effect at low frequencies (below 500-1000 Hz) and there is no need to mix the left and right signals at low frequencies for true omnidirectional characteristics. Thus, the signals l (t) and r (t) can be divided into a low band and a high band. Furthermore, we can avoid the main cause of comb filtering by skipping the mixing of low frequency bands. This is because the human auditory system has a higher frequency resolution or a narrower critical band at low frequencies. This may make some audio in a mono listening anechoic chamber sound slightly sharp and harsh.
At high frequencies, when the signal comes from the front, the hearing device receives the same signal, and it is still possible to combine the two signals to bring some comb-like shape. Due to head shadowing effects, the signal from off-axis sources may exhibit some significant interaural level differences. The mixing of the two signals will show a shallow comb effect.
In view of the above discussion, the cross-correlation or level of the two signals plays an important role in achieving the shallow comb filtering effect and the omni-directional polarization mode. The introduction of delay is one way to reduce the cross-correlation of the speech signals. More importantly, it is proposed to dynamically control the level difference between the two signals to achieve better omni-sensitivity in the mixing.
The mixing parameter alpha is adaptively controlled.
In the case of the mixing, it is preferred that,
v(n)=a*(l(n)+(1-a)*r(n-τ)
in general, α can be considered as an FIR filter, and the symbol indicates the convolution operation.
Signal power PlAnd PrThe calculation is as follows:
sl(n)=∑(α(i))l(n-i)
sr(n)=∑(1-α(i))r(n-i)
Figure BDA0003618162490000251
Figure BDA0003618162490000252
the goal is to obtain an optimum a such that the power difference with the scaling constant g is minimized, i.e.,
Figure BDA0003618162490000253
α can be solved adaptively with the gradient descent method as follows:
Figure BDA0003618162490000254
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003618162490000255
Figure BDA0003618162490000256
Figure BDA0003618162490000257
for a tapped filter (gain stage), the mixing parameters can also be derived below. First, we calculate the short-term smoothed power of the signal as:
Pl= forgetting factor Pl+ (1-forgetting factor) (/)
Pr= forgetting factor Pr+ (1-forgetting factor r)
Then we can select a better signal between the left and right signals. We assume Pl>PrThe level ratio in the mixture would be:
Figure BDA0003618162490000258
our goal is to keep the level ratio g of the source constant in any direction. Therefore, the number of the first and second electrodes is increased,
Figure BDA0003618162490000261
and is
Figure BDA0003618162490000262
In a dynamic acoustic scene, we adaptively update the mixing parameter α as follows:
αn=αn-1+ step size (α - α)n-1)
The step size may be chosen to be 0.005 and the forgetting factor may be around 0.7. When g is 0.25, the level difference between the mixed signals is about 6dB. If P isl==PrThen α isnWill converge on
Figure BDA0003618162490000263
And (1- α) =0.2. For a fixed mix default, we set α =0.5.
In the above, we assume Pl>PrAnd the parameter alpha is multiplied with the left signal. And vice versa for the right signal. To avoid binary decision determination of maximum and minimum values:
we introduce the sigmoid function to make the following soft decisions:
Figure BDA0003618162490000264
and is
Figure BDA0003618162490000265
Therefore, R > >1, gx =0; and R < <1, gx =1; k is a positive constant k =4 to 10. The square root of R can be absorbed into k;
thus, Pmax=(gxpl+(1-gx)pr,Pmin=(gxpr+(1-gx)pl)
Figure BDA0003618162490000266
Figure BDA0003618162490000267
In a dynamic acoustic scenario, for each incoming signal block, we adaptively update the mixing parameter α to reach the target, as follows:
αn=αn-1+ step (α - α)n-1)
The output mix is as follows:
v(t)=(gx*(α*l(t)+(1-α)*r(t-τ))+(1-gx)(a*r(t-τ)+(1-α)*l(t))
thus, in at least some aspects, the present invention relates to a method of performing bilateral processing on respective microphone signals from left and right ear hearing devices of a binaural hearing system, and a corresponding binaural hearing system. Binaural hearing systems use ear-to-ear wireless switching or streaming of multiple mono signals over a wireless communication link. The left or right ear hearing device is configured to generate a bilateral or mono-channel beamformed signal with a high directivity index, which may exhibit a maximum sensitivity in a target direction (e.g. at the user's gaze direction) and a reduced sensitivity on the corresponding ipsilateral side of the left or right ear hearing device. An on-ear head-mounted hearing device generates bilateral omnidirectional microphone signals at opposite ears by mixing a pair of mono signals, wherein the bilateral omnidirectional microphone signals exhibit an omnidirectional response or polar pattern with a low directivity index and thus have a sensitivity that is substantially equal for all sound incident directions or azimuths around the user's head.
Generally, the term "coaxial" herein refers to a "cone" of directions or directions relative to one or both of the hearing devices from which signals are primarily captured. That is, "coaxial" refers to the focal region of one or more beamformers or directional microphones. The focus area is typically, but not always, in front of the user's face, i.e., the user's "direction of sight". In some aspects, one or both of the hearing devices captures a respective signal from a direction of an axis in front of the user. The term "off-axis" refers to all other directions than the "on-axis" direction relative to one or both of the hearing devices. The term "target sound source" or "target source" refers to any source of sound signals that produces a sound signal of interest, such as from a human speaker. "noise source" refers to any undesirable sound source that is not a "target source". For example, the noise source may be a combined acoustic signal from many people speaking simultaneously, machine sound, vehicle traffic sound, and so forth.
The term "reproduction signal" refers to a signal presented to a user of the hearing device via, for example, a small loudspeaker, denoted as "receiver" in the hearing device field. The "reproduction signal" may include compensation for hearing loss, or the "reproduction signal" may be a signal with or without compensation for hearing loss. The term "strength" of a signal refers to the non-instantaneous level of the signal, e.g., proportional to the signal's one-norm (1-norm) or two-norm (2-norm) or power (e.g., two signals are low power).
The term "ipsilateral hearing device" or "ipsilateral device" refers to one device worn on one side of the user's head (e.g., the left side), while "contralateral hearing device" or "contralateral device" refers to another device worn on the other side of the user's head (e.g., the right side). An "ipsilateral hearing device" or an "ipsilateral device" may operate with a contralateral device configured in the same manner as the ipsilateral device or in another manner. In some aspects, an "ipsilateral hearing device" or "ipsilateral device" is an electronic listening device configured to compensate for hearing loss. In some aspects, the electronic listening device is configured to not compensate for hearing loss. The hearing instrument may be configured to one or more of: preventing high sound levels in the surroundings, audio playback, communication as a headset for telecommunication, and compensating for hearing loss.
Also, as used in this specification, the term "first input signal" may refer to an original first input signal, a weighted version of the first input signal, or the first input signal with a gain applied. Similarly, as used in this specification, the term "second input signal" may refer to the original second input signal, a weighted version of the second input signal, or the second input signal with a gain applied.
Herein, the term "feature" in e.g. omni-directional feature corresponds to the term "sensitivity" in e.g. omni-directional sensitivity.

Claims (19)

1. A method performed by a first hearing device (100); the first hearing instrument comprises a first input unit (110) comprising one or more microphones (112, 113) and configured to generate a first input signal (l), a communication unit (120) configured to receive a second input signal (r) from a second hearing instrument, an output unit (140) and a processor (130); and the processor is coupled to the first input unit (110), the communication unit (120) and the output unit (140), the method comprising the steps of:
determining a first gain value (α), a second gain value (1- α), or both the first gain value (α) and the second gain value (1- α);
generating a first intermediate signal (v) comprising or being based on a first weighted combination of the first input signal (l) and the second input signal (r); wherein the first weighted combination is based on the first gain value (a), the second gain value (1-a), or both the first gain value (a) and the second gain value (1-a); and
generating an output signal (z) for the output unit (140) based on the first intermediate signal;
wherein one or both of the first gain value (a) and the second gain value (1-a) are determined with the purpose of having the power of the first input signal (l) and the power of the second input signal (r) differ in a weighted combination by a preset power level difference (d) of more than 2dB.
2. The method of claim 1, wherein the preset power level difference (d) is greater than or equal to 3dB, 4dB, 5dB, or 6dB in weighted combination.
3. The method according to claim 1 or 2, wherein the preset power level difference (d) is equal to or less than 6dB, 8dB, 10dB or 12dB in weighted combination.
4. A method according to any of claims 1-3, wherein one or both of the first gain value (a) and the second gain value (1-a) is determined according to the purpose of having the power of the first input signal (l) and the power of the second input signal (r) differ by the preset power level difference when the power of the first input signal (l) and the power of the second input signal (r) differ by less than 6dB, or less than 8dB, or less than 10 dB.
5. The method according to any one of claims 1-4, comprising:
wherein the first intermediate signal (v) is generated to maintain a maximum power level (P)max) Is generated by the input signal (l; r) has the highest power level in the weighted combination.
6. The method according to any of claims 1-5, wherein the power of the generated first input signal is higher than the power of the received second input signal, and in weighted combination the power of the first input signal is higher than the power of the second input signal.
7. The method according to any of claims 1-6, wherein the power of the received second input signal is higher than the power of the generated first input signal, and in the weighted combination the power of the second input signal is higher than the power of the first input signal.
8. The method according to any one of claims 1-7, further comprising the steps of:
generating a second intermediate signal (va) comprising or based on a second weighted combination of the first input signal (l) and the second input signal (r) in dependence of the first gain value (a) and the second gain value (1-a), respectively;
generating a third intermediate signal (vb) comprising or based on a third weighted combination of the first input signal (i) and the second input signal (r) in dependence on the second gain value (1-a) and the first gain value (a), respectively;
wherein the first intermediate signal (v) is based on the second intermediate signal (va) and the third intermediate signal (vb) according to first output values (gx) and second output values (1-gx) based on a mixing function;
wherein the mixing function is a power (P) of the first input signal (l)l) And the power (P) of said second input signal (r)r) A smooth transition or a multi-step transition between a first limit value ("0") and a second limit value ("1") as a function of the difference or as a function of the ratio of the power of the first input signal and the power of the second input signal.
9. The method according to any one of claims 1-8, comprising the steps of:
determining the power (P) of the first input signal (l)l) And determining the power (P) of the second input signal (r)r);
Based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) And determining the highest power level (P) based on the output value (gx) of the mixing functionmax);
Based on the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) And determining the minimum power level (P) based on the complementary output values (1-gx) of the mixing functionmin);
Wherein the mixing function is a power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) A smooth transition or a multi-step transition between a first limit value ("0") and a second limit value ("1") as a function of the difference between, or as a function of the ratio of the power of the first input signal and the power of the second input signal.
10. Method according to any of claims 1-9, wherein the power (P) of the first input signal (i)l) -smoothing and squaring values based on the first input signal (l); and wherein the power (P) of the second input signal (r)r) Based on the smoothed and squared value of the second directional input signal (r).
11. The method according to any of claims 1-10, wherein the first gain value (a) is iteratively adjusted with the goal of satisfying the following equation:
Figure FDA0003618162480000031
wherein, PmaxIs the power (P) of the first input signal (l)l) And the power (P) of said second input signal (r)r) Highest power level in; and wherein PminIs the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) β =1- α is the second gain value, and 1/g2Corresponding to the preset power level difference.
12. The method according to any of claims 1-11, wherein the first gain value a is determined based on the following equation:
Figure FDA0003618162480000041
wherein, PmaxIs the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) Highest power level in; p isminIs the power (P) of the first input signal (l)l) And the power (P) of the second input signal (r)r) The lowest power level of; and g is a gain factor corresponding to said preset power level difference (d).
13. The method according to any one of claims 1-12, comprising:
repeatedly determining a current value (a) of the first gain value at least at a first time and a second timen) Wherein a current value (α) of the first gain valuen) Iteratively determining from:
estimating a first gain value (a) fulfilling the purpose of having a difference (d) between preset power levels in a weighted combination of the power of the first input signal (l) and the power of the second input signal (r) differing by more than 2dB, and
a previous value (α) of the first gain valuen-1) Adding an estimate based on the first gain value (a) and the previous value (a)n-1) The iteration step value of.
14. The method according to any one of claims 1-13, comprising:
delaying one of the first input signal (i) and the second input signal (r) to delay the first input signal (i) with respect to the second input signal (i) or to delay the second input signal (r) with respect to the first input signal (i).
15. The method according to any one of claims 1-14, further comprising:
-iteratively determining the first gain value (a), the second gain value (1-a), or both the first gain value (a) and the second gain value (1-a) based on a non-instantaneous level of the first input signal (i) and a non-instantaneous level of the second input signal (r).
16. The method according to any of claims 1-15, wherein the first gain value (a) and the second gain value (1-a) are determined iteratively subject to the sum of the first gain value (a) and the second gain value (1-a) being a predetermined time-invariant value.
17. The method according to any one of claims 1-16, further comprising:
processing the intermediate signal (v) to perform hearing loss compensation.
18. A hearing instrument (100) comprising:
a first input unit (110) comprising one or more microphones (112, 113);
a communication unit (120);
an output unit (140) comprising an output transducer (141);
at least one processor (130) coupled to the first input unit (110), the communication unit, and the output unit; and
a memory storing at least one program comprising instructions for causing the at least one processor to perform the method of any one of claims 1-17.
19. A computer readable storage medium storing at least one program, the at least one program comprising instructions, which when executed by a processor of a hearing device (100) enables the hearing device to perform the method of any one of claims 1-17.
CN202210449900.9A 2021-04-29 2022-04-27 Hearing device with omnidirectional sensitivity Pending CN115278493A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US17/244,756 2021-04-29
US17/244,756 US11617037B2 (en) 2021-04-29 2021-04-29 Hearing device with omnidirectional sensitivity
EP21175990.7 2021-05-26
DKPA202170272 2021-05-26
DKPA202170272 2021-05-26
EP21175990.7A EP4084501A1 (en) 2021-04-29 2021-05-26 Hearing device with omnidirectional sensitivity

Publications (1)

Publication Number Publication Date
CN115278493A true CN115278493A (en) 2022-11-01

Family

ID=83758827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210449900.9A Pending CN115278493A (en) 2021-04-29 2022-04-27 Hearing device with omnidirectional sensitivity

Country Status (1)

Country Link
CN (1) CN115278493A (en)

Similar Documents

Publication Publication Date Title
US9282411B2 (en) Beamforming in hearing aids
US8532307B2 (en) Method and system for providing binaural hearing assistance
JP4145323B2 (en) Directivity control method for sound reception characteristics of hearing aid and signal processing apparatus for hearing aid having controllable directivity characteristics
CN111556420A (en) Hearing device comprising a noise reduction system
US8396234B2 (en) Method for reducing noise in an input signal of a hearing device as well as a hearing device
CN109660928B (en) Hearing device comprising a speech intelligibility estimator for influencing a processing algorithm
US10536785B2 (en) Hearing device and method with intelligent steering
CN108694956B (en) Hearing device with adaptive sub-band beamforming and related methods
JP2017063419A (en) Method of determining objective perceptual quantity of noisy speech signal
CN113825076A (en) Method for direction dependent noise suppression for a hearing system comprising a hearing device
US11153695B2 (en) Hearing devices and related methods
WO2020035158A1 (en) Method of operating a hearing aid system and a hearing aid system
Directionality Maximizing the voice-to-noise ratio (VNR) via voice priority processing
US11617037B2 (en) Hearing device with omnidirectional sensitivity
CN115278493A (en) Hearing device with omnidirectional sensitivity
EP3837861B1 (en) Method of operating a hearing aid system and a hearing aid system
Kąkol et al. A study on signal processing methods applied to hearing aids
EP3059979B1 (en) A hearing aid with signal enhancement
WO2020245232A1 (en) Bilateral hearing aid system comprising temporal decorrelation beamformers
EP3886463A1 (en) Method at a hearing device
JP6409378B2 (en) Voice communication apparatus and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination