EP1647972B1 - Intelligibility enhancement of audio signals containing speech - Google Patents

Intelligibility enhancement of audio signals containing speech Download PDF

Info

Publication number
EP1647972B1
EP1647972B1 EP05019316A EP05019316A EP1647972B1 EP 1647972 B1 EP1647972 B1 EP 1647972B1 EP 05019316 A EP05019316 A EP 05019316A EP 05019316 A EP05019316 A EP 05019316A EP 1647972 B1 EP1647972 B1 EP 1647972B1
Authority
EP
European Patent Office
Prior art keywords
speech
components
audio signal
signal
circuit configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Not-in-force
Application number
EP05019316A
Other languages
German (de)
French (fr)
Other versions
EP1647972A2 (en
EP1647972A3 (en
Inventor
Matthias Vierthaler
Florian Pfister
Dieter Lücking
Stefan Müller
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TDK Micronas GmbH
Original Assignee
TDK Micronas GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TDK Micronas GmbH filed Critical TDK Micronas GmbH
Publication of EP1647972A2 publication Critical patent/EP1647972A2/en
Publication of EP1647972A3 publication Critical patent/EP1647972A3/en
Application granted granted Critical
Publication of EP1647972B1 publication Critical patent/EP1647972B1/en
Not-in-force legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Definitions

  • the invention relates to a circuit arrangement for improving the intelligibility of speech-containing audio signals with the preamble features of claim 1 and to a method for processing speech-containing audio signals.
  • a circuit arrangement for improving the intelligibility of speech-containing audio signals in which frequency and / or amplitude components of the audio signal are changed according to predetermined parameters.
  • the audio signal is amplified in a processing path by a predetermined factor and passed into a high pass, a corner frequency of the high pass is adjustable so that the amplitude of the audio signal after the processing path is equal to or proportional to the amplitude of the audio signal before the processing path.
  • the fundamental wave of the speech signal which contributes relatively little to the intelligibility of the speech components contained, but has the largest energy, to be attenuated, the remaining signal spectrum of the audio signal is raised accordingly.
  • the amplitude of the vowels having a large amplitude at low frequency can be lowered to a vowel in the transition region from a consonant having a small amplitude at high frequency to reduce a so-called "backward masking". For this, the entire signal is increased by the factor. Ultimately, high-frequency components are raised and the low-frequency fundamental is lowered to the same extent so that the amplitude or energy of the audio signal remains unchanged.
  • the object of the invention is to improve a circuit arrangement or a method for processing speech-containing audio signals.
  • a circuit arrangement is accordingly advantageous for improving the intelligibility of multicomponent audio signals possibly containing speech with an input for inputting such an audio signal.
  • the circuit arrangement is implemented by a speech detector for detecting speech in the input audio signal and for providing a control signal for controlling a speech processing device and / or a speech processing method for processing the audio signal.
  • the speech detector has a correlator for performing cross or autocorrelation of components of the audio signal. Signal components of the audio signal are separated during signal processing by a matrix.
  • a method for processing multicomponent audio signals containing speech in which speech or speech components contained in an audio signal are detected and, depending on the result of the detection, a control signal for a speech processing device and / or a speech processing method for speech enhancement is generated and provided ,
  • the circuit arrangement or the method are thus to be regarded as a precursor to an actual signal processing for improving the intelligibility of speech-containing audio signals. Accordingly, the received audio signal is first examined to see if any speech is included in the audio signal. Depending on the result of the speech detection, a control signal is then output, which is used by an actual speech processing device or an actual speech processing method as a control signal. This makes it possible that in speech processing to improve the speech components in the audio signal relative to other signal components in the audio signal, processing or alteration of the audio signal is performed only if speech or speech components are actually included.
  • a control signal is provided or output by the circuit arrangement or by the method, which for the actual language improvement z. B. is used as a trigger signal.
  • the speech enhancement by means of detection or analysis of a previous audio signal or the like, possibly a time-delayed audio signal can be performed.
  • the circuit arrangement which generates and provides the control signal can be provided as an independent structural component, but can also be part of a single structural component with the speech processing device or speech enhancement device.
  • the speech detection circuitry and the speech processing means for enhancing the speech components of the audio signal may be part of an integrated circuit arrangement. Accordingly, the method for detecting speech and the speech processing method for improving speech can also be used Speech components in the audio signal are performed separately from each other.
  • a common method which is carried out by means of technical components of a circuit arrangement or by means of a correspondingly running algorithm in a calculation device is particularly preferred.
  • a circuit arrangement is preferred in which the speech detector is designed and / or controlled to detect speech components in the audio signal.
  • a circuit arrangement is preferred in which the speech detector has a threshold value determination device for comparing a range of detected speech components with a threshold value and for outputting the control signal as a function of the comparison result.
  • a circuit arrangement is preferred in which the speech detector has a control input for inputting at least one parameter for variably controlling the detection with regard to a scope of the speech components to be detected and / or with regard to a frequency range of the speech components to be detected.
  • a circuit arrangement is preferred in which the speech detector has a direction determination device for determining a direction of common signal components of the various components.
  • control signal for activating or deactivating the speech enhancement device and / or the speech enhancement method is designed and / or controlled as a function of the speech content of the audio signal.
  • control signal is generated as a function of the extent of detected speech components.
  • a method is preferred in which the extent of the detected speech components is compared with a threshold value at which a cross or autocorrelation of the audio signal or of components of the audio signal is performed.
  • Signal components of the audio signal are separated during signal processing by a matrix.
  • a method is preferred in which the detection is carried out in an adjustable manner with respect to a scope of the speech components to be detected and / or with regard to a frequency range of the speech components to be detected by means of variable parameters.
  • a method is preferred in which the audio signal components of a multicomponent audio signal having a plurality of audio signal components are compared with one another or processed with one another to detect the speech.
  • Components are to be understood as meaning signal components from different distances and directions and / or signals of different channels.
  • a method in which the audio signal components are compared or processed with regard to common speech components in the various of the audio signal components, in particular for comparing a direction of the common signal components is preferably compared or processed.
  • the distance and direction of the speech component can be determined.
  • an application of the speech enhancement is applicable in particular only to speech components which are recognized as originating from a person who is close to the microphone. Signal portions or speech portions of more distant persons can thereby be ignored, so that a speech enhancement is activated only when a related person actually speaks.
  • control signal is provided for activating or deactivating the speech enhancement device and / or the speech enhancement method.
  • a circuit arrangement and / or a method is preferred, wherein a frequency response is determined by means of an FIR or an IIR filter (FIR: finite impulse response, IIR: infinite impulse response).
  • FIR finite impulse response
  • IIR infinite impulse response
  • a circuit arrangement and / or a method is preferred, wherein matrix coefficients for a matrix are determined via a function dependent on the language component.
  • the function is linear and continuous.
  • the function has a hysteresis.
  • the signal components with speech components of the audio signal can be analyzed and detected with regard to various criteria.
  • a minimum duration for example,
  • signal component By means of which language is detected as the voice part, it is also possible, for example as a signal component, to focus on the frequency of detectable speech and / or the direction of a speech source of detected speech.
  • the terms signal components and speech components are therefore to be interpreted as general and not restrictive.
  • Fig. 1 schematically shows, by way of example, the sequence of a method for detecting speech and / or speech components px in an audio signal i for the optional subsequent or parallel speech enhancement of the speech or of the speech components px, if such are detected, in the audio signal i.
  • An audio signal i is input via an input I of a circuit arrangement for improving the intelligibility of audio signals i.p. which may contain speech or voice components px entered.
  • the audio signal i may be a single-channel mono signal.
  • multicomponent audio signals i of a stereo audio signal source or the like ie a stereo audio signal, a 3D stereo audio signal with additional central component or a surround audio signal with currently usually five components for audio signal components from the right, left, center and from Z.
  • a stereo audio signal source or the like ie a stereo audio signal, a 3D stereo audio signal with additional central component or a surround audio signal with currently usually five components for audio signal components from the right, left, center and from Z.
  • the audio signal i is supplied to a first structural or logical component, which forms a speech detector SD.
  • a speech detector SD it is examined whether speech or a speech component px is contained in the audio signal i. According to preferred embodiments, it is checked whether detected speech or speech components px are larger than a correspondingly predetermined threshold value v.
  • detection parameters in particular the threshold value v, can be adapted as needed.
  • the illustrated arrangement has an input IV for inputting the threshold v.
  • a control signal is set to the value 0, for example. Otherwise, the control signal is set to, for example, the value 1.
  • the control signal s is output from the voice detector SD for further use by a voice processing means.
  • the audio signal i currently input into the speech processing is improved in accordance with known methods or with an otherwise known circuit arrangement.
  • an audio signal o improved with respect to the speech parts is output.
  • a delay of the audio signal i input into the circuit arrangement or the method can optionally be performed according to the time delay in the speech detection.
  • a circuit arrangement or a method or algorithm is made possible which can only be used for voice enhancement on parts of the audio signal which actually contain speech or which actually contain a specific speech component in the audio signal. Speech detection thus detects speech or separates it from the rest of the signal.
  • Fig. 2 shows a first embodiment of a speech detector SD.
  • the input consists of two individual inputs for each one audio signal component or an audio signal channel L ', R' of a stereo audio signal.
  • the two audio signal components R ', L' are each fed to a bandpass filter BP for limiting the band.
  • the outputs of the two bandpass filters BP are supplied to a correlator CR for performing a cross-correlation.
  • Each of the two signals output by the bandpass filters BP is respectively multiplied by itself in a multiplier M, ie squared, and then supplied to an adder A. After the addition, a multiplication by the factor 0.5 is optionally carried out in a further multiplier M * in order to reduce the amplitude.
  • the output signal i of the optionally multiplied addition values is fed to a first or second low-pass filter TP.
  • each of the output signals of the two bandpass filters BP is fed to an actual circuit for performing the correlation using in particular a further multiplier M.
  • the correlation signal L, * R 'output therefrom is fed to a second low-pass filter TP.
  • the output signals b, a of the first low-pass filter TP and of the second low-pass filter TP are supplied to a division element DIV for dividing the output signal b of the first low-pass filter TP from the output signal a of the second low-pass filter TP.
  • the division result of the division element DIV is provided as a control signal or as a preliminary stage D1 for the control signal s.
  • a conventional stereo audio signal L ', R' is usually the audio signal i from several audio signal components R, L, C, S together. In the case of a multi-channel audio signal, these components can also be provided separately.
  • L ' L + C + S resp
  • L stands for a left signal component
  • C stands for a signal component coming from the center
  • S stands for a surround signal component, ie, a back signal
  • R stands for a right signal component.
  • the time constant of the low-pass filter TP can be in the range of approximately 100 ms, if a very fast response to changing signal components is desired. However, the time constant can be extended up to several minutes if a very slow response of the speech detector SD is desired.
  • the time constant of the low-pass filter is therefore an advantageously variable parameter.
  • the illustrated circuit arrangements or methods for providing the control signal s can be followed by a stage in which a threshold value v is set, which of the output signal D1 of the described arrangements or methods to Is exceeded, to switch the control signal s in an active state.
  • the actual speech enhancement algorithm or actual speech enhancer may be provided in a manner known per se. For example, an in DE 101 24 699 C1 to which full reference is made, described, simple frequency response correction can be performed. However, any other algorithms and devices for improving speech intelligibility can also be used.
  • the input components or input channels L ', R' of the audio signal i are each multiplied by three factors k1, k3, k5 or k2, k4, k6 and supplied to addition elements.
  • the first adder A is applied the signal of the first channel L 'multiplied by the first coefficient k1 and the signal of the second channel R' multiplied by the second coefficient k2 for addition.
  • the second adder A is applied the signal of the first channel L 'multiplied by the third coefficient k3 and the signal of the second channel R' multiplied by the fourth coefficient k4 for addition.
  • the third Adder A, the signal of the first channel L 'multiplied by the fifth coefficient k5 and the signal of the second channel R' multiplied by the sixth coefficient k6 are applied for addition.
  • the output value of the second adder A is supplied to a speech enhancement circuit VS or a speech enhancement method or algorithm. Its output result is added by means of further addition elements A to the output value or output signal of the first addition element A for providing a first output channel LE and an output value or output signal of the third addition element A by means of a further addition element A for providing a second output channel RE.
  • the two finally output signal channels or components LE, RE correspond to the processed signals which are supplied to the output O for the processed audio signal o.
  • F1 (D1) the circuit already responds to a small detected speech component.
  • the probability of misdetection is relatively high for small values of D1.
  • the Effect of the speech algorithm with a small D1 on the audio signal is relatively low, so that an impairment of the audio signal is barely perceived.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Stereophonic System (AREA)
  • Amplifiers (AREA)
  • Telephone Function (AREA)
  • Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The arrangement has a speech detector (200) detecting speech in an audio signal and providing a control signal (226) to control a speech processing device. The device processes the audio signal to determine whether the audio signal includes components which indicate speech. The detector compares a range of detected speech components to a threshold value, and outputs the control signal based on the comparison result. Independent claims are also included for the following: (A) a method for processing audio signals containing speech (B) an audio processing system comprising a speech detector.

Description

Die Erfindung bezieht sich auf eine Schaltungsanordnung für eine Verbesserung der Verständlichkeit von Sprache enthaltenden Audiosignalen mit den oberbegrifflichen Merkmalen des Patentanspruchs 1 bzw. auf ein Verfahren zur Verarbeitung von Sprache enthaltenden Audiosignalen.The invention relates to a circuit arrangement for improving the intelligibility of speech-containing audio signals with the preamble features of claim 1 and to a method for processing speech-containing audio signals.

Aus DE 101 24 699 C1 ist eine Schaltungsanordnung zur Verbesserung der Verständlichkeit von Sprache enthaltenden Audiosignalen bekannt, bei welcher Frequenz- und/oder Amplitudenanteile des Audiosignals nach vorgegebenen Parametern verändert werden. Dabei wird das Audiosignal in einer Verarbeitungsstrecke um einen vorgegebenen Faktor verstärkt sowie in einen Hochpass geführt, wobei eine Eckfrequenz des Hochpasses so regelbar ist, dass die Amplitude des Audiosignals nach der Verarbeitungsstrecke gleich oder proportional der Amplitude des Audiosignals vor der Verarbeitungsstrecke ist. Mit dieser Schaltungsanordnung soll die Grundwelle des Sprachsignals, welche relativ wenig zur Verständlichkeit der enthaltenen Sprachanteile beiträgt, aber die größte Energie besitzt, abgeschwächt werden, wobei das übrige Signalspektrum des Audiosignals entsprechend angehoben wird. Außerdem kann die Amplitude der Vokale, welche eine große Amplitude bei tiefer Frequenz aufweisen, im Übergangsbereich von einem Konsonanten, der eine kleine Amplitude bei großer Frequenz aufweist, zu einem Vokal abgesenkt werden, um ein sogenanntes "backward masking" zu verringern. Dazu wird das gesamte Signal um den Faktor angehoben. Letztendlich werden hochfrequente Anteile angehoben und die tieffrequente Grundwelle wird im gleichen Maße abgesenkt, so dass die Amplitude oder Energie des Audiosignals unverändert bleibt.Out DE 101 24 699 C1 a circuit arrangement for improving the intelligibility of speech-containing audio signals is known in which frequency and / or amplitude components of the audio signal are changed according to predetermined parameters. In this case, the audio signal is amplified in a processing path by a predetermined factor and passed into a high pass, a corner frequency of the high pass is adjustable so that the amplitude of the audio signal after the processing path is equal to or proportional to the amplitude of the audio signal before the processing path. With this circuit arrangement, the fundamental wave of the speech signal, which contributes relatively little to the intelligibility of the speech components contained, but has the largest energy, to be attenuated, the remaining signal spectrum of the audio signal is raised accordingly. In addition, the amplitude of the vowels having a large amplitude at low frequency can be lowered to a vowel in the transition region from a consonant having a small amplitude at high frequency to reduce a so-called "backward masking". For this, the entire signal is increased by the factor. Ultimately, high-frequency components are raised and the low-frequency fundamental is lowered to the same extent so that the amplitude or energy of the audio signal remains unchanged.

US 5,553,151 beschreibt ein "forward masking". Dabei werden schwache Konsonanten durch vorhergehende starke Vokale zeitlich überdeckt. Vorgeschlagen wird ein verhältnismäßig schneller Kompressor mit einer "attack time" von ca. 10 msec und einer "release time" von ca. 75 bis 150 msec. US 5,553,151 describes a "forward masking". Weak consonants are covered in time by previous strong vowels. Proposed is a relatively fast compressor with an "attack time" of about 10 msec and a "release time" of about 75 to 150 msec.

Aus US 5,479,560 ist bekannt, ein Audiosignal in mehrere Frequenzbänder aufzuteilen und diejenigen Frequenzbänder mit großer Energie verhältnismäßig stark zu verstärken und die anderen abzusenken. Dies wird vorgeschlagen, weil Sprache aus einer Aneinanderreihung von Phonemen besteht. Phoneme bestehen aus einer Vielzahl von Frequenzen. Diese werden im Bereich der Resonanzfrequenzen des Mund- und Rachenraums besonders verstärkt. Ein Frequenzband mit solch einem spektralen Spitzenwert wird Formant genannt. Formants sind besonders wichtig zur Erkennung von Phonemen und somit Sprache. Ein Ansatz zur Verbesserung der Sprachverständlichkeit besteht darin, die Spitzenwerte bzw. Formants des Frequenzspektrums eines Audiosignals zu verstärken und die dazwischen liegenden Fehler abzuschwächen. Für einen erwachsenen Mann liegt die Grundfrequenz der Sprache bei etwa 60 bis 250 Hz. Die ersten vier zugeordneten Formants liegen bei 500 Hz, 1500 Hz, 2500 Hz und 3500 Hz.Out US 5,479,560 It is known to divide an audio signal into several frequency bands and relatively strong amplify those frequency bands with high energy and lower the other. This is suggested because language consists of a sequence of phonemes. Phones are made up of a variety of frequencies. These are especially enhanced in the area of the resonance frequencies of the mouth and throat area. A frequency band having such a spectral peak is called a formant. Formants are particularly important for recognizing phonemes and thus speech. One approach to improving speech intelligibility is to amplify the formants of the frequency spectrum of an audio signal and mitigate the errors between them. For an adult male, the fundamental frequency of the speech is about 60 to 250 Hz. The first four associated formants are 500 Hz, 1500 Hz, 2500 Hz and 3500 Hz.

Als weiterer Stand der Technik werden die US 2003/0044032 , die WO 2004/071130 A1 und die US 2003/0055636 A1 genannt.As a further prior art, the US 2003/0044032 , the WO 2004/071130 A1 and the US 2003/0055636 A1 called.

Derartige Schaltungsanordnungen und Verfahrensweisen machen in einem Audiosignal enthaltene Sprache gegenüber weiteren im Audiosignal enthaltenen Komponenten verständlicher. Gleichzeitig werden aber auch nicht Sprache enthaltende Signalanteile verändert bzw. verfälscht. Nachteilhaft ist bei den Verfahren bzw. Schaltungsanordnungen auch, dass diese jeweils starr vorgegebene Sprachanteile, Frequenzanteile oder dergleichen kontinuierlich verbessern bzw. verarbeiten. Dadurch werden nicht Sprache enthaltende Signalanteile auch zu Zeiten verändert bzw. verfälscht, zu denen das Audiosignal keine Sprache bzw. Sprachanteile enthält.Such circuitry and methods make speech contained in an audio signal more understandable to other components included in the audio signal. At the same time, however, also non-speech-containing signal components are changed or falsified. It is also disadvantageous in the methods or circuit arrangements that they each continuously improve or process rigidly predefined speech components, frequency components or the like. As a result, non-speech signal portions also become changed or falsified at times when the audio signal contains no speech or speech components.

Die Aufgabe der Erfindung besteht darin, eine Schaltungsanordnung bzw. ein Verfahren zur Verarbeitung von Sprache enthaltenden Audiosignalen zu verbessern.The object of the invention is to improve a circuit arrangement or a method for processing speech-containing audio signals.

Diese Aufgabe wird durch eine Schaltungsanordnung für eine Verbesserung der Verständlichkeit von ggf. Sprache enthaltenden Audiosignalen mit den Merkmalen des Patentanspruchs 1 bzw. durch ein Verfahren zur Verarbeitung von ggf. Sprache enthaltenden Audiosignalen mit den Merkmalen des Patentanspruchs 6 gelöst.This object is achieved by a circuit arrangement for improving the intelligibility of possibly audio-containing audio signals having the features of patent claim 1 or by a method for processing optionally audio-containing audio signals having the features of patent claim 6.

Vorteilhaft ist entsprechend eine Schaltungsanordnung für eine Verbesserung der Verständlichkeit von ggf. Sprache enthaltenden mehrkomponentigen Audiosignalen mit einem Eingang zum Eingeben eines solchen Audiosignals. Vorteilhaft wird die Schaltungsanordnung durch einen Sprachdetektor zum Detektieren von Sprache in dem eingegebenen Audiosignal und zum Bereitstellen eines Steuersignals zum Steuern einer Sprachverarbeitungseinrichtung und/oder eines Sprachverarbeitungsverfahrens zum Verarbeiten des Audiosignals. Der Sprachdetektor weist eine Korrelationseinrichtung zum Durchführen einer Kreuz- oder einer Autokorrelation von Komponenten des Audiosignals auf. Signalanteile des Audiosignals werden bei der Signalverarbeitung durch eine Matrix getrennt.A circuit arrangement is accordingly advantageous for improving the intelligibility of multicomponent audio signals possibly containing speech with an input for inputting such an audio signal. Advantageously, the circuit arrangement is implemented by a speech detector for detecting speech in the input audio signal and for providing a control signal for controlling a speech processing device and / or a speech processing method for processing the audio signal. The speech detector has a correlator for performing cross or autocorrelation of components of the audio signal. Signal components of the audio signal are separated during signal processing by a matrix.

Vorteilhaft ist ein Verfahren zur Verarbeitung von ggf. Sprache enthaltenden mehrkomponentigen Audiosignalen, bei dem in einem Audiosignal enthaltene Sprache bzw. Sprachanteile detektiert werden und abhängig von dem Ergebnis der Detektion ein Steuersignal für eine Sprachverarbeitungseinrichtung und/oder ein Sprachverarbeitungsverfahren für eine Sprachverbesserung erzeugt und bereitgestellt wird.Advantageously, a method is provided for processing multicomponent audio signals containing speech, in which speech or speech components contained in an audio signal are detected and, depending on the result of the detection, a control signal for a speech processing device and / or a speech processing method for speech enhancement is generated and provided ,

Die Schaltungsanordnung bzw. das Verfahren sind somit als eine Vorstufe zu einer eigentlichen Signalverarbeitung zur Verbesserung der Verständlichkeit von Sprache enthaltenden Audiosignalen anzusehen. Das empfangene bzw. eingegebene Audiosignal wird demgemäß zuerst daraufhin untersucht, ob überhaupt Sprache bzw. Sprachanteile in dem Audiosignal enthalten sind. Abhängig von dem Ergebnis der Sprachdetektion wird dann ein Steuersignal ausgegeben, welches von einer eigentlichen Sprachverarbeitungseinrichtung bzw. einem eigentlichen Sprachverarbeitungsverfahren als Steuersignal verwendet wird. Dadurch wird ermöglicht, dass bei der Sprachverarbeitung zur Verbesserung der Sprachanteile im Audiosignal relativ zu anderen Signalanteilen im Audiosignal nur dann eine Verarbeitung bzw. Veränderung des Audiosignals durchgeführt wird, wenn auch tatsächlich Sprache oder Sprachanteile enthalten sind.The circuit arrangement or the method are thus to be regarded as a precursor to an actual signal processing for improving the intelligibility of speech-containing audio signals. Accordingly, the received audio signal is first examined to see if any speech is included in the audio signal. Depending on the result of the speech detection, a control signal is then output, which is used by an actual speech processing device or an actual speech processing method as a control signal. This makes it possible that in speech processing to improve the speech components in the audio signal relative to other signal components in the audio signal, processing or alteration of the audio signal is performed only if speech or speech components are actually included.

Entsprechend wird durch die Schaltungsanordnung bzw. durch das Verfahren ein Steuersignal bereitgestellt bzw. ausgegeben, welches für die eigentliche Sprachverbesserung z. B. als ein Triggersignal verwendet wird. Dadurch kann die Sprachverbesserung mittels Detektion bzw. Analyse eines vorherigen Audiosignals oder desgleichen, ggf. eines zeitverzögerten Audiosignals durchgeführt werden.Accordingly, a control signal is provided or output by the circuit arrangement or by the method, which for the actual language improvement z. B. is used as a trigger signal. Thereby, the speech enhancement by means of detection or analysis of a previous audio signal or the like, possibly a time-delayed audio signal can be performed.

Die Schaltungsanordnung, welche das Steuersignal erzeugt und bereitstellt, kann als eigenständige bauliche Komponente bereitgestellt werden, kann aber auch Bestandteil einer einzigen baulichen Komponente mit der Sprachverarbeitungseinrichtung bzw. Sprachverbesserungseinrichtung sein. Insbesondere können die Schaltungsanordnung zur Detektion von Sprache und die Sprachverarbeitungseinrichtung zur Verbesserung der Sprachanteile des Audiosignals Bestandteil einer integrierten Schaltungsanordnung sein. Entsprechend können auch das Verfahren zum Detektieren von Sprache und das Sprachverarbeitungsverfahren zum Verbessern von Sprachkomponenten in dem Audiosignal getrennt voneinander durchgeführt werden. Besonders bevorzugt wird jedoch ein gemeinsames Verfahren, welches mittels technischer Komponenten einer Schaltungsanordnung oder mittels eines entsprechend ablaufenden Algorithmus in einer Berechnungseinrichtung durchgeführt wird.The circuit arrangement which generates and provides the control signal can be provided as an independent structural component, but can also be part of a single structural component with the speech processing device or speech enhancement device. In particular, the speech detection circuitry and the speech processing means for enhancing the speech components of the audio signal may be part of an integrated circuit arrangement. Accordingly, the method for detecting speech and the speech processing method for improving speech can also be used Speech components in the audio signal are performed separately from each other. However, a common method which is carried out by means of technical components of a circuit arrangement or by means of a correspondingly running algorithm in a calculation device is particularly preferred.

Vorteilhafte Ausgestaltungen sind Gegenstand abhängiger Ansprüche.Advantageous embodiments are the subject of dependent claims.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher der Sprachdetektor zum Detektieren von Sprachanteilen in dem Audiosignal ausgebildet und/oder gesteuert ist.In particular, a circuit arrangement is preferred in which the speech detector is designed and / or controlled to detect speech components in the audio signal.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher der Sprachdetektor eine Schwellenwert-Bestimmungseinrichtung zum Vergleichen eines Umfangs detektierter Sprachanteile mit einem Schwellenwert und zum Ausgeben des Steuersignals abhängig vom Vergleichsergebnis aufweist.In particular, a circuit arrangement is preferred in which the speech detector has a threshold value determination device for comparing a range of detected speech components with a threshold value and for outputting the control signal as a function of the comparison result.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher der Sprachdetektor einen Steuereingang zum Eingeben zumindest eines Parameters zum variablen Steuern des Detektierens hinsichtlich eines Umfangs der zu detektierenden Sprachanteile und/oder hinsichtlich eines Frequenzbereichs der zu detektierenden Sprachanteile aufweist.In particular, a circuit arrangement is preferred in which the speech detector has a control input for inputting at least one parameter for variably controlling the detection with regard to a scope of the speech components to be detected and / or with regard to a frequency range of the speech components to be detected.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher der Sprachdetektor zum Verarbeiten eines mehrkomponentigen Audiosignals, insbesondere Stereo-Audiosignals oder Multikannal-Audiosignals, mit mehreren Audiosignal-Komponenten ausgebildet ist und als eine Verarbeitungseinrichtung zum Detektieren der Sprache anhand eines Vergleichs oder einer Verarbeitung der Komponenten untereinander ausgebildet oder gesteuert ist.In particular, a circuit arrangement in which the speech detector for processing a multi-component audio signal, in particular stereo audio signal or multi-channel audio signal, is formed with a plurality of audio signal components and as a processing means for detecting the language based on a comparison or a processing of the components with each other or controlled.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher der Sprachdetektor eine Richtungsbestimmungseinrichtung zum Bestimmen einer Richtung gemeinsamer Signalanteile der verschiedenen Komponenten aufweist.In particular, a circuit arrangement is preferred in which the speech detector has a direction determination device for determining a direction of common signal components of the various components.

Bevorzugt wird insbesondere eine Schaltungsanordnung, bei welcher das Steuersignal zum Aktivieren oder Deaktivieren der Sprachverbesserungseinrichtung und/oder des Sprachverbesserungsverfahrens abhängig vom Sprachgehalt des Audiosignals ausgebildet und/oder gesteuert ist.In particular, a circuit arrangement is preferred in which the control signal for activating or deactivating the speech enhancement device and / or the speech enhancement method is designed and / or controlled as a function of the speech content of the audio signal.

Bevorzugt wird insbesondere ein Verfahren, bei welchem das Steuersignal abhängig vom Umfang detektierter Sprachanteile erzeugt wird.In particular, a method is preferred in which the control signal is generated as a function of the extent of detected speech components.

Bevorzugt wird insbesondere ein Verfahren, bei welchem der Umfang der detektierten Sprachanteile mit einem Schwellenwert verglichen wird, bei welchem eine Kreuz- oder Autokorrelation des Audiosignals oder von Komponenten des Audiosignals durchgeführt wird. Signalanteile des Audiosignals werden bei der Signalverarbeitung durch eine Matrix getrennt.In particular, a method is preferred in which the extent of the detected speech components is compared with a threshold value at which a cross or autocorrelation of the audio signal or of components of the audio signal is performed. Signal components of the audio signal are separated during signal processing by a matrix.

Bevorzugt wird insbesondere ein Verfahren, bei welchem das Detektieren hinsichtlich eines Umfangs der zu detektierenden Sprachanteile und/oder hinsichtlich eines Frequenzbereichs der zu detektierenden Sprachanteile mittels variabler Parameter einstellbar durchgeführt wird.In particular, a method is preferred in which the detection is carried out in an adjustable manner with respect to a scope of the speech components to be detected and / or with regard to a frequency range of the speech components to be detected by means of variable parameters.

Bevorzugt wird insbesondere ein Verfahren, bei welchem von einem mehrkomponentigen Audiosignal mit mehreren Audiosignal-Komponenten die Audiosignal-Komponenten untereinander verglichen oder miteinander verarbeitet werden zum Detektieren der Sprache. Unter Komponenten sind dabei Signalanteile aus verschiedenen Entfernungen und Richtungen und/oder Signale verschiedener Kanäle zu verstehen.In particular, a method is preferred in which the audio signal components of a multicomponent audio signal having a plurality of audio signal components are compared with one another or processed with one another to detect the speech. Components are to be understood as meaning signal components from different distances and directions and / or signals of different channels.

Bevorzugt wird insbesondere ein Verfahren, bei welchem die Audiosignal-Komponenten hinsichtlich gemeinsamer Sprachanteile in den verschiedenen der Audiosignal-Komponenten verglichen bzw. verarbeitet werden, insbesondere zum Bestimmen einer Richtung der gemeinsamen Signalanteile verglichen bzw. verarbeitet werden. Anhand unterschiedlicher Eintreffzeiten auf beispielsweise dem rechten und dem linken Kanal eines Stereosignals sowie anhand spezifischer Dämpfungen spezieller Frequenzen kann die Entfernung und Richtung des Sprachanteils bestimmt werden. Dadurch ist eine Anwendung der Sprachverbesserung insbesondere nur auf Sprachanteile anwendbar, welche als von einer Person, die dicht am Mikrophon steht, stammend erkannt werden. Signalanteile bzw. Sprachanteile von entfernteren Personen können dadurch ignoriert werden, so dass eine Sprachverbesserung nur dann aktiviert wird, wenn tatsächlich eine nahestehende Person spricht.In particular, a method in which the audio signal components are compared or processed with regard to common speech components in the various of the audio signal components, in particular for comparing a direction of the common signal components, is preferably compared or processed. Based on different arrival times on, for example, the right and the left channel of a stereo signal and on the basis of specific attenuation of specific frequencies, the distance and direction of the speech component can be determined. As a result, an application of the speech enhancement is applicable in particular only to speech components which are recognized as originating from a person who is close to the microphone. Signal portions or speech portions of more distant persons can thereby be ignored, so that a speech enhancement is activated only when a related person actually speaks.

Bevorzugt wird insbesondere ein Verfahren, bei welchem das Steuersignal zum Aktivieren oder Deaktivieren der Sprachverbesserungseinrichtung und/oder des Sprachverbesserungsverfahrens bereitgestellt wird.In particular, a method is preferred in which the control signal is provided for activating or deactivating the speech enhancement device and / or the speech enhancement method.

Bevorzugt wird insbesondere eine Schaltungsanordnung und/oder ein Verfahren, wobei ein Frequenzgang mittels eines FIR- oder eines IIR-Filters (FIR: Finite-Impulse-Response, IIR: Infinite-Impulse-Response) bestimmt wird.In particular, a circuit arrangement and / or a method is preferred, wherein a frequency response is determined by means of an FIR or an IIR filter (FIR: finite impulse response, IIR: infinite impulse response).

Bevorzugt wird insbesondere eine Schaltungsanordnung und/oder ein Verfahren, wobei Matrixkoeffizienten für eine Matrix über eine vom Sprachanteil abhängige Funktion bestimmt werden. Dabei ist die Funktion linear und stetig. Alternativ oder zusätzlich besitzt die Funktion eine Hysterese.In particular, a circuit arrangement and / or a method is preferred, wherein matrix coefficients for a matrix are determined via a function dependent on the language component. The function is linear and continuous. Alternatively or additionally, the function has a hysteresis.

Die Signalanteile mit Sprachanteilen des Audiosignals können hinsichtlich verschiedener Kriterien analysiert und detektiert werden. Neben einer beispielsweise Mindestdauer, über welche Sprache als Sprachanteil erfasst wird, kann z.B. als Signalanteil auch auf die Frequenz erfassbarer Sprache und/oder die Richtung einer Sprachquelle erfasster Sprache abgestellt werden. Die Begriffe Signalanteile und Sprachanteile sind daher allgemein und nicht beschränkend auszulegen.The signal components with speech components of the audio signal can be analyzed and detected with regard to various criteria. In addition to a minimum duration, for example, By means of which language is detected as the voice part, it is also possible, for example as a signal component, to focus on the frequency of detectable speech and / or the direction of a speech source of detected speech. The terms signal components and speech components are therefore to be interpreted as general and not restrictive.

Die Erfindung wird nachfolgend anhand der Zeichnung näher erläutert. Es zeigen:

Fig. 1
schematisch Verfahrensschritte bzw. Komponenten eines Verfahrens bzw. einer Schaltungsanordnung zum Verarbeiten eines Audiosignals zur Detektion von darin enthaltener Sprache;
Fig. 2
eine beispielhafte Schaltungsanordnung gemäß einer ersten Ausführungsform zur Anwendung einer Korrelation auf Sprachanteile verschiedener Signalkomponenten;
Fig. 3
eine beispielhafte Schaltungsanordnung zur Darstellung einer Matrixberechnung vor einer Durchführung einer Sprachverbesserung des Audiosignals; und
Fig. 4
ein Diagramm zur Veranschaulichung von Kriterien zur Festlegung eines Schwellenwerts.
The invention will be explained in more detail with reference to the drawing. Show it:
Fig. 1
schematically method steps or components of a method or a circuit arrangement for processing an audio signal for the detection of language contained therein;
Fig. 2
an exemplary circuit arrangement according to a first embodiment for applying a correlation to speech components of various signal components;
Fig. 3
an exemplary circuit arrangement for displaying a matrix calculation before performing a speech enhancement of the audio signal; and
Fig. 4
a diagram illustrating criteria for establishing a threshold.

Fig. 1 zeigt beispielhaft schematisch den Ablauf eines Verfahrens zum Detektieren von Sprache und/oder Sprachanteilen px in einem Audiosignal i zur optionalen nachfolgenden oder parallelen Sprachverbesserung der Sprache bzw. der Sprachanteile px, sofern solche detektiert werden, in dem Audiosignal i. Über einen Eingang I einer Schaltungsanordnung für eine Verbesserung der Verständlichkeit von ggf. Sprache oder Sprachanteilen px enthaltenden Audiosignalen i wird ein Audiosignal i eingegeben. Bei dem Audiosignal i kann es sich je nach Anwendungsfall um ein einkanaliges Monosignal handeln. Bevorzugt werden jedoch mehrkomponentige Audiosignale i einer Stereo-Audiosignalquelle oder dergleichen, d.h. ein Stereo-Audiosignal, ein 3D-Stereo-Audiosignal mit zusätzlicher Zentralkomponente oder ein Surround-Audiosignal mit derzeit üblicherweise fünf Komponenten für Audiosignal-Komponenten von rechts, links, der Mitte sowie von z. B. zwei entfernten Quellen rechts und links. Fig. 1 schematically shows, by way of example, the sequence of a method for detecting speech and / or speech components px in an audio signal i for the optional subsequent or parallel speech enhancement of the speech or of the speech components px, if such are detected, in the audio signal i. An audio signal i is input via an input I of a circuit arrangement for improving the intelligibility of audio signals i.p. which may contain speech or voice components px entered. Depending on the application, the audio signal i may be a single-channel mono signal. However, preference is given to multicomponent audio signals i of a stereo audio signal source or the like, ie a stereo audio signal, a 3D stereo audio signal with additional central component or a surround audio signal with currently usually five components for audio signal components from the right, left, center and from Z. B. two remote sources right and left.

Das Audiosignal i wird einer ersten baulichen oder logischen Komponente, welche einen Sprachdetektor SD ausbildet, zugeführt. In dem Sprachdetektor SD wird untersucht, ob in dem Audiosignal i Sprache bzw. ein Sprachanteil px enthalten ist. Gemäß bevorzugter Ausführungsformen wird dabei geprüft, ob detektierte Sprache bzw. Sprachanteile px größer sind als ein entsprechend vorgegebener Schwellenwert v. Optional sind Detektionsparameter, insbesondere der Schwellenwert v bedarfsweise anpassbar. Diesbezüglich weist die dargestellte Anordnung einen Eingang IV zum Eingeben des Schwellenwerts v auf.The audio signal i is supplied to a first structural or logical component, which forms a speech detector SD. In the speech detector SD, it is examined whether speech or a speech component px is contained in the audio signal i. According to preferred embodiments, it is checked whether detected speech or speech components px are larger than a correspondingly predetermined threshold value v. Optionally, detection parameters, in particular the threshold value v, can be adapted as needed. In this regard, the illustrated arrangement has an input IV for inputting the threshold v.

Ergibt die Detektion, dass ein ausreichender Sprachanteil px in dem Audiosignal i enthalten ist, so wird ein Steuersignal beispielsweise auf den Wert 0 gesetzt. Andernfalls wird das Steuersignal auf beispielsweise den Wert 1 gesetzt. Das Steuersignal s wird von dem Sprachdetektor SD zur weiteren Verwendung durch eine Sprachverarbeitungseinrichtung bzw. ein Sprachverarbeitungsverfahren ausgegeben.If the detection indicates that a sufficient speech component px is contained in the audio signal i, a control signal is set to the value 0, for example. Otherwise, the control signal is set to, for example, the value 1. The control signal s is output from the voice detector SD for further use by a voice processing means.

Falls das Steuersignal s einen Sprachanteil px signalisiert, d. h. falls im vorliegenden Fall s = 0 gilt, wird die Sprache bzw. Sprachanteile px verbessernde Sprachverarbeitung aktiviert. Das momentan in die Sprachverarbeitung eingegebene Audiosignal i wird entsprechend für sich bekannter Verfahren bzw. mit einer ansonsten für sich bekannten Schaltungsanordnung verbessert. An einem Ausgang O wird entsprechend ein hinsichtlich der Sprachanteile verbessertes Audiosignal o ausgegeben.If the control signal s indicates a speech component px, ie if s = 0 in the present case, the language or speech portions px-improving speech processing is activated. The audio signal i currently input into the speech processing is improved in accordance with known methods or with an otherwise known circuit arrangement. At an output O becomes accordingly, an audio signal o improved with respect to the speech parts is output.

Falls bei dem Detektionsschritt kein ausreichender Sprachanteil px erfasst wird, d.h., falls s = 1 gilt, wird das in die Sprachverarbeitung SV eingegebene Audiosignal i belassen, d.h., unverändert als Audiosignal o ausgegeben.If the speech step px is not detected in the detecting step, that is, if s = 1, the audio signal i input to the speech processing SV is left, that is, output unchanged as the audio signal o.

Sofern durch die Sprachdetektion eine zeitliche Verzögerung des an der Sprachverarbeitung anliegenden Steuersignals s relativ zu dem momentan anliegenden Audiosignal i vorliegt, kann optional eine Verzögerung des in die Schaltungsanordnung bzw. das Verfahren eingegebenen Audiosignals i entsprechend der zeitlichen Verzögerung bei der Sprachdetektion vorgenommen werden.If there is a time delay of the control signal s applied to the voice processing relative to the currently applied audio signal i due to the speech detection, a delay of the audio signal i input into the circuit arrangement or the method can optionally be performed according to the time delay in the speech detection.

Ermöglicht wird somit eine Schaltungsanordnung bzw. ein Verfahren oder Algorithmus, welche eine Sprachverbesserung nur auf Teile des Audiosignals anwenden lassen, welche tatsächlich Sprache enthalten oder welche tatsächlich einen bestimmten Sprachanteil im Audiosignal enthalten. Durch die Sprachdetektion wird somit Sprache detektiert bzw. vom restlichen Signal getrennt.Thus, a circuit arrangement or a method or algorithm is made possible which can only be used for voice enhancement on parts of the audio signal which actually contain speech or which actually contain a specific speech component in the audio signal. Speech detection thus detects speech or separates it from the rest of the signal.

In der Realität wird sich Sprache von anderen Signalanteilen eines Audiosignals mathematisch nicht genau trennen lassen. Ziel ist somit, einen möglichst guten Schätzwert zu liefern. Sofern Algorithmen bzw. Schaltungsanordnungen nachfolgend aufgeführter Ausführungsformen sich durch entsprechende andere Signalanteile in die Irre führen lassen, wird gemäß erster Versuche trotzdem eine vorteilhafte Verbesserung eines ausgegebenen Audiosignals erzielt. Vorteilhaft ist dazu, darauf zu achten, dass das Audiosignal i auch bei einer Fehldetektion im Sprachdetektor SD nicht zu sehr verfälscht wird.In reality, speech will not be separated mathematically from other signal components of an audio signal. The aim is therefore to provide the best possible estimate. If algorithms or circuit arrangements of embodiments listed below can be mislead by corresponding other signal components, an advantageous improvement of an output audio signal is still achieved according to the first experiments. It is advantageous to ensure that the audio signal i is not falsified too much even in the case of a misdetection in the speech detector SD.

Fig. 2 zeigt eine erste Ausführungsvariante eines Sprachdetektors SD. Der Eingang besteht aus zwei individuellen Eingängen für jeweils eine Audiosignal-Komponente bzw. einen Audiosignal-Kanal L', R' eines Stereo-Audiosignals. Die beiden Audiosignal-Komponenten R', L' werden jeweils einem Bandpassfilter BP zur Bandbegrenzung zugeführt. Die Ausgangssignale der beiden Bandpassfilter BP werden einer Korrelationseinrichtung CR zum Durchführen einer Kreuzkorrelation zugeführt. Jedes der beiden von den Bandpassfiltern BP ausgegebenen Signale wird jeweils in einem Multiplikator M mit sich selber multipliziert, d. h. quadriert, und dann einem Additionsglied A zugeführt. Nach der Addition erfolgt optional in einem weiteren Multiplikator M* eine Multiplikation mit dem Faktor 0,5, um die Amplitude zu reduzieren. Das Ausgangssignal i der gegebenenfalls multiplizierten Additionswerte wird einem ersten bzw. zweiten Tiefpassfilter TP zugeführt. Fig. 2 shows a first embodiment of a speech detector SD. The input consists of two individual inputs for each one audio signal component or an audio signal channel L ', R' of a stereo audio signal. The two audio signal components R ', L' are each fed to a bandpass filter BP for limiting the band. The outputs of the two bandpass filters BP are supplied to a correlator CR for performing a cross-correlation. Each of the two signals output by the bandpass filters BP is respectively multiplied by itself in a multiplier M, ie squared, and then supplied to an adder A. After the addition, a multiplication by the factor 0.5 is optionally carried out in a further multiplier M * in order to reduce the amplitude. The output signal i of the optionally multiplied addition values is fed to a first or second low-pass filter TP.

Außerdem wird jedes der Ausgangssignale der beiden Bandpassfilter BP einer eigentlichen Schaltung zur Durchführung der Korrelation unter Einsatz insbesondere eines weiteren Multiplikators M zugeführt. Das davon ausgegebene Korrelationssignal L,* R' wird einem zweiten Tiefpassfilter TP zugeführt.In addition, each of the output signals of the two bandpass filters BP is fed to an actual circuit for performing the correlation using in particular a further multiplier M. The correlation signal L, * R 'output therefrom is fed to a second low-pass filter TP.

Die Ausgangssignale b, a des ersten Tiefpassfilters TP und des zweiten Tiefpassfilters TP werden einem Divisionsglied DIV zur Division des Ausgangssignals b des ersten Tiefpassfilters TP von dem Ausgangssignal a des zweiten Tiefpassfilters TP zugeführt. Das Divisionsergebnis des Divisionsglieds DIV wird als Steuersignals bzw. als Vorstufe D1 für das Steuersignal s bereitgestellt.The output signals b, a of the first low-pass filter TP and of the second low-pass filter TP are supplied to a division element DIV for dividing the output signal b of the first low-pass filter TP from the output signal a of the second low-pass filter TP. The division result of the division element DIV is provided as a control signal or as a preliminary stage D1 for the control signal s.

Mit einer solchen Schaltungsanordnung oder einem entsprechenden Verarbeitungsverfahren wird eine Kreuzkorrelation durchgeführt. Ein übliches Stereo-Audiosignal L', R' setzt sich als Audiosignal i in der Regel aus mehreren Audiosignal-Komponenten R, L, C, S zusammen. Im Fall eines Multikannal-Audiosignals können diese Komponenten auch separat bereitgestellt werden.With such a circuit arrangement or a corresponding processing method, a cross-correlation is performed. A conventional stereo audio signal L ', R' is usually the audio signal i from several audio signal components R, L, C, S together. In the case of a multi-channel audio signal, these components can also be provided separately.

Im Fall eines Stereo-Audiosignals L', R' sind die beiden Audiosignal-Kanäle L', R' beschreibbar durch a : = L + C + S bzw .

Figure imgb0001
b : = R + C - S ,
Figure imgb0002

wobei L für eine linke Signalkomponente steht, C für eine zentral von vorne kommende Signalkomponente steht, S für eine Surround-Signalkomponente, d.h, ein rückwärtiges Signalund R für eine rechte Signalkomponente steht.In the case of a stereo audio signal L ', R', the two audio signal channels L ', R' are described by a : L' = L + C + S resp ,
Figure imgb0001
b : R ' = R + C - S .
Figure imgb0002

where L stands for a left signal component, C stands for a signal component coming from the center, S stands for a surround signal component, ie, a back signal and R stands for a right signal component.

Sprache bzw. Sprachanteile px befinden sich hauptsächlich auf dem zentralen Kanal bzw. in der Zentralkomponente C. Diese Tatsache kann benutzt werden, um den Anteil von Sprache bzw. Sprachanteilen px zum restlichen Signalgehalt des Audiosignals i zu detektieren. Bestimmt werden kann die enthaltene Sprache bzw. der enthaltene Sprachanteil px im Verhältnis zu den restlichen Signalanteilen des Audiosignals i gemäß px = 2 * RMS C / RMS / / ( RMS / )

Figure imgb0003
mit RMS als der zeitlich gemittelten Amplitude.Speech or speech components px are located mainly on the central channel or in the central component C. This fact can be used to detect the proportion of speech or speech components px to the remaining signal content of the audio signal i. Can be determined the language included or the included speech portion px in relation to the remaining signal components of the audio signal i according to px = 2 * RMS C / RMS / L' / ( RMS / R ' )
Figure imgb0003
with RMS as the time averaged amplitude.

Durch eine Kreuzkorrelation lässt sich der Anteil der Zentralkomponente C bestimmen durch * = 2 * L * R + L * C + R * C - L * S + R * S + C * C - S * S .

Figure imgb0004
By cross-correlation, the proportion of the central component C can be determined by L' * R ' = 2 * L * R + L * C + R * C - L * S + R * S + C * C - S * S ,
Figure imgb0004

Im zeitlichen Mittel werden für DC-freie Signale, d. h. für Signalkomponenten ohne einen Gleichspannungsanteil alle nicht korrelierten Produkte zu 0. Damit kann als Kriterium für das von dem Sprachdetektor SD ausgegebene Signal D1 gelten: D 1 = 2 * TP * / * + * = 2 * TP C * C - S * S / TP ( * + * ) .

Figure imgb0005
On average, for DC-free signals, ie for signal components without a DC voltage component, all non-correlated products become 0. Thus, the criterion for the signal D1 output by the speech detector SD is: D 1 = 2 * TP L' * R ' / L' * L' + R ' * R ' = 2 * TP C * C - S * S / TP ( L' * L' + R ' * R ' ) ,
Figure imgb0005

Damit ergibt sich für das Ausgangssignal D1, welches als Vorstufe zu dem Steuersignal s oder direkt als Steuersignal s verwendet werden kann, als Wert D1 = 1, falls das Audiosignal i ausschließlich aus einer Zentralkomponente C besteht. D1 = 0 ergibt sich, falls das Audiosignal i ausschließlich aus unkorrelierten rechten und linken Signalkomponenten L, R besteht. D = -1 ergibt sich, falls das Audiosignal i ausschließlich aus Surround-Komponenten S besteht. Bei einer Mischung der verschiedenen Komponenten, wie sie bei einem realen Signal gegeben ist, ergeben sich Werte für D1 zwischen -1 und +1. Je näher das Ausgangssignal bzw. der Ausgangswert D1 bei +1 liegt, desto zentral-lastiger ist das Audiosignal i bzw. L', R', so dass auf einen entsprechend großen Sprachanteil px geschlossen werden kann.This results for the output signal D1, which can be used as a precursor to the control signal s or directly as a control signal s, as a value D1 = 1, if the audio signal i consists exclusively of a central component C. D1 = 0 results if the audio signal i consists exclusively of uncorrelated right and left signal components L, R. D = -1 results if the audio signal i consists exclusively of surround components S. With a mixture of the different components, as given in a real signal, values for D1 are between -1 and +1. The closer the output signal or the output value D1 is to +1, the more central-heavy is the audio signal i or L ', R', so that a correspondingly large speech component px can be closed.

Die Zeitkonstante des Tiefpassfilters TP kann im Bereich von ca. 100 ms liegen, falls eine sehr schnelle Reaktion auf sich ändernde Signalkomponenten gewünscht ist. Die Zeitkonstante kann jedoch bis zu mehreren Minuten verlängert werden, falls eine sehr langsame Reaktion des Sprachdetektors SD gewünscht ist. Die Zeitkonstante des Tiefpassfilters ist daher ein vorteilhafterweise variabler Parameter. Vor der Durchführung eines Detektionsalgorithmus werden DC-Anteile zweckmäßigerweise mittels eines entsprechenden Filters, insbesondere DC-Kerbfilters (DC-Notch) herausgefiltert. Die weitere Bandbegrenzung ist optional.The time constant of the low-pass filter TP can be in the range of approximately 100 ms, if a very fast response to changing signal components is desired. However, the time constant can be extended up to several minutes if a very slow response of the speech detector SD is desired. The time constant of the low-pass filter is therefore an advantageously variable parameter. Before carrying out a detection algorithm, DC components are expediently filtered out by means of a corresponding filter, in particular a DC notch filter (DC Notch). The further band limitation is optional.

Optional kann den dargestellten Schaltungsanordnungen bzw. Verfahrensweisen zur Bereitstellung des Steuersignals s noch eine Stufe nachgeschaltet werden, in welcher ein Schwellenwert v festgelegt wird, der von dem Ausgangssignal D1 der beschriebenen Anordnungen bzw. Verfahren zu Überschreiten ist, um das Steuersignal s in einen aktiven Zustand zu schalten.Optionally, the illustrated circuit arrangements or methods for providing the control signal s can be followed by a stage in which a threshold value v is set, which of the output signal D1 of the described arrangements or methods to Is exceeded, to switch the control signal s in an active state.

Bei einer parallelen oder nachfolgenden Sprachsignalverarbeitung des Audiosignals i besteht das Ziel darin, möglichst viele Signalanteile, die Sprache bzw. Sprachanteile px enthalten, durch einen Sprachverbesserungsalgorithmus zu leiten und die restlichen Signalanteile unverändert zu lassen, wie dies auch anhand Fig. 1 beschrieben ist. Dies wird vorteilhaft durch eine Matrix gelöst, wie dies anhand Fig. 3 skizziert ist. Matrixkoeffizienten k1, k2,..., k6 werden abhängig von dem bestimmten Sprachanteil px bzw. abhängig von dem vom Sprachdetektor SD ausgegebenen Ausgangswert bzw. Ausgangssignal D1, D2 bestimmt bzw. werden als Funktion px = F(D1, D2) ermittelt.
Der eigentliche Sprachverbesserungsalgorithmus oder eine eigentliche Sprachverbesserungseinrichtung kann in für sich bekannter Art und Weise bereitgestellt werden. Beispielsweise kann eine in DE 101 24 699 C1 , auf welche voll umfänglich Bezug genommen wird, beschriebene einfache Frequenzgangkorrektur durchgeführt werden. Einsetzbar sind aber auch beliebige andere Algorithmen und Einrichtungen zur Verbesserung der Sprachverständlichkeit.
In the case of parallel or subsequent speech signal processing of the audio signal i, the goal is to guide as many signal components as possible, which contain speech or speech components px, through a speech enhancement algorithm and to leave the remaining signal components unchanged, as is also the case Fig. 1 is described. This is advantageously solved by a matrix, as shown Fig. 3 outlined. Matrix coefficients k1, k2,..., K6 are determined as a function of the specific speech component px or dependent on the output value or output signal D1, D2 output by the speech detector SD or are determined as function px = F (D1, D2).
The actual speech enhancement algorithm or actual speech enhancer may be provided in a manner known per se. For example, an in DE 101 24 699 C1 to which full reference is made, described, simple frequency response correction can be performed. However, any other algorithms and devices for improving speech intelligibility can also be used.

Bei der in Fig. 3 dargestellten Matrixberechnung werden die Eingangskomponenten bzw. Eingangskanäle L', R' des Audiosignals i jeweils mit drei Faktoren k1, k3, k5 bzw. k2, k4, k6 multipliziert und Additionsgliedern zugeführt. Dem ersten Additionsglied A wird das Signal des ersten Kanals L' multipliziert mit dem ersten Koeffizienten k1 und das Signal des zweiten Kanals R' multipliziert mit dem zweiten Koeffizienten k2 zur Addition angelegt. Dem zweiten Additionsglied A werden das Signal des ersten Kanals L' multipliziert mit dem dritten Koeffizienten k3 und das Signal des zweiten Kanals R' multipliziert mit dem vierten Koeffizienten k4 zur Addition angelegt. Dem dritten Additionsglied A werden das Signal des ersten Kanals L' multipliziert mit dem fünften Koeffizienten k5 und das Signal des zweiten Kanals R' multipliziert mit dem sechsten Koeffizienten k6 zur Addition angelegt. Der Ausgangswert des zweiten Additionsglieds A wird einer Sprachverbesserungsschaltung VS oder einem Sprachverbesserungsverfahren bzw. Algorithmus zugeführt. Dessen Ausgangsergebnis wird mittels weiterer Additionsglieder A dem Ausgangswert bzw. Ausgangssignal des ersten Additionsglieds A zur Bereitstellung eines ersten Ausgangskanals LE und einem Ausgangswert bzw. Ausgangssignal des dritten Additionsglieds A mittels eines weiteren Additionsglied A zum Bereitstellen eines zweiten Ausgangskanals RE aufaddiert.At the in Fig. 3 illustrated matrix calculation, the input components or input channels L ', R' of the audio signal i are each multiplied by three factors k1, k3, k5 or k2, k4, k6 and supplied to addition elements. The first adder A is applied the signal of the first channel L 'multiplied by the first coefficient k1 and the signal of the second channel R' multiplied by the second coefficient k2 for addition. The second adder A is applied the signal of the first channel L 'multiplied by the third coefficient k3 and the signal of the second channel R' multiplied by the fourth coefficient k4 for addition. The third Adder A, the signal of the first channel L 'multiplied by the fifth coefficient k5 and the signal of the second channel R' multiplied by the sixth coefficient k6 are applied for addition. The output value of the second adder A is supplied to a speech enhancement circuit VS or a speech enhancement method or algorithm. Its output result is added by means of further addition elements A to the output value or output signal of the first addition element A for providing a first output channel LE and an output value or output signal of the third addition element A by means of a further addition element A for providing a second output channel RE.

Für die Bestimmung der Koeffizienten wird beispielsweise berücksichtigt, dass der Sprachanteil px durch die beschriebenen Verfahren durch einen Wertebereich von insbesondere 0 ≤ P ≤ 1 und als Funktion der Bestimmten Sprachanteile mit px = F(D1,D2,D3) bestimmbar ist. Gemäß einer einfachen Variante können die Koeffizienten festgelegt werden gemäß k 1 = k 6 = 1 - px / 2 ,

Figure imgb0006
k 2 = K 5 = - px / 2
Figure imgb0007
und k 3 = k 4 = px / 2.
Figure imgb0008
For the determination of the coefficients, it is considered, for example, that the speech component px can be determined by the described methods by a range of values of, in particular, 0 ≦ P ≦ 1 and as a function of the specific speech components with px = F (D1, D2, D3). According to a simple variant, the coefficients can be determined according to k 1 = k 6 = 1 - px / 2 .
Figure imgb0006
k 2 = K 5 = - px / 2
Figure imgb0007
and k 3 = k 4 = px / Second
Figure imgb0008

Die beiden letztendlich ausgegebenen Signalkanäle bzw. Komponenten LE, RE entsprechen den verarbeiteten Signalen, welche dem Ausgang O für das verarbeitete Audiosignal o zugeführt werden.The two finally output signal channels or components LE, RE correspond to the processed signals which are supplied to the output O for the processed audio signal o.

Fig. 4 stellt beispielhaft Funktion F(D1, D2=0, D3=0) dar. Im Fall der ersten dargestellten Funktion F = F1(D1) reagiert die Schaltungsanordnung schon auf einen geringen detektierten Sprachanteil. Die Wahrscheinlichkeit einer Fehldetektion ist für kleine Werte von D1 relativ hoch. Allerdings ist durch den stetigen Verlauf der ersten Funktion F1(D1) die Auswirkung des Sprachalgorithmus bei kleinem D1 auf das Audiosignal relativ gering, so dass eine Beeinträchtigung des Audiosignals kaum wahrgenommen wird. Fig. 4 exemplifies function F (D1, D2 = 0, D3 = 0). In the case of the first illustrated function F = F1 (D1), the circuit already responds to a small detected speech component. The probability of misdetection is relatively high for small values of D1. However, due to the continuous course of the first function F1 (D1) the Effect of the speech algorithm with a small D1 on the audio signal is relatively low, so that an impairment of the audio signal is barely perceived.

Im Fall einer zweiten Funktion F2(D1) bleibt das Audiosignal vollkommen unbeeinträchtigt bis zu einem Schwellenwert v = Ps2. Danach sind die Auswirkungen auf das Audiosignal bei Änderungen des Werts von P1 umso größer.In the case of a second function F2 (D1), the audio signal remains completely unaffected up to a threshold v = Ps2. Thereafter, the effects on the audio signal are greater with changes in the value of P1.

Im Fall einer dritten Funktion F = F3(D1) wird der Algorithmus beim Überschreiten eines bestimmten Schwellenwerts v = Ps31 eingeschaltet und beim Unterschreiten eines anderen, niedrigeren Schwellenwerts v=Ps32 ausgeschaltet. Durch den Einbau einer solchen Hysterese wird ein ständiges Umschalten im Übergangsbereich verhindert.In the case of a third function F = F3 (D1), the algorithm is turned on when a certain threshold value v = Ps31 is exceeded and switched off when another, lower threshold value v = Ps32 is fallen short of. By installing such a hysteresis a constant switching in the transition area is prevented.

Claims (17)

  1. A circuit configuration for enhancing the intelligibility of multi -component audio signals (i) possibly containing speech (px), comprising
    - an input (I) for inputting such an audio signal (i),
    - a speech detector (SD) for detecting speech (p x) in the inputted audio signal (i) and for providing a control signal (s) to control a speech processing device (SV) and/or a speech processing method for processing the audio signal (i),
    characterised in that the speech detector (SD) comprises a correlation device (CR) for performing a cross-correlation or an autocorrelation of components of the audio signal, and in that a matrix decoder is provided, which separates signal components of the audio signal by a matrix (MX).
  2. A circuit configuration according to Claim 1,
    in which the speech detector (SD) is designed and/or controlled to detect speech components (px) in the audio signal (i).
  3. A circuit configuration according to a preceding claim,
    in which the speech detector (SD)
    - is designed to process a multi -component audio signal (i), in particular stereo audio signals (L', R'), 3D stereo audio signals (L, R, C) and/or surround audio signals (L, R, C, S), having several audio signal components (L, R, C, S) and
    - comprises a processing device (CR) for detecting speech by comparing or processing the components (L, R, C, S) with each other.
  4. A circuit configuration according to Claim 3,
    in which the speech detector (SD) comprises a direction and/or distance determination device for determining a direction and/or distance of common signal components of the different components (L, R, C, S).
  5. A circuit configuration according to a preceding Claim,
    in which the control signal (s) for activating or deactivating the speech enhancement device (SV) and/ or the speech enhancement method is designed and/or controlled in dependence on the speech content of the audio signal (i).
  6. A method for processing multi -component audio signals (i) possibly containing speech,
    in which
    - speech or speech components (px) contained in an audio signal (i) are detected and
    - depending on the result of the detection, a control signal ( s) for a speech processing device (SV) and/or a speech processing method for speech enhancement is generated and provided,
    - a cross-correlation or autocorrelation of components (R, L, C, S) of the audio signal (i) is performed and
    - whereby signal components of the audio signal are separated by a matrix.
  7. A method according to Claim 6,
    in which the control signal (s) is generated in dependence upon the range of detected speech components (px).
  8. A method according to Claim 7,
    in which the range of detected speech components ( px) is compared to a threshold value (v).
  9. A method according to one of Claims 6 to 8,
    in which the detection is carried out adjustably with regard to a range of speech components to be detected and/or with regard to a frequency range of the speech components to be detected ( px) by means of variable parameters (v).
  10. A method according to one of Claims 6 to 9,
    in which the audio signal components of a multi -component audio signal having several audio signal components (R, L, C, S) are compared to each other or processed with each other for the detection of speech.
  11. A method according to Claim 10,
    in which the audio signal components (R, L, C, S) are compared or processed with respect to common speech components in the different audio signal components, especially to determine a direction and/or distance of the common signal components.
  12. A method according to one of Claims 6 to 11,
    in which the control signal (s) is provided to activate or deactivate the speech enhancement device (SV) and/or the speech enhancement method.
  13. A circuit configuration according to one of Claims 1 to 5 and/or a method according to one of Claims 6 to 12,
    wherein a frequency response is determined by means of an FIR or an IIR filter (FIR: Finite Impulse Response, IIR: Infinite Signal Response).
  14. A circuit configuration according to one of Claims 1 to 5 and/or a method according to one of Claims 6 to 12,
    wherein matrix coefficients for a matrix (MX) are determined via a function (P= F(px)) dependent on the speech component (px).
  15. A circuit configuration and/or method according to Claim 14,
    wherein the function (P = F(px)) is linear and constant.
  16. A circuit configuration and/or a method according to Claim 14, wherein the function (P = F(px)) has a hysteresis.
  17. A speech enhancement circuit configuration or method having a circuit configuration , and/or a method according to one of the preceding claims.
EP05019316A 2004-10-08 2005-09-06 Intelligibility enhancement of audio signals containing speech Not-in-force EP1647972B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
DE102004049347A DE102004049347A1 (en) 2004-10-08 2004-10-08 Circuit arrangement or method for speech-containing audio signals

Publications (3)

Publication Number Publication Date
EP1647972A2 EP1647972A2 (en) 2006-04-19
EP1647972A3 EP1647972A3 (en) 2006-07-12
EP1647972B1 true EP1647972B1 (en) 2008-03-26

Family

ID=35812768

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05019316A Not-in-force EP1647972B1 (en) 2004-10-08 2005-09-06 Intelligibility enhancement of audio signals containing speech

Country Status (6)

Country Link
US (1) US8005672B2 (en)
EP (1) EP1647972B1 (en)
JP (1) JP2006323336A (en)
KR (1) KR100804881B1 (en)
AT (1) ATE390684T1 (en)
DE (2) DE102004049347A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1691348A1 (en) * 2005-02-14 2006-08-16 Ecole Polytechnique Federale De Lausanne Parametric joint-coding of audio sources
US7970564B2 (en) * 2006-05-02 2011-06-28 Qualcomm Incorporated Enhancement techniques for blind source separation (BSS)
US8954324B2 (en) * 2007-09-28 2015-02-10 Qualcomm Incorporated Multiple microphone voice activity detector
US8175871B2 (en) * 2007-09-28 2012-05-08 Qualcomm Incorporated Apparatus and method of noise and echo reduction in multiple microphone audio systems
KR101349268B1 (en) * 2007-10-16 2014-01-15 삼성전자주식회사 Method and apparatus for mesuring sound source distance using microphone array
US8204235B2 (en) * 2007-11-30 2012-06-19 Pioneer Corporation Center channel positioning apparatus
US8223988B2 (en) * 2008-01-29 2012-07-17 Qualcomm Incorporated Enhanced blind source separation algorithm for highly correlated mixtures
EP2211564B1 (en) 2009-01-23 2014-09-10 Harman Becker Automotive Systems GmbH Passenger compartment communication system
JP5622744B2 (en) * 2009-11-06 2014-11-12 株式会社東芝 Voice recognition device
TWI459828B (en) * 2010-03-08 2014-11-01 Dolby Lab Licensing Corp Method and system for scaling ducking of speech-relevant channels in multi-channel audio
US9569439B2 (en) 2011-10-31 2017-02-14 Elwha Llc Context-sensitive query enrichment
JP2013135325A (en) * 2011-12-26 2013-07-08 Fuji Xerox Co Ltd Voice analysis device
JP5867066B2 (en) * 2011-12-26 2016-02-24 富士ゼロックス株式会社 Speech analyzer
JP6031761B2 (en) * 2011-12-28 2016-11-24 富士ゼロックス株式会社 Speech analysis apparatus and speech analysis system
US10528913B2 (en) 2011-12-30 2020-01-07 Elwha Llc Evidence-based healthcare information management protocols
US10475142B2 (en) 2011-12-30 2019-11-12 Elwha Llc Evidence-based healthcare information management protocols
US10552581B2 (en) 2011-12-30 2020-02-04 Elwha Llc Evidence-based healthcare information management protocols
US10340034B2 (en) 2011-12-30 2019-07-02 Elwha Llc Evidence-based healthcare information management protocols
US10559380B2 (en) 2011-12-30 2020-02-11 Elwha Llc Evidence-based healthcare information management protocols
US20130173295A1 (en) 2011-12-30 2013-07-04 Elwha LLC, a limited liability company of the State of Delaware Evidence-based healthcare information management protocols
US10679309B2 (en) 2011-12-30 2020-06-09 Elwha Llc Evidence-based healthcare information management protocols
WO2014138489A1 (en) * 2013-03-07 2014-09-12 Tiskerling Dynamics Llc Room and program responsive loudspeaker system
KR101808810B1 (en) * 2013-11-27 2017-12-14 한국전자통신연구원 Method and apparatus for detecting speech/non-speech section
US20210201937A1 (en) * 2019-12-31 2021-07-01 Texas Instruments Incorporated Adaptive detection threshold for non-stationary signals in noise
CN111292716A (en) * 2020-02-13 2020-06-16 百度在线网络技术(北京)有限公司 Voice chip and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5430826A (en) * 1992-10-13 1995-07-04 Harris Corporation Voice-activated switch

Family Cites Families (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4410763A (en) * 1981-06-09 1983-10-18 Northern Telecom Limited Speech detector
US4698842A (en) * 1985-07-11 1987-10-06 Electronic Engineering And Manufacturing, Inc. Audio processing system for restoring bass frequencies
US5251263A (en) * 1992-05-22 1993-10-05 Andrea Electronics Corporation Adaptive noise cancellation and speech enhancement system and apparatus therefor
WO1994007341A1 (en) 1992-09-11 1994-03-31 Hyman Goldberg Electroacoustic speech intelligibility enhancement method and apparatus
US5479560A (en) 1992-10-30 1995-12-26 Technology Research Association Of Medical And Welfare Apparatus Formant detecting device and speech processing apparatus
JPH06332492A (en) * 1993-05-19 1994-12-02 Matsushita Electric Ind Co Ltd Method and device for voice detection
BE1007355A3 (en) * 1993-07-26 1995-05-23 Philips Electronics Nv Voice signal circuit discrimination and an audio device with such circuit.
GB2303471B (en) * 1995-07-19 2000-03-22 Olympus Optical Co Voice activated recording apparatus
JPH0990974A (en) * 1995-09-25 1997-04-04 Nippon Telegr & Teleph Corp <Ntt> Signal processor
FI100840B (en) * 1995-12-12 1998-02-27 Nokia Mobile Phones Ltd Noise attenuator and method for attenuating background noise from noisy speech and a mobile station
US5774849A (en) * 1996-01-22 1998-06-30 Rockwell International Corporation Method and apparatus for generating frame voicing decisions of an incoming speech signal
JP3522954B2 (en) * 1996-03-15 2004-04-26 株式会社東芝 Microphone array input type speech recognition apparatus and method
WO1998006091A1 (en) * 1996-08-02 1998-02-12 Matsushita Electric Industrial Co., Ltd. Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus
US6130949A (en) * 1996-09-18 2000-10-10 Nippon Telegraph And Telephone Corporation Method and apparatus for separation of source, program recorded medium therefor, method and apparatus for detection of sound source zone, and program recorded medium therefor
US6230122B1 (en) * 1998-09-09 2001-05-08 Sony Corporation Speech detection with noise suppression based on principal components analysis
US6216103B1 (en) * 1997-10-20 2001-04-10 Sony Corporation Method for implementing a speech recognition system to determine speech endpoints during conditions with background noise
US6381569B1 (en) * 1998-02-04 2002-04-30 Qualcomm Incorporated Noise-compensated speech recognition templates
US6415253B1 (en) * 1998-02-20 2002-07-02 Meta-C Corporation Method and apparatus for enhancing noise-corrupted speech
US6618701B2 (en) * 1999-04-19 2003-09-09 Motorola, Inc. Method and system for noise suppression using external voice activity detection
JP4091244B2 (en) * 2000-11-08 2008-05-28 日産自動車株式会社 Audio playback device
US6889187B2 (en) * 2000-12-28 2005-05-03 Nortel Networks Limited Method and apparatus for improved voice activity detection in a packet voice network
US6952672B2 (en) * 2001-04-25 2005-10-04 International Business Machines Corporation Audio source position detection and audio adjustment
US7236929B2 (en) * 2001-05-09 2007-06-26 Plantronics, Inc. Echo suppression and speech detection techniques for telephony applications
US7158933B2 (en) * 2001-05-11 2007-01-02 Siemens Corporate Research, Inc. Multi-channel speech enhancement system and method based on psychoacoustic masking effects
DE10124699C1 (en) * 2001-05-18 2002-12-19 Micronas Gmbh Circuit arrangement for improving the intelligibility of speech-containing audio signals
FR2825826B1 (en) * 2001-06-11 2003-09-12 Cit Alcatel METHOD FOR DETECTING VOICE ACTIVITY IN A SIGNAL, AND ENCODER OF VOICE SIGNAL INCLUDING A DEVICE FOR IMPLEMENTING THIS PROCESS
JP2005502247A (en) * 2001-09-06 2005-01-20 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Audio playback device
JP2003084790A (en) * 2001-09-17 2003-03-19 Matsushita Electric Ind Co Ltd Speech component emphasizing device
US7299173B2 (en) * 2002-01-30 2007-11-20 Motorola Inc. Method and apparatus for speech detection using time-frequency variance
US7167568B2 (en) * 2002-05-02 2007-01-23 Microsoft Corporation Microphone array signal enhancement
US20040078199A1 (en) * 2002-08-20 2004-04-22 Hanoh Kremer Method for auditory based noise reduction and an apparatus for auditory based noise reduction
US7372848B2 (en) * 2002-10-11 2008-05-13 Agilent Technologies, Inc. Dynamically controlled packet filtering with correlation to signaling protocols
US7174022B1 (en) * 2002-11-15 2007-02-06 Fortemedia, Inc. Small array microphone for beam-forming and noise suppression
EP1592282B1 (en) * 2003-02-07 2007-06-13 Nippon Telegraph and Telephone Corporation Teleconferencing method and system
JP4480335B2 (en) * 2003-03-03 2010-06-16 パイオニア株式会社 Multi-channel audio signal processing circuit, processing program, and playback apparatus
US7343284B1 (en) * 2003-07-17 2008-03-11 Nortel Networks Limited Method and system for speech processing for enhancement and detection
CA2454296A1 (en) * 2003-12-29 2005-06-29 Nokia Corporation Method and device for speech enhancement in the presence of background noise
KR200434705Y1 (en) 2006-09-28 2006-12-26 김학무 Folding type drawing board easel

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5430826A (en) * 1992-10-13 1995-07-04 Harris Corporation Voice-activated switch

Also Published As

Publication number Publication date
EP1647972A2 (en) 2006-04-19
JP2006323336A (en) 2006-11-30
DE102004049347A1 (en) 2006-04-20
KR100804881B1 (en) 2008-02-20
DE502005003436D1 (en) 2008-05-08
EP1647972A3 (en) 2006-07-12
KR20060052101A (en) 2006-05-19
ATE390684T1 (en) 2008-04-15
US8005672B2 (en) 2011-08-23
US20060080089A1 (en) 2006-04-13

Similar Documents

Publication Publication Date Title
EP1647972B1 (en) Intelligibility enhancement of audio signals containing speech
DE19703228B4 (en) Method for amplifying input signals of a hearing aid and circuit for carrying out the method
DE69124005T2 (en) Speech signal processing device
DE112012000052B4 (en) Method and device for eliminating wind noise
DE112009000805B4 (en) noise reduction
DE102005032724B4 (en) Method and device for artificially expanding the bandwidth of speech signals
DE69915711T2 (en) METHOD AND SIGNAL PROCESSOR FOR GAINING LANGUAGE SIGNAL COMPONENTS IN A HEARING AID
DE102006051071B4 (en) Level-dependent noise reduction
EP1386307B1 (en) Method and device for determining a quality measure for an audio signal
DE2626793B2 (en) Electrical circuitry for determining the voiced or unvoiced state of a speech signal
EP1247425B1 (en) Method for operating a hearing-aid and a hearing aid
DE102015207706B3 (en) Method for frequency-dependent noise suppression of an input signal
DE69130687T2 (en) Speech signal processing device for cutting out a speech signal from a noisy speech signal
EP1101390B1 (en) Hearing aid having an improved speech intelligibility by means of frequency selective signal processing, and a method for operating such a hearing aid
EP3588498B1 (en) Method for suppressing an acoustic reverberation in an audio signal
EP1052881B1 (en) Hearing aid with oscillation detector and method for detecting oscillations in a hearing aid
EP1453355B1 (en) Signal processing in a hearing aid
WO2001047335A2 (en) Method for the elimination of noise signal components in an input signal for an auditory system, use of said method and a hearing aid
EP1755110A2 (en) Method and device for adaptive reduction of noise signals and background signals in a speech processing system
EP2394271B1 (en) Method for separating signal paths and use for improving speech using electric larynx
DE10025655B4 (en) A method of removing an unwanted component of a signal and system for distinguishing between unwanted and desired signal components
DE102004044565B4 (en) Method for limiting the dynamic range of audio signals and circuitry therefor
EP1130577B1 (en) Method for the reconstruction of low speech frequencies from mid-range frequencies
EP1348315B1 (en) Method for use of a hearing-aid and corresponding hearing aid
DE4445983A1 (en) Noise suppression system using spectral subtraction method

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

17P Request for examination filed

Effective date: 20060802

17Q First examination report despatched

Effective date: 20060901

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: GERMAN

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 502005003436

Country of ref document: DE

Date of ref document: 20080508

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

REG Reference to a national code

Ref country code: IE

Ref legal event code: FD4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080626

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080707

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080901

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080726

ET Fr: translation filed
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: IE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20081230

BERE Be: lapsed

Owner name: MICRONAS G.M.B.H.

Effective date: 20080930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080626

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080906

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080906

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080927

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080326

REG Reference to a national code

Ref country code: NL

Ref legal event code: SD

Effective date: 20101011

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20090930

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080627

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20090930

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20101125 AND 20101201

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 502005003436

Country of ref document: DE

Owner name: ENTROPIC COMMUNICATIONS, INC., US

Free format text: FORMER OWNER: MICRONAS GMBH, 79108 FREIBURG, DE

Effective date: 20110210

REG Reference to a national code

Ref country code: DE

Ref legal event code: R084

Ref document number: 502005003436

Country of ref document: DE

Effective date: 20110426

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 502005003436

Country of ref document: DE

Representative=s name: EPPING HERMANN FISCHER, PATENTANWALTSGESELLSCH, DE

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20120925

Year of fee payment: 8

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 502005003436

Country of ref document: DE

Representative=s name: EPPING HERMANN FISCHER, PATENTANWALTSGESELLSCH, DE

Effective date: 20121023

Ref country code: DE

Ref legal event code: R081

Ref document number: 502005003436

Country of ref document: DE

Owner name: ENTROPIC COMMUNICATIONS, INC., US

Free format text: FORMER OWNER: TRIDENT MICROSYSTEMS (FAR EAST) LTD., GRAND CAYMAN, KY

Effective date: 20121023

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20120927

Year of fee payment: 8

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20120924

Year of fee payment: 8

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20130919

Year of fee payment: 9

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20131107 AND 20131113

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

Owner name: ENTROPIC COMMUNICATIONS, INC., US

Effective date: 20131119

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20130923

Year of fee payment: 9

REG Reference to a national code

Ref country code: NL

Ref legal event code: V1

Effective date: 20140401

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20130906

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 502005003436

Country of ref document: DE

Effective date: 20140401

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20130906

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140401

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140401

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20150529

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140906

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140930