WO2021100094A1 - Dispositif et procédé d'estimation de signal de source sonore, et programme - Google Patents

Dispositif et procédé d'estimation de signal de source sonore, et programme Download PDF

Info

Publication number
WO2021100094A1
WO2021100094A1 PCT/JP2019/045120 JP2019045120W WO2021100094A1 WO 2021100094 A1 WO2021100094 A1 WO 2021100094A1 JP 2019045120 W JP2019045120 W JP 2019045120W WO 2021100094 A1 WO2021100094 A1 WO 2021100094A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound source
signal
source signal
mth
nth
Prior art date
Application number
PCT/JP2019/045120
Other languages
English (en)
Japanese (ja)
Inventor
江村 暁
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2019/045120 priority Critical patent/WO2021100094A1/fr
Priority to PCT/JP2020/006968 priority patent/WO2021100215A1/fr
Publication of WO2021100094A1 publication Critical patent/WO2021100094A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/0308Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Definitions

  • the present invention relates to a technique for estimating a sound source signal.
  • y n (k) (where k represents the time).
  • h n and m are mixing coefficients.
  • the mixing coefficients h n and m are scalars.
  • the signal from the mth sound source is separated into sound sources by multiplying the nth sound source signal y n (k) by the separation coefficients w m and n and taking the sum, as shown in the following equation.
  • the separation coefficients w m and n are updated so that each sound source signal is statistically more independent.
  • Natural Gradient method and FastICA are known as such update methods.
  • y n (k) (where k represents the time).
  • h n, m (p) is the impulse response of the acoustic path from the mth sound source to the nth microphone
  • P is the length of the impulse response of the acoustic path.
  • Q is the filter length of the FIR filter.
  • the filter length Q of the FIR filter is also several thousand. Therefore, the calculation of BSS in the convolution mixed model is much more difficult than that of BSS in the instantaneous mixed model.
  • the frequency domain processing approach is usually applied to BSS in the convolution mixed model.
  • a Short-Time Fourier Transform STFT
  • STFT Short-Time Fourier Transform
  • f is the frame number when the signal is framed by STFT
  • is the frequency
  • S m (f, ⁇ ) is the mth sound source signal obtained by converting sm (k) into the frequency domain
  • H n, m ( ⁇ ) is the impulse response of the acoustic path from the mth sound source to the nth microphone, which is obtained by converting h n, m (p) in the frequency domain
  • Y n (f, ⁇ ) is
  • Y n (k) is the nth sound pickup signal obtained by frequency domain conversion.
  • ⁇ T represents transpose.
  • the separation filter W ( ⁇ ) can be updated by applying the above-mentioned Natural Gradient method and FastICA as they are at each frequency. Therefore, such an approach is called frequency domain ICA (Frequency-Domain ICA; FDICA).
  • each frequency is processed individually, so there are two problems.
  • the first problem is called a scaling problem, in which each sound source signal is estimated with a different gain at each frequency.
  • the second problem is called the permutation problem, in which the sound sources are estimated in a different order at each frequency.
  • the scaling problem is solved by a method of recovering the sound source signal component at the position of the microphone, focusing on the transmission characteristics between the estimated sound source signal and the sound collection signal by the microphone, and the permutation problem is solved. , It is solved by the method by clustering the activity sequence obtained from the estimated sound source signal (see Non-Patent Document 1).
  • the mth element ⁇ S m (f, ⁇ ) of the sound source signal vector ⁇ s (f, ⁇ ) is called the mth separated sound source signal. Also, for the sake of simplicity, ⁇ will be omitted.
  • the crosstalk component of a signal from another sound source is mixed in the separated sound source signal, and the influence becomes large when the reverberation time is not short.
  • the crosstalk component of a signal from another sound source is the reverberation of a signal from another sound source or a signal from another sound source.
  • Non-Patent Document 2 As a method of suppressing this crosstalk component, there are the methods described in Non-Patent Document 2 and Non-Patent Document 3. In these methods, for example, a model such as the following equation is used in which a small amount of signal derived from the second sound source is mixed in the first separated sound source signal ⁇ S 1 (f).
  • ⁇ 1 and 2 are coefficients indicating the degree to which the crosstalk component of the signal from the second sound source is mixed in the first separated sound source signal ⁇ S 1 (f).
  • ⁇ * represents the complex conjugate.
  • ⁇ 1 and 2 are
  • E [ ⁇ ] represents the expected value
  • the first estimated sound source signal ⁇ S 1 (f) in which the crosstalk component is suppressed can be obtained by the following equation using the Wiener filter ⁇ 1.
  • the first estimated sound source signal ⁇ S 1 (f) in which the crosstalk component is suppressed can be obtained by the following equation using the Wiener filter ⁇ 1 (f).
  • ⁇ (0 ⁇ ⁇ 1) is a forgetting constant for smoothing.
  • Non-Patent Document 2 and Non-Patent Document 3 targets only the amplitude component at each frequency, the phase component related to the crosstalk component is ignored, and musical tones are likely to occur and the sound quality is high. Has the problem of being prone to deterioration.
  • an object of the present invention is to provide a sound source signal estimation technique capable of suppressing sound quality deterioration by removing a crosstalk component in consideration of both an amplitude component and a phase component.
  • the mth separated sound source signal ⁇ S m (f, ⁇ ) (m 1), which is a signal obtained by separating the mth sound source signal S m (f, ⁇ ), which is a signal in the frequency region of the mth sound source signal s m (k).
  • m' ⁇ m) is a coefficient indicating the degree to which the crosstalk component of the signal from the m'thound source is mixed in the mth separated sound source signal ⁇ S m (f, ⁇ ). Optimization problem for pairs of m and m'that satisfy ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m
  • the mth separated sound source signal ⁇ S m (f, ⁇ ) (m 1), which is a signal obtained by separating the mth sound source signal S m (f, ⁇ ), which is a signal in the frequency region of the mth sound source signal s m (k).
  • a sound source signal estimator including a crosstalk component remover that generates (m 1, ..., M), where D is an integer greater than or equal to 1, ⁇ m, m', d ( ⁇ ) (1 ⁇ m).
  • the present invention it is possible to suppress deterioration of sound quality by removing the crosstalk component in consideration of both the amplitude component and the phase component and estimating the sound source signal.
  • x y_z means that y z is a superscript for x
  • x y_z means that y z is a subscript for x
  • Step 1 STFT transform
  • Step 2 Sound source separation
  • Step 3 Removal of crosstalk components
  • ⁇ 1 , 2 ( ⁇ ) is a coefficient indicating the degree to which the crosstalk component of the signal from the second sound source is mixed in the first separated sound source signal ⁇ S 1 (f, ⁇ ).
  • the estimation accuracy is obtained. That is, the first estimated sound source signal ⁇ S 1 (f, ⁇ ) is obtained by the following equation.
  • the second estimated sound source signal ⁇ S 2 (f, ⁇ ) can also be obtained.
  • the mth estimated sound source signal ⁇ S m (f, ⁇ ) shall be calculated by the following equation.
  • the solution can be obtained by using, for example, Alternating Direction Method of Multipliers (ADMM).
  • ADMM Alternating Direction Method of Multipliers
  • Step 4 Reverse STFT
  • the m estimated source signal ⁇ S m (f, ⁇ ) and is converted to the m estimated source signal ⁇ s m is the signal in the time domain using an inverse STFT transform (k) (1 ⁇ m ⁇ M).
  • the crosstalk component is removed by using only one past frame in step 3, but it may be removed by using two or more past frames.
  • the mth estimated sound source signal ⁇ S m (f, ⁇ ) shall be calculated by the following equation.
  • ⁇ S m (f, ⁇ ) (1 ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m, 0 ⁇ d ⁇ D) is the mth separated sound source signal ⁇ S m (f, ⁇ ) is a coefficient indicating the degree to which the crosstalk component of the signal from the m'sound source before the d frame is mixed), so that the mth estimated sound source signal ⁇ S m (f, ⁇ ) becomes more sparse as a signal.
  • FIG. 1 is a block diagram showing a configuration of a sound source signal estimation device 100.
  • FIG. 2 is a flowchart showing the operation of the sound source signal estimation device 100.
  • the sound source signal estimation device 100 includes a frequency domain conversion unit 110, a sound source separation unit 120, a crosstalk component removal unit 130, a time domain conversion unit 140, and a recording unit 190.
  • the recording unit 190 is a component unit that appropriately records information necessary for processing of the sound source signal estimation device 100.
  • the sound source signal estimation device 100 receives signals picked up by M microphones installed in a sound field having M sound sources (M is an integer of 2 or more) as an input, and estimates signals from the M sound sources. And output.
  • M an integer of 2 or more
  • the frequency domain conversion for example, STFT conversion can be used.
  • a signal obtained by separating the mth sound source signal S m (f, ⁇ ), which is a signal in the frequency region of the mth sound source signal s m (k), from (n 1, ..., M) by a predetermined sound source separation method.
  • Generates and outputs a certain mth separated sound source signal ⁇ S m (f, ⁇ ) (m 1,..., M).
  • the sound source separation method for example, the blind sound source separation method in the frequency domain described in Non-Patent Document 1 can be used.
  • FIG. 3 is a block diagram showing the configuration of the crosstalk component removing unit 130.
  • FIG. 4 is a flowchart showing the operation of the crosstalk component removing unit 130.
  • the crosstalk component removing unit 130 includes a coefficient calculation unit 132 and a crosstalk component removing signal calculation unit 134.
  • the coefficient calculation unit 132 has an optimization problem for a set of m and m'that satisfies 1 ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m.
  • the coefficients ⁇ m, m' ( ⁇ ) are calculated by solving (L is a predetermined positive integer representing the number of frames).
  • ⁇ m, m' ( ⁇ ) (1 ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m) is the mth separated sound source signal ⁇ S m (f, ⁇ ) from the m'sound source. It is a coefficient indicating the degree to which the crosstalk component of the signal of is mixed.
  • L may be an integer of about several tens.
  • the crosstalk component removing unit 130 may calculate based on a model that considers the crosstalk components of a plurality of frames in the past.
  • description will be given according to FIG.
  • the coefficient calculation unit 132 is a set of m, m'and d satisfying 1 ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m, 0 ⁇ d ⁇ D (D is an integer of 1 or more). Optimization problem for
  • the coefficients ⁇ m, m', d ( ⁇ ) are calculated by solving (L is a predetermined positive integer representing the number of frames).
  • ⁇ m, m', d ( ⁇ ) (1 ⁇ m ⁇ M, 1 ⁇ m' ⁇ M, m' ⁇ m, 0 ⁇ d ⁇ D) is the mth separated sound source signal ⁇ S m (f, It is a coefficient indicating the degree to which the crosstalk component of the signal from the m'sound source before the d frame is mixed in ⁇ ).
  • an inverse STFT conversion can be used for the time domain conversion.
  • the invention of the present embodiment it is possible to suppress the deterioration of sound quality by removing the crosstalk component in consideration of both the amplitude component and the phase component and estimating the sound source signal.
  • the degree of crosstalk components of signals from other sound sources is estimated using the sparsity of the sound source signal as an evaluation standard. This makes it possible to improve the estimation accuracy of the sound source signal.
  • FIG. 5 is a diagram showing an example of a functional configuration of a computer that realizes each of the above-mentioned devices.
  • the processing in each of the above-mentioned devices can be carried out by causing the recording unit 2020 to read a program for causing the computer to function as each of the above-mentioned devices, and operating the control unit 2010, the input unit 2030, the output unit 2040, and the like.
  • the device of the present invention is, for example, as a single hardware entity, an input unit to which a keyboard or the like can be connected, an output unit to which a liquid crystal display or the like can be connected, and a communication device (for example, a communication cable) capable of communicating outside the hardware entity.
  • Communication unit CPU (Central Processing Unit, cache memory, registers, etc.) to which can be connected, RAM and ROM as memory, external storage device as hard hardware, and input, output, and communication units of these , CPU, RAM, ROM, has a connecting bus so that data can be exchanged between external storage devices.
  • a device (drive) or the like capable of reading and writing a recording medium such as a CD-ROM may be provided in the hardware entity.
  • a physical entity equipped with such hardware resources includes a general-purpose computer and the like.
  • the external storage device of the hardware entity stores the program required to realize the above-mentioned functions and the data required for processing this program (not limited to the external storage device, for example, reading a program). It may be stored in a ROM, which is a dedicated storage device). Further, the data obtained by the processing of these programs is appropriately stored in a RAM, an external storage device, or the like.
  • each program stored in the external storage device (or ROM, etc.) and the data necessary for processing each program are read into the memory as needed, and are appropriately interpreted, executed, and processed by the CPU. ..
  • the CPU realizes a predetermined function (each component represented by the above, ..., ... means, etc.).
  • the present invention is not limited to the above-described embodiment, and can be appropriately modified without departing from the spirit of the present invention. Further, the processes described in the above-described embodiment are not only executed in chronological order according to the order described, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. ..
  • the processing function in the hardware entity (device of the present invention) described in the above embodiment is realized by a computer
  • the processing content of the function that the hardware entity should have is described by a program.
  • the processing function in the above hardware entity is realized on the computer.
  • the program that describes this processing content can be recorded on a computer-readable recording medium.
  • the computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.
  • a hard disk device, a flexible disk, a magnetic tape, or the like as a magnetic recording device is used as an optical disk
  • a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), or a CD-ROM (Compact Disc Read Only) is used as an optical disk.
  • Memory CD-R (Recordable) / RW (ReWritable), etc.
  • MO Magnetto-Optical disc
  • EP-ROM Electroically Erasable and Programmable-Read Only Memory
  • semiconductor memory can be used.
  • the distribution of this program is carried out, for example, by selling, transferring, renting, etc., a portable recording medium such as a DVD or CD-ROM on which the program is recorded. Further, the program may be stored in the storage device of the server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network.
  • a computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. Then, when the process is executed, the computer reads the program stored in its own storage device and executes the process according to the read program. Further, as another execution form of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program, and further, the program is transferred from the server computer to this computer. Each time, the processing according to the received program may be executed sequentially. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be.
  • the program in this embodiment includes information to be used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).
  • the hardware entity is configured by executing a predetermined program on the computer, but at least a part of these processing contents may be realized in terms of hardware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

L'invention concerne une technique d'estimation de signal de source sonore permettant d'empêcher la détérioration de la qualité sonore par élimination d'une composante de diaphonie en tenant compte à la fois d'une composante d'amplitude et d'une composante de phase. βm, m'(ω) (1 ≤ m ≤ M, 1 ≤ m' ≤ M, m' ≠ m) est utilisé en tant que coefficient indiquant le degré selon lequel une composante de diaphonie d'un signal provenant d'une m'-ième source sonore est mélangée dans un m-ième signal de source sonore de séparation ^Sm(f, ω). Une unité d'élimination de composante de diaphonie comprend : une unité de calcul de coefficient qui, par résolution d'un problème d'optimisation prescrit relatif à une paire de m et m' lorsque 1 ≤ m ≤ M, 1 ≤ m' ≤ M et m' ≠ m, calcule le coefficient βm, m'(ω) ; et une unité de calcul de signal d'élimination de composante de diaphonie qui, à partir du m-ième signal de source sonore de séparation ^Sm(f, ω) (m = 1, ..., M) et à l'aide du coefficient βm, m'(ω), calcule un m-ième signal de source sonore d'estimation ~Sm(f, ω) (m = 1, ..., M).
PCT/JP2019/045120 2019-11-18 2019-11-18 Dispositif et procédé d'estimation de signal de source sonore, et programme WO2021100094A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2019/045120 WO2021100094A1 (fr) 2019-11-18 2019-11-18 Dispositif et procédé d'estimation de signal de source sonore, et programme
PCT/JP2020/006968 WO2021100215A1 (fr) 2019-11-18 2020-02-21 Dispositif, procédé et programme d'estimation de signal de source sonore

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/045120 WO2021100094A1 (fr) 2019-11-18 2019-11-18 Dispositif et procédé d'estimation de signal de source sonore, et programme

Publications (1)

Publication Number Publication Date
WO2021100094A1 true WO2021100094A1 (fr) 2021-05-27

Family

ID=75981519

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/JP2019/045120 WO2021100094A1 (fr) 2019-11-18 2019-11-18 Dispositif et procédé d'estimation de signal de source sonore, et programme
PCT/JP2020/006968 WO2021100215A1 (fr) 2019-11-18 2020-02-21 Dispositif, procédé et programme d'estimation de signal de source sonore

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/006968 WO2021100215A1 (fr) 2019-11-18 2020-02-21 Dispositif, procédé et programme d'estimation de signal de source sonore

Country Status (1)

Country Link
WO (2) WO2021100094A1 (fr)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AICHNER R., ZOURUB M., BUCHNER H., KELLERMANN W.: "POST-PROCESSING FOR CONVOLUTIVE BLIND SOURCE SEPARATION", PROC. ICASSP, 5 May 2006 (2006-05-05), pages 37 - 41, XP010931283, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1661206> [retrieved on 20200406] *
MUKAI, RYO ET AL.: "REMOVAL OF RESIDUAL CROSS-TALK COMPONENTS IN BLIND SOURCE SEPARATION USING TIME- DELAYED SUBTRACTION, in Proc.", ICASSP, 2 May 2002 (2002-05-02), pages 1789 - 1792, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5744970> [retrieved on 20200406] *
SAWADA, HIROSHI ET AL.: "MLSP 2007 DATA ANALYSIS COMPETITION: FREQUENCY-DOMAIN BLIND SOURCE SEPARATION FOR CONVOLUTIVE MIXTURES OF SPEECH/AUDIO SIGNALS", IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING(MLSP 2007, August 2007 (2007-08-01), pages 45 - 50, XP031199060, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4414280> [retrieved on 20200406] *

Also Published As

Publication number Publication date
WO2021100215A1 (fr) 2021-05-27

Similar Documents

Publication Publication Date Title
JP4195267B2 (ja) 音声認識装置、その音声認識方法及びプログラム
CN102084667B (zh) 回响去除装置、回响去除方法、回响去除程序、以及记录介质
JP2021036297A (ja) 信号処理装置、信号処理方法、及びプログラム
WO2007100137A1 (fr) Dispositif, procede et programme d&#39;elimination de la reverberation et support d&#39;enregistrement
CN113284507B (zh) 语音增强模型的训练方法和装置及语音增强方法和装置
JP4787851B2 (ja) エコー抑圧ゲイン推定方法とそれを用いたエコー消去装置と、装置プログラムと記録媒体
CN112951263B (zh) 语音增强方法、装置、设备和存储介质
JP6815956B2 (ja) フィルタ係数算出装置、その方法、及びプログラム
WO2021100094A1 (fr) Dispositif et procédé d&#39;estimation de signal de source sonore, et programme
JP5889224B2 (ja) エコー抑圧ゲイン推定方法とそれを用いたエコー消去装置とプログラム
JP6827908B2 (ja) 音源強調装置、音源強調学習装置、音源強調方法、プログラム
JP5769670B2 (ja) エコー抑圧ゲイン推定方法とそれを用いたエコー消去装置とプログラム
WO2021255925A1 (fr) Dispositif de génération de signal sonore cible, procédé de génération de signal sonore cible, et programme
US11676619B2 (en) Noise spatial covariance matrix estimation apparatus, noise spatial covariance matrix estimation method, and program
JP2010044150A (ja) 残響除去装置、残響除去方法、そのプログラムおよび記録媒体
JP6912780B2 (ja) 音源強調装置、音源強調学習装置、音源強調方法、プログラム
JP7156064B2 (ja) 潜在変数最適化装置、フィルタ係数最適化装置、潜在変数最適化方法、フィルタ係数最適化方法、プログラム
JP5562451B1 (ja) エコー抑圧ゲイン推定方法とそれを用いたエコー消去装置とプログラム
WO2021100136A1 (fr) Dispositif d&#39;estimation de signal de source sonore, procédé d&#39;estimation de signal de source sonore, et programme
JP7026358B2 (ja) 回帰関数学習装置、回帰関数学習方法、プログラム
JP2018191255A (ja) 収音装置、その方法、及びプログラム
WO2021024474A1 (fr) Dispositif d&#39;optimisation de psd, procédé d&#39;optimisation de psd, et programme
JP5498452B2 (ja) 背景音抑圧装置、背景音抑圧方法、およびプログラム
JP7375905B2 (ja) フィルタ係数最適化装置、フィルタ係数最適化方法、プログラム
JP7159767B2 (ja) 音声信号処理プログラム、音声信号処理方法及び音声信号処理装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19953085

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19953085

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP