JP2009153053A - Voice estimation method, and mobile terminal using the same - Google Patents

Voice estimation method, and mobile terminal using the same Download PDF

Info

Publication number
JP2009153053A
JP2009153053A JP2007330982A JP2007330982A JP2009153053A JP 2009153053 A JP2009153053 A JP 2009153053A JP 2007330982 A JP2007330982 A JP 2007330982A JP 2007330982 A JP2007330982 A JP 2007330982A JP 2009153053 A JP2009153053 A JP 2009153053A
Authority
JP
Japan
Prior art keywords
signal
noise
microphones
ringtone
collecting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2007330982A
Other languages
Japanese (ja)
Inventor
Daisuke Sugii
大介 杉井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP2007330982A priority Critical patent/JP2009153053A/en
Publication of JP2009153053A publication Critical patent/JP2009153053A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Telephone Function (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To specify a noise signal and an objective sound signal including the noise signal to normally operate an adaptive noise-suppressing filter, and to suppress the noise in the objective sound signal. <P>SOLUTION: A noise and objective-sound component processor 4 estimates the signal with an incoming-call component removed from a sound signal as the noise signal, based on a voice signal obtained by picking up an incoming call through microphones 1a, 1b and receipt sound data. During a following telephone conversation after that, the noise and objective-sound component processor 4 estimates an added voice signal through a predetermined time-difference correction as the objective sound signal including the noise component, in association with the voice signal cosisting of a telephone-call voice picked up through the microphones 1a, 1b and a background noise. An adaptive filter 5 receives these estimated noise signal and the objective sound signal including the noise component as an input, and outputs the objective sound signal (transmitting voice signal) with the noise component suppressed therefrom. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は音声推定方法及びそれを用いた携帯端末に係り、特に適応型雑音抑圧フィルタを用いた音声推定方法及びそれを用いた携帯端末に関する。   The present invention relates to a speech estimation method and a mobile terminal using the same, and more particularly to a speech estimation method using an adaptive noise suppression filter and a mobile terminal using the same.

携帯電話機等の携帯端末では、着信時や通話時に背景音などの周囲雑音をできるだけ抑圧することが必要とされる。例えば、着信があった際に、マイクロホンから入力された周囲雑音(周囲ノイズ)の性質を解析し、配信されて記憶手段に記憶された周囲雑音パラメータの中から、解析した周囲雑音の性質に応じて最適な周囲雑音パラメータを選択し、それを使用して音源部から発生される着信音の音質を制御することで、周囲雑音に影響されることなく着信音を識別させる電話装置が知られている(例えば、特許文献1参照)。   In a portable terminal such as a cellular phone, it is necessary to suppress ambient noise such as background sound as much as possible when an incoming call or a call is made. For example, when an incoming call is received, the characteristics of the ambient noise (ambient noise) input from the microphone is analyzed, and the ambient noise parameters distributed and stored in the storage means are analyzed according to the analyzed ambient noise characteristics. It is known that a telephone device that can identify ringtones without being influenced by ambient noise by selecting the optimum ambient noise parameters and using them to control the tone quality of ringtones generated from the sound source. (For example, refer to Patent Document 1).

また、間隔を置いて配置された複数のマイクロホンを用いて得られた音響信号に対して信号処理を施し、これによって雑音を抑圧すると共に、目的とする音響信号成分を強調する、マイクロホンアレー技術による適応雑音抑圧フィルタも知られている(例えば、特許文献2参照)。この適応型雑音抑圧フィルタは、周囲雑音に対する加算型マイクロホンアレーと、方向性雑音に対する減算型アレーを組み合わせた効率の良い雑音抑圧システムである。従って、この適応型雑音抑圧フィルタを携帯端末に組み込むことで、通話中などにおける周囲雑音をできるだけ抑圧することが可能となる。   In addition, a signal processing is performed on an acoustic signal obtained by using a plurality of microphones arranged at intervals, thereby suppressing noise and emphasizing a target acoustic signal component. An adaptive noise suppression filter is also known (see, for example, Patent Document 2). This adaptive noise suppression filter is an efficient noise suppression system that combines an addition type microphone array for ambient noise and a subtraction type array for directional noise. Therefore, by incorporating this adaptive noise suppression filter into a portable terminal, it is possible to suppress ambient noise during a call as much as possible.

特開2001−036604号公報JP 2001-036604 A 特開2003−223198号公報JP 2003-223198 A

しかしながら、特許文献1記載の携帯端末は、着信音を識別させるために周囲雑音の性質を解析するものであり、適応型雑音抑圧フィルタを用いて通話中の雑音信号を抑圧することはできない。   However, the portable terminal described in Patent Document 1 analyzes the nature of ambient noise in order to identify ringtones, and cannot suppress noise signals during a call using an adaptive noise suppression filter.

一方、適応型雑音抑圧フィルタでは、雑音を含む目的音信号と雑音信号とが入力信号として必要であり、これらの入力信号に対して雑音成分を抑圧した目的音信号を出力することができる。しかし、適応型雑音抑圧フィルタを携帯端末に組み込んだ場合、マイクロホンアレーから目的音信号である送話音声信号だけでなく雑音信号も同時に混在して入力されるが、携帯端末内でその入力信号から雑音成分を含む目的音信号と雑音信号とをそれぞれ特定することができないため、所望の雑音抑圧ができない。   On the other hand, the adaptive noise suppression filter requires a target sound signal including noise and a noise signal as input signals, and can output a target sound signal in which noise components are suppressed for these input signals. However, when an adaptive noise suppression filter is incorporated in a mobile terminal, not only the transmitted speech signal that is the target sound signal but also the noise signal is simultaneously input from the microphone array. Since the target sound signal and the noise signal including the noise component cannot be specified, desired noise suppression cannot be performed.

本発明は以上の点に鑑みなされたもので、雑音成分を含む目的音信号と雑音信号を特定して適応型雑音抑圧フィルタを正常に動作させて、目的音信号中の雑音を抑圧し得る音声推定方法及びそれを用いた携帯端末を提供することを目的とする。   The present invention has been made in view of the above points, and it is possible to specify a target sound signal including a noise component and a noise signal, and operate an adaptive noise suppression filter normally to suppress the noise in the target sound signal. An object is to provide an estimation method and a portable terminal using the estimation method.

上記の目的を達成するため、本発明の音声推定方法は、電話端末における着信音を、電話端末に設けた複数のマイクロホンで収音する第1のステップと、第1のステップで複数のマイクロホンで着信音を収音して得た複数の音声信号から、着信音成分を除去した音声信号を雑音信号と推定する第2のステップとを含むことを特徴とする。   In order to achieve the above object, a speech estimation method of the present invention includes a first step of collecting a ringtone at a telephone terminal with a plurality of microphones provided on the telephone terminal, and a plurality of microphones at the first step. And a second step of estimating a voice signal from which a ring tone component is removed from a plurality of voice signals obtained by collecting a ring tone as a noise signal.

また、上記の目的を達成するため、本発明の携帯端末は、複数のマイクロホンと適応型雑音抑圧フィルタとを備えた携帯端末であって、着信音データに基づいて着信音をスピーカに鳴動させる着信音鳴動手段と、複数のマイクロホンで着信音を収音して得た複数の音声信号と、着信音データとに基づいて、着信音を収音して得た複数の音声信号から着信音成分を除去した音声信号を演算して雑音信号と推定して適応型雑音抑圧フィルタに出力する雑音信号演算手段とを有することを特徴とする。   In order to achieve the above object, a mobile terminal of the present invention is a mobile terminal including a plurality of microphones and an adaptive noise suppression filter, and an incoming call that causes a speaker to ring a ringtone based on ringtone data. Based on the sound ringing means, the plurality of sound signals obtained by collecting the ringtones with a plurality of microphones, and the ringtone data, the ringtone component is obtained from the plurality of sound signals obtained by collecting the ringtones. Noise signal calculation means for calculating the removed speech signal to estimate it as a noise signal and outputting it to an adaptive noise suppression filter is provided.

本発明によれば、それぞれ推定した雑音信号と目的音信号とを入力信号として供給される適応型雑音抑圧フィルタを用いて、自由度の高い雑音抑圧ができる。   According to the present invention, it is possible to perform noise suppression with a high degree of freedom by using an adaptive noise suppression filter that is supplied with an estimated noise signal and a target sound signal as input signals.

次に、本発明の実施形態について図面と共に説明する。   Next, embodiments of the present invention will be described with reference to the drawings.

図1は、本発明になる携帯端末の一実施形態のブロック図を示す。同図において、携帯端末1は、送話用の2個のマイクロホン2a及び2bと、マイクロホン2a及び2bから出力されたアナログ音声信号をディジタル音声信号に別々に変換するAD変換器3a及び3bと、AD変換器3a及び3bから出力されたディジタル音声信号に対して、目的音成分と雑音成分を特定するための演算処理を行う雑音・目的音成分演算部4と、雑音・目的音成分演算部4から出力された雑音成分を含む目的音信号と雑音信号とを入力として受け、公知の雑音抑圧処理を行って雑音成分が抑圧された目的音信号を出力する適応型雑音抑圧機能(AMNOR;Adaptive Microphone for NOise Reduction)を備えたフィルタ(以下、適応フィルタという)5と、着信音鳴動用のスピーカ6とを有する。   FIG. 1 shows a block diagram of an embodiment of a portable terminal according to the present invention. In the figure, a mobile terminal 1 includes two microphones 2a and 2b for transmission, AD converters 3a and 3b for separately converting analog audio signals output from the microphones 2a and 2b into digital audio signals, A noise / target sound component calculation unit 4 that performs calculation processing for specifying a target sound component and a noise component for the digital audio signals output from the AD converters 3a and 3b, and a noise / target sound component calculation unit 4 An adaptive noise suppression function (AMNOR; Adaptive Microphone) that receives a target sound signal including a noise component and a noise signal output from the signal and outputs a target sound signal in which the noise component is suppressed by performing known noise suppression processing for NOise Reduction) (hereinafter referred to as an adaptive filter) 5 and a ringtone ringing speaker 6.

なお、マイクロホン2a及び2bの携帯端末での配置位置は、マイクロホン2a及び2bに送話音声が入力されるときの時間差を利用しているので、受話器(スピーカ6とは別のスピーカ)と離れていることが望ましいが、特に制約はない。   Note that the arrangement positions of the microphones 2a and 2b on the portable terminal use the time difference when the transmitted voice is input to the microphones 2a and 2b, so that they are separated from the receiver (speaker different from the speaker 6). Although it is desirable, there are no particular restrictions.

図2は、本実施形態の要部である雑音・目的音成分演算部4の一実施形態のブロック図を示す。図2に示すように、雑音・目的音成分演算部4は、AD変換器3a及び3bから並列に出力されたディジタル音声信号を別々に一時的に格納するランダム・アクセス・メモリ(RAM)41a及び41bと、着信音データを格納するRAM43と、RAM41a及び41bから読み出されたディジタル音声信号とRAM43から読み出された着信音データとを入力として受け、これらの相関計算と行列の加減算を行い、得られた目的音成分と雑音成分とを出力するディジタル・シグナル・プロセッサ(DSP)42とから構成されている。RAM41a及び41bは、マイクロホン2a及び2bの収音時間が一定とは限らないので、リングバッファ構成となっている。RAM43もRAM41a及び41bと同様、リングバッファ構成である。   FIG. 2 is a block diagram of an embodiment of the noise / target sound component calculation unit 4 which is a main part of the present embodiment. As shown in FIG. 2, the noise / target sound component calculation unit 4 includes a random access memory (RAM) 41a for temporarily storing digital audio signals output in parallel from the AD converters 3a and 3b separately. 41b, the RAM 43 for storing the ring tone data, the digital voice signal read from the RAMs 41a and 41b and the ring tone data read from the RAM 43 as inputs, and performing these correlation calculations and matrix addition / subtraction, It comprises a digital signal processor (DSP) 42 that outputs the obtained target sound component and noise component. The RAMs 41a and 41b have a ring buffer configuration because the sound collection times of the microphones 2a and 2b are not always constant. The RAM 43 also has a ring buffer configuration like the RAMs 41a and 41b.

なお、図1の適応フィルタ5は、当業者にとってよく知られており、また本発明とは直接関係しないので、その詳細な構成の説明は省略する。   The adaptive filter 5 in FIG. 1 is well known to those skilled in the art and is not directly related to the present invention, so that the detailed description of the configuration is omitted.

次に、本実施形態の動作について説明する。本実施形態では、まず、発着信開始から通話開始までの間に、目的音信号と雑音信号の推定信号を準備する。この準備動作を図3のフローチャートと図1及び図2と共に説明する。   Next, the operation of this embodiment will be described. In the present embodiment, first, an estimated signal of a target sound signal and a noise signal is prepared from the start of outgoing / incoming calls to the start of a call. This preparation operation will be described with reference to the flowchart of FIG. 3 and FIGS.

まず、携帯端末1が音声通話着信状態になると(図3のステップS1)、マイクロホン2a及び2bは、公知の方法で着信音信号が供給されるスピーカ6によって電気−音響変換して発音されている着信音と周囲雑音の収音を開始する(図3のステップS2)。これにより、マイクロホン2a及び2bによりそれぞれ音響−電気変換して得られた着信音と周囲雑音に関する音声信号がAD変換器3a及び3bを通して、雑音・目的音成分演算部4内のRAM41a及び41bに供給されて一時格納される。一方、スピーカ6に供給される着信音データは、RAM43にも供給されて一時格納される。   First, when the mobile terminal 1 enters a voice call incoming state (step S1 in FIG. 3), the microphones 2a and 2b are sounded by electro-acoustic conversion by a speaker 6 to which a ring tone signal is supplied by a known method. Picking up the ringtone and ambient noise is started (step S2 in FIG. 3). As a result, the sound signals relating to the ring tone and ambient noise obtained by the acoustic-electric conversion by the microphones 2a and 2b are supplied to the RAMs 41a and 41b in the noise / target sound component calculation unit 4 through the AD converters 3a and 3b. And temporarily stored. On the other hand, the ring tone data supplied to the speaker 6 is also supplied to the RAM 43 and temporarily stored.

この状態で送話者が通話開始ボタン(図1では図示せず)を押下すると(図3のステップS3)、雑音・目的音成分演算部4内のDSP42は、RAM43を参照し、着信音データが記憶されているかどうかを(すなわち、着信音が鳴動していたか否かを)確認する(図3のステップS4)。RAM43に着信音データが記憶されていない時は、着信音が鳴っていなかったと判断し、RAM41a及び41bに記憶されているデータは、雑音信号であると推定する(図3のステップS6)。   In this state, when the sender presses a call start button (not shown in FIG. 1) (step S3 in FIG. 3), the DSP 42 in the noise / target sound component calculation unit 4 refers to the RAM 43 and receives the ring tone data. Is stored (that is, whether or not the ringing tone is ringing) (step S4 in FIG. 3). When no ring tone data is stored in the RAM 43, it is determined that no ring tone has been played, and the data stored in the RAMs 41a and 41b is estimated to be a noise signal (step S6 in FIG. 3).

一方、RAM43に着信音データが記憶されている時は、着信音が鳴っていたと判断し、DSP42は、RAM41a及び41bに記憶されているデータと、RAM43に記憶されている着信音データとの時間相関をとり、RAM43に記憶されている着信音データと同位相にてRAM41a及び41bに記憶されている着信音データの利得(スピーカ6から出力されてマイクロホン2a及び2bに入力された時のループバックゲイン)に、RAM43の着信音データの利得を合わせたものを、RAM41a及び41bに記憶されているデータの利得より差し引くことで着信音を除去し(図3のステップS5)、着信音除去後の値を雑音信号と推定(特定)する(図3のステップS6)。   On the other hand, when the ringtone data is stored in the RAM 43, it is determined that the ringtone is ringing, and the DSP 42 determines the time between the data stored in the RAMs 41a and 41b and the ringtone data stored in the RAM 43. The gain of the ringtone data stored in the RAMs 41a and 41b in the same phase as the ringtone data stored in the RAM 43 is calculated (the loopback when output from the speaker 6 and input to the microphones 2a and 2b). The ringtone is removed by subtracting the gain of the ringtone data in the RAM 43 from the gain of the data stored in the RAMs 41a and 41b (step S5 in FIG. 3). The value is estimated (specified) as a noise signal (step S6 in FIG. 3).

上記の雑音信号推定後、続いて、本実施形態の携帯端末1と相手端末との間の通話が開始される。この通話開始後の動作を、図4のフローチャートと図1及び図2と共に説明する。なお、雑音・目的音成分演算部4は、目的音の音源は、通話時において携帯端末1の正面にあると仮定できるので、予め測定した携帯端末1の正面方向におけるマイクロホン2a及び2bの各出力音声信号の時間差情報を記憶しているものとする。   After the above noise signal estimation, a call between the mobile terminal 1 of the present embodiment and the counterpart terminal is started. The operation after the start of the call will be described with reference to the flowchart of FIG. 4 and FIGS. Note that the noise / target sound component calculation unit 4 can assume that the sound source of the target sound is in front of the mobile terminal 1 during a call. It is assumed that time difference information of the audio signal is stored.

まず、着信音の鳴動が停止し、音声通話が開始されると(図4のステップS11)、マイクロホン2a及び2bは、目的音である送話音声だけでなく、周囲雑音(背景雑音)も同時に収音し、それらを電気信号である音声信号に変換し、AD変換器3a及び3bによりディジタル音声信号に変換させた後、雑音・目的音成分演算部4に供給する。   First, when ringing is stopped and a voice call is started (step S11 in FIG. 4), the microphones 2a and 2b not only transmit the target voice but also ambient noise (background noise) at the same time. The collected sound is converted into an audio signal which is an electric signal, converted into a digital audio signal by the AD converters 3 a and 3 b, and then supplied to the noise / target sound component calculation unit 4.

雑音・目的音成分演算部4は、マイクロホン2aにより収音された通話時の第1のディジタル音声信号と、マイクロホン2bにより収音された通話時の第2のディジタル音声信号に、予め計算しておいた上記の正面方向における時間差情報をそれぞれ加えて互いの時間差をなくす時間差補正を行った後(図4のステップS12)、第1及び第2のディジタル音声信号を加算し(図4のステップS13)、その加算後のディジタル音声信号を、周囲雑音を含む目的音成分(目的音信号)と推定(特定)する(図4のステップS14)。   The noise / target sound component calculation unit 4 calculates in advance a first digital audio signal during a call collected by the microphone 2a and a second digital audio signal during a call collected by the microphone 2b. After adding the time difference information in the front direction and correcting the time difference to eliminate the time difference (step S12 in FIG. 4), the first and second digital audio signals are added (step S13 in FIG. 4). Then, the digital audio signal after the addition is estimated (specified) as a target sound component (target sound signal) including ambient noise (step S14 in FIG. 4).

適応フィルタ5は、上記のようにして雑音・目的音成分演算部4により推定した雑音信号と周囲雑音を含む目的音信号とを入力信号として受け、これらの入力信号に基づいて公知のフィルタリング処理を施して周囲雑音の抑圧された目的音信号を出力する。これにより、例えば、音声通話時の周囲雑音(例えば、送話者の背後で鳴っているTV受像機などの音声や、街頭演説などの通話に必要の無い背景音)を抑圧して、目的音である送話音声の音声信号を明瞭にして出力することができる。   The adaptive filter 5 receives as input signals the noise signal estimated by the noise / target sound component calculation unit 4 and the target sound signal including ambient noise as described above, and performs a known filtering process based on these input signals. To output a target sound signal in which ambient noise is suppressed. This suppresses, for example, ambient noise during a voice call (for example, a sound of a TV receiver or the like that is ringing behind the sender, or a background sound that is not necessary for a telephone call such as a street speech), and the target sound is suppressed. The voice signal of the transmitted voice can be output clearly.

このように、本実施形態では、雑音信号と雑音成分を含む目的音信号とを装置内で推定することで、雑音信号と雑音を含む目的音信号とを特定し、適応フィルタ5を正常に動作させて自由度の高い雑音抑圧ができる。   As described above, in the present embodiment, the noise signal and the target sound signal including the noise component are estimated in the apparatus, the noise signal and the target sound signal including the noise are specified, and the adaptive filter 5 operates normally. Noise suppression with a high degree of freedom.

なお、本発明は上記の実施形態に限定されるものではなく、例えば、マイクロホンの数は3個以上でもよく、また、それらのマイクロホンは自由な配置が可能である。但し、目的音推定用に予め正面方向の位相差情報を必要とする。また、通話開始は、通話開始ボタンの押下以外に、例えば携帯端末1が折り畳み型携帯端末の場合は、2つの筐体を開状態にしたことを装置が検出することで通話開始を認識できる。更に、本発明は携帯端末以外の固定電話機などにも適用は可能である。   In addition, this invention is not limited to said embodiment, For example, the number of microphones may be three or more, and those microphones can be arrange | positioned freely. However, phase difference information in the front direction is required in advance for target sound estimation. In addition to pressing the call start button, for example, when the mobile terminal 1 is a foldable mobile terminal, the apparatus can recognize the start of the call by detecting that the two housings are opened. Furthermore, the present invention can be applied to fixed telephones other than portable terminals.

本発明の携帯端末の一実施形態のブロック図である。It is a block diagram of one Embodiment of the portable terminal of this invention. 図1中の雑音・目的音成分演算部の一実施形態のブロック図である。It is a block diagram of one Embodiment of the noise and the target sound component calculating part in FIG. 図1及び図2の動作説明用フローチャート(その1)である。FIG. 3 is a flowchart (No. 1) for explaining operations in FIGS. 1 and 2; FIG. 図1及び図2の動作説明用フローチャート(その2)である。FIG. 3 is a flowchart (No. 2) for explaining operations in FIGS. 1 and 2; FIG.

符号の説明Explanation of symbols

1 携帯端末
2a、2b マイクロホン
3a、3b AD変換器
4 雑音・目的音成分演算部
5 適応フィルタ(適応型雑音抑圧フィルタ)
6 スピーカ
41a、41b、43 ランダム・アクセス・メモリ(RAM)
42 ディジタル・シグナル・プロセッサ(DSP)
DESCRIPTION OF SYMBOLS 1 Mobile terminal 2a, 2b Microphone 3a, 3b AD converter 4 Noise and target sound component calculation part 5 Adaptive filter (adaptive noise suppression filter)
6 Speakers 41a, 41b, 43 Random access memory (RAM)
42 Digital Signal Processor (DSP)

Claims (11)

電話端末における着信音を、該電話端末に設けた複数のマイクロホンで収音する第1のステップと、
前記第1のステップで前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号から、着信音成分を除去した音声信号を雑音信号と推定する第2のステップと
を含むことを特徴とする音声推定方法。
A first step of collecting a ringtone at a telephone terminal with a plurality of microphones provided on the telephone terminal;
A second step of estimating, as a noise signal, a voice signal from which a ring tone component has been removed from a plurality of voice signals obtained by collecting the ringtone with the plurality of microphones in the first step. A featured speech estimation method.
前記第2のステップで推定した前記雑音信号を適応型雑音抑圧フィルタに供給し、その後の通話時に前記複数のマイクロホンで収音して得た送話音声信号を前記適応型雑音抑圧フィルタに供給して、前記適応型雑音抑圧フィルタのフィルタリング処理を行わせる第3のステップを更に含むことを特徴とする請求項1記載の音声推定方法。   The noise signal estimated in the second step is supplied to an adaptive noise suppression filter, and a transmitted voice signal obtained by collecting with the plurality of microphones during a subsequent call is supplied to the adaptive noise suppression filter. The speech estimation method according to claim 1, further comprising a third step of performing filtering processing of the adaptive noise suppression filter. 電話端末における着信音を、該電話端末に設けた複数のマイクロホンで収音する第1のステップと、
前記第1のステップで前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号から、着信音成分を除去した音声信号を雑音信号と推定する第2のステップと、
通話開始後の送話音声を前記複数のマイクロホンで収音して得た複数の音声信号に対して、前記複数のマイクロホンのそれぞれの正面方向からの音声信号時間差情報に基づいて時間差補正を行う第3のステップと、
前記第3のステップで時間差補正された前記複数の音声信号を加算した信号を、雑音成分を含む目的音信号と推定する第4のステップと
を含むことを特徴とする音声推定方法。
A first step of collecting a ringtone at a telephone terminal with a plurality of microphones provided on the telephone terminal;
A second step of estimating, as a noise signal, a voice signal obtained by removing a ringtone component from a plurality of voice signals obtained by collecting the ringtone with the plurality of microphones in the first step;
A time difference correction is performed on a plurality of sound signals obtained by collecting transmission voices after the start of a call with the plurality of microphones based on sound signal time difference information from the front direction of each of the plurality of microphones. 3 steps,
A speech estimation method comprising: a fourth step of estimating a signal obtained by adding the plurality of speech signals corrected in time difference in the third step as a target sound signal including a noise component.
前記第2のステップは、前記着信音をスピーカに鳴動させる着信音データと、前記第1のステップで前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号との時間相関をとり、前記着信音データと同位相の前記複数の音声信号中の着信音信号の利得に、前記着信音データの利得を合わせたものを、前記複数の音声信号の利得より差し引くことで、前記着信音成分を除去した音声信号を算出して、雑音信号と推定することを特徴とする請求項1又は3記載の音声推定方法。   In the second step, the time correlation between the ring tone data that causes the speaker to ring the ring tone and the plurality of audio signals obtained by collecting the ring tone with the plurality of microphones in the first step is calculated. Then, by subtracting the gain of the ringtone data from the gain of the plurality of voice signals, the gain of the ringtone signal in the plurality of voice signals in phase with the ringtone data is subtracted from the gain of the plurality of voice signals. The speech estimation method according to claim 1 or 3, wherein the speech signal from which the sound component is removed is calculated and estimated as a noise signal. 前記第2のステップで推定した前記雑音信号と、前記第4のステップで推定した前記雑音成分を含む目的音信号とを適応型雑音抑圧フィルタに供給して、フィルタリング処理により前記雑音成分を含む目的音信号から雑音成分が抑圧された目的音信号を出力させる第5のステップを更に含むことを特徴とする請求項3記載の音声推定方法。   The noise signal estimated in the second step and the target sound signal including the noise component estimated in the fourth step are supplied to an adaptive noise suppression filter, and the noise component is included by filtering processing. The speech estimation method according to claim 3, further comprising a fifth step of outputting a target sound signal in which a noise component is suppressed from the sound signal. 複数のマイクロホンと適応型雑音抑圧フィルタとを備えた携帯端末であって、
着信音データに基づいて着信音をスピーカに鳴動させる着信音鳴動手段と、
前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号と、前記着信音データとに基づいて、前記着信音を収音して得た複数の音声信号から着信音成分を除去した音声信号を演算して雑音信号と推定して前記適応型雑音抑圧フィルタに出力する雑音信号演算手段と
を有することを特徴とする携帯端末。
A mobile terminal comprising a plurality of microphones and an adaptive noise suppression filter,
A ringing tone ringing means for ringing a ringtone on a speaker based on ringtone data;
Based on the plurality of sound signals obtained by collecting the ringtones with the plurality of microphones and the ringtone data, the ringtone components are removed from the plurality of sound signals obtained by collecting the ringtones. A mobile terminal comprising: a noise signal calculating means that calculates a noise signal, calculates a noise signal, and outputs the estimated noise signal to the adaptive noise suppression filter.
通話時に前記複数のマイクロホンで収音して得た送話音声信号を前記適応型雑音抑圧フィルタに供給し、前記雑音信号と前記送話音声信号とに基づいて前記適応型雑音抑圧フィルタのフィルタリング処理を行わせて、該送話音声信号に含まれる雑音成分を抑圧させる送話音声信号供給手段を更に有することを特徴とする請求項6記載の携帯端末。   A transmission voice signal obtained by collecting sounds by the plurality of microphones during a call is supplied to the adaptive noise suppression filter, and filtering processing of the adaptive noise suppression filter based on the noise signal and the transmission voice signal 7. The portable terminal according to claim 6, further comprising transmission voice signal supply means that suppresses a noise component contained in the transmission voice signal by performing 複数のマイクロホンと適応型雑音抑圧フィルタとを備えた携帯端末であって、
着信音データに基づいて着信音をスピーカに鳴動させる着信音鳴動手段と、
前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号と、前記着信音データとに基づいて、前記着信音を収音して得た複数の音声信号から着信音成分を除去した音声信号を演算して雑音信号と推定して前記適応型雑音抑圧フィルタに出力する雑音信号演算手段と、
通話開始後の送話音声を前記複数のマイクロホンで収音して得た複数の音声信号に対して、前記複数のマイクロホンのそれぞれの正面方向からの音声信号時間差情報に基づいて時間差補正を行う補正手段と、
時間差補正された前記複数の音声信号を加算した信号を、雑音成分を含む目的音信号と推定して前記適応型雑音抑圧フィルタに出力する目的音信号演算手段と
を有することを特徴とする携帯端末。
A mobile terminal comprising a plurality of microphones and an adaptive noise suppression filter,
A ringing tone ringing means for ringing a ringtone on a speaker based on ringtone data;
Based on the plurality of sound signals obtained by collecting the ringtones with the plurality of microphones and the ringtone data, the ringtone components are removed from the plurality of sound signals obtained by collecting the ringtones. A noise signal calculation means for calculating the noise signal and estimating it as a noise signal and outputting it to the adaptive noise suppression filter;
Correction for performing time difference correction based on sound signal time difference information from the front direction of each of the plurality of microphones for a plurality of sound signals obtained by collecting transmission voices after the start of a call with the plurality of microphones Means,
And a target sound signal calculation unit that estimates a signal obtained by adding the plurality of audio signals corrected for time difference as a target sound signal including a noise component and outputs the target sound signal to the adaptive noise suppression filter. .
前記雑音信号演算手段は、
前記着信音をスピーカに鳴動させる着信音データを記憶する第1の記憶手段と、
前記複数のマイクロホンで前記着信音を収音して得た複数の音声信号を記憶する第2の記憶手段と、
前記着信音データと前記着信音を収音して得た複数の音声信号との時間相関をとり、前記着信音データと同位相の前記複数の音声信号中の着信音信号の利得に、前記着信音データの利得を合わせたものを、前記複数の音声信号の利得より差し引くことで、前記着信音成分を除去した音声信号を前記雑音信号として算出する算出手段と
からなることを特徴とする請求項6又は8記載の携帯端末。
The noise signal calculation means is
First storage means for storing ringtone data for causing the ringtone to ring on a speaker;
Second storage means for storing a plurality of audio signals obtained by collecting the ringtone with the plurality of microphones;
Taking a time correlation between the ring tone data and a plurality of voice signals obtained by collecting the ring tone, the gain of the ring tone signal in the plurality of voice signals in the same phase as the ring tone data is set to the incoming call signal. The calculation means for calculating, as the noise signal, a voice signal from which the ring tone component has been removed by subtracting a sum of gains of sound data from gains of the plurality of voice signals. The mobile terminal according to 6 or 8.
前記適応型雑音抑圧フィルタは、前記雑音信号演算手段による演算により推定された前記雑音信号と、前記目的音信号演算手段による演算により推定された前記雑音成分を含む目的音信号とを入力信号として受け、フィルタリング処理により前記雑音成分を含む目的音信号から雑音成分を抑圧した目的音信号を出力することを特徴とする請求項8記載の携帯端末。   The adaptive noise suppression filter receives as input signals the noise signal estimated by the calculation by the noise signal calculation means and the target sound signal including the noise component estimated by the calculation by the target sound signal calculation means. 9. The portable terminal according to claim 8, wherein a target sound signal in which the noise component is suppressed from the target sound signal including the noise component is output by filtering processing. 前記第1の記憶手段及び前記第2の記憶手段は、それぞれリングバッファ構成であることを特徴とする請求項9記載の携帯端末。   The portable terminal according to claim 9, wherein each of the first storage unit and the second storage unit has a ring buffer configuration.
JP2007330982A 2007-12-21 2007-12-21 Voice estimation method, and mobile terminal using the same Pending JP2009153053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007330982A JP2009153053A (en) 2007-12-21 2007-12-21 Voice estimation method, and mobile terminal using the same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007330982A JP2009153053A (en) 2007-12-21 2007-12-21 Voice estimation method, and mobile terminal using the same

Publications (1)

Publication Number Publication Date
JP2009153053A true JP2009153053A (en) 2009-07-09

Family

ID=40921612

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007330982A Pending JP2009153053A (en) 2007-12-21 2007-12-21 Voice estimation method, and mobile terminal using the same

Country Status (1)

Country Link
JP (1) JP2009153053A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170029624A (en) * 2014-07-28 2017-03-15 후아웨이 테크놀러지 컴퍼니 리미티드 Acoustical signal processing method and device of communication device
JP2019533192A (en) * 2016-09-30 2019-11-14 ボーズ・コーポレーションBosecorporation Noise estimation for dynamic sound adjustment
CN113077808A (en) * 2021-03-22 2021-07-06 北京搜狗科技发展有限公司 Voice processing method and device for voice processing

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170029624A (en) * 2014-07-28 2017-03-15 후아웨이 테크놀러지 컴퍼니 리미티드 Acoustical signal processing method and device of communication device
KR101883421B1 (en) 2014-07-28 2018-07-30 후아웨이 테크놀러지 컴퍼니 리미티드 Acoustical signal processing method and device of communication device
US10204639B2 (en) 2014-07-28 2019-02-12 Huawei Technologies Co., Ltd. Method and device for processing sound signal for communications device
JP2019533192A (en) * 2016-09-30 2019-11-14 ボーズ・コーポレーションBosecorporation Noise estimation for dynamic sound adjustment
CN113077808A (en) * 2021-03-22 2021-07-06 北京搜狗科技发展有限公司 Voice processing method and device for voice processing
CN113077808B (en) * 2021-03-22 2024-04-26 北京搜狗科技发展有限公司 Voice processing method and device for voice processing

Similar Documents

Publication Publication Date Title
US20070237339A1 (en) Environmental noise reduction and cancellation for a voice over internet packets (VOIP) communication device
US20110181452A1 (en) Usage of Speaker Microphone for Sound Enhancement
JP2014112831A (en) System for managing plurality of microphones and speakers
KR20100053502A (en) A device for and a method of processing audio signals
CN101783828A (en) Sound signal adjustment apparatus and method, and telephone device
WO2007018802A2 (en) Method and system for operation of a voice activity detector
JP3267556B2 (en) Echo canceller and transmitter
CN101188876A (en) Method for operating a hearing aid, and hearing aid
JP4533427B2 (en) Echo canceller
JPH066246A (en) Voice communication terminal equipment
JP2010081004A (en) Echo canceler, communication apparatus and echo canceling method
JP2007214976A (en) Echo canceler, video phone terminal and echo cancellation method
JP2009153053A (en) Voice estimation method, and mobile terminal using the same
JPH09233198A (en) Method and device for software basis bridge for full duplex voice conference telephone system
JP2007116585A (en) Noise cancel device and noise cancel method
JP4631581B2 (en) Loudspeaker
JP5707871B2 (en) Voice communication device and mobile phone
CN114125616A (en) Low power consumption method and device of wireless earphone, wireless earphone and readable storage medium
WO2007120734A2 (en) Environmental noise reduction and cancellation for cellular telephone and voice over internet packets (voip) communication devices
WO2019159968A1 (en) Communication transmission device and voice quality determination method for communication transmission device
JP5189515B2 (en) Intercom system
JP2007336132A (en) Echo suppressor
JP4512066B2 (en) Telephone
JP2011075694A (en) Sound processing device and program
JP2009124386A (en) Voice signal processor, and voice signal processing method