JP2020048038A

JP2020048038A - Sound collection device, program, and method

Info

Publication number: JP2020048038A
Application number: JP2018174097A
Authority: JP
Inventors: 隆矢頭; Takashi Yato
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2018-09-18
Filing date: 2018-09-18
Publication date: 2020-03-26
Anticipated expiration: 2038-09-18
Also published as: JP7176316B2

Abstract

To provide a sound collection device for efficiently and stably perform area sound collection.SOLUTION: A sound collection device includes: means for acquiring area sound collection components of multiple sound collection areas on the basis of two or more patterns of combination of microphone arrays in dependence upon an input signal from a microphone array part capable of forming multiple different kinds of directional microphone arrays; partial area component calculation means for performing acquisition on the basis of area sound collection components of the respective sound collection areas concerning respective area sound collection components of a partial area where two or more sound collection areas divided from a whole area for covering all of the acquired sound collection areas are overlapped and also partial areas where the sound collection areas are not overlapped; and means for selecting area sound collection components of one or multiple partial areas from the area sound collection components of the respective partial areas, so as to acquire a sound collection result based on the selected area sound collection components.SELECTED DRAWING: Figure 1

Description

この発明は、収音装置、プログラム及び方法に関し、例えば、雑音環境下で用いられる音声通信システム等に適用し得る。 The present invention relates to a sound collection device, a program, and a method, and can be applied to, for example, a voice communication system used in a noisy environment.

雑音環境下で音声通信システムや音声認識応用システムを利用する場合、必要な目的音声と同時に混入する周囲の雑音は、良好なコミュニケーションを阻害し、音声認識率の低下をもたらす厄介な存在である。従来、このような複数の音源が存在する環境下において、特定の方向の音のみ分離・収音することで不要音の混入を避け必要な目的音を得る技術として、マイクアレイを用いたビームフォーマ（ＢｅａｍＦｏｒｍｅｒ；以下「ＢＦ」とも呼ぶ；特許文献２、３参照）がある。ＢＦとは各マイクロホンに到達する信号の時間差を利用して指向性を形成する技術である。しかしＢＦだけでは収音を目的とするエリア（以下、「目的エリア」と呼ぶ）の周囲に他の音源が存在する場合、目的エリア内に存在する音（以下、「目的エリア音」と呼ぶ）だけを収音することが難しい。そのため、従来、特許文献１等により、複数のマイクアレイを用いて目的エリアを収音するエリア収音方式が提案されている。 When a voice communication system or a voice recognition application system is used in a noisy environment, ambient noise mixed with necessary target voice is a troublesome factor that hinders good communication and lowers a voice recognition rate. Conventionally, in an environment where a plurality of sound sources exist, a beamformer using a microphone array has been used as a technique for obtaining a desired target sound by separating and collecting only a sound in a specific direction to avoid mixing of unnecessary sound. (Beam Former; hereinafter also referred to as “BF”; see Patent Documents 2 and 3). BF is a technique for forming directivity by using a time difference between signals reaching each microphone. However, if there is another sound source around an area for sound collection (hereinafter, referred to as “target area”) using only BF, the sound existing in the target area (hereinafter, referred to as “target area sound”) Only difficult to pick up sound. For this reason, an area sound collecting method for collecting a target area by using a plurality of microphone arrays has been proposed in Patent Document 1 and the like.

図２５は、２つのマイクアレイＭＡ１００、ＭＡ２００を用いて、目的エリアの音源からの目的エリア音を収音する処理について示した説明図（グラフ）である。 FIG. 25 is an explanatory diagram (graph) showing a process of collecting a target area sound from a sound source in the target area using two microphone arrays MA100 and MA200.

図２５（ａ）は、各マイクアレイＭＡ１００、ＭＡ２００の構成例について示した説明図である。図２５（ｂ）、図２５（ｃ）は、それぞれ図２５（ａ）に示すマイクアレイＭＡ１００、ＭＡ２００のＢＦ出力について周波数領域で示した図（グラフ形式のイメージ図)である。図２５において各マイクアレイＭＡ１００、ＭＡ２００は、それぞれ２つのマイクロホンｃｈ１、ｃｈ２により構成されている。 FIG. 25A is an explanatory diagram showing a configuration example of each of the microphone arrays MA100 and MA200. FIGS. 25 (b) and 25 (c) are diagrams (images in the form of graphs) showing the BF outputs of the microphone arrays MA100 and MA200 shown in FIG. 25 (a) in the frequency domain. In FIG. 25, each of the microphone arrays MA100 and MA200 includes two microphones ch1 and ch2, respectively.

従来のエリア収音では、図２５（ａ）に示すように、マイクアレイＭＡ１００、ＭＡ２００の指向性を別々の方向から収音したいエリア（目的エリア）で交差させて収音する。図２５（ａ）の状態では、各マイクアレイＭＡ１００、ＭＡ２００の指向性に目的エリア内に存在する音（目的エリア音）だけでなく、目的エリア方向の雑音（非目的エリア音）も含まれている。しかし、図２５（ｂ）、図２５（ｃ）に示すように、マイクアレイＭＡ１００、ＭＡ２００の指向性を周波数領域で比較すると、目的エリア音成分はどちらの出力にも含まれるが、非目的エリア音成分は各マイクアレイで異なることになる。従来のエリア収音技術では、このような特性を利用し、２つのマイクアレイＭＡ１００、ＭＡ２００のＢＦ出力に、共通に含まれる成分以外を抑圧することで目的エリア音のみ抽出することができる。 In the conventional area sound pickup, as shown in FIG. 25A, sound is picked up by intersecting the directivities of the microphone arrays MA100 and MA200 in different areas (target areas) to be picked up from different directions. In the state of FIG. 25A, the directivity of each of the microphone arrays MA100 and MA200 includes not only the sound existing in the target area (target area sound) but also noise in the target area direction (non-target area sound). I have. However, as shown in FIGS. 25B and 25C, when the directivity of the microphone arrays MA100 and MA200 is compared in the frequency domain, the target area sound component is included in both outputs, but the non-target area The sound component will be different for each microphone array. In the conventional area sound collecting technique, by utilizing such characteristics, only the target area sound can be extracted by suppressing components other than those commonly included in the BF outputs of the two microphone arrays MA100 and MA200.

特開２０１４−０７２７０８号公報JP 2014-072708 A 特開２００５−１９５９５５号公報JP 2005-195555 A 特開２０１６−１２７４５７号公報JP-A-2006-127457

ところで、サイレンが鳴り響く火災現場や、救急現場から指令センタ（消防本部）への緊急連絡の手段として、緊急車両には連絡用のハンドセット（送受話器）が備えられている。従来の緊急車両に搭載されるハンドセットは、利用環境が大騒音下であるが故、現場からの連絡が周囲の騒音でかき消されて、本部（例えば、緊急車両の搭乗員を指揮する本部）に正確な情報を伝えられず誤った情報となり、的確な判断の阻害や、対応の遅れなどの問題が生じるおそれがある。そのため、これまでもハンドセットについて様々な雑音除去技術の活用が検討されてきたが、通話品質の確保、コスト増大など導入には多くの課題があった。このような利用環境において、上述のエリア収音技術は有効な解決策として期待される。例えば、ハンドセットの送話口周辺に２つのマイクアレイを設置し、当該２つのマイクアレイのそれぞれの指向性を、送話口の前で交差させエリア収音を機能させることにより、サイレン等の大騒音を排除し、消防隊員等の送話者の音声だけを本部他に正確に伝達することが可能になる。 By the way, an emergency vehicle is provided with a communication handset (handset) as a means of emergency communication from a fire site where a siren sounds and an emergency site to a command center (fire department). The handset mounted on a conventional emergency vehicle is in a loud noise environment, so communication from the site is drowned out by surrounding noise and sent to the headquarters (for example, the headquarters that commands the crew of the emergency vehicle). Inaccurate information cannot be provided, resulting in erroneous information, which may hinder accurate judgment and delay response. For this reason, the use of various noise removal techniques for handsets has been studied, but there have been many problems in introducing such techniques as securing speech quality and increasing costs. In such a usage environment, the above-described area sound collection technology is expected as an effective solution. For example, two microphone arrays are installed around the mouthpiece of the handset, and the directivity of each of the two microphone arrays intersects in front of the mouthpiece to function as an area sound pickup, thereby providing a large siren or the like. Noise can be eliminated, and only the voice of the sender, such as a firefighter, can be accurately transmitted to the headquarters and other places.

エリア収音を実現するためには、少なくても２つのマイクアレイが必要である。一方、ハンドセットにおいて送話口部分の大きさは外形で直径６ｃｍ程度と小さく、そこにエリア収音実現のために２つのマイクアレイを装着する場合、それぞれのマイクアレイを非常に近接した状態で設置する必要がある。その結果、当該ハンドセットを用いたエリア収音において、収音エリアは送話器直近の非常に狭いエリアに限定される。しかしながら、ハンドセットに、従来のエリア収音処理を適用する場合、利用者（話者）によってハンドセットの持ち方や顔の大きさが異なり、口元が上述の狭く限定された収音エリア（ハンドセットについて設定される収音エリア）からずれる可能性がある。この場合、ハンドセットの収音エリアから利用者（話者）の口元がずれると、収音した音声の歪や脱落が生じ、安定した収音ができないという問題があった。 At least two microphone arrays are required to achieve area sound pickup. On the other hand, the size of the mouthpiece in the handset is as small as about 6 cm in diameter, and when two microphone arrays are mounted to achieve area sound collection, the microphone arrays are placed very close together. There is a need to. As a result, in the area sound collection using the handset, the sound collection area is limited to a very small area immediately near the transmitter. However, when the conventional area sound collection processing is applied to the handset, the manner of holding the handset and the size of the face differ depending on the user (speaker), and the mouth has the above-described narrow sound collection area (set for the handset). Sound pickup area). In this case, if the mouth of the user (speaker) shifts from the sound collection area of the handset, distortion or dropout of the collected sound occurs, and there is a problem that stable sound collection cannot be performed.

そのため、効率良く、かつ安定的にエリア収音を行うことができる収音装置、プログラム及び方法が望まれている。 Therefore, a sound collection device, a program, and a method that can efficiently and stably perform area sound collection are desired.

第１の本発明の収音装置は、（１）複数の異なる指向性のマイクアレイを形成可能なマイクアレイ部からの入力信号に基づいて、２パターン以上の前記マイクアレイの組み合わせに基づき複数の収音エリアのエリア収音成分を取得するエリア収音手段と、（２）前記エリア収音手段が取得した前記収音エリアの全てをカバーする全エリアから分けられる２以上の前記収音エリアが重複する部分エリアと、前記収音エリア同士で重複しない部分エリアのそれぞれのエリア収音成分について、前記エリア収音手段が取得した各パターンの前記収音エリアのエリア収音成分に基づいて取得する部分エリア成分算出手段と、（３）前記部分エリア成分算出手段が算出した部分エリアのエリア収音成分から、１又は複数の部分エリアのエリア収音成分を選択し、選択されたエリア収音成分に基づく収音結果を取得する部分エリア選択手段とを有することを特徴とする。 According to the first aspect of the present invention, there is provided a sound collecting apparatus comprising: (1) a plurality of microphone arrays based on a combination of two or more patterns based on input signals from a microphone array unit capable of forming a plurality of microphone arrays having different directivities; Area pickup means for acquiring an area pickup component of the sound pickup area; and (2) two or more sound pickup areas divided from all areas covering all of the sound pickup areas acquired by the area pickup means. The sound pickup components of each of the overlapping partial areas and the partial areas that do not overlap with each other are obtained based on the sound pickup components of the sound pickup areas of the respective patterns obtained by the area sound pickup means. A partial area component calculating means; and (3) extracting an area sound collecting component of one or more partial areas from the area sound collecting components of the partial area calculated by the partial area component calculating means. -Option is characterized by having a partial area selecting means for obtaining a sound collecting results based on the selected area sound-pickup component.

第２の本発明の収音プログラムは、コンピュータを、（１）複数の異なる指向性のマイクアレイを形成可能なマイクアレイ部からの入力信号に基づいて、２パターン以上の前記マイクアレイの組み合わせに基づき複数の収音エリアのエリア収音成分を取得するエリア収音手段と、（２）前記エリア収音手段が取得した前記収音エリアの全てをカバーする全エリアから分けられる２以上の前記収音エリアが重複する部分エリアと、前記収音エリア同士で重複しない部分エリアのそれぞれのエリア収音成分について、前記エリア収音手段が取得した各パターンの前記収音エリアのエリア収音成分に基づいて取得する部分エリア成分算出手段と、（３）前記部分エリア成分算出手段が算出した部分エリアのエリア収音成分から、１又は複数の部分エリアのエリア収音成分を選択し、選択されたエリア収音成分に基づく収音結果を取得する部分エリア選択手段として機能させることを特徴とする。 A sound collection program according to a second aspect of the present invention provides a computer for: (1) combining two or more patterns of the microphone array based on an input signal from a microphone array unit capable of forming a plurality of microphone arrays having different directivities. And (2) two or more sound pickup areas separated from all areas covering all of the sound pickup areas acquired by the area sound pickup means. For each of the area sound pickup components of the partial area where the sound area overlaps and the partial area where the sound pickup areas do not overlap each other, based on the area sound pickup component of the sound pickup area of each pattern acquired by the area sound pickup means. And (3) one or a plurality of partial area components from the area sound component of the partial area calculated by the partial area component calculating means. Select area sound-pickup components A, characterized in that to function as a partial area selecting means for obtaining a sound collecting results based on the selected area sound-pickup component.

第３の本発明は、収音装置が行う収音方法において、（１）エリア収音手段、部分エリア成分算出手段、及び部分エリア選択手段を備え、（２）前記エリア収音手段は、複数の異なる指向性のマイクアレイを形成可能なマイクアレイ部からの入力信号に基づいて、２パターン以上の前記マイクアレイの組み合わせに基づき複数の収音エリアのエリア収音成分を取得し、（３）前記部分エリア成分算出手段は、前記エリア収音手段が取得した前記収音エリアの全てをカバーする全エリアから分けられる２以上の前記収音エリアが重複する部分エリアと、前記収音エリア同士で重複しない部分エリアのそれぞれのエリア収音成分について、前記エリア収音手段が取得した各パターンの前記収音エリアのエリア収音成分に基づいて取得し、（４）前記部分エリア選択手段は、前記部分エリア成分算出手段が算出した部分エリアのエリア収音成分から、１又は複数の部分エリアのエリア収音成分を選択し、選択されたエリア収音成分に基づく収音結果を取得することを特徴とする。 According to a third aspect of the present invention, there is provided a sound collecting method performed by a sound collecting device, comprising: (1) an area sound collecting unit, a partial area component calculating unit, and a partial area selecting unit; (3) acquiring an area sound pickup component of a plurality of sound pickup areas based on a combination of two or more patterns of the microphone array based on an input signal from a microphone array unit capable of forming a microphone array having different directivities; The partial area component calculating means includes: a partial area in which two or more sound collecting areas divided from an entire area covering all of the sound collecting areas acquired by the area sound collecting means overlap each other; For each area sound pickup component of a non-overlapping partial area, the sound pickup component is obtained based on the area sound pickup component of the sound pickup area of each pattern obtained by the area sound pickup means. The partial area selecting means selects an area sound collecting component of one or a plurality of partial areas from the area sound collecting components of the partial area calculated by the partial area component calculating means, and collects sound based on the selected area sound collecting component. The method is characterized in that a result is obtained.

本発明によれば、効率良く、かつ安定的にエリア収音を行う収音装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the sound collection device which performs area sound collection efficiently and stably can be provided.

第１の実施形態に係る各装置の構成（実施形態に係る収音部（収音装置）の機能的構成を含む）について示したブロック図である。FIG. 2 is a block diagram illustrating a configuration of each device according to the first embodiment (including a functional configuration of a sound collection unit (sound collection device) according to the embodiment). 第１の実施形態に係るハンドセットの使用状態について示した図（斜視図）である。FIG. 2 is a diagram (perspective view) illustrating a use state of the handset according to the first embodiment. 第１の実施形態に係るハンドセットの送話口部分を拡大して示した図である。It is the figure which expanded and showed the mouthpiece part of the handset which concerns on 1st Embodiment. ３個のマイクロホンにより形成されるマイクアレイの構成例について示した説明図（イメージ図）である。FIG. 3 is an explanatory diagram (image diagram) illustrating a configuration example of a microphone array formed by three microphones. ３個のマイクロホンにより形成されるマイクアレイの各組み合わせ（組み合わせのパターン）に対応するエリア収音処理について示した説明図（イメージ図）である。FIG. 9 is an explanatory diagram (image diagram) illustrating an area sound pickup process corresponding to each combination (combination pattern) of microphone arrays formed by three microphones. ２つのマイクアレイの指向性を交差させた場合におけるエリア収音の感度の分布（計算上の感度の分布）を示した図である。FIG. 11 is a diagram illustrating a distribution of sensitivity (calculated sensitivity distribution) of area sound pickup when the directivities of two microphone arrays cross each other. マイクロホン数が２個の場合の減算型ＢＦに係る構成を示すブロック図である。It is a block diagram which shows the structure regarding the subtraction type BF when the number of microphones is two. ２個のマイクロホンを用いた減算型ＢＦにより形成される指向特性を示す図である。FIG. 4 is a diagram illustrating a directional characteristic formed by a subtraction type BF using two microphones. 第２の実施形態に関連する各装置の構成について示したブロック図である。FIG. 13 is a block diagram illustrating a configuration of each device related to the second embodiment. 第２の実施形態に係るマイクアレイ部における６つのマイクロホンの配置及びマイクアレイの構成例について示した図である。FIG. 9 is a diagram illustrating an arrangement of six microphones and a configuration example of a microphone array in a microphone array unit according to a second embodiment. 第２の実施形態に係る目的エリア音抽出部がエリア収音をおこなう収音エリアの分布について示した説明図である。FIG. 10 is an explanatory diagram showing a distribution of sound collection areas where a target area sound extraction unit according to the second embodiment performs area sound collection. 第２の実施形態に係る収音エリアにおいて、複数の収音エリアで重複しない独立エリアと、複数の収音エリアで重複する重複エリアについて示した説明図である。It is explanatory drawing which showed the independent area which does not overlap in a plurality of sound collection areas, and the overlap area which overlaps in a plurality of sound collection areas in the sound collection area which concerns on 2nd Embodiment. 第２の実施形態に係る各収音エリアの組成イメージ（成分ごとのパワー）を棒グラフの形式で示した説明図である。It is explanatory drawing which showed the composition image (power for every component) of each sound collection area which concerns on 2nd Embodiment in the form of a bar graph. 第２の実施形態に係る独立エリア成分算出部による処理（その１：独立エリアのエリア収音成分を取得する処理）の手順を示した説明図である。FIG. 14 is an explanatory diagram showing a procedure of a process (part 1: a process of acquiring an area sound pickup component of an independent area) by the independent area component calculation unit according to the second embodiment. 第２の実施形態に係る独立エリア成分算出部による処理（その２：重複エリアのエリア収音成分を取得する処理）の手順を示した説明図である。FIG. 11 is an explanatory diagram showing a procedure of a process (part 2: a process of acquiring an area sound pickup component of an overlapping area) by the independent area component calculation unit according to the second embodiment. 第３の実施形態に関連する各装置の構成について示したブロック図である。FIG. 13 is a block diagram illustrating a configuration of each device related to a third embodiment. 第３の実施形態に係る通信装置（スマートホン）の平面図について示した図である。FIG. 14 is a diagram illustrating a plan view of a communication device (smart phone) according to a third embodiment. 第３の実施形態に係る３つの収音エリアのイメージについて示した説明図である。It is explanatory drawing shown about the image of three sound collection areas concerning 3rd Embodiment. 第３の実施形態に係る３つの収音エリアにおける組み合わせパターン（第１〜第３の組み合わせパターン）の分解イメージについて示した説明図である。It is explanatory drawing shown about the decomposition | disassembly image of the combination pattern (1st-3rd combination pattern) in three sound collection areas concerning 3rd Embodiment. 第３の実施形態に係るエリアＡ、Ｄで生じる独立部分のイメージについて示した説明図である。It is explanatory drawing shown about the image of the independent part produced | generated in area A and D concerning 3rd Embodiment. 第３の実施形態に係るエリアＢ、Ｅで生じる独立部分のイメージについて示した説明図である。It is explanatory drawing shown about the image of the independent part produced | generated in area B and E concerning 3rd Embodiment. 第３の実施形態に係るエリアＣ、Ｆで生じる独立部分のイメージについて示した説明図である。It is explanatory drawing shown about the image of the independent part produced | generated in area C and F concerning 3rd Embodiment. 第３の実施形態に係る３つのエリアの重複部分のイメージについて示した説明図である。It is explanatory drawing shown about the image of the overlapping part of three areas which concerns on 3rd Embodiment. 実施形態に係るマイクアレイ部のマイクロホンの数を４つとした場合の構成（実施形態に係る変形例の構成）について示した説明図である。FIG. 9 is an explanatory diagram showing a configuration (a configuration of a modified example according to the embodiment) when the number of microphones in the microphone array unit according to the embodiment is four. 従来の収音装置において、２つのマイクアレイのビームフォーマ（ＢＦ）による指向性を別々の方向から目的エリアへ向けた場合の構成例について示した説明図である。FIG. 9 is an explanatory diagram illustrating a configuration example in a case where directivity by a beam former (BF) of two microphone arrays is directed to a target area from different directions in a conventional sound collection device.

（Ａ）第１の実施形態
以下、本発明による収音装置、プログラム及び方法の第１の実施形態を、図面を参照しながら詳述する。この実施形態では、本発明の収音装置、プログラム及び方法を収音部に適用した例について説明する。 (A) First Embodiment Hereinafter, a first embodiment of a sound collection device, a program, and a method according to the present invention will be described in detail with reference to the drawings. In this embodiment, an example in which a sound collection device, a program, and a method of the present invention are applied to a sound collection unit will be described.

まず、この実施形態におけるマイクアレイを用いたエリア収音処理の基本的な原理について図４〜図６を用いて説明する。 First, the basic principle of the area sound collection processing using the microphone array in this embodiment will be described with reference to FIGS.

多角形の各頂点の位置にマイクロホンを配置すると、多角形の中心方向に複数のエリア収音を構築することが出来る。 By arranging a microphone at the position of each vertex of the polygon, a plurality of area sound pickups can be constructed in the direction of the center of the polygon.

例えば、３個のマイクロホンを用いたエリア収音の構成を考えた場合、図４に示すように、マイクロホンの組み合わせによって最大３個のマイクアレイ（指向性の方向の異なる３個のマイクアレイ）を設定することができる。図４に示すように、３個のマイクロホンｃｈ１〜ｃｈ３では、マイクロホンｃｈ１、ｃｈ２を対とするマイクアレイＭＡ３０１、マイクロホンｃｈ２、ｃｈ３を対とするマイクアレイＭＡ３０２、及びマイクロホンｃｈ３、ｃｈ１を対とするマイクアレイＭＡ３０３を設定することができる。 For example, when a configuration of area sound collection using three microphones is considered, as shown in FIG. 4, up to three microphone arrays (three microphone arrays having different directivity directions) are formed by combining microphones. Can be set. As shown in FIG. 4, among the three microphones ch1 to ch3, a microphone array MA301 that pairs the microphones ch1 and ch2, a microphone array MA302 that pairs the microphones ch2 and ch3, and a microphone that pairs the microphones ch3 and ch1. The array MA303 can be set.

さらに、３個のマイクロホンｃｈ１〜ｃｈ３の構成では、図５に示すように、３個のマイクアレイＭＡ３０１、ＭＡ３０２、ＭＡ３０３の組み合わせ（３通りの組み合わせのパターン）に応じたエリア収音が可能となる。 Further, in the configuration of the three microphones ch1 to ch3, as shown in FIG. 5, it is possible to perform area sound pickup according to a combination (three combinations of patterns) of the three microphone arrays MA301, MA302, and MA303. .

図５（ａ）では、マイクアレイＭＡ３０１の指向性を一点鎖線で図示し、マイクアレイＭＡ３０２の指向性を二点鎖線で図示している。また、図５（ｂ）では、マイクアレイＭＡ３０２の指向性を一点鎖線で図示し、マイクアレイＭＡ３０３の指向性を二点鎖線で図示している。さらに、図５（ｃ）では、マイクアレイＭＡ３０１の指向性を一点鎖線で図示し、マイクアレイＭＡ３０３の指向性を二点鎖線で図示している。さらにまた、図５（ａ）では、マイクアレイＭＡ３０１、ＭＡ３０２の組み合わせ（パターン）に応じた収音エリアＡ３０１にハッチ（斜線）を付している。また、図５（ｂ）では、マイクアレイＭＡ３０２、ＭＡ３０３の組み合わせ（パターン）に応じた収音エリアＡ３０２にハッチ（斜線）を付している。さらに、図５（ｃ）では、マイクアレイＭＡ３０１、ＭＡ３０３の組み合わせ（パターン）に応じた収音エリアＡ３０３にハッチ（斜線）を付している。 In FIG. 5A, the directivity of the microphone array MA301 is shown by a one-dot chain line, and the directivity of the microphone array MA302 is shown by a two-dot chain line. Also, in FIG. 5B, the directivity of the microphone array MA302 is shown by a dashed line, and the directivity of the microphone array MA303 is shown by a two-dot chain line. Further, in FIG. 5C, the directivity of the microphone array MA301 is shown by a one-dot chain line, and the directivity of the microphone array MA303 is shown by a two-dot chain line. Furthermore, in FIG. 5A, the sound collection area A301 corresponding to the combination (pattern) of the microphone arrays MA301 and MA302 is hatched (hatched). In FIG. 5B, the sound collection area A302 corresponding to the combination (pattern) of the microphone arrays MA302 and MA303 is hatched (hatched). Further, in FIG. 5C, the sound collection area A303 corresponding to the combination (pattern) of the microphone arrays MA301 and MA303 is hatched (hatched).

図５に示すように、３個のマイクロホンｃｈ１〜ｃｈ３の構成では、いずれのマイクアレイでも、マイクアレイ同士（マイクアレイを構成する２つのマイクロホンの位置を結ぶ線分同士）で角度を有することから、互いの指向性を交差させて、組み合わせ毎に異なるエリア収音（異なる領域のエリア収音）が実現可能である。 As shown in FIG. 5, in the configuration of the three microphones ch 1 to ch 3, any microphone array has an angle between the microphone arrays (line segments connecting the positions of the two microphones constituting the microphone array). By intersecting the directivity of each other, it is possible to realize different area sound pickup (area sound pickup in different areas) for each combination.

一方、マイクアレイを用いたエリア収音の収音エリアは、マイクアレイの前方（マイクアレイから遠い方）に拡がる性質がある。以下、その性質について図６を用いて説明する。 On the other hand, the sound collection area of the area sound collection using the microphone array has a property of expanding in front of the microphone array (farther from the microphone array). Hereinafter, the property will be described with reference to FIG.

図６は、２つのマイクアレイＭＡ４００、ＭＡ５００の指向性を互いに直角を成すように交差させた場合におけるエリア収音の感度の分布（計算上の感度の分布）を示した図である。言い換えると、図６では、２つのマイクアレイＭＡ４００、ＭＡ５００の指向性が交差する領域及びその周辺におけるエリア収音の感度を図示している。なお、図６では、マイクアレイＭＡ４００、ＭＡ５００は、それぞれ２つのマイクロホンｃｈ１、ｃｈ２を備えている。また、図６では、エリア収音の感度を５段階（０〜−５ｄＢ、−５〜−１０ｄＢ、−１０〜−１５ｄＢ、−１５〜−２０ｄＢ、−２０〜−２５ｄＢ）に分けて、段階ごとに異なるパターン（模様）を付している。図６に示すように、マイクアレイＭＡ４００、ＭＡ５００から遠い方（すなわち、右下方向）に向けて感度が高い領域が伸びている状態となることが分かる。 FIG. 6 is a diagram illustrating a distribution of sensitivity (calculated sensitivity distribution) of area sound pickup when the directivities of the two microphone arrays MA400 and MA500 intersect at a right angle to each other. In other words, FIG. 6 illustrates the sensitivity of area sound pickup in a region where the directivities of the two microphone arrays MA400 and MA500 intersect and in the vicinity thereof. In FIG. 6, the microphone arrays MA400 and MA500 have two microphones ch1 and ch2, respectively. In FIG. 6, the sensitivity of area sound pickup is divided into five stages (0 to -5 dB, -5 to -10 dB, -10 to -15 dB, -15 to -20 dB, -20 to -25 dB), and Are provided with different patterns (patterns). As shown in FIG. 6, it can be seen that a region with high sensitivity extends in a direction distant from microphone arrays MA400 and MA500 (that is, in the lower right direction).

したがって、図５（ａ）の組み合わせ（マイクアレイＭＡ３０１、ＭＡ３０２の組み合わせ）、図５（ｂ）の組み合わせ（マイクアレイＭＡ３０２、ＭＡ３０３の組み合わせ）、図５（ｃ）の組み合わせ（マイクアレイＭＡ３０３、ＭＡ３０１の組み合わせ）によるエリア収音の収音エリア（エリア収音の感度の分布）は、それぞれマイクアレイの組み合わせ毎に異なり、重なる部分とそうでない部分（感度の分布が一致する部分と一致しない部分）が生じることになる。 Therefore, the combination of FIG. 5A (combination of microphone arrays MA301 and MA302), the combination of FIG. 5B (combination of microphone arrays MA302 and MA303), and the combination of FIG. The sound pickup area (area distribution of sensitivity of area pickup) of the area sound pickup by each combination differs depending on the combination of the microphone arrays, and an overlapping part and a part not overlapping (a part where the sensitivity distribution coincides with a part which does not coincide) are different. Will happen.

すなわち、図５に示すように、３個のマイクロホンｃｈ１〜ｃｈ３の構成において、異なる２つないし３つのマイクアレイの組み合わせでエリア収音を行い、それぞれの収音結果を足し合わせれば、１つのマイクアレイの組合せで実現した収音エリアより広い範囲のエリア収音が可能になる。 That is, as shown in FIG. 5, in a configuration of three microphones ch1 to ch3, area pickup is performed by a combination of two or three different microphone arrays, and if the respective pickup results are added, one microphone is obtained. It is possible to collect sound in a wider area than the sound collection area realized by the combination of the arrays.

そこで、この実施形態では、多角形（Ｎ角形；Ｎは３以上の整数）の角頂点の位置に配置されたマイクロホンで形成される複数のマイクアレイのうち、異なる複数のマイクアレイの組み合わせ（組み合わせのパターン）でエリア収音を行い、それぞれのエリア収音結果（エリア収音の出力）を加算又は加算平均した結果を、最終的な目的エリアの収音結果として取り扱う処理を行うものとする。これにより、この実施形態のエリア収音処理では、結果として話者の口元の位置（送話器から見た話者の口元の位置）の差異に対して、より頑健なエリア収音（より安定的なエリア収音）を行うことができる。 Therefore, in this embodiment, a combination (combination) of a plurality of different microphone arrays among a plurality of microphone arrays formed by microphones arranged at the corner vertices of a polygon (N polygon; N is an integer of 3 or more). ), And the result of adding or averaging the results of area pickup (output of area pickup) is treated as the final sound pickup result of the target area. As a result, in the area sound pickup processing of this embodiment, as a result, a more robust area sound pickup (more stable) with respect to the difference in the position of the mouth of the speaker (the position of the mouth of the speaker as viewed from the transmitter). Area sound collection).

（Ａ−１）第１の実施形態の構成
図１は、この実施形態に関連する各装置の構成について示したブロック図である。 (A-1) Configuration of First Embodiment FIG. 1 is a block diagram showing a configuration of each device related to this embodiment.

図１では、この実施形態に係る収音部１２０を備える通信装置１００と、通信装置２００とを図示している。また、図１では、通信装置１００、２００間は、通信路Ｐにより通信可能な構成となっている。 FIG. 1 illustrates a communication device 100 including a sound collection unit 120 according to this embodiment, and a communication device 200. In FIG. 1, the communication devices 100 and 200 are configured to be able to communicate with each other via a communication path P.

通信装置１００は、第１のユーザＵ１が発話した音声（音）を収音し、収音した音声の音声データを通信路Ｐを介して通信装置２００に送信するとともに、通信装置２００から受信した音声データに基づく音声（第２のユーザＵ２が発話した音声）を表音出力する装置である。また、通信装置２００は、第２のユーザＵ２が発話した音声（音）を収音し、収音した音声の音声データを通信路Ｐを介して通信装置１００に送信するとともに、通信装置１００から受信した音声データに基づく音声（第１のユーザＵ１が発話した音声）を表音出力する装置である。 The communication device 100 collects the voice (sound) uttered by the first user U1, transmits the voice data of the collected voice to the communication device 200 via the communication path P, and receives the voice data from the communication device 200. It is a device that outputs a sound based on the sound data (a sound uttered by the second user U2). Further, the communication device 200 collects voice (sound) uttered by the second user U2, transmits voice data of the collected voice to the communication device 100 via the communication path P, and This is a device for outputting a voice based on the received voice data (voice uttered by the first user U1).

第１のユーザＵ１は、例えば、救急車や消防車等の緊急車両に登場する搭乗員等が該当し、第２のユーザＵ２としては、例えば、遠隔地（例えば、緊急車両を指揮する司令センタ）の司令担当者等が該当する。 The first user U1 corresponds to, for example, a crew member appearing in an emergency vehicle such as an ambulance or a fire engine, and the second user U2 corresponds to, for example, a remote location (for example, a command center that commands an emergency vehicle). Commanders in charge correspond to this.

通信路Ｐは、有線・無線に限定されず種々の接続手段や接続構成（ネットワーク構成）を適用することができる。 The communication path P is not limited to wired or wireless, and various connection means and connection configurations (network configurations) can be applied.

次に、通信装置１００の構成概要について図１を用いて説明する。 Next, an outline of the configuration of the communication device 100 will be described with reference to FIG.

通信装置１００は、ハンドセット１１０、収音部１２０、通信部１３０、及び出力部１４０を有している。 The communication device 100 includes a handset 110, a sound collection unit 120, a communication unit 130, and an output unit 140.

ハンドセット１１０は、３個のマイクロホンＭＣ１〜ＭＣ３（３ｃｈマイクロホン）により構成されるマイクアレイ部１１１とスピーカ１１２とを備えている。 The handset 110 includes a microphone array unit 111 composed of three microphones MC1 to MC3 (3ch microphones) and a speaker 112.

通信部１３０は、通信路Ｐを介して通信装置２００と通信するための通信インタフェースである。 The communication unit 130 is a communication interface for communicating with the communication device 200 via the communication path P.

収音部１２０は、マイクアレイ部１１１で捕捉した音響信号に基づいて第１のユーザＵ１の発話した音声（音）を収音する。そして、通信部１３０は、収音部１２０が収音した音声の音声データを通信装置２００側に送信する。 The sound collection unit 120 collects a sound (sound) uttered by the first user U1 based on the acoustic signal captured by the microphone array unit 111. Then, the communication unit 130 transmits the audio data of the audio collected by the sound collection unit 120 to the communication device 200 side.

出力部１４０は、通信部１３０を介して通信装置２００から音声データ（第２のユーザＵ２が発話した音声の音声データ）を取得し、当該音声データに基づく音響信号をスピーカ１１２に供給し、スピーカ１１２に当該音響信号を表音出力させる。 The output unit 140 acquires voice data (voice data of voice uttered by the second user U2) from the communication device 200 via the communication unit 130, supplies an audio signal based on the voice data to the speaker 112, 112 causes the sound signal to be output as sound.

通信装置１００のハードウェア的な構成については限定されないものであるが、この実施形態の例では、図１に示すように、通信装置１００は、ハードウェア的にはハンドセット１１０を備える電話機の構成となっているものとする。なお、通信装置１００は、必ずしもハンドセット１１０を備える必要はなく、スマートホンのように筐体（シャーシ）全体が、実質的にハンドセットとして機能する構成（例えば、スマートホンの筐体の一部に送話口が設定された構成）としてもよい。 Although the hardware configuration of the communication device 100 is not limited, in the example of this embodiment, as illustrated in FIG. It shall be. Note that the communication device 100 does not necessarily need to include the handset 110, and the entire housing (chassis) functions substantially as a handset like a smartphone (for example, the communication device 100 may be provided in a part of the smartphone housing). A configuration in which a talk is set) may be used.

次に、通信装置２００の構成概要について図１を用いて説明する。 Next, an outline of the configuration of the communication device 200 will be described with reference to FIG.

通信装置２００は、スピーカ２１０、マイク２２０、通信部２３０、出力部２４０、及び収音部２５０を有している。通信装置２００のハードウェア構成についても限定されないものであるが、例えば、種々の電話装置（例えば、スピーカホン等）を適用することができる。 The communication device 200 includes a speaker 210, a microphone 220, a communication unit 230, an output unit 240, and a sound collection unit 250. Although the hardware configuration of the communication device 200 is not limited, for example, various telephone devices (for example, speakerphones and the like) can be applied.

通信部２３０は、通信路Ｐを介して通信装置２００と通信するための通信インタフェースである。 The communication unit 230 is a communication interface for communicating with the communication device 200 via the communication path P.

収音部２５０は、マイク２２０で捕捉した音響信号に基づいて第２のユーザＵ２の発話した音声（音）を収音する。そして、通信部２３０は、収音部２５０が収音した音声の音声データを通信装置１００側に送信する。 The sound collection unit 250 collects a sound (sound) uttered by the second user U2 based on the acoustic signal captured by the microphone 220. Then, the communication unit 230 transmits the sound data of the sound collected by the sound collection unit 250 to the communication device 100 side.

出力部２４０は、通信部２３０を介して通信装置１００から音声データ（第１のユーザＵ１が発話した音声の音声データ）を取得し、当該音声データに基づく音響信号をスピーカ２１０に供給し、スピーカ２１０に当該音響信号を表音出力させる。 The output unit 240 acquires voice data (voice data of voice uttered by the first user U1) from the communication device 100 via the communication unit 230, supplies an audio signal based on the voice data to the speaker 210, The sound signal is output to the sound signal 210.

次に、収音部１２０の詳細構成について図１を用いて説明する。 Next, a detailed configuration of the sound pickup unit 120 will be described with reference to FIG.

収音部１２０は、信号入力部１２１、周波数変換部１２２、指向性形成部１２３、目的エリア音抽出部１２４及びエリア音加算部１２５を有している。 The sound collection unit 120 includes a signal input unit 121, a frequency conversion unit 122, a directivity forming unit 123, a target area sound extraction unit 124, and an area sound addition unit 125.

収音部１２０は、例えば、プロセッサやメモリ等を備えるコンピュータにプログラム（実施形態に係る収音プログラムを含む）を実行させるようにしてもよいが、その場合であっても、機能的には、図１のように示すことができる。収音部１２０の各構成要素の処理の詳細については後述する。 For example, the sound collection unit 120 may cause a computer including a processor, a memory, and the like to execute a program (including the sound collection program according to the embodiment), but even in that case, functionally, It can be shown as in FIG. Details of the processing of each component of the sound pickup unit 120 will be described later.

次に、送受話器としてのハンドセット１１０の構成について図２、図３を用いて説明する。 Next, the configuration of the handset 110 as a handset will be described with reference to FIGS.

図２は、ハンドセット１１０が第１のユーザＵ１の手Ｕ１ａで把持されている状態について示した斜視図である。 FIG. 2 is a perspective view showing a state where the handset 110 is being held by the hand U1a of the first user U1.

図２に示すようにハンドセット１１０は、第１のユーザＵ１（手Ｕ１ａ）に把持させるための棒形状の把手部１１５と、把手部１１５の一端に設けられた送話口１１３（送話器）と、把手部１１５の他端に設けられた受話口１１４（受話器）とを有している。 As shown in FIG. 2, the handset 110 includes a rod-shaped handle 115 for the first user U1 (hand U1a) to hold, and a mouthpiece 113 (speaker) provided at one end of the handle 115. And an earpiece 114 (receiver) provided at the other end of the handle 115.

図３は、ハンドセット１１０の送話口１１３の部分を拡大して示した図である。 FIG. 3 is an enlarged view of the mouthpiece 113 of the handset 110.

図２、に示すように、受話口１１４にはスピーカ１１２が配置されている。また、図２、図３に示すように、円形の面を備える送話口１１３には、マイクアレイ部１１１（マイクロホンＭＣ１〜ＭＣ３）が配置されている。 As shown in FIG. 2, a speaker 112 is arranged in the earpiece 114. As shown in FIGS. 2 and 3, microphone array unit 111 (microphone MC1 to MC3) is arranged in mouthpiece 113 having a circular surface.

次に、マイクアレイ部１１１の構成について、図２、図３を用いて説明する。 Next, the configuration of the microphone array unit 111 will be described with reference to FIGS.

この実施形態の例では、マイクアレイ部１１１は、３個のマイクロホンＭＣ１〜ＭＣ３を有する構成であるものとする。 In the example of this embodiment, the microphone array unit 111 has a configuration including three microphones MC1 to MC3.

図２に示すように、第１のユーザＵ１が通信装置１００を手Ｕ１ａで把持し、耳にスピーカＳＰを押し付けた場合に、第１のユーザＵ１の口元が位置する送話口１１３の周囲（第１のユーザＵ１の口元と最も近接する部分の周囲）に３個のマイクロホンＭＣ１〜ＭＣ３が配置されている。 As shown in FIG. 2, when the first user U1 holds the communication device 100 with the hand U1a and presses the speaker SP to the ear, the first user U1 surrounds the mouthpiece 113 where the mouth of the first user U1 is located (see FIG. 2). Three microphones MC 1 to MC 3 are arranged around the part closest to the mouth of the first user U 1).

図２、図３に示すハンドセット１１０では、上述の図４、図５に示す構成と同様に、マイクアレイ部１１１を構成する３個のマイクロホンＭＣ１〜ＭＣ３の各位置（各マイクロホンの中心位置）が、送話口１１３の周囲上で、正三角形の頂点となるように配置されている。図２、図３では、収音エリアの拡大を等方向とするため、マイクロホンＭＣ１〜ＭＣ３による三角形の各辺を同じ距離（マイクロホンＭＣ１〜ＭＣ３による三角形が正三角形）としているが、各辺の距離や各角の角度は全て同じでなくてもよい。 In the handset 110 shown in FIGS. 2 and 3, each position (the center position of each microphone) of the three microphones MC 1 to MC 3 constituting the microphone array unit 111 is similar to the configuration shown in FIGS. 4 and 5 described above. , Around the mouthpiece 113 so as to be the vertices of an equilateral triangle. In FIGS. 2 and 3, each side of the triangle formed by the microphones MC1 to MC3 is set to have the same distance (a triangle formed by the microphones MC1 to MC3 is an equilateral triangle) in order to enlarge the sound pickup area in the same direction. Also, the angles of the angles may not all be the same.

なお、図３に示すように、以下では、マイクアレイ部１１１において、マイクロホンＭＣ１、ＭＣ２を対とするマイクアレイをＭＡ１、マイクロホンＭＣ２、ＭＣ３を対とするマイクアレイをＭＡ２、マイクロホンＭＣ３、ＭＣ１を対とするマイクアレイをＭＡ３と呼ぶものとする。 As shown in FIG. 3, in the following, in the microphone array unit 111, the microphone array paired with the microphones MC1 and MC2 is MA1, the microphone array paired with the microphones MC2 and MC3 is MA2, and the microphones MC3 and MC1 are paired. Is referred to as MA3.

（Ａ−２）第１の実施形態の動作
次に、以上のような構成を有するこの実施形態の動作（実施形態に係る収音方法）を説明する。 (A-2) Operation of First Embodiment Next, the operation of this embodiment having the above-described configuration (sound collection method according to the embodiment) will be described.

通信装置１００では、収音部１２０が、マイクアレイ部１１１のマイクロホンＭＣ１〜ＭＣ３から供給される音響信号を用いて、目的エリアの目的エリア音を収音する目的エリア音収音処理を行う。 In the communication device 100, the sound collection unit 120 performs a target area sound collection process of collecting the target area sound of the target area using the acoustic signals supplied from the microphones MC1 to MC3 of the microphone array unit 111.

以下では、通信装置１００を構成する収音部１２０内部の動作を中心に説明する。 The following description focuses on the operation inside the sound pickup unit 120 included in the communication device 100.

信号入力部１２１は、各マイクロホンＭＣ１〜ＭＣ３で収音した音響信号をアナログ信号からデジタル信号に変換し、周波数変換部１２２に供給する。その後、周波数変換部１２２では、例えば高速フーリエ変換を用いてマイク信号を時間領域から周波数領域へ変換する。指向性形成部１２３はＢＦにより指向性を形成する。 The signal input unit 121 converts an acoustic signal collected by each of the microphones MC 1 to MC 3 from an analog signal to a digital signal, and supplies the digital signal to the frequency conversion unit 122. Then, the frequency conversion unit 122 converts the microphone signal from the time domain to the frequency domain using, for example, fast Fourier transform. The directivity forming unit 123 forms directivity by BF.

ここで、図７、図８を用いてＢＦによる指向性形成について説明する。 Here, the formation of directivity by BF will be described with reference to FIGS.

ＢＦとは、マイクアレイにおいて各マイクロホンに到達する信号の時間差を利用して収音の指向性を形成する技術である（非特許文献１参照）。ＢＦは加算型と減算型の大きく２つの種類に分けられが、ここでは少ないマイクロホン数で指向性を形成できる減算型ＢＦについて説明する。 BF is a technique for forming a directivity of sound collection using a time difference between signals reaching respective microphones in a microphone array (see Non-Patent Document 1). BFs are roughly classified into two types, an addition type and a subtraction type. Here, a subtraction type BF that can form directivity with a small number of microphones will be described.

図７は、マイクロホン数が２個（ＭＣ１、ＭＣ２）の場合の減算型ＢＦ６００に係る構成を示すブロック図である。 FIG. 7 is a block diagram showing a configuration relating to the subtraction type BF 600 when the number of microphones is two (MC1, MC2).

図８は、２個のマイクロホンＭＣ１、ＭＣ２を用いた減算型ＢＦ６００により形成される指向特性を示す図である。 FIG. 8 is a diagram illustrating a directional characteristic formed by a subtraction type BF 600 using two microphones MC1 and MC2.

減算型ＢＦ６００は、まず遅延器６１０により目的とする方向に存在する音（以下、「目的音」と呼ぶ）が各マイクロホンＭＣ１、ＭＣ２に到来する信号の時間差を算出し、遅延を加えることにより目的音の位相を合わせる。時間差は（１）式により算出される。ここで、ｄはマイクロホンＭＣ１、ＭＣ２間の距離、ｃは音速、τ_ｉは遅延量を示している。またθ_Ｌは、マイクロホンＭＣ１、Ｍ２の位置を結んだ直線に対する垂直方向から目的方向への角度を示している。 The subtraction type BF 600 first calculates the time difference between the signals that arrive at the microphones MC1 and MC2 of the sound (hereinafter, referred to as “target sound”) existing in the target direction by the delay unit 610, and adds a delay to the calculated signal. Adjust the sound phase. The time difference is calculated by equation (1). Here, d indicates the distance between the microphones MC1 and MC2, c indicates the speed of sound, and τ _i indicates the amount of delay. Θ _L indicates an angle from a vertical direction to a target direction with respect to a straight line connecting the positions of the microphones MC1 and M2.

ここで、死角がマイクロホンＭＣ１とマイクロホンＭＣ２の中心に対し、マイクロホンＭＣ１の方向に存在する場合、遅延器６１０は、マイクロホンＭＣ１の入力信号ｘ_１（ｔ）に対し遅延処理を行う。その後、減算器６２０が、（２）式に従い減算処理を行う。減算器６２０では、この減算処理は周波数領域でも同様に行うことができ、その場合（２）式は（３）式のように変更される。

Here, when the blind spot exists in the direction of the microphone MC1 with respect to the centers of the microphones MC1 and MC2, the delay unit 610 performs a delay process on the input signal x ₁ (t) of the microphone MC1. Thereafter, the subtractor 620 performs a subtraction process according to the equation (2). In the subtractor 620, this subtraction processing can be similarly performed in the frequency domain, and in that case, the equation (2) is changed to the equation (3).

ここでθ_Ｌ＝±π／２の場合、形成される指向性は図８（ａ）に示すように、カージオイド型の単一指向性となり、θ_Ｌ＝０，πの場合は、図８（ｂ）のような８の字型の双指向性となる。また、減算器６２０では、スペクトル減算法（ＳｐｅｃｔｒａｌＳｕｂｔｒａｃｔｉｏｎ）の処理（以下、単に「ＳＳ」とも呼ぶ）を用いることで、双指向性の死角に強い指向性を形成することもできる。ＳＳによる指向性は、（４）式に従い全周波数、もしくは指定した周波数帯域で形成される。（４）式では、マイクロホンＭＣ１の入力信号Ｘ_１を用いているが、マイクロホンＭＣ２の入力信号Ｘ_２でも同様の効果を得ることができる。ここで、ｎはフレーム番号、βはＳＳの強度を調節するための係数を示している。減算器６２０では、減算時に値がマイナスなった場合は、０または元の値を小さくした値に置き換えるフロアリング処理を行うようにしてもよい。この方式では、双指向性の特性によって目的方向以外に存在する音（以下、「非目的音」と呼ぶ）を抽出し、抽出した非目的音の振幅スペクトルを入力信号の振幅スペクトルから減算することで、目的音を強調することができる。

Here, when θ _L = ± π / 2, the formed directivity is a cardioid type single directivity as shown in FIG. 8A, and when θ _L = 0, π, the formed directivity is FIG. An eight-shaped bidirectional pattern as shown in FIG. Further, the subtractor 620 can form a directivity that is strong in a bidirectional blind spot by using a process of a spectral subtraction method (hereinafter, also simply referred to as “SS”). The directivity by the SS is formed in all frequencies or a designated frequency band according to the equation (4). (4) In the formula, is used to input signals _{X 1} microphone MC1, it is possible to obtain the same effect input signal _{X 2} microphones MC2. Here, n is a frame number, and β is a coefficient for adjusting the strength of the SS. If the value becomes negative at the time of subtraction, the subtractor 620 may perform flooring processing of replacing the value with 0 or a value obtained by reducing the original value. In this method, a sound existing in a direction other than a target direction (hereinafter, referred to as a “non-target sound”) is extracted due to bidirectional characteristics, and an amplitude spectrum of the extracted non-target sound is subtracted from an amplitude spectrum of an input signal. Thus, the target sound can be emphasized.

ところで、ある特定の目的エリア内に存在する目的エリア音だけを収音したい場合、減算型ＢＦを用いるだけでは、そのエリアと同一方向の線上に存在する音源（以下、「非目的エリア音」と呼ぶ）も収音してしまう。 By the way, when it is desired to collect only a target area sound existing in a specific target area, a sound source existing on a line in the same direction as that area (hereinafter, referred to as “non-target area sound”) is used simply by using the subtraction type BF. Call) will also pick up sound.

そこで、指向性形成部１２３では、特許文献１で提案されているエリア収音処理（複数のマイクアレイを用い、それぞれ別々の方向から目的エリアへ指向性を向け、指向性を目的エリアで交差させることで目的エリア音を収音する処理）を行うものとして説明する。具体的には、指向性形成部１２３は、以下のような処理によりエリア収音処理を行うようにしてもよい。 Therefore, the directivity forming unit 123 uses the area sound collection processing (a plurality of microphone arrays are used to direct the directivity from different directions to the target area, and intersect the directivity at the target area). In this case, the processing for collecting the target area sound will be described. Specifically, the directivity forming unit 123 may perform the area sound pickup processing by the following processing.

指向性形成部１２３は、マイクアレイＭＡ１〜ＭＡ３のそれぞれについて、三角形（マイクロホンＭＣ１〜ＭＣ３により形成される三角形）の内側に向かってＢＦによって指向性を形成する。そして、指向性形成部１２３は、マイクアレイＭＡ１、ＭＡ２、ＭＡ３の各ＢＦ出力Ｙ_１（ｎ）、Ｙ_２（ｎ）、Ｙ_３（ｎ）を、目的エリア音抽出部１２４に供給する。 The directivity forming unit 123 forms the directivity of each of the microphone arrays MA1 to MA3 with the BF toward the inside of the triangle (the triangle formed by the microphones MC1 to MC3). Then, the directivity forming unit 123 supplies the BF outputs Y ₁ (n), Y ₂ (n), and Y ₃ (n) of the microphone arrays MA1, MA2, and MA3 to the target area sound extracting unit 124.

目的エリア音抽出部１２４は、指向性形成部１２３で形成したマイクアレイＭＡ１、ＭＡ２、ＭＡ３のＢＦ出力Ｙ_１（ｎ）、Ｙ_２（ｎ）、Ｙ_３（ｎ）を用いてエリア音を抽出する。上述の通り、各ＢＦ出力（Ｙ_１（ｎ）、Ｙ_２（ｎ）、Ｙ_３（ｎ））は、３角形（マイクロホンＭＣ１〜ＭＣ３により形成される三角形）の各辺から中心（三角形の内側方向）に向かう指向性を成したものである。したがって、各ＢＦ出力は、そのいずれの２つの組み合せ（組み合わせのパターン）においても２つの指向性が３角形の中心付近で交差するため、目的エリア音抽出部１２４は、以下に記すエリア収音方法によって、互いの指向性が交差したエリアの音を抽出することが出来る。ここでは、代表として、マイクアレイＭＡ１のＢＦ出力Ｙ_１（ｎ）と、マイクアレイＭＡ２のＢＦ出力Ｙ_２（ｎ）を用いた場合について説明する。目的エリア音抽出部１２４は、Ｙ_１（ｎ）、Ｙ_２（ｎ）を（５）、もしくは（６）式に従いＳＳし、目的エリア方向に存在する非目的エリア音Ｎ_１−１（ｎ）、Ｎ_１−２（ｎ）を抽出する。ここでα_１、α_２は、目的エリアと各マイクアレイの距離の違いによって生じる信号レベルの差を補正する補正係数であり、所定の処理によって逐一計算されるべきものであり、その手法は特許文献１にも記載されているが、ここでは簡単のため、目的エリアと各マイクアレイまでの距離は同一（α_１（ｎ）＝α_２（ｎ）＝１）とし、（５）、（６）式を（７）、（８）式に代える。

The target area sound extraction unit 124 extracts an area sound using the BF outputs Y ₁ (n), Y ₂ (n), and Y ₃ (n) of the microphone arrays MA1, MA2, and MA3 formed by the directivity forming unit 123. I do. As described above, each BF output (Y ₁ (n), Y ₂ (n), Y ₃ (n)) is centered (inside the triangle) from each side of the triangle (a triangle formed by the microphones MC1 to MC3). Direction). Therefore, in each BF output, in any two combinations (combination patterns), the two directivities intersect near the center of the triangle, so that the target area sound extraction unit 124 uses the area sound collection method described below. Thus, it is possible to extract sounds in an area where the directivities cross each other. Here, as a representative, the BF output _Y 1 of the microphone array MA1 (n), will be described using the BF output _Y 2 of the microphone array MA2 (n). The target area sound extraction unit 124 SSs Y ₁ (n) and Y ₂ (n) according to the formula (5) or (6), and non-target area sounds N _1-1 (n) existing in the direction of the target area. , N _1-2 (n). Here, α ₁ and α ₂ are correction coefficients for correcting a difference in signal level caused by a difference in distance between the target area and each microphone array, and should be calculated one by one by a predetermined process. Although described in Document 1, here, for simplicity, the distance between the target area and each microphone array is the same (α ₁ (n) = α ₂ (n) = 1), and (5), (6) Expression) is replaced by Expressions (7) and (8).

その後、目的エリア音抽出部１２４は、（９）、（１０）式に従い、各ＢＦ出力から非目的エリア音をＳＳして目的エリア音を抽出する。ここで、γ_１（ｎ）、γ_２（ｎ）はＳＳ時の強度を変更するための係数である。

Thereafter, the target area sound extraction unit 124 extracts the non-target area sound from each BF output according to the equations (9) and (10) to extract the target area sound. Here, γ ₁ (n) and γ ₂ (n) are coefficients for changing the strength at the time of SS.

目的エリア音抽出部１２４において、強調音Ｚ_１−１（ｎ）、Ｚ_１−２（ｎ）のうちいずれを出力としても構わないが、ここではＺ_１−１（ｎ）をマイクアレイＭＡ１−マイクアレイＭＡ２の組み合せ（組み合わせのパターン）によるエリア収音出力Ｚ_１（ｎ）として用いることとする。 In destination area sound extraction unit 124, emphasized sound _Z 1-1 _(n), but may be output to any of the _{Z 1-2} (n), the microphone array _{Z 1-1} (n) here MA1- It is used as an area sound pickup output Z ₁ (n) based on a combination (combination pattern) of the microphone arrays MA2.

同様にして目的エリア音抽出部１２４は、マイクアレイＭＡ２−マイクアレイＭＡ３の組み合せによるエリア収音出力Ｚ_２（ｎ）、及びマイクアレイＭＡ３−マイクアレイＭＡ１の組み合せによるエリア収音出力Ｚ_３（ｎ）を抽出し、エリア音加算部１２５へ供給する。 The destination area sound extraction unit 124 and similarly, the area sound-pickup output _Z 2 by the combination of the microphone array MA2- microphone array MA3 (n), and the area sound-pickup output _Z 3 by a combination of the microphone array MA3- microphone array MA1 (n ) Is extracted and supplied to the area sound adding unit 125.

図２に示すように、マイクロホンＭＣ１〜ＭＣ３は、いずれもハンドセット１１０の送話口１１３における数センチ径の狭い範囲に装着されている。したがって、各マイクアレイＭＡ１、ＭＡ２、ＭＡ３は、非常に近接（密集）した配置であり、それぞれの収音エリアも送話口１１３前の狭い範囲に限られる。しかし、上述の図６に示すように、エリア収音による収音エリアは、２つのマイクアレイの遠方方向に拡がる特性があることが判っている。したがって、それぞれ異なる３方向に拡がった収音エリア（Ｚ_１（ｎ）、Ｚ_２（ｎ）、Ｚ_３（ｎ）のそれぞれに対応する収音エリア）を重ね合わせれば、単独の収音エリア（Ｚ_１（ｎ）、Ｚ_２（ｎ）、Ｚ_３（ｎ）のうちいずれか１つに対応する収音エリア）に比べ、より広い範囲のエリア収音が可能になる。 As shown in FIG. 2, each of the microphones MC 1 to MC 3 is mounted in a narrow range of several centimeters in the mouthpiece 113 of the handset 110. Therefore, the microphone arrays MA1, MA2, and MA3 are arranged very close (dense), and the sound collection area is also limited to a narrow area in front of the mouthpiece 113. However, as shown in FIG. 6 described above, it has been found that the sound pickup area by the area sound pickup has a characteristic of extending in the far direction of the two microphone arrays. Therefore, if sound pickup areas (sound pickup areas corresponding to Z ₁ (n), Z ₂ (n), and Z ₃ (n)) extending in three different directions are overlapped, a single sound pickup area (single sound pickup area) is obtained. A wider range of area sound collection is possible as compared to a sound collection area corresponding to any one of Z ₁ (n), Z ₂ (n), and Z ₃ (n).

そこで、エリア音加算部１２５では、３個のエリア収音の出力Ｚ_１（ｎ）、Ｚ_２（ｎ）、Ｚ_３（ｎ）を加算又は加算平均して最終出力（収音結果）Ｗ（ｎ）を生成して収音部１２０の収音結果として出力する。エリア音加算部１２５は、当該加算処理においてはエリア同士が重なる部分があることを考慮し、３個のエリア収音の出力の加算値（Ｚ_１（ｎ）＋Ｚ_２（ｎ）、＋Ｚ_３（ｎ））を平均化、あるいは式（１１）に示すようにゲイン調整の係数αを乗じてもよい。なお、エリア音加算部１２５は、３個のエリア収音の出力（Ｚ_１（ｎ）、Ｚ_２（ｎ）、Ｚ_３（ｎ））のうち、２以上の出力だけを加算（又は加算平均）する処理を行うようにしてもよい。例えば、エリア音加算部１２５は、３個のエリア収音の出力のうち、２つの出力だけを加算（又は加算平均）する処理を行うようにしてもよい。

Therefore, the area sound adding unit 125 adds or averages the _three outputs Z ₁ (n), Z ₂ (n), and Z ₃ (n) of the sound pickup and outputs the final output (sound pickup result) W ( n) is generated and output as a sound collection result of the sound collection unit 120. The area sound addition unit 125 considers that there is a part where the areas overlap in the addition processing, and the addition value (Z ₁ (n) + Z ₂ (n), + Z ₃ ( n)) may be averaged or multiplied by a gain adjustment coefficient α as shown in equation (11). The area sound adding unit 125 adds (or averages) only two or more outputs of three area sound pickup outputs (Z ₁ (n), Z ₂ (n), and Z ₃ (n)). ) May be performed. For example, the area sound addition unit 125 may perform a process of adding (or averaging) only two outputs of the three area sound pickup outputs.

以上のように、収音部１２０は、拡大されたエリアから収音された目的音声として最終出力Ｗ（ｎ）を出力する。このとき、収音部１２０は、Ｗ（ｎ）を周波数−時間変換した音声データとして出力するようにしてもよい。 As described above, the sound pickup unit 120 outputs the final output W (n) as the target sound picked up from the enlarged area. At this time, the sound pickup unit 120 may output W (n) as frequency-time converted audio data.

そして、通信部１３０は、最終出力Ｗ（ｎ）に基づく音声データを、通信路Ｐを介して通信装置２００に送信する。 Then, the communication unit 130 transmits the audio data based on the final output W (n) to the communication device 200 via the communication path P.

そして、通信装置２００の通信部２３０は、通信装置１００から受信した音声データ（Ｗ（ｎ）に基づく音声データ）を出力部１４０に供給する。出力部１４０は、受信した音声データに基づく音響信号をスピーカ２１０に供給して表音出力（第２のユーザＵ２に向けて表音出力）させる。 Then, the communication unit 230 of the communication device 200 supplies the output unit 140 with the audio data (audio data based on W (n)) received from the communication device 100. The output unit 140 supplies an audio signal based on the received audio data to the speaker 210 to output a sound (a sound output to the second user U2).

（Ａ−３）第１の実施形態の効果
この実施形態によれば、以下のような効果を奏することができる。 (A-3) Effects of First Embodiment According to this embodiment, the following effects can be obtained.

この実施形態の収音部１２０では、別々の方向からエリア収音を行い、それらを足し合わせることで、従来の１組（２つ）のマイクアレイを用いたエリア収音よりも広く、等方向性をもった収音エリア（拡大した収音エリア）を形成することができる。これにより、収音部１２０では、ハンドセット１１０の送話口１１３に付けられたマイクロホンＭＣ１〜ＭＣ３を用いたエリア収音を行う際に、話者（第１のユーザＵ１）の口元と送話口１１３との相対的な位置がずれた場合でも安定した音声収音が可能となる。 In the sound pickup unit 120 of this embodiment, area sound pickup is performed from different directions, and by adding them, an area sound pickup wider than the conventional area pickup using one set (two) microphone arrays is performed. It is possible to form a sound collecting area having a characteristic (enlarged sound collecting area). With this, in the sound pickup unit 120, when performing area sound pickup using the microphones MC1 to MC3 attached to the mouthpiece 113 of the handset 110, the mouth of the speaker (first user U1) and the mouthpiece are performed. Even when the relative position with respect to 113 is shifted, stable voice pickup is possible.

（Ｂ）第２の実施形態
以下、本発明による収音装置、プログラム及び方法の第２の実施形態を、図面を参照しながら詳述する。この実施形態では、本発明の収音装置、プログラム及び方法を収音部に適用した例について説明する。 (B) Second Embodiment Hereinafter, a second embodiment of a sound collection device, a program, and a method according to the present invention will be described in detail with reference to the drawings. In this embodiment, an example in which a sound collection device, a program, and a method of the present invention are applied to a sound collection unit will be described.

上述の通り、第１の実施形態の収音部１２０では、別々の方向からエリア収音を行い、それらを重ね合わせる（足し合わせる）ことで、従来の１組（２つ）のマイクアレイを用いたエリア収音よりも広く、等方向性をもった収音エリア（拡大した収音エリア）を形成している。 As described above, the sound pickup unit 120 of the first embodiment performs area sound pickup from different directions and superimposes (adds) them to use a conventional set (two) of microphone arrays. The sound pickup area (enlarged sound pickup area) which is wider than the sound pickup area and has the same directionality is formed.

しかしながら、第１の実施形態の収音部１２０のように、収音エリアを拡げる試みは、一方で、特定のエリアのみの音を収音することで周辺の不要音を抑圧し、目的音を強調するというエリア収音本来の効果を減ずる恐れがある。 However, as in the sound pickup unit 120 of the first embodiment, an attempt to expand the sound pickup area, on the other hand, suppresses surrounding unnecessary sounds by picking up sound only in a specific area, thereby reducing the target sound. There is a possibility that the effect of emphasizing the area sound collection may be reduced.

そこで、第２の実施形態の収音部１２０Ａでは、第１の実施形態における上述のような問題を解決するために収音可能エリアを拡げつつも目的音強調性能の劣化を抑制する構成となっている。 Therefore, in the sound pickup unit 120A of the second embodiment, in order to solve the above-described problem in the first embodiment, the sound pickup area is expanded and the deterioration of the target sound enhancement performance is suppressed. ing.

（Ｂ−１）第２の実施形態の構成
図９は、第２の実施形態に関連する各装置の構成について示したブロック図である。図９では、上述の図１と同一部分又は対応部分には、同一符号又は対応符号を付している。 (B-1) Configuration of Second Embodiment FIG. 9 is a block diagram showing a configuration of each device related to the second embodiment. In FIG. 9, the same or corresponding portions as those in FIG. 1 described above are denoted by the same reference numerals or corresponding reference numerals.

第２の実施形態では、通信装置１００が通信装置１００Ａに置き換わっている。また、第２の実施形態の通信装置１００Ａでは、マイクアレイ部１１１と収音部１２０が、マイクアレイ部１１１Ａと収音部１２０Ａに置き換わっている。 In the second embodiment, the communication device 100 is replaced with a communication device 100A. In the communication device 100A according to the second embodiment, the microphone array unit 111 and the sound pickup unit 120 are replaced with the microphone array unit 111A and the sound pickup unit 120A.

次に、第２の実施形態における収音部１２０Ａの内部構成について説明する。 Next, an internal configuration of the sound pickup unit 120A according to the second embodiment will be described.

次に、収音部１２０Ａの内部構成について図９を用いて説明する。 Next, the internal configuration of the sound pickup unit 120A will be described with reference to FIG.

収音部１２０Ａでは、目的エリア音抽出部１２４が目的エリア音抽出部１２４Ａに置き換わり、エリア音加算部１２５が除外されている点で第１の実施形態と異なっている。また、収音部１２０Ａでは、部分エリア成分算出部１２６と部分エリア選択部１２７が追加されている点で第１の実施形態と異なっている。 The sound collecting unit 120A is different from the first embodiment in that the target area sound extracting unit 124 is replaced with the target area sound extracting unit 124A, and the area sound adding unit 125 is excluded. The sound pickup unit 120A is different from the first embodiment in that a partial area component calculation unit 126 and a partial area selection unit 127 are added.

次に、第２の実施形態のマイクアレイ部１１１Ａの構成について説明する。 Next, the configuration of the microphone array unit 111A according to the second embodiment will be described.

図９に示すように、第２の実施形態において、マイクアレイ部１１１Ａは、６つのマイクロホンＭＣ１〜ＭＣ６を有している。 As shown in FIG. 9, in the second embodiment, the microphone array unit 111A has six microphones MC1 to MC6.

図１０は、マイクアレイ部１１１Ａにおける６つのマイクロホンＭＣ１〜ＭＣ６の配置及びマイクアレイの構成例について示した図である。 FIG. 10 is a diagram illustrating an arrangement of six microphones MC1 to MC6 in the microphone array unit 111A and a configuration example of the microphone array.

図１０に示すように、マイクアレイ部１１１Ａを構成する６つのマイクロホンＭＣ１〜ＭＣ６は、２つずつのマイクロホンを対として３つのマイクアレイＭＡ１（マイクロホンＭＣ１、ＭＣ２を対とするマイクアレイ）、ＭＡ２（マイクロホンＭＣ３、ＭＣ４を対とするマイクアレイ）、ＭＡ３（マイクロホンＭＣ５、ＭＣ６を対とするマイクアレイ）を構成している。 As shown in FIG. 10, the six microphones MC1 to MC6 constituting the microphone array unit 111A include three microphone arrays MA1 (a microphone array having the microphones MC1 and MC2 as a pair) and MA2 (a microphone array having two microphones as a pair). The microphones MC3 and MC4 constitute a pair of microphone arrays, and the MA3 (microphones MC5 and MC6 constitute a pair of microphone arrays).

分割数を増やしピンポイントのエリアから目的音を抽出する観点からは、３エリア以上の構成を有することが望ましいが、第２の実施形態では、本発明の原理を解り易く説明するため、重なりを持つ２つのエリアのエリア収音を行なう例について説明する。重なりを持つ３つのエリアによる構成については、後述する第３の実施形態で示す。 From the viewpoint of increasing the number of divisions and extracting the target sound from the pinpoint area, it is desirable to have a configuration of three or more areas. However, in the second embodiment, in order to easily understand the principle of the present invention, the overlap will be described. An example will be described in which area sound pickup is performed for the two areas that are provided. A configuration using three overlapping areas will be described in a third embodiment described later.

（Ｂ−２）第２の実施形態の動作
次に、以上のような構成を有する第２の実施形態の動作（実施形態に係る収音方法）を説明する。 (B-2) Operation of Second Embodiment Next, an operation (a sound collection method according to the embodiment) of the second embodiment having the above-described configuration will be described.

信号入力部１２１は、６つのマイクロホンＭＣ１〜ＭＣ６で収音した音響信号を、それぞれアナログ信号からデジタル信号ｘ_１〜ｘ_６に変換し、周波数変換部１２２に供給する。 Signal input unit 121, a sound signal picked up by the six microphones MC1 to MC6, respectively converted from analog signals to digital signals _x 1 ~x _6, and supplies the frequency conversion section 122.

周波数変換部１２２では、例えば高速フーリエ変換を用いてマイクロホン信号ｘ_１〜ｘ_６を、時間領域から周波数領域の信号Ｘ_１〜Ｘ_６へ変換する。 The frequency conversion unit 122 converts the microphone signals x _{1 to} x ₆ from time domain to frequency domain signals X _{1 to} X ₆ using, for example, fast Fourier transform.

指向性形成部１２３は、周波数変換部１２２によって時間−周波数変換された各マイクロホンの入力信号を用いてＢＦにより指向性を形成する。第２の実施形態では、マイクアレイＭＡ１によるＢＦ出力をＹ_１、マイクアレイＭＡ２によるＢＦ出力をＹ_２、マイクアレイＭＡ３によるＢＦ出力をＹ_３とする。ＢＦ出力Ｙ_１、Ｙ_２、Ｙ_３の指向性は図１０に示す通りである。第２の実施形態では図１０に示す通り、マイクアレイＭＡ１〜ＭＡ３が三角形の各頂点の位置に配置されており、ＢＦ出力Ｙ_１、Ｙ_２、Ｙ_３の指向性（マイクアレイＭＡ１〜ＭＡ３の指向性）はそれぞれ三角形の内側を向けられている。 The directivity forming unit 123 forms directivity by BF using the input signal of each microphone that has been time-frequency converted by the frequency converting unit 122. In the second embodiment, the BF output by the microphone array MA1 _{Y 1,} the BF output by the microphone array MA2 _{Y 2,} and the BF output _{Y 3} by the microphone array MA3. The directivity of the BF outputs Y ₁ , Y ₂ , and Y ₃ is as shown in FIG. As in the second embodiment shown in FIG. 10, the microphone array MA1~MA3 is disposed at the position of each vertex of a triangle, BF Output _Y _1, Y 2, _{Y 3} directional (microphone array MA1~MA3 Are directed inside the triangle.

目的エリア音抽出部１２４Ａでは、指向性形成部１２３で生成されたＢＦ出力を用いてエリア収音処理を行なう。エリア収音は、異なる方向からＢＦの指向性を向け、指向性が交差したエリアの成分（エリア音）を分離・抽出するものである。ＢＦ出力Ｙ_１、Ｙ_２の組み合わせ、およびＢＦ出力Ｙ_１、Ｙ_３の組み合わせのそれぞれからエリア収音が実現できる。 The target area sound extraction unit 124A performs an area sound collection process using the BF output generated by the directivity forming unit 123. The area sound pickup is for directing the directivity of the BF from different directions and separating and extracting the components (area sounds) of the areas where the directivities intersect. Area sound pickup can be realized from each of the combination of the BF outputs Y ₁ and Y _{2 and} the combination of the BF outputs Y ₁ and Y ₃ .

図１１は、目的エリア音抽出部１２４Ａがエリア収音をおこなう収音エリアの分布について示した説明図である。 FIG. 11 is an explanatory diagram showing the distribution of sound collection areas where the target area sound extraction unit 124A performs area sound collection.

上述の図６で示したように、エリア収音ではマイクアレイから遠い方向に収音エリアが広がる特性を持つ。そのため、マイクアレイＭＡ１−ＭＡ２によるエリア収音領域（第２の実施形態では、「エリア１」又は「収音エリア１」と呼ぶ）と、マイクアレイＭＡ２−ＭＡ３によるエリア収音領域（第２の実施形態では、「エリア２」又は「収音エリア２」と呼ぶ）は、図１１のようなイメージになる。第２の実施形態では、収音エリア１のエリア収音成分（エリア収音出力）をＺ_１、エリア２のエリア収音成分（エリア収音出力）をＺ_２とする。 As shown in FIG. 6 described above, the area sound pickup has a characteristic that the sound pickup area spreads in a direction far from the microphone array. Therefore, an area sound pickup area (in the second embodiment, referred to as “area 1” or “sound pickup area 1”) by the microphone arrays MA1-MA2 and an area sound pickup area (the second sound pickup area) by the microphone arrays MA2-MA3. In the embodiment, “area 2” or “sound collection area 2” has an image as shown in FIG. In the second embodiment, the area sound pickup component (area sound pickup output) of the sound pickup area 1 is Z ₁ , and the area sound pickup component (area sound pickup output) of the area ₂ is Z ₂ .

それぞれの収音エリアは、図１２のように２つの収音エリアが重複する部分と、重複しない独立した部分に分けられる。 Each sound pickup area is divided into a part where the two sound pickup areas overlap as shown in FIG. 12 and an independent part where the two sound pickup areas do not overlap.

図１２では、エリア１、２で重複する領域を重複エリア１∧２としている。また、図１２では、エリア１内で、重複エリア１∧２を除く独立した領域（他の収音エリアと重複していない領域）を独立エリアＡとしている。さらに、図１２では、エリア２内で、重複エリア１∧２を除く独立した領域を独立エリアＢとしている。なお、１つの収音エリアから派生する独立エリア（独立部分）は、図１２に示すように複数の領域に分割される場合が有りえるが、本明細書では１つの収音エリアから発生した独立エリアについてはまとめて１つの符号で示すものとする。例えば、図１２では、に独立エリアＡは重複エリア１∧２により２つの領域に分割（分断）されているが、ここでは、この２つの領域をまとめて独立エリアＡと呼ぶことになる。 In FIG. 12, the overlapping area in areas 1 and 2 is referred to as overlapping area 1 エリア 2. In FIG. 12, in the area 1, an independent area excluding the overlapping area 1∧2 (an area that does not overlap with another sound collection area) is defined as an independent area A. Further, in FIG. 12, an independent area other than the overlapping area 1∧2 in the area 2 is defined as an independent area B. Note that an independent area (independent portion) derived from one sound collection area may be divided into a plurality of areas as shown in FIG. 12, but in this specification, an independent area generated from one sound collection area is divided. The areas are collectively indicated by one code. For example, in FIG. 12, the independent area A is divided (divided) into two areas by the overlapping area 1∧2, but here, these two areas are collectively referred to as an independent area A.

以上により、エリア１は重複エリア１∧２と独立エリアＡ（エリア１から重複エリア１∧２を除いた領域）とから成り、エリア２は重複エリア１∧２と独立エリアＢ（エリア２から重複エリア１∧２を除いた領域）とから成る。エリア１のエリア収音出力Ｚ_１と、エリア２のエリア収音出力Ｚ_２を重ね合わせる（足し合わせる）と、広い範囲のエリアから収音できるが、重複エリア１∧２の成分が二重に加わることになり収音エリア全体として均一な収音特性は得られない。したがって、重複エリア１∧２と独立エリアＡ、Ｂの音源を個別に分離・抽出することができれば、それぞれのエリアを重複することなく統合することでエリア１、２の全範囲に亘って均一な収音特性が得られることになる。 As described above, the area 1 is composed of the overlapping area 1∧2 and the independent area A (the area excluding the overlapping area 1∧2 from the area 1), and the area 2 is the overlapping area 1∧2 and the independent area B (the overlapping area Area excluding area 1∧2). An area sound-pickup output Z ₁ of the area 1, superimposing area sound-pickup output Z ₂ of area 2 (summing), can be collected sound from a wide range of areas, the component is a double overlap area 1∧2 Therefore, uniform sound pickup characteristics cannot be obtained for the entire sound pickup area. Therefore, if the sound sources of the overlapping area 1 音源 2 and the independent areas A and B can be separated and extracted individually, the respective areas are integrated without overlapping so that the entire area of the areas 1 and 2 is uniform. Sound pickup characteristics can be obtained.

部分エリア成分算出部１２６は、重複エリア１∧２を有する２つのエリア収音成分（ここでは、エリア１、２のエリア収音成分）から、重複エリア１∧２の収音成分と、独立エリアのエリア収音成分（ここでは、独立エリアＡ、Ｂのエリア収音成分）を分離する。 The partial area component calculation unit 126 calculates a sound pickup component of the overlap area 1∧2 and an independent area from the two area sound pickup components (here, the area sound pickup components of the areas 1 and 2) having the overlap area 1∧2. (Here, the sound pickup components of the independent areas A and B) are separated.

図１３は、図１２に示す各エリアの組成イメージ（成分ごとのパワー）を棒グラフの形式で示した説明図である。 FIG. 13 is an explanatory diagram showing a composition image (power for each component) of each area shown in FIG. 12 in the form of a bar graph.

図１３（ａ）は、エリア１のエリア収音成分Ｚ_１の組成イメージ示し、図１３（ｂ）は、エリア２のエリア収音成分Ｚ_２の組成イメージを示している。また、図１３（ｃ）は、図１３（ａ）に示すエリア収音成分Ｚ_１の組成イメージについて、重複エリア１∧２の成分にハッチ（斜線パターン）を付して示したものである。さらに、図１３（ｄ）は、図１３（ｂ）に示すエリア収音出力Ｚ_２の組成イメージについて、重複エリア１∧２の成分にハッチ（斜線パターン）を付して示したものである。 13 (a) shows the composition image of the area voice collecting component _{Z 1} in area 1, FIG. 13 (b) shows a composition image of the area sound-pickup component _{Z 2} of area 2. Further, FIG. 13 (c), the composition image of the area sound-pickup component Z ₁ shown in FIG. 13 (a), the components of the overlapping area 1∧2 illustrates hatched (shaded pattern). Further, FIG. 13 (d) the composition image area sound-pickup output Z ₂ shown in FIG. 13 (b), the components of the overlapping area 1∧2 illustrates hatched (shaded pattern).

エリア１とエリア２の重複エリア１∧２は、文字通り重複して共通であるから、Ｚ_１とＺ_２の中にそれぞれ同一の成分として含まれている。そこで、目的エリア音抽出部１２４Ａでは、エリア収音と同じ原理に基づき、スペクトル減算法（ＳＳ）を用いることで、それぞれの成分を分離する。 Duplicate area of Area 1 and Area 2 1∧2, since a common overlapping literally included as identical components each in Z ₁ and Z _2. Therefore, the target area sound extraction unit 124A separates each component by using the spectrum subtraction method (SS) based on the same principle as the area sound pickup.

部分エリア成分算出部１２６は、エリア収音出力Ｚ_１からエリア収音出力Ｚ_２をＳＳする。部分エリア成分算出部１２６は、ＳＳに際して負になる成分は０にクリッピングする。そうすることで、目的エリア音抽出部１２４Ａでは、エリア収音出力Ｚ_１から重複エリア１∧２のエリア収音成分が除かれ、独立エリアＡのエリア収音成分（第１の実施形態では「Ｖ_Ａ」と呼ぶ）が分離される。同様に、部分エリア成分算出部１２６は、エリア収音出力Ｚ_２からエリア収音出力Ｚ_１をＳＳすることで、独立エリアＢのエリア収音成分（第１の実施形態では、「Ｖ_Ｂ」と呼ぶ）を分離することができる。 Partial area component calculation unit 126 SS the area sound-pickup output _{Z 2} from the area sound-pickup output _{Z 1.} The partial area component calculation unit 126 clips a component that becomes negative at the time of SS to 0. In doing so, the destination area sound extraction unit 124A, area sound-pickup component overlapping area 1∧2 from area sound-pickup output Z ₁ is removed, the area sound-pickup component independent area A (first embodiment " V _A "). Similarly, partial area component calculation unit 126, an area sound-pickup output Z ₁ from the area sound-pickup output Z ₂ By SS, the area sound-pickup component (first embodiment of the independent area B, "V _B" ) Can be separated.

図１４は、部分エリア成分算出部１２６が独立エリアのエリア（独立エリアＡ、Ｂ）の収音成分（Ｖ_Ａ、Ｖ_Ｂ）を算出する処理の手順を示した説明図である。 Figure 14 is an explanatory view showing a procedure of a process for calculating the partial area component calculation unit 126 independent area area (independent areas A, B) of the sound pickup component _{(V A,} V _B).

図１４（ａ）〜図１４（ｃ）全体で、部分エリア成分算出部１２６が、エリア１のエリア収音出力Ｚ_１から、エリア２のエリア収音出力Ｚ_２をＳＳして独立エリアＡのエリア収音成分Ｖ_Ａを抽出する処理（以下の（２１）式に相当する処理）について示し、図１４（ａ）〜図１４（ｃ）の個々のグラフは、それぞれエリア１のエリア収音成分Ｚ_１、エリア２のエリア収音成分Ｚ_２、及び独立エリアＡのエリア収音成分Ｖ_Ａの組成イメージを表している。 Figure 14 (a) in whole to FIG 14 (c), partial area component calculation unit 126, the area 1 from area sound-pickup output _{Z 1,} independent area A the area sound-pickup output _{Z 2} of area 2 and SS The process of extracting the area sound pickup component _VA (processing corresponding to the following equation (21)) is shown. Each of the graphs in FIGS. Z _1, represents the composition image of the area sound-pickup component V _a of the area sound-pickup component area 2 Z _2, and independent area a.

同様に、図１４（ｄ）〜図１４（ｆ）全体は、部分エリア成分算出部１２６が、エリア２のエリア収音成分Ｚ_２から、エリア１のエリア収音成分Ｚ_１をＳＳして独立エリアＢのエリア収音成分Ｖ_Ｂを抽出する処理（以下の（２２）式に相当する処理）について示し、図１４（ｄ）〜図１４（ｆ）の個々のグラフは、それぞれエリア２のエリア収音成分Ｚ_２、エリア１のエリア収音成分Ｚ_１、及び独立エリアＢのエリア収音成分Ｖ_Ｂの組成イメージを表している。 Similarly, in the entirety of FIGS. 14D to 14F, the partial area component calculation unit 126 sets the area sound component Z1 of the area ₁ to the SS from the area sound component Z2 of the area ₂ to be independent. shows the process of extracting the area sound-pickup component _{V B} of the area B (process equivalent to the following equation (22)), the individual graphs, respectively Area2 in FIG 14 (d) ~ FIG 14 (f) sound pickup component Z _2, which represents a composition image of the area sound-pickup component V _B of the area voice collecting component Z _1, and independent area B of area 1.

なお、図１４に示す各組成イメージでは、重複エリア１∧２のエリア収音成分と、独立エリアＡのエリア収音成分Ｖ_Ａと、独立エリアＢのエリア収音成分Ｖ_Ｂとに、それぞれ異なるパターンを付して図示している。 In each composition image shown in FIG. 14, the area sound-pickup component overlapping area 1∧2, the area sound-pickup component V _A separate area A, in the area sound-pickup component V _B of the independent area B, different from each other It is illustrated with a pattern.

部分エリア成分算出部１２６では、独立エリアＡのエリア収音成分Ｖ_Ａ又は独立エリアＢのエリア収音成分Ｖ_Ｂに基づいて、重複エリア１∧２のエリア収音成分（以下、「Ｖ_１∧２」と呼ぶ）を求めることができる。例えば、部分エリア成分算出部１２６は、エリア収音出力Ｚ_１から独立エリアＡのエリア収音成分Ｖ_ＡをＳＳする。部分エリア成分算出部１２６は、ＳＳに際して負になる成分は０にクリッピングする。そうすることで、目的エリア音抽出部１２４Ａでは、エリア収音出力Ｚ_１から独立エリアＡのエリア収音成分Ｖ_Ａが除かれ、重複エリア１∧２のエリア収音成分Ｖ_１∧２が分離される。同様に、部分エリア成分算出部１２６は、エリア収音出力Ｚ_２から独立エリアＢのエリア収音成分Ｖ_ＢをＳＳすることで、重複エリア１∧２のエリア収音成分Ｖ_１∧２を分離することができる。 The partial area component calculation unit 126, based on the area sound-pickup component V _B of the area voice collecting component V _A or an independent area B of the independent area A, area sound-pickup component overlapping area 1∧2 (hereinafter, "V _{1∧ 2} "). For example, partial area component calculation unit 126 SS the area sound-pickup component _{V A} separate area A from area sound-pickup output _{Z 1.} The partial area component calculation unit 126 clips a component that becomes negative at the time of SS to 0. In doing so, the destination area sound extraction unit 124A, area sound-pickup component V _A separate area A from area sound-pickup output Z ₁ is removed, the area voice collecting component V _1∧2 overlapping area 1∧2 separation Is done. Similarly, the partial area component calculation unit 126 separates the area sound component V 1 ₂ of the overlap area 1∧2 by SS the area sound component V _B of the independent area B from the area sound output _Z2 . can do.

図１５は、部分エリア成分算出部１２６が重複エリア１∧２のエリア収音成分Ｖ_１∧２を算出する処理の手順を示した説明図である。 FIG. 15 is an explanatory diagram showing a procedure of a process in which the partial area component calculation unit 126 calculates the area sound _pickup component V 1 _# 2 of the overlap area 1 # 2.

図１５（ａ）、図１５（ｂ）、図１５（ｅ）は、部分エリア成分算出部１２６が、エリア収音出力Ｚ_１から独立エリアＡのエリア収音成分Ｖ_ＡをＳＳして重複エリア１∧２のエリア収音成分Ｖ_１∧２を抽出する処理（以下の（２３）式に相当する処理）について示している。図１５（ｃ）、図１５（ｄ）、図１５（ｅ）は、部分エリア成分算出部１２６が、エリア収音出力Ｚ_２から独立エリアＢのエリア収音成分Ｖ_ＢをＳＳして重複エリア１∧２のエリア収音成分Ｖ_１∧２を抽出する処理（以下の（２４）式に相当する処理）について示している。以上のように、部分エリア成分算出部１２６では、独立エリアＡのエリア収音成分Ｖ_Ａ、独立エリアＢのエリア収音成分Ｖ_Ｂ、及び重複エリア１∧２のエリア収音成分Ｖ_１∧２を分離・抽出することができる。

FIG. 15 (a), the FIG. 15 (b), the FIG. 15 (e), the partial area component calculation unit 126, overlap area the area sound-pickup component _{V A} separate area A from area sound-pickup output _{Z 1} and SS The processing for extracting the 1 _{音 2} area sound _pickup component V 1 ₂ (processing corresponding to the following equation (23)) is shown. FIG. 15 (c), the FIG. 15 (d), the FIG. 15 (e), the partial area component calculation unit 126, overlap area the area sound-pickup component _{V B} of the independent area B from the area sound-pickup output _{Z 2} and SS The processing for extracting the 1 _{音 2} area sound _pickup component V _1∧2 (processing corresponding to the following equation (24)) is shown. As described above, in the partial area component calculation unit 126, the area sound component V _{A of} the independent area _A , the area sound component V _B of the independent area _B , and the area sound component V _1∧2 of the overlap area _1∧2. Can be separated and extracted.

部分エリア選択部１２７は、エリア収音成分Ｚ_１、Ｚ_２に加えて、分割された各部分エリアのエリア収音成分Ｖ_１−２、Ｖ_Ａ、Ｖ_Ｂのいずれかを選択して出力（最終的な収音結果Ｗとして出力）する。なお、部分エリア選択部１２７による選択処理方式については限定されないものである。 Partial area selection unit 127, in addition to the area sound-pickup component _Z 1, _{Z 2,} area sound-pickup component _V 1-2 of the partial area _divided, V A, choose one of the _{V B} output ( (Output as final sound collection result W). The selection processing method by the partial area selection unit 127 is not limited.

以上のように、収音部１２０Ａは、部分エリア選択部１２７により選択されたエリア収音成分を最終出力Ｗ（ｎ）として出力する。 As described above, the sound pickup unit 120A outputs the area sound pickup component selected by the partial area selection unit 127 as the final output W (n).

そして、通信装置２００の通信部２３０は、通信装置１００Ａから受信した音声データ（Ｗ（ｎ）に基づく音声データ）を出力部１４０に供給する。出力部１４０は、受信した音声データに基づく音響信号をスピーカ２１０に供給して表音出力（第２のユーザＵ２に向けて表音出力）させる。 Then, the communication unit 230 of the communication device 200 supplies the output unit 140 with the audio data (audio data based on W (n)) received from the communication device 100A. The output unit 140 supplies an audio signal based on the received audio data to the speaker 210 to output a sound (a sound output to the second user U2).

（Ｂ−３）第２の実施形態の効果
第２の実施形態によれば、第１の実施形態と比較して以下のような効果を奏することができる。 (B-3) Effects of the Second Embodiment According to the second embodiment, the following effects can be obtained as compared with the first embodiment.

第２の実施形態の収音部１２０Ａでは、重複エリアを有する２つのエリア収音出力に対し、エリア同士の重複を活用し、重複エリアと重複しない独立エリアの各々のエリア成分を分離・抽出することによって、エリア全体を複数の小エリアに分割する。そして、第２の実施形態の収音部１２０Ａでは、分割された小エリアの中から目的音収音エリアとして最も相応しいエリアを選択することで、複数エリア収音によるエリア全体の範囲をカバーしつつ、目的音が含まれるピンポイントのエリアから強調音声を取り出すことが可能になる。 In the sound pickup unit 120A of the second embodiment, for two area sound pickup outputs having overlapping areas, the overlapping of the areas is utilized to separate and extract each area component of an independent area that does not overlap with the overlapping area. Thereby, the whole area is divided into a plurality of small areas. Then, the sound pickup unit 120A of the second embodiment selects the most suitable area as the target sound pickup area from the divided small areas, thereby covering the entire area of a plurality of areas. Then, it is possible to extract the emphasized sound from the pinpoint area including the target sound.

（Ｃ）第３の実施形態
以下、本発明による収音装置、プログラム及び方法の第２の実施形態を、図面を参照しながら詳述する。この実施形態では、本発明の収音装置、プログラム及び方法を収音部に適用した例について説明する。 (C) Third Embodiment Hereinafter, a second embodiment of a sound collection device, a program, and a method according to the present invention will be described in detail with reference to the drawings. In this embodiment, an example in which a sound collection device, a program, and a method of the present invention are applied to a sound collection unit will be described.

（Ｃ−１）第３の実施形態の構成
図１６は、第３の実施形態に関連する各装置の構成について示したブロック図である。 (C-1) Configuration of Third Embodiment FIG. 16 is a block diagram showing the configuration of each device related to the third embodiment.

図１６では、上述のＹ１と同一部分又は対応部分には、同一符号又は対応符号を付している。以下では、第３の実施形態について、第２の実施形態との差異を中心に説明する。 In FIG. 16, the same parts or corresponding parts as those of Y1 described above are denoted by the same reference numerals or corresponding reference numerals. Hereinafter, the third embodiment will be described focusing on differences from the second embodiment.

第３の実施形態では、通信装置１００Ａが通信装置１００Ｂに置き換わっている点で第２の実施形態と異なっている。また、第３の実施形態の通信装置１００Ｂでは、マイクアレイ部１１１Ａがマイクアレイ部１１１Ｂに置き換わっている点で第２の実施形態と異なっている。さらに、第３の実施形態の通信装置１００Ｂでは、収音部１２０Ａが収音部１２０Ｂに置き換わっている点で第２の実施形態と異なっている。 The third embodiment is different from the second embodiment in that a communication device 100A is replaced by a communication device 100B. Further, the communication device 100B of the third embodiment differs from the second embodiment in that the microphone array unit 111A is replaced by the microphone array unit 111B. Further, the communication device 100B of the third embodiment is different from the second embodiment in that the sound collection unit 120A is replaced with the sound collection unit 120B.

次に、第３の実施形態における収音部１２０Ｂの内部構成について説明する。 Next, an internal configuration of the sound pickup unit 120B according to the third embodiment will be described.

第３の実施形態の収音部１２０Ｂでは、目的エリア音抽出部１２４Ａと部分エリア成分算出部１２６と部分エリア選択部１２７とが、それぞれ目的エリア音抽出部１２４Ｂと部分エリア成分算出部１２６Ｂと部分エリア選択部１２７Ｂとに置き換わっている点で第２の実施形態と異なっている。 In the sound pickup unit 120B of the third embodiment, the target area sound extraction unit 124A, the partial area component calculation unit 126, and the partial area selection unit 127 are respectively connected to the target area sound extraction unit 124B, the partial area component calculation unit 126B, It differs from the second embodiment in that it is replaced with an area selector 127B.

次に、マイクアレイ部１１１Ｂの構成について、図１７を用いて説明する。 Next, the configuration of the microphone array unit 111B will be described with reference to FIG.

この実施形態の例では、図１７に示すように、通信装置１００Ｂは、ハードウェア的にはスマートホン（話者Ｕ１が所持するスマートホン）の構成となっているものとする。また、第３の実施形態の例では、マイクアレイ部１Ｂは、３つのマイクロホンＭＣ１〜ＭＣ３を有する構成であるものとする。 In the example of this embodiment, as shown in FIG. 17, the communication device 100B has a hardware configuration of a smartphone (a smartphone owned by the speaker U1). In the example of the third embodiment, the microphone array unit 1B has a configuration including three microphones MC1 to MC3.

そして、図１７に示すように、この実施形態の例では、通信装置１００はスマートホンの構成であるため、この３つのマイクロホンＭＣ１〜ＭＣ３は、スマートホンにおいて通常送話口となる部分（スピーカＳＰが配置されている部分と反対側の端）の周囲に配置されることが望ましい。言い換えると、通信装置１００において、３つのマイクロホンＭＣ１〜ＭＣ３は、通信装置１００の使用時に話者Ｕ１の口元と対向する部分（話者Ｕ１の口元と最も近接する部分）の周囲に配置することが望ましい。図１７では、話者Ｕ１が通信装置１００を手で把持し、耳にスピーカＳＰを押し付けた場合に、話者Ｕ１の口元が位置する部分（図１７の方向から見て下側の部分）の周囲（話者Ｕ１の口元と最も近接する部分の周囲）に３つのマイクロホンＭＣ１〜ＭＣ３が配置されている。 As shown in FIG. 17, in the example of this embodiment, since the communication device 100 has a configuration of a smart phone, the three microphones MC1 to MC3 are used as portions (speakers SP) that normally function as mouthpieces in the smart phone. Is desirably disposed around the end opposite to the portion where the is disposed. In other words, in the communication device 100, the three microphones MC1 to MC3 may be arranged around a portion facing the mouth of the speaker U1 (a portion closest to the mouth of the speaker U1) when the communication device 100 is used. desirable. In FIG. 17, when the speaker U1 grips the communication device 100 with his hand and presses the speaker SP against his ear, the portion where the mouth of the speaker U1 is located (the lower portion as viewed from the direction of FIG. 17) is shown. Three microphones MC 1 to MC 3 are arranged around (around the part closest to the mouth of the speaker U 1).

図１７に示す通信装置１００（マイクアレイ部１）では、３個のマイクロホンＭＣ１〜ＭＣ３の各位置（各マイクロホンの中心位置）が正三角形の頂点となるように配置されている。この実施形態では、３つのマイクロホンＭＣ１〜ＭＣ３の組み合わせにより、３つのマイクアレイＭＡ１〜ＭＡ３が構成されるものとする。以下では、図１７に示すように、マイクロホンＭＣ１、ＭＣ２を対とするマイクアレイをＭＡ１、マイクロホンＭＣ２、ＭＣ３を対とするマイクアレイをＭＡ２、マイクロホンＭＣ３、ＭＣ１を対とするマイクアレイをＭＡ３と呼ぶものとする。 In the communication device 100 (microphone array unit 1) illustrated in FIG. 17, the three microphones MC1 to MC3 are arranged so that each position (the center position of each microphone) becomes the vertex of an equilateral triangle. In this embodiment, three microphone arrays MA1 to MA3 are configured by a combination of three microphones MC1 to MC3. In the following, as shown in FIG. 17, a microphone array pairing microphones MC1 and MC2 is called MA1, a microphone array pairing microphones MC2 and MC3 is called MA2, and a microphone array pairing microphones MC3 and MC1 is called MA3. Shall be.

この実施形態では、エリアの拡大を等方向にするためマイクロホンＭＣ１〜ＭＣ３を正三角形に配置したが、必ずしも正三角形に限定されるものではない。すなわち、マイクロホンＭＣ１〜ＭＣ３による三角形の各辺の距離や各角の角度は全て同じでなくてもよい。 In this embodiment, the microphones MC1 to MC3 are arranged in an equilateral triangle in order to enlarge the area in the same direction, but are not necessarily limited to the equilateral triangle. That is, the distances between the sides of the triangle and the angles of the corners of the microphones MC1 to MC3 need not be all the same.

以上のように、第３の実施形態では、図１８に示すように、３個のマイクロホン（ＭＣ１〜ＭＣ３）から３個のマイクアレイ（ＭＡ１〜ＭＡ３）を構成し、マイクアレイの組み合せによって３箇所のエリア収音を行なうものとする。 As described above, in the third embodiment, as shown in FIG. 18, three microphone arrays (MA1 to MA3) are configured from three microphones (MC1 to MC3), and three microphone arrays (MA1 to MA3) are formed by combining the microphone arrays. The area is picked up.

（Ｃ−２）第３の実施形態の動作
次に、以上のような構成を有する第３の実施形態の動作（実施形態に係る収音方法）を説明する。 (C-2) Operation of Third Embodiment Next, an operation (a sound collection method according to the embodiment) of the third embodiment having the above-described configuration will be described.

信号入力部１２１は、３つのマイクロホンＭＣ１〜ＭＣ３で収音した音響信号をアナログ信号からデジタル信号ｘ_１〜ｘ_３に変換し、周波数変換部１２２に供給する。 The signal input unit 121 converts the acoustic signals collected by the three microphones MC _{1 to} MC ₃ from analog signals to digital signals x _{1 to} x ₃ and supplies the digital signals to the frequency conversion unit 122.

周波数変換部１２２では、例えば、高速フーリエ変換を用いてマイクロホン信号を時間領域から周波数領域の信号Ｘ_１〜Ｘ_３へ変換する。 The frequency conversion unit 122, for example, to convert the microphone signals from the time domain to the signal X ₁ to X ₃ in the frequency domain using a fast Fourier transform.

指向性形成部１２３は、周波数変換部１２２によって時間−周波数変換された各マイクロホンの入力信号を用いてＢＦにより指向性を形成する。第３の実施形態では、マイクアレイＭＡ１によるＢＦ出力をＹ_１、マイクアレイＭＡ２によるＢＦ出力をＹ_２、マイクアレイＭＡ３によるＢＦ出力をＹ_３とする。 The directivity forming unit 123 forms directivity by BF using the input signal of each microphone that has been time-frequency converted by the frequency converting unit 122. In the third embodiment, the BF output by the microphone array MA1 _{Y 1,} the BF output by the microphone array MA2 _{Y 2,} and the BF output _{Y 3} by the microphone array MA3.

目的エリア音抽出部１２４では、指向性形成部１２３で形成したＢＦ出力Ｙ_１、Ｙ_２、Ｙ_３、を用い、Ｙ_１−Ｙ_２、Ｙ_２−Ｙ_３、Ｙ_３−Ｙ_１の組合せで、それぞれエリア収音処理を行なう。 The target area sound extraction unit 124 uses the BF outputs Y ₁ , Y ₂ , and Y ₃ formed by the directivity forming unit 123 and uses a combination of Y ₁ -Y ₂ , Y ₂ -Y ₃ , and Y ₃ -Y ₁ . , Respectively.

第３の実施形態では、Ｙ_１−Ｙ_２の組み合せによるエリア（収音エリア）を「１」、Ｙ_２−Ｙ_３の組み合せによるエリア（収音エリア）を２、Ｙ_３−Ｙ_１の組み合せによるエリア（収音エリア）を「３」と呼ぶものとする。 In the third _embodiment, Y 1 "1" area (sound-pickup area) by a combination of -Y _{_2,} combining area by a combination of Y 2 -Y ₃ (the sound-pickup area) of _2, Y 3 -Y ₁ Area (sound collection area) is referred to as “3”.

上述の図６で示したように、エリア収音ではマイクアレイから遠い方向に収音エリアが広がる特性を持つ。そのため、ＢＦ出力Ｙ_１−Ｙ_２（マイクアレイＭＡ１−ＭＡ２）によるエリア１、ＢＦ出力Ｙ_２−Ｙ_３（マイクアレイＭＡ２−ＭＡ３）によるエリア２、ＢＦ出力Ｙ_３−Ｙ_１（マイクアレイＭＡ３−ＭＡ１）による収音エリアの分布は、図１８のようなイメージになる。なお、第３の実施形態では、エリア１、２、３のそれぞれのエリア収音成分（エリア収音出力）をＺ_１、Ｚ_２、Ｚ_３とする。 As shown in FIG. 6 described above, the area sound pickup has a characteristic that the sound pickup area spreads in a direction far from the microphone array. Therefore, BF outputs _Y 1 -Y ₂ area by area according to (microphone array MA1-MA2) 1, BF output _Y 2 -Y ₃ (microphone array MA2-MA3) 2, BF output _Y 3 -Y ₁ (microphone array MA3- The distribution of the sound collection area according to MA1) has an image as shown in FIG. In the third embodiment, the area sound pickup components (area sound pickup outputs) of the areas ₁ , ₂ , and ₃ are Z ₁ , Z ₂ , and Z ₃ .

部分エリア成分算出部１２６Ｂではエリア収音出力Ｚ_１、Ｚ_２、Ｚ_３を用い、上記３つのエリアが重複する部分と、２つのエリアが重複する部分と、重複なく独立した部分とをそれぞれ算出する。第２の実施形態では２つの収音エリアの重複を考えたが、第３の実施形態では収音エリアが３つになるため、第２の実施形態と比較して重複のパターンがより複雑になる。３つの収音エリアの重なり合うエリアの各部分の算出は、エリア収音成分が既知の２つのエリア収音成分の組み合わせに分解することで、第２の実施形態と同じ手法（算出方法）が利用可能となる。 The partial area component calculation unit 126B uses the area sound pickup outputs Z ₁ , Z ₂ , and Z ₃ to calculate a portion where the three areas overlap, a portion where the two areas overlap, and an independent portion without overlap. I do. In the second embodiment, the overlap of two sound pickup areas is considered. However, in the third embodiment, there are three sound pickup areas. Therefore, the overlap pattern is more complicated than in the second embodiment. Become. The calculation of each part of the overlapping area of the three sound pickup areas is performed by decomposing the area sound pickup components into a combination of two known area sound pickup components, thereby using the same method (calculation method) as in the second embodiment. It becomes possible.

具体的には、部分エリア成分算出部１２６Ｂでは、各部分のエリア収音成分を算出する際に、２つの収音エリアの各組み合わせ（エリア１、２の組み合わせ、エリア２、３の組み合わせ、エリア３、１の組み合わせ）のパターン（以下、「組み合わせパターン」と呼ぶ）に分解することで、第２の実施形態と同じ手法が利用可能となる。すなわち、部分エリア成分算出部１２６Ｂにおいて、重複エリアを有する２つの収音エリアのエリア収音成分を、重複エリアの部分と、独立エリアの部分に分離する処理は第２の実施形態と同様である。 Specifically, when calculating the area sound pickup component of each part, the partial area component calculation unit 126B calculates each combination of the two sound pickup areas (the combination of the areas 1 and 2, the combination of the areas 2 and 3, the area The same method as in the second embodiment can be used by decomposing the pattern into a combination (3, 1) (hereinafter, referred to as a “combination pattern”). That is, in the partial area component calculation unit 126B, the processing of separating the area sound pickup components of the two sound pickup areas having the overlap area into the overlap area part and the independent area part is the same as in the second embodiment. .

以下では、エリア１、２の組み合わせパターンを「第１の組み合わせパターン」と呼び、エリア２、３の組み合わせパターンを「第２の組み合わせパターン」と呼び、エリア３、１の組み合わせパターンを「第３の組み合わせパターン」と呼ぶものとする。 Hereinafter, the combination pattern of the areas 1 and 2 is referred to as a “first combination pattern”, the combination pattern of the areas 2 and 3 is referred to as a “second combination pattern”, and the combination pattern of the areas 3 and 1 is referred to as a “third combination pattern”. Combination pattern ".

図１９は、３つのエリア１〜３について２つの収音エリアの組み合わせパターン（第１〜第３の組み合わせパターン）の分解イメージについて示した説明図（イメージ図）である。 FIG. 19 is an explanatory diagram (image diagram) showing an exploded image of a combination pattern (first to third combination patterns) of two sound collection areas for three areas 1 to 3.

図１９（ａ）は、３つのエリア１〜３を重ねて示した図となっている。図１９（ｂ）〜図１９（ｄ）は、それぞれ第１〜第３の組み合わせパターンに分解したイメージについて示した説明図である。 FIG. 19A is a diagram in which three areas 1 to 3 are overlapped. FIGS. 19B to 19D are explanatory diagrams showing images decomposed into first to third combination patterns, respectively.

まず、図１９（ｂ）〜図１９（ｄ）に示す３つの組み合わせパターンから、図１９（ｂ）に示す第１の組み合わせパターン（エリア１、２の組み合わせパターン）を代表例として説明する。 First, from the three combination patterns shown in FIGS. 19B to 19D, the first combination pattern (combination pattern of areas 1 and 2) shown in FIG. 19B will be described as a representative example.

部分エリア成分算出部１２６Ｂは、エリア収音出力Ｚ_１からエリア収音出力Ｚ_２をＳＳすることで、エリア１のエリア２に対して独立した部分（この実施形態では、「エリアＡ」と呼ぶものとする；図１９（ｂ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ａ」と呼ぶ）を得る。また、部分エリア成分算出部１２６Ｂは、エリア収音出力Ｚ_２からエリア収音出力Ｚ_１をＳＳすることで、エリア２のエリア１に対して独立した部分（この実施形態では、「エリアＢ」と呼ぶものとする；図１９（ｂ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ｂ」と呼ぶ）を得ることができる。部分エリア成分算出部１２６Ｂでは、第２の実施形態と同様に、上記の（２１）式、（２２）式の計算式により、エリア収音成分Ｖ_Ａ、Ｖ_Ｂを得ることができる。 Partial area component calculation unit 126B, by the SS an area sound-pickup output Z ₂ from the area sound-pickup output Z _1, the independent parts (the embodiment for areas 2 in area 1, referred to as "area A" It is assumed that an area sound pickup component (referred to as “ _VA ” in the third embodiment) of FIG. 19B is obtained. Further, the partial area component calculation unit 126B, by SS the area sound-pickup output Z ₁ from the area sound-pickup output Z _2, the independent parts (the embodiment for areas 1 area 2, "area B" 19 (b)) (refer to FIG. 19 (b)) (referred to as “V _B ” in the third embodiment). The partial area component calculation unit 126B, as in the second embodiment, the above equation (21), can be obtained by equation (22) equation, the area voice collecting component _V A, _{V B.}

部分エリア成分算出部１２６Ｂでは、第２の組み合わせパターン（エリア２、３の組み合わせパターン）についても同様に、エリア２のエリア３に対して独立した部分（この実施形態では、「エリアＣ」と呼ぶものとする；図１９（ｃ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ｃ」と呼ぶ）と、エリア３のエリア２に対して独立した部分（第３の実施形態では、「エリアＤ」と呼ぶものとする；図１９（ｃ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ｄ」と呼ぶ）を得ることができる。また、部分エリア成分算出部１２６Ｂでは、第３の組み合わせパターン（エリア３、１の組み合わせパターン）についても同様に、エリア３のエリア１に対して独立した部分（この実施形態では、「エリアＥ」と呼ぶものとする；図１９（ｄ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ｅ」と呼ぶ）と、エリア１のエリア３に対して独立した部分（この実施形態では、「エリアＦ」と呼ぶものとする；図１９（ｄ）参照）のエリア収音成分（第３の実施形態では「Ｖ_Ｆ」と呼ぶ）を得ることができる。 In the partial area component calculation unit 126B, the second combination pattern (the combination pattern of the areas 2 and 3) is also a part independent of the area 3 of the area 2 (referred to as “area C” in this embodiment). and ones; and FIG. 19 (c) refer) area sound-pickup components (in the third embodiment is referred to as "V _C"), independent part with respect to the area 2 of the area 3 (in the third embodiment , "Area D"; see FIG. 19 (c)) to obtain an area sound pickup component (called "V _D " in the third embodiment). Similarly, in the partial area component calculation unit 126B, the third combination pattern (the combination pattern of the areas 3 and 1) is also independent of the area 3 of the area 1 (in this embodiment, “area E”). An area sound pickup component (referred to as “ _VE ” in the third embodiment) of FIG. 19D) and a portion independent of the area 3 of the area 1 (in this embodiment, , is referred to as "area F"; the area sound-pickup component (the third embodiment of FIG. 19 (d) refer) can be obtained is called) and "V _F".

部分エリア成分算出部１２６Ｂでは、以下の（３１）式〜（３４）式の計算式により、エリア収音成分Ｖ_Ｃ、Ｖ_Ｄ、Ｖ_Ｅ、Ｖ_Ｆを得ることができる。

The partial area component calculation unit 126B, may be obtained by the following equation (31) - (34) equation, the area voice collecting component _{_{_{V C, V D, V E}}} , the _{V F.}

部分エリア成分算出部１２６Ｂでは、エリアＡのエリア収音成分Ｖ_Ａ、又はエリアＢのエリア収音成分Ｖ_Ｂが既知となると、エリア１、２が重複する部分（この実施形態では「重複エリア１∧２」と呼ぶ；図１９（ｂ）参照）のエリア収音成分（この実施形態では「Ｖ_１∧２」と呼ぶ）を得ることができる。具体的には、部分エリア成分算出部１２６Ｂは、以下の（３５）式に示すように、エリア収音出力Ｚ_１からエリアＡのエリア収音成分Ｖ_ＡをＳＳすることで、エリア１∧２のエリア収音成分を得ることができる。また、部分エリア成分算出部１２６Ｂは、以下の（３６）式に示すように、エリア収音出力Ｚ_２からエリアＢのエリア収音成分Ｖ_ＢをＳＳすることでも、エリア１∧２のエリア収音成分を得ることができる。 When the area sound pickup component VA of the area _A or the area sound pickup component VB of the area _B is known, the partial area component calculation unit 126B determines a portion where the areas 1 and 2 overlap (in this embodiment, "overlap area 1"). 19B (refer to FIG. 19 (b)) (in this embodiment, referred to as “V 1 _{では 2} ”). Specifically, partial area component calculation unit 126B, as shown in the following expression (35), by SS the area sound-pickup component _{V A} of the area A from the area sound-pickup output _{Z 1,} Area 1∧2 Area sound pickup components can be obtained. The partial area component calculation unit 126B, as shown in the following equation (36), also by SS the area sound-pickup component _{V B} of the area B from the area sound-pickup output _{Z 2,} the area 1∧2 area yield Sound components can be obtained.

同様に、部分エリア成分算出部１２６Ｂでは、以下の（３７）式、（３８）式に示すように、エリアＣのエリア収音成分Ｖ_Ｃ又はエリアＤのエリア収音成分Ｖ_Ｄに基づいて、エリア２、３が重複するエリア（この実施形態では「重複エリア２∧３」と呼ぶ；図１９（ｃ）参照）のエリア収音成分（この実施形態では「Ｖ_２∧３」と呼ぶ）を得ることができる。また、部分エリア成分算出部１２６Ｂでは、以下の（３９）式、（４０）式に示すように、エリアＥのエリア収音成分Ｖ_Ｅ又はエリアＦのエリア収音成分Ｖ_Ｆに基づいて、エリア３、１が重複するエリア（この実施形態では「重複エリア３∧１」と呼ぶ；図１９（ｄ）参照）のエリア収音成分（この実施形態では「Ｖ_３∧１」と呼ぶ）を得ることができる。

Similarly, the partial area component calculator 126B calculates the area sound component VC of the area _C or the area sound component VD of the area _D based on the following formulas (37) and (38). An area sound _pickup component (referred to as “V 2 _∧3 ” in this embodiment) of an area where

areas

2 and 3 overlap (referred to as “overlap area 2 ∧ 3” in this embodiment; see FIG. 19C). Obtainable. Further, the partial area component calculation unit 126B, the following equation (39), (40) as shown in the expression on the basis of the area sound-pickup component _{V F} of the area sound-pickup component _{V E} or area F of area E, the area An area sound _pickup component (referred to as “V 3 _∧1 ” in this embodiment) of an area where 3 and 1 overlap (referred to as “overlap area 3 ∧ 1” in this embodiment; see FIG. _19D ). be able to.

部分エリア成分算出部１２６Ｂでは、エリア１、エリア２、エリア３の３つのエリアの中の２つのエリアの組み合せで生じる独立部分のエリア収音成分（Ｖ_Ａ、Ｖ_Ｂ、Ｖ_Ｃ、Ｖ_Ｄ、Ｖ_Ｅ、Ｖ_Ｆ）と重複部分（Ｖ_１∧２、Ｖ_２∧３、Ｖ_３∧１）のすべてが算出されると、それらのエリア収音成分を元に、３エリアを同時に重ねた場合の各部分エリアのエリア収音成分が算出できる。 In the partial area component calculation unit 126B, the area sound pickup components (V _A , V _B , V _C , V _D , V _E, _{V F)} and overlapping parts _{_(V 1∧2, V} _2∧3, when all _{V 3∧1)} is calculated, based on their area sound-pickup components, when stacked 3 areas simultaneously The area sound pickup component of each partial area can be calculated.

図２０〜図２３は、エリア１、エリア２、エリア３の３つのエリアを同時に重ねた場合の各部分エリアについて示した説明図である。例えば、上記の計算で、エリアＡのエリア収音成分Ｖ_Ａと、エリアＤのエリア収音成分Ｖ_Ｄが既知となっているため、部分エリア成分算出部１２６Ｂは、以下の（４１）式に示すように、エリアＡのエリア収音成分Ｖ_ＡからエリアＤのエリア収音成分Ｖ_Ｄを、これまでと同様の計算によりＳＳすることで、エリアＡの独立部分（以下、「エリアＡｄ」と呼ぶ；図２０参照）のエリア収音成分（以下、「Ｖ_Ａｄ」と呼ぶ）を得ることができる。また、部分エリア成分算出部１２６Ｂは、以下の（４２）式に示すように、エリアＤのエリア収音成分Ｖ_ＤからエリアＡのエリア収音成分Ｖ_Ｄを、これまでと同様の計算によりＳＳすることで、エリアＤの独立部分（以下、「エリアＤａ」と呼ぶ；図２０参照）のエリア収音成分（以下、「Ｖ_Ｄａ」と呼ぶ）を得ることができる。 FIG. 20 to FIG. 23 are explanatory diagrams showing the respective partial areas when three areas of area 1, area 2, and area 3 are overlapped at the same time. For example, in the above calculation, the area sound pickup component _VA of the area _A and the area sound pickup component VD of the area _D are known, so the partial area component calculation unit 126B calculates the following equation (41). as shown, the area sound-pickup component V _D of the area D from the area voice collecting component V _a of the area a, by SS by the same calculation as far independent portion of the area a (hereinafter, the "area Ad"Call; see FIG. 20) (hereinafter, referred to as “V _Ad ”). Further, the partial area component calculation unit 126B, as shown in the following equation (42), by the area sound-pickup component V _D of the area A from the area sound-pickup component V _D of the area D, the same calculation as far SS By doing so, an area sound pickup component (hereinafter, referred to as “V _Da ”) of an independent portion of the area D (hereinafter, referred to as “area Da”; see FIG. 20) can be obtained.

そして、部分エリア成分算出部１２６Ｂは、エリアＡｄのエリア収音成分Ｖ_Ａｄ又はエリアＤａのエリア収音成分Ｖ_Ｄａが得られると、エリアＡとエリアＤの重複部分（以下、「エリアＡ∧Ｄ」と呼ぶ；図２０参照）のエリア収音成分（以下、「Ｖ_Ａ∧Ｄ」と呼ぶ）を算出することができる。 Then, when the area sound pickup component VAd of the area _Ad or the area sound pickup component VDa of the area _Da is obtained, the partial area component calculation unit 126 B overlaps the area A and the area D (hereinafter, “area A∧D ; See FIG. 20) (hereinafter, referred to as “ _VA∧D ”).

具体的には、部分エリア成分算出部１２６Ｂは、以下の（４３）式に示すように、エリアＤのエリア収音出力Ｖ_ＤからエリアＡｄのエリア収音成分Ｖ_Ａｄを、これまでと同様の計算方法によりＳＳすることで、エリアＡ∧Ｄのエリア収音成分Ｖ_Ａ∧Ｄを算出することができる。また、部分エリア成分算出部１２６Ｂは、以下の（４４）式に示すように、エリアＡのエリア収音出力Ｖ_ＡからエリアＤａのエリア収音成分Ｖ_ＤａをＳＳすることでも、エリアＡ∧Ｄのエリア収音成分Ｖ_Ａ∧Ｄを算出することができる。

More specifically, the partial area component calculation unit 126B calculates the area sound pickup component V _Ad of the area Ad from the area sound pickup output V _D of the area D as shown in the following equation (43). By performing the SS by the calculation method, it is possible to calculate the area sound _pickup component VA∧D of the area A∧D. In addition, the partial area component calculation unit 126B also performs the area sound component V _Da of the area Da from the area sound output V _A of the area A, as shown in the following equation (44), so that the area A∧D it can be calculated in the area sound-pickup component V _A∧D.

同様に、部分エリア成分算出部１２６Ｂでは、以下の（４５）式、（４７）式に示すように、エリアＢの独立部分（以下、「エリアＢｅ」と呼ぶ；図２１参照）のエリア収音成分（以下、「Ｖ_Ｂｅ」と呼ぶ）と、エリアＥの独立部分（以下、「エリアＥｂ」と呼ぶ；図２１参照）のエリア収音成分（以下、「Ｖ_Ｅｂ」と呼ぶ）を得ることができる。そして、部分エリア成分算出部１２６Ｂでは、以下の（４６）式、（４８）式に示すように、エリアＢｅのエリア収音成分Ｖ_Ｂｅ又はエリアＥｂのエリア収音成分Ｖ_Ｅｂに基づいて、エリアＢとエリアＥの重複部分（以下、「エリアＢ∧Ｅ」と呼ぶ；図２１参照）のエリア収音成分（以下、「Ｖ_Ｂ∧Ｅ」と呼ぶ）を算出することができる。 Similarly, in the partial area component calculation unit 126B, as shown in the following equations (45) and (47), the area sound pickup of an independent part of the area B (hereinafter, referred to as “area Be”; see FIG. 21). component (hereinafter, referred to as _{"V be")} and, independent portions of the area E (hereinafter, referred to as "area Eb"; see FIG. 21) area sound-pickup components (hereinafter, referred to as _{"V Eb")} to obtain Can be. Then, the partial area component calculation unit 126B, the following expression (46), (48) as shown in the expression on the basis of the area sound-pickup component _{V Eb} of area sound-pickup component _{V Be} or area Eb area Be, Area An area sound _pickup component (hereinafter, referred to as “V _BＶE ”) of an overlapping portion of B and area E (hereinafter, referred to as “area B∧E”; see FIG. 21) can be calculated.

また、部分エリア成分算出部１２６Ｂでは、以下の（４９）式、（５１）式に示すように、エリアＣの独立部分（以下、「エリアＣｆ」と呼ぶ；図２２参照）のエリア収音成分（以下、「Ｖ_Ｃｆ」と呼ぶ）と、エリアＦの独立部分（以下、「エリアＦｃ」と呼ぶ；図２２参照）のエリア収音成分（以下、「Ｖ_Ｆｃ」と呼ぶ）を得ることができる。そして、部分エリア成分算出部１２６Ｂでは、以下の（５０）式、（５２）式に示すように、エリアＣｆのエリア収音成分Ｖ_Ｃｆ又はエリアＦｃのエリア収音成分Ｖ_Ｆｃに基づいて、エリアＣとエリアＦの重複部分（以下、「エリアＣ∧Ｆ」と呼ぶ；図２２参照）のエリア収音成分（以下、「Ｖ_Ｃ∧Ｆ」と呼ぶ）を算出することができる。

In addition, the partial area component calculation unit 126B calculates the area sound pickup component of an independent part of the area C (hereinafter, referred to as “area Cf”; see FIG. 22) as shown in the following equations (49) and (51). (Hereinafter, referred to as “V _Cf ”) and an area sound pickup component (hereinafter, referred to as “V _Fc ”) of an independent portion of the area F (hereinafter, referred to as “area Fc”; see FIG. 22). it can. Then, the partial area component calculation unit 126B calculates the area based on the area sound component V _{Cf of the} area _Cf or the area sound component V _Fc of the area Fc as shown in the following equations (50) and (52). An area sound _pickup component (hereinafter, referred to as “ _VC∧F ”) of an overlapping portion of C and the area F (hereinafter, referred to as “area C∧F”; see FIG. 22) can be calculated.

そして、部分エリア成分算出部１２６Ｂでは、エリア収音成分Ｖ_１∧２とエリア収音成分Ｖ_Ｃ∧Ｆが既知になると、エリア１、エリア２、エリア３の３つのエリアの共通部分のエリア（以下、「エリア１∧２∧３」と呼ぶ；図２３参照）のエリア収音成分（以下、「Ｖ_{１∧２∧３}」と呼ぶ）を取得することができる。具体的には、部分エリア成分算出部１２６Ｂでは、以下の（５３）式に示すように、エリア収音成分Ｖ_１∧２からエリア収音成分Ｖ_Ｃ∧Ｆをこれまでと同様の計算方法によりＳＳすることで、エリア１∧２∧３のエリア収音成分Ｖ_{１∧２∧３}を得ることができる。 Then, when the area sound _pickup component V1 _{# 2} and the area sound _pickup component _{VC # F} are known, the partial area component calculation unit 126B determines the area (the area 1, the area 2, and the area 3) of the common part of the three areas. Hereinafter, an area sound _pickup component (hereinafter, referred to as “ _V1∧2∧3 ”) of “area 1∧2∧3; see FIG. 23” can be acquired. More specifically, the partial area component calculation unit 126B calculates the area sound _pickup component V _{C ∧F} from the area sound _pickup component V _1∧2 by the same calculation method as described above, as shown in the following equation (53). By performing SS, it is possible to obtain an area sound _pickup component V1∧2∧3 of the area 1∧2∧3.

また、同様に、部分エリア成分算出部１２６Ｂでは、エリア収音成分Ｖ_２∧３とエリア収音成分Ｖ_Ｂ∧Ｅが既知になると、以下の（５４）式により、エリア１∧２∧３のＶ_{１∧２∧３}を取得することができる。さらに、部分エリア成分算出部１２６Ｂでは、エリア収音成分Ｖ_３∧１とエリア収音成分Ｖ_Ａ∧Ｄが既知になると、以下の（５５）式により、エリア１∧２∧３のＶ_{１∧２∧３}を取得することができる。

Similarly, when the area _pickup component _V2∧3 and the area _pickup component _VB∧E are known, the partial area component calculation unit 126B similarly calculates the area 1∧2∧3 by the following equation (54). V _1∧2∧3 can be obtained. Furthermore, the partial area component calculation unit 126B, the area sound-pickup component _{V 3∧1} and area sound-pickup component _{V A∧D} is known, by the following equation (55), _V the area 1∧2∧3 _{1∧ 2∧3} can be obtained.

以上の処理により、部分エリア成分算出部１２６Ｂでは、重複を含む３つのエリアのエリア収音により、３つのエリアが重複する部分のエリア収音成分（Ｖ_{１∧２∧３}）と、３つのうち２つのエリアが重複する部分のエリア収音成分（Ｖ_Ａ∧Ｄ、Ｖ_Ｂ∧Ｅ、Ｖ_Ｃ∧Ｆ）と、３つのエリアの中で他のエリアとは重複無く独立した部分のエリア収音成分（Ｖ_Ａｄ、Ｖ_Ｄａ、Ｖ_Ｂｅ、Ｖ_Ｅｂ、Ｖ_Ｃｆ、Ｖ_Ｆｃ）が分離抽出することができる。 By the above processing, the partial area component calculation unit 126B _obtains the area sound components (V _1∧2∧3 ) of the portion where the three areas overlap with each other, based on the area sound collection of the three areas including the overlap, and Area sound _pickup components ( _VA∧D , _VB∧E , _VC∧F ) of a part where two areas overlap, and area sound _{pickup of} an independent part without overlapping with other areas in the three areas component _{_{_{_{(V Ad, V Da, V}}}} be, V Eb, V Cf, V Fc) can be separated and extracted.

部分エリア選択部１２７Ｂでは、このように各部分に分解されたエリアのエリア収音成分の中から、最も目的音成分が多く含まれていると推定されるエリアのエリア収音成分を、収音結果Ｗとして選択（取得）する。以下では、部分エリア選択部１２７Ｂにおいて全エリア（エリア１、２、３がカバーする全エリア）から分けられた各エリアを「部分エリア」と呼ぶものとする。例えば、部分エリア選択部１２７Ｂでは、部分エリアとして、エリア１、２、３、Ａ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆ、１∧２、２∧３、３∧１、１∧２∧３、Ａｄ、Ｄａ、Ａ∧Ｄ、Ｂｅ、Ｅｂ、Ｂ∧Ｅ、Ｃｆ、Ｆｃ、Ｃ∧Ｆのうち一部又は全部を設定するようにしてもよい。 The partial area selection unit 127B collects the area sound components of the area estimated to contain the most target sound components from among the area sound components of the area decomposed into the respective parts as described above. Select (acquire) as result W. Hereinafter, each area divided from all areas (all areas covered by areas 1, 2, and 3) in the partial area selection unit 127B is referred to as a "partial area". For example, in the partial area selection unit 127B, as partial areas, areas 1, 2, 3, A, B, C, D, E, F, 1∧2, 2∧3, 3∧1, 1∧2∧3, Some or all of Ad, Da, A∧D, Be, Eb, B∧E, Cf, Fc, and C∧F may be set.

部分エリア選択部１２７Ｂが、それぞれの部分エリア（エリア収音成分）からいずれかを選択する方式については限定されないものである。なお、以下では、部分エリア選択部１２７Ｂが選択した部分エリアを「選択エリア」とも呼ぶものとする。 The method in which the partial area selection unit 127B selects one of the respective partial areas (area sound pickup components) is not limited. Hereinafter, the partial area selected by the partial area selection unit 127B is also referred to as a “selected area”.

例えば、部分エリア選択部１２７Ｂは、最も大きなパワーの部分エリア（例えば、当該部分エリアのエリア収音成分を構成する各周波数成分を平均した平均パワースペクトルが最も大きいエリア）を選択するようにしてもよい。また、その際、部分エリア選択部１２７Ｂは、各部分エリアのパワーを面積で正規化して評価した評価値に基づいて、いずれかの部分エリアを選択するようにしてもよい。例えば、部分エリア選択部１２７Ｂは、各部分エリアについて、同じパワーでも面積が狭いエリアの方が評価が高くなるような評価値を算出し、最も評価（評価値）の高い部分エリアを選択するようにしてもよい。 For example, the partial area selection unit 127B may select a partial area having the largest power (for example, an area having the largest average power spectrum obtained by averaging the frequency components constituting the area sound pickup component of the partial area). Good. At this time, the partial area selection unit 127B may select any one of the partial areas based on the evaluation value obtained by normalizing the power of each partial area with the area and evaluating the power. For example, for each partial area, the partial area selection unit 127B calculates an evaluation value such that an area having a smaller area has a higher evaluation even with the same power, and selects a partial area having the highest evaluation (evaluation value). It may be.

また、部分エリア選択部１２７Ｂは、部分エリアを選択する際に、全ての部分エリアを選択対象としなくてもよい。例えば、部分エリア選択部１２７Ｂは、領域（面積）が他の部分エリアに比して狭いエリアを選択対象から除外（例えば、一番大きい部分エリアと比較して３分の１以下の面積しかない部分エリアを選択対象から除外）し、残った部分エリア（除外されなかった部分エリア）から選択するようにしてもよい。 When selecting a partial area, the partial area selection unit 127B does not have to select all the partial areas as selection targets. For example, the partial area selection unit 127B excludes an area whose area (area) is smaller than other partial areas from selection targets (for example, there is only one third or less the area as compared with the largest partial area). Partial areas may be excluded from selection targets), and selection may be made from remaining partial areas (partial areas not excluded).

さらに、部分エリア選択部１２７Ｂは、上記の方式により選択したエリアに隣接するエリア（境界を接するエリア）も含めて統合して１つの部分エリアとして選択するようにしてもよい。例えば、部分エリア選択部１２７Ｂは、最もパワーの大きいエリアと、当該エリアに隣接するエリアを選択するようにしてもよい。 Further, the partial area selection section 127B may integrate the area including the area adjacent to the area selected by the above-described method (the area adjacent to the boundary) and select the integrated area as one partial area. For example, the partial area selection unit 127B may select an area having the highest power and an area adjacent to the area.

さらにまた、部分エリア選択部１２７Ｂが選択対象とする部分エリアは、必ずしも互いに独立である必要はない。例えば、部分エリア選択部１２７Ｂは、図２０に示すエリアＤとエリアＡのように、互いに重複エリアを有する部分エリアを選択対象として選択するようにしてもよい。言い換えると、部分エリア選択部１２７Ｂでは、全体（エリア１、２、３がカバーする全領域）に比べて選択されるエリアが小さく、目的音を抽出する上で不要なエリアを含まないような部分エリアが選択されればよい。 Furthermore, the partial areas to be selected by the partial area selection unit 127B need not necessarily be independent of each other. For example, the partial area selection unit 127B may select a partial area having an overlapping area as the selection target, such as the area D and the area A shown in FIG. In other words, in the partial area selection unit 127B, the area to be selected is smaller than the entire area (the entire area covered by the areas 1, 2, and 3), and the partial area selection unit 127B does not include an area unnecessary for extracting the target sound. An area may be selected.

そして、部分エリア選択部１２７Ｂは、以上の処理により選択された部分エリアのエリア収音成分を最終的な収音結果Ｗ（ｎ）として取得する。なお、部分エリア選択部１２７Ｂにおける上記のような部分エリア（部分エリアのエリア収音成分）の選択方法は、第２の実施形態の部分エリア選択部１２７Ａに適用するようにしてもよい。 Then, the partial area selection unit 127B acquires the area sound pickup component of the partial area selected by the above processing as the final sound pickup result W (n). Note that the above-described method of selecting the partial area (the area sound pickup component of the partial area) in the partial area selection unit 127B may be applied to the partial area selection unit 127A of the second embodiment.

以上のように、収音部１２０Ｂは部分エリア選択部１２７Ｂにより選択されたエリア収音成分を最終出力Ｗ（ｎ）として出力する。 As described above, the sound pickup unit 120B outputs the area sound pickup component selected by the partial area selection unit 127B as the final output W (n).

そして、通信装置２００の通信部２３０は、通信装置１００Ｂから受信した音声データ（Ｗ（ｎ）に基づく音声データ）を出力部１４０に供給する。出力部１４０は、受信した音声データに基づく音響信号をスピーカ２１０に供給して表音出力（第２のユーザＵ２に向けて表音出力）させる。 Then, the communication unit 230 of the communication device 200 supplies the output unit 140 with the audio data (audio data based on W (n)) received from the communication device 100B. The output unit 140 supplies an audio signal based on the received audio data to the speaker 210 to output a sound (a sound output to the second user U2).

（Ｃ−３）第３の実施形態の効果
第３の実施形態によれば、以下のような効果を奏することができる。 (C-3) Effects of Third Embodiment According to the third embodiment, the following effects can be obtained.

第３の実施形態の通信装置１００Ｂ（収音部１２０Ｂ）では、重なりを持つ３つの収音エリアに対して、どこのエリアとも重複しない独立したエリアと、２つのエリアが重複したエリアと、３つのエリアすべてが重複したエリア、それぞれの部分の成分を算出している。そして、第３の実施形態の通信装置１００Ｂ（収音部１２０Ｂ）では、分割されたエリアの中から目的音収音エリアとして最も相応しいエリアが選択し、選択されたエリアのエリア収音成分を最終的な収音結果として出力している。これにより、第３の実施形態の通信装置１００Ｂ（収音部１２０Ｂ）では、複数のエリア収音結果によるエリア全体の範囲をカバーしつつ、選択エリアはエリア全体に比べ遥かに小さいエリアとなるため、不必要なエリア成分を含むことなく、目的音が含まれるピンポイントのエリアから強調音声を取り出すことが可能になる。 In the communication device 100B (sound pickup unit 120B) of the third embodiment, for the three sound pickup areas having overlaps, an independent area that does not overlap with any area, an area where two areas overlap, and 3 The area in which all three areas overlap is calculated, and the components of each area are calculated. Then, in the communication device 100B (sound pickup unit 120B) of the third embodiment, the most appropriate area as the target sound pickup area is selected from the divided areas, and the area sound pickup component of the selected area is finalized. Is output as a typical sound pickup result. Thus, in the communication device 100B (sound pickup unit 120B) of the third embodiment, the selected area is an area that is much smaller than the entire area while covering the entire area based on the plurality of area sound pickup results. The emphasized voice can be extracted from the pinpoint area including the target sound without including unnecessary area components.

（Ｄ）他の実施形態
本発明は、上記の各実施形態に限定されるものではなく、以下に例示するような変形実施形態も挙げることができる。 (D) Other Embodiments The present invention is not limited to the above embodiments, but may include modified embodiments as exemplified below.

（Ｄ−１）上記の各実施形態では、収音部は通信装置の一部を構成するものとして説明したが、独立した装置として構成するようにしてもよい。また、上記の各実施形態では、収音部にマイクアレイ部は含まない構成として説明したが、収音部とマイクアレイ部を一体とした装置として構成するようにしてもよい。 (D-1) In each of the above embodiments, the sound collection unit has been described as constituting a part of the communication device, but may be configured as an independent device. Further, in each of the above-described embodiments, the sound collecting unit is not configured to include the microphone array unit. However, the sound collecting unit and the microphone array unit may be configured as an integrated device.

（Ｄ−２）上記の各実施形態では、本発明の収音装置（収音部）をハンドセット等の手持ち型の送話器（送受話器）を備える装置等に適用する例について説明したが、本発明の収音装置は、ヘッドセットやウェアラブルデバイス（例えば、マイクロホン付きのヘッドマウントディスプレイ、マイクロホン付きのネックバンド型ヘッドホン等）に適用し、第１のユーザＵ１による装着時に第１のユーザＵ１の口元が位置する領域を目的エリアとし、その周囲（送話口）の多角形（Ｎ角形）の各頂点にマイクロホンを設置し、上記の実施形態と同様にエリア収音処理するようにしてもよい。 (D-2) In each of the above embodiments, an example is described in which the sound pickup device (sound pickup unit) of the present invention is applied to a device such as a handset or the like having a handheld transmitter (transmitter / receiver). The sound collection device of the present invention is applied to a headset or a wearable device (for example, a head-mounted display with a microphone, a neckband headphone with a microphone, or the like), and is attached to the first user U1 when worn by the first user U1. The area where the mouth is located is set as the target area, and microphones are installed at the respective vertices of the polygon (N-gon) around the mouth (mouthpiece), and the area sound pickup processing may be performed in the same manner as in the above embodiment. .

（Ｄ−３）第１、第３の実施形態では、３個のマイクロホンＭＣ１〜ＭＣ３を用いたエリア収音の例について示したが、マイクアレイ部１１１に設置するマイクロホンの数（マイクロホンを配置する多角形の辺（角）の数）は限定されないものでる。例えば、３方向あるいは４方向からエリア収音を行なうようにしてもよい。例えば、第１、第３の実施形態において、４つのマイクロホンを四角形の角頂点に配置するようにしてもよい。 (D-3) In the first and third embodiments, an example of area sound collection using three microphones MC1 to MC3 has been described. However, the number of microphones (microphones are arranged in the microphone array unit 111) is described. The number of sides (corners) of the polygon is not limited. For example, area sound pickup may be performed from three or four directions. For example, in the first and third embodiments, four microphones may be arranged at the corner vertices of a rectangle.

図２４は、マイクアレイ部１１１のマイクロホンの数を４つとした場合の構成について示した説明図である。 FIG. 24 is an explanatory diagram showing a configuration when the number of microphones in the microphone array unit 111 is four.

図２４では、４つのマイクロホンＭＣ１〜ＭＣ４が四角形（正方形）の角頂点の位置に配置されている。４つのマイクロホンＭＣ１〜ＭＣ４は互いに隣り合うマイクロホン同士と組み合わされて、マイクロホンＭＣ１、ＭＣ２の対により形成されるマイクアレイＭＡ７０１と、マイクロホンＭＣ２、ＭＣ３の対により形成されるマイクアレイＭＡ７０２と、マイクロホンＭＣ３、ＭＣ４の対により形成されるマイクアレイＭＡ７０３と、マイクロホンＭＣ４、ＭＣ１の対により形成されるマイクアレイＭＡ７０４の４つが形成される。さらにこれらのマイクロアレイは隣り合うマイクアレイとの組み合わせ（一部のマイクロホンを共有するマイクアレイの組み合わせ）により４つのエリア収音が可能となる。例えば、マイクアレイ部１１１に、４つのマイクロホンＭＣ１〜ＭＣ４の構成を適用した場合、収音部１２０Ｂでは、マイクアレイＭＡ７０１、ＭＡ７０２の組み合わせによるエリア収音と、マイクアレイＭＡ７０２、ＭＡ７０３の組み合わせによるエリア収音と、マイクアレイＭＡ７０３、ＭＡ７０４の組み合わせによるエリア収音と、マイクアレイＭＡ７０４、ＭＡ７０１の組み合わせによるエリア収音の各出力（４つのエリア収音の出力）を取得することができる。そして、収音部１２０Ｂでは、上述の４つのエリア収音の出力に基づいた収音結果を取得することができる。 In FIG. 24, four microphones MC1 to MC4 are arranged at the corner vertices of a square (square). The four microphones MC1 to MC4 are combined with adjacent microphones to form a microphone array MA701 formed by a pair of microphones MC1 and MC2, a microphone array MA702 formed by a pair of microphones MC2 and MC3, and a microphone MC3. A microphone array MA703 formed by a pair of MC4 and a microphone array MA704 formed by a pair of microphones MC4 and MC1 are formed. Further, these microarrays can collect sound in four areas by combination with adjacent microphone arrays (combination of microphone arrays sharing some microphones). For example, when a configuration of four microphones MC1 to MC4 is applied to the microphone array unit 111, the sound collection unit 120B collects an area by combining the microphone arrays MA701 and MA702 and an area by the combination of the microphone arrays MA702 and MA703. It is possible to acquire each output (output of four area sound collections) of the sound, the area sound collection by the combination of the microphone arrays MA703 and MA704, and the area sound collection by the combination of the microphone arrays MA704 and MA701. The sound pickup unit 120B can acquire a sound pickup result based on the output of the above-described four area sound pickups.

（Ｄ−４）第３の実施形態の収音部１２０Ｂ（部分エリア選択部１２７Ｂ）では、複数のエリアから目的音が最も多く含まれる１つのエリアを選択していたが、複数のエリアを選択するようにしてもよい。この場合、第３の実施形態の収音部１２０Ｂ（部分エリア選択部１２７Ｂ）では、選択した複数のエリアのエリア収音成分を統合（加算）して収音結果Ｗとして取得するようにしてもよい。ただし、この場合、第３の実施形態の収音部１２０Ｂ（部分エリア選択部１２７Ｂ）では、選択された複数のエリアは、互いに共通部分（重複部分）を有さないように、選択対象となるエリアを予め配慮しておく必要がある。 (D-4) In the sound collection unit 120B (partial area selection unit 127B) of the third embodiment, one area including the target sound most is selected from a plurality of areas, but a plurality of areas are selected. You may make it. In this case, the sound pickup unit 120B (partial area selection unit 127B) of the third embodiment integrates (adds) area sound pickup components of a plurality of selected areas and acquires the sound pickup result W. Good. However, in this case, in the sound pickup unit 120B (partial area selection unit 127B) of the third embodiment, the selected areas are selected so as not to have a common part (overlapping part) with each other. It is necessary to consider the area in advance.

１００、１００Ａ、１００Ｂ…通信装置、１１０…ハンドセット、１１１…マイクアレイ部、ＭＣ１〜ＭＣ６…マイクロホン、１１２…スピーカ、１１３…送話口、１１４…受話口、１１５…把手部、１２０、１２０Ａ、１２０Ｂ…収音部、１２１…信号入力部、１２２…周波数変換部、１２３…指向性形成部、１２４、１２４Ａ、１２４Ｂ…目的エリア音抽出部、１２５…エリア音加算部、１２６、１２６Ｂ…部分エリア成分算出部、１２７、１２７Ｂ…部分エリア選択部、１３０…通信部、１４０…出力部、２００…通信装置、２１０…スピーカ、２２０…マイク、２３０…通信部、２４０…出力部、２５０…収音部、Ｕ１…第１のユーザ、Ｕ１ａ…聴者の手、Ｕ２…第２のユーザ、Ｐ…通信路。 100, 100A, 100B: communication device, 110: handset, 111: microphone array, MC1 to MC6: microphone, 112: speaker, 113: mouthpiece, 114: earpiece, 115: handle, 120, 120A, 120B ... Sound collection unit, 121 ... Signal input unit, 122 ... Frequency conversion unit, 123 ... Directivity forming unit, 124, 124A, 124B ... Target area sound extraction unit, 125 ... Area sound addition unit, 126, 126B ... Partial area component Calculation unit, 127, 127B: partial area selection unit, 130: communication unit, 140: output unit, 200: communication device, 210: speaker, 220: microphone, 230: communication unit, 240: output unit, 250: sound collection unit U1, first user, U1a, listener's hand, U2, second user, P, communication channel.

Claims

Area sound pickup for acquiring area sound pickup components of a plurality of sound pickup areas based on a combination of two or more patterns of the microphone arrays based on input signals from a microphone array unit capable of forming a plurality of microphone arrays having different directivities. Means,
Each of a partial area in which two or more of the sound collecting areas are divided from an entire area covering all of the sound collecting areas acquired by the area sound collecting means, and a partial area in which the sound collecting areas do not overlap with each other For a sound pickup component, a partial area component calculation means for acquiring based on an area sound pickup component of the sound pickup area of each pattern acquired by the area sound pickup means,
A partial area in which one or more partial area sound pickup components are selected from the area sound pickup components of the partial area calculated by the partial area component calculation means, and a sound pickup result based on the selected area sound pickup component is obtained. A sound pickup device comprising: a selection unit.

2. The partial area selection unit according to claim 1, wherein the partial area selection unit selects an area sound collection component of one or more partial areas based on a result of comparing the power of the area sound collection components of the respective partial areas. 3. Sound pickup device.

The sound pickup apparatus according to claim 2, wherein the partial area selection means selects an area sound pickup component having the highest power from the area sound pickup components of the partial area calculated by the partial area component calculation means. .

The said partial area selection means acquires the result which added the selected some area | region sound-pickup component as a sound-pickup result, when the area sound-pickup component of a some partial area is selected. A sound pickup device according to claim 1.

Computer
Area sound pickup for acquiring area sound pickup components of a plurality of sound pickup areas based on a combination of two or more patterns of the microphone arrays based on input signals from a microphone array unit capable of forming a plurality of microphone arrays having different directivities. Means,
Each of a partial area in which two or more of the sound collecting areas are divided from an entire area covering all of the sound collecting areas acquired by the area sound collecting means, and a partial area in which the sound collecting areas do not overlap with each other For a sound pickup component, a partial area component calculation means for acquiring based on an area sound pickup component of the sound pickup area of each pattern acquired by the area sound pickup means,
A partial area in which one or more partial area sound pickup components are selected from the area sound pickup components of the partial area calculated by the partial area component calculation means, and a sound pickup result based on the selected area sound pickup component is obtained. A sound collection program characterized by functioning as selection means.

In the sound pickup method performed by the sound pickup device,
Area pickup means, partial area component calculation means, and partial area selection means,
The area sound pickup unit is configured to pick up an area sound of a plurality of sound pickup areas based on a combination of two or more patterns of the microphone arrays based on input signals from a microphone array unit capable of forming a plurality of microphone arrays having different directivities. Get the ingredients,
The partial area component calculating means includes: a partial area in which two or more sound collecting areas divided from an entire area covering all of the sound collecting areas acquired by the area sound collecting means overlap each other; For each area sound pickup component of the non-overlapping partial area, acquiring based on the area sound pickup component of the sound pickup area of each pattern obtained by the area sound pickup means,
The partial area selecting means selects an area sound collecting component of one or more partial areas from the area sound collecting components of the partial area calculated by the partial area component calculating means, and collects sound based on the selected area sound collecting component. A sound collection method characterized by acquiring sound results.