JP5052241B2

JP5052241B2 - On-vehicle voice processing apparatus, voice processing system, and voice processing method

Info

Publication number: JP5052241B2
Application number: JP2007188207A
Authority: JP
Inventors: 久高橋; 明雄天野; 真人戸上; 寿一高橋
Original assignee: Clarion Co Ltd
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 2007-07-19
Filing date: 2007-07-19
Publication date: 2012-10-17
Anticipated expiration: 2027-07-19
Also published as: JP2009023486A

Description

本発明は、車載用の音声処理装置の音声処理技術に関する。 The present invention relates to an audio processing technique for an on-vehicle audio processing apparatus.

電話回線等を利用して複数の車両間で対話可能なシステムがある。例えば、ハンズフリー電話を利用すれば、車両間で、容易に対話をすることができる。 There is a system that can communicate between a plurality of vehicles using a telephone line or the like. For example, if a hands-free telephone is used, it is possible to easily communicate between vehicles.

特開２００７−１３８３８号公報JP 2007-13838 A

しかし、単に従来の電話機能を利用して、他の車両からの音声をスピーカから出力するだけでは、臨場感に欠ける。特に、車両内は、個々のプライベート空間が形成されるため、他の車両からの音声を単に出力しても、相手の存在を実感しにくい。 However, simply using a conventional telephone function and outputting sound from another vehicle from a speaker lacks a sense of reality. In particular, since individual private spaces are formed in the vehicle, it is difficult to realize the existence of the opponent even if the sound from other vehicles is simply output.

本発明の目的は、車両間での通信において、より臨場感のある対話を可能とすることにある。 An object of the present invention is to enable more realistic dialogue in communication between vehicles.

上記課題を解決すべく、本発明では、車両の位置や発話者の座席位置に応じて、受話者の音場における、発話者の音声の仮想音源位置を定める。 In order to solve the above problems, in the present invention, the virtual sound source position of the voice of the speaker in the sound field of the listener is determined according to the position of the vehicle and the seat position of the speaker.

例えば、本発明の第１の態様は、第１の車両に搭載され、第２の車両の車載装置と通信
可能な車載用音声処理装置であって、前記第２の車両の位置を取得する車両位置取得手段
と、前記第２の車両内で取得された音声を取得する音声取得手段と、複数のスピーカから
なるスピーカアレイにより形成される音場において、前記第２の車両内で取得された音声
の仮想音源が、前記第２の車両の位置の方向に形成されるように、前記第２の車両内で取
得された音声を加工し、前記スピーカアレイにより出力する音声出力手段と、前記第２の車両内で取得された音声を、音源ごとに分離する音声分離手段と、前記音声分離手段により分離された音声の中から、予め定められた車両内の座席位置に対応する音源位置の音声を抽出する音声抽出手段と、を備え、前記音声出力手段は、前記音声抽出手段で抽出した音声を前記スピーカアレイに出力する。
For example, the first aspect of the present invention is an in-vehicle audio processing device that is mounted on a first vehicle and can communicate with the in-vehicle device of the second vehicle, and acquires the position of the second vehicle. The sound acquired in the second vehicle in a sound field formed by a position acquisition means, a sound acquisition means for acquiring sound acquired in the second vehicle, and a speaker array comprising a plurality of speakers. virtual sound source, as formed in the direction of the position of the second vehicle, and processing the audio acquired in the second vehicle, and audio output means for outputting by the speaker array, the second The sound separation means for separating the sound acquired in the vehicle for each sound source, and the sound at the sound source position corresponding to a predetermined seat position in the vehicle from the sounds separated by the sound separation means Voice extraction means for extracting; For example, the audio output unit outputs audio extracted by the audio extraction unit in the speaker array.

また、本発明の第２の態様は、第１の車両に搭載され、第２の車両の車載装置と通信可能な車載用音声処理装置であって、前記第２の車両内で取得され、音源位置ごとに分離され、かつ座席位置に対応付けられた音声を取得する音声取得手段と、複数のスピーカからなるスピーカアレイにより形成される音場において、前記座席位置に対応付けられた音声の仮想音源が、その座席位置に対応する前記第１の車両の座席位置に形成されるように、前記座席位置ごとの音声を加工し、前記スピーカアレイにより出力する音声出力手段と、を備える。 According to a second aspect of the present invention, there is provided an in-vehicle voice processing device that is mounted on the first vehicle and can communicate with the in-vehicle device of the second vehicle, and is acquired in the second vehicle and is a sound source. In a sound field formed by sound acquisition means for acquiring sound that is separated for each position and that is associated with the seat position, and a speaker array that includes a plurality of speakers, a virtual sound source of the sound that is associated with the seat position Is provided with sound output means for processing sound for each seat position and outputting the sound from the speaker array so as to be formed at the seat position of the first vehicle corresponding to the seat position.

本発明の第３の態様は、第１の車両と第２の車両との間で音声の送受信が可能な音声処理システムであって、前記第１の車両内で取得された音声を、音源ごとに分離する音声分離手段と、前記音声分離手段により得られた音源の位置ごとの音声から、座席位置に対応付けた音声を求める手段と、複数のスピーカからなるスピーカアレイにより形成される音場において、前記座席位置に対応付けた音声の仮想音源が、その座席位置に対応する前記第２の車両の座席位置に形成されるように、前記座席位置ごとの音声を加工し、前記スピーカアレイにより出力する音声出力手段とを備える。 According to a third aspect of the present invention, there is provided an audio processing system capable of transmitting and receiving audio between the first vehicle and the second vehicle, wherein the audio acquired in the first vehicle is transmitted for each sound source. In a sound field formed by a speaker array comprising a plurality of speakers, a sound separation means for separating the sound into sound sources, a means for obtaining a sound associated with a seat position from the sound at each sound source position obtained by the sound separation means, The sound for each seat position is processed so that a virtual sound source of the sound associated with the seat position is formed at the seat position of the second vehicle corresponding to the seat position, and output by the speaker array Voice output means.

以下に、本発明の一実施形態について、図面を参照して説明する。 An embodiment of the present invention will be described below with reference to the drawings.

＜第１の実施形態＞
図１は、本発明の一実施形態が適用された車載用の音声処理装置１００の概略構成図である。なお、本実施形態において、車両間での対話は、それぞれの車両に搭載された複数台の音声処理装置１００の間で行われる。また、２台とは限らず、３台以上が同時に音声の送受信を行ってもよい。この場合は、会議のように、同時に複数人での対話が可能となる。 <First Embodiment>
FIG. 1 is a schematic configuration diagram of an in-vehicle audio processing apparatus 100 to which an embodiment of the present invention is applied. In the present embodiment, the dialogue between the vehicles is performed between a plurality of voice processing devices 100 mounted on each vehicle. In addition, the number is not limited to two, and three or more may simultaneously transmit and receive audio. In this case, it is possible to have a conversation with a plurality of people at the same time as in a conference.

本実施形態の音声出力装置１００は、ナビゲーション装置と一体となっている。図示するように、音声処理装置１００は、演算処理部１と、ディスプレイ２と、記憶装置３と、マイクロフォンアレイ４と、スピーカアレイ５と、着席センサ６と、入力装置７と、車輪速センサ８と、ジャイロセンサ９と、ＧＰＳ(Global Positioning System)受信装置１０と、通信装置１５とを備えている。 The audio output device 100 of this embodiment is integrated with a navigation device. As shown in the figure, the sound processing device 100 includes an arithmetic processing unit 1, a display 2, a storage device 3, a microphone array 4, a speaker array 5, a seating sensor 6, an input device 7, and a wheel speed sensor 8. A gyro sensor 9, a GPS (Global Positioning System) receiver 10, and a communication device 15.

演算処理部１は、様々な処理を行う中心的ユニットである。例えば各種センサ８、９やＧＰＳ受信装置１０から出力される情報を基にして現在位置を検出する。また、マイクロフォンアレイ４から入力された音声を、通話相手の他の車載装置１００に送信したり、通話相手の車載装置１００から受信した音声をスピーカアレイ５から出力したりする。また、音声を認識し、認識した語句から、ユーザの入力内容を特定することも行う。 The arithmetic processing unit 1 is a central unit that performs various processes. For example, the current position is detected based on information output from the various sensors 8 and 9 and the GPS receiver 10. Further, the voice input from the microphone array 4 is transmitted to the other in-vehicle device 100 of the other party, and the voice received from the in-vehicle device 100 of the other party is output from the speaker array 5. In addition, the user's input content is identified from the recognized words and phrases.

ディスプレイ２は、液晶表示装置などで構成され、演算処理部１で生成されたグラフィックス情報を表示するユニットである。 The display 2 is configured by a liquid crystal display device or the like, and is a unit that displays graphics information generated by the arithmetic processing unit 1.

記憶装置３は、ＣＤ-ＲＯＭやＤＶＤ-ＲＯＭやＨＤＤやＩＣカードといった記憶媒体で構成されている。この記憶媒体には、地図データ等が記憶されている。 The storage device 3 includes a storage medium such as a CD-ROM, DVD-ROM, HDD, or IC card. This storage medium stores map data and the like.

マイクロフォンアレイ４は、ユーザが発話した音声を取得する音声入力装置である。マイクロフォンアレイ４は、設定位置の異なる複数のマイクロフォンからなる。これにより、音声取得位置による音声の相違（遅延時間、音量など）を利用した音声分離が可能となり、各音声の音源位置を特定することができる。 The microphone array 4 is a voice input device that acquires voice spoken by the user. The microphone array 4 includes a plurality of microphones having different setting positions. As a result, it is possible to perform sound separation using the difference in sound (delay time, volume, etc.) depending on the sound acquisition position, and the sound source position of each sound can be specified.

スピーカアレイ５は、演算処理部１で生成された音声を出力するための音声出力装置である。スピーカアレイ５は、設置位置の異なる複数のスピーカからなり、それぞれのスピーカから、遅延時間や音量を変化させた音声を出力することで、聴取者（受話者）に対して音場の中に仮想音源を定位させることができる。これにより、聴取者に、実際には音源が位置しない位置から音声が出力されているかのような感覚を与えることができる。本実施形態では、これを利用して、対話の臨場感を向上させる。 The speaker array 5 is an audio output device for outputting the audio generated by the arithmetic processing unit 1. The speaker array 5 is composed of a plurality of speakers with different installation positions, and outputs sound with varying delay time and volume from each speaker, so that a virtual sound in the sound field can be provided to the listener (listener). The sound source can be localized. Thereby, it is possible to give the listener a feeling as if sound is being output from a position where the sound source is not actually located. In the present embodiment, the presence of the dialogue is improved by using this.

着席センサ６は、感圧センサなどで構成され、車両内の座席に搭乗者が着席しているか否かを検知する。 The seating sensor 6 is composed of a pressure-sensitive sensor or the like, and detects whether or not a passenger is seated in a seat in the vehicle.

図２は、車両３００における、マイクロフォンアレイ４、スピーカアレイ５、及び着席センサ６の配置について説明するための図（天頂からみた図）である。図示するように、マイクロフォンアレイ４は、例えば、ダッシュボード、バックミラー、ピラー等の近傍に設置された、複数（少なくとも２つ、好ましくは３〜４つ）のマイクロフォンからなる。 FIG. 2 is a diagram (a diagram seen from the zenith) for explaining the arrangement of the microphone array 4, the speaker array 5, and the seating sensor 6 in the vehicle 300. As shown in the figure, the microphone array 4 includes a plurality of (at least two, preferably three to four) microphones installed in the vicinity of, for example, a dashboard, a rearview mirror, and a pillar.

スピーカアレイ５は、各座席の搭乗者が各々に向けられた音声を聴取可能なように、座席ごとに設けられている。例えば、スピーカアレイ５は、各座席のヘッドレストに、搭乗者の左右の耳の位置に合わせて配置されている。 The speaker array 5 is provided for each seat so that a passenger in each seat can listen to the sound directed to each seat. For example, the speaker array 5 is arranged on the headrest of each seat according to the positions of the left and right ears of the passenger.

着席センサ６は、各座席３０１の座板に配されており、搭乗者の重みにより、着席状態か否かを検出する。 The seating sensor 6 is arranged on the seat plate of each seat 301, and detects whether or not it is in a seated state based on the weight of the passenger.

図１に戻って説明する。入力装置７は、ユーザからの指示を受け付けるユニットである。入力装置５は、スクロールキー、縮尺変更キーなどのハードスイッチ、ジョイスティック、ディスプレイ上に貼られたタッチパネルなどで構成される。 Returning to FIG. The input device 7 is a unit that receives an instruction from a user. The input device 5 includes a hard switch such as a scroll key and a scale change key, a joystick, a touch panel pasted on a display, and the like.

センサ８、９およびＧＰＳ受信装置１０は、現在地（自車位置）を検出するために使用されるものである。 The sensors 8 and 9 and the GPS receiver 10 are used for detecting the current location (own vehicle position).

通信装置１５は、他の車両に搭載された音声処理装置１００と通信を行うための装置である。通信装置１５は、電話回線を利用するものであってもよいし、インターネットを利用するものであってもよい。また、携帯電話と接続して、通信を行うものであってもよい。 The communication device 15 is a device for communicating with the voice processing device 100 mounted on another vehicle. The communication device 15 may use a telephone line or may use the Internet. Further, it may be connected to a mobile phone for communication.

図３は、演算処理部１の機能ブロック図である。 FIG. 3 is a functional block diagram of the arithmetic processing unit 1.

図示するように、演算処理部１は、ユーザ操作解析部４１と、現在位置算出部４２と、ナビゲーション処理部４３と、音声取得部４４と、音声分離部４５と、音場設定部４６と、音声加工部４７と、音声出力部４８と、通信部４９とを備えている。 As illustrated, the arithmetic processing unit 1 includes a user operation analysis unit 41, a current position calculation unit 42, a navigation processing unit 43, a voice acquisition unit 44, a voice separation unit 45, a sound field setting unit 46, An audio processing unit 47, an audio output unit 48, and a communication unit 49 are provided.

ユーザ操作解析部４１は、入力装置７に入力されたユーザからの要求を受け、その要求内容を解析して、その要求内容に対応する処理が実行されるように演算処理部１の各部を制御する。 The user operation analysis unit 41 receives a request from the user input to the input device 7, analyzes the request content, and controls each unit of the arithmetic processing unit 1 so that processing corresponding to the request content is executed. To do.

現在位置算出部４２は、各センサ８，９及びＧＰＳ受信装置１０の出力から、現在位置を求める。例えば、地図データを用いて、マップマッチングにより地図上の自車両の位置を求める。 The current position calculation unit 42 obtains the current position from the outputs of the sensors 8 and 9 and the GPS receiver 10. For example, the position of the vehicle on the map is obtained by map matching using map data.

ナビゲーション処理部４３は、指定された２地点（現在地、目的地）間を結ぶ推奨経路を探索したり、ディスプレイ２に推奨経路を表示し経路誘導を行ったりする。 The navigation processing unit 43 searches for a recommended route connecting between two designated points (current location, destination), or displays the recommended route on the display 2 to perform route guidance.

音声取得部４４は、マイクロフォンアレイ４を介して、話者が発した音声を取得する処理を行う。 The voice acquisition unit 44 performs processing for acquiring voice uttered by the speaker via the microphone array 4.

音声分離部４５は、マイクロフォンアレイ４で取得した、複数の音源から出力された音声（混合音）を、音源位置ごとの音声に分離する。音声分離部４５は、マイクロフォンアレイ４を構成する各マイクロフォンの位置を予め記憶しており、各マイクロフォンの位置を基準として、音源位置を特定する。なお、音声分離の手法については、公知の方法を用いることができるので詳述しない。 The sound separation unit 45 separates the sound (mixed sound) output from the plurality of sound sources acquired by the microphone array 4 into sound for each sound source position. The sound separation unit 45 stores in advance the positions of the microphones constituting the microphone array 4, and specifies the sound source position on the basis of the positions of the microphones. Note that the voice separation method will not be described in detail because a known method can be used.

音場設定部４６は、スピーカアレイ５が出力する音声により聴取者に対して実現される、音場の構成（仮想音源の位置、発する音声、音量など）を決定する処理を行う。 The sound field setting unit 46 performs a process of determining a sound field configuration (virtual sound source position, sound to be emitted, sound volume, etc.) realized for the listener by the sound output from the speaker array 5.

音声加工部４７は、音場設定部４６で決定された音場がスピーカアレイ５により実現されるように、スピーカアレイ５を構成する各スピーカに出力する音声を加工する処理を行う。音声加工部４７は、スピーカアレイ５を構成する各スピーカの位置を予め記憶しており、各スピーカの位置を基準として音場が形成されるように音声を加工する。なお、音場における仮想音源の形成方法については、公知の方法を用いることができるので詳述しない。 The sound processing unit 47 performs processing for processing sound output to each speaker constituting the speaker array 5 so that the sound field determined by the sound field setting unit 46 is realized by the speaker array 5. The sound processing unit 47 stores in advance the positions of the speakers constituting the speaker array 5, and processes the sound so that a sound field is formed with reference to the positions of the speakers. A method for forming a virtual sound source in the sound field will not be described in detail because a known method can be used.

音声出力部４８は、音声加工部４７で加工された、各スピーカで出力されるべき音声を、スピーカアレイ５に送信し、音声出力を実行する。 The audio output unit 48 transmits the audio processed by the audio processing unit 47 and to be output from each speaker to the speaker array 5 and executes audio output.

通信部４９は、通信装置１５を介して、他の音声処理装置１００との間の情報の送受信を仲介する処理を行う。 The communication unit 49 performs processing that mediates transmission / reception of information with the other audio processing device 100 via the communication device 15.

なお、演算処理部１は、図１に示すように、数値演算及び各デバイスを制御するといった様々な処理を実行するＣＰＵ(Central Processing Unit)１１と、記憶装置３から読み出した地図データ、演算データなどを格納するＲＡＭ(Random Access Memory)１２と、プログラムやデータを格納するＲＯＭ(Read Only Memory)１３と、外部との情報の授受を行うインターフェース１４とを備えるコンピュータシステムにより達成される。上述の演算処理部１の各機能部は、ＣＰＵがメモリにロードしたプログラムを実行することにより達成される。なお、音声処理に関して、専用のＤＳＰ（Digital Signal Processor）が行うようにしてもよい。 As shown in FIG. 1, the arithmetic processing unit 1 includes a CPU (Central Processing Unit) 11 that executes various processes such as numerical calculation and control of each device, and map data and arithmetic data read from the storage device 3. This is achieved by a computer system including a RAM (Random Access Memory) 12 that stores information, a ROM (Read Only Memory) 13 that stores programs and data, and an interface 14 that exchanges information with the outside. Each functional unit of the arithmetic processing unit 1 described above is achieved by executing a program loaded into the memory by the CPU. Note that the audio processing may be performed by a dedicated DSP (Digital Signal Processor).

［動作の説明］次に、上記構成の複数台の音声処理装置１００の間で行われる、発話から受話までの動作について説明する。 [Description of Operation] Next, the operation from the utterance to the reception performed between the plurality of speech processing apparatuses 100 having the above-described configuration will be described.

図４は、かかる処理のフロー図である。 FIG. 4 is a flowchart of such processing.

なお、ここでは、発話者側の音声処理装置１００と受話者側の音声処理装置１００とに分けて動作を説明する。もちろん、いずれの車両の搭乗者が発話者になるかによって、音声処理装置１００は、発話者側になったり、受話者側になったりする。 Here, the operation will be described separately for the voice processing apparatus 100 on the speaker side and the voice processing apparatus 100 on the receiver side. Of course, depending on which vehicle passenger is the speaker, the speech processing apparatus 100 may be the speaker side or the receiver side.

本フローは、複数の音声処理装置１００の間で、対話モードとなっている場合に開始される。 This flow is started when the conversation mode is set between the plurality of speech processing apparatuses 100.

まず、発話者側の音声処理装置１００の音声取得部４４は、マイクロフォンアレイ４を介して、発話者の音声を取得する（Ｓ１１）。次に、発話者側の音声処理装置１００の通信部４９は、取得された音声を、現在位置算出部４２で算出された現在位置とともに、受話側の音声処理装置１００に送信する（Ｓ１２）。 First, the voice acquisition unit 44 of the voice processing apparatus 100 on the speaker side acquires the voice of the speaker via the microphone array 4 (S11). Next, the communication unit 49 of the voice processing apparatus 100 on the speaker side transmits the acquired voice to the voice processing apparatus 100 on the receiving side together with the current position calculated by the current position calculation unit 42 (S12).

これを受けて、受話側の音声処理装置１００の通信部４９は、発話者側の音声処理装置１００から送信された、発話者の音声と、発話者側の車両の現在位置を取得する（Ｓ１３）。 In response to this, the communication unit 49 of the speech processing apparatus 100 on the receiver side acquires the speech of the speaker and the current position of the vehicle on the speaker side transmitted from the speech processing apparatus 100 on the speaker side (S13). ).

次に、受話側の音声処理装置１００の音場設定部４６は、音場の構成を設定する（Ｓ１４）。図５は、音場の構成を説明するための図である。音場設定部４６は、自車両（受話者側の車両）の現在位置と相手車両（発話者側の車両）の現在位置との関係５００に基づいて、音場５１０の構成を設定する。具体的には、自車両５０１の位置を音場５１０の中心５１１として、発話者側の車両５０２（または５０３）の現在位置に相当する位置に、発話者の音声の仮想音源５１２（または５１３）を配置し、音場５１０を構成する。また、発話者の音声の音量については、車両間の距離Ｌ１（またはＬ２）が離れる程、小さくなるように設定する。 Next, the sound field setting unit 46 of the voice processing apparatus 100 on the receiver side sets the configuration of the sound field (S14). FIG. 5 is a diagram for explaining the configuration of the sound field. The sound field setting unit 46 sets the configuration of the sound field 510 based on the relationship 500 between the current position of the host vehicle (the speaker's vehicle) and the current position of the partner vehicle (the speaker's vehicle). Specifically, the virtual sound source 512 (or 513) of the speaker's voice is set at a position corresponding to the current position of the speaker's vehicle 502 (or 503) with the position of the own vehicle 501 as the center 511 of the sound field 510. To configure the sound field 510. Further, the volume of the voice of the speaker is set so as to decrease as the distance L1 (or L2) between the vehicles increases.

すなわち、図５の例のように、自車両５０１に対して、右斜め前に発話者側の車両５０２があるとすると、車両５０２の搭乗者の音声の仮想音源５１２は、音場の中心（自身の位置）５１１に対して、右斜め前に設定される。一方、左斜め後ろに発話者側の車両５０３があるとすると、車両５０３の搭乗者の音声の仮想音源５１３は、音場の中心（自身の位置）５１１に対して、左斜め後ろに設定される。さらに、発話者の音声の音量は、自車両から遠い車両からの音声ほど小さくなるように設定されるので、図５の例では、右斜め前の車両５０２からの音声は、左斜め後ろの車両からの音声に比べて小さく設定される。 That is, as in the example of FIG. 5, if there is a vehicle 502 on the speaker side diagonally forward to the host vehicle 501, the virtual sound source 512 of the voice of the passenger of the vehicle 502 is the center of the sound field ( It is set diagonally forward right with respect to its own position) 511. On the other hand, if there is a vehicle 503 on the speaker side diagonally to the left, the virtual sound source 513 of the voice of the passenger of the vehicle 503 is set to the left diagonally behind the center (own position) 511 of the sound field. The Further, since the volume of the voice of the speaker is set so that the voice from the vehicle farther from the own vehicle becomes lower, in the example of FIG. It is set to be smaller than the voice from.

なお、発話者の実際の声の大きさが一定とは限らない。発話者の実際の声の大きさにばらつきがあると、車両間の距離に応じて音量を調整する意味が薄れる。そこで、音場設定部４６は、発話者側から受信した音声について、一旦音量を所定の音量に正規化した後、車両間の距離に応じて調整するようにしてもよい。 Note that the actual loudness of the speaker is not always constant. If there is a variation in the loudness of the speaker's actual voice, the meaning of adjusting the sound volume in accordance with the distance between the vehicles is reduced. Therefore, the sound field setting unit 46 may adjust the sound received from the speaker side according to the distance between the vehicles after temporarily normalizing the sound volume to a predetermined sound volume.

次に、音声加工部４７は、音場設定部４６で設定された音場が再現されるように、スピーカアレイ５を構成する各スピーカに出力されるべき音声を、元の音声を加工することにより生成する（Ｓ１５）。 Next, the sound processing unit 47 processes the sound to be output to each speaker constituting the speaker array 5 so that the sound field set by the sound field setting unit 46 is reproduced. (S15).

そして、音声出力部４８は、音声加工部４７で加工された、スピーカアレイ５を構成する各スピーカに出力されるべき音声を、スピーカアレイ５に出力し、音声出力を実現する（Ｓ１６）。これにより、受話者は、発話者の発話内容を聴取することができる。 Then, the audio output unit 48 outputs the audio processed by the audio processing unit 47 and to be output to each speaker constituting the speaker array 5 to the speaker array 5, thereby realizing audio output (S16). Thereby, the listener can listen to the utterance contents of the speaker.

以上、図４のフローを用いて、発話から受話までの動作について説明した。なお、かかるフローは、いずれかの搭乗者が発話する度に繰り返される。 The operation from utterance to reception has been described above using the flow of FIG. This flow is repeated every time any passenger speaks.

上記した本実施形態によれば、受話者に対して、スピーカアレイ５により形成される音場により、発話者側の車両位置に発話者の音源位置を定位させることができる。したがって、受話者に、あたかもその方向から発話者が発話しているかのような感覚を与えることができ、臨場感のある対話が実現する。 According to the above-described embodiment, the sound source position of the speaker can be localized at the vehicle position on the speaker side by the sound field formed by the speaker array 5 for the receiver. Therefore, it is possible to give the listener a feeling as if the speaker is speaking from that direction, and a realistic dialogue is realized.

上記実施形態は、さまざまな変形が可能である。 The above embodiment can be variously modified.

例えば、ドライバ同士のみの対話、後部座席の搭乗者同士のみの対話など、特定の座席位置の搭乗者のみの対話に制限することもできる。 For example, it is possible to limit the dialogue to only the passengers at a specific seat position, such as the dialogue between drivers only, the dialogue between passengers in the rear seat, and the like.

図６は、かかる場合の対話動作の処理のフロー図である。図６のフローは、図４で示した処理と基本的に同じである。ただし、マイクロフォンアレイ４で取得される「発話者の音声」に、複数の搭乗者の音声が混合していることを想定して、音声分離により特定の座席の搭乗者の音声のみを抽出するための処理Ｓ１１’が挿入されている。Ｓ１１’では、概略すると、発話者側の音声処理装置１００の音声分離部４５は、Ｓ１１において取得された音声に対して、音声分離を行って、特定の座席位置の音声のみを抽出する。 FIG. 6 is a flowchart of the interactive operation process in such a case. The flow in FIG. 6 is basically the same as the process shown in FIG. However, in order to extract only the voices of the passengers of a specific seat by voice separation, assuming that the voices of a plurality of passengers are mixed with the “voices of speakers” acquired by the microphone array 4. The process S11 ′ is inserted. In general, in S11 ', the voice separation unit 45 of the voice processing apparatus 100 on the speaker side performs voice separation on the voice acquired in S11 and extracts only the voice at a specific seat position.

図７は、かかる処理を説明するための概念図である。 FIG. 7 is a conceptual diagram for explaining such processing.

まず、音声分離部４５は、音声分離の手法（すなわち、混合音から音源位置ごとの音声に分離する手法）を用いて、Ｓ１１で取得された発話者の音声（混合音）７００を、音源位置７０１ごとの音声７０２に分離する。 First, the voice separation unit 45 uses the voice separation technique (that is, the technique of separating the mixed sound into the voice for each sound source position) to convert the speaker's voice (mixed sound) 700 acquired in S11 into the sound source position. Separated into speech 702 for each 701.

一方、音声分離部４５は、座席位置７１１と車両内の空間領域７１２との対応表７１０を保持している。図２を用いて説明すると、車両３００の内部空間は、座席位置（ドライバ位置（右前）、助手席位置（左前）、後部右側位置、後部左側位置）に対応して、座席空間Ａ１、Ａ２，Ａ３、Ａ４が定められている。すなわち、車内の三次元座標における位置は、いずれかの座席空間Ａ１〜Ａ４に属する。 On the other hand, the voice separation unit 45 holds a correspondence table 710 between the seat position 711 and the space area 712 in the vehicle. Referring to FIG. 2, the internal space of the vehicle 300 corresponds to seat positions (driver position (right front), front passenger seat position (left front), rear right position, rear left position). A3 and A4 are defined. That is, the position in the three-dimensional coordinates in the vehicle belongs to one of the seat spaces A1 to A4.

そこで、音声分離部４５は、かかる座席位置７１１と空間領域７１２との対応表７１０を用いて、音源位置７０１に対応付けられた音声７０２を、座席位置７１１に対応付ける。そして、座席位置７１１に対応つけた音声７０２を取得する。 Therefore, the voice separation unit 45 associates the voice 702 associated with the sound source position 701 with the seat position 711 using the correspondence table 710 between the seat position 711 and the space area 712. Then, the voice 702 associated with the seat position 711 is acquired.

さらに音声分離部４５は、予め定められた座席位置（例えば、ドライバ位置）の音声７０２のみを抽出する。なお、いずれの座席位置の音声を抽出するかは、ユーザにより選択可能にすることができる。 Furthermore, the voice separation unit 45 extracts only the voice 702 at a predetermined seat position (for example, driver position). Note that the user can select which seat position voice is to be extracted.

そして、通信部４９は、Ｓ１２において、音声分離部４５により抽出された音声を、車両の現在位置とともに、受信側の音声処理装置１００に送信する。その後の処理は、上記した通りである。 In S12, the communication unit 49 transmits the voice extracted by the voice separation unit 45 to the voice processing apparatus 100 on the receiving side together with the current position of the vehicle. Subsequent processing is as described above.

このようにすれば、特定の座席位置の搭乗者の音声のみを、対話に用いることができる。例えば、複数の車両で同一目的地を目指す場合など、ドライバ間の対話が優先的となる場合において、他の搭乗者が発声している場合でも、ドライバの発話内容に関しては、確実に他の車両のドライバに伝えるということができる。 In this way, only the voice of the passenger at the specific seat position can be used for the dialogue. For example, when dialogue between drivers is prioritized, such as when aiming at the same destination with a plurality of vehicles, even if other passengers are speaking, it is certain that other vehicles It can be said to the driver.

なお、音声分離の処理Ｓ１１’は、受話側の音声処理装置１００で行ってもよい。例えば、図４のフローと同様に、発話者側の通信部４９は、発話者の音声（混合音）を、受話者側の音声処理装置１００に送信する。受話者側の音声処理装置１００は、これを受信後、上記のＳ１１’と同様に、音声分離と、座席位置との対応付け、及び特定の座席位置の音声の抽出を行う。そして、抽出した音声のみが、音声出力されるようにする。 Note that the voice separation process S11 'may be performed by the voice processing apparatus 100 on the receiving side. For example, as in the flow of FIG. 4, the communication unit 49 on the speaker side transmits the voice (mixed sound) of the speaker to the voice processing device 100 on the receiver side. After receiving this, the voice processing apparatus 100 on the receiver side performs voice separation, association with a seat position, and extraction of a voice at a specific seat position, as in S11 'above. Only the extracted voice is output as a voice.

また、Ｓ１２において、通信部４９は、発話者の音声を送信する際に、その音声に対応付けられた座席位置も送信するようにしてもよい。そして、受話者側の音声処理装置１００では、受信した発話者の座席位置を示す情報を、表示装置に表示して、誰が発話者なのか分かるようにしてもよい。また、受話側の音声処理装置１００の音声出力部４８は、受信した発話者の座席位置に対応する座席位置のスピーカアレイ５にのみから音声出力するようにしてもよい。こうすれば、設定された座席位置の搭乗者のみで対話をすることができ、対話内容が他の座席位置の搭乗者に聴かれることがない。 In S12, when transmitting the voice of the speaker, the communication unit 49 may also transmit the seat position associated with the voice. Then, in the voice processing apparatus 100 on the receiver side, the received information indicating the seat position of the speaker may be displayed on the display device so that the person who is the speaker can be known. Further, the voice output unit 48 of the voice processing apparatus 100 on the receiving side may output the voice only from the speaker array 5 at the seat position corresponding to the received seat position of the speaker. In this way, the conversation can be performed only by the passenger at the set seat position, and the content of the conversation is not heard by the passenger at the other seat position.

また、発話者の座席位置ごとに、受話側で出力する際の音量を異ならせることもできる。例えば、発話者側の音声出力装置１００の通信部４９は、複数の座席位置に対応付けられた複数の音声を、座席位置の情報とともに、受話側の音声出力装置１００に送信する。受話側の音声出力装置１００の音声出力部４８は、出力しようとする音声について、対応付けられた座席位置に応じて、音量を異ならせる。例えば、優先順位を設け、ドライバ位置＞助手席位置＞後部右側位置＞後部左側位置の順に、音量を大きくする。こうすれば、重要性が高い内容を話しそうな搭乗者の会話を優先的に受話側の搭乗者に聴取させることができる。 Further, the volume at the time of output on the receiving side can be varied for each seat position of the speaker. For example, the communication unit 49 of the voice output device 100 on the speaker side transmits a plurality of voices associated with the plurality of seat positions to the voice output device 100 on the receiver side together with information on the seat positions. The voice output unit 48 of the voice output device 100 on the receiving side varies the volume of the voice to be output according to the associated seat position. For example, priority is provided and the volume is increased in the order of driver position> passenger seat position> rear right position> rear left position. In this way, it is possible to preferentially listen to the passenger on the receiving side who is likely to speak a highly important content.

＜第２の実施形態＞
第２の実施形態におけるハードウェア構成及び機能部の構成は、上記第１の実施形態と似た構成を備えている。したがって、重複する部分については説明を省略する。 <Second Embodiment>
The hardware configuration and functional unit configuration in the second embodiment are similar to those in the first embodiment. Therefore, the description of the overlapping parts is omitted.

本実施形態では、発話者の座席位置、及び受話者の座席位置に応じて、各受話者に対して提供する音場における音声の仮想音源位置を設定する。例えば、他の車両の助手席に搭乗している発話者の音声が、受話側のドライバにとっては、自車両の助手席側から発せられたように感じさせることが可能となる。 In the present embodiment, the virtual sound source position of the sound in the sound field provided to each listener is set according to the seat position of the speaker and the seat position of the receiver. For example, it is possible for the driver on the receiving side to feel that the voice of the speaker boarding the passenger seat of another vehicle is emitted from the passenger seat side of the own vehicle.

図８は、本実施形態にかかる発話から受話までの処理を示すフロー図である。 FIG. 8 is a flowchart showing processing from utterance to reception according to the present embodiment.

まず、発話者側の音声処理装置１００の音声取得部４４は、マイクロフォンアレイ４を介して、発話者の音声を取得する（Ｓ２１）。次に、音声分離部４５は、Ｓ２１において取得された音声に対して、音声分離を行って（Ｓ２２）、さらに、図７で示した方法と同様にして、座席位置ごとの音声を抽出する（Ｓ２３）。次に、発話者側の音声処理装置１００の通信部４９は、取得された音声を、現在位置算出部４２で算出された現在位置とともに、受話側の音声処理装置１００に送信する（Ｓ２４）。 First, the voice acquisition unit 44 of the voice processing apparatus 100 on the speaker side acquires the voice of the speaker via the microphone array 4 (S21). Next, the voice separation unit 45 performs voice separation on the voice acquired in S21 (S22), and further extracts voice for each seat position in the same manner as shown in FIG. S23). Next, the communication unit 49 of the voice processing apparatus 100 on the speaker side transmits the acquired voice to the voice processing apparatus 100 on the receiving side together with the current position calculated by the current position calculation unit 42 (S24).

これを受けて、受話側の音声処理装置１００の通信部４９は、発話者側の音声処理装置１００から送信された、座席位置に対応付けられた発話者の音声と、発話者側の座席位置とを取得する（Ｓ２５）。 In response to this, the communication unit 49 of the speech processing apparatus 100 on the receiver side transmits the voice of the speaker associated with the seat position transmitted from the speech processing apparatus 100 on the speaker side and the seat position on the speaker side. Are acquired (S25).

次に、受話側の音声処理装置１００の音場設定部４６は、着席センサ６を介して、自車両内の各座席について着席状態か否かを検出する（Ｓ２６）。 Next, the sound field setting unit 46 of the voice processing apparatus 100 on the receiving side detects whether or not each seat in the own vehicle is in the seated state via the seating sensor 6 (S26).

さらに、音場設定部４６は、音場の構成を設定する（Ｓ２７）。図９は、音場の構成を説明するための図である。音場設定部４６は、各座席のスピーカアレイ５ごとに、異なる音場を設定することになる。また、発話者の発話者側の車両内での座席位置と、受話者の受話者側の車両内の座席位置との位置関係に基づいて、音場の構成を設定する。具体的には、音場設定部４６は、Ｓ２５で受信した座席位置に対応付けられた音声の仮想音源が、その座席位置に対応する受話側の車両の座席位置に形成されるように、音場の構成を設定する。 Further, the sound field setting unit 46 sets the configuration of the sound field (S27). FIG. 9 is a diagram for explaining the configuration of the sound field. The sound field setting unit 46 sets a different sound field for each speaker array 5 of each seat. Further, the configuration of the sound field is set based on the positional relationship between the seat position in the vehicle on the speaker side of the speaker and the seat position in the vehicle on the receiver side of the receiver. Specifically, the sound field setting unit 46 generates a sound so that the sound virtual sound source associated with the seat position received in S25 is formed at the seat position of the receiving vehicle corresponding to the seat position. Set the field configuration.

例えば、図９（Ａ）に示すように、発話者側の車両において助手席の搭乗者が発話者である音声があるとする。受話者側の車両のドライバ席の音場を設定す場合、音場設定部４６は、まず、ドライバ位置を中心とした音場を設定する。そして、図９（Ｂ）に示すように、発話者（発話者側の車両の助手席の搭乗者）の音声の仮想音源８００を、ドライバ位置からみた助手席位置の方向に設定する。 For example, as shown in FIG. 9A, it is assumed that there is a voice in which the passenger in the passenger seat is the speaker in the speaker-side vehicle. When setting the sound field of the driver's seat of the vehicle on the listener side, the sound field setting unit 46 first sets the sound field centered on the driver position. Then, as shown in FIG. 9B, the virtual sound source 800 of the voice of the speaker (passenger in the passenger seat of the vehicle on the speaker side) is set in the direction of the passenger seat position as viewed from the driver position.

ところで、発話者の座席位置に対応する受話者側の座席位置に、既に搭乗者がいる場合がある。かかる場合に、その座席位置に仮想音源を設定すると、発話者の音声と実際の搭乗者の音声が混じり、受話者に違和感を与える。そこで、音場設定部４６は、
（１）Ｓ２５で取得した座席位置に対応付けられた音声のうち、自車両の未着席状態の座席位置に対応する音声については、その座席位置を音源位置と設定する。
（２）一方、Ｓ２５で取得した座席位置に対応付けられた音声のうち、自車両の着席状態の座席位置に対応する音声については、他の未着席の座席位置（または、予め定めた位置（例えば、車両中央の天井位置）を音源位置と設定する。 By the way, a passenger may already exist at the seat position on the receiver side corresponding to the seat position of the speaker. In such a case, when a virtual sound source is set at the seat position, the voice of the speaker and the voice of the actual passenger are mixed, giving the listener a sense of incongruity. Therefore, the sound field setting unit 46
(1) Among the voices associated with the seat position acquired in S25, for the voice corresponding to the seat position of the unoccupied state of the host vehicle, the seat position is set as the sound source position.
(2) On the other hand, among the voices associated with the seat position acquired in S25, the voice corresponding to the seat position in the seated state of the host vehicle is used for other unseatd seat positions (or predetermined positions ( For example, the ceiling position in the center of the vehicle) is set as the sound source position.

こうして音場が設定されると、音声加工部４７は、音場設定部４６で設定された音場が再現されるように、各座席のスピーカアレイ５ごとに、スピーカアレイ５を構成する各スピーカに出力されるべき音声を、元の音声を加工することにより生成する（Ｓ２８）。 When the sound field is set in this way, the sound processing unit 47 sets each speaker constituting the speaker array 5 for each speaker array 5 in each seat so that the sound field set by the sound field setting unit 46 is reproduced. The voice to be output to is generated by processing the original voice (S28).

そして、音声出力部４８は、音声加工部４７で加工された音声を、各スピーカアレイ５に出力し、音声出力を実現する（Ｓ２９）。これにより、受話者は、発話者の発話内容を聴取することができる。 Then, the audio output unit 48 outputs the audio processed by the audio processing unit 47 to each speaker array 5 to realize audio output (S29). Thereby, the listener can listen to the utterance contents of the speaker.

以上、図８のフローを用いて、発話から受話までの動作について説明した。なお、かかるフローは、いずれかの搭乗者が発話する度に繰り返される。 The operation from utterance to reception has been described above using the flow of FIG. This flow is repeated every time any passenger speaks.

上記した第２の実施形態によれば、発話者の音源位置を、発話者と受話者との座席の位置関係に基づいて定める。したがって、受話者に対して、あたかも自車両の対応する座席に発話者が存在するかのような感覚を与えることができ、より臨場感のある対話を実現することができる。 According to the second embodiment described above, the sound source position of the speaker is determined based on the positional relationship between the seats of the speaker and the receiver. Therefore, it is possible to give the listener a feeling as if the speaker is present in the corresponding seat of the own vehicle, and a more realistic dialogue can be realized.

本実施形態は、さまざまな変形が可能である。 The present embodiment can be variously modified.

例えば、上記第１の実施形態の変形例で説明したのと同様に、ドライバ同士のみの対話、後部座席の搭乗者同士のみの対話など、特定の座席位置の搭乗者のみの対話に制限することもできる。 For example, as described in the modification of the first embodiment, the dialogue is limited to only the passengers at a specific seat position, such as the dialogue between the drivers only and the dialogue between the passengers in the rear seat. You can also.

図１０は、かかる場合の対話動作の処理のフロー図である。また、図１１は、かかる場合の音場の構成を説明するための図である。図１０のフローは、図８で示した処理と基本的に同じである。ただし、マイクロフォンアレイ４で取得される「発話者の音声」に、複数の搭乗者の音声が混合していることを想定して、音声分離により特定の座席の搭乗者の音声のみを抽出するための処理Ｓ２３’が挿入されている。すなわち、発話者側の音声処理装置１００の音声分離部４５は、Ｓ２３において取得された座席位置に対応付けられた音声から、特定の座席位置（例えば、ドライバ位置）の音声のみを抽出する。図１１の例では、発話者であるドライバの音声のみが、受話者側のスピーカアレイ５に出力されるように設定された様子を示している。さらに、受話側のドライバ位置には、既にドライバが着席しているので、音場設定部４６により、発話者の音声の仮想音源８０１の位置は、予め定められた位置（天井中央）に設定されている。なお、いずれの座席位置の音声を抽出して出力するかは、ユーザにより選択可能にすることができる。 FIG. 10 is a flowchart of the interactive operation process in such a case. Moreover, FIG. 11 is a figure for demonstrating the structure of the sound field in such a case. The flow of FIG. 10 is basically the same as the process shown in FIG. However, in order to extract only the voices of the passengers of a specific seat by voice separation, assuming that the voices of a plurality of passengers are mixed with the “voices of speakers” acquired by the microphone array 4. The process S23 ′ is inserted. That is, the voice separation unit 45 of the voice processing apparatus 100 on the speaker side extracts only the voice at a specific seat position (for example, the driver position) from the voice associated with the seat position acquired in S23. The example of FIG. 11 shows a state in which only the voice of the driver who is the speaker is set to be output to the speaker array 5 on the receiver side. Furthermore, since the driver is already seated at the driver position on the receiver side, the sound field setting unit 46 sets the position of the virtual sound source 801 of the speaker's voice at a predetermined position (center of the ceiling). ing. Note that the user can select which seat position voice is extracted and output.

このようにすれば、特定の座席位置の搭乗者の音声のみを、対話に用いることができる。例えば、複数の車両で同一目的地を目指す場合など、ドライバ間の対話が優先的となる場合において、他の搭乗者が発声している場合でも、ドライバの発話内容に関しては、確実に他の車両のドライバに伝えることができる。 In this way, only the voice of the passenger at the specific seat position can be used for the dialogue. For example, when dialogue between drivers is prioritized, such as when aiming at the same destination with a plurality of vehicles, even if other passengers are speaking, it is certain that other vehicles Can tell the driver.

なお、音声分離に係る処理Ｓ２２、Ｓ２３（対話者を制限する場合、さらにＳ２３’）は、受話側の音声処理装置１００で行ってもよい。例えば、発話者側の通信部４９は、発話者の音声（混合音）を、受話者側の音声処理装置１００に送信する。受話者側の音声処理装置１００は、これを受信後、Ｓ２２、Ｓ２３（対話者を制限する場合、さらにＳ２３’）と同様に、音声分離と、座席位置との対応付け、（対話者を制限する場合、さらに特定の座席位置の音声の抽出）を行う。そして、音場設定部４６は、抽出された音声のみを、音場に割り当てる。これにより、音声出力部４８は、特定の座席位置の発話者の音声のみを出力することになる。 Note that the processes S22 and S23 related to voice separation (or S23 'in the case of restricting a conversation person) may be performed by the voice processing apparatus 100 on the receiving side. For example, the communication unit 49 on the speaker side transmits the voice (mixed sound) of the speaker to the voice processing apparatus 100 on the receiver side. After receiving this, the voice processing apparatus 100 on the receiver side associates the voice separation with the seat position in the same manner as S22 and S23 (or S23 ′ in the case of restricting the talker) (restricts the talker). If this is the case, the voice of a specific seat position is further extracted). Then, the sound field setting unit 46 assigns only the extracted sound to the sound field. As a result, the voice output unit 48 outputs only the voice of the speaker at the specific seat position.

また、図８及び図１０のフローでは、着席状態か否かを判定したが、これに限定されない。ユーザの選択に応じて、着席状態の判定を行わないでもよい。そして、着席状態か否かに関らず、一律に、該当する座席位置に仮想音源の位置を割り当てるようにしてもよい。 Moreover, in the flow of FIG.8 and FIG.10, although it was determined whether it was a seating state, it is not limited to this. The seated state may not be determined according to the user's selection. Then, the position of the virtual sound source may be uniformly assigned to the corresponding seat position regardless of whether or not the user is seated.

また、音場の設定Ｓ２７において、音場設定部４６は、受話者の自身の位置（音場の中心）を中心として、円卓会議を行うように、周囲に、発話者の音声の音源位置を割り当ててもよい。発話者の音声の音源位置をどのように配置するかについては、ユーザから選択を受け付けるようにすることができる。 Also, in the sound field setting S27, the sound field setting unit 46 sets the sound source position of the speaker's voice around the speaker so that a round-table conference is performed around the listener's own position (the center of the sound field). It may be assigned. As for how to arrange the sound source position of the voice of the speaker, selection can be received from the user.

また、３台上の複数台で対話を行う場合、同じ座席位置の搭乗者が同時に発話する場合がある。かかる場合、音場設定部４６は、車両ごとに予め優先順位を設けておき、優先順位に応じて発話者の音量を大きく設定してもよい。または、同時に発話した発話者の中の、優先順位が最も高い発話者の音声のみを出力するようにしてもよい。 In addition, when a dialogue is performed with a plurality of three vehicles, passengers at the same seating position may speak at the same time. In such a case, the sound field setting unit 46 may set a priority order for each vehicle in advance, and may set the volume of the speaker high according to the priority order. Alternatively, only the voice of the speaker with the highest priority among the speakers who speak at the same time may be output.

また、上記第１の実施形態と第２の実施形態は、適宜組み合わせることも可能である。すなわち、受話者側の音声処理装置１００の音場設定部４６は、発話者の音源位置を、車両同士の位置関係により概略設定し、さらに、車両内の座席位置に応じて、細かく設定することもできる。または、出力しようとする音声に対応付けられた座席位置が、所定の座席位置（例えば、ドライバ位置）である場合、車両間の距離が遠くても、他の座席の発話者の音声に比べて音量を大きく出力するように設定することもできる。 Further, the first embodiment and the second embodiment can be appropriately combined. That is, the sound field setting unit 46 of the voice processing device 100 on the receiver side roughly sets the sound source position of the speaker based on the positional relationship between the vehicles, and further sets the sound source according to the seat position in the vehicle. You can also. Alternatively, when the seat position associated with the voice to be output is a predetermined seat position (for example, driver position), even if the distance between the vehicles is far, the voice of the speaker of another seat is compared with It can also be set to output louder.

図１は、音声処理装置の概略構成図である。FIG. 1 is a schematic configuration diagram of a voice processing apparatus. 図２は、マイクロフォンアレイ、スピーカアレイ、着席センサの配置を説明するための図である。FIG. 2 is a diagram for explaining the arrangement of the microphone array, the speaker array, and the seating sensor. 図３は、演算処理部１の機能構成を示す図である。FIG. 3 is a diagram illustrating a functional configuration of the arithmetic processing unit 1. 図４は、対話動作の処理のフロー図である。FIG. 4 is a flowchart of the interactive operation process. 図５は、音場の構成を設定する方法を説明するための図である。FIG. 5 is a diagram for explaining a method of setting the configuration of the sound field. 図６は、対話動作の処理のフロー図である。FIG. 6 is a flowchart of the interactive operation process. 図７は、座席位置ごとの音声を求める処理を説明するための図である。FIG. 7 is a diagram for explaining processing for obtaining sound for each seat position. 図８は、対話動作の処理のフロー図である。FIG. 8 is a flowchart of the interactive operation process. 図９は、音場の構成を設定する方法を説明するための図である。FIG. 9 is a diagram for explaining a method of setting the configuration of the sound field. 図１０は、対話動作の処理のフロー図である。FIG. 10 is a flowchart of the interactive operation process. 図１１は、音場の構成を設定する方法を説明するための図である。FIG. 11 is a diagram for explaining a method of setting the configuration of the sound field.

Explanation of symbols

１００…音声処理装置、
１…演算処理部、２…ディスプレイ、３…記憶装置、４…マイクロフォンアレイ、５…スピーカアレイ、６…着席センサ、７…入力装置、８…車輪速センサ、９…ジャイロ、１０…ＧＰＳ受信装置、１５…通信装置、
４１…ユーザ操作解析部、４２…現在位置算出部、４３…ナビゲーション処理部、４４…音声取得部、４５…音声分離部、４６…音場設定部、４７…音声加工部、４８…音声出力部、４９…通信部 100: Audio processing device,
DESCRIPTION OF SYMBOLS 1 ... Arithmetic processing part, 2 ... Display, 3 ... Memory | storage device, 4 ... Microphone array, 5 ... Speaker array, 6 ... Seating sensor, 7 ... Input device, 8 ... Wheel speed sensor, 9 ... Gyro, 10 ... GPS receiver 15 ... communication device,
DESCRIPTION OF SYMBOLS 41 ... User operation analysis part, 42 ... Current position calculation part, 43 ... Navigation processing part, 44 ... Voice acquisition part, 45 ... Voice separation part, 46 ... Sound field setting part, 47 ... Sound processing part, 48 ... Voice output part , 49 ... Communication part

Claims

An in-vehicle audio processing device mounted on a first vehicle and capable of communicating with an in-vehicle device of a second vehicle,
Vehicle position acquisition means for acquiring the position of the second vehicle;
Voice acquisition means for acquiring voice acquired in the second vehicle;
In a sound field formed by a speaker array including a plurality of speakers, the second sound source is generated so that a virtual sound source of sound acquired in the second vehicle is formed in the direction of the position of the second vehicle. Voice output means for processing the voice acquired in the vehicle and outputting by the speaker array;
Voice separating means for separating the sound acquired in the second vehicle for each sound source;
Voice extraction means for extracting voice at a sound source position corresponding to a predetermined seat position in the vehicle from the voice separated by the voice separation means;
The in-vehicle audio processing apparatus , wherein the audio output means outputs the audio extracted by the audio extraction means to the speaker array .

The in-vehicle voice processing device according to claim 1,
The audio output means is
The in-vehicle audio processing characterized in that as the distance between the first vehicle and the second vehicle increases, the volume of the audio acquired in the second vehicle is reduced and output. apparatus.

The in-vehicle voice processing device according to claim 1,
The audio output means is
The sound acquired in the second vehicle is adjusted to a predetermined constant volume, and then the volume is adjusted according to the distance between the first vehicle and the second vehicle. An in-vehicle audio processing device.

An in-vehicle audio processing device mounted on a first vehicle and capable of communicating with an in-vehicle device of a second vehicle,
Sound acquisition means for acquiring sound acquired in the second vehicle, separated for each sound source position, and associated with a seat position;
In a sound field formed by a speaker array composed of a plurality of speakers, a virtual sound source of sound associated with the seat position is formed at the seat position of the first vehicle corresponding to the seat position. An in-vehicle audio processing apparatus comprising: audio output means for processing sound for each seat position and outputting the sound by the speaker array.

The in-vehicle voice processing device according to claim 4 ,
The voice acquisition means includes
For in-vehicle use, wherein the sound acquired in the second vehicle is separated into sound for each position of the sound source, and then the sound associated with the seat position is obtained by associating the sound with the seat position. Audio processing device.

The in-vehicle voice processing device according to claim 4 ,
Seating determination means for determining whether or not the seat in the first vehicle is in a seated state;
The audio output means is
For the sound associated with the second seat position corresponding to the seat determined to be seated by the seating determination means, the virtual sound source is a seat position in the first vehicle that is not seated or predetermined. A vehicle-mounted audio processing apparatus, wherein the sound for each seat position is processed so as to be formed at a different position and output by the speaker array.

An in-vehicle audio processing device mounted on a first vehicle and capable of communicating with an in-vehicle device of a second vehicle,
Sound acquisition means for acquiring sound acquired in the second vehicle, separated for each sound source position, and associated with a seat position;
In a sound field formed by a speaker array composed of a plurality of speakers, the seat position is such that a virtual sound source of sound associated with the seat position is formed at a position surrounding the center position of the sound field. An in-vehicle audio processing apparatus comprising: audio output means for processing each sound and outputting the sound by the speaker array.

An audio processing system capable of transmitting and receiving audio between the first vehicle and the second vehicle, and an audio separation means for separating audio acquired in the first vehicle for each sound source;
Means for obtaining a sound associated with a seat position from the sound for each position of the sound source obtained by the sound separation means;
In a sound field formed by a speaker array including a plurality of speakers, a virtual sound source of sound associated with the seat position is formed at the seat position of the second vehicle corresponding to the seat position. An audio processing system comprising: audio output means that processes audio for each seat position and outputs the processed audio by the speaker array.

An audio processing method for an in-vehicle audio processing device mounted on a first vehicle and capable of communicating with an in-vehicle device of a second vehicle,
A sound acquisition step of acquiring sound acquired in the second vehicle, separated for each sound source position and associated with a seat position;
In a sound field formed by a speaker array composed of a plurality of speakers, a virtual sound source of sound associated with the seat position is formed at the seat position of the second vehicle corresponding to the seat position. A sound output method comprising: processing sound for each seat position and outputting the sound by the speaker array.

The sound of the in-vehicle voice processing device mounted on the first vehicle and capable of communicating with the on-vehicle device of the second vehicle
A processing method,
Separating the sound acquired in the first vehicle for each sound source;
Obtaining the voice associated with the seat position from the voice for each position of the sound source obtained by the separation;
In a sound field formed by a speaker array including a plurality of speakers, a virtual sound source of sound associated with the seat position is formed at the seat position of the second vehicle corresponding to the seat position. Processing the sound for each seat position and outputting it by the speaker array;
An audio processing method comprising: