JP2015158573A

JP2015158573A - Vehicle voice response system and voice response program

Info

Publication number: JP2015158573A
Application number: JP2014032881A
Authority: JP
Inventors: 真人大林; Masato Obayashi; 顕吉澤; Akira Yoshizawa
Original assignee: Denso IT Laboratory Inc
Current assignee: Denso IT Laboratory Inc
Priority date: 2014-02-24
Filing date: 2014-02-24
Publication date: 2015-09-03

Abstract

PROBLEM TO BE SOLVED: To provide a voice response system which makes a voice response in appropriate tone, to a driver driving a vehicle.SOLUTION: A vehicle voice response system 100 includes: a voice input section 11 to which voice is input and which generates an input voice signal from the input voice; a voice analysis section 21 which analyzes the voice signal generated in the voice input section; an information processing section 22 which generates an operation command for operating an external device, on the basis of a result analyzed in the voice analysis section; a mode determination section 24 which determines a mode of response voice; a response voice generation section 23 which generates response voice, on the basis of the operation command and the response voice mode; and a response voice output section 30 which outputs the response voice.

Description

本発明は、車両に搭載され、ユーザに応答音声を出力する車両用音声応答システム及び音声応答プログラムに関するものである。 The present invention relates to a vehicle voice response system and a voice response program that are mounted on a vehicle and output a response voice to a user.

ユーザの音声を認識して、所定の情報処理を行い、ユーザに音声で応答する音声応答システムにおいて、ユーザの親しみやすさを向上させるために、応答音声に間投詞や不要語といった語句を付加することが有効な手法の一つである。 To improve user friendliness in a voice response system that recognizes the user's voice, performs predetermined information processing, and responds to the user with voice, adding words such as interjections and unnecessary words to the response voice Is one of the effective methods.

ここで、間投詞とは、自立語で活用がなく、主語にも修飾語にもならず、他の文節とは比較的独立して用いられるものをいい、例えば、話し手の感動を表す「ああ」、「おお」、呼びかけを表す「おい」、「もしもし」、応答を表す「はい」、「いいえ」等がこれに含まれる。また、不要語とは、文書の解析を行う上で不要となる語であり、例えば、「えっとー」「あのー」等の言いよどみがこれに含まれる。 Here, interjections are words that are not used in independent words, are neither subject nor modifiers, and are used relatively independently from other clauses. , “O”, “Oi” representing a call, “Hello”, “Yes” representing a response, “No”, and the like. Unnecessary words are words that are unnecessary when analyzing a document, and include, for example, slogans such as “Etto” and “Anoo”.

特許文献１には、音声で操作を行う音声操作インタフェースシステムにおいて、システムからユーザに応答音声を出力する場合に、間投詞や不要語を用いることが示されている。これにより、ユーザが親しみを感じるインタフェースを実現できる。また、非特許文献１には、自然言語による問いかけに対して、ユーザは、ユーザと同じ語尾を持つ回答に親しみを持つ傾向があることが述べられている。この知見に基づいて、ユーザの問いかけの特徴量を用いて、親しみのある応答音声を作ることが可能となる。 Patent Document 1 discloses that in a voice operation interface system that operates by voice, interjections and unnecessary words are used when a response voice is output from the system to a user. Thereby, the interface which a user feels familiar can be implement | achieved. Further, Non-Patent Document 1 describes that in response to an inquiry in a natural language, a user tends to be familiar with an answer having the same ending as the user. Based on this knowledge, it is possible to create a friendly response voice using the feature value of the user's question.

なお、本発明に関連する先行技術として、以下の先行技術文献がある。 In addition, there exist the following prior art documents as a prior art relevant to this invention.

特開平１１−２３７９７１号公報JP 11-237971 A 特開２００１−１２５５８４号公報Japanese Patent Laid-Open No. 2001-125584

西原陽子、松村真宏、谷内田正彦著、「ＱＡサイトにおける質問に適した回答の判定」、ＮＬＰ若手の会第２回シンポジウム予稿、２００７年９月２８日Yoko Nishihara, Masahiro Matsumura, Masahiko Taniuchi, “Determination of answers suitable for questions on the QA site”, NLP Young Association's 2nd Symposium Preliminary, September 28, 2007

しかしながら、音声応答システムが、自動車を運転するドライバに対して音声で応答をする車両用音声応答システムとして応用される場合において、ドライバが急いでいるときに、応答音声に間投詞や不要語のような冗長性を付加すると、ドライバに苛立ちを発生させる原因となる。一方、即応的な応答音声のみでは、ドライバが親しみやすさを感じることができない。 However, when the voice response system is applied as a voice response system for a vehicle that responds by voice to a driver driving a car, when the driver is in a hurry, the response voice may be an interjection or an unnecessary word. Adding redundancy causes irritation to the driver. On the other hand, the driver cannot feel friendliness only with the prompt response voice.

本発明は、上記の問題に鑑みてなされたものであり、自動車を運転するドライバに対して適切な応答音声を与える音声応答システムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a voice response system that gives an appropriate response voice to a driver who drives an automobile.

本発明の一つの態様は、車両用応答音声を出力する車両用音声応答システムであって、ユーザの音声を入力するための音声入力部と、前記音声入力部に入力された音声に基づいて外部機器を操作する情報処理部と、応答音声のモードを決定するモード決定部と、前記音声入力部に入力された音声及び前記モード決定部にて決定された応答音声のモードに基づいて応答音声を生成する応答音声生成部と、前記応答音声を出力する応答音声出力部とを備えた構成を有している。 One aspect of the present invention is a vehicular voice response system that outputs a vehicular response voice, an audio input unit for inputting a user's voice, and an external device based on the voice input to the voice input unit. An information processing unit for operating the device, a mode determination unit for determining a response voice mode, a voice input to the voice input unit and a response voice based on the response voice mode determined by the mode determination unit It has the structure provided with the response audio | voice production | generation part to produce | generate and the response audio | voice output part which outputs the said response audio | voice.

この構成により、応答音声を生成する応答音声生成部は、入力された音声に基づいて応答音声を生成するが、同時に、この応答音声は、モード決定部が決定したモードに従って生成されるので、る応答音声の単調さを軽減することができる。 With this configuration, the response sound generation unit that generates the response sound generates the response sound based on the input sound. At the same time, the response sound is generated according to the mode determined by the mode determination unit. The monotonousness of the response voice can be reduced.

前記車両用音声応答システムにおいて、前記モード決定部は、口調モード、方言モード、及び発話速度モードのうちの少なくとも１つのモードを応答音声のモードとして決定してよい。 In the vehicle voice response system, the mode determination unit may determine at least one of a tone mode, a dialect mode, and a speech rate mode as a response voice mode.

この構成により、モード決定部は、応答音声のモードとして、応答音声の口調モード（即応的／フレンドリー／丁寧）、方言モード（標準語／関西弁／東北弁）、発話速度モード（ゆっくり／ふつう／速い）のうちの少なくとも１つを応答音声のモードとして決定することができるので、ユーザの要求に適した様々な応答音声のモードを決定することができる。 With this configuration, the mode determination unit can set the response voice tone mode (immediate / friendly / careful), dialect mode (standard language / Kansai dialect / Tohoku dialect), speech rate mode (slow / normal / Since at least one of (fast) can be determined as the response voice mode, various response voice modes suitable for the user's request can be determined.

前記車両用音声応答システムにおいて、前記モード決定部は、前記音声入力部に入力された音声の特徴、車両の走行状態、及び同乗者の有無のうちの少なくとも１つに基づいて応答音声のモードを決定してよい。 In the vehicle voice response system, the mode determination unit selects a response voice mode based on at least one of the characteristics of the voice input to the voice input unit, the running state of the vehicle, and the presence or absence of a passenger. You may decide.

この構成により、任意の１つ又は複数の方法に基づき応答音声のモードを決定することができるので、ユーザの要求に適した様々な応答音声のモードを決定することができる。なお、複数の方法を用いて応答音声のモードを決定する場合には、各方法に優先度を設けてもよく、又は重みづけをしてもよい。 With this configuration, since the response voice mode can be determined based on any one or a plurality of methods, various response voice modes suitable for the user's request can be determined. In addition, when determining the mode of the response voice using a plurality of methods, each method may be given priority or weighted.

前記車両用音声応答システムにおいて、前記ユーザの音声の特徴は、間投詞の割合、不要語の割合、語尾、口調、発話速度及び周波数のうちの少なくとも１つであってよい。 In the vehicle voice response system, the voice characteristics of the user may be at least one of an interjection ratio, an unnecessary word ratio, a ending, a tone, a speech rate, and a frequency.

この構成により、モード決定部は、入力音声信号における間投詞の割合、不要語の割合、語尾、口調、発話速度及び周波数といった複数の音声特徴のうちの少なくとも１つに基づき、ユーザ音声の特徴を判定することができるため、より適切にユーザ音声の特徴を把握することができる。 With this configuration, the mode determination unit determines the characteristics of the user voice based on at least one of a plurality of voice characteristics such as the ratio of interjections, the ratio of unnecessary words, the ending, the tone, the speech rate, and the frequency in the input voice signal. Therefore, the characteristics of the user voice can be grasped more appropriately.

前記車両用音声応答システムにおいて、前記音声入力部に入力された音声を記憶する音声記憶部をさらに備え、前記モード決定部は、ユーザの音声の特徴と前記音声記憶部に記憶された音声との比較結果に基づいて応答音声のモードを決定してよい。 The vehicle voice response system may further include a voice storage unit that stores voice input to the voice input unit, and the mode determination unit may include a feature of a user's voice and a voice stored in the voice storage unit. The response voice mode may be determined based on the comparison result.

この構成により、モード決定部は、音声記憶部に記憶された過去の入力音声とユーザ音声の差分に基づいて応答音声のモードを決定することができるため、音声を入力するユーザの音声特徴の過去の傾向に即した応答音声のモードを決定することができる。 With this configuration, the mode determination unit can determine the response voice mode based on the difference between the past input voice stored in the voice storage unit and the user voice, so the past of the voice characteristics of the user who inputs the voice It is possible to determine the response voice mode in accordance with the tendency.

前記車両用音声応答システムにおいて、前記音声入力部に入力された音声を記憶する音声記憶部をさらに備え、前記モード決定部は、前記音声記憶部に記憶された入力音声信号を統計的に処理した結果に基づいて応答音声のモードを決定してよい。 The vehicle voice response system further includes a voice storage unit that stores a voice input to the voice input unit, and the mode determination unit statistically processes the input voice signal stored in the voice storage unit. The mode of the response voice may be determined based on the result.

この構成により、音声記憶部に蓄積された過去の入力音声を統計的に処理した結果に基づいて応答音声のモードを決定するので、音声を入力するユーザの音声特徴の傾向を的確に把握し、ユーザの状況を把握した上で、応答音声のモードを決定することができる。なお、音声記憶部の過去のユーザの入力音声の音声特徴を統計的に処理する方法は、所定期間の平均をとるなど任意の方法を用いてよい。 With this configuration, since the response voice mode is determined based on the result of statistically processing past input voices accumulated in the voice storage unit, the tendency of the voice characteristics of the user who inputs voices is accurately grasped, It is possible to determine the response voice mode after grasping the user's situation. In addition, as a method of statistically processing the voice feature of the past user input voice in the voice storage unit, an arbitrary method such as taking an average of a predetermined period may be used.

前記車両用音声応答システムにおいて、前記モード決定部は、応答音声を前記音声入力部に入力された音声に似せるように応答音声のモードを決定してよい。 In the vehicle voice response system, the mode determination unit may determine the response voice mode so that the response voice is similar to the voice input to the voice input unit.

この構成により、モード決定部は、ユーザによる入力音声に似た応答音声モードを決定するので、ユーザに親しみを持たせることができる応答音声を生成することができる。 With this configuration, the mode determination unit determines a response voice mode similar to the voice input by the user, and therefore can generate a response voice that can make the user feel familiar.

前記車両用音声応答システムにおいて、地図情報が記憶されている地図記憶部と、車両の位置情報を取得する位置情報取得部と、車両の情報を取得する車両情報取得部とをさらに備え、前記モード決定部は、前記地図記憶部に記憶された地図情報、前記位置情報取得部にて取得した位置情報、及び前記車両情報取得部にて取得した車両情報に基づいて応答音声のモードを決定してよい。 In the vehicle voice response system, the mode further includes a map storage unit storing map information, a position information acquisition unit that acquires vehicle position information, and a vehicle information acquisition unit that acquires vehicle information, The determination unit determines a response voice mode based on the map information stored in the map storage unit, the position information acquired by the position information acquisition unit, and the vehicle information acquired by the vehicle information acquisition unit. Good.

この構成により、前記モード決定部は、車両の現在位置周辺の地図情報と走行状態等の車両情報との比較等を行うことにより適切な口調、方言、発話速度の応答音声モードを決定することができる。 With this configuration, the mode determination unit can determine an appropriate tone, dialect, and speech speed response voice mode by comparing the map information around the current position of the vehicle with vehicle information such as the driving state. it can.

前記車両用音声応答システムは、車両の位置情報を取得する位置情報取得部と、渋滞情報を取得する渋滞情報取得部とをさらに備えていてよく、前記モード決定部は、前記位置情報取得部にて取得した位置情報及び前記渋滞情報取得部にて取得した渋滞情報に基づいて応答音声のモードを決定してよい。 The vehicle voice response system may further include a position information acquisition unit for acquiring vehicle position information, and a traffic jam information acquisition unit for acquiring traffic jam information, and the mode determination unit is included in the position information acquisition unit. The response voice mode may be determined based on the acquired location information and the traffic information acquired by the traffic information acquisition unit.

この構成により、前記モード決定部は、車両の現在位置から、車両が渋滞に巻き込まれていると判定される場合には、ドライバの苛立ちを助長させないために、「即応的」な口調による応答音声モードを決定するというように、適切な応答音声モードの決定をすることができる。 With this configuration, when it is determined from the current position of the vehicle that the vehicle is involved in a traffic jam, the mode determination unit does not promote the driver's irritation, so that the response sound with an “immediate” tone An appropriate response voice mode can be determined, such as determining the mode.

前記車両用音声応答システムは、車両の位置情報を取得する位置情報取得部と、車両の移動履歴を記憶する移動履歴記憶部とをさらに備えていてよく、前記モード決定部は、前記位置情報取得部にて取得した位置情報及び前記移動履歴記憶部に記憶された移動履歴に基づいて応答音声のモードを決定してよい。 The vehicle voice response system may further include a position information acquisition unit that acquires position information of the vehicle, and a movement history storage unit that stores a movement history of the vehicle, and the mode determination unit includes the position information acquisition The response voice mode may be determined based on the position information acquired by the unit and the movement history stored in the movement history storage unit.

この構成により、前記モード決定部は、車両が現在走行している位置における移動履歴から、車両が現在走行している道路が常用されている道路か否か等を想定して、適切な応答音声モードの決定をすることができる。 With this configuration, the mode determination unit can appropriately respond by assuming from the movement history at the position where the vehicle is currently traveling whether or not the road on which the vehicle is currently traveling is a regular road. You can make mode decisions.

前記車両用音声応答システムは、車両の情報を取得する車両情報取得部をさらに備えていてよく、前記モード決定部は、さらに、前記車両情報取得部にて取得した車両情報にも基づいて応答音声のモードを決定してよい。 The vehicle voice response system may further include a vehicle information acquisition unit that acquires vehicle information, and the mode determination unit further responds based on the vehicle information acquired by the vehicle information acquisition unit. The mode may be determined.

この構成により、モード決定部は、車両が現在走行している位置における移動履歴と、車両の現在の走行状態とを比較することにより、車両の運転負荷等を考慮して適切な応答音声を生成することができる。 With this configuration, the mode determination unit generates an appropriate response sound in consideration of the driving load of the vehicle by comparing the movement history at the position where the vehicle is currently traveling and the current traveling state of the vehicle. can do.

前記車両用音声応答システムは、同乗者を検知する同乗者センサをさらに備えていてよく、前記モード決定部は、前記同乗者センサの検知信号に基づいて応答音声のモードを決定してよい。 The vehicle voice response system may further include a passenger sensor for detecting a passenger, and the mode determination unit may determine a response voice mode based on a detection signal of the passenger sensor.

この構成により、モード決定部は、応答音声モードを、ドライバ以外の同乗者の有無に応じて切り替えることにより、車両内の状況に配慮した適切な応答音声モードを決定することができる。 With this configuration, the mode determination unit can determine an appropriate response sound mode in consideration of the situation in the vehicle by switching the response sound mode according to the presence or absence of a passenger other than the driver.

前記車両用音声応答システムは、車両の情報を取得する車両情報取得部をさらに備えていてよく、前記応答音声生成部は、前記車両情報取得部にて取得した車両情報に基づいて前記応答音声出力部が応答音声を出力するタイミングを決定してよい。 The vehicle voice response system may further include a vehicle information acquisition unit that acquires vehicle information, and the response voice generation unit outputs the response voice based on the vehicle information acquired by the vehicle information acquisition unit. The timing at which the unit outputs the response voice may be determined.

この構成により、応答音声出力部は、車両情報に応じた適切なタイミングで応答音声を出力することができるので、運転負荷が大きい状態の時に応答音声が出力され、さらに運転負荷を増加させるという事態を避けることができる。なお、適切なタイミングで応答音声が出力されるまでに、間投詞を出力する等してもよい。 With this configuration, the response sound output unit can output the response sound at an appropriate timing according to the vehicle information, so that the response sound is output when the driving load is large and the driving load is further increased. Can be avoided. Note that an interjection may be output before the response voice is output at an appropriate timing.

前記車両用音声応答システムは、前記音声入力部で生成された音声信号を分析する音声分析部をさらに備えていてよく、前記情報処理部は、前記音声分析部における分析結果に基づいて、外部機器を操作してよい。 The vehicular voice response system may further include a voice analysis unit that analyzes a voice signal generated by the voice input unit, and the information processing unit is connected to an external device based on an analysis result in the voice analysis unit. May be operated.

この構成により、ユーザの音声によって外部機器を操作でき、応答音声として外部機器の操作に関する音声を出力できる。 With this configuration, the external device can be operated by the user's voice, and a voice relating to the operation of the external device can be output as a response voice.

本発明のさらに別の態様は、音声応答プログラムであって、この音声応答プロゴラムは、両用応答音声を出力するためのコンピュータに、ユーザの音声を入力する音声入力ステップと、前記音声入力ステップにて入力された音声に基づいて外部機器を操作する情報処理ステップと、応答音声のモードを決定するモード決定ステップと、前記音声入力ステップにて入力された音声及び前記モード決定ステップにて決定された応答音声のモードに基づいて応答音声を生成する応答音声生成ステップと、前記応答音声を出力する応答音声出力ステップとを実行させる。 Still another aspect of the present invention is a voice response program, wherein the voice response program includes a voice input step of inputting a user's voice to a computer for outputting a dual response voice, and the voice input step. An information processing step for operating an external device based on the input voice, a mode determination step for determining a response voice mode, a voice input in the voice input step, and a response determined in the mode determination step A response sound generation step for generating a response sound based on a sound mode and a response sound output step for outputting the response sound are executed.

この構成によっても、応答音声を生成する応答音声生成ステップが入力された音声に基づいて応答音声を生成するが、この応答音声は、モード決定ステップにて決定されたモードに従って生成されるので、応答音声の単調さを軽減することができる。 Even in this configuration, the response voice generation step for generating the response voice generates the response voice based on the input voice. Since the response voice is generated according to the mode determined in the mode determination step, the response voice is generated. The monotonousness of voice can be reduced.

本発明によれば、応答音声のモードを切り替えることができるので、応答音声の単調さを軽減できる。 According to the present invention, since the mode of the response voice can be switched, the monotony of the response voice can be reduced.

本発明の実施の形態の音声応答システムを含む音声操作システムの構成を示すブロック図The block diagram which shows the structure of the voice operation system containing the voice response system of embodiment of this invention 本発明の実施の形態の車両用音声応答システムの応答音声出力の一例を示すフローチャートThe flowchart which shows an example of the response audio | voice output of the audio | voice response system for vehicles of embodiment of this invention

以下、本発明の実施の形態の音声応答システムについて、図面を参照しながら説明する。なお、以下に説明する実施の形態は、本発明を実施する場合の一例を示すものであって、本発明を以下に説明する具体的構成に限定するものではない。本発明の実施にあたっては、実施の形態に応じた具体的構成が適宜採用されてよい。 Hereinafter, a voice response system according to an embodiment of the present invention will be described with reference to the drawings. The embodiment described below shows an example when the present invention is implemented, and the present invention is not limited to the specific configuration described below. In carrying out the present invention, a specific configuration according to the embodiment may be adopted as appropriate.

図１は、本発明の実施の形態の音声応答システムを含む音声操作システムの構成を示すブロック図である。この音声操作システムは、車両に搭載されて、車両の乗員が車両に搭載された機器を音声で操作するためのシステムである。本実施の形態の音声操作システムは、特に、車両の乗員であるドライバが音声操作を行うことを想定している。 FIG. 1 is a block diagram showing a configuration of a voice operation system including a voice response system according to an embodiment of the present invention. This voice operation system is a system that is mounted on a vehicle and allows a vehicle occupant to operate a device mounted on the vehicle by voice. The voice operation system according to the present embodiment assumes that a driver who is a vehicle occupant performs voice operations.

音声操作システムは、操作の対象となる外部機器２００と、ユーザの音声を入力して外部機器２００に操作コマンドを出力するとともにユーザに対する応答音声を出力する音声応答システム１００とからなる。音声応答システム１００は、音声入力部１１、音声記憶部１２、車両情報取得部１３、渋滞情報取得部１４、位置情報取得部１５、移動履歴記憶部１６、地図記憶部１７、同乗者センサ１８、制御部２０、及び音声出力部３０を備えている。制御部２０は、音声分析部２１、情報処理部２２、応答音声生成部２３、及びモード決定部２４を備えている。 The voice operation system includes an external device 200 to be operated, and a voice response system 100 that inputs a user's voice, outputs an operation command to the external device 200, and outputs a response voice to the user. The voice response system 100 includes a voice input unit 11, a voice storage unit 12, a vehicle information acquisition unit 13, a traffic jam information acquisition unit 14, a position information acquisition unit 15, a movement history storage unit 16, a map storage unit 17, a passenger sensor 18, A control unit 20 and an audio output unit 30 are provided. The control unit 20 includes a voice analysis unit 21, an information processing unit 22, a response voice generation unit 23, and a mode determination unit 24.

音声入力部１１は、マイクであり、ドライバの発話する音声を感知できるように車両内に設置されている。音声入力部１１は、入力された音声から音声信号を生成して、音声信号を音声記憶部１２及び音声識別部２１に出力する。音声記憶部１２は、音声入力部１１にて生成された音声信号及び音声分析部２１で得られた解析結果を記憶する記憶媒体である。 The voice input unit 11 is a microphone, and is installed in the vehicle so that the voice uttered by the driver can be detected. The voice input unit 11 generates a voice signal from the input voice, and outputs the voice signal to the voice storage unit 12 and the voice identification unit 21. The voice storage unit 12 is a storage medium that stores the voice signal generated by the voice input unit 11 and the analysis result obtained by the voice analysis unit 21.

車両情報取得部１３は、ＣＡＮ（Control Area Network）によって車両の各種の機器の制御情報を取得する。車両情報取得部１３は、例えば、アクセルの開度、ステアリングの舵角、走行速度等の車両情報を取得する。車両情報取得部１３は、取得した車両情報を車モード決定部２４に出力する。 The vehicle information acquisition unit 13 acquires control information of various devices of the vehicle by CAN (Control Area Network). The vehicle information acquisition unit 13 acquires vehicle information such as the accelerator opening, the steering angle, and the traveling speed, for example. The vehicle information acquisition unit 13 outputs the acquired vehicle information to the vehicle mode determination unit 24.

渋滞情報取得部１４は、ＶＩＣＳ（登録商標）（Vehicle Information and Communication System）によって渋滞情報を取得する。渋滞情報取得部１４は、取得した渋滞情報をモード決定部２４に出力する。位置情報取得部１５は、ＧＰＳ（Global Positioning System）によって車両の現在位置の位置情報を取得する。位置情報取得部１５は、取得した位置情報を移動履歴記憶部１６及びモード決定部２４に出力する。 The traffic jam information acquisition unit 14 acquires traffic jam information by VICS (registered trademark) (Vehicle Information and Communication System). The traffic jam information acquisition unit 14 outputs the acquired traffic jam information to the mode determination unit 24. The position information acquisition unit 15 acquires position information of the current position of the vehicle by GPS (Global Positioning System). The position information acquisition unit 15 outputs the acquired position information to the movement history storage unit 16 and the mode determination unit 24.

移動履歴記憶部１６は、位置情報取得部１５から入力された位置情報によって車両の移動履歴を記憶する。地図記憶部１７には、ナビゲーション装置で用いられる地図情報が記憶されている。この地図情報には、道路及び交差点に相当するリンク及びノードの情報、地図上の各種の建物等の情報、制限速度や一方通行等の交通規則情報が含まれる。 The movement history storage unit 16 stores the movement history of the vehicle based on the position information input from the position information acquisition unit 15. The map storage unit 17 stores map information used in the navigation device. This map information includes information on links and nodes corresponding to roads and intersections, information on various buildings on the map, and traffic rule information such as speed limit and one-way traffic.

同乗者センサ１８は、車両の運転席以外の座席に設けられたセンサであり、ドライバ以外の乗員があることを検知する。同乗者センサ１８は、モード決定部２４に検知信号を出力する。 The passenger sensor 18 is a sensor provided in a seat other than the driver's seat of the vehicle, and detects that there is an occupant other than the driver. The passenger sensor 18 outputs a detection signal to the mode determination unit 24.

制御部２０は、マイコンが所定の音声応答プログラムを実行することにより実現される。その機能には、音声分析、情報処理、応答音声の生成、及びモードの決定があり、図１では、それぞれ、音声分析部２１、情報処理部２２、応答音声生成部２３、及びモード決定部２４として示されている。 The control unit 20 is realized by the microcomputer executing a predetermined voice response program. The functions include voice analysis, information processing, response voice generation, and mode determination. In FIG. 1, the voice analysis unit 21, the information processing unit 22, the response voice generation unit 23, and the mode determination unit 24, respectively. Is shown as

音声分析部２１は、音声入力部１１から入力された音声信号（入力音声信号）について、音声分析を行って、入力音声信号を解読する。音声分析部２１は、解読結果を音声記憶部１２、情報処理部２２、及びモード決定部２４に出力する。例えば、入力音声が「エアコンの温度を上げる」である場合は、音声分析部２１は、エアコンの温度を上げるという解読結果を出力する。 The voice analysis unit 21 performs voice analysis on the voice signal (input voice signal) input from the voice input unit 11 and decodes the input voice signal. The voice analysis unit 21 outputs the decoding result to the voice storage unit 12, the information processing unit 22, and the mode determination unit 24. For example, when the input voice is “increasing the temperature of the air conditioner”, the voice analysis unit 21 outputs a decoding result indicating that the temperature of the air conditioner is increased.

情報処理部２２は、音声分析部２１から入力した解読結果に従って、外部機器２００を操作するための操作コマンドを生成して、外部機器２００及び応答音声生成部２３に出力する。例えば、音声分析部２１からエアコンの温度を上げるという解読結果が入力されたときは、情報処理部２２は、外部機器２００であるエアコンに対して、エアコンの設定温度を所定のステップ幅だけ上げるための操作コマンドを生成して出力する。 The information processing unit 22 generates an operation command for operating the external device 200 according to the decoding result input from the voice analysis unit 21, and outputs the operation command to the external device 200 and the response sound generation unit 23. For example, when a decoding result that increases the temperature of the air conditioner is input from the voice analysis unit 21, the information processing unit 22 increases the set temperature of the air conditioner by a predetermined step width with respect to the air conditioner that is the external device 200. Generate and output the operation command.

モード決定部２４は、各種の情報に基づいて、応答音声のモードを決定する。応答音声のモードとしては、口調モード（即応的／フレンドリー／丁寧）、方言モード（標準語／関西弁／東北弁）、発話速度モード（ゆっくり／ふつう／速い）がある。モード決定部２４は、これらの各モードを決定して、応答音声生成部２３に出力する。モードの決定方法については、後述する。 The mode determination unit 24 determines the response voice mode based on various information. The response voice mode includes a tone mode (immediate / friendly / careful), a dialect mode (standard language / Kansai dialect / Tohoku dialect), and an utterance speed mode (slow / normal / fast). The mode determination unit 24 determines each of these modes and outputs it to the response voice generation unit 23. The mode determination method will be described later.

応答音声生成部２３は、情報処理部２２から操作コマンドを入力して、モード決定部２４から決定された各モードを入力して、操作コマンド及びモードに基づいて、応答音声を生成して、応答音声出力部３０に出力する。応答音声の内容は、操作コマンドに基づいて決定され、応答音声の口調、方言、発話速度は、各モードに従う。 The response voice generation unit 23 receives an operation command from the information processing unit 22, inputs each mode determined from the mode determination unit 24, generates a response voice based on the operation command and the mode, Output to the audio output unit 30. The content of the response voice is determined based on the operation command, and the tone, dialect, and speech rate of the response voice follow each mode.

応答音声生成部２３は、口調モードに応じて応答音声の口調を変化させる。具体的には、応答音声生成部２３は、口調モードが「即応的」であるときは、間投詞や不要語を少なくし、あるいはなくして、シンプルな文章ないしはフレーズで、断定の語尾を用い、端的な応答をする応答音声を生成する。応答音声生成部２３は、口調モードが「フレンドリー」である場合は、間投詞や不要語を多くして親しみやすい応答音声を生成するとともに、応答音声に挨拶等の付加的な応答も含めることで、音声応答の単調さを低減し、使用者に親しみを持たせる。応答音声生成部２３は、口調モードが「丁寧」である場合は、「です／ます調」を用いるなど丁寧な言葉遣いの応答音声を生成する。また、応答音声生成部２３は、方言モードに応じて応答音声の訛りを変化させ、発話速度モードに応じて応答音声の発話速度を変化させる。 The response voice generation unit 23 changes the tone of the response voice according to the tone mode. Specifically, when the tone mode is “immediate”, the response voice generation unit 23 reduces or eliminates interjections and unnecessary words, uses simple sentences or phrases, and uses an affirmative ending. A response voice that makes a responsive response is generated. When the tone mode is “friendly”, the response sound generation unit 23 generates a friendly response sound by increasing interjections and unnecessary words, and includes an additional response such as a greeting in the response sound. Reduce the monotony of the voice response and make the user more familiar. When the tone mode is “Polite”, the response voice generation unit 23 generates a response voice with a polite wording such as using “Da / Masuton”. In addition, the response voice generation unit 23 changes the response voice according to the dialect mode, and changes the response speed of the response voice according to the speech speed mode.

例えば、情報処理部２２に与えられた操作コマンドがエアコンの設定温度を２度上げるコマンドである場合において、口調モードが「フレンドリー」であり、方言モードが「関西弁」であり、発話速度モードが「ふつう」であるときは、応答音声生成部２３は、「エアコン、２度だけ上げといたで〜。暑なったらまた言いや〜。」という関西弁イントネーションの標準的なスピードの応答音声を生成する。同じく操作コマンドがエアコンの設定温度を２度上げるコマンドである場合において、口調モードが「丁寧」であり、方言モードが「標準語」であり、発話速度モードが「速い」であるときは、応答音声生成部２３は、例えば、「設定温度を２度上げました」という標準語のイントネーションの高速の応答音声を生成する。 For example, when the operation command given to the information processing unit 22 is a command for raising the set temperature of the air conditioner twice, the tone mode is “friendly”, the dialect mode is “Kansai dialect”, and the speech rate mode is When it is “normal”, the response voice generating unit 23 generates a response voice of the standard speed of Kansai dial intonation, “If the air conditioner is raised only twice. To do. Similarly, when the operation command is a command to raise the set temperature of the air conditioner twice, when the tone mode is "Polite", the dialect mode is "Standard language", and the speaking speed mode is "Fast" The voice generation unit 23 generates, for example, a high-speed response voice of the standard word intonation “the set temperature has been raised twice”.

応答音声出力部３０は、応答音声生成部２３で生成された音声を出力するスピーカである。 The response sound output unit 30 is a speaker that outputs the sound generated by the response sound generation unit 23.

外部機器２００は、車両に搭載される機器であり、車両用音声応答システム１００によって操作される対象である。外部機器２００は、例えばエアコン、カーナビゲーションシステム、ＡＶ機器等の快適な運転・乗車のための機器であってよく、ヘッドライト、サイドミラー、ウインカー、アクセル、ブレーキ、ステアリング等の車両の運転に関わる機器であってもよい。 The external device 200 is a device mounted on the vehicle and is a target operated by the vehicle voice response system 100. The external device 200 may be a device for comfortable driving / riding such as an air conditioner, a car navigation system, and an AV device, and is related to driving of a vehicle such as a headlight, a side mirror, a blinker, an accelerator, a brake, and a steering. It may be a device.

次に、モード決定部２４におけるモードの決定方法について説明する。
（１）ユーザの音声の特徴に基づいてモードを決定する方法
モード決定部２４は、音声分析部２１から入力される入力音声信号の解読結果、及び音声入力部１１から入力される入力音声信号に基づいて、入力音声信号の音声特徴を判定し、音声特徴に基づいてモードを決定する。モード決定部２４は、音声分析部２１からの入力音声信号の解読結果から間投詞や不要語、語尾、イントネーション、発話速度の音声特徴を判定し、音声入力部１１からの入力音声信号から、周波数の音声特徴を判定する。以下に具体例を示す。 Next, a mode determination method in the mode determination unit 24 will be described.
(1) Method for Determining Mode Based on User's Voice Features The mode decision unit 24 determines the decoding result of the input voice signal input from the voice analysis unit 21 and the input voice signal input from the voice input unit 11. Based on this, the voice feature of the input voice signal is determined, and the mode is determined based on the voice feature. The mode determination unit 24 determines speech characteristics of interjections, unnecessary words, endings, intonations, and speech rates from the decoding result of the input speech signal from the speech analysis unit 21, and determines the frequency characteristics from the input speech signal from the speech input unit 11. Determine voice features. Specific examples are shown below.

（例１−１）
モード決定部２４は、入力音声信号における間投詞や不要語の割合が所定の閾値より多い場合には、口調モードを「フレンドリー」とし、入力音声信号における間投詞や不要語の割合が所定の閾値より少ない場合には、口調モードを「即応的」とし、入力音声信号における語尾が「です／ます調」である場合は、口調モードを「丁寧」とする。また、モード決定部２４は、入力音声信号の発話速度又は周波数が所定の高側閾値よりも高い場合は、発話速度モードを「速い」とし、入力音声信号の発話速度又は周波数が所定の低側閾値よりも低い場合は、発話速度モードを「ゆっくり」とし、入力音声信号の発話速度又は周波数が高側閾値と低側閾値の中間である場合は、発話速度モードを「ふつう」とする。さらに、モード決定部２４は、入力音声信号における語尾及びイントネーションに基づいて、方言モードを決定する。このように、モード決定部２４は、口調モードや発話速度モードや方言モードを入力音声に似せるようにモードを決定する。 (Example 1-1)
When the ratio of interjections and unnecessary words in the input speech signal is greater than a predetermined threshold, the mode determination unit 24 sets the tone mode to “friendly” and the ratio of interjections and unnecessary words in the input speech signal is less than the predetermined threshold. In this case, the tone mode is “immediate”, and when the ending in the input audio signal is “D / Masu”, the tone mode is “polite”. Further, when the speech rate or frequency of the input voice signal is higher than the predetermined high side threshold, the mode determination unit 24 sets the speech rate mode to “fast”, and the speech rate or frequency of the input voice signal is the predetermined low side. When it is lower than the threshold, the speech rate mode is set to “slow”, and when the speech rate or frequency of the input voice signal is between the high side threshold and the low side threshold, the speech rate mode is set to “normal”. Further, the mode determination unit 24 determines a dialect mode based on the ending and intonation in the input voice signal. As described above, the mode determination unit 24 determines the mode so that the tone mode, the speech speed mode, and the dialect mode are similar to the input voice.

（例１−２）
モード決定部２４は、例１−１に加えて、音声記憶部１２からユーザの過去の音声特徴を読み出して、入力音声信号の音声特徴と比較し、その比較結果に基づいて、モードを決定する。具体的には、モード決定部２４は、所定の音声特徴について、音声記憶部１２に記憶された過去のものと、音声入力部１１又は音声分析部２１から入力されたものとの差分が所定の閾値より大きいか否かに応じてモードを決定する。 (Example 1-2)
In addition to Example 1-1, the mode determination unit 24 reads the user's past voice features from the voice storage unit 12, compares them with the voice features of the input voice signal, and determines the mode based on the comparison result. . Specifically, the mode determination unit 24 determines whether the difference between the past one stored in the voice storage unit 12 and the one input from the voice input unit 11 or the voice analysis unit 21 is predetermined for a predetermined voice feature. The mode is determined depending on whether or not it is larger than the threshold value.

例えば、入力音声信号の発話速度が、音声記憶部１２に記憶された過去の発話速度（一定量の音声信号の平均）よりも早く、その差分が所定の閾値を超えている場合には、モード決定部２４は、発話速度モードを「速い」とする。 For example, when the speech rate of the input speech signal is faster than the past speech rate (average of a certain amount of speech signal) stored in the speech storage unit 12 and the difference exceeds a predetermined threshold, the mode The determination unit 24 sets the speech speed mode to “fast”.

（例１−３）
例１−１では、入力音声信号の口調、方言、発話速度に似せるように各モードを決定したが、本例では、モード決定部２４は、音声記憶部１２を参照して、ユーザの過去の入力音声の音声特徴を統計的に処理して（例えば一定期間の平均をとって）、それに基づいて、モードを決定する。 (Example 1-3)
In Example 1-1, each mode is determined so as to resemble the tone, dialect, and speech rate of the input voice signal. However, in this example, the mode determination unit 24 refers to the voice storage unit 12 to determine the past of the user. The voice feature of the input voice is statistically processed (for example, taking an average over a certain period), and the mode is determined based on the average.

即ち、モード決定部２４は、過去の入力音声信号における間投詞や不要語の割合が所定の閾値より多い場合には、口調モードを「フレンドリー」とし、過去の入力音声信号における間投詞や不要語の割合が所定の閾値より少ない場合には、口調モードを「即応的」とし、過去の入力音声信号における語尾が「です／ます調」である場合は、口調モードを「丁寧」とする。また、モード決定部２４は、過去の入力音声信号の発話速度又は周波数が所定の高側閾値よりも高い場合は、発話速度モードを「速い」とし、過去の入力音声信号の発話速度又は周波数が所定の低側閾値よりも低い場合は、発話速度モードを「ゆっくり」とし、過去の入力音声信号の発話速度又は周波数が高側閾値と低側閾値の中間である場合は、発話速度モードを「ふつう」とする。さらに、モード決定部２４は、過去の入力音声信号における語尾及びイントネーションに基づいて、方言モードを決定する。このように、モード決定部２４は、口調モードや発話速度モードや方言モードをユーザの過去の入力音声に似せるようにモードを決定する。 That is, when the ratio of interjections and unnecessary words in the past input speech signal is greater than a predetermined threshold, the mode determination unit 24 sets the tone mode to “friendly” and the ratio of interjections and unnecessary words in the past input speech signal. Is less than a predetermined threshold, the tone mode is “immediate”, and when the ending in the past input speech signal is “D / Masu”, the tone mode is “polite”. Further, when the speech rate or frequency of the past input speech signal is higher than the predetermined high side threshold, the mode determination unit 24 sets the speech rate mode to “fast” and the speech rate or frequency of the past input speech signal is When the speech rate mode is lower than the predetermined low side threshold, the speech rate mode is set to “slow”, and when the speech rate or frequency of the past input voice signal is between the high side threshold and the low side threshold, the speech rate mode is set to “ "Normal". Further, the mode determination unit 24 determines the dialect mode based on the ending and intonation in the past input voice signal. As described above, the mode determination unit 24 determines the mode so that the tone mode, the speech speed mode, and the dialect mode are similar to the user's past input speech.

この方法によれば、例えば、ある同乗者が車両両音声応答システム１００を利用して音声で外部機器２００を操作することが多い場合には、音声記憶部１２はその同乗者の入力音声信号及びその解析結果を多く記憶することになる。そうすると、ドライバが単独で乗車してこの車両音声応答システム１００を利用する際には、モード決定部２４は、過去に蓄積された同乗者の口調等に似るようにモードを決定するので、単独で乗車しているドライバは頻繁に同乗する同乗者と同様の口調、方言、発話速度の応答音声を聞くことができ、ドライバは、応答音声に対して親しみを感じることができる。 According to this method, for example, when a passenger often operates the external device 200 by voice using the vehicle voice response system 100, the voice storage unit 12 stores the input voice signal of the passenger and Many analysis results are stored. Then, when the driver gets on alone and uses the vehicle voice response system 100, the mode determination unit 24 determines the mode so as to resemble the passenger's tone and the like accumulated in the past. The driver on board can hear the response voice of the same tone, dialect, and speaking speed as the passenger who frequently rides, and the driver can feel familiar with the response voice.

（２）車両の走行状態に基づいてモードを決定する方法
モード決定部２４は、車両情報取得部１３から入力される車両情報、渋滞情報取得部１４から入力される渋滞情報、位置情報取得部１５から入力される位置情報、移動履歴記憶部１６から読み出した移動履歴、及び地図記憶部１７から読み出した地図情報に基づいて、モードを決定する。 (2) Method for Determining the Mode Based on the Running State of the Vehicle The mode determination unit 24 is vehicle information input from the vehicle information acquisition unit 13, traffic information input from the traffic information acquisition unit 14, and position information acquisition unit 15 The mode is determined on the basis of the position information input from, the movement history read from the movement history storage unit 16, and the map information read from the map storage unit 17.

（例２−１）
モード決定部２４は、位置情報取得部１５から入力された車両の現在位置及び地図情報記憶部１７から読み出した地図情報に基づいて、車両が走行している道路の制限速度を認識し、車両情報取得部１３から入力される車速と制限速度とを比較して、その比較結果に基づいて、モードを決定する。具体的には、モード決定部２４は、車速と制限速度との差分が所定の閾値より大きいか否かに応じてモードを決定する。 (Example 2-1)
The mode determination unit 24 recognizes the speed limit of the road on which the vehicle is traveling based on the current position of the vehicle input from the position information acquisition unit 15 and the map information read from the map information storage unit 17. The vehicle speed input from the acquisition unit 13 is compared with the speed limit, and the mode is determined based on the comparison result. Specifically, the mode determination unit 24 determines the mode according to whether or not the difference between the vehicle speed and the speed limit is greater than a predetermined threshold.

例えば、車速が制限速度より速く、その差分が所定の閾値を超えている場合には、ドライバは焦っており、速い応答が要求されるので、発話速度モードを「速い」とする。一方、車速が制限速度より遅く、その差分が所定の閾値を超えている場合には、道路が混雑してドライバが苛立ちを感じおり、早い応答が要求されるので、口調モードを「即応的」とする。 For example, when the vehicle speed is faster than the speed limit and the difference exceeds a predetermined threshold value, the driver is impatient and a quick response is required, so the speech speed mode is set to “fast”. On the other hand, if the vehicle speed is slower than the speed limit and the difference exceeds a predetermined threshold, the road is congested and the driver feels frustrated and a quick response is required. And

（例２−２）
モード決定部２４は、位置情報取得部１５から入力された車両の現在位置、及び渋滞情報取得部１４から入力された渋滞情報に基づいて、車両が渋滞に巻き込まれているか否かを判定し、その判定結果に基づいて、モードを決定する。具体的には、車両が渋滞に巻き込まれている場合には、ドライバが苛立ちを感じているので、モード決定部２４は、口調モードを「即応的」とする。 (Example 2-2)
The mode determination unit 24 determines whether the vehicle is involved in a traffic jam based on the current position of the vehicle input from the location information acquisition unit 15 and the traffic jam information input from the traffic jam information acquisition unit 14. The mode is determined based on the determination result. Specifically, when the vehicle is involved in a traffic jam, the driver feels frustrated, so the mode determination unit 24 sets the tone mode to “immediate”.

（例２−３）
モード決定部２４は、位置情報取得部１５から入力された車両の現在位置、及び移動履歴記憶部１６から読み出した移動履歴に基づいて、現在走行している道路が、過去に走行した回数の多い道路（常用道路）であるか否かを判定し、判定結果に基づいて、モードを決定する。具体的には、モード決定部２４は、現在走行している道路が常用道路である場合は、例えば通勤路であることを想定して、口調モードを「丁寧」とし、現在走行している常用道路でない、例えば週末のドライブであることを想定して、口調モードを「フレンドリー」として、より非日常的な効果を出す。 (Example 2-3)
Based on the current position of the vehicle input from the position information acquisition unit 15 and the movement history read from the movement history storage unit 16, the mode determination unit 24 has a large number of times that the currently traveling road has traveled in the past. It is determined whether or not the road (ordinary road), and the mode is determined based on the determination result. Specifically, when the currently running road is a regular road, the mode determination unit 24 assumes that the road is a commute, for example, sets the tone mode to “Polite”, and regularly uses the currently running road. Assuming that the road is not on the road, for example, a weekend drive, the tone mode is set to “friendly” to produce a more unusual effect.

（例２−４）
さらに、現在走行している道路が常用道路である場合には、モード決定部２４は、移動履歴記憶部１６から読み出した移動履歴に基づいて過去の平均速度を求め、車両情報取得部１３から取得した車速と比較し、その比較結果に基づいて、モードを決定する。具体的には、モード決定部２４は、過去の平均速度と現在の車速との差分が所定の閾値より大きいか否かに応じてモードを決定する。 (Example 2-4)
Further, when the currently running road is a regular road, the mode determination unit 24 obtains the past average speed based on the movement history read from the movement history storage unit 16 and acquires it from the vehicle information acquisition unit 13. The mode is determined based on the comparison result. Specifically, the mode determination unit 24 determines the mode according to whether or not the difference between the past average speed and the current vehicle speed is greater than a predetermined threshold.

例えば、現在の車速が過去の平均速度より速く、その差分が所定の閾値を超えている場合には、ドライバは忙しいと判断して、モード決定部２４は、口調モードを「即応的」とする。 For example, if the current vehicle speed is faster than the past average speed and the difference exceeds a predetermined threshold value, the driver determines that the driver is busy and the mode determination unit 24 sets the tone mode to “immediate”. .

（３）同乗者の有無に基づいてモードを決定する方法
モード決定部２４は、同乗者センサ１８から入力される検知信号に基づいて、モードを決定する。具体的には、モード決定部２４は、休日に同乗者がある場合には、家族と一緒に乗車していると想定して、口調モードを「フレンドリー」とし、平日に同乗者がある場合には、仕事関係の人間を乗せていることを想定して、口調モードを「丁寧」とすることで、同乗者に奇異な印象をもたれることを回避する。 (3) Method of determining mode based on presence / absence of passengers The mode determination unit 24 determines a mode based on a detection signal input from the passenger sensor 18. Specifically, if there is a passenger on a holiday, the mode determination unit 24 assumes that the passenger is riding with the family, sets the tone mode to “friendly”, and if there is a passenger on weekdays. Is assumed to be carrying a work-related person, and the tone mode is set to “Polite” to avoid giving a strange impression to the passenger.

モード決定部２４は、以上の方法（１）〜（３）の例のいずれか、又はそれらを適宜に組み合わせて、口調モード、方言モード、発話速度モードを決定する。上記の方法を組み合わせる場合には、各方法に優先度を設けてもよいし、又は重みづけをしてよい。以下、モード決定部２４によるモード決定の具体例を説明する。 The mode determination unit 24 determines the tone mode, the dialect mode, and the speech rate mode by combining any of the above methods (1) to (3), or appropriately combining them. When combining the above methods, each method may be given priority or weighted. Hereinafter, a specific example of mode determination by the mode determination unit 24 will be described.

図２は、車両用音声応答システムの応答音声出力の一例を示すフローチャートである。車両用音声応答システム１００は、ドライバからの音声入力があったか否かを判断する（ステップＳ２１）。音声入力がない場合は（ステップＳ２１にてＮＯ）、そのまま音声入力を待つ。音声入力があると（ステップＳ２１にてＹＥＳ）、入力音声信号が音声分析部２１に入力され、音声分析部２１は音声分析を行う（ステップＳ２２）。 FIG. 2 is a flowchart illustrating an example of response voice output of the vehicle voice response system. The vehicle voice response system 100 determines whether or not there is a voice input from the driver (step S21). If there is no voice input (NO in step S21), the voice input is awaited. When there is a voice input (YES in step S21), the input voice signal is input to the voice analysis unit 21, and the voice analysis unit 21 performs voice analysis (step S22).

情報処理部２２は、音声分析の結果に従って、情報処理を行うことで、外部機器２００を操作するための操作コマンドを生成して外部機器２００に出力する（ステップＳ２３）。これによって、外部機器２００が操作される。応答音声生成部２３は、情報処理部２２で生成された操作コマンドに基づいて、応答音声を生成する（ステップＳ２４）。口調モードは「丁寧」、方言モードは「標準語」、発話速度モードは「速い」として、応答音声を生成する。 The information processing unit 22 performs information processing according to the result of the voice analysis, thereby generating an operation command for operating the external device 200 and outputting the operation command to the external device 200 (step S23). As a result, the external device 200 is operated. The response voice generation unit 23 generates a response voice based on the operation command generated by the information processing unit 22 (step S24). A response voice is generated assuming that the tone mode is “polite”, the dialect mode is “standard language”, and the speech speed mode is “fast”.

次に、現在の車速が、現在走行している道路の制限速度よりも所定の閾値以上大きいか否かを判断する（ステップＳ２５）。車速が制限速度に所定の閾値を加えた値よりも大きい場合には（ステップＳ２５にてＹＥＳ）、生成されている応答音声を出力する（ステップＳ２６）。この場合は、例２−１に従って、車速が制限速度より所定の閾値以上に速い場合として発話速度モードとして「速い」を選択したことに相当する。 Next, it is determined whether or not the current vehicle speed is greater than a speed limit of the currently traveling road by a predetermined threshold or more (step S25). If the vehicle speed is higher than the value obtained by adding a predetermined threshold to the speed limit (YES in step S25), the generated response voice is output (step S26). In this case, according to Example 2-1, it corresponds to selecting “fast” as the speech speed mode when the vehicle speed is higher than the speed limit by a predetermined threshold or more.

車速が制限速度に所定の閾値を加えた値よりも小さい場合には（ステップＳ２５にてＮＯ）、モード決定部２４は、モードを変更するために、音声記憶部１２からドライバの過去の音声を読み出して（ステップＳ２７）、ドライバの過去の音声の音声特徴を分析し、例１−３に従って、ドライバの過去の入力音声の音声特徴の統計値に基づいて、モードを決定する（ステップＳ２８）。そして、応答音声生成部２３は、モード決定部２４にて変更されたモードに従って、応答音声を修正し（ステップＳ２９）、応答メッセージを出力する（ステップＳ２６）。 When the vehicle speed is smaller than the value obtained by adding a predetermined threshold to the speed limit (NO in step S25), mode determination unit 24 reads the driver's past voice from voice storage unit 12 in order to change the mode. Read (step S27), analyze the voice characteristics of the driver's past voice, and determine the mode according to the statistical value of the voice characteristics of the driver's past voice according to Example 1-3 (step S28). And the response sound production | generation part 23 corrects a response sound according to the mode changed in the mode determination part 24 (step S29), and outputs a response message (step S26).

以上のように、本発明の実施の形態の車両用音声応答システム１００によれば、応答音声のモードを切り替えることができるので、音声特徴、走行状態、同乗者の有無等の種々の要因に応じて適切な口調、方言、発話速度で応答音声を出力できる。 As described above, according to the vehicle voice response system 100 according to the embodiment of the present invention, the response voice mode can be switched, so that it depends on various factors such as voice characteristics, running state, presence of passengers, and the like. Response voice with proper tone, dialect, and speaking speed.

なお、上記の実施の形態では、音声記憶部１２には、音声入力部１１で入力された音声、及びそれを音声分析した結果を記憶することで、ユーザの過去の音声を蓄積していたが、ユーザの過去の音声は、音声入力が行われる他の機器から取得してもよい。例えば、携帯端末に音声入力の機能がある場合には、その携帯端末において、入力されるユーザの音声を記憶しておき、音声記憶部１２は、この携帯端末から過去に入力されて蓄積されたユーザの音声を取得してもよい。 In the above embodiment, the voice storage unit 12 stores the voice input by the voice input unit 11 and the result of voice analysis thereof, thereby accumulating the user's past voice. The user's past voice may be acquired from another device in which voice input is performed. For example, when the portable terminal has a voice input function, the voice of the input user is stored in the portable terminal, and the voice storage unit 12 is input and accumulated in the past from the portable terminal. The user's voice may be acquired.

また、車両情報取得部１３にて取得された車両情報は、応答音声生成部２３にも出力されてよい。この場合、応答音声生成部２３は、車両情報取得部１３から入力された車両情報に基づいて、生成した音声応答を応答音声出力部３０に出力して、応答音声がスピーカ等から出力されるタイミングを決定することができる。 The vehicle information acquired by the vehicle information acquisition unit 13 may also be output to the response voice generation unit 23. In this case, the response sound generation unit 23 outputs the generated sound response to the response sound output unit 30 based on the vehicle information input from the vehicle information acquisition unit 13, and the response sound is output from a speaker or the like. Can be determined.

例えば、応答音声を出力しようとするときに、ステアリングの舵角の変化量が一定以上であり、かつ、車速が一定以上である場合には、運転負荷が一時的に大きい状態であるので、応答音声生成部２３は、そのような状態が解消して落ち着いた状態となるのを待って、生成した応答音声を応答音声出力部３０に出力する。この場合に、応答音声生成部２３は、応答音声の出力を一時待機している間に、所定の間投詞（例えば、「えーっと」）の応答音声を先に応答音声出力部３０から出力するようにしてよい。これにより、音声の入力による操作に対して何の応答も得られないことを、ユーザがシステムの故障等に起因する遅延であると誤解すること避けることができる。 For example, when a response sound is to be output, if the amount of change in the steering angle of the steering is greater than or equal to a certain value and the vehicle speed is greater than or equal to a certain value, the driving load is temporarily large. The voice generation unit 23 outputs the generated response voice to the response voice output unit 30 after waiting for such a state to be resolved and becoming a calm state. In this case, the response voice generation unit 23 outputs the response voice of a predetermined interjection (for example, “Em”) from the response voice output unit 30 first while waiting for the output of the response voice. It's okay. Thereby, it can be avoided that the user misunderstands that it is a delay caused by a system failure or the like that no response is obtained in response to an operation by voice input.

本発明は、応答音声のモードを切り替えることができるので、音声特徴、走行状態、同乗者の有無等の種々の要因に応じて適切な口調、方言、発話速度で応答音声を出力できるという効果を有し、車両に搭載され、ユーザに応答音声を出力する音声応答システム等として有用である。 Since the present invention can switch the mode of response voice, the response voice can be output with an appropriate tone, dialect, and speaking speed according to various factors such as voice characteristics, running state, presence / absence of passengers, etc. It is useful as a voice response system that is mounted on a vehicle and outputs a response voice to a user.

１００車両用音声応答システム
１１音声入力部
１２音声記憶部
１３車両情報取得部
１４渋滞情報取得部
１５位置情報取得部
１６移動履歴記憶部
１７地図記憶部
１８同乗者センサ
２０制御部
２１音声分析部
２２情報処理部
２３応答音声生成部
２４モード決定部
３０応答音声出力部
２００外部機器 DESCRIPTION OF SYMBOLS 100 Voice response system for vehicles 11 Voice input part 12 Voice storage part 13 Vehicle information acquisition part 14 Congestion information acquisition part 15 Location information acquisition part 16 Movement history storage part 17 Map storage part 18 Passenger sensor 20 Control part 21 Voice analysis part 22 Information processing unit 23 Response voice generation unit 24 Mode determination unit 30 Response voice output unit 200 External device

Claims

A vehicle voice response system that outputs a vehicle response voice,
A voice input unit for inputting a user's voice;
An information processing unit for operating an external device based on the voice input to the voice input unit;
A mode determination unit for determining the mode of the response voice;
A response sound generating unit that generates a response sound based on the mode of the sound input to the sound input unit and the response sound determined by the mode determination unit;
A response voice output unit for outputting the response voice;
A vehicle voice response system comprising:

2. The vehicle voice response system according to claim 1, wherein the mode determination unit determines at least one of a tone mode, a dialect mode, and a speech rate mode as a response voice mode.

The mode determination unit determines a response voice mode based on at least one of a feature of a voice input to the voice input unit, a running state of a vehicle, and the presence or absence of a passenger. Item 3. The vehicle voice response system according to Item 1 or 2.

4. The vehicle voice response system according to claim 3, wherein the voice characteristics of the user are at least one of a ratio of interjections, a ratio of unnecessary words, a ending, a tone, an utterance speed, and a frequency. .

A voice storage unit for storing the voice input to the voice input unit;
5. The mode determination unit according to claim 1, wherein the mode determination unit determines a response voice mode based on a comparison result between a voice characteristic of a user and a voice stored in the voice storage unit. The voice response system for vehicles described in 1.

A voice storage unit for storing the voice input to the voice input unit;
5. The mode determination unit according to claim 1, wherein the mode determination unit determines a response voice mode based on a result of statistically processing the input voice signal stored in the voice storage unit. The vehicle voice response system described.

7. The vehicle sound according to claim 1, wherein the mode determination unit determines a response sound mode so that the response sound is similar to the sound input to the sound input unit. Response system.

A map storage unit storing map information;
A position information acquisition unit for acquiring position information of the vehicle;
A vehicle information acquisition unit that acquires vehicle information;
The mode determination unit determines a response voice mode based on the map information stored in the map storage unit, the position information acquired by the position information acquisition unit, and the vehicle information acquired by the vehicle information acquisition unit. The vehicle voice response system according to any one of claims 1 to 7, wherein:

A position information acquisition unit for acquiring position information of the vehicle;
A traffic information acquisition unit for acquiring traffic information,
8. The mode determination unit according to claim 1, wherein the mode determination unit determines a response voice mode based on the position information acquired by the position information acquisition unit and the traffic jam information acquired by the traffic jam information acquisition unit. The vehicle voice response system according to any one of the preceding claims.

A position information acquisition unit for acquiring position information of the vehicle;
A movement history storage unit that stores a movement history of the vehicle,
8. The mode determination unit according to claim 1, wherein the mode determination unit determines a response voice mode based on the position information acquired by the position information acquisition unit and the movement history stored in the movement history storage unit. The vehicle voice response system according to any one of the preceding claims.

A vehicle information acquisition unit for acquiring vehicle information;
11. The vehicle voice response system according to claim 10, wherein the mode determination unit further determines a response voice mode based on vehicle information acquired by the vehicle information acquisition unit.

A passenger sensor for detecting the passenger,
The vehicle voice response system according to any one of claims 1 to 11, wherein the mode determination unit determines a response voice mode based on a detection signal of the passenger sensor.

A vehicle information acquisition unit for acquiring vehicle information;
The response voice generation unit determines a timing at which the response voice output unit outputs a response voice based on the vehicle information acquired by the vehicle information acquisition unit. The voice response system for a vehicle according to item.

A voice analysis unit for analyzing the voice signal generated by the voice input unit;
The vehicle information processing system according to any one of claims 1 to 13, wherein the information processing unit operates an external device based on an analysis result in the sound analysis unit.

To the computer for outputting the response voice for vehicles,
A voice input step for inputting the user's voice;
An information processing step of operating an external device based on the voice input in the voice input step;
A mode determination step for determining the mode of the response voice;
A response sound generation step for generating a response sound based on the sound input in the sound input step and the response sound mode determined in the mode determination step;
A response voice output step for outputting the response voice;
A voice response program characterized in that