JP2019049742A

JP2019049742A - Voice response device

Info

Publication number: JP2019049742A
Application number: JP2018206748A
Authority: JP
Inventors: 健純近藤; Takeyoshi Kondo; 豪生野澤; Takeo Nozawa; 謙史竹中; Kenji Takenaka; 健司水野; Kenji Mizuno; 博司前川; Hiroshi Maekawa; 毅川西; Takeshi Kawanishi; 林　茂; Shigeru Hayashi; 茂林; 辰美黒田; Tatsumi Kuroda
Original assignee: ADC Technology Inc
Current assignee: ADC Technology Inc
Priority date: 2012-08-10
Filing date: 2018-11-01
Publication date: 2019-03-28
Also published as: JP2018036653A; JPWO2014024751A1; JP2020194184A; WO2014024751A1

Abstract

To make a voice response device for performing voice response for input voice easy to use.SOLUTION: A voice response system 100 configured such that a server 90 generates an appropriate response to the voice input in a terminal device 1 and the terminal device 1 outputs a response as voice includes a voice feature recording unit for recording the features of the input voice, a voice match determination unit for determining whether or not the features of the input voice match the features of the voice previously recorded by the voice feature recording unit, and a voice output unit configured to output a response different from that in the case where it is determined that the voice features match, when it is determined that the voice features do not match by the voice match determination unit.SELECTED DRAWING: Figure 1

Description

Cross-reference to related applications

本国際出願は、２０１２年８月１０日に日本国特許庁に出願された日本国特許出願第２０１２−１７８４５４号に基づく優先権を主張するものであり、日本国特許出願第２０１２−１７８４５４号の全内容を参照により本国際出願に援用する。 This international application claims priority based on Japanese Patent Application No. 2012-178454 filed on Aug. 10, 2012 in the Japanese Patent Office, and the international application is based on Japanese Patent Application No. 2012-178454. The entire contents are incorporated into this international application by reference.

本発明は、入力された音声に対する応答を音声で行わせる音声応答装置に関する。 The present invention relates to a voice response device that causes voice to respond to input voice.

上記の音声応答装置として、入力された質問に対する回答を辞書から検索し、検索した回答を音声で出力するものが知られている（例えば特許文献１参照）。また、使用者との対話の内容に基づいて質問に対する回答を生成する技術も知られている（例えば特許文献２参照）。 As the above-mentioned voice response device, one is known which searches for an answer to an input question from a dictionary and outputs the searched answer as voice (see, for example, Patent Document 1). There is also known a technology for generating an answer to a question based on the content of a dialog with a user (see, for example, Patent Document 2).

特許第４８３２０９７号公報Patent No. 4832097 特許第４９２４９５０号公報Patent No. 4924950 gazette

上記技術では、単に１つの質問に対して辞書によって特定される１つの回答を行うように設定されている。
入力された音声に対する応答を音声で行わせる音声応答装置において、使用者にとってより使い勝手をよくすることが本発明の一側面である。 In the above technology, only one question is set to make one answer specified by the dictionary.
It is an aspect of the present invention to make it easier for the user to use a voice response device that makes voice responses to input voice.

第１局面の発明は、
入力された音声に対する応答を音声で行わせる音声応答装置であって、
入力された音声の特徴を記録する音声特徴記録部と、
入力された音声の特徴が以前に前記音声特徴記録部により記録された音声の特徴と一致するか否かを判定する音声一致判定部と、
前記音声一致判定部により音声の特徴が一致しないと判定された場合、音声の特徴が一致すると判定された場合とは異なる応答を出力させる音声出力部と、
を備えたことを特徴とする。 The invention of the first aspect is
A voice response device for making a voice response to input voice
A voice feature recording unit that records features of the input voice;
A voice match determination unit that determines whether the features of the input voice match the features of the voice previously recorded by the voice feature recording unit;
A voice output unit that outputs a response different from that in the case where it is determined that the voice features do not match if the voice match determination unit determines that the voice features do not match;
It is characterized by having.

このような音声応答装置によれば、音声を入力した人物が以前と異なる場合には、音声を入力した人物が以前と同様の場合とは異なる応答を返すことができる。よって、音声を入力した人物が以前と同じか否かに拘わらず同様の回答をする場合と比較して、使用者にとってより使い勝手をよくすることができる。 According to such a voice response device, when the person who has input the voice is different from before, it is possible to return a different response from the case where the person who has input the voice is the same as before. Therefore, it is possible to improve usability for the user as compared with the case where the person who has input the voice makes the same answer regardless of whether the person is the same as before.

ところで、上記音声応答装置においては、第２局面の発明のように、
入力された音声の特徴に基づいて音声を入力した人物を特定する人物特定部と、
入力された音声に従って被制御部を制御する制御部と、を備え、
前記制御部は、異なる人物から矛盾する指示を受けると予め人物毎に設定された優先順位に従って前記優先順位の上位の者による指示を優先して制御を実施するようにしてもよい。 By the way, in the above voice response apparatus, as in the invention of the second aspect,
A person specifying unit for specifying a person who has input a voice based on the feature of the input voice;
A control unit that controls the controlled unit in accordance with the input voice;
When the control unit receives an instruction contradicting from a different person, control may be performed by giving priority to an instruction by a person higher in priority according to the priority order set in advance for each person.

このような音声応答装置によれば、異なる人物から矛盾する指示を受けた場合であったとしても、優先順位に従って被制御部に対する制御を実施することができる。
なお、矛盾する指示を受けた場合に、音声による応答で矛盾を指摘する、或いは、代替案を提示するようにしてもよい。この際、代替案を提示する場合には、天候などを加味した応答を出力してもよい。 According to such a voice response device, even if a contradictory instruction is received from a different person, it is possible to control the controlled unit according to the priority.
In addition, in the case of receiving a contradictory instruction, a voice response may be used to point out the contradiction or present an alternative. At this time, when presenting an alternative, a response may be output in consideration of the weather and the like.

さらに、上記音声応答装置においては、第３局面の発明のように、
入力された音声の特徴に基づいて音声を入力した人物を特定する人物特定部と、
入力された音声に基づくスケジュールを前記人物毎に記録するスケジュール記録部と、を備えていてもよい。 Furthermore, in the voice response device, as in the invention of the third aspect,
A person specifying unit for specifying a person who has input a voice based on the feature of the input voice;
And a schedule recording unit configured to record a schedule based on the input voice for each of the persons.

このような音声応答装置によれば、人物毎にスケジュールを管理することができる。
なお、本発明において人物特定部を除き、スケジュール記録部を第２局面に係る発明に従属させることができる。また、本発明においては、予定の属性に応じて予定の優先度を変更してもよい。予定の属性とは、例えば、変更できるか否か（相手への影響があるかどうか）などによって区分される。 According to such a voice response device, a schedule can be managed for each person.
In the present invention, the schedule recording unit can be subordinate to the invention according to the second aspect except for the person specifying unit. In the present invention, the priority of the schedule may be changed according to the attribute of the schedule. The scheduled attributes are classified, for example, according to whether or not they can be changed (whether or not there is an influence on the other party).

予定属性によって先に登録されたスケジュールを変更したり、後から登録されるスケジュールを空いている時間に登録したりすればよい。また、スケジュールを登録する際には、前のスケジュールの場所と後のスケジュールの場所とを考慮し、これらの間を移動するための時間を検索し、これらの間を移動するための移動時間を考慮して後から登録されるスケジュールを登録するようにしてもよい。 The schedule registered earlier according to the schedule attribute may be changed, or the schedule registered later may be registered at a vacant time. Also, when registering a schedule, consider the location of the previous schedule and the location of the later schedule, search for the time to move between them, and search for the time to move between them. A schedule to be registered later may be registered in consideration of it.

また、当該装置が管理する複数の人物が打ち合わせを行う場合のように、同じスケジュール（予定が実施される時間帯）を複数の人物が共有する場合には、これらの複数の人物のスケジュールが空いている時間帯を検索し、この時間に打ち合わせを設定するようにしてもよい。また、空いている時間帯がない場合には、予定属性に応じて既に登録されているスケジュールを変更するようにしてもよい。 In addition, when a plurality of persons share the same schedule (time zone in which a schedule is to be implemented), as in the case where a plurality of persons managed by the device make a meeting, the schedules of the plurality of persons are vacant. It is also possible to search for the time zone in which you are working and set up a meeting at this time. In addition, when there is no vacant time zone, the already registered schedule may be changed according to the schedule attribute.

このようにスケジュールを変更する際には、その旨を音声の応答として出力することが好ましい。
また、上記音声応答装置においては、第４局面の発明のように、入力された音声が聞き取れない場合（つまり、文字に変換したときに文章として誤りがあると推定できる場合）に所定の連絡先に問い合わせるようにしてもよい。この際、位置情報を利用して問い合わせ元や問い合わせ先を特定するようにしてもよい。 When changing the schedule in this way, it is preferable to output that effect as a voice response.
Further, in the voice response device, as in the invention of the fourth aspect, the predetermined contact can not be heard if the input voice can not be heard (that is, if it can be estimated that there is an error as a sentence when converted into characters) You may ask to At this time, the inquiry source and the inquiry destination may be specified by using the position information.

このような音声応答装置によれば、例えば、子供が話す内容が聞き取れない場合、母親に問い合わせることや、老人が話す内容が聞き取れない場合、その老人の家族に問い合わせることによって、正しい内容を他の者から入力させることができるので、入力された音声の正確性に担保することができる。 According to such a voice response device, for example, when the content that the child speaks can not be heard, the correct content is referred to by inquiring the mother, or when the content that the elderly person speaks can not be heard, by asking the family of the elderly person. It is possible to secure the accuracy of the input voice because it can be input from the

さらに、上記音声応答装置においては、第５局面の発明のように、予め使用者（音声を入力した者）の年齢または年齢層を示す年齢情報に応じて準備された複数のデータベースを備えておき、使用者の年齢情報に従って使用するデータベースを選択し、この選択したデータベースに従って音声を認識するようにしてもよい。 Furthermore, in the voice response device, as in the invention of the fifth aspect, the voice response device is provided with a plurality of databases prepared in advance according to age information indicating the age or age group of the user (the person who input the voice). Alternatively, a database to be used may be selected according to the user's age information, and speech may be recognized according to the selected database.

このような音声応答装置によれば、年齢に応じて音声認識の際に参照するデータベースを変更するので、年齢に応じて使用頻度が高い単語、言葉の言い回しなどを登録しておけば、音声認識の精度を向上させることができる。 According to such a voice response device, the database to be referred to at the time of voice recognition is changed according to the age. Therefore, if words and phrases used frequently according to the age are registered, the voice recognition is performed. Accuracy can be improved.

また、上記音声応答装置においては、第６局面の発明のように、使用者（音声を入力した者）の年齢を推定し、推定した年齢を年齢情報として利用してもよい。
使用者の年齢を推定する際には、例えば、入力された音声の特徴（声の波形、声の高さ等）に応じて推定してもよいし、使用者が音声を入力する際にカメラ等の撮像部によって使用者の顔を撮像することによって推定してもよい。 In the voice response apparatus, as in the invention of the sixth aspect, the age of the user (the person who has input the voice) may be estimated, and the estimated age may be used as age information.
When the user's age is to be estimated, for example, it may be estimated according to the characteristics of the input voice (voice waveform, voice height, etc.), or the camera when the user inputs voice. It may estimate by imaging a user's face by an imaging part, such as.

また、使用者の顔を撮像する際には、使用者の識別や年齢認証を行ってもよい。
さらに、現金自動支払機等の対面型の装置に本発明を適用してもよい。この場合、本発明を用いて年齢の認証などの本人確認を行うことができる。 In addition, when imaging the face of the user, identification of the user or age authentication may be performed.
Furthermore, the present invention may be applied to a face-to-face device such as a cash dispenser. In this case, identity verification such as age verification can be performed using the present invention.

また、本発明を車両に適用してもよい。この場合、人物を特定する構成を車両の鍵に代わる構成として利用することができる。
なお、上記発明は音声応答装置として説明したが、入力された音声を認識する構成を備えた音声認識装置として構成してもよい。また、各局面の発明は、他の発明を前提とする必要はなく、可能な限り独立した発明とすることができる。 In addition, the present invention may be applied to a vehicle. In this case, the configuration for identifying a person can be used as an alternative to the key of the vehicle.
Although the above-mentioned invention was explained as a voice response device, it may be constituted as a voice recognition device provided with the composition which recognizes the inputted voice. In addition, the invention of each aspect does not have to be premised on other inventions, and can be independent inventions as much as possible.

本発明が適用された音声応答システムの概略構成を示すブロック図である。FIG. 1 is a block diagram showing a schematic configuration of a voice response system to which the present invention is applied. 端末装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of a terminal device. 端末装置のＭＰＵが実行する音声応答端末処理を示すフローチャートである。It is a flowchart which shows the voice response terminal process which MPU of a terminal device performs. サーバの演算部が実行する音声応答サーバ処理（その１）を示すフローチャートである。It is a flowchart which shows the voice response server process (the 1) which the calculating part of a server performs. 音声応答サーバ処理（その２）を示すフローチャートである。It is a flowchart which shows a voice response server process (the 2). 図６Ａは、音声認識ＤＢを示す説明図である。図６Ｂは、優先順位ＤＢを示す説明図である。FIG. 6A is an explanatory view showing a speech recognition DB. FIG. 6B is an explanatory view of the priority DB. 音声応答サーバ処理のうちのスケジュール入力処理を示すフローチャートである。It is a flow chart which shows schedule input processing among voice response server processing. スケジュールＤＢに記録されたスケジュールの一例を示す説明図である。It is explanatory drawing which shows an example of the schedule recorded on schedule DB. 予定属性の一例を示す説明図である。It is an explanatory view showing an example of a schedule attribute. 音声応答サーバ処理のうちの操作入力処理を示すフローチャートである。It is a flow chart which shows operation input processing among voice response server processing. 音声応答サーバ処理のうちの変更確認処理を示すフローチャートである。It is a flowchart which shows the change confirmation process of voice response server processes. 音声応答サーバ処理のうちの期間指定処理を示すフローチャートである。It is a flowchart which shows the period designation processing of voice response server processing. 変形例の操作入力処理を示すフローチャートである。It is a flowchart which shows the operation input process of a modification.

以下に本発明にかかる実施の形態を図面と共に説明する。
［本実施形態の構成］
本発明が適用された音声応答システム１００は、端末装置１において入力された音声に対して、サーバ９０にて適切な応答を生成し、端末装置１で応答を音声で出力するよう構成されたシステムである。また、入力された音声に指令が含まれている場合に、対象となる装置（被制御部９５）に対して制御指令を出力する。さらに、使用者のスケジュールを管理する機能も有する。 Embodiments of the present invention will be described below with reference to the drawings.
[Configuration of this embodiment]
The voice response system 100 to which the present invention is applied is a system configured such that the server 90 generates an appropriate response to the voice input in the terminal device 1 and the terminal device 1 outputs the response as voice. It is. Further, when the input voice includes a command, the control command is output to the target apparatus (the controlled unit 95). Furthermore, it also has a function to manage the schedule of the user.

詳細には、図１に示すように、音声応答システム１００は、複数の端末装置１や車両に搭載されたエアコン等の各種機器（被制御部９５）とサーバ９０とが通信基地局８０、８１やインターネット網８５を介して互いに通信可能に構成されている。なお、端末装置１は他の端末装置１や被制御部９５と直接通信を行うよう構成されていてもよい。 In detail, as shown in FIG. 1, in the voice response system 100, communication base stations 80 and 81 communicate with a plurality of terminal devices 1 and various devices (control units 95) such as air conditioners mounted on vehicles and the server 90. And the Internet 85 are configured to be able to communicate with each other. The terminal device 1 may be configured to directly communicate with another terminal device 1 or the controlled unit 95.

サーバ９０は、通常のサーバ装置としての機能を備えている。特にサーバ９０は、演算部１０１と、各種データベース（ＤＢ）とを備えている。演算部１０１は、ＣＰＵと、ＲＯＭ、ＲＡＭ等のメモリを備えた周知の演算装置として構成されており、メモリ内のプログラムに基づいて、インターネット網８５を介した端末装置１等との通信や、各種ＤＢ内のデータの読み書き、或いは、端末装置１を利用する使用者との会話を行うための音声認識や応答生成といった各種処理を実施する。 The server 90 has a function as a normal server device. In particular, the server 90 includes an arithmetic unit 101 and various databases (DBs). The arithmetic unit 101 is configured as a known arithmetic device provided with a CPU, and memories such as ROM and RAM, and communicates with the terminal device 1 via the Internet network 85 based on a program in the memory, Various processing such as voice recognition and response generation for performing reading and writing of data in various DBs or talking with a user who uses the terminal device 1 is performed.

各種ＤＢとしては、図１に示すように、音声認識ＤＢ１０２、予測変換ＤＢ１０３、音声ＤＢ１０４、応答候補ＤＢ１０５、性格ＤＢ１０６、学習ＤＢ１０７、嗜好ＤＢ１０８、ニュースＤＢ１０９、天気ＤＢ１１０、優先順位ＤＢ１１１、スケジュールＤＢ１１２、端末情報ＤＢ１１３、感情判定ＤＢ１１４、健康判定ＤＢ１１５、通報先ＤＢ１１７等を備えている。なお、これらのＤＢの詳細については、処理の説明の都度述べることにする。 As various DBs, as shown in FIG. 1, voice recognition DB 102, predictive conversion DB 103, voice DB 104, response candidate DB 105, personality DB 106, learning DB 107, preference DB 108, news DB 109, weather DB 110, priority DB 111, schedule DB 112, terminal An information DB 113, an emotion determination DB 114, a health determination DB 115, a report destination DB 117, and the like are provided. The details of these DBs will be described each time the processing is described.

次に、端末装置１は、図２に示すように、行動センサユニット１０と、通信部５０と、報知部６０と、操作部７０と、が所定の筐体に備えられて構成されている。
行動センサユニット１０は、周知のＭＰＵ３１（マイクロプロセッサユニット）、ＲＯＭ、ＲＡＭ等のメモリ３９、および各種センサを備えており、ＭＰＵ３１は各種センサを構成するセンサ素子が検査対象（湿度、風速等）を良好に検出することができるように、例えば、センサ素子の温度に最適化するためのヒータを駆動させる等の処理を行う。 Next, as shown in FIG. 2, the terminal device 1 is configured such that the behavior sensor unit 10, the communication unit 50, the notification unit 60, and the operation unit 70 are provided in a predetermined case.
The behavior sensor unit 10 includes the well-known MPU 31 (microprocessor unit), a memory 39 such as a ROM and a RAM, and various sensors. The MPU 31 is a sensor element constituting the various sensors to be inspected (humidity, wind speed, etc.) In order to be able to detect favorably, processing such as driving a heater for optimizing the temperature of the sensor element is performed.

行動センサユニット１０は、各種センサとして、３次元加速度センサ１１（３ＤＧセンサ）と、３軸ジャイロセンサ１３と、筐体の背面に配置された温度センサ１５と、筐体の背面に配置された湿度センサ１７と、筐体の正面に配置された温度センサ１９と、筐体の正面に配置された湿度センサ２１と、筐体の正面に配置された照度センサ２３と、筐体の背面に配置された濡れセンサ２５と、端末装置１の現在地を検出するＧＰＳ受信機２７と、風速センサ２９とを備えている。 The behavior sensor unit 10 includes, as various sensors, a three-dimensional acceleration sensor 11 (3DG sensor), a three-axis gyro sensor 13, a temperature sensor 15 disposed on the back of the housing, and humidity disposed on the back of the housing A sensor 17, a temperature sensor 19 disposed in front of the housing, a humidity sensor 21 disposed in front of the housing, an illuminance sensor 23 disposed in front of the housing, and a rear surface of the housing A wetness sensor 25, a GPS receiver 27 for detecting the current position of the terminal device 1, and a wind speed sensor 29 are provided.

また、行動センサユニット１０は、各種センサとして、心電センサ３３、心音センサ３５、マイク３７、カメラ４１も備えている。なお、各温度センサ１５，１９、および各湿度センサ１７，２１は、筐体の外部空気の温度または湿度を検査対象として測定を行う。 The behavior sensor unit 10 also includes an electrocardiogram sensor 33, a heart sound sensor 35, a microphone 37, and a camera 41 as various sensors. Each of the temperature sensors 15 and 19 and the humidity sensors 17 and 21 measures the temperature or humidity of the air outside the housing as an inspection target.

３次元加速度センサ１１は、端末装置１に加えられる互いに直交する３方向（鉛直方向（Ｚ方向）、筐体の幅方向（Ｙ方向）、および筐体の厚み方向（Ｘ方向））における加速度を検出し、この検出結果を出力する。 The three-dimensional acceleration sensor 11 receives accelerations applied to the terminal device 1 in three mutually orthogonal directions (vertical direction (Z direction), width direction of the case (Y direction), and thickness direction of the case (X direction)). Detect and output this detection result.

３軸ジャイロセンサ１３は、端末装置１に加えられる角速度として、鉛直方向（Ｚ方向）と、該鉛直方向とは直交する任意の２方向（筐体の幅方向（Ｙ方向）、および筐体の厚み方向（Ｘ方向））における角加速度（各方向における左回りの各速度を正とする）を検出し、この検出結果を出力する。 The triaxial gyro sensor 13 has an angular velocity applied to the terminal device 1 in the vertical direction (Z direction) and in any two directions orthogonal to the vertical direction (the width direction of the housing (Y direction) and the housing The angular acceleration in the thickness direction (X direction) (each counterclockwise velocity in each direction is made positive) is detected, and the detection result is output.

温度センサ１５，１９は、例えば温度に応じて電気抵抗が変化するサーミスタ素子を備えて構成されている。なお、本実施例においては、温度センサ１５，１９は摂氏温度を検出し、以下の説明に記載する温度表示は全て摂氏温度で行うものとする。 The temperature sensors 15, 19 include, for example, a thermistor element whose electric resistance changes in accordance with the temperature. In the present embodiment, the temperature sensors 15 and 19 detect the degree of Celsius temperature, and all temperature display described in the following description is performed at the degree of Celsius temperature.

湿度センサ１７，２１は、例えば周知の高分子膜湿度センサとして構成されている。この高分子膜湿度センサは、相対湿度の変化に応じて高分子膜に含まれる水分の量が変化し、誘電率が変化するコンデンサとして構成されている。 The humidity sensors 17 and 21 are configured, for example, as a known polymer film humidity sensor. The polymer film humidity sensor is configured as a capacitor in which the amount of water contained in the polymer film changes according to the change in relative humidity, and the dielectric constant changes.

照度センサ２３は、例えばフォトトランジスタを備えた周知の照度センサとして構成さ
れている。
風速センサ２９は、例えば周知の風速センサであって、ヒータ温度を所定温度に維持する際に必要な電力（放熱量）から風速を算出する。 The illumination sensor 23 is configured as, for example, a known illumination sensor provided with a phototransistor.
The wind speed sensor 29 is, for example, a well-known wind speed sensor, and calculates the wind speed from the power (heat radiation amount) required to maintain the heater temperature at a predetermined temperature.

心音センサ３５は、使用者の心臓の拍動による振動を捉える振動センサとして構成されており、ＭＰＵ３１は心音センサ３５による検出結果とマイク３７から入力される心音とを鑑みて、拍動による振動や騒音と、他の振動や騒音とを識別する。 The heart sound sensor 35 is configured as a vibration sensor that captures the vibration of the user's heart due to pulsation, and the MPU 31 considers the vibration due to the pulsation or the vibration in consideration of the detection result of the heart sound sensor 35 and the heart sound input from the microphone 37. Identify noise from other vibrations and noise.

濡れセンサ２５は筐体表面の水滴を検出し、心電センサ３３は使用者の鼓動を検出する。
カメラ４１は、端末装置１の筐体内において、端末装置１の外部を撮像範囲とするように配置されている。特に、本実施形態においては、端末装置１の使用者を撮像可能な位置にカメラ４１が配置されている。 The wetness sensor 25 detects water droplets on the surface of the housing, and the electrocardiographic sensor 33 detects the user's heartbeat.
The camera 41 is disposed in the casing of the terminal device 1 such that the outside of the terminal device 1 is an imaging range. In particular, in the present embodiment, the camera 41 is disposed at a position at which the user of the terminal device 1 can be imaged.

通信部５０は、周知のＭＰＵ５１と、無線電話ユニット５３と、連絡先メモリ５５と、を備え、図示しない入出力インターフェイスを介して行動センサユニット１０を構成する各種センサからの検出信号を取得可能に構成されている。そして、通信部５０のＭＰＵ５１は、この行動センサユニット１０による検出結果や、操作部７０を介して入力される入力信号、ＲＯＭ（図示省略）に格納されたプログラムに応じた処理を実行する。 The communication unit 50 includes the well-known MPU 51, the wireless telephone unit 53, and the contact point memory 55, and can obtain detection signals from various sensors constituting the action sensor unit 10 through an input / output interface (not shown). It is configured. Then, the MPU 51 of the communication unit 50 executes processing according to the detection result by the behavior sensor unit 10, an input signal input through the operation unit 70, and a program stored in the ROM (not shown).

具体的には、通信部５０のＭＰＵ５１は、使用者が行う特定の動作を検出する動作検出装置としての機能、使用者との位置関係を検出する位置関係検出装置としての機能、使用者により行われる運動の負荷を検出する運動負荷検出装置としての機能、およびＭＰＵ５１による処理結果を送信する機能を実行する。 Specifically, the MPU 51 of the communication unit 50 has a function as an operation detection device that detects a specific operation performed by the user, a function as a position relationship detection device that detects a position relationship with the user, and a row by the user. The function as an exercise load detection apparatus which detects the exercise load which is to be performed, and the function which transmits the processing result by MPU51 are performed.

無線電話ユニット５３は、例えば携帯電話の基地局と通信可能に構成されており、通信部５０のＭＰＵ５１は、該ＭＰＵ５１による処理結果を報知部６０に対して出力したり、無線電話ユニット５３を介して予め設定された送信先（連絡先メモリ５５に記録された連絡先）に対して送信したりする。 The wireless telephone unit 53 is configured to be able to communicate with, for example, a base station of a mobile phone, and the MPU 51 of the communication unit 50 outputs the processing result of the MPU 51 to the notification unit 60 or via the wireless telephone unit 53. Then, the information is sent to a preset destination (the contact recorded in the contact memory 55).

連絡先メモリ５５は、使用者の訪問先の位置情報を記憶するための記憶領域として機能する。この連絡先メモリ５５には、使用者に異常が生じた場合に連絡をすべき連絡先（電話番号など）の情報が記録されている。 The contact point memory 55 functions as a storage area for storing location information of a user's visit destination. In the contact memory 55, information of a contact (such as a telephone number) to be contacted when an abnormality occurs in the user is recorded.

報知部６０は、例えば、ＬＣＤや有機ＥＬディスプレイとして構成されたディスプレイ６１と、例えば７色に発光可能なＬＥＤからなる電飾６３と、スピーカ６５とを備えている。報知部６０を構成する各部は、通信部５０のＭＰＵ５１により駆動制御される。 The notification unit 60 includes, for example, a display 61 configured as an LCD or an organic EL display, an illumination 63 made of an LED capable of emitting light in seven colors, and a speaker 65, for example. The respective units constituting the notification unit 60 are drive-controlled by the MPU 51 of the communication unit 50.

次に、操作部７０としては、タッチパッド７１と、確認ボタン７３と、指紋センサ７５と、救援依頼レバー７７とを備えている。
タッチパッド７１は、使用者（使用者や使用者の保護者等）により触れられた位置や圧力に応じた信号を出力する。 Next, the operation unit 70 includes a touch pad 71, a confirmation button 73, a fingerprint sensor 75, and a relief request lever 77.
The touch pad 71 outputs a signal corresponding to the position and pressure touched by the user (user or the user's guardian).

確認ボタン７３は、使用者に押下されると内蔵されたスイッチの接点が閉じるように構成されており、通信部５０にて確認ボタン７３が押下されたことを検出することができるようにされている。 The confirmation button 73 is configured such that the contact point of the built-in switch is closed when the user presses it, and the communication unit 50 can detect that the confirmation button 73 has been pressed. There is.

指紋センサ７５は、周知の指紋センサであって、例えば、光学式センサを用いて指紋を読みとることができるよう構成されている。なお、指紋センサ７５に換えて、例えば掌の静脈の形状を認識するセンサ等、人間の身体的特徴を認識することができる手段（バイオ
メトリクス認証をすることができる手段：個人を特定することができる手段）であれば、採用することができる。 The fingerprint sensor 75 is a known fingerprint sensor, and is configured to be able to read a fingerprint using, for example, an optical sensor. In addition, instead of the fingerprint sensor 75, for example, a sensor that recognizes the shape of a vein of the palm, or the like that can recognize human physical characteristics (means that can perform biometrics authentication: specifying an individual) Any means can be adopted.

また、操作されると所定の連絡先に接続される救援依頼レバー７７も備えている。
［本実施形態の処理］
このような音声応答システム１００において実施される処理について以下に説明する。 It also has a relief request lever 77 connected to a predetermined contact when operated.
[Process of this embodiment]
The process implemented in such a voice response system 100 will be described below.

端末装置１にて実施される音声応答端末処理は、使用者による音声入力を受付けてこの音声をサーバ９０に送り、サーバ９０から出力すべき応答を受けるとこの応答を音声で再生する処理である。なお、この処理は、使用者が操作部７０を介して音声入力を行う旨を入力すると開始される。 The voice response terminal process implemented in the terminal device 1 is a process of receiving a voice input by the user and sending this voice to the server 90, and when receiving a response to be output from the server 90, reproducing the response by voice . This process is started when the user inputs that voice input is to be performed via the operation unit 70.

詳細には、図３に示すように、まず、マイク３７からの入力を受け付ける状態（ＯＮ状態）とし（Ｓ２）、カメラ４１による撮像（録画）を開始する（Ｓ４）。そして、音声入力があったか否かを判定する（Ｓ６）。 Specifically, as shown in FIG. 3, first, the input from the microphone 37 is received (ON state) (S2), and imaging (recording) by the camera 41 is started (S4). Then, it is determined whether there is a voice input (S6).

音声入力がなければ（Ｓ６：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ８）。ここで、タイムアウトとは、処理を待機する際の許容時間を超えたことを示し、ここでは許容時間は例えば５秒程度に設定される。 If there is no voice input (S6: NO), it is determined whether or not a timeout has occurred (S8). Here, the timeout indicates that the allowable time for waiting for processing has exceeded, and in this case, the allowable time is set to, for example, about 5 seconds.

タイムアウトしていれば（Ｓ８：ＹＥＳ）、後述するＳ３０の処理に移行する。また、タイムアウトしていなければ（Ｓ８：ＮＯ）、Ｓ６の処理に戻る。
音声入力があれば（Ｓ６：ＹＥＳ）、音声をメモリに記録し（Ｓ１０）、音声の入力が終了したか否かを判定する（Ｓ１２）。ここでは、音声が一定時間以上途切れた場合や、操作部７０を介して音声入力を終了する旨が入力された場合に、音声の入力が終了したと判定する。 If it has timed out (S8: YES), it will shift to the processing of S30 mentioned later. Moreover, if it has not timed out (S8: NO), it will return to the process of S6.
If there is a voice input (S6: YES), the voice is recorded in the memory (S10), and it is determined whether the voice input is completed (S12). Here, when the voice is interrupted for a predetermined time or more, or when it is input that the voice input is to be terminated through the operation unit 70, it is determined that the voice input is finished.

音声の入力が終了していなければ（Ｓ１２：ＮＯ）、Ｓ１０の処理に戻る。また、音声の入力が終了していれば（Ｓ１２：ＹＥＳ）、自身を特定するためのＩＤ、音声、および撮像画像等のデータをサーバ９０に対してパケット送信する（Ｓ１４）。なお、データを送信する処理は、Ｓ１０とＳ１２の間で行ってもよい。 If the voice input is not completed (S12: NO), the process returns to S10. If the voice input is completed (S12: YES), data such as an ID for identifying itself, voice, and a captured image are transmitted to the server 90 as a packet (S14). The process of transmitting data may be performed between S10 and S12.

続いて、データの送信が完了したか否かを判定する（Ｓ１６）。送信が完了していなければ（Ｓ１６：ＮＯ）、Ｓ１４の処理に戻る。
また、送信が完了していれば（Ｓ１６：ＹＥＳ）、後述する音声応答サーバ処理にて送信されるデータ（パケット）を受信したか否かを判定する（Ｓ１８）。データを受信していなければ（Ｓ１８：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ２０）。 Subsequently, it is determined whether transmission of data is completed (S16). If the transmission is not completed (S16: NO), the process returns to S14.
If the transmission is completed (S16: YES), it is determined whether the data (packet) to be transmitted in the voice response server process described later has been received (S18). If the data has not been received (S18: NO), it is determined whether or not a timeout has occurred (S20).

タイムアウトしていれば（Ｓ２０：ＹＥＳ）、後述するＳ３０の処理に移行する。また、タイムアウトしていなければ（Ｓ２０：ＮＯ）、Ｓ１８の処理に戻る。
また、データを受信していれば（Ｓ１８：ＹＥＳ）、パケットを受信する（Ｓ２２）。この処理では、文字情報に対する応答を取得する。 If it has timed out (S20: YES), the process proceeds to the process of S30 described later. Moreover, if it has not timed out (S20: NO), it will return to the process of S18.
If data is received (S18: YES), a packet is received (S22). In this process, a response to character information is acquired.

そして、受信が完了したか否かを判定する（Ｓ２４）。受信が完了していなければ（Ｓ２４：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ２６）。
タイムアウトしていれば（Ｓ２６：ＹＥＳ）、エラーが発生した旨を報知部６０を介して出力し、音声応答端末処理を終了する。また、タイムアウトしていなければ（Ｓ２６：ＮＯ）、Ｓ２２の処理に戻る。 Then, it is determined whether the reception is completed (S24). If the reception has not been completed (S24: NO), it is determined whether or not a timeout has occurred (S26).
If it has timed out (S26: YES), the notification unit 60 outputs that an error has occurred, and ends the voice response terminal process. Moreover, if it has not timed out (S26: NO), it will return to the process of S22.

また、受信が完了していれば（Ｓ２４：ＹＥＳ）、受信したパケットに基づく応答を音
声でスピーカ６５から出力させる（Ｓ２８）。このような処理が終了すると、音声応答端末処理を終了する。 If the reception is completed (S24: YES), the response based on the received packet is output from the speaker 65 by voice (S28). When such processing ends, the voice response terminal processing ends.

続いて、サーバ９０（外部装置）にて実施される音声応答サーバ処理について図４を用いて説明する。音声応答サーバ処理は、端末装置１から音声を受信し、この音声を文字情報に変換する音声認識を行うとともに、音声に対する応答を生成して端末装置１に返す処理である。 Subsequently, a voice response server process performed by the server 90 (external device) will be described with reference to FIG. The voice response server process is a process of receiving a voice from the terminal device 1, performing voice recognition for converting the voice into character information, and generating a response to the voice and returning it to the terminal device 1.

音声応答サーバ処理の詳細としては、図４（および図５）に示すように、まず、何れかの端末装置１からのパケットを受信したか否かを判定する（Ｓ４２）。パケットを受信していなければ（Ｓ４２：ＮＯ）、Ｓ４２の処理を繰り返す。 As the details of the voice response server process, as shown in FIG. 4 (and FIG. 5), first, it is determined whether or not a packet from any of the terminal devices 1 has been received (S42). If a packet has not been received (S42: NO), the process of S42 is repeated.

また、パケットを受信していれば（Ｓ４２：ＹＥＳ）、通信相手の端末装置１を特定する（Ｓ４４）。この処理では、パケットに含まれる端末装置１のＩＤによって端末装置１を特定する。 If the packet is received (S42: YES), the terminal device 1 of the communication partner is specified (S44). In this process, the terminal device 1 is identified by the ID of the terminal device 1 contained in the packet.

続いて、パケットに含まれるカメラ４１による撮像画像を取得し（Ｓ７０）、パケットに含まれる音声の特徴を検出する（Ｓ７２）。この処理では、音声波形の特徴（声紋）や音の高低などの特徴を検出する。 Subsequently, an image captured by the camera 41 included in the packet is acquired (S70), and the feature of the audio included in the packet is detected (S72). In this process, features such as voice waveform features (voiceprints) and sound levels are detected.

続いて、使用者を撮像した撮像画像や音声の特徴から音声を入力した者の年齢層を特定する（Ｓ７４）。この処理では、音声の特徴と年齢層との傾向を予め音声認識ＤＢ１０２に格納しておき、この音声認識ＤＢ１０２を参照することで年齢層を特定する。また、撮像画像から使用者の年齢を推定する周知の技術を併用する。 Subsequently, the age group of the person who has input the voice is specified from the features of the captured image and the voice obtained by imaging the user (S74). In this process, the tendency of the voice feature and the age group is stored in advance in the speech recognition DB 102, and the age group is specified by referring to the speech recognition DB 102. In addition, a known technique for estimating the age of the user from the captured image is used in combination.

次に、これらの音声の特徴から人物を特定する（Ｓ７６）。ここで、音声認識ＤＢ１０２には、人物毎の音声の特徴が人物の名前と対応して予め記憶されており、この処理では、この音声認識ＤＢ１０２を参照することによって人物を特定する。 Next, a person is specified from these voice features (S76). Here, in the voice recognition DB 102, the feature of the voice of each person is stored in advance in correspondence with the name of the person. In this process, the person is specified by referring to the voice recognition DB 102.

なお、各人物の音声の特徴を記録する際には、例えば、名前のみを音声や文字で使用者に入力させ、名前が入力されたときや、その後の音声入力のときに、音声特徴を捉えて記録するようにすればよい。また、画像による本人認証の技術を本実施形態において併用してもよい。 When recording the characteristics of each person's voice, for example, only the name is input by the user as voice or characters, and the voice feature is captured when the name is input or when voice input is performed thereafter. Recording should be done. In addition, the technique of personal identification by an image may be used in the present embodiment.

続いて、この音声や検出した音声特徴を音声認識ＤＢ１０２に記録し（Ｓ７８）、音声認識を行う際に利用するデータベースを選択する（Ｓ８０）。ここで、音声認識ＤＢ１０２には、図６Ａに示すように、４歳までを対象にした幼児ＤＢ、５歳から１０歳までを対象とした子供ＤＢ、１０代（１０歳〜１９歳）を対象とした若年ＤＢ、２０代（２０歳〜２９歳）を対象とした青年ＤＢ、３０〜５０代（３０歳〜５９歳）を対象とした中年ＤＢ、６０代以上を対象として壮年ＤＢを備えている。 Subsequently, the voice and the detected voice feature are recorded in the voice recognition DB 102 (S78), and a database to be used when performing voice recognition is selected (S80). Here, as shown in FIG. 6A, the speech recognition DB 102 targets an infant DB targeting four years old, a child DB targeting five to ten years, and a teenager (10 to 19 years old). And a youth DB for 20's (20 to 29), a middle-aged DB for 30's to 50's (30 to 59), a middle age DB for 60's and over ing.

各ＤＢには、音声を文字として認識するための、音声の波形と文字（音または単語）とを対応付ける辞書データベースを備えている。そして、各ＤＢでは、年齢層毎に、使用者の喋り方（音声特徴の傾向）や、年齢層に使用される傾向がある単語などが、異なる情報として記録されている。 Each DB is provided with a dictionary database that associates speech waveforms with letters (sounds or words) for recognizing speech as letters. Then, in each DB, the manner in which the user speaks (the tendency of the audio feature), the word having a tendency to be used for the age group, and the like are recorded as different information for each age group.

特に、各ＤＢは、若い年齢ほど使用される年齢幅が狭く設定されている。このようにしているのは、若い年齢ほど喋り方の変化や、新たな単語の創造能力が高く、これらの変化に直ちに対応できるようにするためである。 In particular, in each DB, the age range to be used is set narrower as the younger age. The reason for this is that the younger the age, the more likely you are to change the manner in which you speak and the ability to create new words so that you can respond immediately to these changes.

Ｓ８０の処理では、推定した使用者の年齢に応じて年齢層に合致する１つのデータベース（図６Ａに示すもののうちの何れか）を選択し、設定する。続いて、パケットに含まれる音声を認識する（Ｓ４６）。 In the process of S80, one database (any of those shown in FIG. 6A) matching the age group is selected and set according to the estimated user age. Subsequently, the voice contained in the packet is recognized (S46).

ここで、予測変換ＤＢ１０３には、ある単語に続いて利用されがちな単語が対応付けられている。この処理では、音声認識ＤＢ１０２のうちの選択されたデータベースおよび予測変換ＤＢ１０３を参照することで、周知の音声認識処理を実施し、音声を文字情報に変換する。 Here, in the prediction conversion DB 103, words that are likely to be used subsequently to a certain word are associated. In this process, a known speech recognition process is performed by referring to a selected database of the speech recognition DB 102 and the predictive conversion DB 103 to convert speech into character information.

続いて、撮像画像を画像処理することによって、撮像画像中の物体を特定する（Ｓ４８）。そして、音声の波形や言葉の語尾などに基づいて、使用者の感情を判定する（Ｓ５０）。 Subsequently, an object in the captured image is specified by performing image processing on the captured image (S48). Then, the emotion of the user is determined based on the waveform of the voice, the ending of the word, etc. (S50).

この処理では、音声の波形（声色）や言葉の語尾などと、通常、怒り、喜び、困惑、悲しみ、高揚などの感情の区分とが対応付けられた感情判定ＤＢ１１４を参照することによって、使用者の感情が何れかの区分に該当するかを判定し、この判定結果をメモリに記録する。続いて、学習ＤＢ１０７を参照することによって、この使用者がよく話す単語を検索し、音声認識にて生成した文字情報が曖昧であった部位を補正する。 In this process, the user refers to the emotion determination DB 114 in which the waveform (voice color) of the speech and the word ending are associated with the categories of emotions such as anger, joy, embarrassment, sadness, and exaltation. It is determined whether the emotion of the above corresponds to any category, and the determination result is recorded in the memory. Subsequently, by referring to the learning DB 107, a word which the user often speaks is searched, and a portion where the character information generated by the speech recognition is ambiguous is corrected.

なお、学習ＤＢ１０７には、使用者がよく話す単語や発音時の癖など、使用者の特徴が使用者ごとに記録されている。また、使用者との会話において学習ＤＢ１０７へのデータの追加・修正がなされる。また、予測変換ＤＢ１０３、感情判定ＤＢ１０４等においても、音声認識ＤＢ１０２同様に、年齢層毎に区分してデータを保持してもよい。 In addition, in the learning DB 107, characteristics of the user such as a word that the user often speaks and a habit of pronunciation are recorded for each user. In addition, addition and correction of data to the learning DB 107 are performed in a conversation with the user. Further, also in the predictive conversion DB 103, the emotion determination DB 104, etc., data may be stored by being divided into each age group, as in the speech recognition DB 102.

続いて、補正後の文字情報を入力された文字情報として特定する（Ｓ５４）。そして、これらの処理の結果、音声を文字情報として認識できたか否かを判定する（Ｓ８２）。
この処理では、文章として不具合がある場合（例えば、文法的に誤りがある場合など）には、文章が完成していたとしても認識できなかったものとみなす。文字情報として認識できていなければ（Ｓ８２：ＮＯ）、予め通報先ＤＢ１１７に登録された所定の連絡先（端末装置１毎に設定された連絡先）に、所定の音声（例えば、「以下の言葉が認識できませんでした。録音した音声を再生しますので、正しい文章をお話しください。」といった文章）と、使用者が入力した音声とを送信することで、問い合わせを行う（Ｓ８４）。 Subsequently, the corrected character information is specified as the input character information (S54). Then, as a result of these processes, it is determined whether the voice can be recognized as character information (S82).
In this process, when there is a defect as a sentence (for example, when there is a grammatical error), it is considered that even if the sentence is completed, it can not be recognized. If it can not be recognized as text information (S 82: NO), a predetermined voice (for example, the following words) is sent to a predetermined contact address (contact address set for each terminal device 1) registered in advance in the report destination DB 117 Can not be recognized. Since the recorded voice is played back, please give a sentence such as “speak the correct sentence.” And the voice input by the user to make an inquiry (S 84).

この処理は、例えば滑舌の悪い子供が使用者となる場合に、文字情報として認識できない場合、所定の連絡先として登録された母親の端末装置１に問い合わせをしたり、老人が使用者となる場合に、その家族に問い合わせをしたりする。 In this process, for example, when a child having a bad tongue is a user, if it can not be recognized as character information, the mother terminal device 1 registered as a predetermined contact is inquired or an old man becomes a user In that case, ask the family.

続いて、問い合わせ先によって音声が入力されたパケットを受信したか否かを判定する（Ｓ８６）。パケットを受信していなければ（Ｓ８６：ＮＯ）、この処理を繰り返す。また、パケットを受信していれば（Ｓ８６：ＹＥＳ）、Ｓ５４の処理に戻る。 Subsequently, it is determined whether the packet to which the voice has been input by the inquired party has been received (S86). If the packet is not received (S86: NO), this process is repeated. If a packet is received (S86: YES), the process returns to S54.

さらに、Ｓ８２の処理にて、文字情報として認識できていれば（Ｓ８２：ＹＥＳ）、文字情報がスケジュールを入力するものであるか否かを判定する（Ｓ８８）。スケジュールを入力するものでなければ（Ｓ８８：ＮＯ）、後述するＳ９２の処理に移行する。 Further, if the character information can be recognized in the process of S82 (S82: YES), it is determined whether the character information is for inputting a schedule (S88). If the schedule is not to be input (S88: NO), the process proceeds to S92 described later.

また、スケジュールを入力するものであれば（Ｓ８８：ＹＥＳ）、スケジュールの管理を行うスケジュール入力処理を実施する（Ｓ９０）。この処理では、図７に示すように、まず、スケジュールを入力する対象となる特定人物のスケジュールを抽出する（Ｓ１０２）。 If the schedule is to be input (S88: YES), schedule input processing is performed to manage the schedule (S90). In this process, as shown in FIG. 7, first, the schedule of the specific person who is the target of inputting the schedule is extracted (S102).

この処理では、図８に示すように、特定人物と時刻とがマトリクス状に配置されたスケジュールデータをスケジュールＤＢ１１２から抽出し（Ｓ１０２）、入力された予定（時間帯、予定の内容、場所の情報を含むもの）を仮登録する（Ｓ１０４）。 In this process, as shown in FIG. 8, schedule data in which a specific person and time are arranged in a matrix is extracted from the schedule DB 112 (S102), and the input schedule (time zone, schedule content, location information) Is temporarily registered (S104).

続いて、スケジュール（予定）に競合があるか否かを判定する（Ｓ１０６）。例えば、Ａ氏については図８に示すように９月１日の１０時から会議の予定が既に登録されているが、この同じ時間に別の予定を入れるよう指示があった場合には、競合ありとして判定する。 Subsequently, it is determined whether there is a conflict in the schedule (schedule) (S106). For example, for Mr. A, as shown in FIG. 8, a meeting schedule has already been registered from 10 o'clock on September 1st, but if it is instructed to put another schedule at this same time, a conflict will occur. Judge as yes.

競合があれば（Ｓ１０６：ＹＥＳ）、後述するＳ１２８の処理に移行する。また、競合がなければ（Ｓ１０６：ＮＯ）、前後の予定について、予定が実施される場所を抽出する（Ｓ１０８）。 If there is a conflict (S106: YES), the process proceeds to S128 described later. Moreover, if there is no competition (S106: NO), the place where the schedule is implemented is extracted for the schedule before and after (S108).

続いて、前後の予定が実施される時間と場所とを鑑みて、仮登録した予定が実施される場所に移動するまでの移動時間を算出する（Ｓ１１０）。この処理では、例えば周知の乗換案内プログラムを利用して、移動に必要な時間を演算する。例えば、東京の丸の内から名古屋まで移動するには、約２時間の移動時間を必要とされる。 Then, in consideration of the time and place where the schedule before and after is implemented, the movement time until moving to the place where the temporarily registered schedule is implemented is calculated (S110). In this process, the time required for movement is calculated using, for example, a known transfer guidance program. For example, to move from Marunouchi in Tokyo to Nagoya requires approximately 2 hours of travel time.

続いて、仮登録された予定が実施される場所と、前後の予定が実施される場所との移動が可能か否かを判定する（Ｓ１２２）。この処理では、移動に要する時間と空き時間の長さとを比較し、空き時間が長ければ移動可とする。 Subsequently, it is determined whether it is possible to move between the place where the temporarily registered schedule is implemented and the place where the preceding and subsequent schedules are implemented (S122). In this process, the time required for movement is compared with the length of idle time, and if the idle time is long, movement is permitted.

移動が可能であれば（Ｓ１２２：ＹＥＳ）、この予定をスケジュールＤＢ１１２に本登録し（Ｓ１２４）、登録完了した旨を記録し（Ｓ１２６）、スケジュール入力処理を終了する。 If it is possible to move (S122: YES), this schedule is officially registered in the schedule DB 112 (S124), the fact that the registration is completed is recorded (S126), and the schedule input process is ended.

また、移動が不可能であれば（Ｓ１２２：ＮＯ）、前後の予定または仮登録した予定を変更可能か否かを判定する（Ｓ１２８）。ここで、スケジュールＤＢ１１２に記録される各予定には、予定属性が設定されており、予定属性は、図９に示すように、重要度に応じたレベルが設定されている。例えば、レベルＡには、客先とのアポイント（面会の約束）が対応し、予定の変更は不可とされる。 In addition, if the movement is not possible (S122: NO), it is determined whether or not the schedule before or after or the temporarily registered schedule can be changed (S128). Here, for each schedule recorded in the schedule DB 112, a schedule attribute is set, and as the schedule attribute, as shown in FIG. 9, a level according to the importance is set. For example, at level A, an appointment (a promise of a visit) with a customer corresponds, and a change in schedule is prohibited.

また、レベルＢには、客先以外の例えば社内でのアポイントが対応し、予定の変更は不可とされる。また、レベルＣには、私的な用事が対応し、予定の変更が可能とされる。
ここで、本処理では、予定について登録する際（Ｓ１２４の処理の際）には、その内容から予定属性を認識し、予定属性についても登録する。また、仮登録された予定についてはこの処理において予定属性を認識する。 Also, at level B, for example, appointments other than the customer correspond in the company, and it is impossible to change the schedule. Also, at Level C, private events correspond and it is possible to change the schedule.
Here, in the present process, when registering a schedule (in the case of the process of S124), the schedule attribute is recognized from the content, and the schedule attribute is also registered. Also, with regard to temporarily registered schedules, this process recognizes schedule attributes.

前後の予定または仮登録した予定を変更可能であれば（Ｓ１２８：ＹＥＳ）、変更案を提示する（Ｓ１３０）。ここで、変更案とは、変更可能な予定（つまりレベルＣに属する予定）を移動させ、競合がなくかつ予定が実施される場所間で使用者（対象者）が移動可能になるような案を提示する。 If it is possible to change the previous or subsequent schedule or the temporarily registered schedule (S128: YES), a proposed change is presented (S130). Here, the proposal for change is to move a changeable schedule (that is, a schedule belonging to level C), and to allow a user (target person) to move between places where there is no conflict and the schedule is implemented. To present.

そして、変更フラグをＯＮに設定し（Ｓ１３２）、スケジュール入力処理を終了する。
また、前後の予定または仮登録した予定を変更可能でなければ（Ｓ１２８：ＮＯ）、予定が重複した旨を記録し（Ｓ１３４）、スケジュール入力処理を終了する。 Then, the change flag is set to ON (S132), and the schedule input process is ended.
Also, if it is not possible to change the previous or subsequent schedule or the temporarily registered schedule (S128: NO), the fact that the schedule has been duplicated is recorded (S134), and the schedule input process is ended.

このようなスケジュール入力処理が終了すると、図５に戻り、文字情報が被制御部９５を操作するための指令である操作入力であるか否かを判定する（Ｓ９２）。操作入力でなければ（Ｓ９２：ＮＯ）、後述するＳ９６の処理に移行する。 When such a schedule input process is completed, the process returns to FIG. 5 to determine whether the character information is an operation input that is an instruction for operating the controlled unit 95 (S92). If it is not operation input (S92: NO), it will transfer to the process of S96 mentioned later.

また、操作入力であれば（Ｓ９２：ＹＥＳ）、操作入力処理を実施する（Ｓ９４）。この処理は、入力された音声に従って被制御部９５の作動を制御する処理である。詳細には、図１０に示すように、まず、指令内容を認識する（Ｓ２０２）。指令内容としては、例えば、被制御部９５に該当するテレビ受像器の受信チャンネルや音量を変更することや、被制御部９５に該当する車両のエアコンの設定温度を１℃高くする、等が該当する。 If it is an operation input (S 92: YES), the operation input processing is performed (S 94). This process is a process of controlling the operation of the controlled unit 95 according to the input voice. Specifically, as shown in FIG. 10, first, the content of the command is recognized (S202). The command contents include, for example, changing the reception channel and volume of the television receiver corresponding to the controlled unit 95, increasing the set temperature of the air conditioner of the vehicle corresponding to the controlled unit 95 by 1 ° C., etc. Do.

続いて、同じ被制御部９５に対する過去の指令（例えば過去所定時間（１０分以内など）のもの）があったか否かを判定する（Ｓ２０４）。同じ被制御部９５に対する過去の指令がなければ（Ｓ２０４：ＮＯ）、後述するＳ２１６の処理に移行する。 Subsequently, it is determined whether or not there has been a previous instruction (for example, one in the past predetermined time (for example, within 10 minutes) to the same controlled unit 95) (S204). If there is no past command to the same controlled unit 95 (S204: NO), the process proceeds to the process of S216 described later.

また、同じ被制御部９５に対する過去の指令があれば（Ｓ２０４：ＹＥＳ）、この過去の指令を抽出し（Ｓ２０６）、過去の指令との矛盾があるか否かを判定する（Ｓ２０８）。ここで、矛盾とは、例えば、被制御部９５に車両のエアコンが該当する場合、設定温度を１℃低くする、という過去の指令があったのに対して、これに相反する、車両のエアコンの設定温度を１℃高くする、という指令が入力された場合等が該当する。 If there is a command in the past for the same controlled unit 95 (S204: YES), the command in the past is extracted (S206), and it is determined whether there is any contradiction with the command in the past (S208). Here, contradiction, for example, when the air conditioner of the vehicle corresponds to the controlled portion 95, there is a past command to lower the set temperature by 1 ° C., while the air conditioner of the vehicle contradicts this. This corresponds to the case where a command to increase the set temperature by 1 ° C. is input.

また、例えば、被制御部９５にテレビ受像器が該当する場合、受信チャンネルを変更した直後に、他の受信チャンネルに変更する指令を受けた場合や、音量を変更した直後に、さらに音量を変更する指令が入力された場合等が該当する。 Further, for example, when the television receiver corresponds to the controlled part 95, the volume is further changed immediately after the reception channel is changed, when an instruction to change to another reception channel is received, or immediately after the volume is changed. The case where the command to do is input etc. corresponds.

矛盾がなければ（Ｓ２０８：ＮＯ）、Ｓ２１６の処理に移行する。また、矛盾があった場合には（Ｓ２０８：ＹＥＳ）、矛盾する指令を入力した者が一致するか否かを判定する（Ｓ２１０）。矛盾する指令を入力した者が一致しない場合には（Ｓ２１０：ＮＯ）、これらの矛盾する指令を入力した者についての優先順位を取得する（Ｓ２１２）。 If there is no contradiction (S208: NO), the process proceeds to S216. In addition, when there is a contradiction (S208: YES), it is determined whether or not the person who has input the contradictory command matches (S210). If the person who inputs the contradictory command does not match (S210: NO), the priority order of the person who has input these contradictory commands is acquired (S212).

ここで、優先順位ＤＢ１１１には、図６Ｂに示すように、人物と優先順位とが対応付けて記録されている。例えば、Ａ氏とＢ氏とが矛盾する指令をそれぞれ入力した場合には、Ａ氏の１位とＢ氏の４位とが優先順位ＤＢ１１１から取得される。 Here, in the priority DB 111, as shown in FIG. 6B, a person and a priority are stored in association with each other. For example, when Mr. A and Mr. B respectively input contradictory commands, the first place of Mr. A and the fourth place of Mr. B are acquired from priority DB 111.

続いて、優先順位が最も高いものからの指令を設定する。例えば、優先順位が１位のＡ氏が「車両のエアコンの設定温度を１℃高くする」旨を指令し、優先順位が４位のＢ氏が「車両のエアコンの設定温度を１℃低くする」旨を指令した場合には、Ａ氏の指令が適用され、Ｂ氏の指令は無効になる。 Subsequently, the instruction from the one with the highest priority is set. For example, Mr. A, who has the first priority, gives a command to "increase the air conditioner setting temperature of the vehicle by 1 ° C." and Mr. B, who has the fourth priority, decreases the setting temperature of the vehicle air conditioner by 1 ° C. If the command is issued, Mr. A's directive is applied and Mr. B's directive becomes invalid.

そして、設定された指令を被制御部９５に送信し（Ｓ２１８）、操作入力処理を終了する。また、Ｓ２１０の処理にて、矛盾する指令を入力した者が一致する場合には（Ｓ２１０：ＹＥＳ）、直近に入力された指令を設定し（Ｓ２１６）、前述のＳ２１８の処理を実施し、操作入力処理を終了する。 Then, the set command is transmitted to the controlled unit 95 (S218), and the operation input process is ended. If the person who inputs the contradictory command matches in the process of S210 (S210: YES), the most recently input command is set (S216), the process of S218 described above is performed, and the operation is performed. End input processing.

このような操作入力処理が終了すると、図５に戻り、変更フラグがＯＮに設定されているか否かを判定する（Ｓ９６）。変更フラグがＯＦＦであれば（Ｓ９６：ＮＯ）、後述するＳ５６の処理に移行する。 When such an operation input process is completed, the process returns to FIG. 5, and it is determined whether the change flag is set to ON (S96). If the change flag is OFF (S96: NO), the process proceeds to S56 described later.

また、変更フラグがＯＮであれば（Ｓ９６：ＹＥＳ）、変更確認処理を実施する（Ｓ９８）。変更確認処理は、スケジュールの変更案を提示した際に、提示した変更案のようにスケジュールを変更してもよいか、使用者の意思を確認する処理である。 If the change flag is ON (S96: YES), a change confirmation process is performed (S98). The change confirmation process is a process of confirming the user's intention whether the schedule may be changed like the proposed change, when the proposed change of the schedule is presented.

変更確認処理では、図１１に示すように、まず、変更案に対する回答があったか否かを判定する（Ｓ４０２）。変更案に対する回答がなければ（Ｓ４０２：ＮＯ）、本予定につ
いての登録が完了したか否かを判定する（Ｓ４０４）。つまり、使用者自身が予定の入力をやり直すなどして、登録が完了しているかどうかを判定する。 In the change confirmation process, as shown in FIG. 11, first, it is determined whether or not there is an answer to the proposed change (S402). If there is no response to the proposed change (S402: NO), it is determined whether the registration for this schedule has been completed (S404). In other words, it is determined whether the registration has been completed by, for example, the user re-entering the schedule.

本予定についての登録が完了していれば（Ｓ４０４：ＹＥＳ）、変更フラグをＯＦＦに設定し（Ｓ４１２）、変更確認処理を終了する。また、本予定についての登録が完了していれば（Ｓ４０４：ＮＯ）、変更確認処理を終了する。 If the registration for this schedule has been completed (S404: YES), the change flag is set to OFF (S412), and the change confirmation process is ended. If the registration for this schedule has been completed (S404: NO), the change confirmation process is ended.

また、Ｓ４０２の処理において、変更案に対する回答があれば（Ｓ４０２：ＹＥＳ）、例えば、「それでいいよ」など、変更案通りでよい旨の回答が得られたか否かを判定する（Ｓ４０６）。変更案通りでよい旨の回答が得られていれば（Ｓ４０６：ＹＥＳ）、提示した変更案をスケジュールとして登録し（Ｓ４０８）、登録完了を記録する（Ｓ４１０）。そして、前述のＳ４１２の処理を実施し、変更確認処理を終了する。 Further, in the process of S402, if there is an answer to the proposed change (S402: YES), for example, it is determined whether or not an answer indicating that the proposed change is acceptable is obtained (S406). If the answer indicating that the proposed change is acceptable is obtained (S406: YES), the presented proposed change is registered as a schedule (S408), and the registration completion is recorded (S410). Then, the process of S412 described above is performed, and the change confirmation process is ended.

例えば、「それじゃダメ」など、変更案通りではよくない旨の回答が得られていれば（Ｓ４０６：ＮＯ）、別の変更案を提示し（Ｓ４１４）、変更確認処理を終了する。
このような変更確認処理が終了すると、図５に戻り、文字情報に類似する文章を入力として応答候補ＤＢ１０５から検索することによって、応答候補ＤＢ１０５から応答を取得する（Ｓ５６）。ここで、応答候補ＤＢ１０５には、入力となる文字情報と応答となる出力とが一義に対応付けられている。 For example, if an answer indicating that it is not good according to the proposed change is obtained (S406: NO), another proposed change is presented (S414), and the change confirmation process is ended.
When such a change confirmation process is completed, the process returns to FIG. 5, and a response is obtained from the response candidate DB 105 by searching the response candidate DB 105 for a sentence similar to the character information as an input (S56). Here, in the response candidate DB 105, character information as an input and an output as a response are uniquely associated.

例えば、スケジュールが入力された場合において、登録が完了した場合には、「登録が完了しました。」などの応答が出力され、予定が重複した旨が記録された場合には、「予定が重複しています」などの応答が出力される。また、変更フラグがＯＮにされた場合には、「この予定では前後の予定を考慮すると移動ができません。・・・のようにしてはいかがですか。」などと、生成した変更案に関する応答が出力される。 For example, when a schedule is input, when registration is completed, a response such as “registration completed.” Is output, and when it is recorded that a schedule is duplicated, “a schedule is duplicated”. Response is output. In addition, when the change flag is turned on, the response to the generated change proposal is such as "This move is not possible considering the schedule before and after this schedule ... ... How is it like?" It is output.

また、指令が入力され、指示通りに制御を行う場合には、「了解しました」などの応答が出力され、指示通りに制御できない場合には、「指示が矛盾しています」などの応答が出力される。 If a command is input and control is performed as instructed, a response such as "OK" is output, and if control can not be performed as instructed, a response such as "Inconsistent instructions" It is output.

また、「今日の※の天気」という文字情報が入力されると、「今日の※の天気は※です」という音声が出力される。ただし、「※」の部分は、地域名とその地域での数日間の天気予報とが対応付けられた天気ＤＢ１１０にアクセスすることで取得される。 Also, when the text information "Today's ※ weather" is input, a voice "Today's ※ weather is ※" is output. However, the part “※” is acquired by accessing the weather DB 110 in which the area name is associated with the weather forecast for several days in the area.

続いて、応答内容を音声に変換する（Ｓ６２）。この処理では、音声ＤＢ１０４に格納されたデータベースに基づいて、応答内容（文字情報）を音声として出力する処理を行う。 Subsequently, the response content is converted into voice (S62). In this process, based on the database stored in the voice DB 104, a process of outputting the response content (character information) as voice is performed.

そして、生成した応答（音声）を通信相手の端末装置１にパケット送信する（Ｓ６４）。なお、応答内容の音声を生成しつつパケット送信してもよい。
続いて、会話内容を記録する（Ｓ６８）。この処理では、入力された文字情報と出力された応答内容を会話内容として学習ＤＢ１０７に記録する。この際、会話内容に含まれるキーワード（音声認識ＤＢ１０２に記録された単語）や発音時の特徴などを学習ＤＢ１０７に記録する。 Then, the generated response (voice) is transmitted as a packet to the terminal device 1 of the communication partner (S64). The packet may be transmitted while generating the voice of the response content.
Subsequently, the contents of the conversation are recorded (S68). In this process, the input character information and the output response content are recorded in the learning DB 107 as conversation content. At this time, keywords (words recorded in the speech recognition DB 102) included in the conversation content and characteristics at the time of pronunciation are recorded in the learning DB 107.

このような処理が終了すると、音声応答サーバ処理を終了する。
［本実施形態による効果］
以上のように詳述した音声応答システム１００において、サーバ９０（演算部１０１）は、入力された音声の特徴を記録し、入力された音声の特徴が以前に記録された音声の特徴と一致するか否かを判定する。そして、サーバ９０は、音声の特徴が一致しないと判定
した場合、音声の特徴が一致すると判定した場合とは異なる応答を出力させる。 When such processing ends, the voice response server processing ends.
[Effect by this embodiment]
In the voice response system 100 described in detail above, the server 90 (arithmetic unit 101) records the features of the input voice, and the features of the input voice match the features of the previously recorded voice. It is determined whether or not. Then, if it is determined that the voice features do not match, the server 90 outputs a response different from that in the case where it is determined that the voice features match.

このような音声応答システム１００によれば、音声を入力した人物が以前と異なる場合には、音声を入力した人物が以前と同様の場合とは異なる応答を返すことができる。よって、音声を入力した人物が以前と同じか否かに拘わらず同様の回答をする場合と比較して、使用者にとってより使い勝手をよくすることができる。 According to such a voice response system 100, when the person who has input the voice is different from before, it is possible to return a response different from the case where the person who has input the voice is the same as before. Therefore, it is possible to improve usability for the user as compared with the case where the person who has input the voice makes the same answer regardless of whether the person is the same as before.

また、上記音声応答システム１００においてサーバ９０は、入力された音声の特徴に基づいて音声を入力した人物を特定し、入力された音声に従って被制御部９５を制御する。このとき、サーバ９０は、異なる人物から矛盾する指示を受けると予め人物毎に設定された優先順位に従って優先順位の上位の者による指示を優先して制御を実施する。 Further, in the voice response system 100, the server 90 specifies the person who has input the voice based on the feature of the input voice, and controls the controlled unit 95 according to the input voice. At this time, when the server 90 receives a contradictory instruction from a different person, control is performed by prioritizing an instruction from a higher priority person in accordance with the priority order set in advance for each person.

このような音声応答システム１００によれば、異なる人物から矛盾する指示を受けた場合であったとしても、優先順位に従って被制御部９５に対する制御を実施することができる。 According to such a voice response system 100, even when a contradictory instruction is received from a different person, control of the controlled unit 95 can be performed according to the priority.

さらに、上記音声応答システム１００においてサーバ９０は、入力された音声に基づくスケジュールを人物毎に記録する。
このような音声応答システム１００によれば、人物毎にスケジュールを管理することができる。 Furthermore, in the voice response system 100, the server 90 records a schedule based on the input voice for each person.
According to such a voice response system 100, a schedule can be managed for each person.

また、上記音声応答システム１００においてサーバ９０は、予定の属性に応じて予定の優先度を変更する。ここで、予定の属性とは、例えば、変更できるか否か（相手への影響があるかどうか）などによって区分される。 Also, in the voice response system 100, the server 90 changes the priority of the schedule according to the attribute of the schedule. Here, the scheduled attribute is classified, for example, according to whether or not it can be changed (whether or not there is an influence on the other party).

そして、予定属性によって先に登録されたスケジュールを変更したり、後から登録されるスケジュールを空いている時間に登録したりする。また、スケジュールを登録する際には、前のスケジュールの場所と後のスケジュールの場所とを考慮し、これらの間を移動するための時間を検索し、これらの間を移動するための移動時間を考慮して後から登録されるスケジュールを登録する。 Then, the schedule registered earlier according to the schedule attribute is changed, or the schedule registered later is registered as the vacant time. Also, when registering a schedule, consider the location of the previous schedule and the location of the later schedule, search for the time to move between them, and search for the time to move between them. Register the schedule to be registered later in consideration.

また、上記音声応答システム１００においてサーバ９０は、当該システム１００が管理する複数の人物が打ち合わせを行う場合のように、同じスケジュールを複数の人物が共有する場合には、これらの複数の人物のスケジュールが空いている時間帯を検索し、この時間に打ち合わせを設定する。また、スケジュールが空いていない場合には、予定属性に応じて既に登録されているスケジュールを変更する。 Further, in the voice response system 100, the server 90, in the case where a plurality of persons share the same schedule, as in the case where a plurality of persons managed by the system 100 make a meeting, the schedule of the plurality of persons Search for open time zones and set up meetings at this time. In addition, if the schedule is not available, the already registered schedule is changed according to the schedule attribute.

このような音声応答システムによれば、より使い勝手をよくすることができる。
さらに、上記音声応答システム１００においてサーバ９０は、このようにスケジュールを変更する際には、その旨を音声の応答として出力する。このような音声応答システムによれば、スケジュールを変更する際に、使用者に確認を取ることができる。 According to such a voice response system, the usability can be further improved.
Furthermore, in the voice response system 100, when changing the schedule in this way, the server 90 outputs that effect as a voice response. According to such a voice response system, confirmation can be made to the user when changing the schedule.

また、上記音声応答システム１００においてサーバ９０は、入力された音声が聞き取れない場合（つまり、文字に変換したときに文章として誤りがある場合）に所定の連絡先に発言内容を問い合わせる。また、聞き取れなかった音声を録音し、所定の連絡先に音声を送信し、この連絡先の人物に音声を再度入力する。 Further, in the voice response system 100, the server 90 inquires the contents of a statement to a predetermined contact when the input voice can not be heard (that is, when there is an error as a sentence when converted into characters). Also, the voice that can not be heard is recorded, the voice is transmitted to a predetermined contact, and the voice is input again to the contact person.

このような音声応答システム１００によれば、例えば、子供が話す内容が聞き取れない場合、母親に問い合わせることや、老人が話す内容が聞き取れない場合、老人の家族に問い合わせることによって、入力された音声の正確性に担保することができる。なお、この
際、位置情報を利用して問い合わせ元や問い合わせ先を特定するようにしてもよい。 According to such a voice response system 100, for example, when the content spoken by the child can not be heard, the voice input by the inquiry to the mother or when the content spoken by the old person can not be heard, the query to the family of the elderly It is possible to secure the accuracy. At this time, the inquiry source and the inquiry destination may be specified using position information.

さらに、上記音声応答システム１００においてサーバ９０には、予め使用者（音声を入力した者）の年齢または年齢層を示す年齢情報に応じて準備された複数のデータベースを備えておき、サーバ９０は使用者の年齢情報に従って使用するデータベースを選択し、この選択したデータベースに従って音声を認識する。 Further, in the voice response system 100, the server 90 is provided in advance with a plurality of databases prepared in accordance with age information indicating the age or age group of the user (the person who has input the voice). The database to be used is selected according to the person's age information, and the speech is recognized according to this selected database.

このような音声応答システム１００によれば、年齢に応じて音声認識の際に参照するデータベースを変更するので、年齢に応じて使用頻度が高い単語、言葉の言い回しなどを登録しておけば、音声認識の精度を向上させることができる。 According to such a voice response system 100, the database to be referred to at the time of voice recognition is changed according to the age, so if words and word phrases frequently used according to the age are registered, the voice The accuracy of recognition can be improved.

また、上記音声応答システム１００においては、使用者（音声を入力した者）の年齢を推定し、推定した年齢を年齢情報として利用する。
使用者の年齢を推定する際には、例えば、入力された音声の特徴（声の波形、声の高さ等）に応じて推定してもよいし、使用者が音声を入力する際にカメラ等の撮像部によって使用者の顔を撮像することによって推定する。 Further, in the voice response system 100, the age of the user (the person who has input the voice) is estimated, and the estimated age is used as age information.
When the user's age is to be estimated, for example, it may be estimated according to the characteristics of the input voice (voice waveform, voice height, etc.), or the camera when the user inputs voice. And so on by imaging the face of the user by the imaging unit.

また、使用者の顔を撮像する際には、使用者の識別や年齢認証を行う。
このような音声応答システム１００によれば、より正確に音声の認識を行うことができる。 In addition, when imaging the face of the user, identification of the user and age verification are performed.
According to such a voice response system 100, voice recognition can be performed more accurately.

［その他の実施形態］
本発明の実施の形態は、上記の実施形態に何ら限定されることはなく、本発明の技術的範囲に属する限り種々の形態を採りうる。 Other Embodiments
The embodiment of the present invention is not limited to the above embodiment, and can take various forms within the technical scope of the present invention.

例えば、図７に示すスケジュール入力処理において、「９月１日から３日までの間」というように期間（日程および時間帯など）を指定して、スケジュール設定を音声応答システム１００に任せるようにしてもよい。このようにする場合には、例えば、Ｓ１０２の処理とＳ１０４の処理との間において、例えば、「９月１日にＢさん、Ｃさんと１時間の打ち合わせ。」というような、期間を指定したスケジュール設定依頼があったか否かを判定する（Ｓ１０３）。 For example, in the schedule input process shown in FIG. 7, a period (such as a schedule and a time zone) is designated as “between September 1 and 3”, and the schedule setting is left to the voice response system 100. May be In this case, for example, a period such as "a meeting with Mr. B and Mr. C for one hour on September 1" is designated between the processing of S102 and the processing of S104. It is determined whether a schedule setting request has been made (S103).

このようなスケジュール設定依頼がなければ（Ｓ１０３：ＮＯ）、前述のＳ１０４以下の処理を実施する。また、このようなスケジュール設定依頼があれば（Ｓ１０３：ＹＥＳ）、期間指定処理を実施し（Ｓ１３６）、この処理が終了するとスケジュール入力処理を終了する。 If there is no such schedule setting request (S103: NO), the processing of the above-mentioned S104 and thereafter is performed. Further, if there is such a schedule setting request (S103: YES), a period designation process is performed (S136), and when this process ends, the schedule input process is ended.

期間指定処理では、図１２に示すように、まず、入力された依頼が複数人のスケジュールの調整を必要とするものか否かを判定する。例えば、Ａ氏によって、「９月１日にＢさん、Ｃさんと１時間の打ち合わせ。」と入力された場合には、Ａ氏だけでなく、Ｂ氏およびＣ氏のスケジュールも参照する必要があるため、複数人のスケジュールの調整を必要と判定される。 In the period designation process, as shown in FIG. 12, first, it is determined whether the input request requires adjustment of a schedule of a plurality of persons. For example, if Mr. A enters "A meeting with Mr. B, Mr. C for one hour on Sept. 1.", it is necessary to refer not only to Mr. A but also to Mr. B and Mr. C's schedule. Because of this, it is determined that it is necessary to adjust the schedule of multiple persons.

複数人のスケジュールの調整を必要とすれば（Ｓ３０２：ＹＥＳ）、使用者以外の対象者（音声を入力したＡ氏以外のスケジュールに関与するＢ氏、Ｃ氏）のスケジュールを抽出し（Ｓ３０４）、Ｓ３０６の処理に移行する。 If it is necessary to adjust the schedule of multiple persons (S302: YES), the schedule of target persons other than the user (Ms. B and C involved in a schedule other than Mr. A who has input the voice) is extracted (S304) , And proceeds to the process of S306.

また、複数人のスケジュールの調整を必要としなければ（Ｓ３０２：ＮＯ）、指定された期間内において、対象者全員の予定が空いている時間があるか否かを判定する（Ｓ３０６）。 If it is not necessary to adjust the schedule of a plurality of people (S302: NO), it is determined whether there is a time when the schedule of all the target persons is available within the designated period (S306).

全員の予定が空いている時間があれば（Ｓ３０６：ＹＥＳ）、この予定をスケジュールＤＢ１１２に登録し（Ｓ３０８）、登録完了した旨を記録し（Ｓ３１０）、スケジュール入力処理を終了する。また、全員の予定が空いている時間がなければ（Ｓ３０６：ＮＯ）、前後の予定についての情報（時間、場所）を抽出する（Ｓ３１２）。 If it is a time for which everyone's schedule is open (S306: YES), this schedule is registered in the schedule DB 112 (S308), recording completion of registration is recorded (S310), and the schedule input process is ended. Also, if there is no time when all the plans are open (S306: NO), information (time, place) about the previous and subsequent plans is extracted (S312).

続いて、前後の予定を変更可能か否かを判定する（Ｓ３１４）。前後の予定を変更可能であれば（Ｓ３１４：ＹＥＳ）、変更案を提示する（Ｓ３１６）。
そして、変更フラグをＯＮに設定し（Ｓ３１８）、スケジュール入力処理を終了する。また、前後の予定を変更可能でなければ（Ｓ３１４：ＮＯ）、予定が重複した旨を記録し（Ｓ３２０）、スケジュール入力処理を終了する。 Subsequently, it is determined whether or not the schedule before and after can be changed (S314). If it is possible to change the schedule before and after (S314: YES), a proposed change is presented (S316).
Then, the change flag is set to ON (S318), and the schedule input process is ended. Also, if it is not possible to change the schedule before and after (S314: NO), the fact that the schedule has been duplicated is recorded (S320), and the schedule input process is ended.

なお、期間指定処理では、Ｓ１１０、Ｓ１１２の処理を省略したが、これらの処理を実施してもよい。
さらに、操作入力処理においては、図１３に示すように、Ｓ２１２およびＳ２１４の処理に換えて、天気予報を取得し（Ｓ２３２）、天気予報に応じて代替案を設定してもよい（Ｓ２３４）。例えば、天気予報を取得した結果、これから気温が上がる傾向にある場合には、エアコンの設定温度を下げる案を提案し、これから気温が下がる傾向にある場合には、エアコンの設定温度を上げる案を提案する。また、これから雨が降りそうであれば、窓を閉める提案を行う。 In addition, although the process of S110 and S112 was abbreviate | omitted in a period designation process, you may implement these processes.
Furthermore, in the operation input process, as shown in FIG. 13, instead of the processes of S212 and S214, a weather forecast may be acquired (S232), and an alternative plan may be set according to the weather forecast (S234). For example, as a result of acquiring the weather forecast, if the temperature tends to rise from now on, we propose a proposal to lower the set temperature of the air conditioner, and if the temperature tends to fall from now on, the proposal to raise the set temperature of the air conditioner suggest. Also, if it is likely to rain from now on, we will propose to close the window.

このようにしても、使い勝手をよくすることができる。
また、上記実施形態においては、文字情報を入力する構成として音声認識を利用したが、音声認識に限らず、キーボードやタッチパネル等の入力手段（操作部７０）を利用して入力されてもよい。また、「入力された音声を文字情報に変換」する作動についてはサーバ９０で行ったが、端末装置１で行ってもよい。 Even in this way, usability can be improved.
Although voice recognition is used as a configuration for inputting character information in the above embodiment, the present invention is not limited to voice recognition, and may be input using an input unit (operation unit 70) such as a keyboard or a touch panel. Further, although the operation of “converting the input voice into character information” is performed by the server 90, the operation may be performed by the terminal device 1.

さらに、上記音声応答システム１００において演算部１０１は、使用者の行動（会話、移動した場所、カメラに映ったもの）を学習（記録および解析）しておき、使用者の会話における言葉足らずを補うようにしてもよい。 Furthermore, in the voice response system 100, the computing unit 101 learns (records and analyzes) the user's actions (speaking, moving location, things shown on the camera) to compensate for the lack of words in the user's conversation. You may do so.

例えば、「今日はハンバーグでいい？」との質問に対して「カレーがいいな。」と使用者が回答する会話に対して、本装置が「昨日ハンバーグだったからね」と補うと、使用者が、カレーがいいと発言した理由が伝わる。 For example, in response to the question that "the curry is good" for the question "Today is the hamburger?" However, the reason for saying that curry is good is transmitted.

また、このような構成は、電話中に実施することもでき、また、使用者の会話に勝手に参加するよう構成してもよい。
さらに、上記音声応答システム１００においてサーバ９０は、応答候補を所定のサーバ、またはインターネット上から取得するようにしてもよい。 Also, such a configuration can be implemented during a telephone call, and can also be configured to freely participate in the user's conversation.
Furthermore, in the voice response system 100, the server 90 may obtain a response candidate from a predetermined server or the Internet.

このような音声応答システム１００によれば、応答候補をサーバ９０だけでなく、インターネットや専用線等で接続された任意の装置から取得することができる。
さらに、現金自動支払機等の対面型の装置に本発明を適用してもよい。この場合、本発明を用いて年齢の認証などの本人確認を行うことができる。 According to such a voice response system 100, response candidates can be obtained not only from the server 90 but also from any device connected via the Internet, a dedicated line or the like.
Furthermore, the present invention may be applied to a face-to-face device such as a cash dispenser. In this case, identity verification such as age verification can be performed using the present invention.

また、本発明を車両に適用してもよい。この場合、人物を特定する構成を車両の鍵に代わる構成として利用することができる。なお、上記発明は音声応答システム１００として説明したが、入力された音声を認識する音声認識装置として構成してもよい。 In addition, the present invention may be applied to a vehicle. In this case, the configuration for identifying a person can be used as an alternative to the key of the vehicle. Although the above-mentioned invention was explained as voice response system 100, it may constitute as a voice recognition device which recognizes inputted voice.

また、上記実施形態では、端末装置１とサーバ９０とが通信しながら主たる処理をサー
バ９０で行う、いわゆるクラウドシステムとして構成したが、一部または全ての処理（フローチャートで示す処理）を端末装置１で実施してもよい。この場合には、端末装置１およびサーバ９０間の通信に関する処理を省略することができる。 In the above embodiment, the server 90 performs the main processing while the terminal device 1 and the server 90 communicate with each other, i.e., a so-called cloud system. However, a part or all of the processing (the processing shown in the flowchart) It may be implemented in In this case, the process related to communication between the terminal device 1 and the server 90 can be omitted.

また、被制御部９５については、外部からの指令に応じた制御を行う任意の装置が該当する。
さらに、音声応答システム１００において、発せられる音声に機械音であることを示す音である識別音を含むようにしてもよい。機械音と人が話す声とを識別できるようにするためである。この場合、識別音には何れの装置が発した音声であるかを示す識別子を含むようにするとよく、このようにすると複数種類の機械音の発生元を特定することができる。 The controlled unit 95 corresponds to any device that performs control according to an external command.
Furthermore, in the voice response system 100, the sound to be emitted may include an identification sound which is a sound indicating mechanical sound. This is to make it possible to distinguish between mechanical sound and human speech. In this case, it is preferable that the identification sound includes an identifier indicating which device emits the sound. In this way, generation sources of a plurality of types of mechanical sounds can be identified.

このような識別音は、可聴音であってもよいし、非可聴音であってもよい。識別音を非可聴音とする場合、電子透かしの技術を利用して識別子を音声に埋め込むようにしてもよい。 Such identification sound may be an audible sound or an inaudible sound. When the identification sound is an inaudible sound, an identifier may be embedded in the voice by using a digital watermark technology.

また、上記実施形態においては、入力された音声に対応する応答を音声で出力するよう構成したが、音声による入力に限られることなく、この入力に対応する応答を音声で出力してもよい。例えば、使用者の口の形状の変化を検出するカメラを備えておき、使用者の口の形状によって使用者がどのような言葉を話しているかを推定する手段を備えていてもよい。 In the above embodiment, the response corresponding to the input voice is output as voice. However, the response is not limited to voice input, and the response corresponding to this input may be voice output. For example, a camera for detecting a change in the shape of the user's mouth may be provided, and means for estimating what kind of language the user is speaking may be provided based on the shape of the user's mouth.

この場合、口の形状と音との対応関係をデータベースとして準備しておき、口の形状から音を推定し、この音から言葉を推定すればよい。このような構成によれば、使用者は実際に音を発することなく音声を入力することができる。 In this case, the correspondence relationship between the shape of the mouth and the sound may be prepared as a database, the sound may be estimated from the shape of the mouth, and the words may be estimated from the sound. According to such a configuration, the user can input voice without actually making a sound.

また、音声を用いて入力を行う際の補助として口の形状を利用してもよい。このようにすれば、使用者の滑舌が悪い場合であってもより確実に音声認識を行うことができる。
さらに、使用者が音声を入力できない場合に備えて、使用者による入力の履歴をディスプレイ上で選択することで音声に代わる入力ができるよう構成してもよい。この場合、単に履歴を新しい順に表示してもよいし、履歴に含まれる入力内容の利用頻度や入力内容が入力された時間帯等を考慮して、利用される可能性が高いと推定される内容から順に表示させるようにしてもよい。 In addition, the shape of the mouth may be used as an aid when performing input using voice. In this manner, voice recognition can be performed more reliably even if the user's slippage is bad.
Furthermore, in preparation for the case where the user can not input the voice, the user may be configured to be able to perform an input instead of the voice by selecting the history of the input by the user on the display. In this case, the history may be simply displayed in the order of newness, and it is estimated that the possibility of being used is high in consideration of the usage frequency of the input content included in the history and the time zone in which the input content is input. The contents may be displayed in order.

また、車両に端末装置１が搭載されている場合には、車両に対する呼び掛けに対して、持ち主（使用者）からの呼び掛けにだけに応答して解錠する等の特定の作動を行うようにしてもよい。このようにすれば、音声を鍵として利用できるとともに、車両の持ち主が広い駐車場などで自身の車両を見失った場合でも車両に呼び掛けを行うことで自身の車両を見つけることができる。 When the terminal device 1 is mounted on a vehicle, a specific action such as unlocking is performed in response to a call from the owner (user) in response to a call to the vehicle. It is also good. In this way, the voice can be used as a key, and even if the owner of the vehicle loses sight of the vehicle in a large parking lot or the like, the vehicle can be found by calling the vehicle.

［本発明の構成と実施形態の構成との関係］
本実施形態における音声応答システム１００は、本発明でいう音声応答装置の一例に相当する。また、サーバ９０が実行する処理のうち、Ｓ７４の処理は本発明でいう人物特定部の一例に相当し、Ｓ７８の処理は本発明でいう音声特徴記録部の一例に相当する。 [Relationship between the configuration of the present invention and the configuration of the embodiment]
The voice response system 100 in the present embodiment corresponds to an example of the voice response device in the present invention. Further, among the processes executed by the server 90, the process of S74 corresponds to an example of the person specifying unit in the present invention, and the process of S78 corresponds to an example of the audio feature recording unit in the present invention.

さらに、Ｓ２１０の処理は本発明でいう音声一致判定部の一例に相当し、Ｓ２１４，Ｓ２１６の処理は本発明でいう音声出力部の一例に相当する。また、Ｓ２０８、Ｓ２１８の処理は本発明でいう制御部の一例に相当し、Ｓ９０の処理は本発明でいうスケジュール記録部の一例に相当する。 Further, the process of S210 corresponds to an example of the speech match determination section in the present invention, and the processes of S214 and S216 correspond to an example of the speech output section in the present invention. The processes of S208 and S218 correspond to an example of the control unit in the present invention, and the process of S90 corresponds to an example of the schedule recording unit in the present invention.

１…端末装置、１０…行動センサユニット、１１…３次元加速度センサ、１３…３軸ジャイロセンサ、１５…温度センサ、１７…湿度センサ、１９…温度センサ、２１…湿度センサ、２３…照度センサ、２５…濡れセンサ、２７…ＧＰＳ受信機、２９…風速センサ、３３…心電センサ、３５…心音センサ、３７…マイク、３９…メモリ、４１…カメラ、５０…通信部、５３…無線電話ユニット、５５…連絡先メモリ、６０…報知部、６１…ディスプレイ、６３…電飾、６５…スピーカ、７０…操作部、７１…タッチパッド、７３…確認ボタン、７５…指紋センサ、７７…救援依頼レバー、８０…通信基地局、８５…インターネット網、９０…サーバ、９５…被制御部、１００…音声応答システム、１０１…演算部。 DESCRIPTION OF SYMBOLS 1 ... Terminal device, 10 ... Behavior sensor unit, 11 ... 3 dimensional acceleration sensor, 13 ... 3 axis gyro sensor, 15 ... Temperature sensor, 17 ... Humidity sensor, 19 ... Temperature sensor, 21 ... Humidity sensor, 23 ... Illuminance sensor, 25: wetness sensor, 27: GPS receiver, 29: wind speed sensor, 33: electrocardiogram sensor, 35: heart sound sensor, 37: microphone, 39: memory, 41: camera, 50: communication unit, 53: wireless telephone unit, 55: contact memory, 60: notification unit, 61: display, 63: illumination, 65: speaker, 70: operation unit, 71: touch pad, 73: confirmation button, 75: fingerprint sensor, 77: rescue request lever, 80: Communication base station, 85: Internet network, 90: Server, 95: Controlled unit, 100: Voice response system, 101: Arithmetic unit.

Claims

A voice response device for making a voice response to input voice
A voice feature recording unit that records features of the input voice;
A voice match determination unit that determines whether the features of the input voice match the features of the voice previously recorded by the voice feature recording unit;
A voice output unit that outputs a response different from that in the case where it is determined that the voice features do not match if the voice match determination unit determines that the voice features do not match;
A voice response device comprising:

In the voice response device according to claim 1,
A person specifying unit for specifying a person who has input a voice based on the feature of the input voice;
A control unit that controls the controlled unit in accordance with the input voice;
The voice response device characterized in that, when the control unit receives an contradictory instruction from a different person, control is performed by prioritizing an instruction from a person higher in priority according to a priority order set in advance for each person.

In the voice response device according to claim 1,
A person specifying unit for specifying a person who has input a voice based on the feature of the input voice;
A schedule recording unit for recording a schedule based on the input voice for each of the persons;
A voice response device comprising: