JP6969811B2

JP6969811B2 - Voice response device

Info

Publication number: JP6969811B2
Application number: JP2019208039A
Authority: JP
Inventors: 勉足立; 丈誠横井; 茂林; 健純近藤; 辰美黒田; 大介毛利; 豪生野澤; 謙史竹中; 毅川西; 健司水野; 博司前川; 誠岩田
Original assignee: Ｃａｓｅ特許株式会社
Priority date: 2012-06-18
Filing date: 2019-11-18
Publication date: 2021-11-24
Anticipated expiration: 2033-05-29
Also published as: JP2018049285A; JP2017215602A; JP6669951B2; JP2018092179A; JP2023079225A; JP2022062200A; JP6267636B2; JP2018136541A; JP2018136540A; JP7531241B2; JP2018136545A; WO2013190963A1; JP6751865B2; JP7231289B2; JP2021184111A; JP2018136546A; JP2019179243A; JP6552123B2; JP2020038387A; JPWO2013190963A1

Description

Cross-reference of related applications

本国際出願は、２０１２年６月１８日に日本国特許庁に出願された日本国特許出願第２
０１２−１３７０６５号、日本国特許出願第２０１２−１３７０６６号、および日本国特許出願第２０１２−１３７０６７号に基づく優先権を主張するものであり、日本国特許出願第２０１２−１３７０６５号、日本国特許出願第２０１２−１３７０６６号、および日本国特許出願第２０１２−１３７０６７号の全内容を参照により本国際出願に援用する。 This international application is the second Japanese patent application filed with the Japan Patent Office on June 18, 2012.
It claims priority based on 012-137065, Japanese Patent Application No. 2012-137066, and Japanese Patent Application No. 2012-137067, and Japanese Patent Application No. 2012-137065, Japanese Patent Application. The entire contents of No. 2012-137066 and Japanese Patent Application No. 2012-137067 are incorporated herein by reference.

本発明は、入力された文字情報に対する応答を音声で行わせる音声応答装置に関する。 The present invention relates to a voice response device that makes a response to input character information by voice.

上記の音声応答装置として、入力された質問に対する回答を辞書から検索し、検索した回答を音声で出力するものが知られている（例えば特許文献１参照）。また、使用者との対話の内容に基づいて質問に対する回答を生成する技術も知られている（例えば特許文献２参照）。 As the above-mentioned voice response device, there is known a device that searches a dictionary for an answer to an input question and outputs the searched answer by voice (see, for example, Patent Document 1). Further, a technique for generating an answer to a question based on the content of a dialogue with a user is also known (see, for example, Patent Document 2).

特許第４８３２０９７号公報Japanese Patent No. 4832097 特許第４９２４９５０号公報Japanese Patent No. 4924950

上記技術では、単に１つの質問に対して辞書によって特定される１つの回答を行うように設定されている。
入力された音声に対する応答を音声で行わせる音声応答装置において、使用者にとってより使い勝手をよくすることが本発明の一側面である。 In the above technique, only one question is set to give one answer specified by the dictionary.
One aspect of the present invention is to improve usability for the user in a voice response device that makes a response to the input voice by voice.

本開示の一局面は、音声応答装置であって、
入力された音声に対する応答を音声で行わせるように構成された音声応答部と、
音声によって自動会話する旨を設定するように構成された音声設定部と、
自動会話する旨がＯＮであれば、自動会話モードに設定された旨を、自身を特定するためのＩＤとともにサーバに対して送信するように構成された送信部と、
を備える。
また、本開示の他の局面であって、
第１局面の発明においては、
入力された文字情報に対する応答を音声で行わせる音声応答装置であって、
前記文字情報に対する複数の異なる応答を取得する応答取得手段と、
前記複数の異なる応答をそれぞれ異なる声色で出力させる音声出力手段と、
を備えたことを特徴とする。 One aspect of the present disclosure is a voice response device.
A voice response unit configured to respond to the input voice by voice,
A voice setting unit configured to set automatic conversation by voice,
If the automatic conversation mode is ON, the transmission unit configured to transmit the fact that the automatic conversation mode is set to the server together with the ID for identifying itself, and the transmission unit.
To prepare for.
In addition, in other aspects of this disclosure,
In the invention of the first aspect,
It is a voice response device that responds to the input character information by voice.
A response acquisition means for acquiring a plurality of different responses to the character information,
A voice output means for outputting a plurality of different responses with different voice colors, and
It is characterized by being equipped with.

このような音声応答装置によれば、複数の応答を異なる声色で出力させることができるので、１の文字情報に対する解が１つに特定できない場合であっても、異なる解を異なる声色で使用者に分かりやすく出力することができる。よって、使用者にとってより使い勝手をよくすることができる。 According to such a voice response device, a plurality of responses can be output with different voice colors, so that even if a solution for one character information cannot be specified as one, a user can output different solutions with different voice colors. It can be output in an easy-to-understand manner. Therefore, it is possible to improve the usability for the user.

なお、本発明の音声応答装置は、例えば、使用者が所持する端末装置として構成されていてもよいし、この端末装置と通信を行うサーバとして構成されていてもよい。また、文字情報は、キーボード等に入力手段を利用して入力されてもよいし、音声を文字情報に変換することで入力されてもよい。 The voice response device of the present invention may be configured as, for example, a terminal device owned by the user, or may be configured as a server that communicates with the terminal device. Further, the character information may be input to a keyboard or the like using an input means, or may be input by converting voice into character information.

ところで、上記音声応答装置においては、第２局面の発明のように、
使用者が音声を入力するための音声入力手段と、入力された音声を文字情報に変換し、
該文字情報に対する複数の異なる応答を生成して当該音声応答装置に送信する外部装置、に対して送信する音声送信手段と、
を備え、
前記応答取得手段は、前記外部装置から前記応答を取得する
ようにしてもよい。 By the way, in the above-mentioned voice response device, as in the invention of the second aspect,
A voice input means for the user to input voice, and the input voice is converted into character information.
A voice transmission means for transmitting to an external device that generates a plurality of different responses to the character information and transmits the response to the voice response device.
Equipped with
The response acquisition means may acquire the response from the external device.

このような音声応答装置によれば、音声応答装置では音声を入力することができるので、文字情報を音声で入力する構成とすることができる。また、外部装置において応答を生成する構成とすることができるので、音声応答装置での処理負荷を軽減することができる。 According to such a voice response device, since the voice response device can input voice, the character information can be input by voice. Further, since the response can be generated in the external device, the processing load in the voice response device can be reduced.

なお、音声送信手段においては、「入力された音声を文字情報に変換」する作動を音声応答装置で行ってもよいし、外部装置で行ってもよい。
さらに、上記音声応答装置においては、第３局面の発明のように、
当該音声応答装置または前記外部装置には、複数の文字情報のそれぞれに対して、各文字情報に対する肯定的応答と否定的応答とを含む複数の異なる応答が記録された応答記録手段、を備え、
前記応答取得手段は、前記複数の異なる応答として前記肯定的応答と前記否定的応答とを取得し、
前記音声出力手段は、前記肯定的応答と前記否定的応答とで異なる声色で再生するようにしてもよい。 In the voice transmission means, the operation of "converting the input voice into character information" may be performed by a voice response device or an external device.
Further, in the above-mentioned voice response device, as in the invention of the third aspect,
The voice response device or the external device includes a response recording means for recording a plurality of different responses including a positive response and a negative response to each character information for each of the plurality of character information.
The response acquisition means acquires the positive response and the negative response as the plurality of different responses.
The voice output means may reproduce the positive response and the negative response with different voice colors.

このような音声応答装置によれば、肯定的応答と否定的応答というように、立場の異なる応答を異なる声色で再生することができるので、別人物が話しているかのように音声を再生することができる。よって、音声を聞く使用者に違和感を覚えさせにくくすることができる。 According to such a voice response device, responses from different positions such as positive response and negative response can be reproduced with different voice colors, so that the voice is reproduced as if another person is speaking. Can be done. Therefore, it is possible to make it difficult for the user who listens to the voice to feel a sense of discomfort.

なお、応答の種別や応答の際の言葉遣いによって声色を変更してもよい。例えば、優しい口調で応答を行う場合には、落ち着いた女性の音声で再生し、激しい口調で応答する場合には、勇ましい男性の音声で応答するなどすればよい。つまり、応答内容と性格とを対応付けておき、性格に応じて声色を設定するようにすればよい。 The voice color may be changed depending on the type of response and the wording used in the response. For example, when responding with a gentle tone, the voice of a calm female may be played, and when responding with a violent tone, the voice of a brave man may be used. That is, the response content and the personality may be associated with each other, and the voice color may be set according to the personality.

また、上記音声応答装置においては、第４局面の発明のように、仕事場や会社の受付で利用する構成とし、或いは使用者が誰かに直接言いにくいことを代わりに伝える構成とすることができる。 Further, the voice response device may be configured to be used at the reception desk of a workplace or a company as in the invention of the fourth aspect, or may be configured to convey to someone that it is difficult for the user to directly tell.

受付において音声応答装置を利用する場合には、セールスに来る者の名前と会社名を音声応答装置や外部装置に予め記録しておき、受付に来たものが、この名前や会社名を名乗った場合には、断る文句の音声を再生するように、応答を生成すればよい。 When using a voice response device at the reception, the name and company name of the person who comes to the sales are recorded in advance on the voice response device or an external device, and the person who comes to the reception gives this name or company name. In that case, a response may be generated to play the voice of the refusal.

また、言いにくいことを代わりに伝える構成とする場合には、例えば、デート前に、今日はこのようなことを言いたいと本装置に話しかけておくと、適当なタイミング（例えば予め設定した時刻や、会話が途切れてから一定時間が経過した場合など）で、音声応答装置が代わりに話してくれる（音声を再生する）ようにすればよい。 Also, if you want to convey something that is difficult to say instead, for example, if you talk to the device today that you want to say something like this before dating, you can set an appropriate timing (for example, a preset time). , When a certain amount of time has passed since the conversation was interrupted), the voice response device should speak (play the voice) instead.

或いは、言いにくいことのきっかけになる言葉、例えば「そういえば何か彼女に話すって言ってなかったっけ？」のような言葉、を話す構成としてもよい。つまり、直ちに応答を出力するのではなく、一定時間経過後など、再生条件が成立した場合に応答を出力するようにしてもよい。 Alternatively, it may be configured to speak a word that triggers something difficult to say, such as a word such as "Did you say something to her?" That is, instead of outputting the response immediately, the response may be output when the reproduction condition is satisfied, such as after a certain period of time has elapsed.

さらに、上記音声応答装置においては、第５局面の発明のように、外部装置または音声応答装置は、文字情報に対する応答を生成するための情報を他の音声応答装置から取得するようにしてもよい。また、上記音声応答装置においては、第６局面の発明のように、文字情報に対する応答を生成するための情報を他の音声応答装置から要求された場合、この要求に応じた情報を返すようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the fifth aspect, the external device or the voice response device may acquire information for generating a response to the character information from another voice response device. .. Further, in the above-mentioned voice response device, when information for generating a response to character information is requested from another voice response device as in the invention of the sixth aspect, the information corresponding to this request is returned. You may.

この場合、音声応答装置は、位置情報、温度、湿度、照度、騒音レベル等を検出するためのセンサ類や、辞書情報などのデータベースを備えておき、要求に応じて必要な情報を抽出するようにすればよい。 In this case, the voice response device is provided with sensors for detecting position information, temperature, humidity, illuminance, noise level, etc., and a database such as dictionary information, and extracts necessary information as requested. It should be.

このような音声応答装置（外部装置）によれば、他の音声応答装置から応答を生成するための情報を取得することができる。この場合、他の音声応答装置の位置等、他の音声応答装置固有の情報を取得することができる。 According to such a voice response device (external device), it is possible to acquire information for generating a response from another voice response device. In this case, information unique to the other voice response device, such as the position of the other voice response device, can be acquired.

また、他の音声応答装置に自身固有の情報を送信することができる。
さらに、上記音声応答装置においては、第７局面の発明のように、自身または他の音声応答装置が出力した応答（例えば、肯定的応答や否定的応答）を文字情報として入力し、この応答に対する反論を行うための応答を生成するようにしてもよい。つまり、使用者の立場からすると、賛成の立場と反対の立場との両方の意見による議論を聞くことができる。そして、この議論を聞いたうえで、使用者は最終判断を行うことができる。 In addition, it is possible to transmit information unique to itself to other voice response devices.
Further, in the above-mentioned voice response device, as in the invention of the seventh aspect, a response output by itself or another voice response device (for example, a positive response or a negative response) is input as text information, and the response is received. You may want to generate a response to make a counterargument. In other words, from the user's point of view, we can hear discussions from both the pros and cons. Then, after listening to this discussion, the user can make a final decision.

この構成は、１台または複数の音声応答装置を用いて実現できる。この場合、複数の音声応答装置が音声をやり取りするには、音声を直接入出力してもよいし、無線等による通信を利用してもよい。 This configuration can be realized using one or more voice response devices. In this case, in order for the plurality of voice response devices to exchange voice, the voice may be directly input / output, or wireless communication or the like may be used.

また、第８局面の発明においては、
入力された文字情報に対する応答を音声で行わせる音声応答装置であって、
使用者または使用者に関係がある者を表す関係者の性格を予め設定された区分に従って対応付けた性格情報を取得する性格情報取得手段と、
前記文字情報に対する複数の異なる応答を表す応答候補を取得する応答取得手段と、
前記性格情報に応じて応答候補から出力させる応答を選択し、該選択した応答を出力させる音声出力手段と、
を備えたことを特徴とする。 Further, in the invention of the eighth aspect,
It is a voice response device that responds to the input character information by voice.
A personality information acquisition means for acquiring personality information in which the personality of a user or a person related to the user is associated with a preset classification, and
A response acquisition means for acquiring response candidates representing a plurality of different responses to the character information, and
A voice output means for selecting a response to be output from the response candidates according to the personality information and outputting the selected response.
It is characterized by being equipped with.

このような音声応答装置によれば、使用者や使用者に関係がある者（関係者）の性格に応じて異なる応答を行うことができる。よって、使用者にとって使い勝手を良くすることができる。 According to such a voice response device, it is possible to make a different response depending on the personality of the user or a person (related person) related to the user. Therefore, it is possible to improve usability for the user.

また、上記音声応答装置においては、第９局面の発明のように、
予め設定された複数の質問に対する回答に基づいて前記使用者または前記関係者の性格情報を生成する第１性格情報生成手段を備え、
前記性格情報取得手段は、前記性格情報生成手段で生成された性格情報を取得する
ようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the ninth aspect,
A first personality information generation means for generating personality information of the user or the related person based on answers to a plurality of preset questions is provided.
The personality information acquisition means may acquire the personality information generated by the personality information generation means.

このような音声応答装置によれば、性格情報を音声応答装置において生成することができる。なお、性格情報を生成する際には、周知の性格分析技術（ロールシャッハ・テスト、ソンディ・テスト等）を利用すればよい。また、性格情報を生成する際には、企業等が採用試験に利用する適性検査の技術を利用してもよい。 According to such a voice response device, personality information can be generated in the voice response device. When generating personality information, well-known personality analysis techniques (Rorschach test, Szondi test, etc.) may be used. In addition, when generating personality information, aptitude test technology used by companies and the like for recruitment tests may be used.

さらに、上記音声応答装置においては、第１０局面の発明のように、
前記入力された文字情報に含まれる文字列に基づいて前記使用者または前記関係者の性格情報を生成する第２性格情報生成手段を備え、
前記性格情報取得手段は、前記性格情報生成手段で生成された性格情報を取得する
ようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the tenth aspect,
A second personality information generation means for generating personality information of the user or the related person based on the character string included in the input character information is provided.
The personality information acquisition means may acquire the personality information generated by the personality information generation means.

このような音声応答装置によれば、使用者が音声応答装置を利用する過程で性格情報を生成することができる。
また、上記音声応答装置においては、第１１局面の発明のように、
文字情報に含まれる文字列に基づいて前記使用者または前記関係者の嗜好の傾向を示す嗜好情報を生成する嗜好情報生成手段、を備え、
前記音声出力手段は、前記嗜好情報に基づいて前記応答候補から出力させる応答を選択し、該選択した応答を出力させる
ようにしてもよい。 According to such a voice response device, personality information can be generated in the process of using the voice response device by the user.
Further, in the above-mentioned voice response device, as in the invention of the eleventh aspect,
A preference information generation means for generating preference information indicating a tendency of preference of the user or the related person based on a character string included in the character information is provided.
The voice output means may select a response to be output from the response candidate based on the preference information and output the selected response.

このような音声応答装置によれば、使用者または関係者の好みに応じて応答を行うことができる。
さらに、上記音声応答装置においては、第１２局面の発明のように、使用者の行動（会話、移動した場所、カメラに映ったもの）を学習（記録および解析）しておき、使用者の会話における言葉足らずを補うようにしてもよい。 According to such a voice response device, it is possible to respond according to the preference of the user or a person concerned.
Further, in the above-mentioned voice response device, as in the invention of the twelfth aspect, the user's behavior (conversation, the place moved, what is reflected by the camera) is learned (recorded and analyzed), and the user's conversation is performed. You may try to make up for the lack of words in.

例えば、「今日はハンバーグでいい？」との質問に対して「カレーがいいな。」と使用者が回答する会話に対して、本装置が「昨日ハンバーグだったからね」と補うと、使用者が、カレーがいいと発言した理由が伝わる。 For example, if the user answers "I like curry" to the question "Is it okay with hamburger steak today?", This device supplements "Because it was hamburger steak yesterday." However, the reason why he said that curry is good is transmitted.

また、このような構成は、電話中に実施することもでき、また、使用者の会話に勝手に参加するよう構成してもよい。
さらに、上記音声応答装置においては、第１３局面の発明のように、
応答候補を所定のサーバ、またはインターネット上から取得する応答候補取得手段、
を備えていてもよい。 Further, such a configuration can be carried out during a telephone call, or may be configured to freely participate in the conversation of the user.
Further, in the above-mentioned voice response device, as in the invention of the thirteenth aspect,
Response candidate acquisition means for acquiring response candidates from a predetermined server or the Internet,
May be provided.

このような音声応答装置によれば、応答候補を自装置や外部装置だけでなく、インターネットや専用線等で接続された任意の装置から取得することができる。
また、上記音声応答装置においては、第１４局面の発明のように、
使用者による動作を文字情報に変換する文字情報生成手段、
を備えていてもよい。 According to such a voice response device, response candidates can be acquired not only from the own device or an external device but also from any device connected by the Internet, a dedicated line, or the like.
Further, in the above-mentioned voice response device, as in the invention of the 14th aspect,
Character information generation means that converts user actions into character information,
May be provided.

ここで、本発明でいう動作には、会話、文字の手書き、或いは身振り手振り（例えば手話）等の筋肉の動作に起因するものが該当する。
このような音声応答装置によれば、使用者の動作を文字情報に変換することができる。 Here, the movements referred to in the present invention correspond to those caused by movements of muscles such as conversation, handwriting of characters, and gestures (for example, sign language).
According to such a voice response device, the user's operation can be converted into character information.

さらに、上記音声応答装置においては、第１５局面の発明のように、
文字情報生成手段は、使用者の発話による音声を文字情報に変換し、発声時の癖（発音上の癖など）を学習情報として蓄積する（特徴を捉えてこの特徴を記録しておく）
ようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the fifteenth aspect,
The character information generation means converts the voice produced by the user into character information and stores the habits (pronunciation habits, etc.) at the time of utterance as learning information (capture the characteristics and record these characteristics).
You may do so.

このような音声応答装置によれば、学習情報に基づいて文字情報を生成することができるので、文字情報の生成精度を向上させることができる。
また、上記音声応答装置においては、第１６局面の発明のように、
前記学習情報を他の音声応答装置に転送する転送手段、
を備えていてもよい。 According to such a voice response device, character information can be generated based on the learning information, so that the accuracy of character information generation can be improved.
Further, in the above-mentioned voice response device, as in the invention of the 16th aspect,
A transfer means for transferring the learning information to another voice response device,
May be provided.

このような音声応答装置によれば、使用者が他の音声応答装置を利用する場合においても、本音声応答装置で記録された学習情報を利用することができる。よって、他の音声応答装置を利用する場合においても文字情報の生成精度を向上させることができる。 According to such a voice response device, the learning information recorded by the voice response device can be used even when the user uses another voice response device. Therefore, even when another voice response device is used, the accuracy of generating character information can be improved.

さらに、上記音声応答装置においては、第１７局面の発明のように、使用者の行動および操作のうちの何れかを検出し、これらに基づいて学習情報または性格情報を生成するようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the 17th aspect, any one of the user's behavior and operation may be detected, and learning information or personality information may be generated based on these. ..

このような音声応答装置によれば、例えば、使用者が数日間連続で電車に飛び乗ることを検出した場合には、翌日からは数分早く家を出るよう促したり、会話から使用者に怒りやすい傾向があることを検出した場合には、気分を抑える音声や音楽を出力したりすることができる。 According to such a voice response device, for example, when it is detected that a user jumps on a train for several days in a row, the user is urged to leave the house a few minutes earlier from the next day, or the user is likely to get angry from the conversation. When it is detected that there is a tendency, it is possible to output a voice or music that suppresses the mood.

また、上記音声応答装置においては、第１８局面の発明のように、
他の音声応答装置から他の音声応答装置に記録されている情報を取得する他装置情報取得手段
を備えていてもよい。 Further, in the above-mentioned voice response device, as in the invention of the eighteenth aspect,
The other device information acquisition means for acquiring the information recorded in the other voice response device from the other voice response device may be provided.

このような音声応答装置によれば、他の音声応答装置に記録された情報に基づいて応答を生成することができる。
さらに、上記の音声応答装置においては、第１９局面の発明のように、
前記文字情報が入力されない場合において、当該音声応答装置の状況が予め音声を出力させる条件として設定された再生条件に合致するか否かを判定する再生条件判定手段と、
前記再生条件に合致する場合に、予め設定されたメッセージを出力させるメッセージ再生手段と、
を備えていてもよい。 According to such a voice response device, a response can be generated based on the information recorded in another voice response device.
Further, in the above-mentioned voice response device, as in the invention of the 19th aspect,
A reproduction condition determining means for determining whether or not the condition of the voice response device meets the reproduction condition set as a condition for outputting voice in advance when the character information is not input.
A message reproduction means for outputting a preset message when the reproduction conditions are met, and
May be provided.

このような音声応答装置によれば、文字情報が入力されない場合（つまり、使用者が話しかけない場合）であっても、音声を出力させることができる。例えば、強制的に使用者に発話させることで、自動車運転中の眠気抑制対策に利用することができる。また、一人暮らしの者が応答するか否かを判定することで、安否確認を行うことができる。 According to such a voice response device, voice can be output even when character information is not input (that is, when the user does not speak). For example, by forcibly making the user speak, it can be used as a measure for suppressing drowsiness while driving a car. In addition, safety can be confirmed by determining whether or not a person living alone responds.

また、上記音声応答装置においては、第２０局面の発明のように、
メッセージ再生手段は、ニュースの情報を取得し、該ニュースに関するメッセージを使用者の回答を求める質問形式で出力させる
ようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the 20th aspect,
The message reproduction means may acquire news information and output a message related to the news in the form of a question asking for a user's answer.

このような音声応答装置によれば、ニュースに関する会話をすることができるので、いつも同じ会話ばかりになることを抑制することができる。会話の内容としては、例えば、ある会社の株価に関する情報を取得できた場合には、「今日の○○会社の株価が○○円上がりましたね。ご存じでしたか？」などとすることができる。 With such a voice response device, it is possible to have a conversation about news, so it is possible to suppress that the conversation is always the same. As the content of the conversation, for example, if you can get information about the stock price of a certain company, you can say, "Today's stock price of XX company has risen by XX yen. Did you know?" ..

さらに、上記音声応答装置においては、第２１局面の発明のように、
音声出力手段またはメッセージ再生手段は、予め設定されたメッセージに別途取得した（ニュースや環境（気温、天気、位置情報等の）外部取得情報を付加して出力させる
ようにしてもよい。 Further, in the above-mentioned voice response device, as in the invention of the 21st aspect,
The voice output means or the message reproduction means may add externally acquired information (news, environment (temperature, weather, location information, etc.)) separately acquired to the preset message and output the message.

このような音声応答装置によれば、所定のメッセージと取得した情報とを組み合わせた応答を出力することができる。
また、上記音声応答装置においては、第２２局面の発明のように、
複数のメッセージを取得し、メッセージの再生頻度に応じて再生するメッセージを選択して出力させる
ようにしてもよい。 According to such a voice response device, it is possible to output a response in which a predetermined message and the acquired information are combined.
Further, in the above-mentioned voice response device, as in the invention of the 22nd aspect,
A plurality of messages may be acquired, and the message to be played may be selected and output according to the playback frequency of the messages.

このような音声応答装置によれば、再生頻度が高いメッセージを再生しにくくすることで、メッセージ再生時のランダム性を奏したり、敢えて再生頻度が高いメッセージを繰り返し再生することで注意喚起や記憶の定着を促したりすることができる。 According to such a voice response device, by making it difficult to play a message with a high playback frequency, randomness at the time of message playback can be achieved, or by dare to repeatedly play a message with a high playback frequency, attention and memory can be obtained. It can promote fixation.

さらに、上記音声応答装置においては、第２３局面の発明のように、
応答やメッセージに対する回答が得られない場合に、予め設定された連絡先に対して、使用者を特定する情報、および回答が得られなかった旨を送信する未回答時送信手段、
を備えていてもよい。 Further, in the above-mentioned voice response device, as in the invention of the 23rd aspect,
Unanswered transmission means to send user-identifying information and the fact that an answer was not obtained to a preset contact when a response or message cannot be obtained.
May be provided.

このような音声応答装置によれば、回答が得られない場合に連絡先に通報することができる。よって、例えば、一人暮らしの老人等の異常を早期に通報することができる。
また、上記音声応答装置においては、第２４局面の発明のように、
メッセージ再生手段は、会話内容を記憶し、聞いた内容について同じ内容を得るための質問をする（記憶確認処理）
ようにしてもよい。 With such a voice response device, it is possible to notify a contact when an answer cannot be obtained. Therefore, for example, it is possible to report an abnormality of an elderly person living alone at an early stage.
Further, in the above-mentioned voice response device, as in the invention of the 24th aspect,
The message reproduction means memorizes the conversation content and asks a question to obtain the same content as the heard content (memory confirmation process).
You may do so.

このような音声応答装置によれば、使用者の記憶力の確認をするとともに、記憶の定着を図ることができる。
さらに、上記音声応答装置においては、第２５局面の発明のように、
使用者が入力する音声の発音やアクセントの正確度合を検出する発話正確度検出手段と、
検出した正確度合を出力する正確度合出力手段と、
を備えていてもよい。 With such a voice response device, it is possible to confirm the memory ability of the user and to fix the memory.
Further, in the above-mentioned voice response device, as in the invention of the 25th aspect,
Speaking accuracy detection means that detects the accuracy of the pronunciation and accent of the voice input by the user,
An accuracy output means that outputs the detected accuracy,
May be provided.

このような音声応答装置によれば、発音やアクセントの正確性を確認することができる。例えば外国語の練習を行う際に有効である。
また、上記音声応答装置においては、第２６局面の発明のように、
前記正確度合出力手段は、正確度合が一定値以下の場合に、最も近い単語を含む音声を出力する
ようにしてもよい。 With such a voice response device, the accuracy of pronunciation and accent can be confirmed. For example, it is effective when practicing a foreign language.
Further, in the above-mentioned voice response device, as in the invention of the 26th aspect,
The accuracy degree output means may output the voice including the closest word when the accuracy degree is a certain value or less.

このような音声応答装置によれば、使用者が発音やアクセントの正確性を確認することができる。
さらに、上記音声応答装置においては、第２７局面の発明のように、
メッセージ再生手段は、正確度合が一定値以下の場合に、再度、同じ質問を出力させるようにしてもよい。 With such a voice response device, the user can confirm the accuracy of pronunciation and accent.
Further, in the above-mentioned voice response device, as in the invention of the 27th aspect,
The message reproduction means may output the same question again when the accuracy is equal to or less than a certain value.

このような音声応答装置によれば同じ質問を出力することによって正確な回答を求めることができる。
また、上記音声応答装置においては、第２８局面の発明のように、
入力された文字情報によって通信相手を特定し、通信相手毎に予め設定された通信先と前記通信相手とを接続する接続制御手段、
を備えていてもよい。 According to such a voice response device, an accurate answer can be obtained by outputting the same question.
Further, in the above-mentioned voice response device, as in the invention of the 28th aspect,
A connection control means that identifies a communication partner by the input character information and connects a communication destination preset for each communication partner to the communication partner.
May be provided.

このような音声応答装置によれば、受付業務や電話対応を補助することができる。
特に、上記音声応答装置においては、第２９局面の発明のように、
接続制御手段は、営業活動（セールス）、来客を識別し、営業活動であれば断るメッセージを再生する
ようにしてもよい。 With such a voice response device, it is possible to assist reception work and telephone correspondence.
In particular, in the above-mentioned voice response device, as in the invention of the 29th aspect,
The connection control means may identify a sales activity (sales) and a visitor, and reproduce a message to decline if it is a sales activity.

このような音声応答装置によれば、使用者の業務に支障がある虞がある者を、自身が対応することなく排除することができる。
さらに、上記音声応答装置においては、第３０局面の発明のように、入力された文字情報（特に音声）に含まれるキーワードを抽出し、キーワードが該当する接続先に接続するようにしてもよい。なお、例えば相手先の名称等のキーワードとその接続先とは予め対応付けておけばよい。 According to such a voice response device, a person who may interfere with the work of the user can be excluded without taking any action.
Further, in the voice response device, as in the invention of the thirtieth aspect, the keyword included in the input character information (particularly voice) may be extracted and connected to the connection destination to which the keyword corresponds. For example, a keyword such as the name of the other party may be associated with the connection destination in advance.

このような音声応答装置によれば、電話の転送や受付への呼び出し等の業務を補助することができる。
また、上記音声応答装置においては、第３１局面の発明のように、キーワードに基づいて相手が話す要件を認識し、相手が話した概要を使用者に伝えるようにしてもよい。 With such a voice response device, it is possible to assist operations such as telephone transfer and call to reception.
Further, in the above-mentioned voice response device, as in the invention of the 31st aspect, the requirement that the other party speaks may be recognized based on the keyword, and the outline spoken by the other party may be conveyed to the user.

このような音声応答装置によれば、客先との取次の業務を補助することができる。
さらに、上記音声応答装置においては、第３２局面の発明のように、
使用者によって入力された音声について、声色から感情を読み取り、通常、怒り、喜び、困惑、悲しみ、高揚のうちの少なくとも１つを含む感情のうちの、何れの感情に該当するかを出力する感情判定手段
を備えていてもよい。 According to such a voice response device, it is possible to assist the work of agency with the customer.
Further, in the above-mentioned voice response device, as in the invention of the 32nd aspect,
Emotions that read emotions from the voice of the voice input by the user and output which emotions usually correspond to, including at least one of anger, joy, embarrassment, sadness, and uplifting. A determination means may be provided.

このような音声応答装置によれば、使用者の感情に応じて応答を出力することができる。
次に、第３３局面の発明は、
前記文字情報が入力された際に、当該音声応答装置の周囲を撮像した撮像画像に応じた応答を生成する応答生成手段と、
前記応答を音声で出力させる音声出力手段と、
を備えたことを特徴とする。 According to such a voice response device, it is possible to output a response according to the emotion of the user.
Next, the invention of the 33rd aspect is
When the character information is input, a response generation means that generates a response according to an image captured by capturing the surroundings of the voice response device, and a response generation means.
A voice output means for outputting the response by voice and
It is characterized by being equipped with.

このような音声応答装置によれば、撮像画像に応じて応答を音声で出力することができる。したがって、文字情報のみから応答を生成する構成と比較して使い勝手を向上させることができる。 According to such a voice response device, the response can be output by voice according to the captured image. Therefore, usability can be improved as compared with a configuration in which a response is generated only from character information.

本発明の具体的構成としては、例えば、認識したものが何かを応答するよう文字情報を入力し、撮像画像から認識したものが何か（誰か）を音声で出力するなどの構成が挙げられる。 Specific configurations of the present invention include, for example, inputting character information so as to respond to what is recognized, and outputting by voice what is recognized from the captured image (someone). ..

ところで、上記音声応答装置においては、第３４局面の発明のように、
文字情報に含まれる物体を撮像画像中から画像処理によって検索し、該検索された物体の位置を特定する位置特定手段検索手段と、
前記物体の位置まで案内する案内手段と、
を備えていてもよい。 By the way, in the above-mentioned voice response device, as in the invention of the 34th aspect,
A position specifying means search means for searching an object included in the character information from the captured image by image processing and specifying the position of the searched object, and a search means.
Guidance means to guide to the position of the object and
May be provided.

このような音声応答装置によれば、撮像画像中の物体まで使用者を案内することができる。
さらに、上記音声応答装置においては、第３５局面の発明のように、
文字情報を音声で入力する際において使用者の口の形状を撮像した動画像を取得する音
声入力動画取得手段と、
前記音声を文字情報に変換し、かつ、該動画像に基づいて、音声の不明確な部分を推定して文字情報を補正する文字情報変換手段と、
を備えていてもよい。 According to such a voice response device, the user can be guided to the object in the captured image.
Further, in the above-mentioned voice response device, as in the invention of the 35th aspect,
A voice input video acquisition means for acquiring a moving image of the shape of the user's mouth when inputting character information by voice, and
A character information conversion means for converting the voice into character information and estimating the unclear part of the voice based on the moving image to correct the character information.
May be provided.

このような音声応答装置によれば、口の形状から発声内容を推定することできるので、音声の不明確な部分を良好に推定することができる。
また、上記音声応答装置においては、第３６局面の発明のように、
メッセージ再生手段は、思いがけずに発する音声を検出することによって使用者の苛立ちや動揺を検出し、苛立ちや動揺を抑制するためのメッセージを生成する
ようにしてもよい。 According to such a voice response device, since the utterance content can be estimated from the shape of the mouth, it is possible to satisfactorily estimate the unclear part of the voice.
Further, in the above-mentioned voice response device, as in the invention of the 36th aspect,
The message reproduction means may detect annoyance or agitation of the user by detecting an unexpectedly emitted voice, and may generate a message for suppressing the annoyance or agitation.

このような音声応答装置によれば、使用者に苛立ちや動揺がある場合に、これらを抑制することができる。よって、使用者と周囲とのトラブルの発生を抑制することができる。
さらに、上記音声応答装置においては、第３７局面の発明のように、
目的地までの案内を行う場合において、目的地までの天気、温度、湿度、交通情報、路面状態等の経路情報を取得する経路情報取得手段、を備え、
メッセージ再生手段は、経路情報を音声で出力させる
ようにしてもよい。 According to such a voice response device, when the user is irritated or upset, it is possible to suppress them. Therefore, it is possible to suppress the occurrence of troubles between the user and the surroundings.
Further, in the above-mentioned voice response device, as in the invention of the 37th aspect,
When providing guidance to the destination, it is equipped with a route information acquisition means for acquiring route information such as weather, temperature, humidity, traffic information, and road surface condition to the destination.
The message reproduction means may output the route information by voice.

このような音声応答装置によれば、目的地までの状況（経路情報）を使用者に音声で通知することができる。
また、上記音声応答装置においては、第３８局面の発明のように、
使用者の視線を検出する視線検出手段と、
前記メッセージ再生手段による呼びかけに対して所定の位置に使用者の視線が移動しない場合、視線を所定の位置に移動させるよう要求する音声を出力する視線移動要求送信手段と、
を備えていてもよい。 According to such a voice response device, the situation (route information) to the destination can be notified to the user by voice.
Further, in the above-mentioned voice response device, as in the invention of the 38th aspect,
A line-of-sight detection means for detecting the user's line of sight,
When the user's line of sight does not move to a predetermined position in response to the call by the message reproducing means, the line-of-sight movement request transmitting means that outputs a voice requesting the line of sight to move to the predetermined position, and the line-of-sight request transmitting means.
May be provided.

このような音声応答装置によれば、使用者に特定の位置を見させることができる。よって、車両運転時の安全確認などを確実に行うことができる。
なお、上記音声応答装置においては、第３９局面の発明のように、
体の部位の位置や顔の表情を観察し、前記呼びかけ対する変化が少ない場合、体の部位の位置や顔の表情を変化させるよう要求する音声を出力する変化要求送信手段
を備えていてもよい。 With such a voice response device, the user can be made to look at a specific position. Therefore, it is possible to surely confirm the safety when driving the vehicle.
In the above-mentioned voice response device, as in the invention of the 39th aspect,
A change request transmission means for observing the position of the body part and the facial expression and outputting a voice requesting the change of the position of the body part and the facial expression may be provided when the change to the call is small. ..

このような音声応答装置によれば、使用者の体の部位の位置を特定の位置に移動させたり、特定の表情をするよう誘導したりすることができる。本発明は、車両の運転時や身体検査等の際に利用することができる。 According to such a voice response device, it is possible to move the position of a part of the user's body to a specific position or to induce the user to make a specific facial expression. The present invention can be used when driving a vehicle, performing a physical examination, or the like.

さらに、上記音声応答装置においては、第４０局面の発明のように、
使用者が視聴する放送番組と同様の放送番組を取得する放送番組取得手段と、
放送番組が途切れた場合に、自身が取得した放送番組を出力することで途切れた放送番組を補完する放送番組補完手段と、
を備えていてもよい。 Further, in the above-mentioned voice response device, as in the invention of the 40th aspect,
A broadcast program acquisition means for acquiring a broadcast program similar to the broadcast program that the user watches, and
When a broadcast program is interrupted, a broadcast program complement means that complements the interrupted broadcast program by outputting the broadcast program acquired by itself,
May be provided.

このような音声応答装置によれば、使用者が視聴する放送番組が途切れないように補うことができる。
また、上記音声応答装置においては、第４１局面の発明のように、
歌詞無しの楽曲に使用者が歌詞を付して歌う場合において、歌詞ありの楽曲と使用者が
付した歌詞とを比較し、使用者の歌詞のみがない部分において歌詞を音声で出力させる歌詞付加手段、を備えたこと。 According to such a voice response device, it is possible to supplement the broadcast program that the user watches without interruption.
Further, in the above-mentioned voice response device, as in the invention of the 41st aspect,
When the user sings a song without lyrics with lyrics, the song with lyrics is compared with the lyrics attached by the user, and lyrics are added so that the lyrics are output by voice in the part where only the user's lyrics are not present. Having the means.

このような音声応答装置によれば、いわゆるカラオケにおいて使用者が歌えない部分（歌詞が途切れた部分）を補うことができる。
さらに、上記音声応答装置においては、第４２局面の発明のように、
撮像画像中に文字が含まれる場合において、使用者からこの文字の読み方についての質問を受けると、この文字の情報を外部から取得し、この情報に含まれる文字の読み方を音声で出力させる読み方出力手段、
を備えていてもよい。 According to such a voice response device, it is possible to supplement a part (a part where the lyrics are interrupted) that the user cannot sing in so-called karaoke.
Further, in the above-mentioned voice response device, as in the invention of the 42nd aspect,
When a character is included in the captured image, when the user asks a question about how to read this character, the information of this character is acquired from the outside and the reading of the character contained in this information is output by voice. means,
May be provided.

このような音声応答装置によれば、文字の読み方を使用者に教えることができる。
また、上記音声応答装置においては、第４３局面の発明のように、
使用者の行動や使用者の周囲環境を検出する行動環境検出手段を備え、
メッセージ生成手段は、検出された行動や周囲環境に応じてメッセージを生成する
ようにしてもよい。 With such a voice response device, it is possible to teach the user how to read characters.
Further, in the above-mentioned voice response device, as in the invention of the 43rd aspect,
Equipped with a behavioral environment detection means that detects the behavior of the user and the surrounding environment of the user,
The message generation means may generate a message according to the detected behavior and the surrounding environment.

このような音声応答装置によれば、危険な場所や立ち入り禁止の領域などを報知することができる。また、使用者に異常な行動があることなどを検出することができる。
さらに、上記音声応答装置においては、第４４局面の発明のように、
使用者を撮像した撮像画像に基づいて、健康状態を判定する健康状態判定手段と、
健康状態に応じてメッセージを生成する健康メッセージ生成手段と、
を備えていてもよい。 With such a voice response device, it is possible to notify a dangerous place, an exclusion zone, or the like. In addition, it is possible to detect that the user has abnormal behavior.
Further, in the above-mentioned voice response device, as in the invention of the 44th aspect,
A health condition determination means for determining a health condition based on an image taken by a user, and a health condition determination means.
Health message generation means that generates messages according to health status,
May be provided.

このような音声応答装置によれば、使用者の健康状態を管理することができる。
また、上記音声応答装置においては、第４５局面の発明のように、
健康状態が基準値を下回る場合に、所定の連絡先に通報を行う通報手段、
を備えていてもよい。 According to such a voice response device, the health condition of the user can be managed.
Further, in the above-mentioned voice response device, as in the invention of the 45th aspect,
A means of reporting to a given contact when the health condition is below the standard value,
May be provided.

このような音声応答装置によれば、使用者の健康状態が基準値以下の場合に、通報を行うことができる。よってより早期に異常を他者に報知することができる。
さらに、上記音声応答装置においては、第４６局面の発明のように、使用者以外の者から問い合わせに対して使用者についての情報を出力するようにしてもよい。 According to such a voice response device, it is possible to make a report when the health condition of the user is equal to or less than the reference value. Therefore, the abnormality can be notified to others at an earlier stage.
Further, in the voice response device, as in the invention of the 46th aspect, information about the user may be output in response to an inquiry from a person other than the user.

このような音声応答装置によれば、例えば、使用者の食事内容な散歩の距離などを検出しておけば、病院等での質問に使用者に代わって回答することができる。また、健康状態や自己紹介など学習しておくようにしてもよい。 According to such a voice response device, for example, if the distance of a walk, which is the meal content of the user, is detected, it is possible to answer a question at a hospital or the like on behalf of the user. You may also learn about your health condition and self-introduction.

なお、各局面の発明は、他の発明を前提とする必要はなく、可能な限り独立した発明とすることができる。 It should be noted that the invention of each aspect does not need to be premised on another invention, and can be an independent invention as much as possible.

本発明が適用された音声応答システムの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the voice response system to which this invention is applied. 端末装置の概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of a terminal device. 端末装置のＭＰＵが実行する音声応答端末処理を示すフローチャートである。It is a flowchart which shows the voice response terminal processing which MPU of a terminal apparatus performs. サーバの演算部が実行する音声応答サーバ処理を示すフローチャートである。It is a flowchart which shows the voice response server processing which the arithmetic unit of a server executes. 応答候補ＤＢの一例を示す説明図である。It is explanatory drawing which shows an example of the response candidate DB. 端末装置のＭＰＵが実行する自動会話端末処理を示すフローチャートである。It is a flowchart which shows the automatic conversation terminal processing which MPU of a terminal apparatus performs. サーバの演算部が実行する自動会話サーバ処理を示すフローチャートである。It is a flowchart which shows the automatic conversation server processing executed by the arithmetic unit of a server. 端末装置のＭＰＵが実行する伝言端末処理を示すフローチャートである。It is a flowchart which shows the message terminal processing which MPU of a terminal apparatus performs. サーバの演算部が実行する伝言サーバ処理を示すフローチャートである。It is a flowchart which shows the message server processing which the arithmetic part of a server executes. 端末装置のＭＰＵが実行する誘導端末処理を示すフローチャートである。It is a flowchart which shows the guidance terminal processing which MPU of a terminal apparatus performs. サーバの演算部が実行する誘導サーバ処理を示すフローチャートである。It is a flowchart which shows the guidance server processing which the arithmetic part of a server executes. サーバの演算部が実行する受付処理を示すフローチャートである。It is a flowchart which shows the reception process executed by the arithmetic unit of a server. 端末装置のＭＰＵが実行する情報提供端末処理を示すフローチャートである。It is a flowchart which shows the information providing terminal processing which MPU of a terminal apparatus performs. 性格ＤＢの一例を示す説明図である。It is explanatory drawing which shows an example of a personality DB. 端末装置のＭＰＵが実行する性格情報生成処理を示すフローチャートである。It is a flowchart which shows the personality information generation processing which MPU of a terminal apparatus performs. 嗜好ＤＢの一例を示す説明図である。It is explanatory drawing which shows an example of a preference DB. サーバの演算部が実行する嗜好情報生成処理を示すフローチャートである。It is a flowchart which shows the preference information generation processing executed by the arithmetic unit of a server. 性格区分と嗜好との組み合わせ例を示す説明図である。It is explanatory drawing which shows the combination example of a personality classification and a preference. サーバの演算部が実行する動作文字入力処理を示すフローチャートである。It is a flowchart which shows the operation character input process executed by the arithmetic unit of a server. サーバの演算部が実行する他端末利用処理を示すフローチャートである。It is a flowchart which shows the other terminal use processing executed by the arithmetic unit of a server. サーバの演算部が実行する記憶確認処理を示すフローチャートである。It is a flowchart which shows the storage confirmation process which the arithmetic unit of a server executes. サーバの演算部が実行する発音判定処理１を示すフローチャートである。It is a flowchart which shows the pronunciation determination process 1 executed by the arithmetic unit of a server. サーバの演算部が実行する発音判定処理２を示すフローチャートである。It is a flowchart which shows the pronunciation determination process 2 executed by the arithmetic unit of a server. サーバの演算部が実行する発音判定処理３を示すフローチャートである。It is a flowchart which shows the pronunciation determination process 3 executed by the arithmetic unit of a server. サーバの演算部が実行する感情判定処理を示すフローチャートである。It is a flowchart which shows the emotion determination process executed by the arithmetic unit of a server. サーバの演算部が実行する感情応答生成処理を示すフローチャートである。It is a flowchart which shows the emotion response generation processing executed by the arithmetic unit of a server. サーバの演算部が実行する案内処理を示すフローチャートである。It is a flowchart which shows the guidance process which a calculation part of a server executes. サーバの演算部が実行する移動要求処理１を示すフローチャートである。It is a flowchart which shows the move request process 1 executed by the arithmetic unit of a server. サーバの演算部が実行する移動要求処理２を示すフローチャートである。It is a flowchart which shows the move request process 2 executed by the arithmetic unit of a server. サーバの演算部が実行する放送楽曲補完処理を示すフローチャートである。It is a flowchart which shows the broadcast music complement processing which the arithmetic part of a server executes. サーバの演算部が実行する文字解説処理を示すフローチャートである。It is a flowchart which shows the character explanation processing executed by the arithmetic unit of a server. サーバの演算部が実行する行動応答端末処理を示すフローチャートである。It is a flowchart which shows the action response terminal processing which the arithmetic part of a server executes. サーバの演算部が実行する行動応答サーバ処理を示すフローチャートである。It is a flowchart which shows the action response server processing which the arithmetic part of a server executes.

以下に本発明にかかる実施の形態を図面と共に説明する。
［第１実施形態］
［本実施形態の構成］
本発明が適用された音声応答システム１００は、端末装置１において入力された音声に対して、サーバ９０にて適切な応答を生成し、端末装置１で応答を音声で出力するよう構成されたシステムである。詳細には、図１に示すように、音声応答システム１００は、複数の端末装置１とサーバ９０とが通信基地局８０やインターネット網８５を介して互いに通信可能に構成されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]
[Structure of this embodiment]
The voice response system 100 to which the present invention is applied is a system configured such that the server 90 generates an appropriate response to the voice input in the terminal device 1 and the terminal device 1 outputs the response by voice. Is. Specifically, as shown in FIG. 1, the voice response system 100 is configured such that a plurality of terminal devices 1 and a server 90 can communicate with each other via a communication base station 80 and an Internet network 85.

サーバ９０は、通常のサーバ装置としての機能を備えている。特にサーバ９０は、演算部１０１と、各種データベース（ＤＢ）とを備えている。演算部１０１は、ＣＰＵと、Ｒ
ＯＭ、ＲＡＭ等のメモリを備えた周知の演算装置として構成されており、メモリ内のプログラムに基づいて、インターネット網８５を介した端末装置１等との通信や、各種ＤＢ内のデータの読み書き、或いは、端末装置１を利用する使用者との会話を行うための音声認識や応答生成といった各種処理を実施する。 The server 90 has a function as a normal server device. In particular, the server 90 includes a calculation unit 101 and various databases (DB). The calculation unit 101 includes a CPU and R.
It is configured as a well-known arithmetic unit equipped with memory such as OM and RAM, and based on the program in the memory, it communicates with the terminal device 1 etc. via the Internet network 85, and reads / writes data in various DBs. Alternatively, various processes such as voice recognition and response generation for having a conversation with the user who uses the terminal device 1 are performed.

各種ＤＢとしては、図１に示すように、音声認識ＤＢ１０２、予測変換ＤＢ１０３、音声ＤＢ１０４、応答候補ＤＢ１０５、性格ＤＢ１０６、学習ＤＢ１０７、嗜好ＤＢ１０８、ニュースＤＢ１０９、天気ＤＢ１１０、再生条件ＤＢ１１１、手書き文字・手話ＤＢ１１２、端末情報ＤＢ１１３、感情判定ＤＢ１１４、健康判定ＤＢ１１５、カラオケＤＢ１１６、通報先ＤＢ１１７、セールスＤＢ１１８、クライアントＤＢ１１９等を備えている。なお、これらのＤＢの詳細については、処理の説明の都度述べることにする。 As shown in FIG. 1, the various DBs include voice recognition DB 102, predictive conversion DB 103, voice DB 104, response candidate DB 105, personality DB 106, learning DB 107, preference DB 108, news DB 109, weather DB 110, reproduction condition DB 111, handwritten characters / sign language. It includes a DB 112, a terminal information DB 113, an emotion determination DB 114, a health determination DB 115, a karaoke DB 116, a report destination DB 117, a sales DB 118, a client DB 119, and the like. The details of these DBs will be described each time the processing is explained.

次に、端末装置１は、図２に示すように、行動センサユニット１０と、通信部５０と、報知部６０と、操作部７０と、が所定の筐体に備えられて構成されている。
行動センサユニット１０は、周知のＭＰＵ３１（マイクロプロセッサユニット）、ＲＯＭ、ＲＡＭ等のメモリ３９、および各種センサを備えており、ＭＰＵ３１は各種センサを構成するセンサ素子が検査対象（湿度、風速等）を良好に検出することができるように、例えば、センサ素子の温度に最適化するためのヒータを駆動させる等の処理を行う。 Next, as shown in FIG. 2, the terminal device 1 includes an action sensor unit 10, a communication unit 50, a notification unit 60, and an operation unit 70 in a predetermined housing.
The behavior sensor unit 10 includes a well-known MPU 31 (microprocessor unit), a memory 39 such as ROM and RAM, and various sensors. In the MPU 31, the sensor elements constituting the various sensors are inspected (humidity, wind speed, etc.). For example, a process for driving a heater for optimizing the temperature of the sensor element is performed so that it can be detected satisfactorily.

行動センサユニット１０は、各種センサとして、３次元加速度センサ１１（３ＤＧセンサ）と、３軸ジャイロセンサ１３と、筐体の背面に配置された温度センサ１５と、筐体の背面に配置された湿度センサ１７と、筐体の正面に配置された温度センサ１９と、筐体の正面に配置された湿度センサ２１と、筐体の正面に配置された照度センサ２３と、筐体の背面に配置された濡れセンサ２５と、端末装置１の現在地を検出するＧＰＳ受信機２７と、風速センサ２９とを備えている。 The behavior sensor unit 10 has, as various sensors, a three-dimensional acceleration sensor 11 (3DG sensor), a three-axis gyro sensor 13, a temperature sensor 15 arranged on the back surface of the housing, and humidity arranged on the back surface of the housing. The sensor 17, the temperature sensor 19 arranged on the front of the housing, the humidity sensor 21 arranged on the front of the housing, the illuminance sensor 23 arranged on the front of the housing, and the illuminance sensor 23 arranged on the back of the housing. It includes a wetness sensor 25, a GPS receiver 27 for detecting the current location of the terminal device 1, and a wind speed sensor 29.

また、行動センサユニット１０は、各種センサとして、心電センサ３３、心音センサ３５、マイク３７、カメラ４１も備えている。なお、各温度センサ１５，１９、および各湿度センサ１７，２１は、筐体の外部空気の温度または湿度を検査対象として測定を行う。 The behavior sensor unit 10 also includes an electrocardiographic sensor 33, a heart sound sensor 35, a microphone 37, and a camera 41 as various sensors. The temperature sensors 15 and 19 and the humidity sensors 17 and 21 measure the temperature or humidity of the external air of the housing as an inspection target.

３次元加速度センサ１１は、端末装置１に加えられる互いに直交する３方向（鉛直方向（Ｚ方向）、筐体の幅方向（Ｙ方向）、および筐体の厚み方向（Ｘ方向））における加速度を検出し、この検出結果を出力する。 The three-dimensional acceleration sensor 11 applies acceleration to the terminal device 1 in three orthogonal directions (vertical direction (Z direction), housing width direction (Y direction), and housing thickness direction (X direction)). Detects and outputs this detection result.

３軸ジャイロセンサ１３は、端末装置１に加えられる角速度として、鉛直方向（Ｚ方向）と、該鉛直方向とは直交する任意の２方向（筐体の幅方向（Ｙ方向）、および筐体の厚み方向（Ｘ方向））における角加速度（各方向における左回りの各速度を正とする）を検出し、この検出結果を出力する。 The 3-axis gyro sensor 13 has an angular velocity applied to the terminal device 1 in the vertical direction (Z direction), in any two directions orthogonal to the vertical direction (width direction of the housing (Y direction), and in the housing). The angular acceleration in the thickness direction (X direction) (assuming each counterclockwise velocity in each direction is positive) is detected, and this detection result is output.

温度センサ１５，１９は、例えば温度に応じて電気抵抗が変化するサーミスタ素子を備えて構成されている。なお、本実施例においては、温度センサ１５，１９は摂氏温度を検出し、以下の説明に記載する温度表示は全て摂氏温度で行うものとする。 The temperature sensors 15 and 19 are configured to include, for example, a thermistor element whose electrical resistance changes according to temperature. In this embodiment, the temperature sensors 15 and 19 detect the temperature in degrees Celsius, and all the temperature displays described in the following description are performed in degrees Celsius.

湿度センサ１７，２１は、例えば周知の高分子膜湿度センサとして構成されている。この高分子膜湿度センサは、相対湿度の変化に応じて高分子膜に含まれる水分の量が変化し、誘電率が変化するコンデンサとして構成されている。 The humidity sensors 17 and 21 are configured as, for example, a well-known polymer membrane humidity sensor. This polymer film humidity sensor is configured as a capacitor in which the amount of water contained in the polymer film changes according to a change in relative humidity and the dielectric constant changes.

照度センサ２３は、例えばフォトトランジスタを備えた周知の照度センサとして構成されている。
風速センサ２９は、例えば周知の風速センサであって、ヒータ温度を所定温度に維持す
る際に必要な電力（放熱量）から風速を算出する。 The illuminance sensor 23 is configured as a well-known illuminance sensor including, for example, a phototransistor.
The wind speed sensor 29 is, for example, a well-known wind speed sensor, and calculates the wind speed from the electric power (heat dissipation amount) required to maintain the heater temperature at a predetermined temperature.

心音センサ３５は、使用者の心臓の拍動による振動を捉える振動センサとして構成されており、ＭＰＵ３１は心音センサ３５による検出結果とマイク３７から入力される心音とを鑑みて、拍動による振動や騒音と、他の振動や騒音とを識別する。 The heart sound sensor 35 is configured as a vibration sensor that captures the vibration caused by the beat of the user's heart, and the MPU 31 considers the detection result by the heart sound sensor 35 and the heart sound input from the microphone 37, and the vibration caused by the beat. Distinguish between noise and other vibrations and noise.

濡れセンサ２５は筐体表面の水滴を検出し、心電センサ３３は使用者の鼓動を検出する。
カメラ４１は、端末装置１の筐体内において、端末装置１の外部を撮像範囲とするように配置されている。 The wetness sensor 25 detects water droplets on the surface of the housing, and the electrocardiographic sensor 33 detects the heartbeat of the user.
The camera 41 is arranged in the housing of the terminal device 1 so that the outside of the terminal device 1 is the imaging range.

通信部５０は、周知のＭＰＵ５１と、無線電話ユニット５３と、連絡先メモリ５５と、を備え、図示しない入出力インターフェイスを介して行動センサユニット１０を構成する各種センサからの検出信号を取得可能に構成されている。そして、通信部５０のＭＰＵ５１は、この行動センサユニット１０による検出結果や、操作部７０を介して入力される入力信号、ＲＯＭ（図示省略）に格納されたプログラムに応じた処理を実行する。 The communication unit 50 includes a well-known MPU 51, a radiotelephone unit 53, and a contact memory 55, and can acquire detection signals from various sensors constituting the behavior sensor unit 10 via an input / output interface (not shown). It is configured. Then, the MPU 51 of the communication unit 50 executes processing according to the detection result by the behavior sensor unit 10, the input signal input via the operation unit 70, and the program stored in the ROM (not shown).

具体的には、通信部５０のＭＰＵ５１は、使用者が行う特定の動作を検出する動作検出装置としての機能、使用者との位置関係を検出する位置関係検出装置としての機能、使用者により行われる運動の負荷を検出する運動負荷検出装置としての機能、およびＭＰＵ５１による処理結果を送信する機能を実行する。 Specifically, the MPU 51 of the communication unit 50 has a function as an motion detection device for detecting a specific motion performed by the user, a function as a positional relationship detection device for detecting a positional relationship with the user, and a function depending on the user. It executes a function as an exercise load detection device for detecting an exercise load and a function for transmitting a processing result by the MPU 51.

無線電話ユニット５３は、例えば携帯電話の基地局と通信可能に構成されており、通信部５０のＭＰＵ５１は、該ＭＰＵ５１による処理結果を報知部６０に対して出力したり、無線電話ユニット５３を介して予め設定された送信先に対して送信したりする。 The radiotelephone unit 53 is configured to be able to communicate with, for example, a base station of a mobile phone, and the MPU 51 of the communication unit 50 outputs the processing result of the MPU 51 to the notification unit 60 or via the radiotelephone unit 53. To send to a preset destination.

連絡先メモリ５５は、使用者の訪問先の位置情報を記憶するための記憶領域として機能する。この連絡先メモリ５５には、使用者に異常が生じた場合に連絡をすべき連絡先（電話番号など）の情報が記録されている。 The contact memory 55 functions as a storage area for storing the location information of the visit destination of the user. The contact memory 55 records information on contacts (telephone numbers, etc.) to be contacted when an abnormality occurs in the user.

報知部６０は、例えば、ＬＣＤや有機ＥＬディスプレイとして構成されたディスプレイ６１と、例えば７色に発光可能なＬＥＤからなる電飾６３と、スピーカ６５とを備えている。報知部６０を構成する各部は、通信部５０のＭＰＵ５１により駆動制御される。 The notification unit 60 includes, for example, a display 61 configured as an LCD or an organic EL display, an illumination 63 composed of LEDs capable of emitting light in, for example, seven colors, and a speaker 65. Each unit constituting the notification unit 60 is driven and controlled by the MPU 51 of the communication unit 50.

次に、操作部７０としては、タッチパッド７１と、確認ボタン７３と、指紋センサ７５と、救援依頼レバー７７とを備えている。
タッチパッド７１は、使用者（使用者や使用者の保護者等）により触れられた位置や圧力に応じた信号を出力する。 Next, the operation unit 70 includes a touch pad 71, a confirmation button 73, a fingerprint sensor 75, and a rescue request lever 77.
The touch pad 71 outputs a signal according to the position and pressure touched by the user (user, guardian of the user, etc.).

確認ボタン７３は、使用者に押下されると内蔵されたスイッチの接点が閉じるように構成されており、通信部５０にて確認ボタン７３が押下されたことを検出することができるようにされている。 The confirmation button 73 is configured to close the contact of the built-in switch when pressed by the user, and the communication unit 50 can detect that the confirmation button 73 has been pressed. There is.

指紋センサ７５は、周知の指紋センサであって、例えば、光学式センサを用いて指紋を読みとることができるよう構成されている。なお、指紋センサ７５に換えて、例えば掌の静脈の形状を認識するセンサ等、人間の身体的特徴を認識することができる手段（バイオメトリクス認証をすることができる手段：個人を特定することができる手段）であれば、採用することができる。 The fingerprint sensor 75 is a well-known fingerprint sensor, and is configured so that the fingerprint can be read by using, for example, an optical sensor. In addition, instead of the fingerprint sensor 75, a means capable of recognizing human physical characteristics such as a sensor recognizing the shape of a vein in the palm (means capable of performing biometrics authentication: means of identifying an individual). Any means that can be done) can be adopted.

また、操作されると所定の連絡先に接続される救援依頼レバー７７も備えている。
［本実施形態の処理］
このような音声応答システム１００において実施される処理について以下に説明する。 It also has a rescue request lever 77 that is connected to a predetermined contact when operated.
[Processing of this embodiment]
The processing performed in such a voice response system 100 will be described below.

端末装置１にて実施される音声応答端末処理は、使用者による音声入力を受付けてこの音声をサーバ９０に送り、サーバ９０から出力すべき応答を受けるとこの応答を音声で再生する処理である。なお、この処理は、使用者が操作部７０を介して音声入力を行う旨を入力すると開始される。 The voice response terminal process performed by the terminal device 1 is a process of receiving voice input by the user, sending the voice to the server 90, and reproducing this response by voice when the response to be output from the server 90 is received. .. It should be noted that this process is started when the user inputs that voice input is to be performed via the operation unit 70.

詳細には、図３に示すように、まず、マイク３７からの入力を受け付ける状態（ＯＮ状態）とし（Ｓ２）、カメラ４１による撮像（録画）を開始する（Ｓ４）。そして、音声入力があったか否かを判定する（Ｓ６）。 Specifically, as shown in FIG. 3, first, the state of accepting the input from the microphone 37 (ON state) is set (S2), and the imaging (recording) by the camera 41 is started (S4). Then, it is determined whether or not there is a voice input (S6).

音声入力がなければ（Ｓ６：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ８）。ここで、タイムアウトとは、処理を待機する際の許容時間を超えたことを示し、ここでは許容時間は例えば５秒程度に設定される。 If there is no voice input (S6: NO), it is determined whether or not a time-out has occurred (S8). Here, the time-out indicates that the permissible time for waiting for processing has been exceeded, and here, the permissible time is set to, for example, about 5 seconds.

タイムアウトしていれば（Ｓ８：ＹＥＳ）、後述するＳ３０の処理に移行する。また、タイムアウトしていなければ（Ｓ８：ＮＯ）、Ｓ６の処理に戻る。
音声入力があれば（Ｓ６：ＹＥＳ）、音声をメモリに記録し（Ｓ１０）、音声の入力が終了したか否かを判定する（Ｓ１２）。ここでは、音声が一定時間以上途切れた場合や、操作部７０を介して音声入力を終了する旨が入力された場合に、音声の入力が終了したと判定する。 If the time-out has occurred (S8: YES), the process proceeds to the process of S30 described later. If the time-out has not occurred (S8: NO), the process returns to S6.
If there is a voice input (S6: YES), the voice is recorded in the memory (S10), and it is determined whether or not the voice input is completed (S12). Here, it is determined that the voice input is completed when the voice is interrupted for a certain period of time or more, or when it is input to end the voice input via the operation unit 70.

音声の入力が終了していなければ（Ｓ１２：ＮＯ）、Ｓ１０の処理に戻る。また、音声の入力が終了していれば（Ｓ１２：ＹＥＳ）、自身を特定するためのＩＤ、音声、および撮像画像等のデータをサーバ９０に対してパケット送信する（Ｓ１４）。なお、データを送信する処理は、Ｓ１０とＳ１２の間で行ってもよい。 If the voice input is not completed (S12: NO), the process returns to S10. If the voice input is completed (S12: YES), the data such as the ID for identifying itself, the voice, and the captured image are packet-transmitted to the server 90 (S14). The process of transmitting data may be performed between S10 and S12.

続いて、データの送信が完了したか否かを判定する（Ｓ１６）。送信が完了していなければ（Ｓ１６：ＮＯ）、Ｓ１４の処理に戻る。
また、送信が完了していれば（Ｓ１６：ＹＥＳ）、後述する音声応答サーバ処理にて送信されるデータ（パケット）を受信したか否かを判定する（Ｓ１８）。データを受信していなければ（Ｓ１８：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ２０）。 Subsequently, it is determined whether or not the data transmission is completed (S16). If the transmission is not completed (S16: NO), the process returns to S14.
Further, if the transmission is completed (S16: YES), it is determined whether or not the data (packet) transmitted by the voice response server processing described later has been received (S18). If no data has been received (S18: NO), it is determined whether or not a timeout has occurred (S20).

タイムアウトしていれば（Ｓ２０：ＹＥＳ）、後述するＳ３０の処理に移行する。また、タイムアウトしていなければ（Ｓ２０：ＮＯ）、Ｓ１８の処理に戻る。
また、データを受信していれば（Ｓ１８：ＹＥＳ）、パケットを受信する（Ｓ２２）。この処理では、文字情報に対する１または複数の異なる応答がそれぞれ異なる声色で対応付けられたものを取得する。 If the time-out has occurred (S20: YES), the process proceeds to the process of S30 described later. If the time-out has not occurred (S20: NO), the process returns to S18.
Further, if the data is received (S18: YES), the packet is received (S22). In this process, one or a plurality of different responses to the character information are associated with different voices.

そして、受信が完了したか否かを判定する（Ｓ２４）。受信が完了していなければ（Ｓ２４：ＮＯ）、タイムアウトしたか否かを判定する（Ｓ２６）。
タイムアウトしていれば（Ｓ２６：ＹＥＳ）、エラーが発生した旨を報知部６０を介して出力し、音声応答端末処理を終了する。また、タイムアウトしていなければ（Ｓ２６：ＮＯ）、Ｓ２２の処理に戻る。 Then, it is determined whether or not the reception is completed (S24). If the reception is not completed (S24: NO), it is determined whether or not the time-out has occurred (S26).
If the time-out has occurred (S26: YES), the fact that an error has occurred is output via the notification unit 60, and the voice response terminal processing is terminated. If the time-out has not occurred (S26: NO), the process returns to the process of S22.

また、受信が完了していれば（Ｓ２４：ＹＥＳ）、受信したパケットに基づく応答を音声でスピーカ６５から出力させる（Ｓ２８）。この処理では、複数の応答を再生する場合には、複数の応答がそれぞれ異なる声色で再生される。このような処理が終了すると、音声応答端末処理を終了する。 If the reception is completed (S24: YES), the response based on the received packet is output from the speaker 65 by voice (S28). In this process, when a plurality of responses are reproduced, the plurality of responses are reproduced with different voice colors. When such processing is completed, the voice response terminal processing is terminated.

続いて、サーバ９０（外部装置）にて実施される音声応答サーバ処理について図４を用いて説明する。音声応答サーバ処理は、端末装置１から音声を受信し、この音声を文字情報に変換する音声認識を行うとともに、音声に対する応答を生成して端末装置１に返す処理である。特に、本実施形態においては、複数の応答を異なる声色の音声と対応付けて送信する場合がある。 Subsequently, the voice response server processing performed by the server 90 (external device) will be described with reference to FIG. The voice response server process is a process of receiving voice from the terminal device 1, performing voice recognition that converts the voice into character information, and generating a response to the voice and returning it to the terminal device 1. In particular, in the present embodiment, a plurality of responses may be transmitted in association with voices having different voice colors.

音声応答サーバ処理の詳細としては、図４に示すように、まず、何れかの端末装置１からのパケットを受信したか否かを判定する（Ｓ４２）。パケットを受信していなければ（Ｓ４２：ＮＯ）、Ｓ４２の処理を繰り返す。 As for the details of the voice response server processing, as shown in FIG. 4, first, it is determined whether or not a packet from any of the terminal devices 1 has been received (S42). If no packet has been received (S42: NO), the process of S42 is repeated.

また、パケットを受信していれば（Ｓ４２：ＹＥＳ）、通信相手の端末装置１を特定する（Ｓ４４）。この処理では、パケットに含まれる端末装置１のＩＤによって端末装置１を特定する。 Further, if the packet is received (S42: YES), the terminal device 1 of the communication partner is specified (S44). In this process, the terminal device 1 is specified by the ID of the terminal device 1 included in the packet.

続いて、パケットに含まれる音声を認識する（Ｓ４６）。ここで、音声認識ＤＢ１０２においては、多数の音声の波形と多数の文字とが対応付けられている。また、予測変換ＤＢ１０３には、ある単語に続いて利用されがちな単語が対応付けられている。 Subsequently, the voice included in the packet is recognized (S46). Here, in the voice recognition DB 102, a large number of voice waveforms and a large number of characters are associated with each other. Further, the predictive conversion DB 103 is associated with a word that tends to be used following a certain word.

そこで、この処理では、音声認識ＤＢ１０２および予測変換ＤＢ１０３を参照することで、周知の音声認識処理を実施し、音声を文字情報に変換する。
続いて、撮像画像を画像処理することによって、撮像画像中の物体を特定する（Ｓ４８）。そして、音声の波形や言葉の語尾などに基づいて、使用者の感情を判定する（Ｓ５０）。 Therefore, in this process, by referring to the voice recognition DB 102 and the predictive conversion DB 103, a well-known voice recognition process is performed and the voice is converted into character information.
Subsequently, the captured image is image-processed to identify the object in the captured image (S48). Then, the emotion of the user is determined based on the waveform of the voice, the ending of the word, and the like (S50).

この処理では、音声の波形（声色）や言葉の語尾などと、通常、怒り、喜び、困惑、悲しみ、高揚などの感情の区分とが対応付けられた感情判定ＤＢ１１４を参照することによって、使用者の感情が何れかの区分に該当するかを判定し、この判定結果をメモリに記録する。続いて、学習ＤＢ１０７を参照することによって、この使用者がよく話す単語を検索し、音声認識にて生成した文字情報が曖昧であった部位を補正する。 In this process, the user refers to the emotion determination DB 114 in which the waveform (voice color) of the voice, the ending of the word, and the like are usually associated with the emotion classifications such as anger, joy, confusion, sadness, and uplifting. It is determined whether the emotion of is applicable to any of the categories, and the determination result is recorded in the memory. Subsequently, by referring to the learning DB 107, the word frequently spoken by the user is searched, and the part where the character information generated by the voice recognition is ambiguous is corrected.

なお、学習ＤＢ１０７には、使用者がよく話す単語や発音時の癖など、使用者の特徴が使用者ごとに記録されている。また、使用者との会話において学習ＤＢ１０７へのデータの追加・修正がなされる。 The learning DB 107 records the characteristics of each user, such as words often spoken by the user and habits during pronunciation. In addition, data is added / corrected to the learning DB 107 in a conversation with the user.

続いて、補正後の文字情報を入力された文字情報として特定し（Ｓ５４）、文字情報に類似する文章を入力として応答候補ＤＢ１０５から検索することによって、応答候補ＤＢ１０５から応答を取得する（Ｓ５６）。ここで、応答候補ＤＢ１０５には、図５に示すように、入力となる文字情報、第１出力、第１出力の声色、第２出力、第２出力の声色が一義に対応付けられている。 Subsequently, the corrected character information is specified as the input character information (S54), and a sentence similar to the character information is searched for as an input from the response candidate DB 105 to acquire a response from the response candidate DB 105 (S56). .. Here, as shown in FIG. 5, the response candidate DB 105 is uniquely associated with input character information, first output, first output voice color, second output, and second output voice color.

例えば、図５の第１段目に示すように、「今日の※の天気」という文字情報が入力されると、「今日の※の天気は※です」という第１出力が女１の声色に対応付けて出力される。ただし、「※」の部分は、地域名とその地域での数日間の天気予報とが対応付けられた天気ＤＢ１１０にアクセスすることで取得される。 For example, as shown in the first row of Fig. 5, when the text information "Today's * weather" is input, the first output "Today's * weather is *" becomes the voice of female 1. It is output in association with each other. However, the "*" part is acquired by accessing the weather DB 110 in which the area name and the weather forecast for several days in the area are associated with each other.

また、「今日の※の天気」という文字情報が入力された場合には、今日の天気が変化するタイミングの天気も天気ＤＢ１１０から取得し、「ただし※は※です。」という第２出力が男１の声色に対応付けて出力される。今日の東京の天気が晴れで明日の天気が雨の場合において「今日の東京の天気」と入力された場合、女１の声色で、「今日の東京の天気
は晴れです。」と出力され、男１の声色で、「ただし明日は雨です。」と出力されることになる。 In addition, when the text information "Today's * weather" is input, the weather at the timing when today's weather changes is also acquired from the weather DB 110, and the second output "However, * is *" is a man. It is output in association with the voice color of 1. If today's Tokyo weather is sunny and tomorrow's weather is rainy, and "Today's Tokyo weather" is entered, the voice of Woman 1 will output "Today's Tokyo weather is sunny." In the voice of Man 1, "However, it will rain tomorrow." Will be output.

なお、本実施形態では、複数の応答を出力する場合を説明したが、入力に対する回答が１つだけの場合には応答は１つだけになる。このため、応答は１つであるか否かを判定する（Ｓ５８）。応答が１つだけであれば（Ｓ５８：ＹＥＳ）、後述するＳ６２の処理に移行する。 In the present embodiment, the case where a plurality of responses are output has been described, but when there is only one answer to the input, only one response is given. Therefore, it is determined whether or not there is only one response (S58). If there is only one response (S58: YES), the process proceeds to S62, which will be described later.

また、応答が複数であれば（Ｓ５８：ＮＯ）、応答内容と声色とを対応付ける（Ｓ６０）。ここで、音声ＤＢ１０４には、人工音声のデータベースが声色毎に格納されており、この処理では、各応答に対して設定された声色を、データベース中の声色と対応付ける。 If there are a plurality of responses (S58: NO), the response content and the voice color are associated with each other (S60). Here, the voice DB 104 stores an artificial voice database for each voice color, and in this process, the voice color set for each response is associated with the voice color in the database.

続いて、応答内容を音声に変換する（Ｓ６２）。この処理では、音声ＤＢ１０４に格納されたデータベースに基づいて、応答内容（文字情報）を音声として出力する処理を行う。 Subsequently, the response content is converted into voice (S62). In this process, the response content (character information) is output as voice based on the database stored in the voice DB 104.

そして、生成した応答（音声）を通信相手の端末装置１にパケット送信する（Ｓ６４）。なお、応答内容の音声を生成しつつパケット送信してもよい。
続いて、会話内容を記録する（Ｓ６８）。この処理では、入力された文字情報と出力された応答内容を会話内容として学習ＤＢ１０７に記録する。この際、会話内容に含まれるキーワード（音声認識ＤＢ１０２に記録された単語）や発音時の特徴などを学習ＤＢ１０７に記録する。 Then, the generated response (voice) is packet-transmitted to the terminal device 1 of the communication partner (S64). It should be noted that the packet may be transmitted while generating the voice of the response content.
Then, the conversation content is recorded (S68). In this process, the input character information and the output response content are recorded in the learning DB 107 as conversation content. At this time, the keywords (words recorded in the voice recognition DB 102) included in the conversation content, the characteristics at the time of pronunciation, and the like are recorded in the learning DB 107.

このような処理が終了すると、音声応答サーバ処理を終了する。
［本実施形態による効果］
以上のように詳述した音声応答システム１００は、入力された文字情報に対する応答を音声で行わせるシステムであって、端末装置１（ＭＰＵ３１）は、文字情報に対する複数の異なる応答を取得し、複数の異なる応答をそれぞれ異なる声色で出力させる。 When such processing is completed, the voice response server processing is terminated.
[Effect of this embodiment]
The voice response system 100 described in detail as described above is a system that makes a response to the input character information by voice, and the terminal device 1 (MPU31) acquires a plurality of different responses to the character information and a plurality of them. The different responses of are output with different voices.

このような音声応答システム１００によれば、複数の応答を異なる声色で出力させることができるので、１の文字情報に対する解が１つに特定できない場合であっても、異なる解を異なる声色で使用者に分かりやすく出力することができる。よって、使用者にとってより使い勝手をよくすることができる。 According to such a voice response system 100, a plurality of responses can be output with different voice colors, so that different solutions can be used with different voice colors even when one solution for one character information cannot be specified. It can be output in an easy-to-understand manner. Therefore, it is possible to improve the usability for the user.

また、上記音声応答システム１００において端末装置１は、マイク３７を介して使用者による音声を入力し、サーバ９０（演算部１０１）は、入力された音声を文字情報に変換し、該文字情報に対する複数の異なる応答を生成して端末装置１に対して送信する。そして、端末装置１は、サーバ９０から応答を取得する。 Further, in the voice response system 100, the terminal device 1 inputs the voice by the user via the microphone 37, and the server 90 (calculation unit 101) converts the input voice into character information for the character information. Generates a plurality of different responses and sends them to the terminal device 1. Then, the terminal device 1 acquires a response from the server 90.

このような音声応答システム１００によれば、端末装置１では音声を入力することができるので、文字情報を音声で入力する構成とすることができる。また、サーバ９０において応答を生成する構成とすることができるので、音声応答システム１００での処理負荷を軽減することができる。 According to such a voice response system 100, since voice can be input in the terminal device 1, the character information can be input by voice. Further, since the server 90 can be configured to generate a response, the processing load on the voice response system 100 can be reduced.

さらに、上記音声応答システム１００においてサーバ９０は、使用者の発話による音声を文字情報に変換し、発声時の癖（発音上の癖など）を学習情報として蓄積する（特徴を捉えてこの特徴を記録しておく）。 Further, in the voice response system 100, the server 90 converts the voice uttered by the user into character information and accumulates habits (pronunciation habits, etc.) at the time of utterance as learning information (capturing the characteristics and incorporating these characteristics). Record it).

このような音声応答システム１００によれば、学習情報に基づいて文字情報を生成することができるので、文字情報の生成精度を向上させることができる。
さらに、上記音声応答システム１００においてサーバ９０は、使用者によって入力された音声について、声色から感情を読み取り、通常、怒り、喜び、困惑、悲しみ、高揚のうちの少なくとも１つを含む感情のうちの、何れの感情に該当するかを出力する。 According to such a voice response system 100, character information can be generated based on learning information, so that the accuracy of character information generation can be improved.
Further, in the voice response system 100, the server 90 reads emotions from the voice color of the voice input by the user, and usually among emotions including at least one of anger, joy, embarrassment, sadness, and uplifting. , Outputs which emotion corresponds to.

このような音声応答システム１００によれば、使用者の感情に応じて応答を出力することができる。
［第１実施形態の変形例］
本実施形態においては、文字情報を入力する構成として音声認識を利用したが、音声認識に限らず、キーボードやタッチパネル等の入力手段（操作部７０）を利用して入力されてもよい。また、「入力された音声を文字情報に変換」する作動についてはサーバ９０で行ったが、端末装置１で行ってもよい。 According to such a voice response system 100, it is possible to output a response according to the emotion of the user.
[Modified example of the first embodiment]
In the present embodiment, voice recognition is used as a configuration for inputting character information, but the input is not limited to voice recognition and may be input using an input means (operation unit 70) such as a keyboard or a touch panel. Further, although the operation of "converting the input voice into character information" is performed by the server 90, it may be performed by the terminal device 1.

さらに、上記音声応答システム１００においてサーバ９０には、複数の文字情報のそれぞれに対して、各文字情報に対する肯定的応答と否定的応答とを含む複数の異なる応答が記録された応答候補ＤＢ１０５、を備え、端末装置１は、複数の異なる応答として肯定的応答と否定的応答とを取得し、肯定的応答と否定的応答とで異なる声色で再生するようにしてもよい。 Further, in the voice response system 100, the server 90 receives a response candidate DB 105 in which a plurality of different responses including a positive response and a negative response to each character information are recorded for each of the plurality of character information. The terminal device 1 may acquire a positive response and a negative response as a plurality of different responses, and reproduce the positive response and the negative response with different voices.

例えば図５に示す第２段目に示すように、何らかの物を「買ってもよいか」との音声を入力すると、この物について、よい評判などの肯定的情報を女の声を対応付けて出力する。また、その一方で、悪い評判などの否定的情報を肯定的情報が対応付けられた女の声とは異なる声色（ここでは男の声）で出力する。 For example, as shown in the second row shown in FIG. 5, when a voice saying "Can I buy something?" Is input, positive information such as a good reputation is associated with a female voice for this item. Output. On the other hand, negative information such as bad reputation is output with a voice different from that of a female voice (here, a male voice) to which positive information is associated.

このような音声応答システム１００によれば、肯定的応答と否定的応答というように、立場の異なる応答を異なる声色で再生することができるので、別人物が話しているように音声を再生することができる。よって、音声を聞く使用者に違和感を覚えさせにくくすることができる。 According to such a voice response system 100, responses from different positions such as a positive response and a negative response can be reproduced with different voice colors, so that the voice can be reproduced as if another person is speaking. Can be done. Therefore, it is possible to make it difficult for the user who listens to the voice to feel a sense of discomfort.

さらに、上記音声応答システム１００においては、自身の端末装置１または他の端末装置１が出力した応答（例えば、肯定的応答や否定的応答）を文字情報として入力し、この応答に対する反論を行うための応答を生成するようにしてもよい。つまり、使用者の立場からすると、賛成の立場と反対の立場との両方の意見による議論を聞くことができる。そして、この議論を聞いたうえで、使用者は最終判断を行うことができる。 Further, in the voice response system 100, a response (for example, a positive response or a negative response) output by its own terminal device 1 or another terminal device 1 is input as character information, and a counterargument to this response is made. You may want to generate a response for. In other words, from the user's point of view, we can hear discussions from both the pros and cons. Then, after listening to this discussion, the user can make a final decision.

この構成は、１台または複数の端末装置１を用いて実現できる。複数の端末装置１が音声を互いにやり取りするには、音声を直接入出力してもよいし、無線等による通信を利用してもよい。複数の端末装置１とサーバ９０とが通信する場合には、Ｓ６６の処理にて、他の端末装置１にデータを送信すればよい。 This configuration can be realized by using one or a plurality of terminal devices 1. In order for the plurality of terminal devices 1 to exchange voice with each other, the voice may be directly input / output, or wireless communication or the like may be used. When the plurality of terminal devices 1 and the server 90 communicate with each other, the data may be transmitted to the other terminal device 1 in the process of S66.

さらに、上記音声応答システム１００において演算部１０１は、使用者の行動（会話、移動した場所、カメラに映ったもの）を学習（記録および解析）しておき、使用者の会話における言葉足らずを補うようにしてもよい。 Further, in the voice response system 100, the calculation unit 101 learns (records and analyzes) the user's behavior (conversation, moved place, what is reflected by the camera), and supplements the lack of words in the user's conversation. You may do so.

例えば、「今日はハンバーグでいい？」との質問に対して「カレーがいいな。」と使用者が回答する会話に対して、本装置が「昨日ハンバーグだったからね」と補うと、使用者
が、カレーがいいと発言した理由が伝わる。 For example, if the user answers "I like curry" to the question "Is it okay with hamburger steak today?", This device supplements "Because it was hamburger steak yesterday." However, the reason why he said that curry is good is transmitted.

また、このような構成は、電話中に実施することもでき、また、使用者の会話に勝手に参加するよう構成してもよい。
さらに、上記音声応答システム１００においてサーバ９０は、応答候補を所定のサーバ、またはインターネット上から取得するようにしてもよい。 Further, such a configuration can be carried out during a telephone call, or may be configured to freely participate in the conversation of the user.
Further, in the voice response system 100, the server 90 may acquire response candidates from a predetermined server or the Internet.

このような音声応答システム１００によれば、応答候補をサーバ９０だけでなく、インターネットや専用線等で接続された任意の装置から取得することができる。
［第２実施形態］
［第２実施形態の処理］
次に、別形態の音声応答システムについて説明する。本実施形態（第２実施形態）以下の実施形態では、第１実施形態の音声応答システム１００と異なる箇所のみを詳述し、第１実施形態の音声応答システム１００と同様の箇所については、同一の符号を付して説明を省略する。 According to such a voice response system 100, response candidates can be acquired not only from the server 90 but also from any device connected by the Internet, a dedicated line, or the like.
[Second Embodiment]
[Processing of the second embodiment]
Next, another form of the voice response system will be described. In the following embodiments of the present embodiment (second embodiment), only the parts different from the voice response system 100 of the first embodiment are described in detail, and the same parts as the voice response system 100 of the first embodiment are the same. The description will be omitted by adding the reference numerals.

第２実施形態の音声応答システムでは、使用者が文字情報を入力しない場合においても、音声を出力する。詳細には、端末装置１では図６に示す自動会話端末処理を実施する。自動会話端末処理は、例えば端末装置１の電源が投入されると開始される処理であって、その後、繰り返し実行される処理である。 In the voice response system of the second embodiment, voice is output even when the user does not input the character information. Specifically, the terminal device 1 carries out the automatic conversation terminal processing shown in FIG. The automatic conversation terminal process is, for example, a process that is started when the power of the terminal device 1 is turned on, and is a process that is repeatedly executed thereafter.

自動会話端末処理では、まず、自動会話をする旨の設定がＯＮ（オン）にされているか否かを判定する（Ｓ８２）。なお、自動会話を行うか否かについては操作部７０を介して、或いは音声を入力することによって使用者が設定可能に構成されている。 In the automatic conversation terminal processing, first, it is determined whether or not the setting for automatic conversation is turned ON (S82). It should be noted that whether or not to perform automatic conversation can be set by the user via the operation unit 70 or by inputting voice.

自動会話する旨がＯＦＦ（オフ）であれば（Ｓ８２：ＮＯ）、自動会話端末処理を終了する。また、自動会話する旨がＯＮであれば（Ｓ８２：ＹＥＳ）、自動会話モードに設定された旨を、自身を特定するためのＩＤとともにサーバ９０に対して送信する（Ｓ８４）。 If the automatic conversation is OFF (OFF) (S82: NO), the automatic conversation terminal processing is terminated. If the automatic conversation mode is ON (S82: YES), the fact that the automatic conversation mode is set is transmitted to the server 90 together with the ID for identifying itself (S84).

続いて、サーバ９０からのパケットを受信したか否かを判定する（Ｓ８６）。パケットを受信していなければ（Ｓ８６：ＮＯ）、Ｓ８６の処理を繰り返す。また、パケットを受信していれば（Ｓ８６：ＹＥＳ）、前述のＳ２２〜Ｓ３０と同様の処理を実施し、これらの処理が終了すると自動会話端末処理を終了する。 Subsequently, it is determined whether or not a packet from the server 90 has been received (S86). If no packet has been received (S86: NO), the process of S86 is repeated. Further, if the packet is received (S86: YES), the same processing as in S22 to S30 described above is performed, and when these processing are completed, the automatic conversation terminal processing is terminated.

また、サーバ９０では、図７に示す自動会話サーバ処理を実行する。自動会話サーバ処理は、例えばサーバ９０の電源が投入されると開始され、その後、繰り返し実行される処理である。 Further, the server 90 executes the automatic conversation server process shown in FIG. 7. The automatic conversation server process is, for example, a process that is started when the power of the server 90 is turned on and then repeatedly executed.

自動会話サーバ処理では、まず、自動会話モードに設定された旨を端末装置１から受信したか否かを判定する（Ｓ９２）。自動会話モードに設定された旨を受信していなければ（Ｓ９２：ＮＯ）、Ｓ９８の処理に移行する。 In the automatic conversation server processing, first, it is determined whether or not the terminal device 1 has received the fact that the automatic conversation mode has been set (S92). If it has not been received that the automatic conversation mode has been set (S92: NO), the process proceeds to S98.

自動会話モードに設定された旨を受信していれば（Ｓ９２：ＹＥＳ）、受信したパケットに含まれるＩＤに基づいて通信相手となる端末装置１を特定し（Ｓ９４）、この通信相手に対して自動会話する旨を設定する（Ｓ９６）。続いて、自動会話する旨を設定した端末装置１のそれぞれについて、再生条件を満たすか否かを判定する（Ｓ９８）。 If it has been received that the automatic conversation mode has been set (S92: YES), the terminal device 1 to be the communication partner is specified based on the ID included in the received packet (S94), and the communication partner is addressed. Set to have an automatic conversation (S96). Subsequently, it is determined whether or not the reproduction condition is satisfied for each of the terminal devices 1 for which automatic conversation is set (S98).

ここで、再生条件とは、例えば、前回の会話（音声入力）から一定時間が経過していることや、１日のあるきまった時刻、特定の天気のとき、何れかのセンサ値が異常を示す値
であるときなどを示す。 Here, the playback condition is, for example, that a certain time has passed since the previous conversation (voice input), a certain time of the day, or when the weather is specific, one of the sensor values is abnormal. Indicates when the value is indicated.

再生条件を満たしていなければ（Ｓ９８：ＮＯ）、自動会話サーバ処理を終了する。また、再生条件を満たしていれば（Ｓ９８：ＹＥＳ）、再生条件に応じたメッセージを生成する（Ｓ１００）。 If the reproduction condition is not satisfied (S98: NO), the automatic conversation server processing is terminated. Further, if the reproduction condition is satisfied (S98: YES), a message corresponding to the reproduction condition is generated (S100).

ここで、再生条件に応じたメッセージとは、例えば、「おはようございます。」や「こんにちは。」等の定型文であってもよいし、最新のニュースが自動更新されるニュースＤＢ１０９から得られる最新のニュースに関するものであってもよい。最新のニュースに関するものをメッセージとする場合には、例えば、ある会社の株価に関する情報を取得できた場合には、「今日の○○会社の株価が○○円上がりましたね。ご存じでしたか？」などとすることができる。 Here, the message according to the playback conditions may be, for example, a fixed phrase such as "Good morning." Or "Hello.", Or the latest news obtained from the news DB 109 in which the latest news is automatically updated. It may be about the news of. If you want to send a message about the latest news, for example, if you can get information about the stock price of a certain company, "Today's stock price of XX company has increased by XX yen. Did you know? "And so on.

この処理が終了すると、前述のＳ４２〜Ｓ５４の処理を実施する。そして、Ｓ５４の処理が終了すると、通信相手となる端末装置１から所定の回答が得られたか否かを判定する（Ｓ１１２）。ここで、所定の回答とは、例えば、何らかの音声であってもよいし、特定の解答であってもよい。特定の解答とは、例えば、「知っていますか？」との質問に対しては、「知っている」または「知らない」という回答が該当し、「今の天気はどうですか？」という質問に対しては、「雨です」や「晴れています」など、天気を示す単語を含むものが該当する。 When this process is completed, the above-mentioned processes S42 to S54 are performed. Then, when the processing of S54 is completed, it is determined whether or not a predetermined answer has been obtained from the terminal device 1 which is the communication partner (S112). Here, the predetermined answer may be, for example, some kind of voice or a specific answer. The specific answer is, for example, to the question "Do you know?", The answer "I know" or "I don't know", and to the question "How is the weather now?" On the other hand, those that include words that indicate the weather, such as "it's raining" and "it's sunny", are applicable.

所定の回答があれば（Ｓ１１２：ＹＥＳ）、自動会話サーバ処理を終了する。また、所定の回答がなければ（Ｓ１１２：ＮＯ）、Ｓ１００にて送信したメッセージを再送する（Ｓ１１４）。このようにメッセージを再送する際には、声色を変化させ、語気を強く、かつ厳しい口調の音声を生成する。 If there is a predetermined answer (S112: YES), the automatic conversation server processing is terminated. If there is no predetermined answer (S112: NO), the message transmitted in S100 is retransmitted (S114). When the message is retransmitted in this way, the voice color is changed to generate a voice with a strong voice and a harsh tone.

続いて、予め端末装置１と通報先とが対応付けられた通報先ＤＢ１１７を参照し、所定の通報先に回答がなかった旨を送信する（Ｓ１１６）。このような処理が終了すると、自動会話サーバ処理を終了する。 Subsequently, the report destination DB 117 in which the terminal device 1 and the report destination are associated with each other in advance is referred to, and a notification that there is no response to the predetermined report destination is transmitted (S116). When such processing is completed, the automatic conversation server processing is terminated.

［第２実施形態による効果］
上記の音声応答システム１００においてサーバ９０は、文字情報が入力されない場合において、当該音声応答システム１００の状況が予め音声を出力させる条件として設定された再生条件に合致するか否かを判定する。そして、再生条件に合致する場合に、予め設定されたメッセージを出力させる。 [Effect of the second embodiment]
In the above voice response system 100, the server 90 determines whether or not the situation of the voice response system 100 meets the reproduction condition set as the condition for outputting the voice in advance when the character information is not input. Then, when the reproduction conditions are met, a preset message is output.

このような音声応答システム１００によれば、文字情報が入力されない場合（つまり、使用者が話しかけない場合）であっても、音声を出力させることができる。例えば、強制的に使用者に発話させることで、自動車運転中の眠気抑制対策に利用することができる。また、一人暮らしの者が応答するか否かを判定することで、安否確認を行うことができる。 According to such a voice response system 100, voice can be output even when character information is not input (that is, when the user does not speak). For example, by forcibly making the user speak, it can be used as a measure for suppressing drowsiness while driving a car. In addition, safety can be confirmed by determining whether or not a person living alone responds.

また、上記音声応答システム１００においてサーバ９０は、ニュースの情報を取得し、該ニュースに関するメッセージを使用者の回答を求める質問形式で出力させる。
このような音声応答システム１００によれば、ニュースに関する会話をすることができるので、いつも同じ会話ばかりになることを抑制することができる。 Further, in the voice response system 100, the server 90 acquires news information and outputs a message related to the news in the form of a question asking for a user's answer.
According to such a voice response system 100, since it is possible to have a conversation about news, it is possible to suppress that the conversation is always the same.

さらに、上記音声応答システム１００においてサーバ９０は、予め設定されたメッセージに別途取得した（ニュースや環境（気温、天気、位置情報等の）外部取得情報を付加して出力させる。 Further, in the voice response system 100, the server 90 adds and outputs separately acquired externally acquired information (news, environment (temperature, weather, location information, etc.)) to a preset message.

このような音声応答システム１００によれば、所定のメッセージと取得した情報とを組み合わせた応答を出力することができる。
さらに、上記音声応答システム１００においてサーバ９０は、応答やメッセージに対する回答が得られない場合に、予め設定された連絡先に対して、使用者を特定する情報、および回答が得られなかった旨を送信する。 According to such a voice response system 100, it is possible to output a response in which a predetermined message and the acquired information are combined.
Further, in the voice response system 100, when the server 90 cannot obtain a response or a response to the message, the information identifying the user and the response to the preset contact cannot be obtained. Send.

このような音声応答システム１００によれば、回答が得られない場合に連絡先に通報することができる。よって、例えば、一人暮らしの老人等の異常を早期に通報することができる。 According to such a voice response system 100, it is possible to notify a contact when an answer cannot be obtained. Therefore, for example, it is possible to report an abnormality of an elderly person living alone at an early stage.

［第２実施形態の変形例］
また、上記音声応答システム１００においてサーバ９０は、複数のメッセージを取得し、メッセージの再生頻度に応じて再生するメッセージを選択して出力させるようにしてもよい。 [Modified example of the second embodiment]
Further, in the voice response system 100, the server 90 may acquire a plurality of messages and select and output the message to be reproduced according to the reproduction frequency of the messages.

このような音声応答システム１００によれば、再生頻度が高いメッセージを再生しにくくすることで、メッセージ再生時のランダム性を奏したり、敢えて再生頻度が高いメッセージを繰り返し再生することで注意喚起や記憶の定着を促したりすることができる。 According to such a voice response system 100, by making it difficult to reproduce a message having a high reproduction frequency, randomness at the time of message reproduction can be achieved, or by dare to repeatedly reproduce a message having a high reproduction frequency, an alert or a memory can be obtained. It is possible to promote the establishment of.

［第３実施形態］
［第３実施形態の処理］
次に第３実施形態の音声応答システムでは、使用者が誰かに直接は言いにくいことを端末装置１が代わりに伝える構成としている。例えば、デート前に、今日はこのようなことを言いたいと本装置に話しかけておくと、適当なタイミング（例えば予め設定した時刻や、会話が途切れてから一定時間が経過した場合など）で、音声応答システム１００が代わりに話してくれる（音声を再生する）ようにする。 [Third Embodiment]
[Processing of the third embodiment]
Next, in the voice response system of the third embodiment, the terminal device 1 instead conveys that it is difficult for the user to directly tell someone. For example, before a date, if you talk to the device today to say something like this, at an appropriate time (for example, a preset time or a certain amount of time has passed since the conversation was interrupted). Allow the voice response system 100 to speak (play voice) instead.

詳細には、端末装置１は図８に示す伝言端末処理を実施し、サーバ９０は図９に示す伝言サーバ処理を実施する。伝言端末処理は例えば端末装置１の電源が投入されると開始され、その後、繰り返し実行される処理である。 Specifically, the terminal device 1 carries out the message terminal processing shown in FIG. 8, and the server 90 carries out the message server processing shown in FIG. The message terminal processing is, for example, a processing that is started when the power of the terminal device 1 is turned on and then repeatedly executed.

伝言端末処理では、図８に示すように、まず、使用者によって伝言モードが設定されているか否かを判定する（Ｓ１３２）。伝言モードが設定されていなければ（Ｓ１３２：ＮＯ）、Ｓ１３２の処理を繰り返す。 In the message terminal processing, as shown in FIG. 8, first, it is determined whether or not the message mode is set by the user (S132). If the message mode is not set (S132: NO), the process of S132 is repeated.

また、伝言モードが設定されていれば（Ｓ１３２：ＹＥＳ）、Ｓ２〜Ｓ８の処理を実施し、Ｓ６にて肯定判定された場合には、端末装置１のメモリ内において、伝言モードフラグをＯＮ状態に設定する（Ｓ１３４）。そして、Ｓ１０〜Ｓ１６の処理を実施する。 If the message mode is set (S132: YES), the processes of S2 to S8 are executed, and if an affirmative determination is made in S6, the message mode flag is turned on in the memory of the terminal device 1. (S134). Then, the processing of S10 to S16 is carried out.

Ｓ１６にて肯定判定された場合には、サーバ９０からのパケットを受信したか否かを判定する（Ｓ１３６）。パケットを受信していなければ（Ｓ１３６：ＮＯ）、Ｓ１３６の処理を繰り返す。また、パケットを受信していれば（Ｓ１３６：ＹＥＳ）、Ｓ２４〜Ｓ３０の処理を実施し、伝言端末処理を終了する。 If an affirmative determination is made in S16, it is determined whether or not a packet from the server 90 has been received (S136). If no packet has been received (S136: NO), the process of S136 is repeated. Further, if the packet is received (S136: YES), the processes of S24 to S30 are executed, and the message terminal process is terminated.

次に、伝言サーバ処理は、例えばサーバ９０の電源が投入されると開始される処理であり、その後、繰り返し実行される。詳細には、まず、何れかの端末装置１からパケットを受信したか否かを判定する（Ｓ１４２）。パケットを受信していなければ（Ｓ１４２：ＮＯ）、後述するＳ１５６の処理に移行する。 Next, the message server process is, for example, a process that is started when the power of the server 90 is turned on, and is then repeatedly executed. Specifically, first, it is determined whether or not a packet has been received from any of the terminal devices 1 (S142). If no packet has been received (S142: NO), the process proceeds to S156, which will be described later.

また、パケットを受信していれば（Ｓ１４２：ＹＥＳ）、通信相手の端末装置１を特定し（Ｓ４４）、パケットに伝言モードフラグ等のモードフラグが含まれているか否かを判定する（Ｓ１４４）。モードフラグがなければ（Ｓ１４４：ＮＯ）、Ｓ１４８の処理に移行する。 Further, if the packet is received (S142: YES), the terminal device 1 of the communication partner is specified (S44), and it is determined whether or not the packet contains a mode flag such as a message mode flag (S144). .. If there is no mode flag (S144: NO), the process proceeds to S148.

また、モードフラグがあれば（Ｓ１４４：ＹＥＳ）、サーバ９０においても通信相手の端末装置１に対応するフラグをＯＮ状態に設定することでモード設定をする（Ｓ１４６）。例えば、伝言モードフラグが対応する伝言モードであれば、後述するＳ４６〜Ｓ１５２の処理が実施され、後述する誘導モードフラグが対応する誘導モードであれば、Ｓ４６〜Ｓ１７６（図１１参照）が実施されることになる。 If there is a mode flag (S144: YES), the server 90 also sets the mode by setting the flag corresponding to the terminal device 1 of the communication partner to the ON state (S146). For example, if the message mode flag corresponds to the message mode, the processes of S46 to S152 described later are executed, and if the guidance mode flag described later corresponds to the guidance mode, S46 to S176 (see FIG. 11) are executed. Will be.

続いて、伝言フラグがＯＮ状態であるか否かを判定する（Ｓ１４８）。伝言フラグがＯＮ状態であれば（Ｓ１４８：ＹＥＳ）、Ｓ４６〜Ｓ５４の処理を実施し、続いて、伝言再生条件を抽出する（Ｓ１５０）。 Subsequently, it is determined whether or not the message flag is in the ON state (S148). If the message flag is in the ON state (S148: YES), the processes S46 to S54 are performed, and then the message reproduction conditions are extracted (S150).

ここで、伝言再生条件は、予め使用者が端末装置１の操作部７０を介して設定可能であって、例えば、時刻や位置が該当する。なお、伝言再生条件は、伝言端末処理のパケット送信の際にサーバ９０に送信される。 Here, the message reproduction condition can be set in advance by the user via the operation unit 70 of the terminal device 1, and the time and position correspond, for example. The message reproduction condition is transmitted to the server 90 at the time of packet transmission of the message terminal processing.

続いて、伝言と音声（声色）とを対応付けて、メモリに記録し（Ｓ１５２）、Ｓ１５６の処理に移行する。また、伝言フラグがＯＦＦ状態であれば（Ｓ１４８：ＮＯ）、他のモードに関する処理を行い（Ｓ１５４）、再生タイミングになったか否かを判定する（Ｓ１５６）。ここで、再生タイミングとは、伝言再生条件で設定された内容を示す。 Subsequently, the message and the voice (voice color) are associated with each other and recorded in the memory (S152), and the process proceeds to S156. If the message flag is in the OFF state (S148: NO), processing related to another mode is performed (S154), and it is determined whether or not the reproduction timing has been reached (S156). Here, the reproduction timing indicates the content set in the message reproduction condition.

再生タイミングでなければ（Ｓ１５６：ＮＯ）、直ちに伝言サーバ処理を終了する。また、再生タイミングであれば（Ｓ１５６：ＹＥＳ）、Ｓ６２〜Ｓ６４の処理を実施し、伝言サーバ処理を終了する。 If it is not the reproduction timing (S156: NO), the message server processing is immediately terminated. If it is the reproduction timing (S156: YES), the processes of S62 to S64 are executed, and the message server process is terminated.

［第３実施形態による効果］
このような第３実施形態の音声応答システムによれば、使用者が入力した音声を直ちに再生するのではなく、一定時間後において伝言再生条件が成立したときに再生することができる。 [Effect of the third embodiment]
According to the voice response system of the third embodiment, the voice input by the user is not immediately reproduced, but can be reproduced when the message reproduction condition is satisfied after a certain period of time.

例えば、図５の第３段目に示すように、「○○さんに○○と伝えてね」と入力すると、○○さんの声が認識されてから（聞こえてから）、伝えたい文章が再生されることになる。 For example, as shown in the third row of Fig. 5, if you enter "Tell Mr. XX to XX", the sentence you want to convey will be displayed after Mr. XX's voice is recognized (after you hear it). It will be played.

［第３実施形態の変形例］
上記第３実施形態においては、使用者が話した内容を再生するよう構成したが、言いにくいことのきっかけになる言葉、例えば「そういえば何か彼女に話すって言ってなかったっけ？」のような言葉、を話す構成としてもよい。詳細には、端末装置１は図１０に示す誘導端末処理を実施し、サーバ９０は図１１に示す誘導サーバ処理を実施する。 [Modified example of the third embodiment]
In the third embodiment, the content spoken by the user is reproduced, but a word that triggers a difficult thing to say, for example, "Did you say something to her?" It may be configured to speak such words. Specifically, the terminal device 1 carries out the guidance terminal processing shown in FIG. 10, and the server 90 carries out the guidance server processing shown in FIG.

誘導端末処理は、例えば端末装置１の電源が投入されると開始され、その後、繰り返し実行される処理である。例えば端末装置１の電源が投入されると開始され、その後、繰り返し実行される処理である。 The guidance terminal process is, for example, a process that is started when the power of the terminal device 1 is turned on and then repeatedly executed. For example, it is a process that is started when the power of the terminal device 1 is turned on and then repeatedly executed.

誘導端末処理では、図１０に示すように、まず、使用者によって誘導モードが設定されているか否かを判定する（Ｓ１６２）。誘導モードが設定されていなければ（Ｓ１６２：ＮＯ）、Ｓ１６２の処理を繰り返す。 In the guidance terminal process, as shown in FIG. 10, first, it is determined whether or not the guidance mode is set by the user (S162). If the induction mode is not set (S162: NO), the process of S162 is repeated.

また、誘導モードが設定されていれば（Ｓ１６２：ＹＥＳ）、Ｓ２〜Ｓ８の処理を実施し、Ｓ６にて肯定判定された場合には、端末装置１のメモリ内において、誘導モードフラグをＯＮ状態に設定する（Ｓ１６４）。そして、Ｓ１０〜Ｓ１６の処理を実施する。 If the guidance mode is set (S162: YES), the processes of S2 to S8 are executed, and if an affirmative judgment is made in S6, the guidance mode flag is turned ON in the memory of the terminal device 1. (S164). Then, the processing of S10 to S16 is carried out.

Ｓ１６にて肯定判定された場合には、サーバ９０からのパケットを受信したか否かを判定する（Ｓ１６６）。パケットを受信していなければ（Ｓ１６６：ＮＯ）、Ｓ１６６の処理を繰り返す。また、パケットを受信していれば（Ｓ１６６：ＹＥＳ）、Ｓ２４〜Ｓ３０の処理を実施し、誘導端末処理を終了する。 If an affirmative determination is made in S16, it is determined whether or not a packet from the server 90 has been received (S166). If no packet has been received (S166: NO), the process of S166 is repeated. Further, if the packet is received (S166: YES), the processes of S24 to S30 are executed, and the guidance terminal process is terminated.

次に、誘導サーバ処理は、例えばサーバ９０の電源が投入されると開始され、その後、繰り返し実行される処理である。詳細には、前述のＳ１４２〜Ｓ１４６の処理を実行する。そして、誘導フラグがＯＮ状態であるか否かを判定する（Ｓ１７２）。 Next, the guidance server process is, for example, a process that is started when the power of the server 90 is turned on and then repeatedly executed. Specifically, the above-mentioned processes of S142 to S146 are executed. Then, it is determined whether or not the induction flag is in the ON state (S172).

誘導フラグがＯＮ状態であれば（Ｓ１７２：ＹＥＳ）、Ｓ４６〜Ｓ５４の処理を実施し、続いて、誘導再生条件を抽出する（Ｓ１７４）。
ここで、誘導再生条件においても伝言再生条件と同様に、予め使用者が端末装置１の操作部７０を介して設定可能であって、例えば、時刻や位置が該当する。なお、誘導再生条件は、伝言端末処理のパケット送信の際にサーバ９０に送信される。 If the induction flag is in the ON state (S172: YES), the processes S46 to S54 are performed, and then the induction regeneration conditions are extracted (S174).
Here, in the guided reproduction condition as well as the message reproduction condition, the user can set in advance via the operation unit 70 of the terminal device 1, and the time and the position correspond, for example. The guided reproduction condition is transmitted to the server 90 at the time of packet transmission of the message terminal processing.

続いて、誘導内容を生成し、この誘導内容と音声（声色）とを対応付けて、メモリに記録する（Ｓ１７６）。ここで誘導内容としては、例えば、入力された文字情報に含まれる「したい」「希望」などの願望を表す単語を検索し、これらの単語の前のキーワードを抽出し、これらのキーワードを誘導する言葉として登録された言葉を誘導内容として出力する。なお、キーワードと誘導内容を示す言葉とは、予め対応付けられて応答候補ＤＢ１０５に記録されている。 Subsequently, the guidance content is generated, and the guidance content is associated with the voice (voice color) and recorded in the memory (S176). Here, as the guidance content, for example, a word expressing a desire such as "want" or "hope" included in the input character information is searched, the keyword before these words is extracted, and these keywords are guided. The word registered as a word is output as the guidance content. The keyword and the word indicating the guidance content are associated with each other in advance and recorded in the response candidate DB 105.

続いて、前述のＳ１５６以下の処理を実施し、サーバ処理を終了する。また、誘導フラグがＯＦＦ状態であれば（Ｓ１７２：ＮＯ）、他のモードに関する処理を行い（Ｓ１５４）、前述のＳ１５６以下の処理を実施し、サーバ処理を終了する。 Subsequently, the above-mentioned processing of S156 or less is performed, and the server processing is terminated. If the guidance flag is in the OFF state (S172: NO), processing related to other modes is performed (S154), the above-mentioned processing S156 or less is performed, and the server processing is terminated.

このような第３実施形態の変形例の構成によれば、使用者が言いたい言葉を直接出力するのではなく、言いたい言葉を話せるように誘導することができる。
［第４実施形態］
［第４実施形態の処理］
次に、端末装置１を受付業務に使用する例について説明する。本実施形態においては、端末装置１は会社の受付などに設置される。なお、会社の代表電話やテレホンバンキングなどの電話受付に採用することもできる。ここで、本実施形態では、第１実施形態におけるＳ５６の処理を、図１２に示す受付処理に置き換えることによって実現される。 According to the configuration of the modified example of the third embodiment, the user can be guided to speak the desired word instead of directly outputting the desired word.
[Fourth Embodiment]
[Processing of the fourth embodiment]
Next, an example of using the terminal device 1 for reception work will be described. In the present embodiment, the terminal device 1 is installed at a reception desk of a company or the like. It can also be used for telephone reception such as company representative telephones and telephone banking. Here, in the present embodiment, the processing of S56 in the first embodiment is realized by replacing it with the reception processing shown in FIG.

受付処理では、図１２に示すように、まず、文字情報に会社名が含まれるか否かを判定する（Ｓ１９２）。この処理では、一般的な名前や会社名（音声認識ＤＢ１０２に記録されたもの）が含まれているか否かを判定する。 In the reception process, as shown in FIG. 12, first, it is determined whether or not the character information includes the company name (S192). In this process, it is determined whether or not a general name or a company name (recorded in the voice recognition DB 102) is included.

文字情報に会社名または個人名が含まれていなければ（Ｓ１９２：ＹＥＳ）、会社名および個人名を尋ねるための応答を生成し（Ｓ１９４）、受付処理を終了する。この処理では、例えば、「お名前とご用件をお話しください。」などの応答を生成する。 If the character information does not include the company name or the personal name (S192: YES), a response for asking the company name and the personal name is generated (S194), and the reception process is terminated. This process produces a response, for example, "Please tell us your name and your requirements."

文字情報に会社名または個人名が含まれていれば（Ｓ１９２：ＮＯ）、この会社名や個人名をセールスＤＢ１１８およびクライアントＤＢ１１９から抽出する（Ｓ１９６）。こ
こで、セールスＤＢ１１８には、過去にセールスに来た会社および担当者、或いは苦情ばかり話すクレーマーの名前等が記録されている。また、クライアントＤＢ１１９には、会社名やその会社の担当者、端末装置１の利用者側（自社側）の担当者、面会予定時刻等のスケジュール、担当者ごとに連絡先が対応付けて記録されている。 If the character information includes the company name or the personal name (S192: NO), the company name or the personal name is extracted from the sales DB 118 and the client DB 119 (S196). Here, in the sales DB 118, the names of companies and persons in charge who have come to sales in the past, or the names of claimers who only make complaints are recorded. Further, in the client DB 119, the company name, the person in charge of the company, the person in charge on the user side (company side) of the terminal device 1, the schedule such as the scheduled visit time, and the contact information for each person in charge are recorded in association with each other. ing.

続いて、会社名や個人名をセールスＤＢ１１８から抽出できたか否か、つまり、文字情報に含まれる会社名や個人名がセールスＤＢ１１８に含まれていたか否かを判定する（Ｓ１９８）。会社名や個人名をセールスＤＢ１１８から抽出できていれば（Ｓ１９８：ＹＥＳ）、セールスを断る旨のセールスお断り応答（取次ぎを断る応答）を生成し（Ｓ２００）、受付処理を終了する。 Subsequently, it is determined whether or not the company name or individual name can be extracted from the sales DB 118, that is, whether or not the company name or individual name included in the character information is included in the sales DB 118 (S198). If the company name or individual name can be extracted from the sales DB 118 (S198: YES), a sales refusal response (response for refusing the agency) to decline the sales is generated (S200), and the acceptance process is terminated.

また、会社名や個人名をセールスＤＢ１１８から抽出できていなければ（Ｓ１９８：ＮＯ）、受付に来た者がクライアントＤＢ１１９内のスケジュールにおいて、近い時刻（例えば、現在時刻の前後１時間以内）に訪問してくる者か否かを判定する（Ｓ２０２）。近い時刻に訪問してくる者であれば（Ｓ２０２：ＹＥＳ）、この者を担当する担当者の連絡先をクライアントＤＢ１１９から抽出し、この担当者と受付に来た者とが会話をできるように、この担当者に接続する（Ｓ２０４）。この処理では、担当者の内線電話、携帯電話等に接続すればよい。 If the company name and personal name cannot be extracted from the sales DB 118 (S198: NO), the person who came to the reception visits at a close time (for example, within one hour before and after the current time) in the schedule in the client DB 119. It is determined whether or not the person is coming (S202). If the person is visiting at a near time (S202: YES), the contact information of the person in charge of this person is extracted from the client DB 119 so that this person and the person who came to the reception can have a conversation. , Connect to this person (S204). In this process, the person in charge may be connected to an extension telephone, a mobile phone, or the like.

続いて、クライアント用の受付応答を生成する（Ｓ２０６）。ここで、クライアント用の受付応答としては、例えば、「○○様、いつもありがとうございます。担当者に接続しておりますのでしばらくお待ちください。」のような応答を生成する。このような処理が終了すると、受付処理を終了する。 Subsequently, a reception response for the client is generated (S206). Here, as a reception response for the client, for example, a response such as "Thank you very much, Mr. XX. Please wait for a while because we are connected to the person in charge." Is generated. When such processing is completed, the reception processing is terminated.

また、近い時刻に訪問してくる者でなければ（Ｓ２０２：ＮＯ）、予め設定された受付用の連絡先に接続し、この担当者と受付に来た者とが会話をできるように、この受付担当者に接続する（Ｓ２０８）。そして、通常受付応答を生成する（Ｓ２１０）。 Also, if you are not a visitor at a near time (S202: NO), you can connect to a preset reception contact so that this person in charge and the person who came to the reception can have a conversation. Connect to the receptionist (S208). Then, a normal reception response is generated (S210).

ここで、通常受付応答としては、例えば、「受付に接続しておりますのでしばらくお待ちください。」のような応答を生成する。このような処理が終了すると、受付処理を終了する。 Here, as a normal reception response, for example, a response such as "Please wait for a while because you are connected to the reception" is generated. When such processing is completed, the reception processing is terminated.

［第４実施形態による効果］
上記音声応答システム１００においては、仕事場や会社の受付で利用する構成としている。この構成では、セールスに来る者の名前と会社名をサーバ９０のセールスＤＢ１１８に予め記録しておき、受付に来たものが、この名前や会社名を名乗った場合には、断る文句の音声を再生するように、応答を生成する。 [Effect of the fourth embodiment]
The voice response system 100 is configured to be used at the reception desk of a workplace or a company. In this configuration, the name and company name of the person who comes to the sales are recorded in advance in the sales DB 118 of the server 90, and if the person who comes to the receptionist gives this name or the company name, the voice of the complaint to refuse is heard. Generate a response to play.

また、上記音声応答システム１００においてサーバ９０は、入力された文字情報によって通信相手を特定し、通信相手毎に予め設定された通信先と通信相手とを接続する。
このような音声応答システム１００によれば、受付業務や電話対応を補助することができる。また、このような音声応答システム１００によれば、使用者の業務に支障がある虞がある者を、自身が対応することなく排除することができる。 Further, in the voice response system 100, the server 90 identifies a communication partner by the input character information, and connects the communication destination and the communication partner preset for each communication partner.
According to such a voice response system 100, it is possible to assist reception work and telephone correspondence. Further, according to such a voice response system 100, a person who may interfere with the work of the user can be excluded without taking any action.

さらに、上記音声応答システム１００においてサーバ９０は、入力された文字情報（特に音声）に含まれるキーワードを抽出し、キーワードが該当する接続先に接続する。なお、例えば相手先の名称等のキーワードとその接続先とは予め対応付けられている。 Further, in the voice response system 100, the server 90 extracts a keyword included in the input character information (particularly voice) and connects to the connection destination to which the keyword corresponds. In addition, for example, a keyword such as the name of the other party and the connection destination are associated in advance.

このような音声応答システム１００によれば、電話の転送や受付への呼び出し等の業務を補助することができる。
［第４実施形態の変形例］
上記実施形態では、相手先に応じて接続先を設定するよう構成したが、この技術を応用して、例えば、テレホンバンキングやテレホンショッピング等の電話受付において、要件（文字情報に含まれるキーワード）を認識し、要件に応じて接続先を変更するようにしてもよい。 According to such a voice response system 100, it is possible to assist operations such as telephone transfer and call to reception.
[Modified example of the fourth embodiment]
In the above embodiment, the connection destination is set according to the other party, but by applying this technology, for example, in the telephone reception such as telephone banking and telephone shopping, the requirement (keyword included in the character information) is set. You may recognize it and change the connection destination according to your requirements.

また、上記音声応答システム１００においてサーバ９０は、キーワードに基づいて相手が話す要件を認識し、相手が話した概要を使用者に伝えるようにしてもよい。
このような音声応答システム１００によれば、客先との取次の業務を補助することができる。 Further, in the voice response system 100, the server 90 may recognize the requirement spoken by the other party based on the keyword and convey the outline spoken by the other party to the user.
According to such a voice response system 100, it is possible to assist the business of agency with the customer.

［第５実施形態］
［第５実施形態の処理］
次に、端末装置１は、他の端末装置１からの要求を受けて、他の端末装置１が求める情報を提供するようにしてもよい。 [Fifth Embodiment]
[Processing of the fifth embodiment]
Next, the terminal device 1 may receive a request from the other terminal device 1 and provide the information requested by the other terminal device 1.

このように構成する場合、サーバ９０は、Ｓ５６の処理において、必要な情報を他の端末装置１に要求し、他の端末装置１から必要な情報を取得した上で応答を生成する。そして、必要な情報を提供する端末装置１では、図１３に示す情報提供端末処理が実施される。情報提供端末処理は、例えば、サーバ９０からの要求があると開始される処理である。 In this configuration, the server 90 requests the necessary information from the other terminal device 1 in the process of S56, acquires the necessary information from the other terminal device 1, and then generates a response. Then, in the terminal device 1 that provides necessary information, the information providing terminal processing shown in FIG. 13 is performed. The information providing terminal process is, for example, a process started when there is a request from the server 90.

情報提供端末処理は、図１３に示すように、まず、情報提供先を抽出する（Ｓ２２２）。この情報提供先は、情報を要求する他の端末装置１を示し、この他の端末装置１を特定するためのＩＤがサーバ９０からの要求に含まれている。 As shown in FIG. 13, the information providing terminal process first extracts the information providing destination (S222). This information providing destination indicates another terminal device 1 requesting information, and an ID for identifying the other terminal device 1 is included in the request from the server 90.

続いて、情報の提供を許可する相手であるか否かを判定する（Ｓ２２４）。ここで、端末情報ＤＢ１１３には、家族や友人等、情報の提供を許可する相手のＩＤが予め記録されている。この処理ではこの端末情報ＤＢ１１３を参照することで判定を行う。 Subsequently, it is determined whether or not the person is the other party who is permitted to provide the information (S224). Here, in the terminal information DB 113, the ID of a person who permits the provision of information, such as a family member or a friend, is recorded in advance. In this process, the determination is made by referring to the terminal information DB 113.

情報の提供を許可する相手であれば（Ｓ２２４：ＹＥＳ）、自身のメモリ３９や各種センサ類等から要求された情報を取得し（Ｓ２２６）、このデータをサーバ９０に送信する（Ｓ２２８）。また、情報の提供を許可する相手でなければ（Ｓ２２４：ＮＯ）、情報の提供を拒否する旨をサーバ９０に送信する（Ｓ２３０）。 If the other party permits the provision of information (S224: YES), the information requested from its own memory 39, various sensors, etc. is acquired (S226), and this data is transmitted to the server 90 (S228). Further, if the other party does not permit the provision of information (S224: NO), the server 90 is transmitted to the effect that the provision of information is refused (S230).

このような処理が終了すると、情報提供端末処理を終了する。
この構成では、例えば、図５の第４段目に示すように、「○○さんは何をしているか」という質問に対して、サーバ９０は○○さんの端末装置１に位置情報を要求し、この端末装置１は位置情報を返す。 When such processing is completed, the information providing terminal processing is terminated.
In this configuration, for example, as shown in the fourth row of FIG. 5, the server 90 requests the position information from the terminal device 1 of Mr. XX in response to the question "What is Mr. XX doing?" Then, the terminal device 1 returns the position information.

そして、サーバ９０は位置情報に基づいて○○さんの行動を認識する。例えば、線路上を人間の走る速度よりも速い速度で移動していれば、電車に乗って移動中と判断し、「○○さんは電車の中にいます。帰宅中のようです。」などと応答を生成することになる。 Then, the server 90 recognizes Mr. XX's action based on the location information. For example, if you are moving on the railroad track at a speed faster than the speed at which humans run, it is judged that you are moving on the train, and "Mr. XX is on the train. It seems that you are returning home." Will generate a response.

［第５実施形態による効果］
上記音声応答システム１００においてサーバ９０は要求元の端末装置１とは異なる他の端末装置１から他の端末装置１に記録されている情報を取得し、他の端末装置１に提供する。つまり、上記音声応答システム１００においてサーバ９０は、文字情報に対する応答を生成するための情報を他の端末装置１から取得する。 [Effect of the fifth embodiment]
In the voice response system 100, the server 90 acquires the information recorded in the other terminal device 1 from another terminal device 1 different from the requesting terminal device 1, and provides the information to the other terminal device 1. That is, in the voice response system 100, the server 90 acquires information for generating a response to the character information from the other terminal device 1.

このような音声応答システム１００によれば、他の端末装置１に記録された情報に基づ
いて応答を生成することができる。
また、上記音声応答システム１００において端末装置１は、文字情報に対する応答を生成するための情報を他の端末装置１から要求された場合、この要求に応じた情報を返す。 According to such a voice response system 100, a response can be generated based on the information recorded in the other terminal device 1.
Further, in the voice response system 100, when the terminal device 1 requests information for generating a response to the character information from another terminal device 1, the terminal device 1 returns the information corresponding to the request.

この構成において端末装置１は、位置情報、温度、湿度、照度、騒音レベル等を検出するためのセンサ類や、辞書情報などのデータベースを備えておき、要求に応じて必要な情報を抽出する。 In this configuration, the terminal device 1 is provided with sensors for detecting position information, temperature, humidity, illuminance, noise level, etc., and a database such as dictionary information, and extracts necessary information according to a request.

このような音声応答システム１００によれば、他の端末装置１の位置等、他の端末装置１固有の情報を取得することができる。また、他の端末装置１に自身固有の情報を送信することができる。 According to such a voice response system 100, it is possible to acquire information unique to the other terminal device 1, such as the position of the other terminal device 1. In addition, it is possible to transmit information unique to itself to another terminal device 1.

［第６実施形態］
［第６実施形態の処理］
次に、第６実施形態の音声応答システムでは、使用者または使用者に関係がある者を表す関係者の性格を予め設定された区分に従って対応付けた性格情報が記録された性格ＤＢ１０６を準備している。性格ＤＢ１０６は、例えば、図１４に示すように、使用者や関係者の名前と、これらの者の性格区分とを対応付けて記録されている。 [Sixth Embodiment]
[Processing of the sixth embodiment]
Next, in the voice response system of the sixth embodiment, the personality DB 106 in which the personality information in which the personality of the user or the person related to the user is associated according to the preset classification is prepared is prepared. ing. As shown in FIG. 14, for example, the personality DB 106 records the names of users and related persons in association with the personality classification of these persons.

また、図１４に示す性格ＤＢ１０６では、使用者や関係者に性格テストを実施し、そのテスト結果についても記録している。ここで、性格情報を生成する際には、周知の性格分析技術（ロールシャッハ・テスト、ソンディ・テスト等）を利用すればよい。また、性格情報を生成する際には、企業等が採用試験に利用する適性検査の技術を利用してもよい。 Further, in the personality DB 106 shown in FIG. 14, a personality test is performed on users and related persons, and the test results are also recorded. Here, when generating personality information, a well-known personality analysis technique (Rorschach test, Szondi test, etc.) may be used. In addition, when generating personality information, aptitude test technology used by companies and the like for recruitment tests may be used.

性格情報を生成する際には、例えば図１５に示す性格情報生成処理を実施する。性格情報生成処理は、例えば、端末装置１において操作部７０等を用いて性格情報を生成する旨が入力されると開始される処理である。 When generating personality information, for example, the personality information generation process shown in FIG. 15 is performed. The personality information generation process is, for example, a process that is started when it is input that the terminal device 1 uses the operation unit 70 or the like to generate personality information.

性格情報生成処理では、図１５に示すように、まず、マイク３７をＯＮ状態とし（Ｓ２４２）、所定の４択問題の１つを音声で出力する（Ｓ２４４）。この際、４択問題については、サーバ９０から取得してもよいし、予めメモリ３９に記録された問題を出題してもよい。 In the personality information generation process, as shown in FIG. 15, first, the microphone 37 is turned on (S242), and one of the predetermined four-choice questions is output by voice (S244). At this time, the four-choice question may be acquired from the server 90, or the question recorded in the memory 39 in advance may be set.

続いて、対象者（使用者またはその関係者）から音声で回答があったか否かを判定する（Ｓ２４６）。回答がなければ（Ｓ２４６：ＮＯ）、Ｓ２４６の処理を繰り返す。
また、回答があれば（Ｓ２４６：ＹＥＳ）、言葉の語尾、会話スピード等の会話パラメータを抽出し（Ｓ２４８）、現在の問題が最終問題であるか否かを判定する（Ｓ２５０）。最終問題でなければ（Ｓ２５０：ＮＯ）、次の問題を選択し（Ｓ２５２）、Ｓ２４２の処理に戻る。 Subsequently, it is determined whether or not there is a voice response from the target person (user or a related person thereof) (S246). If there is no answer (S246: NO), the process of S246 is repeated.
If there is an answer (S246: YES), conversation parameters such as word endings and conversation speed are extracted (S248), and it is determined whether or not the current problem is the final problem (S250). If it is not the final problem (S250: NO), the next problem is selected (S252), and the process returns to S242.

また、最終問題であれば（Ｓ２５０：ＹＥＳ）、４択問題を回答することによる性格分析を行い（Ｓ２５４）、会話パラメータによる性格分析を行う（Ｓ２５６）。ここで、会話パラメータによる性格分析では、自分に自信がある人は語尾が強く、自信がない人は語尾が弱くなる傾向や、せっかちな人は会話スピードが速く、おっとりした人は会話スピードが遅い傾向等を捉えることができる。 If it is the final question (S250: YES), the personality analysis is performed by answering the four-choice question (S254), and the personality analysis is performed by the conversation parameter (S256). Here, in the personality analysis using conversation parameters, those who are confident in themselves tend to have a strong flexion, those who are not confident tend to have a weak flexion, those who are impatient have a fast conversation speed, and those who are calm have a slow conversation speed. You can catch trends and so on.

続いて、これらの性格分析結果を加重平均するなど、総合的に分析し（Ｓ２５８）、性格区分に振り分ける（Ｓ２６０）。詳細には、テストによって得られた対象者の性格を点数化し、点数ごとに性格区分に振り分ける。 Subsequently, these personality analysis results are comprehensively analyzed (S258), such as by weighted averaging, and sorted into personality categories (S260). In detail, the personality of the subject obtained by the test is scored and sorted into personality categories according to the score.

続いて、対象者と性格区分とを対応付けて（Ｓ２６２）、性格ＤＢ１０６に記録させる（Ｓ２６４）。つまり、対象者と性格区分との関係をサーバ９０に送信する。なお、このとき、テスト結果についてもサーバ９０に送信し、サーバ９０は図１４に示すような性格ＤＢ１０６を構築する。このような処理が終了すると、性格情報生成処理を終了する。 Subsequently, the subject and the personality classification are associated with each other (S262) and recorded in the personality DB 106 (S264). That is, the relationship between the target person and the personality classification is transmitted to the server 90. At this time, the test result is also transmitted to the server 90, and the server 90 constructs the personality DB 106 as shown in FIG. When such processing is completed, the personality information generation processing is terminated.

このように生成された性格ＤＢ１０６を利用する際には、性格区分と異なる応答とを対応付けたものを応答候補ＤＢ１０５において準備しておく。そして、サーバ９０はＳ５６の処理にて、文字情報に対する複数の異なる応答を表す応答候補を取得し、性格情報に応じて応答候補から出力させる応答を選択し、Ｓ６０、Ｓ６４の処理にて、該選択した応答を出力させる。 When using the personality DB 106 generated in this way, a response candidate DB 105 in which a personality classification and a different response are associated with each other is prepared. Then, the server 90 acquires response candidates representing a plurality of different responses to the character information in the processing of S56, selects a response to be output from the response candidates according to the personality information, and in the processing of S60 and S64, the response candidate is selected. Output the selected response.

［第６実施形態による効果］
上記音声応答システム１００において端末装置１は、予め設定された複数の質問に対する回答に基づいて使用者または関係者の性格情報を生成し、生成された性格情報を取得する。 [Effect of the sixth embodiment]
In the voice response system 100, the terminal device 1 generates personality information of a user or a related person based on answers to a plurality of preset questions, and acquires the generated personality information.

このような音声応答システム１００によれば、性格情報をサーバ９０や端末装置１において生成することができる。
さらに、上記音声応答システム１００において演算部１０１は、入力された文字情報に含まれる文字列に基づいて使用者または関係者の性格情報を生成する。 According to such a voice response system 100, personality information can be generated in the server 90 or the terminal device 1.
Further, in the voice response system 100, the calculation unit 101 generates personality information of the user or a related person based on the character string included in the input character information.

このような音声応答システム１００によれば、使用者が音声応答システム１００を利用する過程で性格情報を生成することができる。
また、このような音声応答システム１００によれば、使用者や使用者に関係がある者（関係者）の性格に応じて異なる応答を行うことができる。よって、使用者にとって使い勝手を良くすることができる。 According to such a voice response system 100, personality information can be generated in the process of using the voice response system 100 by the user.
Further, according to such a voice response system 100, it is possible to make a different response depending on the personality of the user or a person (related person) related to the user. Therefore, it is possible to improve usability for the user.

［第６実施形態の変形例］
上記第６実施形態では、性格に応じて応答を１つに絞ってから出力してもよいし、複数の応答に対してそれぞれ異なる声色の音声を対応付けて出力してもよい。 [Variation example of the sixth embodiment]
In the sixth embodiment, the response may be narrowed down to one according to the personality and then output, or the plurality of responses may be output in association with voices having different voice colors.

また、上記性格情報生成処理のうちの、Ｓ２４８、Ｓ２５４〜Ｓ２６４の処理は、サーバ９０において実施してもよい。この場合、第１実施形態等と同様に、サーバ９０に端末装置１を特定させつつ、端末装置１とサーバ９０との間で音声や問題をやりとりすればよい。 Further, among the above-mentioned personality information generation processes, the processes of S248 and S254 to S264 may be performed on the server 90. In this case, as in the first embodiment, the server 90 may specify the terminal device 1 while exchanging voices and problems between the terminal device 1 and the server 90.

さらに、上記音声応答システム１００においてサーバ９０は、使用者の行動および操作のうちの何れかを検出し、これらに基づいて学習情報または性格情報を生成するようにしてもよい。 Further, in the voice response system 100, the server 90 may detect any one of the user's actions and operations and generate learning information or personality information based on these.

このような音声応答システム１００によれば、例えば、使用者が数日間連続で電車に飛び乗ることを検出した場合には、翌日からは数分早く家を出るよう促したり、会話から使用者に怒りやすい傾向があることを検出した場合には、気分を抑える音声や音楽を出力したりすることができる。 According to such a voice response system 100, for example, when it is detected that the user jumps on the train for several days in a row, the user is urged to leave the house several minutes earlier from the next day, or the user gets angry from the conversation. When it is detected that it tends to be easy, it is possible to output a voice or music that suppresses the mood.

［第７実施形態］
［第７実施形態の処理］
次に、第７実施形態の音声応答システムでは、使用者や関係者の嗜好を予め設定された区分に従って対応付けた嗜好情報が記録された嗜好ＤＢ１０８を準備している。嗜好ＤＢ１０８は、例えば、図１６に示すように、使用者や関係者の名前と、これらの者の嗜好が
、食の好み（食）、色の好み（色）、趣味、等の嗜好の種別のそれぞれに対して対応付けて記録されている。 [7th Embodiment]
[Processing of the 7th embodiment]
Next, in the voice response system of the seventh embodiment, the preference DB 108 in which the preference information in which the preferences of the user and the related persons are associated according to the preset categories is recorded is prepared. In the preference DB 108, for example, as shown in FIG. 16, the names of users and related persons and the preferences of these persons are the types of preferences such as food preference (food), color preference (color), and hobby. It is recorded in association with each of.

特に、食の好みについては、甘党（甘）、辛党（辛）、その中間である並、色の好みについては、暖色系（暖）、寒色系（寒）、その中間である並、趣味については、インドア系の趣味（内）、アウトドア系の趣味（外）、インドア・アウトドア両方の趣味（内外）に分類している。 In particular, for food preferences, sweet party (sweet), spicy party (spicy), average in between, and for color preferences, warm color (warm), cool color (cold), intermediate average, hobbies Is classified into indoor hobbies (inside), outdoor hobbies (outside), and both indoor and outdoor hobbies (inside and outside).

このような嗜好ＤＢ１０８を構築する際には、例えば、図１７に示す嗜好情報生成処理を実行する。嗜好情報生成処理は、例えば、Ｓ４８〜Ｓ５４の間で実施される。
詳細には、図１７に示すように、文字情報から嗜好に関するキーワードを抽出し（Ｓ２８２）、画像処理によって特定された物体のうち、嗜好に関するものを抽出する（Ｓ２８４）。なお、嗜好に関するキーワードは、嗜好ＤＢ１０８において、嗜好の種別とその種別の中での分類（食の好みであれば甘、並、辛など）とが対応付けられており、これらの処理では抽出したキーワードや物体が嗜好ＤＢ１０８に含まれている場合に、嗜好に関するものとして抽出される。 When constructing such a preference DB 108, for example, the preference information generation process shown in FIG. 17 is executed. The preference information generation process is performed, for example, between S48 and S54.
Specifically, as shown in FIG. 17, a keyword related to preference is extracted from the character information (S282), and an object related to preference is extracted from the objects specified by image processing (S284). In the preference DB 108, the keywords related to the preference are associated with the type of preference and the classification within the type (sweet, average, spicy, etc. if the taste of food), and were extracted in these processes. When a keyword or an object is included in the preference DB 108, it is extracted as related to the preference.

続いて、嗜好に関するキーワードのグループごとにカウンタをインクリメントする（Ｓ２８８）。例えば、キムチのように、嗜好の種別が「食の好み」であり、種別が「辛」であるものが抽出された場合には、「食の好み」「辛」が対応するカウンタをインクリメントする。 Subsequently, the counter is incremented for each group of keywords related to preference (S288). For example, when the type of preference is "food preference" and the type is "spicy" such as kimchi is extracted, the counter corresponding to "food preference" and "spicy" is incremented. ..

そして、カウンタ値に基づいて、嗜好情報（嗜好ＤＢ１０８）を更新する（Ｓ２９０）。つまり、「嗜好の種別」ごとに、最もカウンタ値が大きな「種別」が最も嗜好に合致しているものとして、使用者や関係者の嗜好の特徴として嗜好ＤＢ１０８に記録する。このような処理が終了すると、嗜好情報生成処理を終了する。 Then, the preference information (preference DB 108) is updated based on the counter value (S290). That is, for each "type of preference", the "type" having the largest counter value is recorded in the preference DB 108 as a feature of the preference of the user or the person concerned as the one that best matches the preference. When such a process is completed, the preference information generation process is terminated.

このように生成された嗜好ＤＢ１０８を利用する際には、嗜好毎に異なる応答を対応付けたものを応答候補ＤＢ１０５において準備しておき、サーバ９０はＳ５６の処理にて、文字情報に対する複数の異なる応答を表す応答候補を取得し、嗜好情報に応じて応答候補から出力させる応答を選択し、Ｓ６０、Ｓ６４の処理にて、該選択した応答を出力させる。 When using the preference DB 108 generated in this way, a response candidate DB 105 in which different responses are associated with each preference is prepared, and the server 90 prepares a plurality of different responses to the character information in the process of S56. A response candidate representing a response is acquired, a response to be output from the response candidate is selected according to preference information, and the selected response is output in the processes of S60 and S64.

［第７実施形態による効果］
上記音声応答システム１００においてサーバ９０は、文字情報に含まれる文字列に基づいて使用者または関係者の嗜好の傾向を示す嗜好情報を生成する。そして、嗜好情報に基づいて応答候補から出力させる応答を選択し、該選択した応答を出力させる。 [Effect of the 7th embodiment]
In the voice response system 100, the server 90 generates preference information indicating a tendency of preference of a user or a related person based on a character string included in the character information. Then, the response to be output from the response candidates is selected based on the preference information, and the selected response is output.

このような音声応答システム１００によれば、使用者または関係者の好みに応じて応答を行うことができる。例えば、使用者が関係者のプレゼントを買う際に、「○○さんは何がほしいかな」と端末装置１に問いかけると、嗜好情報に応じた応答を得ることができる。 According to such a voice response system 100, it is possible to respond according to the preference of the user or a person concerned. For example, when the user asks the terminal device 1 "what does Mr. XX want?" When buying a present of the person concerned, a response according to the preference information can be obtained.

［第７実施形態の変形例］
応答候補ＤＢ１０５においては、図１８に示すように、性格区分と嗜好情報とを対応付けたテーブルを持たせておいてもよい。 [Modified example of the seventh embodiment]
As shown in FIG. 18, the response candidate DB 105 may have a table in which the personality classification and the preference information are associated with each other.

例えば、図１８に示す例では、性格区分と色に関する好みとを対応付けて、女性がプレゼントとして貰えると喜ぶと推定できる商品をマトリクス状に配置している。
Ｓ５６の処理では、このように性格と嗜好との両方を加味して応答を生成することもで
きる。 For example, in the example shown in FIG. 18, the personality classification and the color preference are associated with each other, and the products that can be presumed to be happy for a woman to receive as a gift are arranged in a matrix.
In the process of S56, it is also possible to generate a response by adding both personality and taste in this way.

［第８実施形態］
［第８実施形態の処理］
上記実施形態では、音声を文字情報に変換したが、使用者による動作を文字情報に変換するようにしてもよい。 [Eighth Embodiment]
[Processing of the eighth embodiment]
In the above embodiment, the voice is converted into character information, but the operation by the user may be converted into character information.

詳細には、端末装置１は、使用者の動作を撮像画像として捉えてサーバ９０に送信し、サーバ９０では、例えば、図１９に示す動作文字入力処理を実施すればよい。動作文字入力処理は、Ｓ４８の処理にて撮像画像中に使用者の体の部位が映っていた場合に開始される処理である。 Specifically, the terminal device 1 may capture the user's operation as an captured image and transmit it to the server 90, and the server 90 may perform, for example, the operation character input process shown in FIG. The operation character input process is a process started when a part of the user's body is reflected in the captured image in the process of S48.

動作文字入力処理では、図１９に示すように、まず、撮像画像を取得する（Ｓ３０２）。そして、使用者が手書きで文字を入力しようとしているか、手話で文字を入力しようとしているかを判定する（Ｓ３０４、Ｓ３０８）。 In the operation character input process, as shown in FIG. 19, first, a captured image is acquired (S302). Then, it is determined whether the user is trying to input characters by handwriting or by sign language (S304, S308).

これらの処理では、例えば、撮像画像に使用者の上半身が顔とともに映っている場合には、手話で文字を入力しようとしていると判定し、撮像画像に使用者の顔が映ることなく使用者の手が映っている場合には、手書きで文字を入力しようとしていると判定する。 In these processes, for example, when the upper body of the user is shown together with the face in the captured image, it is determined that the user is trying to input characters in sign language, and the user's face is not shown in the captured image. If the hand is reflected, it is determined that the character is being input by handwriting.

手書きで文字を入力しようとしてれば（Ｓ３０４：ＹＥＳ）、指先またはペン先の挙動を記録し（Ｓ３０６）、この挙動に基づいて挙動を文字情報に変換する（Ｓ３１２）。ここで、手書き文字・手話ＤＢ１１２には、文字を手書きする際の挙動と文字とが対応付けられており、また、手の動きと手話により表現される文字とが対応付けられている。Ｓ３１２の処理では、手書き文字・手話ＤＢ１１２を参照することによって、文字情報を生成する。 If a character is to be input by handwriting (S304: YES), the behavior of the fingertip or the pen tip is recorded (S306), and the behavior is converted into character information based on this behavior (S312). Here, in the handwritten character / sign language DB 112, the behavior when handwriting the character and the character are associated with each other, and the movement of the hand and the character expressed by the sign language are associated with each other. In the process of S312, character information is generated by referring to the handwritten character / sign language DB 112.

また、手話で文字を入力しようとしてれば（Ｓ３０４：ＮＯ、Ｓ３０８：ＹＥＳ）、手書き文字・手話ＤＢ１１２を参照して手話内容を認識し、前述のＳ３１２の処理を実施する。また、手書きや手話で文字を入力しようとしてなければ（Ｓ３０８：ＮＯ）、他の方式による入力の処理を行う（Ｓ３１４）。 Further, if an attempt is made to input a character in sign language (S304: NO, S308: YES), the sign language content is recognized with reference to the handwritten character / sign language DB 112, and the above-mentioned processing of S312 is performed. Further, if no attempt is made to input characters by handwriting or sign language (S308: NO), input processing by another method is performed (S314).

続いて、動作によって入力された文字と音声によって入力された文字とを対応付け、類似性がある音声があるか否か（文字に基づく基準波形と発音波形との一致度が基準値以上か否か）を判定する（Ｓ３１６）。このような音声入力があれば（Ｓ３１６：ＹＥＳ）、この使用者がこの文字を入力するときのアクセントや発音の特徴を、文字と対応付けて学習ＤＢ１０７に記録し（Ｓ３１８）、動作文字入力処理を終了する。 Then, the characters input by the action and the characters input by the voice are associated with each other, and whether or not there is a similar voice (whether or not the degree of matching between the reference waveform based on the character and the pronunciation waveform is equal to or higher than the reference value). Is determined (S316). If there is such a voice input (S316: YES), the accent and pronunciation characteristics when the user inputs this character are recorded in the learning DB 107 in association with the character (S318), and the operation character input process is performed. To finish.

また、このような音声入力がなければ（Ｓ３１６：ＮＯ）、動作文字入力処理を終了する。
［第８実施形態による効果］
上記音声応答システム１００においては、使用者による動作を文字情報に変換するので、使用者が声を出すことなく文字情報を入力することができる。 If there is no such voice input (S316: NO), the operation character input process is terminated.
[Effect of the eighth embodiment]
In the voice response system 100, since the operation by the user is converted into character information, the user can input the character information without speaking out.

［第８実施形態の変形例］
本実施形態の動作としては、文字の手書き、或いは身振り手振り（例えば手話）だけでなく筋肉の動作に起因するものであればよい。 [Variation example of the eighth embodiment]
The operation of the present embodiment may be caused not only by handwriting of characters or gestures (for example, sign language) but also by the movement of muscles.

［第９実施形態］
［第９実施形態の処理］
学習ＤＢ１０７の内容は、使用者が普段利用する端末装置１とは別の他の端末装置１を利用する場合に、この他の端末装置１において利用できるようにしてもよい。この場合、他の端末装置１から、利用要求とともに、普段利用する端末装置１のＩＤとパスワードをサーバ９０に対して送信する。 [9th Embodiment]
[Processing of the ninth embodiment]
The contents of the learning DB 107 may be made available in the other terminal device 1 when the other terminal device 1 different from the terminal device 1 normally used by the user is used. In this case, the other terminal device 1 transmits the ID and password of the terminal device 1 that is normally used to the server 90 together with the usage request.

そして、サーバ９０では、図２０に示す他端末利用処理を実行する。他端末利用処理は、利用要求を受けると開始される処理である。
他端末利用処理では、図２０に示すように、まず、ＩＤとパスワードが入力されたか否かを判定する（Ｓ３３２）。ＩＤとパスワードが入力されていなければ（Ｓ３３２：ＮＯ）、Ｓ３３２の処理を繰り返す。 Then, the server 90 executes the other terminal use process shown in FIG. The other terminal usage process is a process that is started when a usage request is received.
In the process of using another terminal, as shown in FIG. 20, first, it is determined whether or not the ID and the password have been input (S332). If the ID and password have not been entered (S332: NO), the process of S332 is repeated.

また、ＩＤとパスワードが入力されていれば（Ｓ３３２：ＹＥＳ）、ＩＤとパスワードによる認証が完了したか否かを判定する（Ｓ３３４）。認証が完了していれば（Ｓ３３４：ＹＥＳ）、認証が完了した旨を他の端末装置１に送信し（Ｓ３３６）、他の端末装置１がＩＤとパスワードが対応する端末装置１の学習ＤＢ１０７を利用するよう設定する（Ｓ３３８）。 Further, if the ID and the password are input (S332: YES), it is determined whether or not the authentication by the ID and the password is completed (S334). If the authentication is completed (S334: YES), the fact that the authentication is completed is transmitted to the other terminal device 1 (S336), and the other terminal device 1 transmits the learning DB 107 of the terminal device 1 corresponding to the ID and the password. Set to use (S338).

認証が完了しなければ（Ｓ３３４：ＮＯ）、エラーである旨を他の端末装置１に送信し（Ｓ３４０）、他端末利用処理を終了する。
［第９実施形態による効果］
また、上記音声応答システム１００においてサーバ９０は、ある端末装置１の学習情報を他の端末装置１に転送する。 If the authentication is not completed (S334: NO), an error is transmitted to the other terminal device 1 (S340), and the processing for using the other terminal is terminated.
[Effect of the ninth embodiment]
Further, in the voice response system 100, the server 90 transfers the learning information of one terminal device 1 to another terminal device 1.

このような音声応答システム１００によれば、ある端末装置１を利用する使用者が他の端末装置１を利用する場合においても、ある端末装置１で記録された学習情報（サーバ９０に記録された学習情報）を利用することができる。よって、他の端末装置１を利用する場合においても文字情報の生成精度を向上させることができる。特に、使用者が端末装置１を複数所持する場合に有効である。 According to such a voice response system 100, even when a user who uses a certain terminal device 1 uses another terminal device 1, the learning information recorded in the certain terminal device 1 (recorded in the server 90). Learning information) can be used. Therefore, even when another terminal device 1 is used, the accuracy of generating character information can be improved. This is particularly effective when the user has a plurality of terminal devices 1.

さらに、上記音声応答システム１００においてサーバ９０は、使用者以外の者から問い合わせに対して使用者についての情報を出力する。
このような音声応答システム１００によれば、例えば、使用者の食事内容な散歩の距離などを検出しておけば、病院等での質問に使用者に代わって回答することができる。また、健康状態や自己紹介など学習しておくようにしてもよい。 Further, in the voice response system 100, the server 90 outputs information about the user in response to an inquiry from a person other than the user.
According to such a voice response system 100, for example, if the distance of a walk, which is the meal content of the user, is detected, it is possible to answer a question at a hospital or the like on behalf of the user. You may also learn about your health condition and self-introduction.

［第９実施形態の変形例］
第９実施形態の構成と同様に、利用を終了する要求と、ＩＤおよびパスワードを受けると、ＩＤおよびパスワードが対応する端末装置１に対する学習ＤＢ１０７の利用を終了（禁止）するようにしてもよい。 [Variation example of the ninth embodiment]
Similar to the configuration of the ninth embodiment, when the request for terminating the use and the ID and the password are received, the use of the learning DB 107 for the terminal device 1 corresponding to the ID and the password may be terminated (prohibited).

［第１０実施形態］
［第１０実施形態の処理］
第１０実施形態の音声応答システムでは、サーバ９０が、会話内容を記憶し、聞いた内容について同じ内容を得るための質問をする。詳細には、図７に示す自動会話サーバ処理のＳ１００において、図２１に示す記憶確認処理を実行する。 [10th Embodiment]
[Processing of the tenth embodiment]
In the voice response system of the tenth embodiment, the server 90 stores the conversation content and asks a question to obtain the same content about the heard content. Specifically, in S100 of the automatic conversation server process shown in FIG. 7, the memory confirmation process shown in FIG. 21 is executed.

記憶確認処理では、図２１に示すように、過去の会話内容を学習ＤＢ１０７から抽出し（Ｓ３５２）、このうちの何れかの会話内容に含まれるキーワードを解答とする質問を生成する（Ｓ３５３）。このような処理が終了すると記憶確認処理を終了する。 In the memory confirmation process, as shown in FIG. 21, past conversation contents are extracted from the learning DB 107 (S352), and a question having a keyword included in any of the conversation contents as an answer is generated (S353). When such a process is completed, the memory confirmation process is terminated.

記憶確認処理では、例えば、「昨日の夕食のメニューは何でしたか」や、「３日前にどこに出かけましたか」などと質問すればよい。
［第１０実施形態による効果］
このような音声応答システム１００によれば、使用者の記憶力の確認をするとともに、記憶の定着を図ることができる。高齢者の認知症の進行を抑制するためにも有効であると考えられる。 In the memory confirmation process, you can ask, for example, "What was the menu for dinner yesterday?" Or "Where did you go three days ago?"
[Effect of the tenth embodiment]
According to such a voice response system 100, it is possible to confirm the memory ability of the user and to fix the memory. It is also considered to be effective in suppressing the progression of dementia in the elderly.

［第１１実施形態］
［第１１実施形態の処理］
次に、第１１実施形態の音声応答システムでは、端末装置１およびサーバ９０を利用して使用者が外国語の練習を行えるよう構成している。 [11th Embodiment]
[Processing of the eleventh embodiment]
Next, the voice response system of the eleventh embodiment is configured so that the user can practice a foreign language by using the terminal device 1 and the server 90.

詳細には、図２２に示す発音判定処理１と図２３に示す発音判定処理２と図２４に示す発音判定処理３とを順に実行する。ただし、サーバ９０は、音声応答サーバ処理（図２）の実施毎に、発音判定処理１〜３の各処理のうちの１つを実行する。また、発音判定処理１〜３の各処理は、前述のＳ５６の処理として実行される。 Specifically, the pronunciation determination process 1 shown in FIG. 22, the pronunciation determination process 2 shown in FIG. 23, and the pronunciation determination process 3 shown in FIG. 24 are executed in order. However, the server 90 executes one of the pronunciation determination processes 1 to 3 each time the voice response server process (FIG. 2) is executed. Further, each of the pronunciation determination processes 1 to 3 is executed as the process of S56 described above.

まず、発音判定処理１では、図２２に示すように、所定の文章を音声で入力するよう指示する旨の応答を生成する（Ｓ３６２）。この処理では、例えば、外国語の手本となる文章を生成し、この文章を手本に続いて真似て話すよう促す。この処理が終了すると、発音判定処理１を終了する。 First, in the pronunciation determination process 1, as shown in FIG. 22, a response to instruct to input a predetermined sentence by voice is generated (S362). In this process, for example, a sentence that serves as a model for a foreign language is generated, and the sentence is encouraged to imitate and speak following the model. When this process is completed, the pronunciation determination process 1 is terminated.

次に、発音判定処理１に伴って音声が入力されると、発音判定処理２を実施する。発音判定処理２では、図２３に示すように、発音およびアクセントの正確性をスコア（点数）化する（Ｓ３７２）。この処理では、音声は波形として捉え、このとき手本となる文章を波形としたときとの波形の一致度合をスコア化する。 Next, when a voice is input in association with the pronunciation determination process 1, the pronunciation determination process 2 is performed. In the pronunciation determination process 2, as shown in FIG. 23, the accuracy of pronunciation and accent is scored (S372). In this process, the voice is regarded as a waveform, and the degree of coincidence of the waveform with that of the sample sentence as the waveform is scored.

そして、このスコアをメモリに記録し（Ｓ３７４）、発音判定処理２を終了する。続いて、発音判定処理３を実施する。発音判定処理３では、図２４に示すように、まず、スコアが閾値未満であるか否かを判定する（Ｓ３８２）。 Then, this score is recorded in the memory (S374), and the pronunciation determination process 2 is terminated. Subsequently, the pronunciation determination process 3 is performed. In the pronunciation determination process 3, as shown in FIG. 24, first, it is determined whether or not the score is less than the threshold value (S382).

スコアが閾値未満であれば（Ｓ３８２：ＹＥＳ）、再度、同様の文章を入力するよう指示する旨の応答を生成する（Ｓ３８４）。この処理では、例えば、再度、手本に続いて真似て話すよう促すための応答を生成する。 If the score is less than the threshold value (S382: YES), a response to instruct to input the same sentence again is generated (S384). In this process, for example, a response is generated to encourage the user to imitate and speak again following the example.

また、スコアが閾値以上であれば（Ｓ３８２：ＮＯ）、発音がよかった旨および次の文章を入力するよう促す応答を生成する（Ｓ３８６）。例えば、「よい発音です。次に進みましょう。」などと応答を生成する。 Further, if the score is equal to or higher than the threshold value (S382: NO), a response indicating that the pronunciation is good and prompting to input the next sentence is generated (S386). For example, generate a response such as "Good pronunciation. Let's move on."

このような処理が終了すると、発音判定処理３を終了する。
［第１１実施形態による効果］
上記音声応答システム１００においてサーバ９０は、使用者が入力する音声の発音やアクセントの正確度合を検出し、検出した正確度合を出力する。 When such a process is completed, the pronunciation determination process 3 is terminated.
[Effect of the 11th embodiment]
In the voice response system 100, the server 90 detects the accuracy of the pronunciation and accent of the voice input by the user, and outputs the detected accuracy.

このような音声応答システム１００によれば、発音やアクセントの正確性を確認することができる。例えば外国語の練習を行う際に有効である。
さらに、上記音声応答システム１００においてサーバ９０は、正確度合が一定値以下の場合に、再度、同じ質問を出力させる。 According to such a voice response system 100, the accuracy of pronunciation and accent can be confirmed. For example, it is effective when practicing a foreign language.
Further, in the voice response system 100, the server 90 outputs the same question again when the accuracy is equal to or less than a certain value.

このような音声応答システム１００によれば同じ質問を出力することによって正確な回
答を求めることができる。
［第１１実施形態の変形例］
上記音声応答システム１００においてサーバ９０は、正確度合が一定値以下の場合に、確認のために、使用者が行った発音に最も近い単語を含む音声を出力するようにしてもよい。 According to such a voice response system 100, an accurate answer can be obtained by outputting the same question.
[Modified example of the eleventh embodiment]
In the voice response system 100, when the accuracy is equal to or less than a certain value, the server 90 may output a voice including the word closest to the pronunciation made by the user for confirmation.

このような音声応答システム１００によれば、使用者が発音やアクセントの正確性を確認することができる。
［第１２実施形態］
［第１２実施形態の処理］
次に第１２実施形態の音声応答システムについて説明する。第１２実施形態の音声応答システムでは、使用者が入力した音声から使用者の感情を検出し、感情に応じて使用者を癒す応答を生成する。 According to such a voice response system 100, the user can confirm the accuracy of pronunciation and accent.
[12th Embodiment]
[Processing of the twelfth embodiment]
Next, the voice response system of the twelfth embodiment will be described. In the voice response system of the twelfth embodiment, the user's emotion is detected from the voice input by the user, and a response that heals the user is generated according to the emotion.

詳細には、図２５に示す感情判定処理と、図２６に示す感情応答生成処理とを実行する。感情判定処理は、前述のＳ５０の処理の詳細として実施され、図２５に示すように、まず、声色、文章の語尾の強弱、一文の長さ、会話スピード、思いがけず出る言葉等から感情をスコア化する（Ｓ３９２）、続いて、スコアによって感情を分類し、メモリに記録する（Ｓ３９４）。 Specifically, the emotion determination process shown in FIG. 25 and the emotion response generation process shown in FIG. 26 are executed. The emotion determination process is performed as the details of the above-mentioned S50 process, and as shown in FIG. 25, first, the emotion is scored from the voice color, the strength of the flexion of the sentence, the length of one sentence, the conversation speed, the unexpected words, and the like. (S392), then the emotions are classified according to the score and recorded in the memory (S394).

このような処理が終了すると、感情判定処理を終了する。続いて、前述のＳ５６の処理において、感情応答生成処理を実行する。
詳細には、図２６に示すように、まず、感情判定処理にて設定された感情区分を判定する（Ｓ４１２）。感情区分が通常であれば（Ｓ４１２：通常）、「こんにちは」等の普通の挨拶文を応答（メッセージ）として生成する（Ｓ４１４）。 When such a process is completed, the emotion determination process is terminated. Subsequently, in the above-mentioned process of S56, the emotion response generation process is executed.
Specifically, as shown in FIG. 26, first, the emotion classification set in the emotion determination process is determined (S412). If the emotion classification is normal (S412: normal), a normal greeting such as "hello" is generated as a response (message) (S414).

また、感情区分が怒りであれば（Ｓ４１２：怒り）、「お気に障りましたか」等、相手の感情を落ち着かせる際の文章を応答として生成する（Ｓ４１６）。さらに、感情区分が喜びであれば（Ｓ４１２：喜び）、「今日は楽しいですね」等、普通の挨拶文と比較して明るいニュアンスの挨拶文を応答として生成する（Ｓ４１８）。 If the emotion classification is anger (S412: anger), a sentence for calming the other party's emotions, such as "Did you bother me?", Is generated as a response (S416). Further, if the emotion classification is joy (S412: joy), a greeting with a brighter nuance as compared with a normal greeting, such as "Today is fun", is generated as a response (S418).

また、感情区分が困惑であれば（Ｓ４１２：困惑）、「どうかしましたか」等、相手を気遣う際の挨拶文を応答として生成する（Ｓ４２０）。このような処理が終了すると、感情応答生成処理を終了する。 If the emotion classification is confusing (S412: confused), a greeting message for caring for the other party such as "What's wrong?" Is generated as a response (S420). When such a process is completed, the emotion response generation process is terminated.

［第１２実施形態による効果］
上記音声応答システム１００においてサーバ９０は、思いがけずに発する音声を検出することによって使用者の苛立ちや動揺を検出し、苛立ちや動揺を抑制するためのメッセージを生成する。 [Effects of the 12th Embodiment]
In the voice response system 100, the server 90 detects annoyance or agitation of the user by detecting an unexpectedly emitted voice, and generates a message for suppressing the annoyance or the agitation.

このような音声応答システム１００によれば、使用者に苛立ちや動揺がある場合に、これらを抑制することができる。よって、使用者と周囲とのトラブルの発声を抑制することができる。 According to such a voice response system 100, when the user is irritated or upset, it is possible to suppress them. Therefore, it is possible to suppress the vocalization of troubles between the user and the surroundings.

［第１３実施形態］
［第１３実施形態の処理］
次に第１３実施形態の音声応答システムについて説明する。第１３実施形態の音声応答システムでは、撮像画像中の物体まで使用者を案内する処理を行う。この処理はサーバ９０において前述のＳ５６の処理の詳細として実施される。 [13th Embodiment]
[Processing of the thirteenth embodiment]
Next, the voice response system of the thirteenth embodiment will be described. In the voice response system of the thirteenth embodiment, a process of guiding the user to an object in the captured image is performed. This process is performed on the server 90 as the details of the process of S56 described above.

端末装置１において「見えているタワーまで道案内してください」などと音声で入力すると、Ｓ５６の処理では案内処理が実施される。案内処理では、図２７に示すように、まず、端末位置情報を端末装置１のＧＰＳ受信機２７等から取得する（Ｓ４３２）。 When a voice input such as "Please guide the way to the tower you can see" in the terminal device 1, the guidance process is performed in the process of S56. In the guidance process, as shown in FIG. 27, first, the terminal position information is acquired from the GPS receiver 27 or the like of the terminal device 1 (S432).

そして、音声（文字情報）と画像処理とに基づいて、撮像画像中の物体のうちから対象となる物体を特定し、この位置を特定する（Ｓ４３４）。この処理では、物体の形状、相対的な位置等に基づいて地図情報（外部から取得してもよいし、サーバ９０が保持していてもよい）において物体の位置を特定する。例えば、撮像画像中にタワーが映っていた場合、端末装置１の位置とタワーの形状とから、そのタワーを地図上において特定する。 Then, based on the voice (character information) and the image processing, the target object is specified from the objects in the captured image, and this position is specified (S434). In this process, the position of the object is specified in the map information (which may be acquired from the outside or may be held by the server 90) based on the shape of the object, the relative position, and the like. For example, when a tower is shown in the captured image, the tower is specified on the map from the position of the terminal device 1 and the shape of the tower.

続いて、この物体までの経路を検索し（Ｓ４３６）、経路情報を取得する（Ｓ４３８）。この処理は周知のクラウド方式のナビゲーション装置における処理と同様の処理を用いて実現できる。 Subsequently, the route to this object is searched (S436), and the route information is acquired (S438). This process can be realized by using the same process as that in a well-known cloud-type navigation device.

そして、経路を案内するための応答を生成する（Ｓ４４０）。この処理においても、ナビゲーション装置による案内と同様の応答を生成すればよい。
このような処理が終了すると、案内処理を終了する。なお、使用者が移動しながら案内処理を実施する際には、自動会話サーバ処理を利用して、使用者が案内すべきポイントに到達することを再生条件としてメッセージを再生すればよい。 Then, a response for guiding the route is generated (S440). Also in this process, it is sufficient to generate a response similar to the guidance provided by the navigation device.
When such a process is completed, the guidance process is terminated. When the guidance process is performed while the user is moving, the automatic conversation server process may be used to reproduce the message on the condition that the user reaches the point to be guided.

［第１３実施形態による効果］
上記音声応答システム１００においてサーバ９０は、文字情報が入力された際に、当該音声応答システム１００の周囲を撮像した撮像画像に応じた応答を生成し、この応答を音声で出力させる。 [Effect of the thirteenth embodiment]
In the voice response system 100, when the character information is input, the server 90 generates a response corresponding to the captured image captured around the voice response system 100, and outputs this response by voice.

このような音声応答システム１００によれば、撮像画像に応じて応答を音声で出力することができる。したがって、文字情報のみから応答を生成する構成と比較して使い勝手を向上させることができる。 According to such a voice response system 100, a response can be output by voice according to the captured image. Therefore, usability can be improved as compared with a configuration in which a response is generated only from character information.

また、上記音声応答システム１００においてサーバ９０は、文字情報に含まれる物体を撮像画像中から画像処理によって検索し、該検索された物体の位置を特定し、この物体の位置まで案内する。 Further, in the voice response system 100, the server 90 searches for an object included in the character information from the captured image by image processing, identifies the position of the searched object, and guides the object to the position of the searched object.

このような音声応答システム１００によれば、撮像画像中の物体まで使用者を案内することができる。
さらに、上記音声応答システム１００においてサーバ９０は、目的地までの案内を行う場合において、目的地までの天気、温度、湿度、交通情報、路面状態等の経路情報を取得し、経路情報を音声で出力させる。 According to such a voice response system 100, the user can be guided to an object in the captured image.
Further, in the voice response system 100, the server 90 acquires route information such as weather, temperature, humidity, traffic information, and road surface condition to the destination when guiding to the destination, and the route information is voiced. Output.

このような音声応答システム１００によれば、目的地までの状況（経路情報）を使用者に音声で通知することができる。
［第１３実施形態の変形例］
上記構成に加えて、認識したものが何かを応答するよう文字情報を入力し、撮像画像から認識したものが何か（誰か）を音声で出力するようにしてもよい。 According to such a voice response system 100, the situation (route information) to the destination can be notified to the user by voice.
[Modified example of the thirteenth embodiment]
In addition to the above configuration, character information may be input so that what is recognized responds, and what is recognized from the captured image (someone) may be output by voice.

さらに、上記音声応答システム１００においてサーバ９０は、Ｓ４８の処理に換えて、文字情報を音声で入力する際において使用者の口の形状を撮像した動画像を取得してもよい。この場合、Ｓ５２の処理に換えて、音声を文字情報に変換し、かつ、該動画像に基づいて、音声の不明確な部分を推定して文字情報を補正してもよい。 Further, in the voice response system 100, the server 90 may acquire a moving image of the shape of the user's mouth when inputting character information by voice instead of the processing of S48. In this case, instead of the processing of S52, the voice may be converted into character information, and the unclear part of the voice may be estimated and the character information may be corrected based on the moving image.

このような音声応答システム１００によれば、口の形状から発声内容を推定することできるので、音声の不明確な部分を良好に推定することができる。
［第１４実施形態］
［第１４実施形態の処理］
次に第１４実施形態の音声応答システムについて説明する。第１４実施形態の音声応答システムでは、使用者に所定の動作を要求し、この要求通りに使用者が動作を行ったかどうかを判定する。この構成では、図６に示す自動会話端末処理、および図７に示す自動会話サーバ処理において、前述のＳ５６の処理の詳細として図２８に示す移動要求処理１および図２９に示す移動要求処理２が順に実施される。 According to such a voice response system 100, since the utterance content can be estimated from the shape of the mouth, an unclear part of the voice can be estimated satisfactorily.
[14th Embodiment]
[Processing of the 14th embodiment]
Next, the voice response system of the 14th embodiment will be described. In the voice response system of the 14th embodiment, a predetermined operation is requested from the user, and it is determined whether or not the user has performed the operation according to the request. In this configuration, in the automatic conversation terminal process shown in FIG. 6 and the automatic conversation server process shown in FIG. 7, the move request process 1 shown in FIG. 28 and the move request process 2 shown in FIG. 29 are the details of the process of S56 described above. It will be carried out in order.

初めにＳ５４の処理が終了すると移動要求処理１が開始され、移動要求処理１では、図２８に示すように、所定の位置に視線や頭を移動させるよう指示する旨の応答（メッセージ）を出力する（Ｓ４５２）。この処理が終了すると、移動要求処理１を終了する。 First, when the processing of S54 is completed, the movement request processing 1 is started, and in the movement request processing 1, as shown in FIG. 28, a response (message) instructing to move the line of sight or the head to a predetermined position is output. (S452). When this process is completed, the move request process 1 is terminated.

続いて、次にＳ５４の処理が終了すると移動要求処理２が開始され、移動要求処理２では、図２９に示すように、指示通りに視線や頭の位置が移動したか否かを判定する（Ｓ４６２）。この処理では、カメラによる撮像画像を画像処理することや、端末装置１の各種センサによる検出結果を用いて使用者の動作を検出する。なお、画像処理によって視線を検出する場合には、周知の視線認識の技術を採用すればよい。 Subsequently, when the processing of S54 is completed next, the movement request processing 2 is started, and in the movement request processing 2, it is determined whether or not the line of sight or the position of the head has moved as instructed as shown in FIG. 29 (as shown in FIG. 29). S462). In this processing, the motion of the user is detected by processing the image captured by the camera and using the detection results of various sensors of the terminal device 1. When the line of sight is detected by image processing, a well-known line-of-sight recognition technique may be adopted.

指示通りに視線や頭が移動していなければ（Ｓ４６２：ＮＯ）、再度、Ｓ４５２にて生成した応答を出力する（Ｓ４６４）。また、指示通りに視線や頭が移動していれば（Ｓ４６２：ＹＥＳ）、別の任意の応答を生成する（Ｓ４６６）。 If the line of sight or head does not move as instructed (S462: NO), the response generated in S452 is output again (S464). Further, if the line of sight or the head is moving as instructed (S462: YES), another arbitrary response is generated (S466).

このような処理が終了すると、移動要求処理２を終了する。
［第１４実施形態による効果］
上記音声応答システム１００においては、使用者の視線を検出し、呼びかけに対して所定の位置に使用者の視線が移動しない場合、視線を所定の位置に移動させるよう要求する音声を出力する。 When such a process is completed, the move request process 2 is terminated.
[Effect of the 14th embodiment]
The voice response system 100 detects the line of sight of the user, and if the line of sight of the user does not move to a predetermined position in response to the call, outputs a voice requesting the line of sight to move to a predetermined position.

このような音声応答システム１００によれば、使用者に特定の位置を見させることができる。よって、車両運転時の安全確認などを確実に行うことができる。
なお、上記音声応答システム１００においてサーバ９０は、体の部位の位置や顔の表情を観察し、呼びかけ対する変化が少ない場合、体の部位の位置や顔の表情を変化させるよう要求する音声を出力する。 According to such a voice response system 100, the user can be made to look at a specific position. Therefore, it is possible to surely confirm the safety when driving the vehicle.
In the voice response system 100, the server 90 observes the position of the body part and the facial expression, and outputs a voice requesting the position of the body part and the facial expression to be changed when there is little change in the call. do.

このような音声応答システム１００によれば、使用者の体の部位の位置を特定の位置に移動させたり、特定の表情をするよう誘導したりすることができる。本発明は、車両の運転時や身体検査等の際に利用することができる。 According to such a voice response system 100, it is possible to move the position of a part of the user's body to a specific position or to induce the user to make a specific facial expression. The present invention can be used when driving a vehicle, performing a physical examination, or the like.

［第１５実施形態］
［第１５実施形態の処理］
次に第１５実施形態の音声応答システムについて説明する。第１５実施形態の音声応答システムでは、使用者が音声として放送番組や楽曲を入力した場合において、放送番組や楽曲が途切れた場合に補完する処理を実施する。 [15th Embodiment]
[Processing of the fifteenth embodiment]
Next, the voice response system of the fifteenth embodiment will be described. In the voice response system of the fifteenth embodiment, when the user inputs a broadcast program or music as voice, a process for complementing the interruption of the broadcast program or music is performed.

この構成では、前述のＳ５６の詳細として、図３０に示す放送楽曲補完処理を実施する。放送楽曲補完処理では、図３０に示すように、まず、放送番組や楽曲（使用者が歌う場合にはその歌）が途切れたか否かを判定する（Ｓ４８２）。 In this configuration, as the details of the above-mentioned S56, the broadcast music complement processing shown in FIG. 30 is performed. In the broadcast music complement processing, as shown in FIG. 30, first, it is determined whether or not the broadcast program or music (if the user sings, the song) is interrupted (S482).

途切れていれば（Ｓ４８２：ＹＥＳ）、後述するＳ４９２の処理にて同期した放送番組や楽曲を応答内容として設定し（Ｓ４８４）、放送楽曲補完処理を終了する。また、途切れていなければ（Ｓ４８２：ＮＯ）、放送番組の視聴中であれば放送番組を取得し（Ｓ４８６）、楽曲の演奏中であれば該当する楽曲を取得する（Ｓ４８８）。 If it is interrupted (S482: YES), the broadcast program or music synchronized in the process of S492 described later is set as the response content (S484), and the broadcast music complement processing is terminated. If it is not interrupted (S482: NO), the broadcast program is acquired if the broadcast program is being watched (S486), and the corresponding music is acquired if the music is being played (S488).

ここで、カラオケＤＢ１１６には、楽曲と歌詞とが対応付けて記録されており、この処理において楽曲を取得する場合には、歌詞が付いた楽曲を取得する。
続いて、使用者が視聴する放送番組または楽曲を特定する（Ｓ４９０）。そして、この放送番組または楽曲を取得して、使用者が視聴する放送番組または楽曲に同期して再生できるよう準備し（Ｓ４９２）、放送楽曲補完処理を終了する。 Here, in the karaoke DB 116, the music and the lyrics are recorded in association with each other, and when the music is acquired in this process, the music with the lyrics is acquired.
Subsequently, the broadcast program or music to be viewed by the user is specified (S490). Then, the broadcast program or music is acquired and prepared so that it can be played back in synchronization with the broadcast program or music to be viewed by the user (S492), and the broadcast music complement processing is completed.

［第１５実施形態による効果］
上記音声応答システム１００においてサーバ９０は、使用者が視聴する放送番組と同様の放送番組を取得し、放送番組が途切れた場合に、自身が取得した放送番組を出力することで途切れた放送番組を補完する。 [Effect of the 15th embodiment]
In the voice response system 100, the server 90 acquires a broadcast program similar to the broadcast program to be viewed by the user, and when the broadcast program is interrupted, outputs the broadcast program acquired by itself to output the interrupted broadcast program. Complement.

このような音声応答システム１００によれば、使用者が視聴する放送番組が途切れないように補うことができる。
また、上記音声応答システム１００においてサーバ９０は、歌詞無しの楽曲に使用者が歌詞を付して歌う場合において、歌詞ありの楽曲と使用者が付した歌詞とを比較し、使用者の歌詞のみがない部分において歌詞を音声で出力させる。 According to such a voice response system 100, it is possible to supplement the broadcast program that the user watches without interruption.
Further, in the voice response system 100, when the user sings a song without lyrics with lyrics, the server 90 compares the song with lyrics with the lyrics attached by the user, and only the lyrics of the user. The lyrics are output by voice in the part where there is no.

このような音声応答システム１００によれば、いわゆるカラオケ装置を利用する使用者が歌えない部分（歌詞が途切れた部分）を補うことができる。
［第１６実施形態］
［第１６実施形態の処理］
次に第１６実施形態の音声応答システムについて説明する。第１６実施形態の音声応答システムでは、撮像画像中に文字が含まれる場合において、端末装置１において使用者からこの文字の読み方についての質問を受けると、この文字の情報を外部から取得し、この情報に含まれる文字の読み方を音声で出力させる。 According to such a voice response system 100, it is possible to supplement a part (a part where lyrics are interrupted) that a user who uses a so-called karaoke device cannot sing.
[16th Embodiment]
[Processing of the 16th embodiment]
Next, the voice response system of the 16th embodiment will be described. In the voice response system of the 16th embodiment, when a character is included in the captured image and the terminal device 1 receives a question about how to read the character, the information of the character is acquired from the outside and the character is obtained. The reading of the characters contained in the information is output by voice.

この構成では、前述のＳ５６の詳細として、図３１に示す文字解説処理を実施する。文字解説処理では、図３１に示すように、まず、例えば「読み方」のように読みの質問を受けたか否かを判定する（Ｓ５０２）。読みの質問を受けていれば（Ｓ５０２：ＹＥＳ）、画像認識した文字について、読みをインターネット網８５を介して接続された他のサーバ等から検索し（Ｓ５０４）、得られた読みを応答に設定し（Ｓ５０６）、文字解説処理を終了する。 In this configuration, as the details of S56 described above, the character commentary processing shown in FIG. 31 is performed. In the character commentary processing, as shown in FIG. 31, first, it is determined whether or not a reading question has been received, for example, "how to read" (S502). If the reading question is received (S502: YES), the reading is searched from another server or the like connected via the Internet network 85 for the image-recognized character (S504), and the obtained reading is set as the response. (S506), and the character commentary processing is terminated.

読みの質問でなければ（Ｓ５０２：ＮＯ）、国語辞典に記載された内容のような「言葉の意味」の質問。を受けたか否かを判定する（Ｓ５０８）。意味の質問を受けていれば、画像認識した文字（言葉）について、意味をインターネット網８５を介して接続された他のサーバ等から検索し（Ｓ５１０）、得られた意味を応答に設定し（Ｓ５１２）、文字解説処理を終了する。 If it is not a reading question (S502: NO), it is a "meaning of words" question like the content described in the Japanese dictionary. It is determined whether or not it has been received (S508). If a question of meaning is received, the meaning of the image-recognized character (word) is searched from another server connected via the Internet network 85 (S510), and the obtained meaning is set as a response (S510). S512), the character explanation process is terminated.

［第１６実施形態による効果］
このような音声応答システム１００によれば、画像認識した文字について、読みを他のサーバ等から検索し、得られた読みを応答に設定するので、文字の読み方や言葉の意味等を使用者に教えることができる。 [Effect of the 16th embodiment]
According to such a voice response system 100, the reading of the image-recognized character is searched from another server or the like, and the obtained reading is set as the response. Therefore, the user is informed of how to read the character and the meaning of the word. I can teach you.

［第１７実施形態］
［第１７実施形態の処理］
次に第１７実施形態の音声応答システムについて説明する。第１７実施形態の音声応答システムでは、端末装置１によって検出されたセンサ値に基づいて、サーバ９０が端末装置１の使用者の異常行動や結構状態を検出し、異常がある場合に通報を行う処理を実施する。 [17th Embodiment]
[Processing of the 17th embodiment]
Next, the voice response system of the 17th embodiment will be described. In the voice response system of the 17th embodiment, the server 90 detects an abnormal behavior or a fine state of the user of the terminal device 1 based on the sensor value detected by the terminal device 1, and reports when there is an abnormality. Carry out the process.

詳細には、端末装置１においては図３２に示す行動応答端末処理を実施し、サーバ９０においては行動応答サーバ処理を実施する。行動応答端末処理においては、図３２に示すように、まず、端末装置１に搭載された各種センサによる出力を取得するとともに（Ｓ５２２）、カメラ４１による撮像画像を取得する（Ｓ５２４）。そして、取得した各種センサによる出力および撮像画像をサーバ９０に対してパケット送信し（Ｓ５２６）、行動応答端末処理を終了する。 Specifically, the terminal device 1 implements the action response terminal process shown in FIG. 32, and the server 90 implements the action response server process. In the action response terminal processing, as shown in FIG. 32, first, the output from various sensors mounted on the terminal device 1 is acquired (S522), and the image captured by the camera 41 is acquired (S524). Then, the output and the captured image by the acquired various sensors are packet-transmitted to the server 90 (S526), and the action response terminal processing is terminated.

次に、行動応答サーバ処理では、図３３に示すように、まず、前述のＳ４２〜Ｓ４４の処理を実施する。続いて、端末装置１の位置情報（ＧＰＳ受信機２７による検出結果）に基づいて、徘徊等の行動を特定し（Ｓ５３２）、温度センサ１５，１９等による検出結果に基づいて使用者の環境を検出する（Ｓ５３４）。そして、異常を検出する（Ｓ５３６）。 Next, in the action response server processing, as shown in FIG. 33, first, the above-mentioned processes S42 to S44 are performed. Subsequently, the behavior such as wandering is specified based on the position information of the terminal device 1 (detection result by the GPS receiver 27) (S532), and the environment of the user is determined based on the detection results by the temperature sensors 15, 19 and the like. Detect (S534). Then, the abnormality is detected (S536).

この処理では、位置情報の変化と環境とに基づいて異常を検出する。例えば、使用者が高温や低温の場所で動かない場合や、使用者が普段行かない場所に存在する場合に、異常である旨を検出する（Ｓ５３６）。或いは、位置情報や環境を点数化し、この点数が基準値を下回る場合（基準範囲外である場合）に異常であると判断する。 In this process, anomalies are detected based on changes in location information and the environment. For example, when the user does not move in a place of high temperature or low temperature, or when the user exists in a place where the user does not usually go, the abnormality is detected (S536). Alternatively, the location information and the environment are scored, and if this score is below the reference value (outside the reference range), it is judged to be abnormal.

続いて、異常が検出されたか否かを判定する（Ｓ５３８）。異常が検出されていなければ（Ｓ５３８：ＮＯ）、行動応答サーバ処理を終了する。また、異常が検出されていれば（Ｓ５３８：ＹＥＳ）、異常がある旨のメッセージを生成し（Ｓ５４０）、所定の連絡先に通報する（Ｓ５４２）。そして、前述のＳ６２〜Ｓ６８（Ｓ６６を除く）の処理を実施し、行動応答サーバ処理を終了する。 Subsequently, it is determined whether or not an abnormality has been detected (S538). If no abnormality is detected (S538: NO), the action response server processing is terminated. If an abnormality is detected (S538: YES), a message indicating that there is an abnormality is generated (S540), and a predetermined contact is notified (S542). Then, the above-mentioned processes S62 to S68 (excluding S66) are executed, and the action response server process is terminated.

［第１７実施形態による効果］
上記音声応答システム１００においてサーバ９０は、使用者の行動や使用者の周囲環境を検出し、検出された行動や周囲環境に応じてメッセージを生成する。 [Effect of the 17th embodiment]
In the voice response system 100, the server 90 detects the user's behavior and the user's surrounding environment, and generates a message according to the detected behavior and the surrounding environment.

このような音声応答システム１００によれば、危険な場所や立ち入り禁止の領域などを報知することができる。また、使用者に異常な行動があることなどを検出することができる。 According to such a voice response system 100, it is possible to notify a dangerous place, an exclusion zone, and the like. In addition, it is possible to detect that the user has abnormal behavior.

さらに、上記音声応答システム１００においてサーバ９０は、使用者を撮像した撮像画像に基づいて、健康状態を判定し、この健康状態に応じてメッセージを生成する。
このような音声応答システム１００によれば、使用者の健康状態を管理することができる。 Further, in the voice response system 100, the server 90 determines the health state based on the captured image captured by the user, and generates a message according to the health state.
According to such a voice response system 100, the health condition of the user can be managed.

また、上記音声応答システム１００においてサーバ９０は、健康状態が基準値を下回る場合に、所定の連絡先に通報を行う。
このような音声応答システム１００によれば、使用者の健康状態が基準値以下の場合に、通報を行うことができる。よってより早期に異常を他者に報知することができる。 Further, in the voice response system 100, the server 90 notifies a predetermined contact when the health condition is lower than the reference value.
According to such a voice response system 100, when the health condition of the user is equal to or less than the reference value, the notification can be made. Therefore, the abnormality can be notified to others at an earlier stage.

［その他の実施形態］
本発明の実施の形態は、上記の実施形態に何ら限定されることはなく、本発明の技術的
範囲に属する限り種々の形態を採りうる。 [Other embodiments]
The embodiments of the present invention are not limited to the above embodiments, and various embodiments may be adopted as long as they belong to the technical scope of the present invention.

例えば、音声応答システム１００が二者間および多者間でのやり取りを仲介するようにしてもよい。詳細には、交差点等で道を譲り合う必要がある場合、どちらの車両が先に交差点に進入するかを端末装置１同士が交渉するようにしてもよい。この場合、各端末装置１はサーバ９０に対して交差点へ接近する際の移動方位や交差点への接近速度の情報を送信し、サーバ９０は、移動方位や接近速度に応じて各端末装置１に優先順位を設定し、優先順位に応じて「とまれ」や「進入可」などの音声を生成して出力すればよい。 For example, the voice response system 100 may mediate exchanges between two parties and between multiple parties. In detail, when it is necessary to give way to each other at an intersection or the like, the terminal devices 1 may negotiate which vehicle enters the intersection first. In this case, each terminal device 1 transmits information on the movement direction when approaching the intersection and the approach speed to the intersection to the server 90, and the server 90 sends information to each terminal device 1 according to the movement direction and the approach speed. Priority may be set, and voices such as "Torare" and "Enterable" may be generated and output according to the priority.

また、音声通信等、リアルタイムに応答を行う必要がある通信の着呼（着信）を端末装置１が受け付ける際には、使用者の都合のよいときだけ着呼を受け付けるようにしてもよい。具体的には、カメラ４１にて使用者の顔を撮像できたときに使用者の都合のよいときとみなして着呼を受け付けるようにしてもよい。 Further, when the terminal device 1 accepts an incoming call (incoming call) for a communication that needs to be answered in real time, such as voice communication, the incoming call may be accepted only when it is convenient for the user. Specifically, when the user's face can be imaged by the camera 41, it may be regarded as a time convenient for the user and the incoming call may be accepted.

さらに、音声通信等の際に、相手を呼び出しても応答しないと、不愉快になる人がいる。このような感情を抑制するために、相手からの応答を待っている利用者に対し、相手の状況を伝えるようにしてもよい。例えば、端末装置１において使用者のスケジュールを管理しておき、使用者が着呼に対して応答しない場合、使用者が何をしているか、或いは、使用者のスケジュールの空き時間を検索し、使用者がいつ応答できるか伝えるようにすることが考えられる。 Furthermore, there are some people who are uncomfortable if they do not answer even if they call the other party during voice communication or the like. In order to suppress such feelings, the situation of the other party may be communicated to the user who is waiting for the response from the other party. For example, if the user's schedule is managed in the terminal device 1 and the user does not answer the incoming call, what the user is doing or the free time of the user's schedule is searched for. It is conceivable to let the user know when they can respond.

また、使用者が着呼に対して応答しない場合、使用者の場所を呼出元に伝えてもよい。例えば、使用者がスマートフォンやパソコンを介してインターネット等に繋いでいれば、どの端末が操作されているかがわかる。この情報から使用者の場所を特定して呼出元に伝えることが考えられる。 Further, if the user does not answer the incoming call, the location of the user may be notified to the caller. For example, if the user is connected to the Internet or the like via a smartphone or a personal computer, it is possible to know which terminal is being operated. It is conceivable to identify the location of the user from this information and inform the caller.

さらに、使用者が着呼に対して応答できるか否かを、ＧＰＳ等を用いた位置情報を利用して判断するようにしてもよい。位置情報に基づけば、車に乗っているか否か、自宅に居るか否か等を判断でき、例えば使用者が移動中である場合やベッド上にいる場合であれば、公共性が高い或いは睡眠中と判断して着呼に応答できないと判断すればよい。このように着呼に応答できない場合には、前述のように使用者が何をしているか等を呼出元に伝えることが考えられる。 Further, it may be determined whether or not the user can answer the incoming call by using the position information using GPS or the like. Based on location information, it is possible to determine whether or not you are in a car, whether or not you are at home, etc. For example, if the user is on the move or on the bed, it is highly public or sleepy. It may be judged that the call cannot be answered by judging that the call is medium. If the incoming call cannot be answered in this way, it is conceivable to inform the caller of what the user is doing as described above.

また、位置情報を取得するためには、防犯カメラを利用する構成も考えられる。近年では、様々な場所防犯カメラが取り付けられているので、これらの防犯カメラを利用して、顔認証等の本人を特定するための構成を利用して、使用者の位置を認識することができる。また、防犯カメラを利用して使用者が何をしているか（電話に出られる状況か否か）といった状況判断を行ってもよい。また、着呼に応答できるか否かについては、別の固定電話を使っているかといった条件（固定電話の使用中には着呼に応答できない）でも判断できる。 Further, in order to acquire the position information, a configuration using a security camera can be considered. In recent years, since security cameras are installed in various places, it is possible to recognize the position of the user by using these security cameras and using a configuration for identifying the person such as face recognition. .. In addition, the security camera may be used to determine the situation such as what the user is doing (whether or not he / she can answer the phone). In addition, whether or not the incoming call can be answered can be determined by the condition that another fixed-line telephone is used (the incoming call cannot be answered while the fixed-line telephone is in use).

さらに、端末装置１の使用者が誰かと会話をしたい場合、使用者の性格学習結果を利用し、不特定多数の内、利用者同士の相性が良いと推定される端末装置を呼び出すようにしてもよい。また、このような場合、盛り上がりそうな話題（双方の使用者が興味のある話題（学習結果を利用して抽出されるもの））を使用者に対して話しかけるようにしてもよい。 Furthermore, when the user of the terminal device 1 wants to have a conversation with someone, the user's personality learning result is used to call the terminal device that is presumed to be compatible with each other among an unspecified number of users. May be good. Further, in such a case, a topic that is likely to be exciting (a topic that both users are interested in (extracted by using the learning result)) may be spoken to the user.

また、音声応答装置の利用が長時間ない場合（基準時間以上、使用者が発話していないとき）に、音声応答装置が使用者に何らかの言葉を掛けるようにしてもよい。
この際に、ＧＰＳ等の位置情報を利用して話しかける言葉を選択してもよい。 Further, when the voice response device is not used for a long time (when the user has not spoken for a reference time or more), the voice response device may give some words to the user.
At this time, the language to be spoken may be selected by using the position information such as GPS.

［特許請求の範囲または課題を解決するための手段に記載（本発明）の各手段と、実施形態における構成との関係］
上記実施形態における端末装置１、サーバ９０は本発明の音声応答装置の一例に相当する。また、上記実施形態における２２、Ｓ５６の処理は本発明の応答取得手段の一例に相当する。 [Relationship between the means described in the means for solving the claims or the problems (the present invention) and the configuration in the embodiment]
The terminal device 1 and the server 90 in the above embodiment correspond to an example of the voice response device of the present invention. Further, the processing of 22 and S56 in the above embodiment corresponds to an example of the response acquisition means of the present invention.

さらに、上記実施形態におけるＳ２８、Ｓ６０、Ｓ６４の処理は本発明の音声出力手段の一例に相当する。また、上記実施形態におけるＳ２、Ｓ６の処理は本発明の音声入力手段の一例に相当する。 Further, the processing of S28, S60, and S64 in the above embodiment corresponds to an example of the audio output means of the present invention. Further, the processing of S2 and S6 in the above embodiment corresponds to an example of the voice input means of the present invention.

さらに、上記実施形態におけるＳ１４の処理は本発明の音声送信手段の一例に相当する。また、上記実施形態における応答候補ＤＢ１０５は本発明の応答記録手段の一例に相当する。 Further, the processing of S14 in the above embodiment corresponds to an example of the voice transmission means of the present invention. Further, the response candidate DB 105 in the above embodiment corresponds to an example of the response recording means of the present invention.

さらに、上記実施形態におけるＳ５６の処理は本発明の性格情報取得手段の一例に相当する。また、上記実施形態におけるＳ２２、Ｓ５６の処理は本発明の応答取得手段の一例に相当する。 Further, the processing of S56 in the above embodiment corresponds to an example of the personality information acquisition means of the present invention. Further, the processing of S22 and S56 in the above embodiment corresponds to an example of the response acquisition means of the present invention.

さらに、上記実施形態におけるＳ２８、Ｓ６０、Ｓ６４の処理は本発明の音声出力手段の一例に相当する。また、上記実施形態におけるＳ２５４、Ｓ２５８、Ｓ２６０の処理は本発明の第１性格情報生成手段、第２性格情報生成手段の一例に相当する。また、上記実施形態におけるＳ５６の処理は本発明の性格情報取得手段の一例に相当する。 Further, the processing of S28, S60, and S64 in the above embodiment corresponds to an example of the audio output means of the present invention. Further, the processing of S254, S258, and S260 in the above embodiment corresponds to an example of the first personality information generating means and the second personality information generating means of the present invention. Further, the processing of S56 in the above embodiment corresponds to an example of the personality information acquisition means of the present invention.

さらに、上記実施形態におけるＳ２２、Ｓ５６の処理は本発明の応答取得手段に相当する。また、上記実施形態におけるＳ２８、Ｓ６０、Ｓ６４の処理は本発明の音声出力手段の一例に相当する。 Further, the processing of S22 and S56 in the above embodiment corresponds to the response acquisition means of the present invention. Further, the processing of S28, S60, and S64 in the above embodiment corresponds to an example of the audio output means of the present invention.

さらに、上記実施形態におけるＳ２５４、Ｓ２５８、Ｓ２６０の処理は本発明の第１性格情報生成手段、第２性格情報生成手段の一例に相当する。
さらに、上記実施形態におけるＳ４８、Ｓ５６の処理は本発明の応答生成手段の一例に相当する。また、上記実施形態におけるＳ２８、Ｓ６０、Ｓ６４の処理は本発明の音声出力手段の一例に相当する。 Further, the processing of S254, S258, and S260 in the above embodiment corresponds to an example of the first personality information generating means and the second personality information generating means of the present invention.
Further, the processing of S48 and S56 in the above embodiment corresponds to an example of the response generating means of the present invention. Further, the processing of S28, S60, and S64 in the above embodiment corresponds to an example of the audio output means of the present invention.

さらに、上記実施形態における変形例：Ｓ４８の処理は本発明の音声入力動画取得手段の一例に相当する。また、上記実施形態におけるＳ５２の処理は本発明の文字情報変換手段の一例に相当する。 Further, the modification of the modification: S48 in the above embodiment corresponds to an example of the voice input moving image acquisition means of the present invention. Further, the processing of S52 in the above embodiment corresponds to an example of the character information conversion means of the present invention.

さらに、上記実施形態における嗜好情報生成処理は本発明の嗜好情報生成手段の一例に相当する。また、上記実施形態におけるＳ５６の処理は本発明の応答候補取得手段の一例に相当する。 Further, the preference information generation process in the above embodiment corresponds to an example of the preference information generation means of the present invention. Further, the processing of S56 in the above embodiment corresponds to an example of the response candidate acquisition means of the present invention.

さらに、上記実施形態における動作文字入力処理は本発明の文字情報生成手段の一例に相当する。また、他装置情報取得手段上記実施形態における他端末利用処理は本発明の転送手段の一例に相当する。 Further, the operation character input process in the above embodiment corresponds to an example of the character information generation means of the present invention. Further, the other device information acquisition means The processing using another terminal in the above embodiment corresponds to an example of the transfer means of the present invention.

さらに、上記実施形態におけるＳ９８の処理は本発明の再生条件判定手段の一例に相当する。また、上記実施形態におけるＳ１００の処理は本発明のメッセージ再生手段の一例に相当する。 Further, the treatment of S98 in the above embodiment corresponds to an example of the reproduction condition determining means of the present invention. Further, the processing of S100 in the above embodiment corresponds to an example of the message reproducing means of the present invention.

さらに、上記実施形態におけるＳ１１６の処理は本発明の未回答時送信手段の一例に相当する。また、上記実施形態におけるＳ３７２の処理は本発明の発話正確度検出手段の一例に相当する。 Further, the processing of S116 in the above embodiment corresponds to an example of the unanswered transmission means of the present invention. Further, the processing of S372 in the above embodiment corresponds to an example of the utterance accuracy detecting means of the present invention.

さらに、上記実施形態におけるＳ３７４の処理は本発明の正確度合出力手段の一例に相当する。また、上記実施形態におけるＳ２０４の処理は本発明の接続制御手段の一例に相当する。 Further, the processing of S374 in the above embodiment corresponds to an example of the accuracy matching output means of the present invention. Further, the processing of S204 in the above embodiment corresponds to an example of the connection control means of the present invention.

さらに、上記実施形態におけるＳ５０の処理は本発明の感情判定手段の一例に相当する。また、上記実施形態におけるＳ４３８の処理は本発明の経路情報取得手段の一例に相当する。 Further, the processing of S50 in the above embodiment corresponds to an example of the emotion determination means of the present invention. Further, the processing of S438 in the above embodiment corresponds to an example of the route information acquisition means of the present invention.

さらに、上記実施形態におけるＳ４６２の処理は本発明の視線検出手段の一例に相当する。また、上記実施形態におけるＳ４６４の処理は本発明の視線移動要求送信手段の一例に相当する。 Further, the treatment of S462 in the above embodiment corresponds to an example of the line-of-sight detecting means of the present invention. Further, the processing of S464 in the above embodiment corresponds to an example of the line-of-sight movement request transmitting means of the present invention.

さらに、上記実施形態におけるＳ４６４の処理は本発明の変化要求送信手段の一例に相当する。また、上記実施形態におけるＳ４８６の処理は本発明の放送番組取得手段の一例に相当する。 Further, the processing of S464 in the above embodiment corresponds to an example of the change request transmitting means of the present invention. Further, the processing of S486 in the above embodiment corresponds to an example of the broadcast program acquisition means of the present invention.

さらに、上記実施形態におけるＳ４８４の処理は本発明の放送番組補完手段、歌詞付加手段の一例に相当する。また、上記実施形態におけるＳ５０４、Ｓ５０６の処理は本発明の読み方出力手段の一例に相当する。また、上記実施形態におけるＳ５２２、Ｓ５２４の処理は本発明の行動環境検出手段の一例に相当する。 Further, the processing of S484 in the above embodiment corresponds to an example of the broadcast program complementing means and the lyrics adding means of the present invention. Further, the processing of S504 and S506 in the above embodiment corresponds to an example of the reading output means of the present invention. Further, the processing of S522 and S524 in the above embodiment corresponds to an example of the behavioral environment detecting means of the present invention.

さらに、上記実施形態におけるＳ５３８の処理は本発明の健康状態判定手段の一例に相当する。また、上記実施形態におけるＳ５４０の処理は本発明の健康メッセージ生成手段の一例に相当する。 Further, the treatment of S538 in the above embodiment corresponds to an example of the health condition determining means of the present invention. Further, the treatment of S540 in the above embodiment corresponds to an example of the health message generation means of the present invention.

さらに、上記実施形態におけるＳ５４２の処理は本発明の通報手段の一例に相当する。 Further, the processing of S542 in the above embodiment corresponds to an example of the reporting means of the present invention.

１…端末装置、１０…行動センサユニット、１１…次元加速度センサ、１３…軸ジャイロセンサ、１５…温度センサ、１７…湿度センサ、１９…温度センサ、２１…湿度センサ、２３…照度センサ、２５…濡れセンサ、２７…ＧＰＳ受信機、２９…風速センサ、３３…心電センサ、３５…心音センサ、３７…マイク、３９…メモリ、４１…カメラ、５０…通信部、５３…無線電話ユニット、５５…連絡先メモリ、６０…報知部、６１…ディスプレイ、６３…電飾、６５…スピーカ、７０…操作部、７１…タッチパッド、７３…確認ボタン、７５…指紋センサ、７７…救援依頼レバー、８０…通信基地局、８５…インターネット網、９０…サーバ、１００…音声応答システム、１０１…演算部、１０２…音声認識ＤＢ、１０３…予測変換ＤＢ、１０４…音声ＤＢ、１０５…応答候補ＤＢ、１０６…性格ＤＢ、１０７…学習ＤＢ、１０８…嗜好ＤＢ、１０９…ニュースＤＢ、１１０…天気ＤＢ、１１１…再生条件ＤＢ、１１２…手書き文字・手話ＤＢ、１１３…端末情報ＤＢ、１１４…感情判定ＤＢ、１１５…健康判定ＤＢ、１１６…カラオケＤＢ、１１７…通報先ＤＢ、１１８…セールスＤＢ、１１９…クライアントＤＢ。 1 ... Terminal device, 10 ... Behavior sensor unit, 11 ... Dimensional acceleration sensor, 13 ... Axis gyro sensor, 15 ... Temperature sensor, 17 ... Humidity sensor, 19 ... Temperature sensor, 21 ... Humidity sensor, 23 ... Illumination sensor, 25 ... Wet sensor, 27 ... GPS receiver, 29 ... wind speed sensor, 33 ... electrocardiographic sensor, 35 ... heart sound sensor, 37 ... microphone, 39 ... memory, 41 ... camera, 50 ... communication unit, 53 ... wireless telephone unit, 55 ... Contact memory, 60 ... Notification unit, 61 ... Display, 63 ... Illumination, 65 ... Speaker, 70 ... Operation unit, 71 ... Touch pad, 73 ... Confirmation button, 75 ... Fingerprint sensor, 77 ... Relief request lever, 80 ... Communication base station, 85 ... Internet network, 90 ... Server, 100 ... Voice response system, 101 ... Calculation unit, 102 ... Voice recognition DB, 103 ... Predictive conversion DB, 104 ... Voice DB, 105 ... Response candidate DB, 106 ... Character DB, 107 ... Learning DB, 108 ... Preference DB, 109 ... News DB, 110 ... Weather DB, 111 ... Playback condition DB, 112 ... Handwritten characters / handwriting DB, 113 ... Terminal information DB, 114 ... Sensory judgment DB, 115 ... Health judgment DB, 116 ... Karaoke DB, 117 ... Report destination DB, 118 ... Sales DB, 119 ... Client DB.

Claims

A voice response unit configured to respond to the input voice by voice,
A voice setting unit configured to set automatic conversation by voice,
A determination unit configured to determine whether or not the setting for automatic conversation is turned on, and a determination unit.
When the determination unit determines that the automatic conversation mode is ON, the transmission unit configured to transmit the fact that the automatic conversation mode has been set to the server together with the ID for identifying itself. When,
A reproduction determination unit configured to determine whether or not a preset reproduction condition is satisfied, and a reproduction determination unit.
If the reproduction condition is satisfied, a generation unit configured to generate a question-type message requesting the user's answer, which is a message corresponding to the reproduction condition, and
A voice response device equipped with.

The voice response device according to claim 1.
An end unit configured to end the automatic conversation when the determination unit determines that the automatic conversation is OFF.
A voice response device further equipped with.

The voice response device according to claim 1 or 2.
A personality recognition unit configured to recognize the personality of the user of the voice response device based on the input voice, and
A distribution unit configured to distribute the recognized personality to any of the prepared personality categories, and
A recording unit configured to record the user and the personality classification to which the recognized personality is assigned in a database in association with each other.
Further prepare
The voice response unit is a voice response device configured to select a response according to a recognized personality.

The voice response device according to any one of claims 1 to 3.
With a microphone and a speaker,
The voice response unit is a voice response device configured to acquire a voice response to a voice input using the microphone from a server and generate the response from the speaker.

The voice response device according to any one of claims 1 to 4.
A voice response device configured to output that an error has occurred when the response is not obtained from the server within an allowable time.