JP2015064446A

JP2015064446A - Communication device, communication method and program

Info

Publication number: JP2015064446A
Application number: JP2013197366A
Authority: JP
Inventors: 佑介高木; Yusuke Takagi; 勝利石倉; Katsutoshi Ishikura; 洋和小林; Hirokazu Kobayashi; 文代佐藤; Fumiyo Sato; 嵯峨　洋行; Hiroyuki Saga; 洋行嵯峨; 重人鈴木; Shigeto Suzuki
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2013-09-24
Filing date: 2013-09-24
Publication date: 2015-04-09

Abstract

PROBLEM TO BE SOLVED: To make a response faster by reducing time and effort of character input of a text, in communication of talking with a voice and a text.SOLUTION: A communication device 10 includes: a receiving unit 106 for receiving input from a user; an obtaining unit 1131 for obtaining voice information; a converting unit 1132 for converting the voice information obtained by the obtaining unit 1131 into semantic information; a display unit 105 for displaying information based on the semantic information converted by the converting unit 1132; a generating unit 112 for extracting an interrogative sentence from the semantic information converted by the converting unit 1132, generating options selected by the user on the basis of the extracted interrogative sentence to display that in the display unit 105, and generating information related to a response to the semantic information on the basis of the option selected by the used from the displayed options and received by the receiving unit 106; and a transmitting unit for transmitting the information related to the response which is generated by the generating unit 112 to another device.

Description

本発明は、通信装置、通信方法及びプログラムに関する。 The present invention relates to a communication device, a communication method, and a program.

現在、携帯端末装置は、電話による通信のほか、メールやインターネット接続等の機能を備えたものが多い。また、今後はＶｏＬＴＥ（ＶｏｉｃｅＯｖｅｒＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ））等のデータパケットでの音声通話サービスが開始される。現状、携帯端末装置を使用して会話をする場合、音声通話では、通話を発信する発呼者と通話を着信する着呼者が音声によって会話をする。また、携帯端末装置では、メールやチャット等を利用して、音声通話でなく文字を入力して会話をすることも可能である。 At present, many portable terminal devices have functions such as mail and Internet connection in addition to telephone communication. In the future, a voice call service using data packets such as VoLTE (Voice Over LTE (Long Term Evolution)) will be started. At present, when a conversation is performed using a portable terminal device, in a voice call, a caller who makes a call and a callee who receives the call have a conversation. Moreover, in a portable terminal device, it is also possible to have a conversation by inputting characters instead of voice calls using mail, chat, or the like.

もし着呼者が電車等の公共交通機関で移動中に発呼者から着信があった場合、公共マナーとして音声通話は控えるべきであるため、通話せずに保留するのが一般的である。発呼者は、着呼者が音声通話できないと判明した場合、電話を終了し、メールやチャット等により文字を入力して着呼者と会話をすることが可能である。しかしながら、メールやチャット等では、発呼者及び着呼者ともに文字を入力する手間がかかるという問題がある。 If the caller receives an incoming call from a caller while traveling on a public transport such as a train, the voice call should be refrained as a public manner, so it is common to hold the call without making a call. If it is determined that the caller cannot make a voice call, the caller can end the phone call and enter characters by e-mail, chat, or the like to have a conversation with the caller. However, in mail, chat, etc., there is a problem that it takes time to input characters for both the calling party and the called party.

そこで、相手の状況による会話方法として、音声とテキスト形式の相互互換を可能にすることにより、健常者と耳に障害を持つ人とのコミュニケーションを可能にするものがある（例えば、特許文献１参照）。特許文献１に記載された技術では、健常者が話す音声をテキストに変換し、耳に障害を持つ人の携帯端末装置のディスプレイに表示することにより健常者と耳に障害を持つ人との会話を実現している。 Therefore, as a conversation method according to the situation of the other party, there is one that enables communication between a healthy person and a person with an ear disorder by enabling mutual compatibility between voice and text format (see, for example, Patent Document 1). ). In the technology described in Patent Document 1, a conversation between a healthy person and a person with a hearing impairment is performed by converting speech spoken by a healthy person into text and displaying it on a display of a portable terminal device of a person with a hearing impairment. Is realized.

特開２００３−１８８９４８号公報JP 2003-188948 A

しかしながら、特許文献１に記載された技術では、耳に障害を持つ人は、メールやチャット等と同様に文章を全て文字入力して返答しなければならないため、どうしても健常者へのレスポンスが遅くなってしまう。また、健常者同士の通話でも、着呼者が音声通話できない環境にいる場合は、発呼者側の音声をテキスト変換して表示しても、着呼者は文字入力により返答しなければならないため、発呼者へのレスポンスが遅くなってしまう、という問題がある。 However, with the technique described in Patent Document 1, a person with an ear disorder must respond by inputting all text in the same manner as in email and chat, so the response to a healthy person is inevitably slow. End up. In addition, even in the case of a call between healthy people, if the caller is in an environment where voice call is not possible, the caller must respond by inputting text even if the caller's voice is converted to text and displayed. Therefore, there is a problem that the response to the caller is delayed.

本発明は上記の点に鑑みてなされたものであり、その目的は、音声とテキストとで会話する通信において、テキストの文字入力の手間を軽減してレスポンスを早くすることができる通信装置、通信方法及びプログラムを提供することにある。 The present invention has been made in view of the above points, and an object of the present invention is to provide a communication apparatus and communication that can reduce the time and effort of inputting text and speed up a response in communication in which voice and text are conversed. It is to provide a method and a program.

本発明は上記の課題を解決するためになされたものであり、本発明の一態様は、ユーザからの入力を受け付ける受付部と、音声情報を取得する取得部と、前記取得部により取得された音声情報を意味情報に変換する変換部と、前記変換部により変換された意味情報に基づく情報を表示する表示部と、前記変換部により変換された意味情報から疑問文を抽出し、抽出した疑問文に基づいて、ユーザが選択する選択肢を生成して前記表示部に表示させ、前記表示させた選択肢の中から前記ユーザが選択した選択肢であって、前記受付部が受け付けた選択肢に基づいて、前記意味情報への応答に関する情報を生成する生成部と、前記生成部により生成された応答に関する情報を他装置に送信する送信部と、を備える通信装置である。 The present invention has been made to solve the above-described problems, and one aspect of the present invention is acquired by a receiving unit that receives input from a user, an acquiring unit that acquires audio information, and the acquiring unit. A conversion unit that converts speech information into semantic information; a display unit that displays information based on the semantic information converted by the conversion unit; and a question sentence extracted from the semantic information converted by the conversion unit, Based on the sentence, an option to be selected by the user is generated and displayed on the display unit, the option selected by the user from the displayed options, and based on the option received by the reception unit, A communication apparatus comprising: a generation unit that generates information related to a response to the semantic information; and a transmission unit that transmits information related to a response generated by the generation unit to another device.

本発明によれば、音声とテキストとで会話する通信において、テキストの文字入力の手間を軽減してレスポンスを早くすることができる。 ADVANTAGE OF THE INVENTION According to this invention, in the communication which carries out a conversation with an audio | voice and a text, the effort of inputting the character of a text can be reduced and a response can be made quick.

本発明の実施形態による通信装置の外観構成を示す正面図である。It is a front view which shows the external appearance structure of the communication apparatus by embodiment of this invention. 本実施形態による通信装置の構成を示すブロック図である。It is a block diagram which shows the structure of the communication apparatus by this embodiment. 本実施形態による通信装置における通話処理の動作を示すシーケンス図である。It is a sequence diagram which shows the operation | movement of the telephone call process in the communication apparatus by this embodiment. 本実施形態による通信装置が表示する選択肢の一例を示すイメージ図である。It is an image figure which shows an example of the option which the communication apparatus by this embodiment displays. 本実施形態による通信装置における通話処理の動作を示すシーケンス図である。It is a sequence diagram which shows the operation | movement of the telephone call process in the communication apparatus by this embodiment. 本実施形態による通信装置が表示する文字入力画面の一例を示すイメージ図である。It is an image figure which shows an example of the character input screen which the communication apparatus by this embodiment displays. 本実施形態による通信装置における通話制御処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the call control process in the communication apparatus by this embodiment.

以下、図面を参照しながら本発明の実施形態について詳しく説明する。図１は、本実施形態による通信装置１０の外観構成を示す正面図である。通信装置１０は、例えば、携帯電話機やスマートフォン、タブレット端末等の電子装置である。通信装置１０は、例えば、ＶｏＬＴＥ等のデータ通信による音声通話の機能を備える。通信装置１０は、表示部１０５と、受付部１０６とを備える。表示部１０５は、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）等であり、情報を表示する。受付部１０６は、ボタン（キー）や表示部１０５の画面上の接触を検知するタッチパネル等から構成され、ユーザからの操作入力を受け付ける。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a front view showing an external configuration of a communication device 10 according to the present embodiment. The communication device 10 is an electronic device such as a mobile phone, a smartphone, or a tablet terminal. The communication device 10 has a voice call function by data communication such as VoLTE, for example. The communication device 10 includes a display unit 105 and a reception unit 106. The display unit 105 is an LCD (Liquid Crystal Display) or the like, and displays information. The receiving unit 106 includes a button (key), a touch panel that detects contact on the screen of the display unit 105, and the like, and receives an operation input from the user.

次に、本実施形態による通信装置１０の構成について説明する。図２は、本実施形態による通信装置１０の構成を示すブロック図である。通信装置１０は、制御部１０１と、送受信部１０２と、音声入力部１０３と、音声出力部１０４と、表示部１０５と、受付部１０６とを含んで構成される。 Next, the configuration of the communication device 10 according to the present embodiment will be described. FIG. 2 is a block diagram illustrating the configuration of the communication apparatus 10 according to the present embodiment. The communication device 10 includes a control unit 101, a transmission / reception unit 102, a voice input unit 103, a voice output unit 104, a display unit 105, and a reception unit 106.

送受信部１０２は、他の装置と無線通信する通信部である。例えば、送受信部１０２は、音声情報を他の通信装置１０から受信する。音声情報は、通話に関する音声データである。また、送受信部１０２は、意味情報への応答に関する情報を他の通信装置１０に送信する。意味情報は、音声情報が示す音声をテキスト（文字）に変換したテキストデータである。音声入力部１０３は、マイク等であり、音声を入力する。音声出力部１０４は、スピーカ等であり、音声を出力する。 The transmission / reception unit 102 is a communication unit that wirelessly communicates with other devices. For example, the transmission / reception unit 102 receives audio information from another communication device 10. The voice information is voice data related to a call. In addition, the transmission / reception unit 102 transmits information related to the response to the semantic information to the other communication device 10. The semantic information is text data obtained by converting the voice indicated by the voice information into text (characters). The voice input unit 103 is a microphone or the like and inputs voice. The audio output unit 104 is a speaker or the like and outputs audio.

制御部１０１は、通信装置１０の各部を統括して制御する。制御部１０１は、生成部１１２と、音声意味変換部１１３と、意味音声変換部１１４とを含んで構成される。また、音声意味変換部１１３は、取得部１１３１と、変換部１１３２とを含んで構成される。取得部１１３１は、送受信部１０２を介して、他の通信装置１０から音声情報を受信する。そして、変換部１１３２は、取得部１１３１が受信した音声情報を意味情報に変換する。例えば、変換部１１３２は、一般的な感性制御技術を利用した音声認識技術により音声情報を意味情報に変換する。また、変換部１１３２は、音声入力部１０３により入力された音声を意味情報に変換する。変換部１１３２は、これらの変換した意味情報を表示部１０５に表示する。取得部１１３１、変換部１１３２はそれぞれ、特許請求の範囲における「取得部」、「変換部」の一例である。 The control unit 101 controls each unit of the communication device 10 in an integrated manner. The control unit 101 includes a generation unit 112, a speech / speech conversion unit 113, and a semantic speech conversion unit 114. The speech meaning conversion unit 113 includes an acquisition unit 1131 and a conversion unit 1132. The acquisition unit 1131 receives audio information from another communication device 10 via the transmission / reception unit 102. Then, the conversion unit 1132 converts the voice information received by the acquisition unit 1131 into semantic information. For example, the conversion unit 1132 converts voice information into semantic information by a voice recognition technique using a general sensitivity control technique. The conversion unit 1132 converts the voice input by the voice input unit 103 into semantic information. The conversion unit 1132 displays the converted semantic information on the display unit 105. The acquisition unit 1131 and the conversion unit 1132 are examples of “acquisition unit” and “conversion unit” in the claims, respectively.

生成部１１２は、受付部１０６が受け付けたユーザからの入力と、音声意味変換部１１３の変換部１１３２により変換された意味情報とに基づいて、意味情報への応答に関する情報を生成する。具体的には、まず、生成部１１２は、意味情報から疑問文を抽出する。例えば、生成部１１２は、「ですか？」や「ますか？」等の言葉（以下、疑問言葉と称する）の含まれる文を疑問文と判定する。或いは、生成部１１２は、一般的な感性制御技術を利用した音声認識技術により疑問文を判定してもよい。そして、生成部１１２は、抽出した疑問文に基づいて、ユーザが選択する選択肢を生成する。ただし、生成部１１２は、抽出した疑問言葉より所定文字数以内前に「誰（ｗｈｏ）」、「いつ（ｗｈｅｎ）」、「どこ（ｗｈｅｒｅ）」、「何（ｗｈａｔ）」、「なぜ（ｗｈｙ）」、「どうして（ｈｏｗ）」等の言葉（以下、５Ｗ１Ｈ言葉と称する）がある場合には、ユーザが文字を入力するための入力欄（文字入力画面）を生成する。そして、生成部１１２は、ユーザが選択する選択肢、又は、文字入力画面に対して行ったユーザからの入力に基づいて、意味情報への応答に関する情報を生成する。そして、生成部１１２は、送受信部１０２を介して、生成した意味情報への応答に関する情報を他の通信装置１０に送信する。 The generation unit 112 generates information related to the response to the semantic information based on the input from the user received by the reception unit 106 and the semantic information converted by the conversion unit 1132 of the speech meaning conversion unit 113. Specifically, first, the generation unit 112 extracts a question sentence from the semantic information. For example, the generation unit 112 determines a sentence including words (hereinafter referred to as “question words”) such as “? Or the production | generation part 112 may determine a question sentence with the speech recognition technique using a general sensitivity control technique. And the production | generation part 112 produces | generates the choice which a user selects based on the extracted question sentence. However, the generation unit 112 may display “who”, “when”, “where”, “what”, “why” within a predetermined number of characters before the extracted question word. When there are words such as “how” (hereinafter referred to as 5W1H words), an input field (character input screen) for the user to input characters is generated. And the production | generation part 112 produces | generates the information regarding the response to a semantic information based on the choice from a user, or the input from the user performed with respect to the character input screen. Then, the generation unit 112 transmits information regarding a response to the generated semantic information to the other communication device 10 via the transmission / reception unit 102.

意味音声変換部１１４は、テキストデータ（意味情報）を音声情報に変換して、変換した音声情報に関する音声を音声出力部１０４に出力させる。例えば、意味音声変換部１１４は、音節接続方式やコーパスベース方式、大規模コーパスベース方式等により意味情報を音声情報に変換する。意味音声変換部１１４と音声出力部１０４とが、意味情報を音声情報に変換し、音声出力を行う出力部である。 The semantic voice conversion unit 114 converts text data (semantic information) into voice information, and causes the voice output unit 104 to output voice related to the converted voice information. For example, the semantic speech conversion unit 114 converts the semantic information into speech information by a syllable connection method, a corpus base method, a large-scale corpus base method, or the like. The semantic voice conversion unit 114 and the voice output unit 104 are output units that convert semantic information into voice information and perform voice output.

表示部１０５は、音声意味変換部１１３の取得部１１３１により取得され、変換部１１３２によって変換された意味情報に基づく情報を表示する。受付部１０６は、ユーザが選択する選択肢、又は、文字入力画面に対して行われたユーザからの入力を受け付ける。 The display unit 105 displays information based on the semantic information acquired by the acquisition unit 1131 of the speech meaning conversion unit 113 and converted by the conversion unit 1132. The accepting unit 106 accepts an option selected by the user or an input from the user made on the character input screen.

次に、図３〜図６を参照して、本実施形態による通信装置１０における通話方法について説明する。図３〜図６は、着呼者が公共交通機関等での移動中のため音声通話することができない状況にある場合における通信装置１０の動作を示す。以下、説明の便宜を図るため、発呼者の持つ通信装置１０を通信装置１０Ｂと記し、着呼者の持つ通信装置１０を通信装置１０Ａと記す。 Next, with reference to FIGS. 3 to 6, a call method in the communication apparatus 10 according to the present embodiment will be described. 3 to 6 show the operation of the communication apparatus 10 when the called party is in a situation where a voice call cannot be made because he / she is moving in a public transportation system or the like. Hereinafter, for convenience of explanation, the communication device 10 possessed by the calling party is referred to as a communication device 10B, and the communication device 10 possessed by the called party is referred to as a communication device 10A.

図３は、本実施形態による通信装置１０における通話処理の動作を示すシーケンス図である。
まず、通信装置１０Ｂが通信装置１０Ａに対して音声発信する（ステップＳ１０１）。通信装置１０Ａは、通信装置１０Ｂから音声着信があると、表示部１０５にその旨表示してユーザに報知する。ユーザは、文字通話モードにするための入力操作を通信装置１０Ａの受付部１０６にする。文字通話モードは、文字による通話をするための動作モードである。例えば、通信装置１０Ａは、音声着信があった時、表示部１０５に音声で通話する「音声通話モード」と、着呼者が文字による通話で応答する「文字通話モード」とのどちらで応答するかを選択させる選択肢を表示させる。そして、ユーザは、いずれかのモードを選択する。通信装置１０Ａは、受付部１０６が受け付けたユーザからの入力操作に基づいて、文字通話モードが選択されたことを示す情報を通信装置１０Ｂに対して応答する（ステップＳ１０２）。 FIG. 3 is a sequence diagram showing an operation of a call process in the communication device 10 according to the present embodiment.
First, the communication device 10B transmits a voice to the communication device 10A (step S101). When there is an incoming voice call from the communication device 10B, the communication device 10A displays that fact on the display unit 105 and notifies the user. The user uses the receiving unit 106 of the communication device 10A to perform an input operation for setting the character call mode. The character call mode is an operation mode for making a call using characters. For example, when there is an incoming voice call, the communication device 10A responds with either a “voice call mode” in which a voice call is made to the display unit 105 or a “character call mode” in which the callee responds with a call by text. Display the option to select. Then, the user selects any mode. Based on the input operation from the user accepted by the accepting unit 106, the communication device 10A responds to the communication device 10B with information indicating that the character call mode has been selected (step S102).

そして、通信装置１０Ａと通信装置１０Ｂとは文字通話モードで通話を開始する（ステップＳ１０３）。このとき、通信装置１０Ｂは、音声入力部１０３からユーザの通話に関する音声を入力する。そして、通信装置１０Ｂは、入力された音声を示す音声情報を通信装置１０Ａに送信する。また、通信装置１０Ｂは、音声意味変換部１１３の変換部１１３２により音声情報を意味情報に変換し、変換した意味情報を表示部１０５に表示する。 Then, the communication device 10A and the communication device 10B start a call in the character call mode (step S103). At this time, the communication device 10 B inputs voice related to the user's call from the voice input unit 103. Then, the communication device 10B transmits audio information indicating the input audio to the communication device 10A. In addition, the communication device 10 B converts the speech information into semantic information by the conversion unit 1132 of the speech semantic conversion unit 113 and displays the converted semantic information on the display unit 105.

通信装置１０Ｂは、疑問文「ますか？」を含む音声情報を送信する。ここでは、「時間に間に合いますか？」と送信したとする。（ステップＳ１０４）。通信装置１０Ａは、音声情報を受信すると、音声情報を意味情報に変換する。続いて、通信装置１０Ａは、意味情報の「時間に間に合いますか？」に疑問文「ますか？」が含まれることを生成部１１２により判定し、意味情報の「時間に間に合いますか？」とともに生成部１１２で生成した選択肢「はい」及び選択肢「いいえ」を表示部１０５に表示する。そして、通信装置１０Ａは、表示部１０５に表示した選択肢の選択を受付部１０６により受け付ける。ここでは、ユーザは受付部１０６を操作して選択肢「はい」を選択したものとする。通信装置１０Ａは、選択された選択肢「はい」を示すテキストデータ（意味情報への応答に関する情報）を通信装置１０Ｂに送信する（ステップＳ１０５）。通信装置１０Ｂは、テキストデータを受信すると、テキストデータが示す「はい」を表示部１０５に表示する。また、このとき、通信装置１０Ｂは、意味音声変換部１１４によりテキストデータを音声に変換して音声出力部１０４から「はい」を発音（出力）する。ここでは、通信装置１０Ｂは、表示部１０５に「はい」を表示させ、さらに意味音声変換部１１４により音声に変換して音声出力部から発音させたが、例えば、表示部１０５への表示又は音声出力部からの発音のうちいずれか一方のみを行うように制御してもよいし、通信装置１０Ｂのユーザにいずれを行うように制御するか選択させるようにしてもよい。 The communication device 10B transmits voice information including the question sentence “Do you want to?”. Here, it is assumed that “Is it in time?” Is transmitted. (Step S104). When receiving the voice information, the communication device 10A converts the voice information into semantic information. Subsequently, the communication device 10A determines that the question section “Do you make it in time?” Includes the question sentence “Do you have time?” By the generation unit 112, and the semantic information “Does it in time?” At the same time, the option “Yes” and the option “No” generated by the generation unit 112 are displayed on the display unit 105. Then, the communication device 10 A accepts selection of options displayed on the display unit 105 by the accepting unit 106. Here, it is assumed that the user operates the receiving unit 106 and selects the option “Yes”. The communication device 10A transmits text data (information related to the response to the semantic information) indicating the selected option “Yes” to the communication device 10B (step S105). Upon receiving the text data, the communication device 10B displays “Yes” indicated by the text data on the display unit 105. At this time, the communication device 10 B converts the text data into speech by the semantic speech conversion unit 114 and pronounces (outputs) “Yes” from the speech output unit 104. Here, the communication device 10B displays “Yes” on the display unit 105, and further converts the sound into a voice by the semantic voice conversion unit 114 and causes the voice output unit to pronounce the sound. Control may be performed so that only one of the sound generations from the output unit is performed, or the user of the communication device 10B may select which control is performed.

図４は、本実施形態による通信装置１０が表示する選択肢の一例を示すイメージ図である。本図に示す画面は、上述したステップＳ１０４において通信装置１０Ａが疑問文「時間に間に合いますか？」を含む音声情報を受信したときに表示する画面である。通信装置１０Ａは、疑問文「時間に間に合いますか？」を含む音声情報を受信すると、疑問文「時間に間に合いますか？」Ｑ１と選択肢「はい」Ｙと選択肢「いいえ」Ｎとを表示部１０５に表示する。選択肢「はい」Ｙ及び選択肢「いいえ」Ｎは、受付部１０６によりいずれかを選択可能である。なお、本例では、通信装置１０は、選択肢「はい」Ｙ及び選択肢「いいえ」Ｎのみを表示部１０５に表示しているが、これに限らず、選択肢以外の返答もできるように文字入力画面を選択肢とともに表示部１０５に表示してもよい。 FIG. 4 is an image diagram showing an example of options displayed by the communication apparatus 10 according to the present embodiment. The screen shown in this figure is a screen that is displayed when the communication device 10A receives the voice information including the question sentence “Do you make it in time?” In step S104 described above. When the communication apparatus 10A receives the voice information including the question sentence “Do you make time?” Q1 and the choice “Yes” Y and the choice “No” N are displayed on the display unit. 105. The option “Yes” Y and the option “No” N can be selected by the reception unit 106. In this example, the communication apparatus 10 displays only the choice “Yes” Y and the choice “No” N on the display unit 105. However, the present invention is not limited to this, and a character input screen is provided so that a response other than the choice can be made. May be displayed on the display unit 105 together with the options.

図５は、本実施形態による通信装置１０における通話処理の動作を示すシーケンス図である。本図に示す動作は、「誰（ｗｈｏ）」、「いつ（ｗｈｅｎ）」、「どこ（ｗｈｅｒｅ）」、「何（ｗｈａｔ）」、「なぜ（ｗｈｙ）」、「どうして（ｈｏｗ）」等の５Ｗ１Ｈ言葉が意味情報の疑問文に含まれている場合の動作である。 FIG. 5 is a sequence diagram showing an operation of a call process in the communication device 10 according to the present embodiment. The operations shown in this figure are “who”, “when”, “where”, “what”, “why”, “how”, etc. This is an operation when the 5W1H word is included in the question sentence of the semantic information.

まず、通信装置１０Ｂが通信装置１０Ａに対して音声発信する（ステップＳ２０１）。通信装置１０Ａは、通信装置１０Ｂから音声着信があると、表示部１０５にその旨表示してユーザに報知する。ユーザは、音声通話モードでの応答か、文字通話モードでの応答かの選択肢から、文字通話モードにするための入力操作を通信装置１０Ａの受付部１０６にする。通信装置１０Ａは、受付部１０６が受け付けたユーザからの入力操作に基づいて、文字通話モードが選択されたことを示す情報を通信装置１０Ｂに対して応答する（ステップＳ２０２）。そして、通信装置１０Ａと通信装置１０Ｂとは文字通話モードで通話を開始する（ステップＳ２０３）。 First, the communication device 10B transmits a voice to the communication device 10A (step S201). When there is an incoming voice call from the communication device 10B, the communication device 10A displays that fact on the display unit 105 and notifies the user. The user makes an input operation for switching to the character call mode to the accepting unit 106 of the communication device 10A from the choice of a response in the voice call mode or a response in the character call mode. Based on the input operation from the user accepted by the accepting unit 106, the communication device 10A responds to the communication device 10B with information indicating that the character call mode has been selected (step S202). Then, the communication device 10A and the communication device 10B start a call in the character call mode (step S203).

通信装置１０Ｂは、疑問文「何が必要ですか？」を含む音声情報を送信する（ステップＳ２０４）。この時、通信装置１０Ｂは、音声意味変換部１１３の変換部１１３２により音声情報を意味情報に変換し、変換した意味情報を表示部１０５に表示する。通信装置１０Ａは、音声情報を受信すると、音声情報を意味情報に変換する。続いて、通信装置１０Ａは、意味情報が「何」および「ですか？」を含む疑問文であるため、疑問文「何が必要ですか？」とともに文字入力画面を表示部１０５に表示する。通信装置１０Ａは、受付部１０６により文字入力画面に対する文字入力を受け付ける。ユーザは、受付部１０６を操作して疑問文に対する応答「印鑑が必要です」を文字入力画面に入力する。通信装置１０Ａは、文字入力画面に入力された応答「印鑑が必要です」を示すテキストデータ（意味情報への応答に関する情報）を通信装置１０Ｂに送信する（ステップＳ２０５）。通信装置１０Ｂは、テキストデータを受信すると、テキストデータが示す「印鑑が必要です」を表示部１０５に表示する。また、このとき、通信装置１０Ｂは、意味音声変換部１１４によりテキストデータを音声に変換して音声出力部１０４から「印鑑が必要です」を発音（出力）する。このように、通信装置１０Ｂは、表示部１０５による表示と、音声出力部１０４による発音との両方を行うように制御したが、表示部１０５への表示又は音声出力部からの発音のうちいずれか一方のみを行うように制御してもよい。また、通信装置１０Ｂは、通信装置１０Ｂのユーザにいずれを行うように制御するか選択させるようにしてもよい。 The communication device 10B transmits voice information including the question sentence “What is necessary?” (Step S204). At this time, the communication device 10 B converts the speech information into semantic information by the conversion unit 1132 of the speech semantic conversion unit 113 and displays the converted semantic information on the display unit 105. When receiving the voice information, the communication device 10A converts the voice information into semantic information. Subsequently, the communication apparatus 10A displays a character input screen on the display unit 105 together with the question sentence “What is necessary?” Because the semantic information is a question sentence including “what” and “what?”. Communication device 10 A accepts character input on the character input screen by accepting unit 106. The user operates the reception unit 106 and inputs a response “question required” to the question sentence on the character input screen. 10 A of communication apparatuses transmit the text data (information regarding the response to a semantic information) which shows the response "a seal is required" input into the character input screen to the communication apparatus 10B (step S205). When the communication device 10B receives the text data, the communication device 10B displays “The seal is necessary” indicated by the text data on the display unit 105. At this time, the communication device 10 B converts the text data into speech by the semantic speech conversion unit 114 and pronounces (outputs) “the seal is necessary” from the speech output unit 104. As described above, the communication device 10B is controlled to perform both the display by the display unit 105 and the sound generation by the sound output unit 104. However, either the display on the display unit 105 or the sound output from the sound output unit is selected. You may control to perform only one side. Further, the communication device 10B may allow the user of the communication device 10B to select which one to control.

図６は、本実施形態による通信装置１０が表示する文字入力画面の一例を示すイメージ図である。本図に示す画面は、上述したステップＳ２０４において通信装置１０Ａが疑問文「何が必要ですか？」を含む音声情報を受信したときに表示する画面である。通信装置１０Ａは、疑問文「何が必要ですか？」を含む音声情報を受信すると、疑問文「何が必要ですか？」Ｑ２と文字入力画面Ａを表示部１０５に表示する。文字入力画面Ａには、受付部１０６により、例えば「印鑑が必要です」等の文字が入力可能である。 FIG. 6 is an image diagram illustrating an example of a character input screen displayed by the communication device 10 according to the present embodiment. The screen shown in this figure is a screen displayed when the communication apparatus 10A receives voice information including the question sentence “What is necessary?” In step S204 described above. When the communication device 10A receives the voice information including the question sentence “What is necessary?”, The communication device 10A displays the question sentence “What is necessary?” Q2 and the character input screen A on the display unit 105. On the character input screen A, the accepting unit 106 can input characters such as “Needs a seal”.

次に、本実施形態による通信装置１０における通話方法を実現するための通話制御処理について説明する。図７は、本実施形態による通信装置１０における通話制御処理の手順を示すフローチャートである。本図に示す処理は、文字通話モードで通話する際に、着呼者の持つ通信装置１０Ａが実行する処理である。 Next, a call control process for realizing the call method in the communication apparatus 10 according to the present embodiment will be described. FIG. 7 is a flowchart showing the procedure of the call control process in the communication apparatus 10 according to the present embodiment. The process shown in this figure is a process executed by the communication device 10A of the called party when making a call in the character call mode.

まず、制御部１０１は、送受信部１０２を介して、文字通話モードで発呼者の通信装置１０との通話を開始する（ステップＳ５０１）。続いて、音声意味変換部１１３の取得部１１３１は、送受信部１０２が発呼者の通信装置１０から音声情報を受信したか否かを判定する（ステップＳ５０２）。取得部１１３１は、音声情報を受信していないと判定した場合（ステップＳ５０２：Ｎｏ）には、ステップＳ５０２の処理に戻る。一方、取得部１１３１は、音声情報を受信したと判定した場合（ステップＳ５０２：Ｙｅｓ）には、送受信部１０２が受信した音声情報を取得する。そして、変換部１１３２は、取得部１１３１が取得した音声情報を、意味情報に変換する（ステップＳ５０３）。 First, the control unit 101 starts a call with the communication device 10 of the caller in the character call mode via the transmission / reception unit 102 (step S501). Subsequently, the acquisition unit 1131 of the voice meaning conversion unit 113 determines whether or not the transmission / reception unit 102 has received voice information from the communication device 10 of the caller (step S502). If the acquisition unit 1131 determines that no audio information has been received (step S502: No), the acquisition unit 1131 returns to the process of step S502. On the other hand, if the acquisition unit 1131 determines that the audio information has been received (step S502: Yes), the acquisition unit 1131 acquires the audio information received by the transmission / reception unit 102. Then, the conversion unit 1132 converts the audio information acquired by the acquisition unit 1131 into semantic information (step S503).

続いて、制御部１０１の生成部１１２が、取得した意味情報に疑問文が含まれるか否かを判定する（ステップＳ５０４）。具体的には、生成部１１２は、意味情報に疑問言葉が含まれている場合に疑問文が含まれていると判定し、意味情報に疑問言葉が含まれていない場合に疑問文が含まれていないと判定する。生成部１１２は、意味情報に疑問文が含まれていないと判定した場合（ステップＳ５０４：Ｎｏ）には、意味情報を表示部１０５に表示して（ステップＳ５０５）、ステップＳ５０２の処理に戻る。 Subsequently, the generation unit 112 of the control unit 101 determines whether or not a question sentence is included in the acquired semantic information (step S504). Specifically, the generation unit 112 determines that a question sentence is included when the semantic information includes a question word, and includes a question sentence when the semantic information does not include the question word. Judge that it is not. If the generation unit 112 determines that the question information is not included in the semantic information (step S504: No), the generation unit 112 displays the semantic information on the display unit 105 (step S505), and returns to the process of step S502.

一方、生成部１１２は、意味情報に疑問文が含まれていると判定した場合（ステップＳ５０４：Ｙｅｓ）には、意味情報に所定の単語（例えば、５Ｗ１Ｈ言葉）があるか否かを判定する（ステップＳ５０６）。 On the other hand, when it is determined that the semantic information includes a question sentence (step S504: Yes), the generation unit 112 determines whether the semantic information includes a predetermined word (for example, a 5W1H word). (Step S506).

生成部１１２は、意味情報に所定の単語が含まれていないと判定した場合（ステップＳ５０６：Ｎｏ）には、受信した意味情報とともに選択肢を表示部１０５に表示する（ステップＳ５０７）。受付部１０６は、表示部１０５に表示された選択肢の選択入力を受け付ける。そして、生成部１１２は、受付部１０６により、ユーザが選択肢を選択したか否かを判定する（ステップＳ５０８）。生成部１１２は、選択肢が選択されていないと判定した場合（ステップＳ５０８：Ｎｏ）には、ステップＳ５０８の処理に戻る。一方、生成部１１２は、選択肢が選択されたと判定した場合（ステップＳ５０８：Ｙｅｓ）には、送受信部１０２を介して、選択結果（選択された選択肢）を示すテキストデータ（意味情報への応答に関する情報）を発呼者の通信装置１０に送信する（ステップＳ５０９）。 When the generation unit 112 determines that the predetermined word is not included in the semantic information (step S506: No), the generation unit 112 displays the option on the display unit 105 together with the received semantic information (step S507). The accepting unit 106 accepts an input for selecting an option displayed on the display unit 105. Then, the generation unit 112 determines whether or not the user has selected an option using the reception unit 106 (step S508). If the generation unit 112 determines that no option is selected (step S508: No), the generation unit 112 returns to the process of step S508. On the other hand, when the generation unit 112 determines that the option has been selected (step S508: Yes), the text data indicating the selection result (selected option) via the transmission / reception unit 102 (related to the response to the semantic information). Information) is transmitted to the communication device 10 of the caller (step S509).

一方、生成部１１２は、意味情報に所定の単語が含まれていると判定した場合（ステップＳ５０６：Ｙｅｓ）には、受信した意味情報とともに文字入力画面を表示部１０５に表示する（ステップＳ５１０）。受付部１０６は、表示部１０５に表示された文字入力画面に対する文字入力を受け付ける。そして、生成部１１２は、受付部１０６により、ユーザが文字入力を完了したか否かを判定する（ステップＳ５１１）。生成部１１２は、文字入力が完了していないと判定した場合（ステップＳ５１１：Ｎｏ）には、ステップＳ５１１の処理に戻る。一方、生成部１１２は、文字入力が完了したと判定した場合（ステップＳ５１１：Ｙｅｓ）には、入力された文字を示すテキストデータ（意味情報への応答に関する情報）を発呼者の通信装置１０に送信する（ステップＳ５１２）。 On the other hand, when determining that the semantic information includes a predetermined word (step S506: Yes), the generation unit 112 displays a character input screen on the display unit 105 together with the received semantic information (step S510). . The accepting unit 106 accepts character input for the character input screen displayed on the display unit 105. Then, the generation unit 112 determines whether the user has completed the character input through the reception unit 106 (step S511). If the generation unit 112 determines that the character input has not been completed (step S511: No), the process returns to the process of step S511. On the other hand, when the generation unit 112 determines that the character input is completed (step S511: Yes), the generation unit 112 transmits text data (information related to the response to the semantic information) indicating the input character to the communication device 10 of the caller. (Step S512).

ステップＳ５０９又はステップＳ５１２に続いて、制御部１０１は、発呼者の通信装置１０との通話が終了したか否かを判定する（ステップＳ５１３）。制御部１０１は、通話が終了していないと判定した場合（ステップＳ５１３：Ｎｏ）には、ステップＳ５０２の処理に戻る。一方、制御部１０１は、通話が終了したと判定した場合（ステップＳ５１３：Ｙｅｓ）には、本通話制御処理を終了する。 Subsequent to step S509 or step S512, the control unit 101 determines whether or not the call with the communication device 10 of the caller is terminated (step S513). If the control unit 101 determines that the call has not ended (step S513: No), the control unit 101 returns to the process of step S502. On the other hand, if the control unit 101 determines that the call has ended (step S513: Yes), the call control process ends.

このように、本実施形態によれば、通信装置１０は、文字通話モードで通話する場合、発呼者の通信装置１０から受信した音声情報を意味情報に変換して表示部１０５に表示する。これにより、発呼者は、音声による通話を維持したまま着呼者に用件を伝えることができる。すなわち、発呼者は、着呼者が音声による通話をできない場合に、電話を一度終了し、文字で用件を伝えるためにメールもしくはチャットを起動する必要がない。 Thus, according to the present embodiment, the communication device 10 converts the voice information received from the caller's communication device 10 into semantic information and displays it on the display unit 105 when making a call in the character call mode. As a result, the calling party can convey the message to the called party while maintaining the voice call. That is, the caller does not need to activate mail or chat in order to terminate the call once and convey the message in text when the caller cannot make a voice call.

また、通信装置１０は、受信した音声情報に疑問文が含まれる場合には、受信した意味情報とともに選択肢を表示部１０５に表示する。そして、通信装置１０は、受付部１０６において着呼者による選択肢の選択を受け付け、選択された選択肢を示すテキストデータを発呼者の通信装置１０に送信する。これにより、着呼者は、文字を入力することなく、発呼者からの疑問文に対して応答することができ、発呼者へのレスポンスを迅速にすることができる。 In addition, when the received voice information includes a question sentence, the communication device 10 displays options on the display unit 105 together with the received semantic information. Then, the communication device 10 accepts selection of an option by the callee at the accepting unit 106, and transmits text data indicating the selected option to the communication device 10 of the caller. Thereby, the called party can respond to the question sentence from the calling party without inputting characters, and can speed up the response to the calling party.

また、通信装置１０は、「ですか？」や「ますか？」等の疑問言葉が含まれる疑問文に５Ｗ１Ｈ言葉が含まれる場合には、受信した意味情報とともに文字入力画面を表示部１０５に表示する。そして、通信装置１０は、受付部１０６において文字入力画面に対する文字入力を受け付け、入力された文字を示すテキストデータを発呼者の通信装置１０に送信する。これにより、着呼者は、音声を発することなく、発呼者からの疑問文に対する応答をすることができる。 Further, when a 5W1H word is included in a question sentence including question words such as “??” or “Masu?”, The communication device 10 displays a character input screen on the display unit 105 together with the received semantic information. indicate. Then, the communication device 10 receives character input on the character input screen in the receiving unit 106 and transmits text data indicating the input characters to the caller's communication device 10. Thereby, the called party can respond to the question sentence from the calling party without making a voice.

なお、上述した実施形態における通信装置１０の一部、例えば、制御部１０１をコンピュータで実現するようにしても良い。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現しても良い。なお、ここでいう「コンピュータシステム」とは、通信装置１０に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでも良い。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。
また、上述した実施形態における通信装置１０の一部、または全部を、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の集積回路として実現しても良い。
通信装置１０の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化しても良い。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現しても良い。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いても良い。 Note that a part of the communication device 10 in the above-described embodiment, for example, the control unit 101 may be realized by a computer. In that case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. Here, the “computer system” is a computer system built in the communication apparatus 10 and includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In such a case, a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
In addition, a part or all of the communication device 10 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration).
Each functional block of the communication device 10 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.

以上、図面を参照してこの発明の一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 As described above, the embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to the above, and various design changes and the like can be made without departing from the scope of the present invention. It is possible to

例えば、上述した実施形態では、着呼者の通信装置１０において音声情報を意味情報に変換しているが、これに限らず、発呼者の通信装置１０が音声情報を意味情報に変換し、変換した意味情報を着呼者の通信装置１０に送信してもよい。或いは、発呼者の通信装置１０と着呼者の通信装置１０とを繋ぐネットワークの基地局で音声情報を意味情報に変換してもよい。
同様に、意味情報から音声情報への変換についても、着呼者の通信装置１０、発呼者の通信装置１０、又は、発呼者の通信装置１０と着呼者の通信装置１０とを繋ぐ基地局のいずれで行ってもよい。 For example, in the above-described embodiment, the caller's communication device 10 converts voice information into semantic information. However, the present invention is not limited to this, and the caller's communication device 10 converts voice information into semantic information. The converted semantic information may be transmitted to the communication device 10 of the called party. Alternatively, voice information may be converted into semantic information at a base station of a network connecting the calling party communication device 10 and the called party communication device 10.
Similarly, for conversion from semantic information to voice information, the caller communication device 10, the caller communication device 10, or the caller communication device 10 and the caller communication device 10 are connected. It may be performed at any of the base stations.

また、上述した実施形態では、通信装置１０は、５Ｗ１Ｈ言葉が意味情報の疑問文に含まれている場合には、文字入力画面を表示しているが、これに限らず、例えば、「何時」等の固定表現が含まれている場合には、「１時」、「２時」…等の選択肢を表示してもよい。具体的には、通信装置１０は、意味情報に疑問文「今日は何時に到着しますか？」が含まれている場合には、疑問文とともに選択肢「１時」、「２時」、…、「２４時」を表示部１０５に表示して、選択肢の選択を受付部１０６により受け付ける。 In the embodiment described above, the communication device 10 displays the character input screen when the 5W1H word is included in the question sentence of the semantic information. However, the present invention is not limited to this. For example, “what time” If a fixed expression such as “1 o'clock”, “2 o'clock”,..., Etc. may be displayed. Specifically, if the semantic information includes a question sentence “What time will you arrive today?”, The communication device 10 may select the options “1 o'clock”, “2 o'clock”,. , “24 o'clock” is displayed on the display unit 105, and selection of an option is received by the receiving unit 106.

（１）本発明の一態様は、ユーザからの入力を受け付ける受付部と、音声情報を取得する取得部と、前記取得部により取得された音声情報を意味情報に変換する変換部と、前記変換部により変換された意味情報に基づく情報を表示する表示部と、前記変換部により変換された意味情報から疑問文を抽出し、抽出した疑問文に基づいて、ユーザが選択する選択肢を生成して前記表示部に表示させ、前記表示させた選択肢の中から前記ユーザが選択した選択肢であって、前記受付部が受け付けた選択肢に基づいて、前記意味情報への応答に関する情報を生成する生成部と、前記生成部により生成された応答に関する情報を他装置に送信する送信部と、を備える通信装置である。 (1) According to one aspect of the present invention, a reception unit that receives input from a user, an acquisition unit that acquires voice information, a conversion unit that converts voice information acquired by the acquisition unit into semantic information, and the conversion A display unit for displaying information based on the semantic information converted by the unit, and extracting a question sentence from the semantic information converted by the conversion unit, and generating an option for the user to select based on the extracted question sentence A generating unit configured to generate information related to a response to the semantic information based on the option selected by the user from the displayed options displayed on the display unit and received by the receiving unit; And a transmission unit that transmits information related to the response generated by the generation unit to another device.

（２）また、本発明の他の態様は、（１）に記載の通信装置であって、前記生成部は、前記意味情報から疑問文を抽出し、抽出した疑問文に基づいて、前記ユーザが意味情報を入力するための入力欄を生成し、前記ユーザが意味情報を入力するための入力欄に対して行った前記ユーザからの入力であって、前記受付部が受け付けた入力に基づいて、前記意味情報への応答に関する情報を生成する、通信装置である。 (2) Another aspect of the present invention is the communication device according to (1), in which the generation unit extracts a question sentence from the semantic information, and the user is based on the extracted question sentence. Generates an input field for inputting semantic information, and is an input from the user made to the input field for the user to input semantic information, based on the input received by the receiving unit A communication device that generates information related to a response to the semantic information.

（３）また、本発明の他の態様は、（１）又は（２）に記載の通信装置であって、前記意味情報を音声情報に変換し、音声出力を行う出力部を備える、通信装置である。 (3) Moreover, the other aspect of this invention is a communication apparatus as described in (1) or (2), Comprising: The communication apparatus provided with the output part which converts the said semantic information into audio | voice information, and outputs an audio | voice. It is.

（４）また、本発明の他の態様は、ユーザからの入力を受け付け、音声情報を取得し、前記取得した音声情報を意味情報に変換し、前記変換した意味情報に基づく情報を表示し、前記受け付けたユーザからの入力と、前記変換した意味情報とに基づいて、前記意味情報への応答に関する情報を生成し、前記生成した応答に関する情報を他装置に送信する、通信方法である。 (4) Moreover, the other aspect of this invention receives the input from a user, acquires audio | voice information, converts the acquired audio | voice information into semantic information, displays the information based on the converted semantic information, The communication method generates information related to a response to the semantic information based on the received input from the user and the converted semantic information, and transmits the generated information related to the response to another device.

（５）また、本発明の他の態様は、コンピュータに、ユーザからの入力を受け付けさせ、音声情報を取得させ、前記取得された音声情報を意味情報に変換させ、前記変換された意味情報に基づく情報を表示させ、前記受け付けられたユーザからの入力と、前記変換された意味情報とに基づいて、前記意味情報への応答に関する情報を生成させ、前記生成された応答に関する情報を他装置に送信させる、プログラムである。 (5) Moreover, the other aspect of this invention makes a computer accept the input from a user, acquires audio | voice information, converts the acquired audio | voice information into semantic information, and converts into the converted said semantic information. Based on the received input from the user and the converted semantic information, information related to the response to the semantic information is generated, and the information related to the generated response is transmitted to another device. It is a program that sends.

１０…通信装置１０１…制御部１０２…送受信部１０３…音声入力部１０４…音声出力部１０５…表示部１０６…受付部１１１…取得部１１２…生成部１１３…音声意味変換部１１４…意味音声変換部 DESCRIPTION OF SYMBOLS 10 ... Communication apparatus 101 ... Control part 102 ... Transmission / reception part 103 ... Audio | voice input part 104 ... Audio | voice output part 105 ... Display part 106 ... Reception part 111 ... Acquisition part 112 ... Generating part 113 ... Voice meaning conversion part 114 ... Meaning voice conversion part

Claims

A reception unit that receives input from the user;
An acquisition unit for acquiring audio information;
A conversion unit that converts the audio information acquired by the acquisition unit into semantic information;
A display unit for displaying information based on the semantic information converted by the conversion unit;
Extracting a question sentence from the semantic information converted by the conversion unit, generating an option to be selected by the user based on the extracted question sentence, displaying the option on the display unit, and selecting the user from the displayed option A generation unit that generates information related to a response to the semantic information based on the option received by the reception unit;
A transmission unit that transmits information related to the response generated by the generation unit to another device;
A communication device comprising:

The communication device according to claim 1,
The generation unit extracts a question sentence from the semantic information, generates an input field for the user to input semantic information based on the extracted question sentence, and an input for the user to input the semantic information Generating information related to the response to the semantic information based on the input received by the reception unit, which is input from the user with respect to a column;
Communication device.

The communication device according to claim 1 or 2,
An output unit that converts the semantic information into audio information and performs audio output;
Communication device.

Accepts user input,
Get audio information,
Converting the acquired voice information into semantic information;
Displaying information based on the converted semantic information;
Based on the received input from the user and the converted semantic information, information on a response to the semantic information is generated,
Sending information about the generated response to another device;
Communication method.

On the computer,
Accept input from the user,
Get audio information,
Converting the acquired voice information into semantic information;
Displaying information based on the converted semantic information;
Based on the received input from the user and the converted semantic information, information on a response to the semantic information is generated,
Sending information about the generated response to another device;
program.