JP2011180645A

JP2011180645A - Conversation management system and management server

Info

Publication number: JP2011180645A
Application number: JP2010041675A
Authority: JP
Inventors: Takehiro Yamamoto; 武洋山本
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2010-02-26
Filing date: 2010-02-26
Publication date: 2011-09-15
Anticipated expiration: 2030-02-26
Also published as: JP5067435B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a conversation management system accurately recognizing the start and end of a conversation made by a plurality of general users, and recording only a necessary conversation. <P>SOLUTION: A wearable terminal 20 acquires partner identification information for identifying a general user being a conversation partner by a camera 40. At least either the wearable terminal 20 or an utterance content management server 10 identifies the general user being a conversation partner from partner identification information by a face recognition part 12. The wearable terminal 20 acquires by a microphone 30 at least the utterance of the general user to be used. The utterance content management server 10 records the utterance of a plurality of general users by a voice recognition result management part 13 by using a point of time when partner identification information is mutually acquired as the starting end of the conversation. When such a state that partner identification information is not mutually acquired, is continued for a prescribed time, the recording of the conversation is ended by an interview history management part 14. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、会話する複数の一般ユーザに個々に利用される複数のウェアラブル端末と、複数のウェアラブル端末とデータ通信する一個の管理サーバと、を有する会話管理システムおよび管理サーバに関する。 The present invention relates to a conversation management system and a management server having a plurality of wearable terminals individually used by a plurality of general users who have a conversation, and one management server that performs data communication with the plurality of wearable terminals.

現在、会話を記録し、後から検索、閲覧したいという場合において、従来の方法としてはＩＣ(Integrated Circuit)レコーダなどで明示的な操作によって記録し、後から音声を書き起こすという方法がある。 At present, when a conversation is recorded and it is desired to search and browse later, as a conventional method, there is a method of recording it by an explicit operation with an IC (Integrated Circuit) recorder or the like and writing the voice later.

この方法では明示的な操作をすることへの心理的抵抗やわずらわしさ、音声データをＰＣ(Personal Computer)に移したり、音声を文字データにする作業のわずらわしさなどの理由から特定の重要な会話でしか実施されていない。 In this method, certain important conversations such as psychological resistance and annoyance to explicit operation, transfer of voice data to PC (Personal Computer), troublesome work of converting voice to character data, etc. It has only been implemented.

一方、ウェアラブル端末が普及しはじめており、ウェアラブル端末を装着している人は、ウェアラブル端末に付属したマイクにより、常に自分の発言を記録できる環境にある。 On the other hand, wearable terminals are becoming widespread, and a person wearing a wearable terminal can always record his / her speech with a microphone attached to the wearable terminal.

また、顔認証により、個人を特定する技術もでてきている。より詳細には、顔画像の大きさを正規化して顔画像から輪郭線を検出し、輪郭線が閉じた閉領域からなる一般特徴と、輪郭線が開端になる線分および孤立した点からなる固有特徴と、を検出し、登録された特徴処理済み顔画像ファイルをパターンマッチングで照合し本人認証する。 In addition, a technique for identifying an individual by face authentication has been developed. More specifically, the size of the face image is normalized, the contour line is detected from the face image, and includes a general feature including a closed region where the contour line is closed, a line segment where the contour line is open, and an isolated point. The unique feature is detected, and the registered feature-processed face image file is verified by pattern matching to authenticate the person.

このため、画像処理によって特徴処理済み登録顔画像と、訪問者の撮像顔画像からの特徴処理した顔画像と、を照合させることによってパターンマッチング処理を簡易にすることができる（特許文献１）。 For this reason, it is possible to simplify the pattern matching process by collating the registered face image that has undergone the feature processing by the image processing and the face image that has undergone the feature processing from the captured face image of the visitor (Patent Document 1).

さらに、上述のような技術として、任意の自由な状況で常時通話、常時記録を可能にしつつ、記録された内容の証拠的価値を高めることのできるコミュニケーション記録システムの提案がある。 Furthermore, as a technique as described above, there is a proposal of a communication recording system that can increase the proof value of recorded contents while allowing continuous calls and continuous recording in any free situation.

その技術では、ヘッドセットは、当該ヘッドセットの装着者の音声を検出し第一の音声信号を生成するマイクロホンと、第一の音声信号を、ブルートゥース規格に準拠した近距離無線通信により機器に送信する送信手段とを備える。 In that technology, the headset detects the voice of the wearer of the headset and generates a first audio signal, and transmits the first audio signal to the device by short-range wireless communication compliant with the Bluetooth standard. Transmitting means.

機器は、ヘッドセットからの第一の音声信号をブルートゥース規格に準拠した近距離無線通信により受信する受信手段と、ヘッドセットの装着者以外の者の音声を検出して第二の音声信号を生成する音声入力手段と、第一および第二の音声信号の発生時刻を取得する時刻情報取得手段と、第一および第二の音声信号に時刻情報を対応付けて、格納すべき情報を作成する情報作成手段と、作成された情報を格納する記憶手段とを備える（特許文献２）。 The device receives the first audio signal from the headset by short-range wireless communication compliant with the Bluetooth standard, and generates the second audio signal by detecting the audio of a person other than the headset wearer Information for generating information to be stored by associating time information with the first and second audio signals, time information acquiring means for acquiring the generation times of the first and second audio signals, A creation means and a storage means for storing the created information are provided (Patent Document 2).

特開２００５−２４２４３２号公報JP 2005-242432 A 特開２００５−２３７０１７号公報Japanese Patent Laying-Open No. 2005-237017

ウェアラブル端末を双方の話者が装着した会話の場合に、人と話した内容を効率的に記録し、後から検索、閲覧したいというニーズがある。しかし、従来の方法ではウェアラブル端末では自分の発言のみが記録されているため、相手の発言を検索、閲覧することはできなかった。 In the case of a conversation in which both speakers wear a wearable terminal, there is a need to efficiently record the contents of conversations with people and to search and view them later. However, in the conventional method, only the user's remarks are recorded on the wearable terminal, and therefore, the remarks of the other party cannot be searched and viewed.

特許文献２の技術では、上述のような課題を解決することができる。しかし、複数の一般ユーザによる会話の開始と終了とを正確に認識し、必要な会話のみを記録することは困難である。 The technique of Patent Document 2 can solve the above-described problems. However, it is difficult to accurately recognize the start and end of a conversation by a plurality of general users and record only the necessary conversation.

本発明は上述のような課題に鑑みてなされたものであり、複数の一般ユーザによる会話の開始と終了とを正確に認識し、必要な会話のみを記録することができる会話管理システムおよび管理サーバを提供するものである。 The present invention has been made in view of the above problems, and a conversation management system and management server capable of accurately recognizing the start and end of conversations by a plurality of general users and recording only necessary conversations. Is to provide.

本発明の会話管理システムは、会話する複数の一般ユーザに個々に利用される複数のウェアラブル端末と、複数のウェアラブル端末とデータ通信する一個の管理サーバと、を有し、ウェアラブル端末が、会話相手の一般ユーザを識別する相手識別情報を取得する識別取得手段を有し、ウェアラブル端末と管理サーバとの少なくとも一方が、相手識別情報から会話相手の一般ユーザを識別するユーザ識別手段を有し、ウェアラブル端末が、少なくとも利用する一般ユーザの発言を取得する発言取得手段を有し、管理サーバが、相手識別情報が相互に取得されると複数の一般ユーザの会話の記録を開始する記録開始手段と、相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を終了する記録終了手段と、を有する。 The conversation management system of the present invention includes a plurality of wearable terminals individually used by a plurality of general users who have a conversation, and a single management server that performs data communication with the plurality of wearable terminals. And a wearable terminal and a management server, at least one of the wearable terminal and the management server has user identification means for identifying a general user of a conversation partner from the partner identification information, and wearable The terminal has a statement acquisition unit that acquires at least a statement of a general user to be used, and the management server starts recording of conversations of a plurality of general users when the partner identification information is mutually acquired, Recording end means for ending the recording of conversation when a state in which the partner identification information is not mutually acquired has elapsed until a predetermined time.

本発明の管理サーバは、本発明の会話管理システムの管理サーバであって、相手識別情報が相互に取得されると複数の一般ユーザの会話の記録を開始する記録開始手段と、相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を終了する記録終了手段と、を有する。 The management server of the present invention is a management server of the conversation management system of the present invention, and recording start means for starting recording of conversations of a plurality of general users when the partner identification information is mutually acquired, and the partner identification information is Recording ending means for ending the recording of the conversation when the mutually unacquired states elapse until a predetermined time.

なお、本発明の各種の構成要素は、その機能を実現するように形成されていればよく、例えば、所定の機能を発揮する専用のハードウェア、所定の機能がコンピュータプログラムにより付与された発言内容管理サーバ、コンピュータプログラムにより発言内容管理サーバに実現された所定の機能、これらの任意の組み合わせ、等として実現することができる。 It should be noted that the various components of the present invention need only be formed so as to realize their functions. For example, dedicated hardware that exhibits a predetermined function, contents of statements in which a predetermined function is given by a computer program It can be realized as a management server, a predetermined function realized in the statement content management server by a computer program, an arbitrary combination thereof, or the like.

また、本発明の各種の構成要素は、必ずしも個々に独立した存在である必要はなく、複数の構成要素が一個の部材として形成されていること、一つの構成要素が複数の部材で形成されていること、ある構成要素が他の構成要素の一部であること、ある構成要素の一部と他の構成要素の一部とが重複していること、等でもよい。 The various components of the present invention do not necessarily have to be independent of each other. A plurality of components are formed as a single member, and a single component is formed of a plurality of members. It may be that a certain component is a part of another component, a part of a certain component overlaps with a part of another component, or the like.

本発明の会話管理システムでは、ウェアラブル端末が、会話相手の一般ユーザを識別する相手識別情報を識別取得手段で取得する。ウェアラブル端末と管理サーバとの少なくとも一方が、相手識別情報から会話相手の一般ユーザをユーザ識別手段で識別する。ウェアラブル端末が、少なくとも利用する一般ユーザの発言を発言取得手段で取得する。管理サーバが、相手識別情報が相互に取得されると複数の一般ユーザの会話の記録を記録開始手段で記録する。相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を記録終了手段で終了する。このため、複数の一般ユーザによる会話の開始と終了とを正確に認識し、必要な会話のみを記録することができる。 In the conversation management system of the present invention, the wearable terminal acquires partner identification information for identifying the general user of the conversation partner by the identification acquisition means. At least one of the wearable terminal and the management server identifies the general user of the conversation partner with the user identification means from the partner identification information. The wearable terminal acquires at least a general user's speech to be used by the speech acquisition means. When the management server obtains the partner identification information from each other, the recording server records the conversations of a plurality of general users. When the state in which the other party identification information is not mutually acquired elapses until a predetermined time, the recording of the conversation is ended by the recording end unit. For this reason, it is possible to accurately recognize the start and end of a conversation by a plurality of general users and to record only the necessary conversation.

本発明の実施の形態のデータ処理システムの論理構造を示す模式的なブロック図である。It is a typical block diagram which shows the logical structure of the data processing system of embodiment of this invention. 会話の記録工程を示す模式的なタイムチャートである。It is a typical time chart which shows the recording process of conversation. ある会話ログのデータ構造を示す模式図である。It is a schematic diagram which shows the data structure of a certain conversation log. 二人の一般ユーザの会話の記録工程を示す模式的なタイムチャートである。It is a typical time chart which shows the recording process of conversation of two general users. 一般ユーザのＩＤのデータ構造を示す模式図である。It is a schematic diagram which shows the data structure of ID of a general user. 会話の開始を認識する工程を示すフローチャートである。It is a flowchart which shows the process of recognizing the start of conversation. 記録された複数の会話のデータ構造を示す模式図である。It is a schematic diagram which shows the data structure of the several conversation recorded. 一つの会話での発言内容などのデータ構造を示す模式図である。It is a schematic diagram which shows data structures, such as the utterance content in one conversation. 会話の終了を認識する工程を示すフローチャートである。It is a flowchart which shows the process of recognizing completion | finish of conversation. 眼鏡型ディスプレイに会話履歴表示メニューが表示された状態を示す模式図である。It is a schematic diagram which shows the state by which the conversation history display menu was displayed on the spectacles type display. 眼鏡型ディスプレイに検索結果が表示された状態を示す模式図である。It is a schematic diagram which shows the state by which the search result was displayed on the spectacles type display. 三人の一般ユーザの会話の記録工程を示す模式的なタイムチャートである。It is a typical time chart which shows the recording process of a conversation of three general users.

本発明の実施の一形態を図面を参照して以下に説明する。図１を参照すると、本発明の会話管理システムは、会話する複数の一般ユーザに個々に利用される複数のウェアラブル端末２０と、複数のウェアラブル端末２０とデータ通信する一個の発言内容管理サーバ１０と、を有する。 An embodiment of the present invention will be described below with reference to the drawings. Referring to FIG. 1, a conversation management system of the present invention includes a plurality of wearable terminals 20 that are individually used by a plurality of general users who have a conversation, and a message content management server 10 that performs data communication with the plurality of wearable terminals 20. Have.

ウェアラブル端末２０が、会話相手の一般ユーザを識別する相手識別情報を取得する識別取得手段であるカメラ４０を有する。発言内容管理サーバ１０が、相手識別情報から会話相手の一般ユーザを識別するユーザ識別手段である顔認識部１２を有する。 The wearable terminal 20 includes a camera 40 that is identification acquisition means for acquiring partner identification information for identifying a general user of a conversation partner. The statement content management server 10 includes a face recognition unit 12 that is a user identification unit that identifies a general user of a conversation partner from the partner identification information.

ウェアラブル端末２０が、少なくとも利用する一般ユーザの発言を取得する発言取得手段であるマイク３０を有する。発言内容管理サーバ１０が、相手識別情報が相互に取得されると複数の一般ユーザの会話の記録を開始する記録開始手段である音声認識結果管理部１３と、相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を終了する記録終了手段である面会履歴管理部１４と、を有する。 The wearable terminal 20 includes a microphone 30 that is a speech acquisition unit that acquires at least a general user's speech to be used. The speech content management server 10 is a state where voice recognition result management unit 13 which is a recording start means for starting recording of conversations of a plurality of general users when the partner identification information is mutually acquired, and the partner identification information is not acquired mutually. Has a meeting history management section 14 which is a recording end means for ending the recording of conversation when a predetermined time elapses.

面会履歴管理部１４は、上述の所定時間が経過すると相手識別情報が相互に取得されなくなったときを終端として会話を音声認識結果管理部１３に保存する。ただし、ウェアラブル端末２０が、取得された発言を音声認識してテキストデータとする音声認識部２１を有するので、音声認識結果管理部１３は、会話をテキストデータで記録する。 The meeting history management unit 14 stores the conversation in the voice recognition result management unit 13 with the end when the other party identification information is not acquired after the predetermined time has passed. However, since the wearable terminal 20 has the speech recognition unit 21 that recognizes the acquired speech as text data and converts it into text data, the speech recognition result management unit 13 records the conversation as text data.

発言内容管理サーバ１０は、会話相手と識別された一般ユーザに記録された会話をウェアラブル端末２０で閲覧させる。ウェアラブル端末２０のマイク３０は、利用する一般ユーザが発言する検索キーを取得し、発言内容管理サーバ１０は、取得された検索キーで会話を検索して閲覧させる。 The message content management server 10 causes the wearable terminal 20 to browse a conversation recorded by a general user identified as a conversation partner. The microphone 30 of the wearable terminal 20 acquires a search key that is spoken by a general user to use, and the message content management server 10 searches and browses the conversation using the acquired search key.

ウェアラブル端末２０は、会話相手の一般ユーザの顔画像を相手識別情報として取得する。また、ウェアラブル端末２０は、会話相手の一般ユーザのウェアラブル端末２０から相手識別情報を取得することもできる。 The wearable terminal 20 acquires a face image of a general user who is a conversation partner as partner identification information. Wearable terminal 20 can also acquire partner identification information from wearable terminal 20 of the general user of the conversation partner.

さらに、詳細には後述するが、音声認識結果管理部１３は、三人以上の一般ユーザの相手識別情報が相互に取得されたときも会話を記録し、面会履歴管理部１４は、会話が記録されている三人以上の全員の相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を終了する。 Further, as will be described in detail later, the voice recognition result management unit 13 records the conversation even when the partner identification information of three or more general users is mutually acquired, and the meeting history management unit 14 records the conversation. The recording of the conversation is terminated when a predetermined time elapses when the partner identification information of all three or more persons who have been acquired is not mutually acquired.

より具体的には、発言内容管理サーバ１０と、ウェアラブル端末２０はプログラム制御により動作し、インターネット等のデータネットワーク１００を介して相互に接続されている。 More specifically, the message content management server 10 and the wearable terminal 20 operate under program control and are connected to each other via a data network 100 such as the Internet.

発言内容管理サーバ１０は、インターネット上に設置されているワークステーション・サーバ等の情報処理装置であり、ＩＤ管理部１１、顔認識部１２、音声認識結果管理部１３、面会履歴管理部１４、検索部１５、等を論理的に有する。 The message content management server 10 is an information processing device such as a workstation server installed on the Internet, and includes an ID management unit 11, a face recognition unit 12, a voice recognition result management unit 13, a visit history management unit 14, and a search. Units 15 and so on are logically included.

ウェアラブル端末２０は、小型パーソナルコンピュータ等の情報処理装置であり、音声認識部２１、マイク３０、カメラ４０、眼鏡型ディスプレイ５０、操作キー６０、等を具備している。 The wearable terminal 20 is an information processing apparatus such as a small personal computer, and includes a voice recognition unit 21, a microphone 30, a camera 40, a glasses-type display 50, operation keys 60, and the like.

また、ウェアラブル端末２０は、カメラ４０で撮影した情報を、データネットワーク１００を介して発言内容管理サーバ１０に送信する機能を備えている。さらに、発言内容管理サーバ１０が送信するデータを受信し、眼鏡型ディスプレイ５０上の画面に表示する機能も有する。 The wearable terminal 20 has a function of transmitting information captured by the camera 40 to the message content management server 10 via the data network 100. Further, it has a function of receiving data transmitted by the message content management server 10 and displaying it on the screen on the glasses-type display 50.

また、マイク３０が認識した音声を音声認識部２１で文字情報に変換し、発言内容管理サーバ１０に送信する機能を有する。ここで、音声認識部２１は構成によっては発言内容管理サーバ１０上に具備し、ウェアラブル端末２０は音声データを、そのまま発言内容管理サーバ１０に送信してもよい。 The voice recognized by the microphone 30 is converted into character information by the voice recognition unit 21 and transmitted to the message content management server 10. Here, the speech recognition unit 21 may be provided on the statement content management server 10 depending on the configuration, and the wearable terminal 20 may transmit the voice data to the statement content management server 10 as it is.

次に、図１〜図１１を参照して本実施例の動作について詳細に説明する。図２のように、ＡさんとＢさんが10:00-10:02まで会話をする場合において、図３および図４を参照して詳細を説明する。 Next, the operation of this embodiment will be described in detail with reference to FIGS. As shown in FIG. 2, when Mr. A and Mr. B have a conversation until 10: 00-10: 02, the details will be described with reference to FIG. 3 and FIG.

まず、Ａさんのウェアラブル端末２０はカメラ４０を介してＢさんの映像を発言内容管理サーバ１０へ送信する（ステップＡ１）。ここで、ウェアラブル端末２０は一定間隔で常に映像を発言内容管理サーバ１０へ送信するように動作している。 First, Mr. A's wearable terminal 20 transmits Mr. B's video to the message content management server 10 via the camera 40 (step A1). Here, the wearable terminal 20 operates so as to always transmit video to the message content management server 10 at regular intervals.

次に、発言内容管理サーバ１０は受け取った映像を顔認識部１２で顔認識またはウェアラブル端末２０のＩＤ認識によってＩＤ管理部に格納されたＩＤ情報と突合せ、Ｂさんと判定する（ステップＡ２）。 Next, the message content management server 10 matches the received video with the ID information stored in the ID management unit by face recognition or ID recognition of the wearable terminal 20 by the face recognition unit 12, and determines Mr. B (step A2).

ここで、ウェアラブル端末には個々のＩＤが付与されており、ＩＤ管理部１１により、図５のように個人を特定できる状態で管理されている。ウェアラブル端末２０のＩＤ認識は、例えば、眼鏡型ディスプレイ５０の前面に二次元コードなどで表記しておくことができる(図示せず)。 Here, each ID is assigned to the wearable terminal and is managed by the ID management unit 11 in a state where an individual can be specified as shown in FIG. The ID recognition of the wearable terminal 20 can be expressed by, for example, a two-dimensional code on the front surface of the glasses-type display 50 (not shown).

次に、面会履歴管理部１４は、図６のような判定ロジックにより、図７のように、認識した時刻を会話の開始時刻と終了時刻とにセットし、Ａさんを話者1、Ｂさんを話者２にセットし、状態を認識待ちとする（ステップＡ３）。 Next, the meeting history management unit 14 sets the recognized time to the start time and the end time of the conversation as shown in FIG. 7 according to the determination logic as shown in FIG. Is set to the speaker 2 and the state is awaiting recognition (step A3).

同様に、Ｂさんのウェアラブル端末２０はカメラ４０を解してＡさんの映像を発言内容管理サーバ１０へ送信すると、面会履歴管理部１４は、ＡさんとＢさんが認識待ちであることから、ＡさんとＢさんの双方向の認識が完了し、会話が始まったと判断し、認識した時刻を会話終了時刻にセットし、状態を会話中とする。 Similarly, when Mr. B's wearable terminal 20 transmits the video of Mr. A to the message content management server 10 through the camera 40, the meeting history management unit 14 indicates that Mr. A and Mr. B are waiting for recognition. It is determined that the bidirectional recognition of Mr. A and Mr. B has been completed and the conversation has started, the recognized time is set as the conversation end time, and the state is defined as being in conversation.

次に、Ａさんのウェアラブル端末２０はマイク３０を介して入力された音声を、音声認識部２１によってテキストに変換し（ステップＡ４）、テキストを発言内容管理サーバ１０に送信する（ステップＡ５）。 Next, Mr. A's wearable terminal 20 converts the voice input through the microphone 30 into text by the voice recognition unit 21 (step A4), and transmits the text to the message content management server 10 (step A5).

次に、発言内容管理サーバ１０は受け取ったテキストを、図８のように、音声認識結果管理部１３に格納する（ステップＡ６）。このように会話は継続されていくが、図９のような判定ロジックにより、面会履歴管理部１４を参照し、一定時間お互いの顔認識が行われず、会話終了時刻が更新されなかった場合には会話が終了したと判断し、状態を終了とする。 Next, the statement content management server 10 stores the received text in the speech recognition result management unit 13 as shown in FIG. 8 (step A6). In this way, the conversation is continued, but when the face-to-face history management unit 14 is referred to by the determination logic as shown in FIG. 9 and face recognition is not performed for a certain period of time, and the conversation end time is not updated, It is determined that the conversation has ended, and the state is ended.

次に、利用者によって、操作キー６０による操作で、図１０のように、会話履歴表示メニューを眼鏡型ディスプレイ５０に表示する。このとき、ウェアラブル端末２０は発言内容管理サーバ１０に会話履歴の検索を要求する（ステップＡ７）。 Next, a conversation history display menu is displayed on the glasses-type display 50 as shown in FIG. At this time, the wearable terminal 20 requests the speech content management server 10 to search the conversation history (step A7).

次に、発言内容管理サーバ１０内の検索部１５は面会履歴管理部１４から利用者が会話をした時間と会話相手を抽出し、その時間の発言を音声認識結果管理部１３から抽出し（ステップＡ８）、ウェアラブル端末２０に送信する（ステップＡ９）。 Next, the search unit 15 in the statement content management server 10 extracts the conversation time and conversation partner from the visit history management unit 14, and extracts the comment at that time from the voice recognition result management unit 13 (step A8), and transmit to wearable terminal 20 (step A9).

次に、ウェアラブル端末２０は会話履歴を眼鏡型ディスプレイ５０に表示する（ステップＡ１０）。ここで、図１１のように、音声入力などによりキーワードを入力し、発言やその他の情報を検索するような仕組みとすることも考えられる。 Next, wearable terminal 20 displays the conversation history on glasses-type display 50 (step A10). Here, as shown in FIG. 11, it is also possible to adopt a mechanism in which a keyword is input by voice input or the like and a utterance or other information is searched.

本実施の形態の会話管理システムでは、上述のようにウェアラブル端末２０が、会話相手の一般ユーザを識別する相手識別情報をカメラ４０で取得する。発言内容管理サーバ１０が、相手識別情報から会話相手の一般ユーザを顔認識部１２で識別する。 In the conversation management system of the present embodiment, as described above, wearable terminal 20 acquires partner identification information for identifying a conversational partner general user with camera 40. The message content management server 10 identifies the general user of the conversation partner by the face recognition unit 12 from the partner identification information.

ウェアラブル端末２０が、少なくとも利用する一般ユーザの発言をマイク３０で取得する。発言内容管理サーバ１０が、相手識別情報が相互に取得されたときを会話の始端として複数の一般ユーザの発言を音声認識結果管理部１３で記録し、相手識別情報が相互に取得されない状態が所定時間まで経過すると会話の記録を面会履歴管理部１４で終了する。 The wearable terminal 20 acquires at least a general user's remarks to be used by the microphone 30. The speech content management server 10 records the utterances of a plurality of general users in the speech recognition result management unit 13 at the beginning of the conversation when the partner identification information is mutually acquired, and a state in which the partner identification information is not mutually acquired is predetermined. When the time has elapsed, the conversation history management unit 14 ends the conversation recording.

このため、本実施の形態の会話管理システムでは、複数の一般ユーザによる会話の開始と終了とを正確に認識し、必要な会話のみを記録することができる。さらに、会話相手の端末で記録された音声認識結果テキストを自分の端末で表示できる。 For this reason, in the conversation management system of the present embodiment, it is possible to accurately recognize the start and end of conversations by a plurality of general users and to record only necessary conversations. Furthermore, the voice recognition result text recorded on the conversation partner's terminal can be displayed on the own terminal.

その理由は、顔認識またはＩＤ認識により特定されたＩＤ情報と時刻を活用し、会話していた時間を判別し、対話していた時間の発言のみを会話相手と相互に共有できるようにしたためである。 The reason is that ID information and time specified by face recognition or ID recognition are used to determine the conversation time, and only the conversation during the conversation can be shared with the conversation partner. is there.

また、複数人の会話であっても会話に参加していた全員が相互の発言を参照できる。その理由は、一対一の会話の組を複数同時に成立した場合に複数人で会話したとみなすようにしたためである。 Moreover, even if it is a conversation of two or more persons, all who participated in the conversation can refer to each other's remarks. The reason is that when a plurality of one-to-one conversation groups are established at the same time, it is considered that a plurality of persons have conversations.

図１２を参照して、三人で会話を行う場合の例をあげる。ＡさんとＢさんの会話が成立し、かつ、ＡさんとＣさんの会話が成立し、かつ、ＢさんとＣさんの会話が成立した場合に、三人の会話が成立したと判断する。さらに、図９の終了ロジックにより、三人のいずれも顔を認識せずに一定時間経ったときに会話が終了したと判断する。 With reference to FIG. 12, the example in the case of having a conversation with three people is given. When the conversation between Mr. A and Mr. B is established, the conversation between Mr. A and Mr. C is established, and the conversation between Mr. B and Mr. C is established, it is determined that the conversation between the three persons is established. Furthermore, the end logic of FIG. 9 determines that the conversation has ended when a certain time has passed without any of the three persons recognizing their faces.

なお、本発明は本実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で各種の変形を許容する。例えば、上記形態では相手識別情報から会話相手の一般ユーザを識別するユーザ識別手段である顔認識部１２が発言内容管理サーバ１０にあることを例示した。しかし、このような顔認識部１２がウェアラブル端末２０にあってもよい。 The present invention is not limited to the present embodiment, and various modifications are allowed without departing from the scope of the present invention. For example, in the above embodiment, the face recognition unit 12 as user identification means for identifying a general user of a conversation partner from the partner identification information is exemplified in the statement content management server 10. However, such a face recognition unit 12 may be in the wearable terminal 20.

また、前述のように取得された発言を音声認識してテキストデータとする音声認識部２１がウェアラブル端末２０にあることを例示した。しかし、このような音声認識部２１が発言内容管理サーバ１０にあってもよい。 In addition, the speech recognition unit 21 that recognizes the acquired speech as described above and converts it into text data is exemplified in the wearable terminal 20. However, such a speech recognition unit 21 may be provided in the statement content management server 10.

さらに、本実施の形態では発言内容管理サーバ１０やウェアラブル端末２０の各部がコンピュータプログラムにより各種機能として論理的に実現されることを例示した。しかし、このような各部の各々を固有のハードウェアとして形成することもでき、ソフトウェアとハードウェアとの組み合わせとして実現することもできる。 Furthermore, in the present embodiment, it has been exemplified that each part of the statement content management server 10 and the wearable terminal 20 is logically realized as various functions by a computer program. However, each of these units can be formed as unique hardware, or can be realized as a combination of software and hardware.

また、上記形態ではデータネットワーク１００として現状のインターネットを例示したが、これが次世代のインターネットであるＮＧＮ（Next Generation Network）でもよい。 In the above embodiment, the current Internet is exemplified as the data network 100. However, this may be a next generation network (NGN) which is the next generation Internet.

なお、当然ながら、上述した実施の形態および複数の変形例は、その内容が相反しない範囲で組み合わせることができる。また、上述した実施の形態および変形例では、各部の構造などを具体的に説明したが、その構造などは本願発明を満足する範囲で各種に変更することができる。 Needless to say, the above-described embodiment and a plurality of modifications can be combined within a range in which the contents do not conflict with each other. Further, in the above-described embodiments and modifications, the structure of each part has been specifically described, but the structure and the like can be changed in various ways within a range that satisfies the present invention.

１０発言内容管理サーバ
１１ＩＤ管理部
１２顔認識部
１３音声認識結果管理部
１４面会履歴管理部
１５検索部
２０ウェアラブル端末
２１音声認識部
３０マイク
４０カメラ
５０眼鏡型ディスプレイ
６０操作キー
１００データネットワーク DESCRIPTION OF SYMBOLS 10 Statement content management server 11 ID management part 12 Face recognition part 13 Voice recognition result management part 14 Visit history management part 15 Search part 20 Wearable terminal 21 Voice recognition part 30 Microphone 40 Camera 50 Glasses type display 60 Operation key 100 Data network

Claims

A plurality of wearable terminals individually used by a plurality of general users having a conversation, and a single management server in data communication with the plurality of wearable terminals,
The wearable terminal has identification acquisition means for acquiring partner identification information for identifying the general user of the conversation partner,
At least one of the wearable terminal and the management server has user identification means for identifying the general user of the conversation partner from the partner identification information,
The wearable terminal has a speech acquisition unit that acquires at least the speech of the general user to be used,
When the management server obtains the partner identification information from each other, a recording start unit that starts recording the conversations of the plurality of general users, and a state in which the partner identification information is not mutually acquired until a predetermined time elapses. A conversation ending system for ending the recording of the conversation.

The recording start unit records the conversation even when the partner identification information of three or more general users is acquired mutually,
2. The conversation management according to claim 1, wherein the recording ending unit ends the recording of the conversation when a state in which the partner identification information of all of the three or more persons in which the conversation is recorded is not mutually acquired has elapsed for a predetermined time. system.

The statement content management server according to claim 1, further comprising: a conversation storage unit that stores the conversation when the partner identification information is not acquired mutually after the predetermined time has elapsed.

The conversation management according to any one of claims 1 to 3, wherein the management server further includes browsing permission means for allowing the general user identified as the conversation partner to browse the conversation recorded on the wearable terminal. system.

The speech acquisition means acquires a search key that the general user to use speaks,
The conversation management system according to claim 4, wherein the browsing permission unit searches and browses the conversation with the acquired search key.

At least one of the wearable terminal and the management server further includes voice recognition means that recognizes the acquired speech as text data,
The conversation management system according to any one of claims 1 to 5, wherein the recording start unit records the conversation as the text data.

The conversation management system according to any one of claims 1 to 6, wherein the wearable terminal further includes image acquisition means for acquiring a face image of the general user of the conversation partner as the partner identification information.

The conversation management system according to claim 1, wherein the wearable terminal acquires the partner identification information from the wearable terminal of the general user who is a conversation partner.

The management server of the conversation management system according to any one of claims 1 to 8,
Recording start means for starting recording of the conversations of a plurality of general users when the partner identification information is mutually acquired;
A recording ending unit for ending the recording of the conversation when a state in which the partner identification information is not mutually acquired elapses until a predetermined time;
Management server having

The management server according to claim 9, further comprising user identification means for identifying the general user of the conversation partner from the partner identification information for identifying the general user of the conversation partner acquired by the wearable terminal.