JP4183645B2

JP4183645B2 - Conversation leader discriminating apparatus and conversation leader discriminating method

Info

Publication number: JP4183645B2
Application number: JP2004084420A
Authority: JP
Inventors: 真弓坊農; 紀子鈴木; 恭弘片桐
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2004-03-23
Filing date: 2004-03-23
Publication date: 2008-11-19
Anticipated expiration: 2024-03-23
Also published as: JP2005275536A

Description

この発明は会話先導者判別装置および会話先導者判別方法に関し、特にたとえば、或る場に存在する２人以上の人物間における会話の先導者を判別する、会話先導者判別装置および会話先導者判別方法に関する。 The present invention relates to a conversation leader discriminating apparatus and a conversation leader discriminating method, and more particularly to, for example, a conversation leader discriminating apparatus and a conversation leader discriminating apparatus for discriminating a conversation leader between two or more persons existing in a certain place. Regarding the method.

この種の会話先導者判別装置についての従来技術は存在しなかった。近似する従来技術の一例が特許文献１に開示される。この特許文献１によれば、訪問者（見学者）が所持する発信源から発信された信号が各展示会場で検知され、これによって訪問者の現在位置と見学した履歴とが特定される。訪問者の興味は、特定された位置データおよび履歴データに基づいて推定され、訪問者に提供される展示情報は、推定された興味に基づいて作成される。
特開平１１−９６２３０号 There has been no prior art on this type of conversation leader discriminating device. An example of the prior art to be approximated is disclosed in Patent Document 1. According to this Patent Document 1, a signal transmitted from a transmission source possessed by a visitor (visitor) is detected at each exhibition hall, thereby identifying the current location of the visitor and the history of the visit. The visitor's interest is estimated based on the specified position data and history data, and the exhibition information provided to the visitor is created based on the estimated interest.
JP-A-11-96230

この従来技術では、各展示会場において、展示物を紹介（説明）したり、質問に答えたりするような会話の先導者（会話先導者）は固定的である。したがって、当該会話先導者の存在する場（展示会場）における会話状態は、当該会話先導者の音声に基づいて判別することができる。たとえば、会話先導者が頻繁に発話している場合には、展示物について紹介（説明）しているような一方向の会話が行われている状態であると判断することができる。また、会話先導者の音声が途切れて、或る程度の無音区間（３秒〜１０秒）を検出し、その後、会話先導者の音声を再度検出するような場合には、会話先導者と来訪者との間で質疑応答のような双方向の会話が行われている状態であると判断することができる。さらに、長時間（１０分以上）、会話先導者の音声が途切れる場合には、当該場において何ら会話が行われていない状態であると判断することができる。 In this prior art, the leaders of conversations (conversation leaders) who introduce (explain) exhibits or answer questions are fixed at each exhibition hall. Therefore, the conversation state in the place (exhibition hall) where the conversation leader exists can be determined based on the voice of the conversation leader. For example, when a conversation leader frequently speaks, it can be determined that a one-way conversation such as introducing (explaining) an exhibit is in progress. When the conversation leader's voice is interrupted and a certain silent period (3 to 10 seconds) is detected, and then the conversation leader's voice is detected again, the conversation leader and the visitor are visited. It can be determined that a two-way conversation such as a question-and-answer session is performed with a person. Furthermore, when the voice of the conversation leader is interrupted for a long time (10 minutes or more), it can be determined that no conversation is being performed at the place.

しかし、会議（いわゆる井戸端会議を含む。）のように、会話先導者が時間とともに変化するような場合には、当然のことながら会話先導者を特定することができないため、会議の場における会話状態を容易に判別することができなかった。 However, if the conversation leader changes with time, such as a meeting (including the so-called well-end meeting), it is natural that the conversation leader cannot be specified. Could not be easily determined.

それゆえに、この発明の主たる目的は、新規な、会話先導者判別装置および会話先導者判別方法を提供することである。 Therefore, a main object of the present invention is to provide a novel conversation leader discriminating apparatus and conversation leader discriminating method.

また、この発明の他の目的は、会話先導者を正確に判別できる、会話先導者判別装置および会話先導者判別方法を提供することである。 Another object of the present invention is to provide a conversation leader discriminating apparatus and a conversation leader discriminating method capable of accurately discriminating a conversation leader.

請求項１は、或る場に存在する２人以上の人物間における会話の先導者を判別する会話先導者判別装置であって、２人以上の人物の各々についての音声を収集するための複数のマイク、当該場に存在する人物を個別に認識する個人認識手段、マイクの出力に基づいて、個人認識手段によって認識された各人物の或る時間帯における発話量を算出する算出手段、算出手段によって算出された各人物についての発話量のうち、所定値を超える発話量を有する人物を抽出する抽出手段、抽出手段によって抽出された人物が複数人存在するとき、当該抽出された人物についての発話量の差に基づいて発話量のばらつきの有無を判断する判断手段、および判断手段によってばらつきが無いことが判断されたとき、会話の先導者が存在しないことを判別し、判断手段によってばらつきが有ることが判断されたとき、発話量が最大となる発話量の人物を会話の先導者として判別する先導者判別手段を備える、会話先導者判別装置である。 Claim 1 is a conversation leader discriminating device for discriminating a leader of a conversation between two or more persons existing in a certain place, and a plurality of voices for collecting voices of each of the two or more persons Microphone, personal recognition means for individually recognizing a person present in the place, calculation means for calculating the amount of speech of each person recognized by the personal recognition means in a certain time zone based on the output of the microphone , calculation means Extraction means for extracting a person having an utterance amount exceeding a predetermined value from among the utterance amounts for each person calculated by the above, and when there are a plurality of persons extracted by the extraction means, the utterance for the extracted person The judging means for judging whether or not the utterance amount varies based on the difference in the amount, and when the judging means judges that there is no variation, it is judged that there is no conversation leader. When the variation is that there is determined by the determining means, it comprises a leader discriminating means for discriminating the speech of the person speaking the maximum amount as leader of the conversation, a conversational leader discriminating device.

請求項１の発明では、会話先導者判別装置は、或る場（たとえば、会議室や展示会場）に存在する２人以上の人物間における会話の先導者を判別する。各人物音声を収集するための複数のマイクが設けられる。また、個人認識手段は、当該場に存在する人物を個別に認識する。算出手段は、マイクの出力に基づいて、個人認識手段によって認識された各人物の或る時間帯における発話量を算出する。具体的には、或る場に複数の人物が存在する場合には、同じ時間帯における各人物の発話量が算出される。抽出手段は、算出した発話量のうち、所定値を超える発話量についての人物を抽出する。判断手段は、抽出手段によって抽出された人物が複数人存在するとき、当該抽出された人物についての発話量の差に基づいて発話量のばらつきの有無を判断する。そして、会話先導者判別手段は、判断手段によってばらつきが無いことが判断されたとき、会話の先導者が存在しないことを判別し、判断手段によってばらつきが有ることが判断されたとき、発話量が最大となる発話量の人物を会話の先導者として判別する。 In the invention of claim 1, the conversation leader discriminating apparatus discriminates a conversation leader between two or more persons existing in a certain place (for example, a conference room or an exhibition hall). A plurality of microphones are provided for collecting each person's voice. Also, the personal recognition means individually recognizes the person existing in the place. The calculation means calculates the utterance amount of each person recognized by the personal recognition means in a certain time zone based on the output of the microphone. Specifically, when there are a plurality of persons in a certain place, the utterance amount of each person in the same time zone is calculated. The extraction means extracts a person whose utterance amount exceeds a predetermined value from the calculated utterance amount. When there are a plurality of persons extracted by the extracting unit, the determining unit determines whether or not there is a variation in the amount of speech based on the difference in the amount of speech for the extracted person. The conversation leader discriminating means discriminates that the conversation leader does not exist when it is judged that there is no variation by the judgment means. The person with the largest utterance amount is determined as the conversation leader.

請求項１の発明によれば、或る場に存在するすべての人物についての或る時間帯における発話量を算出し、発話量にばらつきが有る場合に、その発話量が所定値を超える人物の中から発話量が最も多い人物を会話の先導者として判別するので、或る場の当該時間帯における会話の先導者を正確に判別することができる。 According to the first aspect of the present invention, the utterance amount in a certain time zone for all persons existing in a certain place is calculated, and when the utterance amount varies, the utterance amount of the person exceeding the predetermined value is calculated . since speech amount is determined highest person as leader of conversation in the leader of the conversation that put the corresponding time period of one field can be accurately determined.

請求項２の発明は請求項１に従属し、先導者判別手段は、さらに、すべての人物の発話量が所定値を超えないとき、会話の先導者が存在しないことを判別する。 The invention of claim 2 is dependent on claim 1, and the leader discriminating means further discriminates that there is no leader of the conversation when the utterance amount of all persons does not exceed a predetermined value .

請求項２の発明では、会話先導者判別手段は、すべての人物の発話量が所定値（或る閾値）を超えない場合には、会話の先導者が存在しないことを判別する。 In the invention of claim 2, the conversation leader discriminating means discriminates that there is no conversation leader when the utterance amount of all persons does not exceed a predetermined value (a certain threshold).

請求項２の発明では、単に発話量が最大となるだけでなく、或る閾値を超える発話量のうち、最大の発話量の人物を会話先導者として判別するので、より正確に会話先導者を判別することができる。これは、或る時間帯の全部と比較した場合に、比較的発話量の少ない人物の中で、会話先導者を判別するのは、適切でないと考えられるからである。 In the invention of claim 2, not only the utterance amount is simply maximized, but also the person with the largest utterance amount among the utterance amounts exceeding a certain threshold is determined as the conversation leader. Can be determined. This is because it is considered that it is not appropriate to discriminate the conversation leader among the persons with a relatively small amount of utterances when compared with the entire time period.

請求項３の発明は、或る場に存在する２人以上の人物間における会話の先導者を判別する会話先導者判別方法であって、(a)２人以上の人物の各々についての音声を収集し、(b)当該場に存在する人物を個別に認識し、(c)ステップ(a)によって収集された音声に基づいて、ステップ(b)によって認識された各人物の或る時間帯における発話量を算出し、(d)ステップ(c)によって算出された各人物についての発話量のうち、所定値を超える発話量を有する人物を抽出し、(e)ステップ(d)によって抽出された人物が複数人存在するとき、当該抽出された人物についての発話量の差に基づいて発話量のばらつきの有無を判断し、そして(f)ステップ(e)によってばらつきが無いことが判断されたとき、会話の先導者が存在しないことを判別し、判断手段によってばらつきが有ることが判断されたとき、発話量が最大となる発話量の人物を会話の先導者として判別する、会話先導者判別方法である。 The invention of claim 3 is a conversation leader discriminating method for discriminating a leader of a conversation between two or more persons existing in a certain place, and (a) a voice for each of the two or more persons is recorded. (B) Recognize each person present in the venue individually, (c) Based on the voice collected by step (a), each person recognized by step (b) in a certain time zone The amount of utterance is calculated, (d) out of the amount of utterance for each person calculated in step (c), a person having an utterance amount exceeding a predetermined value is extracted, and (e) extracted in step (d) When there are multiple people, determine the presence or absence of utterance variation based on the difference in utterance amount for the extracted person, and (f) when it is determined that there is no variation in step (e) , Determine that there is no conversation leader, When the can is present is determined, to determine the utterance of the person speaking the maximum amount as leader of the conversation, a conversational leader discrimination method.

第３の発明においても、第１の発明と同様に、或る場の或る時間帯における会話の先導者を正確に判別することができる。 Also in the third invention, similar to the first invention, the leader of the conversation at a certain time period of a certain field can be accurately determined.

この発明によれば、或る場に存在するすべての人物についての或る時間帯における発話量を算出し、発話量が最も多い人物を会話先導者に決定するので、当該時間帯における正確に会話先導者を判別することができる。 According to the present invention, the utterance amount in a certain time zone for all persons existing in a certain place is calculated, and the person having the largest utterance amount is determined as the conversation leader. Leaders can be identified.

この発明の上述の目的，その他の目的，特徴および利点は、図面を参照して行う以下の実施例の詳細な説明から一層明らかとなろう。 The above object, other objects, features and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

図１を参照して、この実施例の会話判別装置１０はＰＣあるいはワークステーションのようなコンピュータ１２を含む。コンピュータ１２には、複数のウェアラブルセンサ（以下、単に「センサ」という。）１４、複数のタグリーダ１６およびデータベース１８が接続される。ただし、センサ１４は、無線通信により、直接的（またはネットワークを介して間接的）にコンピュータ１２に接続される。このセンサ１４は、主として音声を検出するためのセンサであり、後述するように、声帯振動マイクのようなマイク１５２を含む。このマイク１５２は、ユーザの喉に装着され、当該ユーザの音声を収集する。ただし、マイク１５２としては、単一指向性マイクを用いることもできる。 Referring to FIG. 1, a conversation discrimination device 10 of this embodiment includes a computer 12 such as a PC or a workstation. A plurality of wearable sensors (hereinafter simply referred to as “sensors”) 14, a plurality of tag readers 16, and a database 18 are connected to the computer 12. However, the sensor 14 is connected to the computer 12 directly (or indirectly via a network) by wireless communication. The sensor 14 is a sensor mainly for detecting sound, and includes a microphone 152 such as a vocal cord vibration microphone, as will be described later. The microphone 152 is attached to the user's throat and collects the user's voice. However, a unidirectional microphone can also be used as the microphone 152.

タグリーダ１６は、ユーザに装着されるタグ２０が発信する固有の識別情報を検出し、検出した識別情報をコンピュータ１２に入力する。ここで、タグ２０は、周波数タグないしはＩＲタグであり、使用するタグの種類に応じたタグリーダ１６が設けられる。後述するように、１のセンサ１４および１のタグ２０はユーザに装着され、当該センサ１４および当該タグ２０は１対１で（固定的に）対応づけられている。データベース１８は、後述するように、センサ１４から入力された発話情報とタグリーダ１６から入力されたセンサ情報とを管理するとともに、コンピュータ１２によって判別された会話先導者の情報（会話先導者情報）を管理（記憶）する（図６参照）。 The tag reader 16 detects unique identification information transmitted from the tag 20 attached to the user, and inputs the detected identification information to the computer 12. Here, the tag 20 is a frequency tag or an IR tag, and a tag reader 16 corresponding to the type of tag to be used is provided. As will be described later, one sensor 14 and one tag 20 are attached to the user, and the sensor 14 and the tag 20 are associated one-to-one (fixedly). As will be described later, the database 18 manages the utterance information input from the sensor 14 and the sensor information input from the tag reader 16, and the conversation leader information (conversation leader information) determined by the computer 12. Manage (store) (see FIG. 6).

また、会話状態判別装置１０には、必要に応じて出力装置２２が接続される。厳密に言うと、出力装置２２はコンピュータ１２に接続される。この出力装置２２は、たとえばＣＲＴあるいはＬＣＤのようなディスプレイである。 Further, an output device 22 is connected to the conversation state determination device 10 as necessary. Strictly speaking, the output device 22 is connected to the computer 12. The output device 22 is a display such as a CRT or LCD.

図２は、図１に示したセンサ１４の具体的な構成を示すブロック図である。この図２を参照して、センサ１４は、筐体１４０を含み、筐体１４０内部にはＣＰＵ１４２が設けられる。このＣＰＵ１４２には、Ａ／Ｄ変換器１４４、メモリ１４６、無線ＬＡＮ１４８および時計回路１５０が接続される。また、Ａ／Ｄ変換器１４４には、筐体１４０外部に配置されるマイク（声帯振動マイク）１５２が接続される。 FIG. 2 is a block diagram showing a specific configuration of the sensor 14 shown in FIG. Referring to FIG. 2, sensor 14 includes a housing 140, and CPU 142 is provided inside housing 140. An A / D converter 144, a memory 146, a wireless LAN 148, and a clock circuit 150 are connected to the CPU 142. In addition, a microphone (voice cord vibration microphone) 152 disposed outside the housing 140 is connected to the A / D converter 144.

なお、メモリ１４６としては、半導体メモリを用いることができ、また、ハードディスク、ＭＤ、ＭＯ、ＣＤおよびＤＶＤのようなディスク記録媒体を用いることもできる。 As the memory 146, a semiconductor memory can be used, and a disk recording medium such as a hard disk, MD, MO, CD, and DVD can also be used.

このような構成のセンサ１４は、ユーザに装着され、マイク１５２から入力された音声信号は、Ａ／Ｄ変換器１４４によってディジタル変換され、ディジタル変換された音声データはＣＰＵ１４２に与えられる。ＣＰＵ１４２は、音声データを解析することにより、実際にユーザが発話している区間（発話区間）の発話開始時刻および発話終了時刻を、時計回路１５０が示す現在時刻に基づいて特定し、メモリ１４６に形成されたテーブル１４６ｔ（図３参照）に記憶する。ＣＰＵ１４２は、メモリ１４６に記憶されたテーブル１４６ｔを、所定時間（たとえば、１０分）毎に無線ＬＡＮ１４８を介してコンピュータ１２に入力する。 The sensor 14 having such a configuration is attached to the user, and the audio signal input from the microphone 152 is digitally converted by the A / D converter 144, and the audio data that has been digitally converted is provided to the CPU 142. The CPU 142 analyzes the voice data, specifies the utterance start time and utterance end time of the section where the user is actually speaking (utterance section) based on the current time indicated by the clock circuit 150, and stores it in the memory 146. It memorize | stores in the formed table 146t (refer FIG. 3). The CPU 142 inputs the table 146t stored in the memory 146 to the computer 12 via the wireless LAN 148 every predetermined time (for example, 10 minutes).

なお、図３に示すテーブル１４６ｔでは、たとえば、「15h02m50s」は１５時２分５０秒を意味する。以下、同様である。 In the table 146t shown in FIG. 3, for example, “15h02m50s” means 15: 2: 50. The same applies hereinafter.

ただし、上述したように、ユーザが装着するセンサ１４とタグ２０とは、固定的に対応づけられており、したがって、センサ１４は、対応するタグ２０の識別情報（後述する「人物No. 」）を付加したテーブル１４６ｔをコンピュータ１２に入力する。 However, as described above, the sensor 14 worn by the user and the tag 20 are fixedly associated with each other. Therefore, the sensor 14 has identification information (“person No.” described later) of the corresponding tag 20. Is input to the computer 12.

ここで、音声データの発話区間と無音区間との判別について説明する。この実施例では、所定の閾値（この実施例では、５０ｄＢ）よりも大きいレベル（パワー）の部分を発話と判断し、５０ｄＢ以下のパワーの部分を無音と判断するようにしてある。ただし、発話が開始されたかどうかを正確に判断するため、５０ｄＢよりも大きいパワーの部分が所定時間ｔ１（この実施例では、５０ミリ秒）検出されると、発話が開始されたと判断するようにしてある。また、無音区間を検出すると、当該無音区間の開始時点を発話終了時点（発話終了時刻）と判断する。ただし、ユーザの息継ぎ（ブレス）などによる休止（ポーズ）を無音区間と判断しないように、５０ｄＢ以下のパワーであり、その状態が所定時間ｔ２（たとえば、３００ミリ秒）以上続いた場合に、無音区間と判断するようにしてある。 Here, discrimination between the speech section and the silent section of the voice data will be described. In this embodiment, a portion having a level (power) larger than a predetermined threshold (50 dB in this embodiment) is determined as an utterance, and a portion having a power of 50 dB or less is determined as silence. However, in order to accurately determine whether or not the utterance has been started, it is determined that the utterance has started when a portion with a power greater than 50 dB is detected for a predetermined time t1 (in this embodiment, 50 milliseconds). It is. When a silent section is detected, the start time of the silent section is determined as the utterance end time (utterance end time). However, in order not to determine a pause (pause) due to a user's breathing (breath) or the like as a silent interval, the power is 50 dB or less, and if the state continues for a predetermined time t2 (for example, 300 milliseconds) or more, silence is generated. It is determined to be a section.

具体的には、センサ１４に設けられるＣＰＵ１４２は、図４に示す発話検出処理を実行する。ただし、ＣＰＵ１４２は、この発話検出処理と並行して、音声データの記録処理も実行している。 Specifically, the CPU 142 provided in the sensor 14 executes an utterance detection process shown in FIG. However, the CPU 142 also executes audio data recording processing in parallel with the utterance detection processing.

図４に示すように、ＣＰＵ１４２は発話検出処理を開始すると、ステップＳ１で、発話が開始されたか否かを判断する。５０ｄＢ以上の音量の音声が５０ミリ秒継続すると、発話が開始されたとみなし、ステップＳ１で“ＹＥＳ”となり、ステップＳ３で、時計回路１５０が示す現在時刻から５０ミリ秒だけ遡った時刻を発話開始時刻として、テーブル１４６ｔに記憶する。 As shown in FIG. 4, when starting the speech detection process, the CPU 142 determines whether or not the speech has been started in step S1. If a voice with a volume of 50 dB or more continues for 50 milliseconds, it is considered that the utterance has started, “YES” is determined in step S1, and in step S3, the utterance is started at a time that is back by 50 milliseconds from the current time indicated by the clock circuit 150. The time is stored in the table 146t.

ステップＳ５では、発話が終了したかどうかを判断する。５０ｄＢ以上の音量の音声が入力されない期間が３００ミリ秒以上継続すると、発話が終了したとみなし、ステップＳ７に進む。ステップＳ７では、時計回路１５０が示す現在時刻から３００ミリ秒だけ遡った時刻を発話終了時刻として、テーブル１４６ｔに記憶する。ステップＳ７の処理を終了すると、ステップＳ１に戻る。 In step S5, it is determined whether the utterance has ended. If the period during which sound with a volume of 50 dB or more is not input continues for 300 milliseconds or more, it is considered that the speech has ended, and the process proceeds to step S7. In step S7, the time lapsed by 300 milliseconds from the current time indicated by the clock circuit 150 is stored in the table 146t as the utterance end time. When the process of step S7 is completed, the process returns to step S1.

なお、図３のテーブル１４６ｔを参照して分かるように、この実施例では、簡単のため、１秒未満については省略してある。 As can be seen with reference to the table 146t of FIG. 3, in this embodiment, for the sake of simplicity, less than one second is omitted.

図１に示したような構成の会話状態判別装置１０は、たとえば会議室および展示会場が設けられるような建物内に配置される。ただし、このような建物内に限定される必要はなく、会話が起こり得る場に当該会話状態判別装置１０を適用することができる。図５に示すように、タグリーダ１６は、会議室或いは展示会場のような場所（ここでは、場所Ａ、ＢおよびＣ）にそれぞれ配置され、場所Ａ、ＢおよびＣに存在する人物（ユーザ）をそれぞれ認識する。つまり、上述したように、ユーザは、センサ１４およびタグ２０を装着し、したがって、タグリーダ１６は自身の検出範囲内（図５の点線枠内）に存在するユーザが装着するタグ２０の識別情報（以下、「人物No. 」ということがある。）を検出する。タグリーダ１６は、一定時間（この実施例では、１秒）毎に、タグ２０の検出処理を実行し、検出した人物No. に自身に割り当てられた識別情報（以下、「センサ情報」ということがある。）を付加して、コンピュータ１２に入力する。この実施例では、分かり易くするために、場所Ａに配置されるタグリーダ１６にはセンサ情報Ａが割り当てられ、場所Ｂに配置されるタグリーダ１６にはセンサ情報Ｂが割り当てられ、そして、場所Ｃに配置されるタグリーダ１６にはセンサ情報Ｃが割り当てられているものとしてある。また、ユーザの音声がセンサ１４で検出され、上述したように作成されたテーブル１４６ｔがコンピュータ１２に入力される。 The conversation state discriminating apparatus 10 having the configuration as shown in FIG. 1 is disposed in a building where a conference room and an exhibition hall are provided, for example. However, it is not necessary to be limited to such a building, and the conversation state determination device 10 can be applied to a place where conversation can occur. As shown in FIG. 5, the tag reader 16 is arranged in a place (here, places A, B, and C) such as a conference room or an exhibition hall, and a person (user) existing in the places A, B, and C is shown. Recognize each. That is, as described above, the user wears the sensor 14 and the tag 20, and therefore the tag reader 16 is in the detection range (within the dotted frame in FIG. 5) of the tag 20 worn by the user. Hereinafter, it may be referred to as “person No.”). The tag reader 16 executes the detection process of the tag 20 every certain time (in this embodiment, 1 second), and the identification information assigned to the detected person No. (hereinafter referred to as “sensor information”). To the computer 12. In this embodiment, for the sake of simplicity, sensor information A is assigned to the tag reader 16 arranged at the location A, sensor information B is assigned to the tag reader 16 arranged at the location B, and It is assumed that sensor information C is assigned to the tag reader 16 arranged. In addition, the user's voice is detected by the sensor 14, and the table 146 t created as described above is input to the computer 12.

なお、詳細な説明は省略するが、場所Ｂおよび場所Ｃにおいても同様である。また、図５においては、簡単のため、コンピュータ１２、センサ１４およびデータベース１８は省略してある。さらに、図５においては、図面の都合上、場所Ｂおよび場所Ｃに存在するユーザも省略してある。 Although the detailed description is omitted, the same applies to the places B and C. In FIG. 5, the computer 12, the sensor 14, and the database 18 are omitted for simplicity. Furthermore, in FIG. 5, the user who exists in the location B and the location C is also omitted for convenience of drawing.

また、この実施例では、３箇所（場所Ａ，ＢおよびＣ）にタグリーダ１６が配置され、各場に存在するユーザおよびその発話情報を検出するようにしてあるが、当該場所は少なくとも１つ存在すればよく、さらに、４箇所以上であってもよい。 In this embodiment, tag readers 16 are arranged at three locations (locations A, B, and C) to detect users and their utterance information in each location, but there is at least one such location. What is necessary is just four or more places.

図６は、図１に示したデータベース１８の内容を示す図解図である。データベース１８は、滞在情報記憶領域３０、発話情報記憶領域３２および会話先導者情報記憶領域３４を含む。滞在情報記憶領域３０は、滞在情報についてのテーブル３０ｔを記憶し、このテーブル３０ｔは図７のように示される。テーブル３０ｔは、タグ２０の識別情報（人物No.）に対応して、センサ情報、入場時刻および退場時刻が記憶される。タグリーダ１６からセンサ情報が付加された人物No. 入力されたとき、コンピュータ１２は、その時点における時刻を時計回路１２ａから取得し、取得した時刻を入場時刻として、人物No. およびセンサ情報とともに、テーブル３０ｔに登録（追加）する。したがって、図７からも分かるように、入場時刻が早い順に、人物No. およびセンサ情報がテーブル３０ｔに登録されている。また、コンピュータ１２は、人物No.およびセンサ情報がタグリーダ１６から入力されなくなると、その時点における時刻を時計回路１２ａから取得し、該当する項目に、退場時刻として時刻を書き込む。このような滞在情報のテーブル３０ｔを参照することにより、或る時間（時間帯）に、場所Ａ、ＢおよびＣのそれぞれに存在（滞在）していたユーザ（人物No. ）を特定（認識）することができる。 FIG. 6 is an illustrative view showing the contents of the database 18 shown in FIG. The database 18 includes a stay information storage area 30, an utterance information storage area 32, and a conversation leader information storage area 34. The stay information storage area 30 stores a table 30t for stay information, and this table 30t is shown as in FIG. The table 30t stores sensor information, entry time, and exit time corresponding to the identification information (person No.) of the tag 20. When the person number to which sensor information is added is input from the tag reader 16, the computer 12 obtains the current time from the clock circuit 12a, and uses the obtained time as the entry time together with the person number and sensor information in the table. Register (add) to 30t. Therefore, as can be seen from FIG. 7, person numbers and sensor information are registered in the table 30t in the order of early admission time. Further, when the person number and sensor information are no longer input from the tag reader 16, the computer 12 acquires the time at that time from the clock circuit 12a, and writes the time as the exit time in the corresponding item. By referring to such a stay information table 30t, a user (person No.) that exists (stays) in each of locations A, B, and C at a certain time (time zone) is specified (recognized). can do.

また、発話情報記憶領域３２には、発話情報についてのテーブル３２ｔが記憶される。図８に示すように、このテーブル３２ｔは、人物No. に対応して発話区間を規定する発話開始時刻およびそれに対応する発話終了時刻が記憶される。このテーブル３２ｔは、ウェアラブルセンサ１４から入力されるテーブル１４６ｔ（図３）をユーザ（人物No. ）毎に記憶したものである。つまり、コンピュータ１２は、センサ１４から入力される人物No. が付加されたテーブル１４６ｔに基づいてテーブル３２ｔを作成するのである。 In the utterance information storage area 32, a table 32t for utterance information is stored. As shown in FIG. 8, the table 32t stores an utterance start time that defines an utterance section corresponding to a person No. and an utterance end time corresponding to the utterance section. This table 32t stores the table 146t (FIG. 3) input from the wearable sensor 14 for each user (person No.). That is, the computer 12 creates the table 32t based on the table 146t to which the person number input from the sensor 14 is added.

会話先導者情報記憶領域３４には、会話先導者情報のテーブル３４ｔが記憶される。このテーブル３４ｔは、後で詳細に説明する会話先導者判別処理（図１１および図１２参照）によって作成される。具体的には、図９に示すように、各場所（センサ情報）に対応して、或る時間帯毎に判別された会話先導者の人物No. が記憶され、さらに、会話先導者が直前の時間帯から継続（維持）しているか、または、変更しているかの情報（会話先導者維持／変更情報）が記憶される。たとえば、センサ情報Ａ（場所Ａ）においては、１５時１秒〜１５時１０分の間では会話先導者が人物No.８のユーザであり、次の１５時１０分１秒〜１５時２０分の間では、会話先導者が人物No.９のユーザであることが分かる。この人物No.９のユーザが会話先導者として判別されたとき、会話先導者が変更されたことも分かる。さらに次の１５時２０分１秒〜１５時３０分の間では、会話先導者は人物No.９のユーザであり、直前の時間帯から会話先導者が維持されていることが分かる。また、後で詳細に説明するが、センサ情報Ｂについての会話先導者の判別結果が示すように、時間帯（１５時２０分１秒〜１５時３０分の間）によっては、会話先導者が存在しない（会話先導者なし）と判断される場合もある。 In the conversation leader information storage area 34, a table 34t of conversation leader information is stored. This table 34t is created by the conversation leader discriminating process (see FIGS. 11 and 12) described in detail later. Specifically, as shown in FIG. 9, corresponding to each place (sensor information), the conversation leader person number determined for each certain time zone is stored, and the conversation leader The information (conversation leader maintenance / change information) indicating whether or not it has been continued (maintained) or changed since this time period is stored. For example, in the sensor information A (location A), the conversation leader is the user of the person No. 8 between 15:01 and 15:10, and the next 15: 10: 1 to 15:20 It can be seen that the conversation leader is the user of the person No. 9. When the user of the person No. 9 is determined as the conversation leader, it can also be seen that the conversation leader has been changed. Further, it is understood that the conversation leader is the user of the person No. 9 during the next 15: 20: 1 second to 15:30, and the conversation leader is maintained from the immediately preceding time zone. Further, as will be described in detail later, as indicated by the conversation leader's determination result for the sensor information B, depending on the time zone (between 15:20:15 and 15:30), the conversation leader It may be determined that it does not exist (no conversation leader).

なお、会話先導者維持／変更情報の欄が空欄になっているのは、会話先導者の維持および変更のいずれにも該当しないことを意味する。 Note that a blank in the conversation leader maintenance / change information column means that the conversation leader maintenance / change is not applicable.

たとえば、会話先導者は、各場所（場所Ａ，Ｂ，Ｃ）で、時間帯（たとえば、１０分＝６００秒）毎に判別される。なお、会話先導者の判別方法は、いずれの場所においても同じであるため、場所Ａについて説明し、場所Ｂおよび場所Ｃについての説明は省略することにする。 For example, the conversation leader is determined every time zone (for example, 10 minutes = 600 seconds) at each location (location A, B, C). Since the method for determining the conversation leader is the same at any place, place A will be described, and description of place B and place C will be omitted.

コンピュータ１２は、会話先導者を判別する時間帯を設定すると、当該時間帯において場所Ａに存在していたユーザを抽出する。つまり、当該時間帯に、センサ情報Ａが記述された人物No. を抽出する。具体的には、図７に示したテーブル３０ｔを参照して、人物No. を抽出する。たとえば、時間帯が１５時１秒〜１５時１０分の間に設定された場合には、当該時間帯にセンサ情報Ａを示す人物No. が抽出される。図７に示すテーブル３０ｔでは、人物No. ３，５，８および９が抽出されることになる。 When the computer 12 sets a time zone for discriminating conversation leaders, the computer 12 extracts users who were present at the location A in the time zone. That is, the person number describing the sensor information A is extracted in the time zone. Specifically, the person number is extracted with reference to the table 30t shown in FIG. For example, when the time zone is set between 15: 1 and 15:10, the person number indicating the sensor information A is extracted in the time zone. In the table 30t shown in FIG. 7, person numbers 3, 5, 8, and 9 are extracted.

次に、コンピュータ１２は、抽出した各人物No. に対応するユーザの当該時間帯における発話産出率をそれぞれ計算する。ここで、発話産出率は、設定された時間帯（ここでは、６００秒）における発話量（全発話区間）の割合であり、数１に従って算出される。 Next, the computer 12 calculates the utterance production rate of the user corresponding to each extracted person No. in the relevant time zone. Here, the utterance production rate is the ratio of the utterance amount (all utterance sections) in the set time zone (here, 600 seconds), and is calculated according to Equation 1.

[数１]
発話産出率（％）＝全発話区間÷時間帯×１００
ただし、発話量すなわち全発話区間は、当該時間帯に含まれる発話区間の総計であり、図８に示した発話情報のテーブル３２ｔを参照して求められる。図８を参照して、たとえば、人物No. １のユーザについて考えると、当該ユーザは、１５時１秒〜１５時１０分の間では、１５時２分５０秒から１５時４分１０秒までの８０秒間と、１５時６分１０秒から１５時６分４０秒までの３０秒間とで発話している。したがって、この場合の全発話区間は１１０秒であり、当該ユーザの発話産出率は、数１に従って求めると、約１８．３％となる。 [Equation 1]
Utterance production rate (%) = total utterance interval ÷ time slot × 100
However, the utterance amount, that is, the total utterance interval, is the total of the utterance intervals included in the time zone, and is obtained with reference to the utterance information table 32t shown in FIG. Referring to FIG. 8, for example, when considering the user of person No. 1, the user is from 15: 2: 50 to 15: 4: 10 between 15: 1 and 15:10. For 80 seconds and 30 seconds from 15: 6: 10 to 15: 6: 40. Therefore, the total utterance period in this case is 110 seconds, and the utterance production rate of the user is about 18.3% when calculated according to Equation 1.

このようにして、発話産出率を計算した結果が、たとえば、図１０のように示される。つまり、上述したように、抽出した人物No. ３，５，８および９では、当該時間帯における発話産出率が、それぞれ、２０％，１０％，８０％および６０％である。 The result of calculating the utterance production rate in this way is shown, for example, in FIG. That is, as described above, in the extracted person numbers 3, 5, 8, and 9, the utterance production rates in the time zone are 20%, 10%, 80%, and 60%, respectively.

この発話産出率に基づいて会話先導者を判別するのであるが、単純な方法によれば、最大の発話産出率に対応するユーザを会話先導者として判別することができる。つまり、図１０に示す例で言えば、最大の発話産出率は８０％であり、当該時間帯における会話先導者は、人物No. ８のユーザに決定することができる。ただし、会話先導者を正確に判別するために、以下のような２つの条件を設定してある。条件（１）は、発話産出率が所定の率（たとえば、６０％）で以上であること。条件（２）は、条件（１）を満たす発話産出率が２以上存在する場合には、それらの発話産出率の中でばらつきがあること。具体的には、各発話産出率の差が、一定数（たとえば、１０％）以上あること。 The conversation leader is discriminated based on the utterance production rate, but according to a simple method, the user corresponding to the maximum utterance production rate can be discriminated as the conversation leader. That is, in the example shown in FIG. 10, the maximum utterance production rate is 80%, and the conversation leader in the time zone can be determined as the user of the person No. 8. However, in order to accurately determine the conversation leader, the following two conditions are set. Condition (1) is that the utterance production rate is equal to or higher than a predetermined rate (for example, 60%). If there are two or more utterance production rates that satisfy the condition (1), the condition (2) must be varied among the utterance production rates. Specifically, the difference between the utterance production rates is a certain number (for example, 10%) or more.

条件（１）については、最大の発話産出率を選択した場合に、当該発話産出率が所定の率よりも小さければ、発話した時間の長さが短く、会話先導者として判別するのは適切ではないと考えられるからである。 Regarding the condition (1), when the maximum utterance production rate is selected, if the utterance production rate is smaller than the predetermined rate, the length of the utterance time is short, and it is appropriate to discriminate as the conversation leader. It is because it is thought that there is not.

また、条件（２）については、所定の率を超える発話産出率が２以上ある場合には、２人以上のユーザが会話先導者の候補と考えられ、それらにばらつきがなければ、いずれのユーザを会話先導者として判別すべきであるかを決定することができないからである。 As for condition (2), if there are two or more utterance production rates exceeding a predetermined rate, two or more users are considered to be conversation leader candidates, and if there is no variation, any user This is because it cannot be determined whether or not should be determined as a conversation leader.

このような条件に従えば、図１０に示した例では、条件（１）を満たすのは、人物No. ８および９であり、これらの発話産出率は条件（２）を満たしている。したがって、発話産出率が最大である人物No. ８のユーザが会話先導者として判別される。会話先導者が判別されると、図９に示したように、センサ情報および時間帯に対応して、当該会話先導者の人物No. が示す数値がテーブル３４ｔに書き込まれる。さらに、直前の時間帯における会話先導者の人物No. つまり前回の判別結果が示す人物No. と、今回の判別結果が示すNo. とが一致するか否かが判断される。それらが一致する場合には、会話先導者が継続していると判断して、会話先導者維持／変更情報の欄に、“維持”が書き込まれ、逆に不一致である場合には、会話先導者が変更したと判断して、“変更”が書き込まれる。 According to such a condition, in the example shown in FIG. 10, it is the persons No. 8 and 9 that satisfy the condition (1), and these utterance production rates satisfy the condition (2). Therefore, the user of the person No. 8 with the highest utterance production rate is determined as the conversation leader. When the conversation leader is determined, as shown in FIG. 9, the numerical value indicated by the person number of the conversation leader is written in the table 34t corresponding to the sensor information and the time zone. Further, it is determined whether or not the person number of the conversation leader in the immediately preceding time zone, that is, the person number indicated by the previous discrimination result, matches the No. indicated by the current discrimination result. If they match, it is determined that the conversation leader is continuing, “Continue” is written in the conversation leader maintenance / change information column, and conversely, if they do not match, the conversation leader It is determined that the user has changed, and “change” is written.

ただし、条件（１）および条件（２）を満たさない場合には、当該場においては、当該時間帯には、会話先導者がいないと判断される。この場合には、図９に示したように、センサ情報および時間帯に対応して、会話先導者の人物No. の欄に“なし”が書き込まれる。このとき、会話先導者維持／変更情報の欄には、何も書き込まれない。 However, if the conditions (1) and (2) are not satisfied, it is determined that there is no conversation leader in the time zone. In this case, as shown in FIG. 9, “None” is written in the column of the conversation leader person number corresponding to the sensor information and the time zone. At this time, nothing is written in the column of conversation leader maintenance / change information.

具体的には、図１に示したコンピュータ１２が図１１および図１２に示す会話先導者判別処理を実行する。なお、この実施例では、上述したように、各場所（Ａ，Ｂ，Ｃ）における或る時間帯毎に会話先導者を判別するため、以下に説明する会話先導者判別処理は、当該場所毎に実行される。 Specifically, the computer 12 shown in FIG. 1 executes the conversation leader determination process shown in FIGS. In this embodiment, as described above, in order to determine the conversation leader for each certain time zone at each location (A, B, C), the conversation leader determination process described below is performed for each location. To be executed.

コンピュータ１２は会話先導者判別処理を開始すると、ステップＳ１１で、所定の時間間隔（この実施例では、１０分＝６００秒）の時間帯を設定する。ここでは、たとえば、図９に示したように、１５時１秒〜１５時１０分（「15h00m01s/15h10m00s」）のように時間帯を設定する。続くステップＳ１３では、判別回数Ｎを初期化（Ｎ＝０）する。この判別回数Ｎは、会話先導者を判別した回数であり、図１では省略したが、コンピュータ１２の内部カウンタによってカウントされる。 When the computer 12 starts the conversation leader determination process, in step S11, the computer 12 sets a predetermined time interval (in this embodiment, 10 minutes = 600 seconds). Here, for example, as shown in FIG. 9, the time zone is set such as from 15: 1 to 15:10 (“15h00m01s / 15h10m00s”). In the subsequent step S13, the number N of determinations is initialized (N = 0). This determination number N is the number of times the conversation leader has been determined, and is omitted in FIG. 1, but is counted by an internal counter of the computer 12.

続いて、ステップＳ１５で、当該時間帯において、会話先導者を判別する場所に対応するセンサ情報が記述された人物No.を抽出する。次に、ステップＳ１７では、当該時間帯における人物の発話産出率を数１に従って計算する。そして、ステップＳ１９では、所定の率（ここでは、６０％）を超えるユーザ（人物No. ）が存在するかどうかを判断する。 Subsequently, in step S15, a person number in which sensor information corresponding to the place where the conversation leader is determined is extracted in the time period. Next, in step S17, the utterance production rate of the person in the time zone is calculated according to Equation 1. In step S19, it is determined whether or not there is a user (person No.) exceeding a predetermined rate (here, 60%).

ステップＳ１９で“ＮＯ”であれば、つまり所定の率を超えるユーザが存在しなければ、条件（１）を満たす発話産出率は存在しないと判断し、ステップＳ２１で、会話先導者なしをテーブル３４ｔに書き込み、図１２に示すステップＳ３１に進む。一方、ステップＳ1９で“ＹＥＳ”であれば、つまり所定の率を超えるユーザ（人物No. ）が存在すれば、条件（１）を満たす発話産出率が存在すると判断し、ステップＳ２３で、当該人物No. を分析対象者として抽出する。そして、ステップＳ２５で、抽出した発話産出率の中で、ばらつきがあるかどうかを判断する。 If “NO” in the step S19, that is, if there is no user exceeding the predetermined rate, it is determined that there is no utterance production rate that satisfies the condition (1), and in step S21, there is no conversation leader in the table 34t. The process proceeds to step S31 shown in FIG. On the other hand, if “YES” in the step S19, that is, if there is a user (person No.) exceeding a predetermined rate, it is determined that there is an utterance production rate that satisfies the condition (1), and in step S23 the person concerned Extract No. as the subject of analysis. In step S25, it is determined whether there is any variation in the extracted utterance production rate.

ただし、ステップＳ２３において抽出された発話産出率が１つである場合には、ステップＳ２５の判断処理は実行されずに、そのままステップＳ２７に移行する。 However, when the utterance production rate extracted in step S23 is one, the determination process in step S25 is not executed and the process proceeds to step S27 as it is.

ステップＳ２５で“ＮＯ”であれば、つまり抽出した発話産出率の中で、ばらつきがなく、条件（２）を満たさないと判断すると、ステップＳ２１に進む。一方、ステップＳ２５で“ＹＥＳ”であれば、つまり抽出した発話産出率の中で、ばらつきがあり、条件（２）を満たすと判断すると、ステップＳ２７で、抽出した分析対象者のうち、発話産出率が最大の人物No. を会話先導者として判別（決定）する。そして、図１２に示すステップＳ２９で、決定した会話先導者の人物No. をテーブル３４ｔの該当欄に書き込み、ステップＳ３１に進む。 If “NO” in the step S25, that is, if it is determined that there is no variation in the extracted speech production rate and the condition (2) is not satisfied, the process proceeds to a step S21. On the other hand, if “YES” in the step S25, that is, if it is determined that there is a variation in the extracted utterance production rate and the condition (2) is satisfied, the utterance production among the extracted analysis subjects in the step S27. The person with the highest rate is identified (determined) as the conversation leader. In step S29 shown in FIG. 12, the determined conversation leader person number is written in the corresponding field of the table 34t, and the process proceeds to step S31.

ステップＳ３１では、判別回数Ｎを１加算し、つまり内部カウンタをインクリメントし、ステップＳ３３で、判別回数Ｎが２以上であるかどうかを判断する。ステップＳ３３で“ＮＯ”であれば、つまり判別回数Ｎが１であり、初めて会話先導者を判別した場合には、そのままステップＳ４１に進む。一方、ステップＳ３３で“ＹＥＳ”であれば、つまり判別回数Ｎが２以上であれば、２回目以降の会話先導者の判別であると判断し、ステップＳ３５で、今回判別した会話先導者が前回判別した会話先導者と同じ人物であるかどうかを判断する。つまり、テーブル３４ｔを参照して、前回の判別結果の人物No. と今回の判別結果の人物No. とが一致するかどうかを判断する。 In step S31, the determination number N is incremented by 1, that is, the internal counter is incremented. In step S33, it is determined whether the determination number N is 2 or more. If “NO” in the step S33, that is, the determination number N is 1, and when the conversation leader is determined for the first time, the process proceeds to a step S41 as it is. On the other hand, if “YES” in the step S33, that is, if the determination number N is 2 or more, it is determined that the second or subsequent conversation leader is determined, and in step S35, the conversation leader determined this time is the previous time. It is determined whether or not the same person as the determined conversation leader. That is, referring to the table 34t, it is determined whether or not the person No. in the previous determination result matches the person No. in the current determination result.

ステップＳ３５で“ＹＥＳ”であれば、つまり前回判別した会話先導者と今回判別した会話先導者とが同じ人物であれば、ステップＳ３７で、テーブル３４ｔの会話先導者維持／変更情報の欄に “維持”を書き込み、ステップＳ４１に進む。一方、ステップＳ３５で“ＮＯ”であれば、つまり前回判別した会話先導者と今回判別した会話先導者とが異なる人物であれば、ステップＳ３９で、テーブル３４ｔの会話先導者維持／変更情報の欄に“変更”を書き込み、ステップＳ４１に進む。 If “YES” in the step S35, that is, if the conversation leader determined last time and the conversation leader determined this time are the same person, in the step S37, the column of the conversation leader maintenance / change information in the table 34t is displayed. "Maintain" is written, and the process proceeds to step S41. On the other hand, if “NO” in the step S35, that is, if the conversation leader determined last time is different from the conversation leader determined this time, the conversation leader maintenance / change information column of the table 34t in the step S39. "Change" is written in the field, and the process proceeds to step S41.

ステップＳ４１では、次の時間帯を設定し、図１１に示したステップＳ１７に戻る。たとえば、ステップＳ４１では、次の時間帯として、１５時１０分１秒〜１５時２０分が設定される。 In step S41, the next time zone is set, and the process returns to step S17 shown in FIG. For example, in step S41, 15: 10: 1 to 15:20 is set as the next time zone.

このような処理を繰り返すことにより、或る場（場所Ａ，ＢまたはＣ）における時間帯毎の会話先導者を判別することができるのである。 By repeating such processing, the conversation leader for each time zone in a certain place (place A, B or C) can be determined.

なお、この会話先導者判別処理では、会話先導者を判別する場所には、複数のユーザが存在することを前提として、ステップＳ１５で、人物No. を抽出すると、そのままステップＳ１７に進むようにしてある。しかし、ステップＳ１５において、抽出された人物No. が１つ以下である場合、つまりユーザが１人以下である場合には、会話が成立しないため、会話先導者を判別することはできない。したがって、ステップＳ１５とステップＳ１７との間に、人物No. が２つ以上在るかどうかを判断するステップ（処理）を設けて、人物No. が２つ以上存在する場合には、ステップＳ１７に進み、人物No. が１つ以下の場合には、ステップＳ２３に進むようにしてもよい。 In this conversation leader discrimination process, assuming that there are a plurality of users in the place where the conversation leader is discriminated, if the person number is extracted in step S15, the process proceeds directly to step S17. However, in step S15, when the extracted person number is 1 or less, that is, when the number of users is 1 or less, since the conversation is not established, the conversation leader cannot be determined. Therefore, a step (process) for determining whether or not there are two or more person numbers is provided between step S15 and step S17. If there are two or more person numbers, the process proceeds to step S17. If the number of persons is one or less, the process may proceed to step S23.

また、図１に示したように、表示装置２２をコンピュータ１２に接続した場合には、たとえば、図１３に示すような会話先導者の判別結果を、表示装置２２に表示することができる。図１３では、１５時（厳密には、１５時１秒）〜１５時１０分の間では、場所Ａの会話先導者は人物No.８のユーザであり、場所Ｂでは人物No.１のユーザが会話先導者であり、場所Ｃにおいては会話先導者が存在しなかったことを示してある。 As shown in FIG. 1, when the display device 22 is connected to the computer 12, for example, the determination result of the conversation leader as shown in FIG. 13 can be displayed on the display device 22. In FIG. 13, between 15 o'clock (strictly, 15: 1 sec) and 15:10, the conversation leader of the place A is the user of the person No. 8, and the user of the person No. 1 in the place B Is a conversation leader, and there is no conversation leader at location C.

この実施例によれば、或る場に存在するユーザの音声信号を検出し、或る時間帯における各ユーザの発話産出率を計算し、所定の率を超える発話産出率のうち最大の発話産出率となるユーザを会話先導者として判別するので、正確に会話先導者を判別することができる。 According to this embodiment, the voice signal of a user existing in a certain place is detected, the utterance production rate of each user in a certain time zone is calculated, and the maximum utterance production out of the utterance production rates exceeding a predetermined rate. Since the user who becomes the rate is determined as the conversation leader, the conversation leader can be accurately determined.

なお、この実施例では、或る時間帯における発話産出率を計算して、この発話産出率に基づいて、条件（１）および条件（２）を満たす発話産出率の中で最大の発話産出率のユーザを会話先導者として判別するようにした。しかし、或る時間帯における発話量（発話時間）に基づいて、会話先導者を判別するようにしてもよい。かかる場合には、条件（１）における所定の率（発話産出率）を所定の量（発話量）とし、条件（２）におけるばらつきすなわち発話産出率の差の数値を発話量の差の数値とすればよい。 In this embodiment, the utterance production rate in a certain time zone is calculated, and the maximum utterance production rate among the utterance production rates satisfying the conditions (1) and (2) based on the utterance production rate. Were identified as conversation leaders. However, the conversation leader may be determined based on the amount of speech (speech time) in a certain time zone. In such a case, the predetermined rate (utterance production rate) in the condition (1) is set as a predetermined amount (utterance amount), and the variation in the condition (2), that is, the numerical value of the difference in the speech production rate is the numerical value of the difference in speech production do it.

また、この実施例では、会話先導者を判別するだけであるため、ユーザの音声信号は記録しないようにしたが、音声信号も記録するようにしておけば、会話先導者を判別した後に、判別された会話先導者の音声信号に基づいて、その時間帯において当該会話先導者が存在していた場の会話状態を判別することができる。このような会話状態の判別装置や方法については、本件出願人が先に出願した特願２００４−３９７６号に詳細に説明されており、本件の本質的部分ではないため、その説明は省略することにする。 In this embodiment, since only the conversation leader is determined, the user's voice signal is not recorded. However, if the voice signal is also recorded, after the conversation leader is determined, the determination is made. Based on the voice signal of the conversation leader, the conversation state of the place where the conversation leader was present in that time zone can be determined. Such a conversation state discriminating apparatus and method are described in detail in Japanese Patent Application No. 2004-3976 filed earlier by the applicant of the present application and are not an essential part of the present application. To.

図１はこの発明の会話先導者判別装置の構成の一例を示す図解図である。FIG. 1 is an illustrative view showing one example of a configuration of a conversation leader discriminating apparatus of the present invention. 図２は図１に示すウェアラブルセンサの構成を示す図解図である。FIG. 2 is an illustrative view showing a configuration of the wearable sensor shown in FIG. 図３は図２に示すウェアラブルセンサのメモリに形成されるテーブルを示す図解図である。FIG. 3 is an illustrative view showing a table formed in the memory of the wearable sensor shown in FIG. 図４はウェアラブルセンサのＣＰＵの発話検出処理を示すフロー図である。FIG. 4 is a flowchart showing the utterance detection process of the CPU of the wearable sensor. 図５は図１実施例に示す会話先導者判別装置の適用例を説明するための図解図である。FIG. 5 is an illustrative view for explaining an application example of the conversation leader discriminating apparatus shown in FIG. 1 embodiment. 図６は図１実施例に示すデータベースの内容を示す図解図である。FIG. 6 is an illustrative view showing the contents of the database shown in FIG. 1 embodiment. 図７は図１実施例に示すデータベースに記憶される滞在情報のテーブルの一例を示す図解図である。FIG. 7 is an illustrative view showing one example of a table of stay information stored in the database shown in FIG. 1 embodiment. 図８は図１実施例に示すデータベースに記憶される発話情報のテーブルの一例を示す図解図である。FIG. 8 is an illustrative view showing one example of a table of speech information stored in the database shown in FIG. 1 embodiment. 図９は図１実施例に示すデータベースに記憶される会話先導者情報のテーブルの一例を示す図解図である。FIG. 9 is an illustrative view showing one example of a table of conversation leader information stored in the database shown in FIG. 1 embodiment. 図１０は算出した発話産出率から会話先導者を決定する決定方法を説明するための図解図である。FIG. 10 is an illustrative view for explaining a determination method for determining a conversation leader from the calculated utterance production rate. 図１１は図１実施例に示すコンピュータの会話先導者判別処理の一部を示すフロー図である。FIG. 11 is a flowchart showing a part of the conversation leader discrimination process of the computer shown in FIG. 1 embodiment. 図１２は図１１に示す会話先導者判別処理に後続するフロー図である。FIG. 12 is a flowchart subsequent to the conversation leader discriminating process shown in FIG. 図１３は図１実施例に示す出力装置の表示例を示す図解図である。FIG. 13 is an illustrative view showing a display example of the output device shown in FIG. 1 embodiment.

Explanation of symbols

１０ …会話先導者判別装置
１２ …コンピュータ
１４ …ウェアラブルセンサ
１６ …タグリーダ
１８ …データベース
２０ …タグ
２２ …出力装置
１４２ …ＣＰＵ
１４６ …メモリ
１５２ …マイク
DESCRIPTION OF SYMBOLS 10 ... Conversation leader discriminating device 12 ... Computer 14 ... Wearable sensor 16 ... Tag reader 18 ... Database 20 ... Tag 22 ... Output device 142 ... CPU
146 ... Memory 152 ... Microphone

Claims

A conversation leader discriminating apparatus for discriminating a leader of a conversation between two or more persons existing in a certain place,
A plurality of microphones for collecting audio for each of the two or more persons;
Individual recognition means for individually recognizing a person existing in the place,
Calculation means for calculating the amount of speech in a certain time zone of each person recognized by the personal recognition means based on the output of the microphone ;
Extraction means for extracting a person having an utterance amount exceeding a predetermined value out of the utterance amount for each person calculated by the calculation means;
Determining means for determining whether or not there is a variation in the utterance amount based on a difference in utterance amount for the extracted person when there are a plurality of persons extracted by the extracting means; and
When it is determined by the determination means that there is no variation, it is determined that there is no leader of the conversation, and when it is determined by the determination means that there is variation, the utterance amount that maximizes the utterance amount A conversation leader discriminating device comprising a leader discriminating means for discriminating a person as a conversation leader.

The conversation leader discriminating apparatus according to claim 1, wherein the leader discriminating means further discriminates that there is no conversation leader when the utterance amount of all persons does not exceed the predetermined value .

A conversation leader discriminating method for discriminating a leader of a conversation between two or more persons existing in a certain place,
(a) collecting audio for each of the two or more persons,
(b) Recognize the person who exists in the place individually,
(c) Based on the voice collected in step (a), the amount of speech in a certain time zone of each person recognized in step (b) is calculated ,
(d) Out of the utterance amount for each person calculated in the step (c), a person having an utterance amount exceeding a predetermined value is extracted,
(e) When there are a plurality of persons extracted by the step (d), determine the presence or absence of variation in the amount of speech based on the difference in the amount of speech for the extracted person, and
(f) When it is determined in step (e) that there is no variation, it is determined that there is no conversation leader, and when the determination means determines that there is variation, the utterance amount is maximum. The conversation leader discriminating method which discriminate | determines the person of the utterance amount used as a conversation leader.