JPH11259501A

JPH11259501A - Speech structure detector/display

Info

Publication number: JPH11259501A
Application number: JP10059762A
Authority: JP
Inventors: Takashi Osawa; 隆大澤; Hiroshi Katsurabayashi; 浩桂林; Eriko Tamaru; 恵理子田丸
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1998-03-11
Filing date: 1998-03-11
Publication date: 1999-09-24
Anticipated expiration: 2018-03-11
Also published as: JP3879793B2

Abstract

PROBLEM TO BE SOLVED: To provide a speech structure detection/display capable of displaying the relation of speeches having high interaction with each other. SOLUTION: A speech section detection means 11 detects the speech section of each speaker from voice signals from voice input means and a posture detection means 6 detects the posture of each speaker. A voice/posture recording means 13 records voice information from the voice input means records the information of the speech section of each speaker detected in the speech section detection means and the posture of each speaker detected in the posture detection means in correspondence. A speech object person specifying means 21 specifies a person to whom the speech is mode based on the information recorded in the voice/posture recording means. A speech flow detection means 22 detects the flow of the speech in a prescribed time section based on the recorded information of the vice/posture recording means and the result of the speech object person specifying means. Display information corresponding to the detected result of the speech flow detection means 22 is displayed.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、会議における参
加者の発言の構造を検出し、表示する発言構造検出表示
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a statement structure detecting and displaying device for detecting and displaying the structure of a statement of a participant in a conference.

【０００２】[0002]

【従来の技術】発言の構造化を広い意味でとらえるなら
ば、発言を他の情報と結び付けることで発言を構造化す
る技術と、発言情報そのものから発言を構造化する技術
とがあると考えられる。2. Description of the Related Art If the structuring of a statement can be understood in a broad sense, there are techniques for structuring the statement by associating the statement with other information, and techniques for structuring the statement from the statement information itself. .

【０００３】前者の従来の技術として、特開平６−３４
３１４６号公報、特開平７―２２６９３１号公報、特開
平６−２０５１５１号公報、特開平６−１７６１７１号
公報、特開平７−１８２３６５号公報、Ｍａｒｑｅｅ：
ＡＴｏｏｌＦｏｒＲｅａｌ−ＴｉｍｅＶｉｄｅ
ｏＬｏｇｇｉｎｇ（ＣＨＩ´９４ＨｕｍａｎＦａ
ｃｔｏｒｓｉｎＣｏｍｐｕｔｉｎｇＳｙｓｔｅｍ
ｓ）などに記載されている会議システムがある。The former prior art is disclosed in Japanese Patent Laid-Open No. 6-34.
No. 3146, Japanese Unexamined Patent Publication No. Hei 7-226931, Japanese Unexamined Patent Publication No. Hei 6-205151, Japanese Unexamined Patent Publication No. Hei 6-176171, Japanese Unexamined Patent Publication No. Hei 7-182365, Marqee:
A Tool For Real-Time Video
o Logging (CHI'94 Human Fa
ctors in Computing System
s) and the like.

【０００４】これらの会議システムでは、会議などの音
声・映像などのマルチメディア情報を記録する一方で、
会議参加者のペン入力やキーボード入力などの入力情報
と、その入力時刻を記録し、後でその入力時刻を利用し
てその入力情報に関連するマルチメディア情報を再現で
きるような仕組みが備えられている。これは、発言情報
同士を構造化するわけではないが、発言情報をユーザ入
力情報と関連付けて構造化する技術である。[0004] In these conference systems, while multimedia information such as audio / video of a conference is recorded,
A mechanism is provided to record input information such as pen input and keyboard input of meeting participants and the input time, and to be able to reproduce multimedia information related to the input information later using the input time. I have. This is a technique that does not structure the statement information but associates the statement information with the user input information.

【０００５】市販のカセットテープレコーダやミニディ
スクレコーダなどにも、記録中に重要な情報を記録して
いる時に、後で重要箇所を検索し易くするように付箋
（トラックマーク）をつけられるようなボタンが具備さ
れているものがあるが、この種の技術も、関連のある従
来技術として位置づけられる。[0005] Commercially available cassette tape recorders and mini-disc recorders also have a tag (track mark) that can be used to make it easier to retrieve important parts later when important information is recorded during recording. Some of them are provided with buttons, but this kind of technology is also regarded as a related prior art.

【０００６】一方、後者、すなわち発言情報そのものか
ら発言を構造化する技術として、音声認識を行うアプロ
ーチがある。会議や講演会のような場面で発せられる、
ごく自然な人間の発話を認識し、理解することは現状で
は、極めて困難である。そこで、例えば、ワードスポッ
ティング法のような技術を用いて、キーワードを検出、
その結果を用いて、発言を構造化するものである。On the other hand, there is an approach of performing speech recognition as the latter, that is, a technique for structuring a speech from the speech information itself. It is emitted in a scene such as a conference or a lecture,
At present, it is extremely difficult to recognize and understand natural human speech. Therefore, for example, using a technique such as the word spotting method, the keyword is detected,
The result is structured using the result.

【０００７】また、発言の状況を視覚化して、それによ
って人間に発言の状況を把握し易くする方法もある。こ
の技術では、発言の構造化を機械が行うわけではない
が、人間が発言を構造化するのを支援する技術として位
置づけられる。[0007] There is also a method of visualizing the state of the utterance, thereby making it easier for a human to grasp the state of the utterance. In this technology, a machine does not structure utterances, but is positioned as a technology that assists humans in structuring utterances.

【０００８】その代表的な例として、特開平８−３１７
３６５号公報に記載の電子会議装置がある。この装置で
は、各発言者の各発言の記録量の大きさを横軸にとり、
縦軸に会話の順序を示すグラフ表示領域を設けること
で、発言の状況を視覚的に把握し易くしている。以後、
時間軸の概念を持ち、更に発言の状況を示した図を発言
者チャートと呼ぶ。A typical example is disclosed in Japanese Patent Application Laid-Open No. 8-317.
There is an electronic conference device described in JP-A-365-365. In this device, the magnitude of the recorded amount of each utterance of each speaker is taken on the horizontal axis,
By providing a graph display area indicating the order of conversation on the vertical axis, it is easy to visually grasp the state of speech. Since then
A diagram that has the concept of a time axis and further shows the state of speech is called a speaker chart.

【０００９】これに類似した発言者チャートは、研究論
文の中にも見られる。例えば、「ＣＨＩ´９５ＭＯＳ
ＡＩＣＯＦＣＲＥＡＴＩＶＩＴＹ」に紹介されてい
る、ＤｏｎａｌｄＧ．Ｋｉｍｂｅｒらの研究論文で
ある「ＳｐｅａｋｅｒＳｅｇｍｅｎｔａｔｉｏｎｆ
ｏｒＢｒｏｗｓｉｎｇＲｅｃｏｒｄｅｄＡｕｄｉ
ｏ」にも発言者チャートの情報が記載されている。A similar speaker chart is also found in research papers. For example, "CHI'95 MOS
AIC OF CREATIVITY ", Donald G. A research paper by Kimber et al., “Speaker Segmentation f
or Browsing Recorded Audi
"o" also describes the speaker chart information.

【００１０】[0010]

【発明が解決しようとする課題】ところで、会議におい
て、互いに関連のある発言相互の関係が発言者チャート
に表示できると便利である。例えば、会議参加者Ａが、
他の会議参加者Ｂに対して意見を述べたり、質問をぶつ
けた時に、会議参加者Ｂが、それに対して回答や反論を
行うインタラクティブな場面が、チャートから判別でき
ると、そこでは、何らかの議論があったことが分かり、
記録された会議情報の検索者は、それを手掛かりとし
て、再生したい議論部分を、簡単に検索することができ
ると期待される。In a meeting, it is convenient to be able to display the relations between the related statements on a speaker chart. For example, conference participant A
When an opinion or a question is asked to another conference participant B, if an interactive scene in which the conference participant B answers or refutes it can be determined from the chart, there is some discussion there. It turns out that there was
A searcher of the recorded meeting information is expected to be able to easily search for a discussion part to be reproduced using the clue as a clue.

【００１１】特に、３人以上の多人数の会議では、上述
のようなインタラクションの高い発言構造区間もあれ
ば、そうではなく、淡々と発言者が移っていく発言区間
もある。このような場合に、上記のようなインターラク
ションの高い発言構造区間を容易に検出できれば、重要
な会議情報部分へのアクセスが比較的簡単になると期待
される。Particularly, in a multi-person meeting with three or more people, there is a speech structure section having a high interaction as described above, and in other cases, a speech section in which speakers move in a quiet manner. In such a case, if the speech structure section having a high interaction as described above can be easily detected, it is expected that access to the important conference information portion becomes relatively easy.

【００１２】しかしながら、上述した従来の技術の会議
システムでは、複数の発言同志の関係を構造化するもの
ではない。すなわち、従来の会議システムの前者の場合
には、音声情報をユーザ入力情報と関連付けて構造化す
ることは可能であるが、発言相互の関係を含む発言の流
れを抽出することはできない。However, the conventional conference system described above does not structure the relationship between a plurality of remarks. That is, in the former case of the conventional conference system, it is possible to structure the voice information in association with the user input information, but it is not possible to extract the flow of the utterance including the mutual relation between the utterances.

【００１３】また、ワードスポッティングのような技術
から、重要な単語を拾い出したとしても、上述のような
発言の流れを検出することは、極めて困難であり、人間
と同程度の音声認識、理解能力がないと実現できない。Further, even if an important word is picked up by a technique such as word spotting, it is extremely difficult to detect the flow of the above-mentioned utterance. It cannot be realized without the ability.

【００１４】さらに、従来の発言者チャートでは、発言
者の交代、発言の長さは分かるが、個々の発言の関連
性、流れは分からない。例えば、ある人が発言をした後
で、別の人が発言をしたことは、従来の発言者チャート
でも把握し得るが、質問に対する回答などのような流れ
のあるやり取りがあったのか、あるいは単に発言者が移
って新たな会話の流れを開始したのかは、従来の発言者
チャートからは知ることはできない。Further, in the conventional speaker chart, although the alternation of the speakers and the length of the speech can be known, the relevance and flow of each speech cannot be understood. For example, if one person speaks and then another person speaks, it can be grasped by the conventional speaker chart, but whether there was a flow of communication such as answering questions, or simply Whether the speaker has moved and started a new conversation flow cannot be known from the conventional speaker chart.

【００１５】この発明は、以上の点にかんがみ、インタ
ーラクションの高い発言相互の関係が表示できるように
した発言構造検出表示装置を提供することを目的とす
る。In view of the above, an object of the present invention is to provide an utterance structure detecting and displaying device capable of displaying a high mutual interaction between utterances.

【００１６】[0016]

【課題を解決するための手段】上記課題を解決するた
め、この発明による発言構造検出表示装置は、発言者の
音声を収音するための音声入力手段と、前記音声入力手
段からの音声信号から、発言者毎の発言区間を検出する
発言区間検出手段と、前記発言者毎の姿勢を検出する姿
勢検出手段と、前記音声入力手段からの音声情報を記録
すると共に、前記発言区間検出手段で検出された発言者
毎の発言区間の情報と、前記姿勢検出手段で検出された
発言者毎の姿勢とを、対応付けて記録する音声・姿勢記
録手段と、前記音声・姿勢記録手段に記録された情報に
基づいて、発言が誰に対するものであるかを特定する発
言対象者特定手段と、所定の時間区間における発言の流
れを、前記音声・姿勢記録手段の記録情報と、前記発言
対象者特定手段の結果に基づいて検出する発言流れ検出
手段と、前記発言流れ検出手段の検出結果に応じた表示
情報を表示する表示手段と、を備えることを特徴とす
る。In order to solve the above-mentioned problems, a speech structure detecting and displaying apparatus according to the present invention comprises a speech input means for picking up a speech of a speaker and a speech signal from the speech input means. Utterance section detection means for detecting a utterance section of each utterer, attitude detection means for detecting an attitude of each utterer, and voice information from the voice input means, and detection by the utterance section detection means. The voice / posture recording means for recording the information of the utterance section for each speaker and the posture of each speaker detected by the posture detection means in association with each other, and the voice / posture recording means Based on the information, the utterance target person specifying means for specifying to whom the utterance is directed, the flow of the utterance in a predetermined time interval, the recording information of the voice / posture recording means, and the utterance target person specifying means Result Wherein the speech flow detection means for detecting, and display means for displaying the display information corresponding to the detection result of the speech flow detecting means, further comprising a based on.

【００１７】[0017]

【作用】上述の構成のこの発明によれば、単に、発言者
の発言区間を時系列的に順次に並べるだけでなく、発言
者毎の姿勢情報から、特定発言対象者特定手段により、
発言が誰に対するものであるかが特定される。そして、
発言の流れ検出手段で、発言が誰に対するものであるか
の情報を反映した発言の流れが検出され、表示手段で、
その発言の流れを示す表示情報が表示画面に表示され
る。According to the present invention having the above-described structure, not only the utterance sections of the utterer are sequentially arranged in chronological order, but also the specific utterance target person specifying means from the posture information of each utterer.
It is specified to whom the remark is. And
An utterance flow detecting means detects an utterance flow reflecting information on who the utterance is for, and a display means,
Display information indicating the flow of the remark is displayed on the display screen.

【００１８】これにより、発言相互のインターラクショ
ンが高い部分を容易に検知でき、この表示情報を見るだ
けで、ユーザは、どのような発言経過があったのかを予
測することが可能となる。したがって、会議の重要部分
の検索など、必要な個所の検索に非常に役立つものであ
る。Thus, it is possible to easily detect a portion where the interaction between the utterances is high, and it is possible for the user to predict what kind of utterance has occurred just by looking at the display information. Therefore, it is very useful for searching for a necessary part, such as searching for an important part of a meeting.

【００１９】[0019]

【発明の実施の形態】以下、この発明による発言構造検
出表示装置の実施の形態について、図を参照しながら説
明する。以下に説明する実施の形態は、３人以上の多人
数による対面型会議の会議情報記録再生装置に、この発
明を適用した場合である。この発明による発言構造検出
表示装置の実施の形態を説明する前に、図２を参照し
て、この例の対面型会議の概要を説明する。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing an embodiment of a speech structure detecting and displaying apparatus according to the present invention. The embodiment described below is a case where the present invention is applied to a meeting information recording / reproducing apparatus for a face-to-face meeting with three or more people. Before describing the embodiment of the utterance structure detection display device according to the present invention, an outline of a face-to-face conference of this example will be described with reference to FIG.

【００２０】会議参加者１のそれぞれは、視線検出用の
センサ２と、頭部の位置と方向とを検出するための３次
元磁気センサ３を装着している。この実施例に用いた視
線検出用センサ２は、角膜強膜反射法を用いたセンサを
用いた。Each of the conference participants 1 is equipped with a sensor 2 for detecting the line of sight and a three-dimensional magnetic sensor 3 for detecting the position and direction of the head. As the line-of-sight detection sensor 2 used in this example, a sensor using a corneal scleral reflection method was used.

【００２１】また、会議参加者１の各人の音声を個別に
収録するために、会議参加者のそれぞれには、マイクロ
フォン４が割り当てられている。In order to individually record the voice of each of the conference participants 1, a microphone 4 is assigned to each of the conference participants.

【００２２】そして、各々の会議参加者１の視線検出用
のセンサ２、３次元磁気センサ３およびマイクロフォン
４から得られる情報は、会議情報記録再生用のパーソナ
ルコンピュータ５に入力され、デジタル化されて、記録
される。そして、このパーソナルコンピュータ５のソフ
トウエアとして、この発明の実施の形態の発言構造検出
表示装置の要部が実現される。The information obtained from the sensor 2 for detecting the line of sight of each conference participant 1, the three-dimensional magnetic sensor 3 and the microphone 4 is input to a personal computer 5 for recording and reproducing conference information and digitized. Is recorded. Then, as software of the personal computer 5, a main part of the speech structure detection and display device according to the embodiment of the present invention is realized.

【００２３】図１に、この実施の形態の発言構造検出表
示装置のブロック図を示す。この実施の形態の発言構造
検出表示装置は、各会議参加者のマイクロフォン４から
の音声信号を処理する音声情報処理部１１と、視線検出
用のセンサ２と磁気センサ３を備える姿勢入力装置６か
らの姿勢情報を処理する姿勢情報処理部１２と、処理さ
れた音声情報および姿勢情報を記録する音声・姿勢記録
部１３と、会議参加者や会議情報の検索者のユーザ入力
を受け付けるユーザ入力部１４と、ユーザ入力に基づい
て制御を行う制御部１５と、会議参加者のユーザ入力を
記録するユーザ入力記録部１６と、音声・姿勢記録部１
３の情報から発言の状況を視覚的に表現した発言者チャ
ートを作成するチャート作成部１７と、ユーザ入力や発
言者チャートなどの情報を表示する表示部１８と、記録
された音声を再生する再生部１９と、再生音声を出力す
る音声出力部２０と、発言が誰に対するものであったか
を特定する発言対象者特定部２１と、発言の流れを検出
する発言流れ検出部２２とを備える。FIG. 1 is a block diagram showing a speech structure detecting and displaying apparatus according to this embodiment. The speech structure detection and display device according to this embodiment includes a speech information processing unit 11 that processes a speech signal from a microphone 4 of each conference participant, and a posture input device 6 including a sensor 2 and a magnetic sensor 3 for line-of-sight detection. Posture information processing unit 12 for processing posture information of the user, a voice / posture recording unit 13 for recording the processed voice information and posture information, and a user input unit 14 for receiving a user input of a conference participant or a searcher of the conference information. A control unit 15 for performing control based on a user input; a user input recording unit 16 for recording user input of a conference participant;
3, a chart creating section 17 for creating a speaker chart visually expressing the state of the statement, a display section 18 for displaying information such as user input and the speaker chart, and a playback for playing recorded voice. A voice output unit 20 for outputting a reproduced voice, a voice target person specifying unit 21 for specifying to whom the voice is directed, and a voice flow detecting unit 22 for detecting a voice flow.

【００２４】音声情報処理部１１は、この例では、音声
情報のデジタル化と、各会議参加者毎の発言の区間の検
出を行う。デジタル化の部分は、いわゆるサウンドボー
ドをパーソナルコンピュータに接続して構成する。各会
議参加者毎の発言の区間の検出は、パーソナルコンピュ
ータのソフトウエア処理で行う。すなわち、デジタル化
された音声情報を処理し、会議中に誰がいつ発言したか
を、ソフトウエア処理で検出するようにしている。In this example, the voice information processing section 11 digitizes voice information and detects a speech section for each conference participant. The digitization part is configured by connecting a so-called sound board to a personal computer. Detection of a speech section for each conference participant is performed by software processing of a personal computer. That is, digitized voice information is processed, and who spoke when and during a conference is detected by software processing.

【００２５】前述のように、この例では、会議参加者毎
の音声を個別のマイクロホン４で収音するようにしてお
り、マイクロホンと、各会議参加者との対応関係が予め
認識されている。このマイクロホンと会議参加者の対応
関係の情報は、予め、会議開始前に、各会議参加者など
により設定され、音声・姿勢記録部１３に記録されてい
る。As described above, in this example, the sound of each conference participant is collected by the individual microphone 4, and the correspondence between the microphone and each conference participant is recognized in advance. The information on the correspondence between the microphone and the conference participants is set by each conference participant or the like before the conference starts, and is recorded in the voice / posture recording unit 13.

【００２６】そして、この例では、あるマイクロホンか
らの音声信号レベルが、予め定めた或るレベルＬ１以上
であって、それが予め定めた或る時間Δｔ１以上継続し
た場合には、そのマイクロホンに対応する会議参加者が
発言を開始したとみなし、また、その音声信号レベル
が、予め定めた或る時間Δｔ２以上に渡って、予め定め
た或るレベルＬ２以下であるときには、発言が終了した
とみなして、発言区間を検出する。In this example, if the audio signal level from a certain microphone is equal to or higher than a predetermined certain level L1 and it continues for a certain predetermined time Δt1 or more, the corresponding level It is considered that the conference participant has started the speech, and when the voice signal level is less than the predetermined level L2 for more than the predetermined time Δt2, it is considered that the speech has ended. Then, a speech section is detected.

【００２７】図３に、各マイクロホン４からの音声信号
についての発言区間検知処理のフローチャートを示す。
また、図４に、この発言区間検知処理を説明するための
概念図を示す。この例においては、図３の発言区間検出
処理は、レベルＬ１以上の音声が検出されたときに起動
される。なお、図４の説明図では、Ｌ１＝Ｌ２としてい
る。しかし、レベルＬ１とレベルＬ２とは異なっていて
も勿論よい。FIG. 3 is a flowchart showing a speech section detection process for a speech signal from each microphone 4.
FIG. 4 is a conceptual diagram for explaining the comment section detection process. In this example, the utterance section detection process of FIG. 3 is started when a voice of level L1 or higher is detected. In the explanatory diagram of FIG. 4, L1 = L2. However, the level L1 and the level L2 may of course be different.

【００２８】図３のフローチャートに示すように、ま
ず、マイクロホンからレベルＬ１以上の音声が入力され
ると、ステップ１０１に進み、定められた時間Δｔ１以
上に渡って、その定められた閾値レベルＬ１以上の音声
が持続するか否かを監視する。もし、持続しなければ、
それは発言とはみなされず、発言区間の検知処理ルーチ
ンを終了する。As shown in the flow chart of FIG. 3, first, when a sound of level L1 or higher is input from the microphone, the process proceeds to step 101, where the sound exceeds the predetermined threshold level L1 for a predetermined time Δt1 or longer. And monitor whether the voice is sustained. If not,
It is not regarded as a speech, and the speech section detection processing routine ends.

【００２９】図４に示すように、時刻Ｔ１でステップ１
０１の条件が満足されたと判別されると、ステップ１０
２に進み、現在時刻Ｔ１の情報を取得して、発言開始時
刻ｔｓを、ｔｓ＝Ｔ１−Δｔ１とし、その情報を音声・
姿勢記録部１３に送って記録するようにする。As shown in FIG. 4, at time T1, step 1
If it is determined that the condition of No. 01 is satisfied, step 10
2, the information of the current time T1 is obtained, the utterance start time ts is set to ts = T1-Δt1,
It is sent to the posture recording unit 13 and recorded.

【００３０】次に、ステップ１０３に進み、その音声信
号レベルが、予め定めた或る時間Δｔ２以上に渡って、
予め定めた或るレベルＬ２以下となったかを監視する。
図４に示すように、時刻Ｔ２において、音声が、予め定
められた時間Δｔ２以上、レベルＬ２を下回ったことが
検出された場合、ステップ１０４へ進み、発言終了時刻
ｔｅを、ｔｅ＝Ｔ２―Δｔ２とし、その情報を音声・姿
勢記録部１３に送って記録するようにする。Next, the routine proceeds to step 103, in which the audio signal level is kept for a predetermined time Δt2 or more.
It is monitored whether the level falls below a predetermined level L2.
As shown in FIG. 4, at time T2, when it is detected that the voice has dropped below the level L2 for a predetermined time Δt2 or more, the process proceeds to step 104, and the utterance end time te is changed to te = T2−Δt2. The information is sent to the voice / posture recording unit 13 and recorded.

【００３１】図５に、音声・姿勢記録部１３の発言状況
の記録情報である発言状況テーブルＴＢＬ１のデータ構
造の一例を示す。「発言ＩＤ」のレコードは、検出され
た発言にシーケンシャルに付与された識別番号である。
「発言者」のレコードは、発言が検出された会議参加者
名である。なお、すべての会議参加者と、その識別情報
とを記述した会議参加者テーブルを別に持つ場合には、
この「発言者」のレコードは、参加者識別情報であって
もよい。FIG. 5 shows an example of the data structure of the statement status table TBL1, which is recorded information of the statement status of the voice / posture recording unit 13. The record of “utterance ID” is an identification number sequentially assigned to the detected utterance.
The record of "speaker" is the name of the conference participant whose speech was detected. If you have a separate conference participant table that describes all conference participants and their identification information,
This “speaker” record may be participant identification information.

【００３２】「発言開始時刻」および「発言終了時刻」
のレコードには、前述した発言区間検出処理により算出
された発言開始時刻ｔｓおよび発言終了時刻ｔｅが記録
される。なお、最後の「発言対象者」のレコードは、後
述する発言対象者特定部２１で特定される、各発言が誰
に対して行われたかの記録である。"Speech start time" and "speech end time"
Record, the utterance start time ts and the utterance end time te calculated by the above-described utterance section detection processing are recorded. The last record of “speaker” is a record of each utterance, which is specified by the utterer target specifying unit 21 described later.

【００３３】次に、姿勢情報処理部１２の処理について
説明する。姿勢情報処理部１２は、姿勢情報入力部６を
構成する視線検出センサ２の出力と、磁気センサ３の出
力から、発言者が誰を注目して発言しているかを検出す
る。Next, the processing of the posture information processing section 12 will be described. The posture information processing unit 12 detects from the output of the line-of-sight detection sensor 2 constituting the posture information input unit 6 and the output of the magnetic sensor 3 which person the speaker pays attention to and speaks.

【００３４】視線検出センサ２は、それを装着している
ユーザの頭部座標系における視線方向を検出できるもの
である。視線検出センサ２は、この例では、両目の視線
検出を行っており、両眼の視線を用いて頭部座標系にお
ける見ている箇所、すなわち視点を検出する。そして、
この視点位置を絶対座標系における視点位置に変換する
ために、３次元磁気センサ３を用いている。The line-of-sight detection sensor 2 can detect the line-of-sight direction in the head coordinate system of the user wearing the sensor. In this example, the line-of-sight detection sensor 2 detects the line of sight of both eyes, and uses the line of sight of both eyes to detect the point of view in the head coordinate system, that is, the viewpoint. And
The three-dimensional magnetic sensor 3 is used to convert this viewpoint position into a viewpoint position in an absolute coordinate system.

【００３５】３次元磁気センサ３は、会議参加者の頭部
に装着されており、これにより、この頭部の絶対空間上
の方向が求まる。３次元磁気センサ３の情報と頭部座標
系における視点位置情報とによって、各会議参加者の絶
対空間上における視点位置が求まる。姿勢情報処理部１
２には、会議参加者の位置情報が記録されている。この
実施例では、この位置情報として、各会議参加者の３次
元磁気センサ３から出力される最新の頭部位置情報を記
録している。The three-dimensional magnetic sensor 3 is mounted on the head of a conference participant, whereby the direction of the head in the absolute space is determined. From the information of the three-dimensional magnetic sensor 3 and the viewpoint position information in the head coordinate system, the viewpoint position of each conference participant in the absolute space is obtained. Posture information processing unit 1
2 records the location information of the conference participants. In this embodiment, the latest head position information output from the three-dimensional magnetic sensor 3 of each conference participant is recorded as the position information.

【００３６】図６に、この実施例における注視対象者
（発言をしている者の注視対象者は、発言対象者であ
る）の求め方の説明図を示す。FIG. 6 is an explanatory diagram of a method of obtaining a watch target person (the watch target person who is speaking is the comment target person) in this embodiment.

【００３７】図６（Ａ）に示すように、この実施例で
は、ある会議参加者Ａの視点位置Ｐｅが、他の会議参加
者Ｂの頭部に装着されている３次元磁気センサ３の位置
Ｐｓを中心とした所定範囲内（半径Ｒの球内）にある場
合に、会議参加者Ａは、会議参加者Ｂを注視していると
解釈する。As shown in FIG. 6A, in this embodiment, the viewpoint position Pe of a conference participant A is determined by the position of the three-dimensional magnetic sensor 3 mounted on the head of another conference participant B. When it is within a predetermined range around Ps (within a sphere of radius R), it is interpreted that the conference participant A is watching the conference participant B.

【００３８】各会議参加者毎の注視対象者の検出処理
は、例えば単位時間周期で行われる。音声・姿勢記録部
１３には、各参加者毎の、各単位時間内における注視対
象者の情報が、例えば図７に示すように、注視対象者テ
ーブルＴＢＬ２として記録されている。この注視対象者
テーブルＴＢＬ２の「時間」のレコードは、各単位時間
の識別情報であり、この例では、シーケンシャル番号で
示されている。図７の例においては、例えば、会議参加
者Ａは、時間１および時間２では、会議参加者Ｂを注視
していたことが記録される。The process of detecting the watching target for each conference participant is performed, for example, in a unit time cycle. In the voice / posture recording unit 13, information on the watch target person within each unit time for each participant is recorded as a watch target person table TBL2, for example, as shown in FIG. The record of “time” in the watch target person table TBL2 is identification information of each unit time, and in this example, is indicated by a sequential number. In the example of FIG. 7, for example, it is recorded that the conference participant A gazes at the conference participant B at time 1 and time 2.

【００３９】なお、上述の例では、両眼で視線検出を行
っているため、視点を求めることができる。しかし、片
眼の視線と３次元磁気センサ３、あるいは、３次元磁気
センサ３だけを用いても近似的に注視対象者を検出する
方法が考えられる。ただし、この場合は必ずしも眼球が
注視している状況にあることを検知することはできな
い。In the example described above, since the line of sight is detected by both eyes, the viewpoint can be obtained. However, a method is also conceivable in which the gaze target is approximately detected by using the line of sight of one eye and the three-dimensional magnetic sensor 3 or only the three-dimensional magnetic sensor 3. However, in this case, it is not always possible to detect that the eye is gazing.

【００４０】この場合の注視対象者の検出方式を説明す
るための模式図を図６（Ｂ）に示す。この場合は、視線
あるいは頭部方向を表す直線ＤＲと、参加者Ｂの３次元
磁気センサ３の位置Ｐｓとの距離ｄを求めて、その距離
が、参加者Ｂの３次元磁気センサ３の位置Ｐｓから所定
の距離Ｒ内にあれば注視していると解釈する。FIG. 6B is a schematic diagram for explaining a method of detecting a person to be watched in this case. In this case, the distance d between the line DR representing the line of sight or the head direction and the position Ps of the three-dimensional magnetic sensor 3 of the participant B is obtained, and the distance is determined as the position of the three-dimensional magnetic sensor 3 of the participant B. If it is within a predetermined distance R from Ps, it is interpreted that the user is watching.

【００４１】音声・姿勢記録部１３には、以上のよう
に、発言状況の記録である発言状況テーブルＴＢＬ１
と、姿勢状況の記録である注視者テーブルＴＢＬ２が記
録されるとともに、すべてのマイクロホン４からの音声
信号が会議情報として記録される。音声情報は、パーソ
ナルコンピュータやワークステーションで提供されてい
るような、通常のオーディオフォーマットで記録してい
る。As described above, the speech / posture recording unit 13 stores the speech status table TBL1 which is a record of the speech status.
And the gaze table TBL2, which is a record of the posture status, is recorded, and audio signals from all the microphones 4 are recorded as conference information. The audio information is recorded in a normal audio format such as provided on a personal computer or a workstation.

【００４２】表示部１８は、例えばＣＲＴモニタや、液
晶モニタで構成される表示画面を備え、この例では、ペ
ン／タブレット一体型入出力装置をも兼用する構成とさ
れている。The display section 18 has a display screen composed of, for example, a CRT monitor or a liquid crystal monitor. In this example, the display section 18 is configured to also serve as a pen / tablet integrated input / output device.

【００４３】ユーザ入力部１４は、この例では、前記の
ペン／タブレット一体型入出力装置により構成される。
制御部１５は、ユーザ入力部１４からのユーザ入力情報
を受け取り、表示部１８に送り、表示画面に表示させ
る。そして、受け取ったユーザ入力情報に応じた処理
を、情報記録時（情報蓄積時）あるいは情報再生時に応
じて行う。In this example, the user input unit 14 is constituted by the pen / tablet integrated type input / output device.
The control unit 15 receives user input information from the user input unit 14, sends the information to the display unit 18, and causes the display unit 18 to display the information. Then, processing according to the received user input information is performed at the time of information recording (at the time of information storage) or at the time of information reproduction.

【００４４】なお、ユーザ入力情報としては、ペン（ま
たはマウス／トラックボール／タッチパネルなど）から
の筆跡あるいは図形（線、四角形、円などのオブジェク
ト）の他に、筆跡データを文字認識したコード情報、キ
ーボードからのコード情報でもよい。The user input information includes, in addition to handwriting or graphics (objects such as lines, rectangles, circles, etc.) from a pen (or mouse / trackball / touch panel, etc.), code information for character recognition of handwriting data, Code information from a keyboard may be used.

【００４５】また、表示されているユーザ入力情報を移
動／複写／削除したという編集情報、ページ切り替えを
行ったという情報、ユーザがセンサー付きの椅子に座っ
たという情報、仮想的な消しゴムが用いられたという情
報など、ユーザ入力情報が表示されない性質のものであ
ってもよく、この場合は、そのユーザ入力情報の存在を
示す所定のしるしを表示部１８に表示する。すなわち、
入力される音声信号が存在する間に、ユーザが計算処理
能力を持つ装置に対して行った入力は、いずれもこの発
明でいうユーザ入力情報に相当する。Editing information that the displayed user input information has been moved / copied / deleted, information that the page has been switched, information that the user has sat on a chair with a sensor, and a virtual eraser are used. The information may be such that the user input information is not displayed, such as information indicating that the user input information is displayed. In this case, a predetermined sign indicating the presence of the user input information is displayed on the display unit 18. That is,
Any input made by a user to a device having a calculation processing capability while an input audio signal is present corresponds to the user input information referred to in the present invention.

【００４６】制御部１５は、また、会議情報としての音
声情報の記録時に入力されたユーザ入力情報と、その入
力開始時刻および終了時刻等とを、ユーザ入力情報記録
部１６に記録する。The control unit 15 also records in the user input information recording unit 16 the user input information input at the time of recording the audio information as the conference information and the input start time and end time.

【００４７】また、制御部１５は、再生時には、ユーザ
入力部１４からのユーザの指示に応じて、ユーザ入力情
報記録部１６に記録されている情報を読み出し、表示部
１８に、その情報を伝達する。さらに、後述するよう
に、表示部１８の表示画面に発言者チャートが表示され
ているときに、ユーザにより指示入力がなされたときに
は、その指示された部分に相当する時間部分の再生を行
うことができるように構成されている。At the time of reproduction, the control section 15 reads information recorded in the user input information recording section 16 in response to a user's instruction from the user input section 14 and transmits the information to the display section 18. I do. Further, as described later, when the speaker chart is displayed on the display screen of the display unit 18 and an instruction is input by the user, a time portion corresponding to the instructed portion can be reproduced. It is configured to be able to.

【００４８】表示部１８は、制御部１５から渡されたユ
ーザ入力情報を、その表示画面に表示する。また、後述
するように、チャート作成部１７で作成された発言者チ
ャートを、その表示画面に表示する。The display section 18 displays the user input information passed from the control section 15 on its display screen. In addition, as described later, the speaker chart created by the chart creating unit 17 is displayed on the display screen.

【００４９】ユーザは、後述するように、この表示部１
８の表示画面に表示された複数のユーザ入力情報から、
いずれかのユーザ入力情報を、対応する時系列情報の再
生やチャートの作成指示のために選択することができ
る。また、このシステムでは、この表示部１８に表示さ
れた発言者チャートにおいて、特定の再生箇所、すなわ
ち、再生開始点あるいは再生区間を指示することによ
り、指定された再生箇所に対応する音声情報の再生を行
わせるようにすることができる。The user operates the display unit 1 as described later.
From a plurality of user input information displayed on the display screen of No. 8,
Any one of the user input information can be selected for reproducing the corresponding time-series information or instructing to create a chart. Further, in this system, by specifying a specific reproduction position, that is, a reproduction start point or a reproduction section, in the speaker chart displayed on the display unit 18, reproduction of audio information corresponding to the specified reproduction position is performed. Can be performed.

【００５０】音声・姿勢記録部１３およびユーザ入力情
報記録部１６の記録媒体は、具体的にはパーソナルコン
ピュータに内蔵の半導体メモリやハードディスクを用い
ている。なお、ＭＯディスクやフロッピーディスク等の
記録媒体であってもよい。As a recording medium of the voice / posture recording unit 13 and the user input information recording unit 16, a semiconductor memory or a hard disk built in a personal computer is specifically used. Note that a recording medium such as an MO disk or a floppy disk may be used.

【００５１】また、表示部１７、音声出力部２０は、パ
ーソナルコンピュータに接続されるディスプレイおよび
スピーカで実現している。The display unit 17 and the audio output unit 20 are realized by a display and a speaker connected to a personal computer.

【００５２】次に、パーソナルコンピュータのソフトウ
エアで実現される発言対象者特定部２１、発言流れ検出
部２２およびチャート作成部１７の処理動作について、
以下に説明する。Next, the processing operations of the utterance target person specifying unit 21, the utterance flow detecting unit 22, and the chart creating unit 17 realized by the software of the personal computer will be described.
This will be described below.

【００５３】まず、発言対象者特定部２１の動作につい
て説明する。ユーザによって、ユーザ入力部１４を通じ
て発言対象者特定の命令が入力されると、制御部１５を
介して、その命令が発言対象者特定部２１に入力され
る。以下にその動作について述べる。First, the operation of the comment target person specifying unit 21 will be described. When the user inputs a command for specifying the target person via the user input unit 14, the command is input to the target user specifying unit 21 via the control unit 15. The operation will be described below.

【００５４】発言対象者特定部２１は、音声・姿勢記録
部１３に記録された注視対象者テーブルＴＢＬ２の情報
を用いて、各発言の発言者の発言対象者を特定する。こ
の例の場合、発言対象者の特定の際には、各発言に対し
て、それぞれ時間区間を特定し、その特定区間での、姿
勢状況情報としての注視対象者テーブルＴＢＬ２の情報
を参照して発言対象者を特定する。前記特定区間を、以
下、対象区間Ｄと呼ぶ。Using the information of the watch target table TBL2 recorded in the voice / posture recording unit 13, the target person specifying unit 21 specifies the target person of each speaker. In the case of this example, when specifying the utterance target, a time section is specified for each utterance, and the information of the watching target table TBL2 as the posture status information in the specific section is referred to. Identify the target speaker. The specific section is hereinafter referred to as a target section D.

【００５５】発言開始時刻をｔｓ、発言終了時刻をｔ
ｅ、その発言における有効な姿勢情報とみなす対象区間
をＤとし、発言終了時刻ｔｅから発言開始時刻ｔｓの方
向に溯る時間長ｔ１と、発言終了時刻ｔｅよりも後の時
間長ｔ２を考えると、基本的には、発言終了時刻ｔｅを
基準時刻として、時点ｔｅ−ｔ１から時点ｔｅ＋ｔ２の
区間を、対象区間Ｄとする。ただし、この決め方は様々
ある。その発言全体を対象区間Ｄとしてもよいし、発言
区間の後半の一定の割合の区間を対象区間Ｄとしてもよ
い。The utterance start time is ts, and the utterance end time is t
e, let D be a target section considered as valid posture information in the utterance, and consider a time length t1 from the utterance end time te in the direction of the utterance start time ts and a time length t2 after the utterance end time te. Basically, a section from time te-t1 to time te + t2 is set as a target section D with the utterance end time te as a reference time. However, there are various ways to determine this. The entire utterance may be set as the target section D, or a section having a fixed ratio in the latter half of the utterance section may be set as the target section D.

【００５６】この対象区間Ｄの定め方には、ケース１か
らケース４の４つのケースが考えられ、それぞれ図８の
模式図を用いて説明する。There are four possible ways of determining the target section D, Case 1 to Case 4, and each case will be described with reference to the schematic diagram of FIG.

【００５７】まず、図８（Ａ）に示すケース１において
は、発言終了時刻ｔｅから時間ｔ１だけ溯った時刻が、
発言開始時刻ｔｓと発言終了時刻ｔｅの間の時刻（ｔｓ
＜（ｔｅ−ｔ１））であり、かつ、発言終了時刻ｔｅか
ら時点ｔｅ＋ｔ２までの区間で、他の誰の発言もなけれ
ば、時点ｔｅ−ｔ１から時点ｔｅ＋ｔ２の区間を対象区
間Ｄとする。First, in case 1 shown in FIG. 8 (A), the time that has elapsed from the utterance end time te by time t1 is
The time (ts) between the speech start time ts and the speech end time te
<(Te−t1)), and in the section from the utterance end time te to the time point te + t2, if there is no other utterance, the section from the time point te−t1 to the time point te + t2 is set as the target section D.

【００５８】図８（Ｂ）に示すケース２は、発言終了時
刻ｔｅから時間ｔ１だけ溯った時刻が、発言開始時刻ｔ
ｓよりも前（ｔｓ＞（ｔｅ−ｔ１））であり、発言終了
時刻ｔｅから時点ｔｅ＋ｔ２までの区間で、他の誰の発
言もない場合である。この場合は、発言開始時刻ｔｓか
ら時点ｔｅ＋ｔ２の区間を対象区間Ｄとする。In case 2 shown in FIG. 8 (B), the time preceding the utterance end time te by the time t1 is the utterance start time t
This is before s (ts> (te−t1)), in the section from the utterance end time te to the time point te + t2, and there is no other utterance. In this case, the section from the utterance start time ts to the time point te + t2 is set as the target section D.

【００５９】図８（Ｃ）に示すケース３は、発言終了時
刻ｔｅから時間ｔ１だけ溯った時刻が、発言開始時刻ｔ
ｓと発言終了時刻ｔｅの間の時刻（ｔｓ＜ｔｅ−ｔ１）
であるが、発言終了時刻ｔｅから時点ｔｅ＋ｔ２までの
区間内の時刻ｔｘで、他の誰かの発言があった場合であ
る。このケース３の場合は、時点ｔｅ−ｔ１から、時点
ｔｘまでの区間を対象区間Ｄとする。In case 3 shown in FIG. 8 (C), the time that has elapsed from the utterance end time te by the time t1 is the utterance start time t
Time between s and utterance end time te (ts <te−t1)
However, this is a case where someone else's speech is made at time tx in the section from the speech end time te to the time point te + t2. In this case 3, a section from the time point te-t1 to the time point tx is set as the target section D.

【００６０】図８（Ｄ）に示すケース４は、発言終了時
刻ｔｅから時間ｔ１だけ溯った時刻が、発言開始時刻ｔ
ｓよりも前（ｔｓ＞（ｔｅ−ｔ１））であり、発言終了
時刻ｔｅから時点ｔｅ＋ｔ２までの区間内の時刻ｔｘ
で、他の誰かの発言があった場合である。このケース４
の場合は、発言開始時刻ｔｓから時点ｔｘまでの区間を
対象区間Ｄとする。In case 4 shown in FIG. 8 (D), the time preceding the utterance end time te by the time t1 is the utterance start time t
ts> (te−t1) before s, and the time tx in the section from the speech end time te to the time te + t2
And when someone else speaks. This case 4
In the case of, the section from the speech start time ts to the time point tx is set as the target section D.

【００６１】次に、発言対象者特定部２１の処理動作の
フローチャートを図９に示す。発言対象者特定部２１
は、ユーザ入力部１４からの発言対象者特定の命令が、
制御部１５を通じて到来すると処理を起動して、ステッ
プ２０１へ進む。Next, FIG. 9 shows a flowchart of the processing operation of the utterance subject identification unit 21. Target person specifying unit 21
Is a command for specifying a person to be remarked from the user input unit 14,
Upon arrival through the control unit 15, the processing is started, and the process proceeds to step 201.

【００６２】ステップ２０１では、音声・姿勢記録部１
３に記録されている発言状況テーブルＴＢＬ１の最初の
発言に着目し、その発言開始時刻ｔｓと発言終了時刻ｔ
ｅを求める。次に、ステップ２０２へ進む。In step 201, the voice / posture recording unit 1
Focusing on the first utterance in the utterance situation table TBL1 recorded in No. 3 and its utterance start time ts and utterance end time t
Find e. Next, the process proceeds to step 202.

【００６３】ステップ２０２では、その発言について、
前述の図８を用いて説明したようにして対象区間Ｄを求
め、その対象区間Ｄの発言対象者を、音声・姿勢記録部
１３の注視対象者テーブルＴＢＬ２を参照して求める。
なお、この特定方法の詳細は、後述する。In step 202, regarding the remark,
As described with reference to FIG. 8 described above, the target section D is determined, and the utterance target person of the target section D is determined with reference to the watching target table TBL2 of the voice / posture recording unit 13.
The details of this specifying method will be described later.

【００６４】次に、ステップ２０３へ進み、処理をした
のは、発言状況テーブルＴＢＬ１に記録されている最後
の発言か否かをチェックする。もしそうであれば、この
発言対象者特定処理を終了し、そうでなければ、ステッ
プ２０４へ進み、次の発言を着目し、その発言開始時刻
ｔｓと発言終了時刻ｔｅを求め、ステップ２０２へ進
む。Next, the routine proceeds to step 203, where it is checked whether or not the processing is the last utterance recorded in the utterance situation table TBL1. If so, the utterance subject identification process is terminated; otherwise, the process proceeds to step 204, where attention is paid to the next utterance, the utterance start time ts and the utterance end time te are obtained, and the process proceeds to step 202. .

【００６５】次に、ステップ２０２における、この例に
おける発言対象者を求める方法を説明する。この例にお
いては、注視対象者テーブルＴＢＬ２を参照して、ステ
ップ２０２で特定された対象区間Ｄ内で、３単位時間以
上連続して、ある参加者を注視していたときに、その注
視対象者を発言対象者として特定するようにする。Next, a description will be given of a method of obtaining the utterance target in this example in step 202. In this example, with reference to the gaze target table TBL2, when the user gazes at a participant continuously for 3 unit time or more in the target section D specified in step 202, the gaze target person Is specified as the person to speak.

【００６６】図１０の例を用いて説明する。この図１０
の例は、音声・姿勢記録部１３の注視対象者テーブルＴ
ＢＬ２の参加者Ａの姿勢状況を示している。今、仮に、
ステップ２０２で特定された対象区間Ｄは、時間ｎから
時間ｎ＋１５までの単位時間を含むと特定されたとす
る。This will be described with reference to the example shown in FIG. This FIG.
Is a gaze target table T of the voice / posture recording unit 13.
The posture situation of the participant A of BL2 is shown. Now, temporarily,
It is assumed that the target section D specified in step 202 is specified to include a unit time from time n to time n + 15.

【００６７】図１０に示すように、この例の場合には、
対象区間Ｄにおいて、同じ参加者が３回以上連続で表れ
ている場合に、その参加者が発言対象者となる。図１０
の例の場合には、時間ｎから時間ｎ＋１５の間で、参加
者Ｂと参加者Ｄが、図１０において、矢印で示すよう
に、それぞれ１回ずつ発言対象者として特定される。As shown in FIG. 10, in the case of this example,
In the target section D, when the same participant appears three or more times in a row, the participant becomes the utterance target. FIG.
In the case of the example, the participant B and the participant D are each specified once as a speech target person, as shown by arrows in FIG. 10, from time n to time n + 15.

【００６８】次に、発言の状況をチャートとして表現
し、表示部１８に表示する過程について説明する。Next, a description will be given of a process of expressing the state of a comment as a chart and displaying the chart on the display unit 18.

【００６９】ユーザ入力部１４より、チャート作成命令
と、作成するチャートの時間区間（開始時刻Ｔｓ，終了
時刻Ｔｅ）が入力されると、制御部１５は、それらの情
報をチャート作成部１７へ送る。チャート作成部１７
は、これを受けて、チャート作成処理を実行する。チャ
ート作成部１７のチャート作成処理の例のフローチャー
トを図１１に示す。When a chart creation command and a time section (start time Ts, end time Te) of a chart to be created are input from the user input unit 14, the control unit 15 sends the information to the chart creation unit 17. . Chart creator 17
Receives this, executes a chart creation process. FIG. 11 shows a flowchart of an example of the chart creation process of the chart creation unit 17.

【００７０】すなわち、チャート作成部１７は、チャー
ト作成命令と、作成するチャートの時間区間（Ｔｓ，Ｔ
ｅ）を受け取ると、ステップ３０１へ進む。ステップ３
０１では、音声・姿勢記録部１３の発言記録情報である
発言状況テーブルＴＢＬ１を参照し、時間軸上の各参加
者の発言区間を視覚的に表現した、図１２に示すような
基本チャートを作成する。That is, the chart creation section 17 sends a chart creation command and a time section (Ts, Ts) of the chart to be created.
When e) is received, the process proceeds to step 301. Step 3
In step 01, a basic chart as shown in FIG. 12 is created by visually referring to the statement status table TBL1, which is the statement record information of the voice / posture recording unit 13, and expressing the speech section of each participant on the time axis. I do.

【００７１】この図１２の基本チャートにおいて、領域
３１には、会議参加者名が表示される。そして、会議参
加者名の表示領域３２の横は、発言者チャート表示領域
３２とされ、この領域３２に、指定された開始時刻Ｔｓ
と終了時刻Ｔｅとの間での各会議参加者の発言区間が、
矩形バー３３により現わされている。In the basic chart of FIG. 12, a conference participant name is displayed in an area 31. Next to the conference participant name display area 32 is a speaker chart display area 32, in which the specified start time Ts
The speech section of each conference participant between and the end time Te is
It is represented by a rectangular bar 33.

【００７２】なお、発言者チャート表示領域３２の縦横
の大きさ、時間軸の場所、発言者の情報を示す領域３１
の位置などは制御部１５に保持されており、それを参照
して表示情報が生成されている。この実施例では、指定
された区間の長さに応じてチャートの大きさ（時間軸の
長さ）は変わるようにされている。The speaker chart display area 32 has a vertical and horizontal size, a location on the time axis, and an area 31 indicating information on the speaker.
Is stored in the control unit 15, and display information is generated with reference to the position. In this embodiment, the size of the chart (the length of the time axis) changes according to the length of the designated section.

【００７３】以上のようにして基本チャートが作成され
ると、ステップ３０２へ進み、発言流れ検出部２２に、
時間区間（Ｔｓ，Ｔｅ）の情報と、流れ検出命令を送
る。この命令により、後述するようにして、発言流れ検
出部２２で発言の流れが検出され、その検出された発言
の流れの情報が送られてくる。When the basic chart is created as described above, the process proceeds to step 302, where the utterance flow detecting unit 22
The information of the time section (Ts, Te) and the flow detection command are sent. In accordance with this command, the flow of the utterance is detected by the utterance flow detection unit 22 as described later, and information on the detected flow of the utterance is sent.

【００７４】そこで、ステップ３０３で、発言流れ検出
部２２からの発言の流れの情報を、受け取ったかどうか
判断し、受け取った場合には、ステップ３０４へ進み、
発言流れ検出部２２から受け取った結果に基づいて、各
発言区間の発言者について、発言対象者があったときに
は、その発言の矩形バー３３と、発言対象者の次の発言
の矩形バー３３とを、チャート上で、後述する図１５に
示すように、結合線３４で結び、チャートを完成させ
る。そして、ステップ３０５へ進み、表示部１８の表示
画面に、そのチャートを表示する。Therefore, in step 303, it is determined whether or not the information on the flow of the utterance from the utterance flow detecting unit 22 has been received.
Based on the result received from the utterance flow detecting unit 22, when there is a utterance target for the utterer in each utterance section, a rectangular bar 33 of the utterance and a rectangular bar 33 of the next utterance of the utterance target are displayed. Then, on the chart, as shown in FIG. Then, the process proceeds to step 305, and the chart is displayed on the display screen of the display unit 18.

【００７５】次に、ステップ３０２で発せられる命令に
より起動される発言流れ検出部２２の発言流れ検出処理
動作について、図１４のフローチャートを参照して説明
する。Next, the utterance flow detection processing operation of the utterance flow detection unit 22 started by the instruction issued in step 302 will be described with reference to the flowchart of FIG.

【００７６】すなわち、チャート作成部１７から、時間
区間（Ｔｓ，Ｔｅ）の情報と、発言流れ検出命令が入力
されると、ステップ４０１へ進み、音声・姿勢記録部１
３の発言状況テーブルＴＢＬ１の発言ＩＤのレコードを
参照し、指定された時間区間（Ｔｓ，Ｔｅ）内の複数個
の発言ＩＤを求める。That is, when the information of the time section (Ts, Te) and the utterance flow detection command are input from the chart creating unit 17, the process proceeds to step 401, where the voice / posture recording unit 1
With reference to the utterance ID record in the utterance status table TBL1 of No. 3, a plurality of utterance IDs in the designated time section (Ts, Te) are obtained.

【００７７】次に、ステップ４０２へ進み、求められた
複数個の発言ＩＤの最初の発言に着目し、ステップ４０
３へ進む。ステップ４０３では、音声・姿勢記録部１３
の発言状況テーブルＴＢＬ１の発言対象者のレコードを
参照し、着目している発言ＩＤの発言についての発言対
象者が次の発言者であるかどうかを調べる。Next, proceeding to step 402, focusing on the first utterance of the plurality of utterance IDs obtained,
Proceed to 3. In step 403, the voice / posture recording unit 13
With reference to the record of the utterance target person in the utterance status table TBL1, it is checked whether or not the utterance target person for the utterance of the utterance ID of interest is the next utterer.

【００７８】そして、もし、着目している発言ＩＤの発
言についての発言対象者が、次の発言者であって、しか
も、着目している発言ＩＤの発言についての発言対象者
の発言対象者が、着目した発言ＩＤの発言者であるとき
には、両発言者は互いに注視しているとみなせることか
ら、着目した発言ＩＤの発言者名、発言開始時間および
発言終了時間と、次の発言の発言者名および発言開始時
間とを、一つのまとまった情報として、バッファに格納
する。このときのバッファの格納情報を、図１５に示
す。Then, if the utterance target of the utterance of the utterance ID of interest is the next utterer, and the utterance target of the utterance of the utterance of the utterance ID of interest is Since the two speakers can be considered to be watching each other when they are speakers of the noted statement ID, the name of the speaker of the noted statement ID, the speech start time and the speech end time, and the speaker of the next statement The name and the speech start time are stored in the buffer as one set of information. FIG. 15 shows the information stored in the buffer at this time.

【００７９】次に、ステップ４０４へ進み、着目してい
る発言の発言ＩＤが、指定された時間区間（Ｔｓ，Ｔ
ｅ）の最後から一つ前の発言ＩＤであるかをチェック
し、そうであれば、ステップ４０５へ進み、バッファに
格納されている情報をチャート作成部１７へ送り、終了
する。そうでなければ、ステップ４０６へ進み、次の発
言ＩＤに着目する。そして、ステップ４０３へ戻り、上
述の同様の処理を繰り返す。Next, the process proceeds to step 404, where the utterance ID of the utterance of interest is set in the designated time section (Ts, Ts
It is checked whether the utterance ID is one immediately before the end of e), and if so, the process proceeds to step 405, where the information stored in the buffer is sent to the chart creating unit 17, and the process ends. Otherwise, the process proceeds to step 406, and focuses on the next statement ID. Then, the process returns to step 403, and the same processing as described above is repeated.

【００８０】以上のようにして作成され、表示部１８に
表示された発言者チャートの例を、図１３に示す。この
図１３の例は、会議において、次のような発言状況があ
った場合のチャートである。すなわち、会議参加者Ａが
発言した後に、会議参加者Ｂは、会議参加者Ｃに発言を
促す発言をし、会議参加者Ｃは、それに対して回答し
た。そして、しばらくして、会議参加者Ｂは、それに対
してコメントした。次に、会議参加者Ａが会議参加者Ｄ
に対して何かを発言した。そして、会議参加者Ｄはそれ
に対して何かを発言した。FIG. 13 shows an example of the speaker chart created as described above and displayed on the display unit 18. The example of FIG. 13 is a chart in a case where the following utterance situation occurs in a meeting. That is, after the conference participant A speaks, the conference participant B speaks to the conference participant C, and the conference participant C answers. After a while, conference participant B commented on it. Next, the conference participant A receives the conference participant D
Said something to. Conference participant D then said something.

【００８１】上述のような発言状況から、図１３に示す
ように、会議参加者Ｂが、会議参加者Ｃに発言を促す発
言をし、会議参加者Ｃは、それに対して回答した部分や
会議参加者Ａが会議参加者Ｄに対してした発言部分は、
インターラクティブな部分であり、それらの発言バー３
３が、図１３の矢印３４で結ばれて、そのことが表示さ
れる。As shown in FIG. 13, the conference participant B speaks to the conference participant C, and the conference participant C responds to the part or the conference, as shown in FIG. Participant A made a statement to conference participant D,
The interactive part, their remark bar 3
3 are connected by an arrow 34 in FIG. 13 to indicate that fact.

【００８２】このように、発言情報と姿勢情報から、発
言間の関係をチャート上に表示することで、単なる発言
の交代だけでなく、それぞれの発言の流れをユーザは認
識することができる。As described above, by displaying the relationship between the statements on the chart from the statement information and the posture information, the user can recognize not only the change of the statements but also the flow of each statement.

【００８３】なお、発言状況テーブルの発言対象者のレ
コードを、より詳細に記録することにより、発言区間の
相互のインターラクションを、より詳細に表示すること
もできる。By recording the record of the utterance target person in the utterance status table in more detail, the interaction between utterance sections can be displayed in more detail.

【００８４】図１６は、そのような場合の発言状況テー
ブルＴＢＬ３の例を示すものである。図１６では、図５
の発言状況テーブルＴＢＬ１に比べて、発言対象者の欄
がさらに詳しく記録されている。すなわち、この図１６
の場合には、発言対象者のレコードとしては、会議参加
者のすべてについて、当該発言ＩＤの発言者が注視して
いた回数（発言対象者になった回数）と、その注視時間
とが、それぞれ記録される。発言対象者特定部２１で
は、各会議参加者毎に、発言対象者になった回数とその
時間を記録する。FIG. 16 shows an example of the statement status table TBL3 in such a case. In FIG. 16, FIG.
The comment subject column is recorded in more detail than the comment status table TBL1. That is, FIG.
In the case of, as the record of the utterance target, the number of times the utterer of the utterance ID gazes (the number of times the utterance target becomes) and the gazing time for each of the conference participants are respectively Be recorded. The uttered person specifying unit 21 records the number of times the uttered person became the uttered person and the time for each conference participant.

【００８５】すなわち、この例の場合には、発言対象者
特定部２１では、各会議参加者毎に、発言対象者になっ
た回数とその時間を記録する。これにより、会議におけ
る発言者の他の会議参加者への注視度が分かり、より詳
細なインターラクティブ性を発言者チャートに表示でき
るようになる。That is, in the case of this example, the target speaker specifying unit 21 records, for each conference participant, the number of times that the target is the target speaker and the time of that. As a result, the degree of gazing of the speaker in the conference to other conference participants can be understood, and more detailed interactivity can be displayed on the speaker chart.

【００８６】この例の発言状況テーブルＴＢＬ３のよう
に、発言対象者のレコードとして、会議参加者のすべて
について、当該発言ＩＤの発言者が注視していた回数
（発言対象者になった回数）と、その注視時間とが、そ
れぞれ記録される場合には、図１７に示すように、発言
者チャートにおいては、これらの各会議参加者の発言対
象者となった時間と回数の情報に基づいて、結ぶ線３
５、３６、３７の属性を変えて表示することができる。As in the statement status table TBL3 of this example, as the record of the speaker, the number of times the speaker with the corresponding speaker ID gazes (the number of times the speaker becomes the speaker) is checked for all the conference participants. When the gaze time is recorded, as shown in FIG. 17, in the speaker chart, based on the information on the time and the number of times each conference participant became the speaker, as shown in FIG. Connecting line 3
5, 36 and 37 can be displayed with different attributes.

【００８７】例えば、図１７の例では、注視時間と回数
との情報に応じて、インターラクティブ性を判別し、そ
のインターラクティブ性の高い順に、太い実線３５、太
い破線３６、細い実線３７のように属性を変えて表示す
るようにしている。For example, in the example of FIG. 17, the interactivity is determined in accordance with the information of the gaze time and the number of times, and the attributes such as the thick solid line 35, the thick broken line 36, and the thin solid line 37 are determined in the descending order of the interactivity. Is changed and displayed.

【００８８】なお、発言流れ検出部２２は、チャートを
作成するだけに用いられるわけではない。例えば、ユー
ザ入力部１４から、発言流れ検出命令と、時間区間（Ｔ
ｓ，Ｔｅ）と、発言者２名の名前が入力されると、その
時間区間（Ｔｓ，Ｔｅ）における指定された２名の会議
参加者のやり取りがあった時間が、この発言流れ検出部
２２から出力される。Note that the comment flow detection unit 22 is not used only for creating a chart. For example, a utterance flow detection command and a time interval (T
s, Te) and the names of the two speakers are input, the time during which the specified two conference participants have exchanged in the time section (Ts, Te) is the utterance flow detection unit 22. Output from

【００８９】この出力は、制御部１５を介して、表示部
１８へ出力される。この例では、前述のチャート作成に
おけるバッファ情報を出力させている。これは、発言流
れ検出部２２が、検索における一つの構成要素になって
いる例である。This output is output to the display unit 18 via the control unit 15. In this example, the buffer information in the above-described chart creation is output. This is an example in which the utterance flow detection unit 22 is one component in the search.

【００９０】単に発言者の遷移に着目しただけでは、イ
ンターラクティブ性の高いやり取りのあった時間が正確
に分からないが、以上のように、発言流れ検出部２２を
用いることで、誰が質問して、誰が回答したか、などの
ように特定の２者のやり取りのあった場面が、精度よく
抽出できる。It is not possible to accurately know the time of the highly interactive exchange by merely focusing on the transition of the speaker. Scenes where two specific parties interact, such as who answered, can be accurately extracted.

【００９１】この例の場合、表示部１８は、入出力一体
型のディスプレイであるので、表示された発言の流れ
を、ユーザが直接指定することで、音声情報を再生する
ことが可能である。その場合には、表示部１８から制御
部１５へ、ユーザ入力に応じた入力座標が送られる。In this example, since the display unit 18 is an input / output integrated display, it is possible to reproduce the audio information by directly specifying the displayed speech flow by the user. In this case, input coordinates according to the user input are sent from the display unit 18 to the control unit 15.

【００９２】チャート情報や、表示されている命令のよ
うに、表示されてる情報の全ては、制御部１５で管理さ
れているため、その入力の意味が制御部１５で解釈され
る。例えば、ユーザが、表示されているチャートの任意
の位置を指示し、再生ボタンを押すと、制御部１５は、
座標を時間に変換した後、再生部１９に再生命令と時間
を送る。再生部１９は、指定された時間の音声・姿勢記
録部１３の記録音声信号を読み出し、音声出力部２０へ
出力する。Since all the displayed information, such as chart information and displayed instructions, is managed by the control unit 15, the control unit 15 interprets the meaning of the input. For example, when the user specifies an arbitrary position on the displayed chart and presses a play button, the control unit 15
After converting the coordinates into time, a reproduction command and time are sent to the reproduction unit 19. The reproduction unit 19 reads the recorded audio signal of the audio / posture recording unit 13 for the designated time, and outputs the signal to the audio output unit 20.

【００９３】また、別の実施の形態として、図１８に示
すように、区間特定部２３を設けた例を示す。As another embodiment, an example in which a section specifying unit 23 is provided as shown in FIG. 18 is shown.

【００９４】この例の場合の区間特定部２３は、一度入
力された時間、または時間区間情報から発言流れ検出部
２２の出力に応じた時間区間を特定する。その特定され
た区間はチャート作成部１７や再生部１９で利用され
る。In this example, the section specifying unit 23 specifies a time section corresponding to the output of the utterance flow detecting unit 22 from the time input once or the time section information. The specified section is used by the chart creating unit 17 and the reproducing unit 19.

【００９５】具体的には、ユーザにより、適当な時間区
間が入力されると、その時間、あるいは、その時間帯を
含む一連の発言の流れの時間区間を特定し、発言者チャ
ートとして表示、あるいは再生することができる。図１
９にその概念図を示す。また、この実施の形態の場合の
フローチャートを図２０に示す。Specifically, when an appropriate time interval is input by the user, the time or a time interval of a series of utterance flows including that time zone is specified and displayed as a speaker chart, or Can be played. FIG.
Fig. 9 shows a conceptual diagram. FIG. 20 shows a flowchart in the case of this embodiment.

【００９６】すなわち、図２０に示すように、時間Ｔ、
または，時間区間（Ｔ０，Ｔ１）が入力されると、ステ
ップ５０１へ進み、図１９に示すように、時間Ｔまたは
Ｔ０以前の発言の流れの最初の開始時刻Ｔａを見つけ
る。そして、ステップ５０２へ進み、時間ＴあるいはＴ
１以後で、発言の流れが最初に終了する終了時刻Ｔｂを
見つける。That is, as shown in FIG.
Alternatively, when the time section (T0, T1) is input, the process proceeds to step 501, and as shown in FIG. 19, the first start time Ta of the speech flow before the time T or T0 is found. Then, the process proceeds to step 502, where the time T or T
After 1, the end time Tb at which the speech flow ends first is found.

【００９７】次に、ステップ５０３へ進み、開始時刻Ｔ
ａ，終了時刻Ｔｂを、制御部１５へ出力する。なお、こ
の場合に、図１９に示すように、その時間区間を多少広
げた区間（Ｔａ´，Ｔｂ´）としても構わない。この場
合は、時間区間を広げた分だけ、多少文脈が分かりやす
くなる。Next, the routine proceeds to step 503, where the start time T
a and the end time Tb are output to the control unit 15. In this case, as shown in FIG. 19, the time section may be set to a section (Ta ′, Tb ′) in which the time section is slightly widened. In this case, the context becomes somewhat easier to understand as the time interval is extended.

【００９８】この区間特定部２３へ入力する時間は、ユ
ーザが直接ユーザ入力部１４より入力してもよい。ま
た、次のような使い方でもよい。The time input to the section identification unit 23 may be directly input by the user from the user input unit 14. Further, the following usage may be used.

【００９９】すなわち、図２１に示すように、表示部１
８に表示されている発言者チャート上の特定の指示個所
４１をユーザ入力部１４により指定すると、制御部１５
はその指定位置の時刻に基づいて、ユーザ入力記録部１
６に記録されているユーザ入力情報と入力時間を参照
し、その入力時間を区間特定部２３へ入力するようにす
る。That is, as shown in FIG.
When the user designates a specific instruction point 41 on the speaker chart displayed in 8 by the user input unit 14, the control unit 15
Indicates the user input recording unit 1 based on the time at the designated position.
Reference is made to the user input information and the input time recorded in No. 6 and the input time is input to the section identification unit 23.

【０１００】以上の実施の形態は、図２に示したような
通常の対面型会議の場合に、この発明を適用した場合で
あるが、この発明は、テレビ会議にも適用可能である。The above embodiment is a case where the present invention is applied to a normal face-to-face conference as shown in FIG. 2, but the present invention is also applicable to a video conference.

【０１０１】図２２は、この発明をテレビ会議に適用し
た場合の、会議状況の説明図である。この例の場合に
は、会議情報記録用パーソナルコンピュータ５は、ネッ
トワーク５０を通じて、それぞれの会議参加者６１の部
屋６０の端末パーソナルコンピュータ６２と接続されて
いる。FIG. 22 is an explanatory diagram of a conference situation when the present invention is applied to a video conference. In this example, the conference information recording personal computer 5 is connected to a terminal personal computer 62 in the room 60 of each conference participant 61 via the network 50.

【０１０２】それぞれの会議参加者６１の端末パーソナ
ルコンピュータ６２のディスプレイ６３の画面には、他
の会議参加者の画面がマルチウインドウの形式で表示さ
れている。端末パーソナルコンピュータ６２の上部に
は、視線検出部６４を構成するセンサが設置されてい
る。この視線検出部６４は、会議参加者６１が、画面上
で、どの会議参加者を注視していたかを検出する。この
視線検出部６４で検出された姿勢情報は、ネットワーク
５０を通じて会議情報記録用パーソナルコンピュータ５
に送られる。On the screen of the display 63 of the personal computer 62 of each conference participant 61, the screens of the other conference participants are displayed in a multi-window format. Above the terminal personal computer 62, a sensor constituting the visual line detection unit 64 is installed. The gaze detection unit 64 detects which conference participant the conference participant 61 is watching on the screen. The posture information detected by the line-of-sight detection unit 64 is transmitted to the conference information recording personal computer 5 through the network 50.
Sent to

【０１０３】また、この例の場合には、各会議参加者の
映像と、その発言音声とが、ビデオカメラ６５により取
得され、ネットワーク５０を通じて会議情報記録用パー
ソナルコンピュータ５に送られる。In the case of this example, the video of each conference participant and its voice are acquired by the video camera 65 and transmitted to the conference information recording personal computer 5 through the network 50.

【０１０４】そして、会議情報記録用パーソナルコンピ
ュータ５で、上述と同様にして、発言状況テーブルＴＢ
Ｌ１あるいはＴＢＬ３として発言状況情報が記録され、
注視対象者テーブルＴＢＬ２により、姿勢状況が記録さ
れる。そして、再生に当たっては、図１３や図１７に示
したような発言者チャートが表示画面に表示されて、検
索に役立つように使用される。Then, in the conference information recording personal computer 5, in the same manner as described above, the statement status table TB
Comment status information is recorded as L1 or TBL3,
The posture status is recorded by the watching target person table TBL2. At the time of reproduction, a speaker chart as shown in FIG. 13 or FIG. 17 is displayed on the display screen, and is used to help search.

【０１０５】[0105]

【発明の効果】以上説明したように、この発明によれ
ば、従来の会議システムなどでは実現されていなかった
会話の流れを検出することができる。そして、それを利
用した音声情報の再生やチャートを表示することができ
る。As described above, according to the present invention, it is possible to detect a conversation flow which has not been realized in a conventional conference system or the like. Then, it is possible to reproduce audio information and display a chart using the audio information.

【０１０６】これにより、発言相互のインターラクショ
ンが高い部分を容易に検知でき、この表示情報を見るだ
けで、ユーザは、どのような発言経過があったのかを予
測することが可能となる。したがって、会議の重要部分
の検索など、必要な個所の検索に非常に役立つものであ
る。As a result, it is possible to easily detect a portion where the interaction between the utterances is high, and it is possible for the user to predict what kind of utterance has occurred just by looking at the display information. Therefore, it is very useful for searching for a necessary part, such as searching for an important part of a meeting.

【０１０７】例えば、二つの連続した発言があった場
合、それが一つの流れに含まれるものか、別の会話の流
れが始まったかの区別がようにできる。例えば、別の会
話の流れであれば、質問と回答などのように、一つの会
話の流れに含まれるような会話のやり取りではないこと
が分かる。会議に参加した人であれば、この流れを見る
ことで、会議情報の想起の促進にもなる。For example, when there are two continuous remarks, it can be distinguished whether they are included in one flow or whether another conversation flow has started. For example, if it is another conversation flow, it is understood that it is not a conversation exchange such as a question and an answer included in one conversation flow. Anyone who has participated in the meeting will see this flow and will also help recall the meeting information.

[Brief description of the drawings]

【図１】この発明による発言構造検出表示装置の一実施
の形態のブロック部である。FIG. 1 is a block diagram of an embodiment of a speech structure detection display device according to the present invention.

【図２】この発明による発言構造検出表示装置が適用さ
れる会議の概要を説明するための図である。FIG. 2 is a diagram for explaining an outline of a conference to which the statement structure detection display device according to the present invention is applied;

【図３】この発明による発言構造検出表示装置の実施の
形態における発言区間の検出方法を説明するためのフロ
ーチャートである。FIG. 3 is a flowchart for explaining a method of detecting a speech section in the speech structure detection display device according to the embodiment of the present invention;

【図４】この発明による発言構造検出表示装置の実施の
形態における発言区間の検出方法を説明するための図で
ある。FIG. 4 is a diagram for explaining a method of detecting a speech section in the speech structure detection display device according to the embodiment of the present invention;

【図５】この発明による発言構造検出表示装置の実施の
形態における発言状況の記録情報の例を示す図である。FIG. 5 is a diagram showing an example of recorded information of a speech status in the speech structure detection display device according to the embodiment of the present invention;

【図６】この発明による発言構造検出表示装置の実施の
形態における発言者の姿勢としての注視状況の検出方法
の例を説明するための図である。FIG. 6 is a diagram for explaining an example of a method of detecting a gaze state as an attitude of a speaker in the embodiment of the speech structure detection display device according to the present invention.

【図７】この発明による発言構造検出表示装置の実施の
形態における発言者の姿勢としての注視状況の記録情報
の例を説明するための図である。FIG. 7 is a diagram for describing an example of recorded information of a gaze state as a speaker's posture in the embodiment of the speech structure detection display device according to the present invention.

【図８】この発明による発言構造検出表示装置の実施の
形態において、発言対象者を特定する方法の説明に用い
る図である。FIG. 8 is a diagram used to describe a method of specifying a person to be uttered in the embodiment of the utterance structure detection display device according to the present invention.

【図９】この発明による発言構造検出表示装置の実施の
形態において、発言対象者を特定する処理の例を説明す
るためのフローチャートである。FIG. 9 is a flowchart for explaining an example of processing for specifying a person to be uttered in the embodiment of the utterance structure detection display device according to the present invention.

【図１０】この発明による発言構造検出表示装置の実施
の形態において、発言対象者を特定する方法の説明に用
いる図である。FIG. 10 is a diagram used to explain a method of specifying a person to be uttered in the embodiment of the utterance structure detection display device according to the present invention.

【図１１】この発明による発言構造検出表示装置の実施
の形態において、発言者チャートの作成処理の一例の説
明のためのフローチャートである。FIG. 11 is a flowchart illustrating an example of a speaker chart creation process in the embodiment of the speech structure detection display device according to the present invention.

【図１２】基本的発言者チャートの例を示す図である。FIG. 12 is a diagram showing an example of a basic speaker chart.

【図１３】この発明による発言構造検出表示装置の実施
の形態における発言者チャートの例を示す図である。FIG. 13 is a diagram showing an example of a speaker chart in the embodiment of the speech structure detection display device according to the present invention.

【図１４】この発明による発言構造検出表示装置の実施
の形態における発言流れ検出処理の一例の説明のための
フローチャートである。FIG. 14 is a flowchart for explaining an example of a comment flow detection process in the embodiment of the comment structure detection display device according to the present invention.

【図１５】この発明による発言構造検出表示装置の実施
の形態における発言流れ検出処理の一例を説明するため
に用いる図である。FIG. 15 is a diagram used to explain an example of a comment flow detection process in the embodiment of the comment structure detection display device according to the present invention.

【図１６】この発明による発言構造検出表示装置の実施
の形態における発言状況の記録情報の例を示す図であ
る。FIG. 16 is a diagram showing an example of recorded information of a speech status in the speech structure detection display device according to the embodiment of the present invention;

【図１７】この発明による発言構造検出表示装置の実施
の形態における発言者チャートの例を示す図である。FIG. 17 is a diagram showing an example of a speaker chart in the embodiment of the speech structure detection display device according to the present invention.

【図１８】この発明による発言構造検出表示装置の他の
実施の形態のブロック部である。FIG. 18 is a block diagram showing another embodiment of the speech structure detection / display apparatus according to the present invention.

【図１９】この発明による発言構造検出表示装置の他の
実施の形態の動作説明に用いる図である。FIG. 19 is a diagram used for describing the operation of another embodiment of the utterance structure detection display device according to the present invention.

【図２０】この発明による発言構造検出表示装置の他の
実施の形態の動作説明に用いるフローチャートである。FIG. 20 is a flowchart used to describe the operation of another embodiment of the utterance structure detection display device according to the present invention.

【図２１】この発明による発言構造検出表示装置の他の
実施の形態の動作説明に用いる図である。FIG. 21 is a diagram used for describing the operation of another embodiment of the utterance structure detection display device according to the present invention.

【図２２】この発明による発言構造検出表示装置が適用
される会議の他の例の概要を説明するための図である。FIG. 22 is a diagram for explaining an outline of another example of a conference to which the speech structure detection display device according to the present invention is applied.

[Explanation of symbols]

２視線検出センサ３３次元磁気センサ４マイクロホン５会議記録用パーソナルコンピュータ１１音声情報処理部１２姿勢情報処理部１３音声・姿勢記録部１４ユーザ入力部１５制御部１６ユーザ入力情報記録部１７チャート作成部１８表示部１９再生部２０音声出力部２１発言対象者特定部２２発言流れ検出部２３区間特定部 2 gaze detection sensor 3 three-dimensional magnetic sensor 4 microphone 5 conference recording personal computer 11 voice information processing unit 12 attitude information processing unit 13 voice / posture recording unit 14 user input unit 15 control unit 16 user input information recording unit 17 chart creation unit Reference Signs List 18 display unit 19 reproduction unit 20 voice output unit 21 uttered person identification unit 22 utterance flow detection unit 23 section identification unit

Claims

[Claims]

1. A speech input means for collecting a speech of a speaker, a speech section detection means for detecting a speech section of each speaker from a speech signal from the speech input means, A posture detecting means for detecting a posture, recording voice information from the voice input means, information of a speech section for each speaker detected by the speech section detecting means, and a speech detected by the posture detecting means. The posture of each person, a voice / posture recording means for recording in association with each other, based on information recorded in the voice / posture recording means,
Remark target person specifying means for specifying who the remark is to, a flow of the remark in a predetermined time interval, based on the record information of the voice / posture recording means and the result of the remark target person specifying means A utterance structure detection and display device, comprising: utterance flow detection means for detecting; and display means for displaying display information according to a detection result of the utterance flow detection means.

2. The utterance structure detection display device according to claim 1, wherein at least each of the utterance sections of the predetermined time section detected by the utterance flow detecting means and a mutual relation of the utterance sections are: An utterance structure detection display device comprising a chart creator for creating a utterance chart to be displayed on a display.

3. The utterance structure detection display device according to claim 1, further comprising a section specifying means for specifying the predetermined time section.

4. The apparatus according to claim 1, further comprising: user input information recording means for recording user input information input in response to said utterance and the input time. Speech structure detection display device.