JP2006251898A

JP2006251898A - Information processor, information processing method, and program

Info

Publication number: JP2006251898A
Application number: JP2005064062A
Authority: JP
Inventors: Takeshi Mizunashi; 豪水梨
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-03-08
Filing date: 2005-03-08
Publication date: 2006-09-21

Abstract

<P>PROBLEM TO BE SOLVED: To provide an information processor for adequately grasping the interest and stance of a conference participant. <P>SOLUTION: A conference history display device (the information processor) 1 comprises: a voice recognition part 3 for generating a writing text based on voice data of a speaker by using voice recognition technique, so as to generate speech data; a speech database 4 for storing the speech data; a speaker setting part 5 for setting the speaker; a tabulating part 6 for tabulating the appearance times of words in the speech data by correspondence to the time axis of the speech data, based on the speech data stored in the speech database 4; and a control part 7 for displaying a graph where the appearance times of the words in the speech data is made to correspond to the time axis of speaker data. Consequently, it is easily recognized who makes a speech, when the speech is made, and what the content of the speech is when subsequently examining the conference. Thus, the interest and stance of each participant is adequately grasped. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

従来、会議等の情報が漏れなく記録でき、後でユーザが要求する情報を容易に指定して選択された部分のみを再生できる再生装置が提案されている（特許文献１）。 2. Description of the Related Art Conventionally, there has been proposed a playback apparatus that can record information such as conferences without omission, and can play back only selected portions by easily specifying information requested by a user later (Patent Document 1).

また、会議における参加者の発言の構造を検出し、表示する発言構造検出表示装置が特許文献２で提案されている。特許文献２記載の装置は、発言者毎の発言区間の情報と、発言者毎の姿勢と誰への発言かを特定し、所定の時間区間における発言の流れを検出することにより、インターラクションの高い発言相互の関係が表示できるというものである。 Further, Patent Document 2 proposes a speech structure detection and display device that detects and displays the speech structure of participants in a conference. The apparatus described in Patent Document 2 specifies information on a speech section for each speaker, an attitude for each speaker, and who speaks, and detects a flow of speech in a predetermined time section. High reciprocal relations can be displayed.

また、関連のある複数の発言群ごとに適切な発言構造区間を特定し、提示することにより、会議情報の再生に際し、所望する個所に効率よくアクセスするための発言構造情報提示装置が提案されている（特許文献３）。 Also, a speech structure information presentation device has been proposed for efficiently accessing a desired location when reproducing conference information by identifying and presenting an appropriate speech structure section for each of a plurality of related speech groups. (Patent Document 3).

特開平６−３４３１４６号公報JP-A-6-343146 特開平１１−２５９５０１号公報JP-A-11-259501 特開平１１−２７２６７９号公報Japanese Patent Laid-Open No. 11-272679

しかしながら、従来技術では、たとえばある会議参加者が会議で何についてどのように語ったのか、またどのような関心やスタンスを持っているのかなどを的確に把握することは難しいという問題がある。 However, in the prior art, there is a problem that it is difficult to accurately grasp, for example, what and how a certain conference participant talked about at the conference and what interest and stance they have.

そこで、本発明は、上記問題点に鑑みてなされたもので、各人の関心やスタンスを的確に把握できる情報処理装置、情報処理方法およびプログラムを提供することを目的とする。 Therefore, the present invention has been made in view of the above problems, and an object thereof is to provide an information processing apparatus, an information processing method, and a program capable of accurately grasping each person's interest and stance.

上記課題を解決するために、本発明は、発話データを記憶する記憶手段と、前記記憶手段に記憶された発話データに基づいて、前記発話データの時間軸に対応させて前記発話データ内の単語の出現回数を集計する集計手段とを備える情報処理装置である。本発明によれば、発話データの時間軸に対応させて発話データ内の単語の出現回数（出現頻度）を集計することで、たとえば会議を後から振り返る際、誰がいつどのような内容を話したかが容易にわかり、参加者各人の関心、スタンス（姿勢）を的確に把握することができる。 In order to solve the above problems, the present invention provides a storage means for storing speech data, and words in the speech data corresponding to the time axis of the speech data based on the speech data stored in the storage means. It is an information processing apparatus provided with the totaling means which totalizes the appearance frequency. According to the present invention, by counting the number of appearances (appearance frequency) of words in the utterance data in correspondence with the time axis of the utterance data, for example, when reviewing the meeting later, who spoke what and when It is easy to understand and can accurately understand the interests and stances of each participant.

本発明は、前記発話データ内の単語の出現回数を前記話者データの時間軸に対応させたグラフを表示する制御手段をさらに備える。これにより、たとえば会議を後から振り返る際、誰がいつどのような内容を話したかが容易にわかり、参加者各人の関心、姿勢、スタンスを的確に把握することができる。本発明は、前記発話データ内の単語の出現回数を前記話者データの時間軸に対応させたグラフを話者毎に表示する制御手段をさらに備える。前記制御手段は、前記発話データ内の単語の中で出現回数の多い単語名を前記グラフに対応させて表示することを特徴とする。これにより、どのような単語名を話したのかを一目で知ることができる。 The present invention further includes control means for displaying a graph in which the number of appearances of words in the utterance data is associated with the time axis of the speaker data. Thus, for example, when looking back at a meeting later, it is easy to know who spoke what and when, and it is possible to accurately grasp the interest, attitude, and stance of each participant. The present invention further includes control means for displaying, for each speaker, a graph in which the number of appearances of words in the utterance data is associated with the time axis of the speaker data. The control means displays word names having a high frequency of appearance among words in the utterance data in association with the graph. Thereby, it is possible to know at a glance what word name is spoken.

前記制御手段は、前記発話データ内の単語の中で出現回数の多い単語名を発話時刻とともに前記グラフに対応させて表示することを特徴とする。これにより、いつどのような単語名を話したのかを一目で把握することができる。前記制御手段は、前記グラフに対応させて発話相手を表示することを特徴とする。これにより、だれと話したのかを一目で把握することができる。前記制御手段は、前記グラフ上で所定の操作を行うことにより対応する映像を再生することを特徴とする。これにより、実際に話者データを確認できる。 The control means displays word names having a high frequency of appearance among words in the utterance data in association with the utterance time in correspondence with the graph. Thereby, it is possible to grasp at a glance what kind of word name is spoken. The control means displays an utterance partner in correspondence with the graph. As a result, it is possible to grasp at a glance who has spoken to. The control means reproduces a corresponding video by performing a predetermined operation on the graph. Thereby, the speaker data can be actually confirmed.

本発明は、前記集計手段が集計する話者を設定する設定手段をさらに備える。本発明によれば、特定の話者に対して単語の出現回数を知ることができる。本発明は、所定の音声認識技術を用いて話者の音声データから書き起こしテキストを生成することにより前記発話データを作成する音声処理手段をさらに備える。あらかじめ書き起こしテキストを作成しておくことにより、すばやく結果を表示させることができる。前記発話データは、会議中に検出された音声データから作成されていることを特徴とする。これにより会議の振り返りを支援することができる。また、前記発話データは、発話開始時刻、発話終了時刻、話者、発話内容および発話相手を含むことを特徴とする。 The present invention further includes setting means for setting speakers to be counted by the counting means. According to the present invention, it is possible to know the number of times a word appears for a specific speaker. The present invention further includes voice processing means for creating the utterance data by generating a transcript from the voice data of the speaker using a predetermined voice recognition technique. By creating a transcript in advance, the results can be displayed quickly. The utterance data is created from voice data detected during a meeting. Thereby, the reflection of the meeting can be supported. The utterance data includes an utterance start time, an utterance end time, a speaker, an utterance content, and an utterance partner.

本発明は、所定の記憶手段に記憶された発話データを取得するステップと、前記発話データに基づいて、前記発話データの時間軸に対応させて前記発話データ内の単語の出現回数を集計するステップとを有する情報処理方法である。本発明によれば、発話データの時間軸に対応させて発話データ内の単語の出現回数（出現頻度）を集計することで、会議を後から振り返る際、誰がいつどのような内容を話したかが容易にわかり、参加者各人の関心、姿勢、スタンスを的確に把握することができる。本発明は、前記発話データ内の単語の出現回数を前記話者データの時間軸に対応させたグラフを表示するステップをさらに有する。 The present invention includes a step of acquiring utterance data stored in a predetermined storage means, and a step of counting the number of appearances of words in the utterance data in accordance with the time axis of the utterance data based on the utterance data Is an information processing method. According to the present invention, by counting the number of appearances (appearance frequency) of words in the utterance data in correspondence with the time axis of the utterance data, it is easy to determine who spoke what and when and when to look back on the meeting later. You can understand the interests, attitudes, and stances of each participant. The present invention further includes a step of displaying a graph in which the number of appearances of words in the utterance data is associated with the time axis of the speaker data.

本発明は、発話データを取得するステップ、前記発話データに基づいて、前記時間軸に対応させて前記発話データ内の単語の出現回数を集計するステップ、前記発話データ内の単語の出現回数を前記話者データの時間軸に対応させたグラフを表示するための情報を生成するステップをコンピュータに実行させるためのプログラムである。本発明によれば、発話データの時間軸に対応させて発話データ内の単語の出現回数（出現頻度）を集計することで、会議を後から振り返る際、誰がいつどのような内容を話したかが容易にわかり、参加者各人の関心、姿勢、スタンスを的確に把握することができる。 The present invention includes a step of acquiring utterance data, a step of counting the number of appearances of words in the utterance data based on the utterance data, in correspondence with the time axis, and the number of appearances of words in the utterance data. A program for causing a computer to execute a step of generating information for displaying a graph corresponding to a time axis of speaker data. According to the present invention, by counting the number of appearances (appearance frequency) of words in the utterance data in correspondence with the time axis of the utterance data, it is easy to tell who spoke what and when when looking back at the meeting later. You can understand the interests, attitudes, and stances of each participant.

本発明によれば、各人の関心やスタンスを的確に把握できる情報処理装置、情報処理方法およびプログラムを提供することができる。 According to the present invention, it is possible to provide an information processing apparatus, an information processing method, and a program that can accurately grasp each person's interests and stances.

以下、本発明を実施するための最良の形態について説明する。 Hereinafter, the best mode for carrying out the present invention will be described.

図１は、会議履歴表示装置（情報処理装置）１のブロック図である。図１に示すように、会議履歴表示装置１は、会議記録データベース２、音声認識部３、発話データベース４、話者設定部５、集計部６、制御部７、表示部８およびスピーカ９を備える。 FIG. 1 is a block diagram of a conference history display device (information processing device) 1. As shown in FIG. 1, the conference history display device 1 includes a conference record database 2, a voice recognition unit 3, an utterance database 4, a speaker setting unit 5, a totaling unit 6, a control unit 7, a display unit 8, and a speaker 9. .

会議履歴表示装置１は、会議の履歴を表示するものである。会議記録データベース２は、会議中に検出された会議参加者（話者）の音声データおよび会議映像データを格納するものである。音声認識部３は、所定の音声認識技術を用いて会議記録データベース２に格納された話者の音声データから書き起こしテキストを生成することにより発話内容を含む発話データを作成し、発話開始時刻、発話修了時刻、話者、発話相手、発話内容を含む発話データを発話データベース４に格納する。このようにして、発話データは、会議中に検出された音声データから作成される。 The conference history display device 1 displays a conference history. The conference record database 2 stores audio data and conference video data of conference participants (speakers) detected during the conference. The voice recognition unit 3 creates utterance data including the utterance contents by generating a transcribed text from the voice data of the speaker stored in the conference recording database 2 using a predetermined voice recognition technology, Utterance data including the utterance completion time, speaker, utterance partner, and utterance content is stored in the utterance database 4. In this way, the utterance data is created from the voice data detected during the conference.

図２は発話データベース４の内容を示す図である。図２に示すように、発話データベース４内は、発話の開始時刻、発話の終了時刻、話者、発話内容（発話データ）、発話相手のフィールドからなる。ここでは、会議参加者（話者）として、Ａさん、Ｂさん、Ｃさん、Ｄさんが存在する。このため、発話相手もＡさん、Ｂさん、Ｃさん、Ｄさんが存在する。また、発話内容はテキストデータとして保持されている。話者設定部５は、たとえばキーボードやマウス等により構成される。ユーザは、話者設定部５を用いることにより集計部６で集計する話者を設定することができる。 FIG. 2 shows the contents of the utterance database 4. As shown in FIG. 2, the utterance database 4 includes utterance start time, utterance end time, speaker, utterance content (utterance data), and utterance partner fields. Here, Mr. A, Mr. B, Mr. C, and Mr. D exist as conference participants (speakers). For this reason, there are A, B, C, and D as speaking partners. The utterance content is held as text data. The speaker setting unit 5 is composed of, for example, a keyboard and a mouse. The user can set the speakers to be counted by the counting unit 6 by using the speaker setting unit 5.

集計部６は、設定された話者が話した発話データを発話データベース４から抽出する。図３は、話者としてＣさんが設定された場合に発話データベース４から抽出された発話データを示す図である。話者としてＣが設定された場合、発話データベース４からは図３に示す発話データが発話データベース４から抽出される。集計部６は、抽出された発話データのうち「発話内容」のフィールドにあるテキストを解析し、発話データの時間軸に対応させて発話データ内の単語の出現回数を集計する。ここで、発話データの時間軸に対応させてとは、発話データの時間を保持しながらまたは発話データの時間軸を考慮しながらという意味である。集計部６が、図３に示した発話内容のフィールドにあるテキストから名詞のみを抽出した例を図４に示す。図４に示すように、集計部６は、発話内容のフィールド内の名詞だけを単語として抽出する。 The totaling unit 6 extracts the utterance data spoken by the set speaker from the utterance database 4. FIG. 3 is a diagram showing utterance data extracted from the utterance database 4 when Mr. C is set as the speaker. When C is set as the speaker, the utterance data shown in FIG. 3 is extracted from the utterance database 4 from the utterance database 4. The totaling unit 6 analyzes the text in the “utterance content” field of the extracted utterance data, and totals the number of appearances of words in the utterance data in correspondence with the time axis of the utterance data. Here, “corresponding to the time axis of utterance data” means holding the time of the utterance data or considering the time axis of the utterance data. FIG. 4 shows an example in which the totaling unit 6 extracts only nouns from the text in the utterance content field shown in FIG. As shown in FIG. 4, the totaling unit 6 extracts only nouns in the utterance content field as words.

表示部８は、たとえばディスプレイ装置によって構成される。制御部７は、集計結果をもとに、グラフ上に（例えば）棒グラフで出現した単語の合計回数とその中で出現回数が多かった単語を表示し、あわせて、そのときの発話相手のアイコンも表示する。 The display unit 8 is configured by a display device, for example. Based on the counting result, the control unit 7 displays the total number of words appearing in (for example) a bar graph on the graph and the word having the highest number of appearances on the graph. Is also displayed.

次に、表示部８の表示例について説明する。図５は、話者としてＣさんを設定した場合の表示例を示す図である。図５に示すように、制御部７は、発話データの発話内容内の単語の出現回数を話者データの時間軸に対応させた棒グラフを表示する。図５において、横軸は会議経過時間を示し、会議開始時刻と会議終了時刻が分かるように表示されている。縦軸はＣさんの発話単語数を示す。ウインドウ２０には、話者設定部５により設定された話者名（Ｃさん）が表示されている。 Next, a display example of the display unit 8 will be described. FIG. 5 is a diagram showing a display example when Mr. C is set as the speaker. As shown in FIG. 5, the control unit 7 displays a bar graph in which the number of appearances of words in the utterance content of the utterance data is associated with the time axis of the speaker data. In FIG. 5, the horizontal axis indicates the conference elapsed time and is displayed so that the conference start time and conference end time can be understood. The vertical axis indicates the number of words spoken by Mr. C. In the window 20, the speaker name (Mr. C) set by the speaker setting unit 5 is displayed.

また、制御部７は、発話データの発話内容内の単語の中で出現回数の多い（出現数が上位の）単語名「コスト」、「問題」、「拠点」、「××」をグラフに対応させて表示する。さらに、制御部７は、発話データの発話内容内の単語の中で出現回数の多い単語名を発話時刻「１５時４３分０４秒」、「１５時４３分４５秒」「１５時４４分４５秒」とともにグラフに対応させて表示する。制御部７は、発話データの発話内容内の単語の中で出現回数の多い単語名を表示する際に、対話相手であるＢさんを表わすアイコンＢをグラフに対応させて表示する。ユーザがたとえばバー２１を移動またはクリックするなどのように、グラフ上で所定の操作を行うと、制御部７は、対応する会議映像および音声データを表示部８およびスピーカ９から再生する。 In addition, the control unit 7 graphs the word names “cost”, “problem”, “base”, and “xx” that have the highest number of appearances (the number of appearances is higher) among the words in the utterance content of the utterance data. Display in correspondence. Further, the control unit 7 selects the word names having the highest number of appearances among the words in the utterance content of the utterance data as utterance times “15:43:04”, “15:43:45”, “15:44:45”. "Seconds" is displayed in correspondence with the graph. When the control unit 7 displays a word name having a high frequency of appearance among words in the utterance content of the utterance data, the control unit 7 displays an icon B representing Mr. B who is the conversation partner in association with the graph. When the user performs a predetermined operation on the graph such as, for example, moving or clicking the bar 21, the control unit 7 reproduces the corresponding conference video and audio data from the display unit 8 and the speaker 9.

次に、他の表示例について説明する。図６は、Ａ、Ｂ、Ｃさんを話者として設定した場合の表示例を示す図である。図６に示すように、制御部７は、発話データの発話内容内の単語の出現回数を話者データの時間軸に対応させた折れ線グラフを話者（Ａ、Ｂ、Ｃ）毎に表示する。図６において、横軸は会議経過時間を示し、会議開始時刻と会議終了時刻が分かるように表示されている。縦軸は話者の発話単語数を示す。ウインドウ２０には、話者設定部５により設定された話者名Ａ、Ｂ、Ｃさんが表示されている。 Next, another display example will be described. FIG. 6 is a diagram illustrating a display example when A, B, and C are set as speakers. As shown in FIG. 6, the control unit 7 displays, for each speaker (A, B, C), a line graph in which the number of appearances of words in the utterance content of the utterance data is associated with the time axis of the speaker data. . In FIG. 6, the horizontal axis indicates the conference elapsed time and is displayed so that the conference start time and conference end time can be understood. The vertical axis indicates the number of words spoken by the speaker. In the window 20, speaker names A, B, and C set by the speaker setting unit 5 are displayed.

制御部７は、発話データの発話内容内の単語の出現回数を話者データの時間軸に対応させた折れ線グラフを表示する際に、対話相手である、Ａさん、Ｂさん、Ｃさんを表わすアイコンＡ、Ｂ、Ｃをグラフに対応させて表示する。「Ａさんの発言」の線Ｌ１は、文字通り「Ａさんがいつ何単語話したか」を表わしている。それらの線上に現れる顔アイコンは、「Ａさんの発言」の線Ｌ１上であれば、その時点でのＡさんの対話相手をあらわしている。ＡからＤの４人の会議の場合、「Ａさんの発言」の線Ｌ１上にはＡさん以外の参加者であるＢ、Ｃ、Ｄのうちの誰かの顔アイコンが現れることになる。逆に、Ａさんの発言の線上にはＡさんの顔アイコンは現れない。Ｌ２１は、Ｂさんの発言をあらわす線である。そこだけたちあがっているのは、その前後の時間には発言がまったくなかった（会議開始からしばらくは発言があったが、その後のある時間は発言しておらず、会議の最後の方で少し発言した）ということをあらわしている。ユーザがたとえばバー２１を移動またはクリックするなどのように、グラフ上で所定の操作を行うと、制御部７は、対応する会議映像を再生する。 When the control unit 7 displays a line graph in which the number of appearances of words in the utterance content of the utterance data is associated with the time axis of the speaker data, the control unit 7 represents the conversation partners A, B, and C. Icons A, B, and C are displayed in correspondence with the graph. The line L1 of “Mr. A's remark” literally represents “when and how many words Mr. A spoke”. If the face icon appearing on these lines is on the line L1 of “Mr. A's remark”, it represents Mr. A's conversation partner at that time. In the case of a meeting of four people from A to D, a face icon of someone among B, C, and D who is a participant other than Mr. A appears on the line L1 of “Mr. A's remark”. Conversely, Mr. A's face icon does not appear on the line of Mr. A's remarks. L21 is a line representing Mr. B's remarks. There was no remarks at the time before and after that (no remarks for a while after the start of the meeting, but a certain time after that, a little remarks at the end of the meeting) )). When the user performs a predetermined operation on the graph such as, for example, moving or clicking the bar 21, the control unit 7 reproduces the corresponding conference video.

次に、会議履歴表示装置１の動作について説明する。図７は、会議履歴表示装置１の動作フローチャートである。ステップＳ１で、音声認識部３は、所定の音声認識技術を用いて会議記録データベース２に格納された話者の音声データから書き起こしテキストを生成することにより発話データを作成し、発話開始時刻、発話修了時刻、話者、発話相手に対応させて話者データを発話データベース４に格納する。ステップＳ２で、ユーザは、話者設定部５を用いることにより集計部６で集計する話者を設定する。 Next, the operation of the conference history display device 1 will be described. FIG. 7 is an operation flowchart of the conference history display device 1. In step S1, the voice recognition unit 3 creates utterance data by generating a transcribed text from the voice data of the speaker stored in the conference recording database 2 using a predetermined voice recognition technology, and utterance start time, The speaker data is stored in the utterance database 4 corresponding to the utterance completion time, the speaker, and the utterance partner. In step S <b> 2, the user uses the speaker setting unit 5 to set speakers to be aggregated by the aggregation unit 6.

ステップＳ３で、集計部６は、抽出されたデータの「発話内容」のフィールドにあるテキストを解析し、発話データの時間軸に対応させて発話データの発話内容内の単語の出現回数を集計する。ステップＳ４で、制御部７は、図５および図６に示すように、発話データ内の単語の出現回数を話者データの時間軸に対応させたグラフを表示する。 In step S3, the tabulation unit 6 analyzes the text in the “utterance content” field of the extracted data, and tabulates the number of appearances of words in the utterance content of the utterance data in accordance with the time axis of the utterance data. . In step S4, the control unit 7 displays a graph in which the number of appearances of words in the utterance data is associated with the time axis of the speaker data, as shown in FIGS.

本実施例の会議履歴表示装置によれば、ある参加者を設定すると、あらかじめ音声認識されてテキスト化されているデータを検索し、その参加者の発話頻度や発話内容を解析した結果を視覚的に表示することによって、誰がいつどのような内容を話したかが容易にわかり、参加者各人の関心やスタンスを的確かつ直感的に把握することができる。すなわち、少なくとも参加者の音声が記録されている会議を後から振り返る際、参加者各人の関心や姿勢を容易にかつ直観的に把握することができる。 According to the conference history display device of the present embodiment, when a certain participant is set, data that has been voice-recognized and converted into text is searched in advance, and the result of analyzing the participant's utterance frequency and utterance content is visually displayed. By displaying on the screen, it is easy to know who spoke what and when, and it is possible to accurately and intuitively grasp the interest and stance of each participant. That is, when looking back at a meeting where at least the voice of the participant is recorded, it is possible to easily and intuitively grasp the interest and attitude of each participant.

なお、本発明による情報処理装置は、例えば、ＣＰＵ（Central Processing Unit）、ＲＯＭ(Read Only Memory)、ＲＡＭ(Random Access Memory)等を用いて実現される。プログラムをハードディスク装置や、ＣＤ−ＲＯＭ、ＤＶＤまたはフレキシブルディスクなどの可搬型記憶媒体等からインストールし、または通信回路からダウンロードし、ＣＰＵがこのプログラムを実行することで、情報処理方法の各ステップが実現される。また、プログラムは、発話データを取得するステップ、前記発話データに基づいて、前記時間軸に対応させて前記発話データ内の単語の出現回数を集計するステップ、前記発話データ内の単語の出現回数を前記話者データの時間軸に対応させたグラフを表示するための情報を生成するステップをコンピュータに実行させる。発話データベース４は発話データを記憶する記憶手段に、音声認識部３が音声処理手段にそれぞれ対応する。 The information processing apparatus according to the present invention is realized using, for example, a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. Each step of the information processing method is realized by installing the program from a hard disk drive, a portable storage medium such as a CD-ROM, DVD or flexible disk, or downloading from a communication circuit, and the CPU executing this program Is done. The program further includes a step of acquiring utterance data, a step of counting the number of appearances of words in the utterance data in correspondence with the time axis based on the utterance data, and the number of appearances of words in the utterance data. A computer is caused to generate the information for displaying the graph corresponding to the time axis of the speaker data. The speech database 4 corresponds to storage means for storing speech data, and the speech recognition unit 3 corresponds to speech processing means.

以上本発明の好ましい実施例について詳述したが、本発明は係る特定の実施例に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形、変更が可能である。 Although the preferred embodiments of the present invention have been described in detail above, the present invention is not limited to the specific embodiments, and various modifications, within the scope of the gist of the present invention described in the claims, It can be changed.

会議履歴表示装置のブロック図である。It is a block diagram of a meeting history display device. 発話データベース４の内容を示す図である。It is a figure which shows the content of the speech database. 話者としてＣさんが設定された場合に発話データベース４から抽出された発話データを示す図である。It is a figure which shows the speech data extracted from the speech database 4 when C is set as a speaker. テキストから名詞のみを抽出した例である。This is an example in which only nouns are extracted from text. 話者としてＣさんを設定した場合の表示例である。It is an example of a display when Mr. C is set as a speaker. Ａ、Ｂ、Ｃさんを話者として設定した場合の表示例である。It is a display example when A, B, and C are set as speakers. 会議履歴表示装置の動作フローチャートである。It is an operation | movement flowchart of a meeting history display apparatus.

Explanation of symbols

１会議履歴表示装置
２会議記録データベース
３音声認識部
４発話データベース
５話者設定部
６集計部
７制御部
８表示部
９スピーカ
DESCRIPTION OF SYMBOLS 1 Conference history display apparatus 2 Conference record database 3 Voice recognition part 4 Speech database 5 Speaker setting part 6 Total part 7 Control part 8 Display part 9 Speaker

Claims

Storage means for storing utterance data;
An information processing apparatus comprising: an aggregation unit that aggregates the number of appearances of words in the utterance data in correspondence with a time axis of the utterance data based on the utterance data stored in the storage unit.

The information processing apparatus according to claim 1, further comprising a control unit that displays a graph in which the number of appearances of words in the utterance data is associated with a time axis of the speaker data.

The information processing apparatus according to claim 1, further comprising a control unit that displays, for each speaker, a graph in which the number of appearances of words in the utterance data is associated with the time axis of the speaker data.

The information processing apparatus according to claim 2, wherein the control unit displays word names having a large number of appearances in the words in the utterance data in association with the graph.

4. The information processing apparatus according to claim 2, wherein the control unit displays a word name having a large number of appearances among words in the utterance data in association with the utterance time in association with the graph. .

The information processing apparatus according to claim 2, wherein the control unit displays an utterance partner in association with the graph.

The information processing apparatus according to claim 2, wherein the control unit reproduces a corresponding video by performing a predetermined operation on the graph.

The information processing apparatus according to claim 1, further comprising setting means for setting a speaker to be aggregated by the aggregation means.

2. The information processing apparatus according to claim 1, further comprising voice processing means for creating the utterance data by generating a transcript from the voice data of a speaker using a predetermined voice recognition technology.

The information processing apparatus according to claim 1, wherein the utterance data is created from voice data detected during a meeting.

The information processing apparatus according to claim 1, wherein the utterance data includes an utterance start time, an utterance end time, a speaker, an utterance content, and an utterance partner.

Obtaining utterance data stored in a predetermined storage means;
An information processing method comprising: counting the number of appearances of words in the utterance data based on the utterance data in correspondence with a time axis of the utterance data.

The information processing method according to claim 12, further comprising a step of displaying a graph in which the number of appearances of words in the utterance data is associated with a time axis of the speaker data.

Obtaining utterance data;
Based on the utterance data, counting the number of appearances of words in the utterance data in correspondence with the time axis;
The program for making a computer perform the step which produces | generates the information for displaying the graph which made the frequency | count of appearance of the word in the said speech data correspond to the time-axis of the said speaker data.