JP2013222347A

JP2013222347A - Minute book generation device and minute book generation method

Info

Publication number: JP2013222347A
Application number: JP2012094157A
Authority: JP
Inventors: Toru Aida; 徹相田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-04-17
Filing date: 2012-04-17
Publication date: 2013-10-28

Abstract

PROBLEM TO BE SOLVED: To grasp the situations of a convention and the reference materials of the convention at a glance, and to easily perform access to necessary information.SOLUTION: The minute book generation device includes: voice processing means for collecting the voice of the attendant of a convention by a microphone, and for converting the collected voice into voice data to store the voice data in a storage part; voice recognition means for analyzing the voice data to specify the attendant of the convention, and for specifying the utterance section and non-utterance section of each attendant; reference material processing means for storing a file started by the convention attendant and a reference path for linking reference materials and the materials of a reference page in the storage part; operation means for performing the setting of the reference materials, the input of the convention information and the reproduction instruction of a minute book; and operation/display processing means for displaying the data respectively stored in the storage part by the voice processing means and the reference material processing means like a band along the attendant-categorized time axis of the minute book of the convention as well as the icon of the reference materials and the reference page and an icon showing the utterance situations.

Description

本発明は、議事録生成装置及び議事録生成方法に関し、特に、一見して会議の状況及び参照資料を把握できるようにするために用いて好適な技術に関する。 The present invention relates to a minutes generation apparatus and a minutes generation method, and more particularly to a technique suitable for use at a glance so as to make it possible to grasp a meeting situation and reference materials.

従来、会議の電子議事録生成装置が提案されている。
例えば、特許文献１に記載されている録音装置は、会議における情況（雰囲気）をタイムラインチャートにマップする。そして、発言区間では明るい声、怒った声などを識別、非発言区間では拍手、笑い声、物音などを識別し、アイコンを用いてマップし、クリック等の操作で所望の区間の再生が可能である。 2. Description of the Related Art Conventionally, a conference electronic minutes generating apparatus has been proposed.
For example, the recording device described in Patent Document 1 maps the situation (atmosphere) in a meeting to a timeline chart. Then, bright voices, angry voices, etc. can be identified in the speech section, applause, laughter, noises, etc. can be identified in the non-speech section, mapped using icons, and playback of the desired section can be performed by clicking and other operations. .

特許文献２に記載の議事録作成装置は、会議での発言内容を時刻順に漏らすことなく記載し、かつ、発言を行った参加者を特定した議事録を自動で生成することができる。さらに、会議で使用したホワイトボードの記載内容や、プロジェクタ等で使用した画像などのメディア情報を、時刻順に議事録に記録することができ、また、発言内容やメディア情報を端末に出力することが可能である。 The minutes creation apparatus described in Patent Document 2 can automatically describe the contents of statements made at a meeting without leaking them in time order and automatically identify the minutes of participants who made the statements. In addition, it is possible to record the contents of the whiteboard used in the meeting and media information such as images used in the projector in the minutes in order of time, and to output the contents of speech and media information to the terminal. Is possible.

特許文献３に記載の電子会議システムの情報処理装置は、各会議メンバの各発言時間帯をその発言内容と共に、例えば各会議メンバの発言内容をアイコンにより表示し、アイコンは発言内容を簡単に示唆したり、発言のメディアの種類を示したりする。発言内容の表示を通じて、議事進行状況、例えば、どのような質疑応答があったか等を一目で見て取ることができ、各発言内容表示領域に該当する発言内容の再生を指示可能とする発言内容再生指示手段を備えている。 The information processing apparatus of the electronic conference system described in Patent Document 3 displays the speech time zone of each conference member together with the content of the speech, for example, the content of the speech of each conference member with an icon, and the icon simply suggests the content of the speech Or indicate the type of media being remarked. Through the display of the content of the remarks, it is possible to see at a glance the proceeding status of the proceedings, for example, what kind of questions and answers were received, and the replay content replay instruction means that allows the replay of the remarks corresponding to each remark content display area It has.

特許文献４に記載の自動記録装置は、会議で用いる資料を表示装置に表示している間、一時記憶手段が資料番号を記憶し、音声入力手段が参加者の発言を常に入力し、発言選択手段は発言が公的なものか、私的なものかを判定して、公的なものを選択する。また、インデクス作成手段は、記憶すべき発言が決定すると、資料番号とその発言とを対応付けるためのインデクスを作成し、発言記録手段は、記録すべき発言とそのインデクスが決定した時点で、記憶装置に記憶する。 In the automatic recording device described in Patent Document 4, while the material used in the conference is displayed on the display device, the temporary storage means stores the material number, the voice input means always inputs the participant's speech, and the speech selection The means determines whether the remark is public or private and selects the public one. Further, when the utterance to be stored is determined, the index creating means creates an index for associating the material number with the utterance, and the utterance recording means stores the utterance to be recorded and the index when the storage device is determined. To remember.

特開２０１０−０５４９９１号公報JP 2010-054991 A 特開２００４−２８７２０１号公報JP 2004-287201 A 特開平６−２６６６３２号公報JP-A-6-266632 特開平４−８２３５７号公報JP-A-4-82357 特開２００２−９１４８２号公報JP 2002-91482 A

前述した特許文献１〜４では、会議に参加していない者にとって会議の状況の理解に時間を要するという課題がある。状況の理解という意味では、前述した特許文献１の装置が最も状況の理解に有効である。前述した特許文献１によれば、例えば図１０に示すような議事録が生成され、会議の出席者の発言区間や非発言区間を区分して表示すると共に、各区間の雰囲気を一覧表示することができ、所望の区間の発言を参照できる。 In the above-described Patent Documents 1 to 4, there is a problem that it takes time for the person who has not participated in the conference to understand the status of the conference. In terms of understanding the situation, the above-described device of Patent Document 1 is most effective for understanding the situation. According to Patent Document 1 described above, for example, the minutes as shown in FIG. 10 are generated, and the speech sections and non-speech sections of the attendees of the conference are divided and displayed, and the atmosphere of each section is displayed as a list. You can refer to the remarks of the desired section.

しかしながら、一見してどの資料に対する議論であったかは知ることができない。すなわち、発言を参照しない限り何に対する議論であったかを理解することができない問題点があった。
本発明は前述の問題点に鑑み、一見して会議の状況及び会議の参照資料が把握できるようにするとともに、必要な情報に容易にアクセスできるようにすることを目的とする。 However, at first glance it is impossible to know which material was the argument. In other words, there was a problem that it was impossible to understand what was the argument without referring to the remarks.
SUMMARY OF THE INVENTION The present invention has been made in view of the above-described problems, and it is an object of the present invention to make it possible to grasp a conference status and conference reference materials at a glance and to easily access necessary information.

本発明の議事録生成装置は、会議の議事録の作成と再生を行う議事録生成装置であって、前記会議の出席者の音声をマイクにより収音し、収音された音声をデジタル化した音声データを変換して記憶部に記憶する音声処理手段と、前記音声データを解析して前記会議の出席者を特定すると共に、各出席者の発言区間および非発言区間を特定する音声認識手段と、前記会議が行われている際に、出席者により参照された資料を検出するとともに、資料のパス及び参照資料の参照ページを記憶部に記憶する参照資料処理手段と、前記参照資料の設定、会議情報の入力、議事録の再生指示を行う操作手段と、前記音声処理手段と参照資料処理手段によって各記憶部に記憶されたそれぞれのデータを、前記会議の議事録上の出席者別時間軸に沿って帯状に参照資料、参照ページのアイコンおよび発言状況を示すアイコンと共に表示する操作・表示処理手段とを備えることを特徴とする。 The minutes generating device of the present invention is a minutes generating device for creating and playing back the minutes of a meeting, which picks up the voice of attendees of the meeting with a microphone and digitizes the picked up sound Voice processing means for converting voice data and storing it in a storage unit; voice recognition means for analyzing the voice data and identifying attendees of the conference; A reference material processing means for detecting a material referred to by an attendee during the meeting and storing a material path and a reference material reference page in a storage unit; and setting of the reference material; The operation means for inputting conference information and instructing the reproduction of the minutes, and the data stored in each storage unit by the voice processing means and the reference material processing means, the time axis for each attendee on the minutes of the meeting Along the belt Reference material, characterized in that it comprises an operation and display processing means for displaying together with the icons showing the icons and speech status of a reference page.

本発明によれば、実施した会議の情報を、アイコンを用いて時系列に表現するようにした。そして、各アイコンは会議の状況と参照資料を表すと共に、議論内容および参照資料、参照ページへリンクさせるようにした。これにより、一見して会議の状況及び参照資料が把握できると共に、所望の情報に容易にアクセスすることが可能となる。 According to the present invention, information on a conference that has been held is expressed in time series using icons. Each icon represents the status of the meeting and reference materials, and links to discussion contents, reference materials, and reference pages. This makes it possible to grasp the status of the conference and reference materials at a glance, and to easily access desired information.

本発明の実施形態を示し、議事録生成システムの構成例を示す図である。It is a figure which shows embodiment of this invention and shows the structural example of the minutes generation system. 第１の実施形態の議事録生成装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the minutes production | generation apparatus of 1st Embodiment. 第１の実施形態に対応する会議開始前の議事録の一例を示す図である。It is a figure which shows an example of the minutes before the meeting start corresponding to 1st Embodiment. 第１の実施形態に対応する会議後の議事録の一例を示す図である。It is a figure which shows an example of the minutes after the meeting corresponding to 1st Embodiment. 第１の実施形態の議事録生成装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the minutes generation apparatus of 1st Embodiment. 第２の実施形態の議事録生成装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the minutes production | generation apparatus of 2nd Embodiment. 第２の実施形態に対応する会議開始前の議事録の一例を示す図である。It is a figure which shows an example of the minutes before the meeting start corresponding to 2nd Embodiment. 第２の実施形態に対応する会議後の議事録の一例を示す図である。It is a figure which shows an example of the minutes after the meeting corresponding to 2nd Embodiment. 第２の実施形態の議事録生成装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the minutes generation apparatus of 2nd Embodiment. 従来技術の議事録の一例を示す図である。It is a figure which shows an example of the minutes of a prior art.

以下、図面を参照して本発明の実施形態を説明する。
［第１の実施形態］
以下、本発明の第１の実施形態を図１の構成図と、図２のブロック図と、図３の会議開始前の議事録テンプレートを示す図と、図４の会議後の議事録を示す図を用いて説明する。
図１は、本実施形態の議事録生成装置の構成例を示す図、図２は議事録生成装置のブロック図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]
1 shows the configuration diagram of FIG. 1, the block diagram of FIG. 2, the diagram of the minutes template before the start of the conference in FIG. 3, and the minutes after the conference in FIG. This will be described with reference to the drawings.
FIG. 1 is a diagram illustrating a configuration example of a minutes generating apparatus according to the present embodiment, and FIG. 2 is a block diagram of the minutes generating apparatus.

図１の構成図において、議事録生成装置１０１は会議が行われる部屋に設置され、出席者の音声を収音するためのマイク１０２に接続されている。また、議事録生成装置１０１は会議における資料を表示するための端末（プレゼンテーション用端末）としても使用されるため、プロジェクタや大画面ディスプレイなどの表示装置１０３に接続されている。なお、図１における操作者は資料説明者でもよく、特定の操作者を割り当ててもよく、特に限定しない。 In the configuration diagram of FIG. 1, the minutes generating apparatus 101 is installed in a room where a meeting is held, and is connected to a microphone 102 for collecting the voices of attendees. In addition, since the minutes generating apparatus 101 is also used as a terminal (presentation terminal) for displaying materials in a meeting, it is connected to a display device 103 such as a projector or a large screen display. In addition, the operator in FIG. 1 may be a material explanation person, may assign a specific operator, and is not specifically limited.

本実施形態の議事録生成装置１０１は、会議の議事録の作成と再生を行う機能を有し、図２に示す通り、音声処理部２１、参照資料処理部２２、テキスト議事録生成部２３および操作・表示処理部２４で構成されている。
音声処理部２１は、収音部２０１、音声データ記憶部２０２、音声認識部２０３、感情分類部２０４、再生部２０５により構成されている。 The minutes generation apparatus 101 of the present embodiment has a function of creating and playing back the minutes of a meeting. As shown in FIG. 2, the voice processing unit 21, the reference material processing unit 22, the text minutes generation unit 23, and The operation / display processing unit 24 is configured.
The voice processing unit 21 includes a sound collection unit 201, a voice data storage unit 202, a voice recognition unit 203, an emotion classification unit 204, and a playback unit 205.

参照資料処理部２２は、参照資料パス記憶部２０６および参照ページ区間情報記憶部２０７により構成されている。
テキスト議事録生成部２３は、音声／テキスト変換部２１１および議事録追記・記憶部２１２により構成されている。
操作・表示処理部２４は、操作部２０８、表示・再生制御部２０９、表示部２１０により構成されている。操作部２０８は、参照資料の設定、会議情報の入力、議事録の再生指示を行う。なお、各記憶部における記憶手段は不図示のＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスクドライブ等により実現される。 The reference material processing unit 22 includes a reference material path storage unit 206 and a reference page section information storage unit 207.
The text minutes generation unit 23 includes a voice / text conversion unit 211 and a minutes addition / storage unit 212.
The operation / display processing unit 24 includes an operation unit 208, a display / reproduction control unit 209, and a display unit 210. The operation unit 208 performs setting of reference materials, input of meeting information, and an instruction to reproduce the minutes. The storage means in each storage unit is realized by a CPU, RAM, ROM, hard disk drive, etc. (not shown).

次に、本実施形態の議事録生成装置１０１の動作について説明する。
まず、会議前に議事録担当者は議事録生成装置１０１を起動し、図３に示す様な議事録テンプレートの会議情報欄３０１に会議情報を、出席者欄３０２に出席者名を、資料欄３０３に各出席者の資料を登録する。ここで、会議情報とは会議名、開催場所等の情報であり必要に応じて入力する。また、資料はＬＡＮなどのネットワークを介して参照してもよく、議事録生成装置１０１の外部記憶装置にダウンロードして参照することも可能である。
これらの準備が整い会議が開始できる状態になったら、会議開始ボタン３０４を押す。これにより、音声処理と参照資料処理が会議終了ボタン３０５が押されるまで続行される。 Next, the operation of the minutes generation apparatus 101 of this embodiment will be described.
First, before the meeting, the person in charge of the minutes starts up the minutes generating apparatus 101, and the meeting information column 301 of the minutes template as shown in FIG. 3 shows the meeting information, the attendee name in the attendee field 302, and the data field. In 303, the materials of each attendee are registered. Here, the conference information is information such as a conference name and a venue, and is input as necessary. Further, the document may be referred to via a network such as a LAN, or may be downloaded to the external storage device of the minutes generating device 101 for reference.
When these preparations are complete and the conference can be started, the conference start button 304 is pressed. Thereby, the voice processing and the reference material processing are continued until the conference end button 305 is pressed.

図２を用いて各処理について説明する。
まず、音声処理部２１について説明する。会議開始ボタン３０４が押されることにより、音声データの記録が開始される。音声データの収音は収音部２０１により行われる。収音部２０１は、複数の出席者（例えば出席者Ａ、Ｂ、Ｃ、Ｄ）の音声を収音し、デジタル音声信号に変換するデジタル化を行い音声データ記憶部２０２に入力する。収音部２０１は１つ以上のマイクを備えており、各出席者別に割り当てる複数のマイクでもよく、全出席者の音声を一括して収音するマイクであってもよい。 Each process will be described with reference to FIG.
First, the voice processing unit 21 will be described. When the conference start button 304 is pressed, recording of audio data is started. The sound collecting unit 201 collects sound data. The sound collection unit 201 collects voices of a plurality of attendees (for example, attendees A, B, C, and D), digitizes them into digital voice signals, and inputs them to the voice data storage unit 202. The sound collection unit 201 includes one or more microphones, and may be a plurality of microphones assigned to each attendee, or may be a microphone that collects sound of all attendees at once.

音声認識部２０３は、音声データ記憶部２０２から入力されたデジタル音声信号を解析して、各出席者の発言の区切りを検出することにより、各出席者の発言区間、および非発言区間を割り出す。各発言区間については、発言者を特定し、非発言区間については、その区間の状況（無音、拍手、物音など）を特定する。そして、特定した音声データを感情分類部２０４に出力する。
なお、各出席者の発言は、予め登録された各出席者の音声波形の認識により識別することができる。また、各出席者に個別にマイクが設けられている場合は、どのマイクで収音されたかによって発言者を識別することができる。 The voice recognition unit 203 analyzes the digital voice signal input from the voice data storage unit 202 and detects a break between each participant's utterances, thereby determining a speech segment and a non-speech segment for each participant. For each speaking section, the speaker is specified, and for the non-speaking section, the status (silence, applause, noise, etc.) of the section is specified. Then, the identified voice data is output to the emotion classification unit 204.
The speech of each attendee can be identified by recognizing the speech waveform of each attendee registered in advance. In addition, when a microphone is provided for each attendee, the speaker can be identified based on which microphone has picked up the sound.

感情分類部２０４は、発言区間を解析し、その結果を会議の雰囲気として分類する。この感情の解析は、例えば、特許文献５等に記載の技術を適用して実現することができる。
なお、音声データの記憶、音声認識、感情分類は同時に並列に行うことも可能である。 The emotion classification unit 204 analyzes the speech section and classifies the result as the meeting atmosphere. This emotion analysis can be realized, for example, by applying the technique described in Patent Document 5 or the like.
Note that voice data storage, voice recognition, and emotion classification can be performed in parallel at the same time.

次に、参照資料処理部２２について説明する。
議事録生成装置１０１はプレゼンテーション資料表示用の端末としても使用される。そのため、資料欄の資料がクリック等により選択されると、適当なアプリケーションによって起動・表示される。この時、アクセスコマンド等により起動されたファイルを検出し、参照資料パス記憶部２０６により参照パスを記憶する。さらに、同アクセスコマンドにより参照ページ数を取得し、参照ページ区間情報記憶部２０７にて参照ページ情報を記憶する。音声データとこれら参照資料情報は同一時刻のタイムスタンプにて管理されるため、音声データと参照資料データとは時間的にリンクされている。 Next, the reference material processing unit 22 will be described.
The minutes generating apparatus 101 is also used as a terminal for displaying presentation materials. Therefore, when a material in the material column is selected by clicking or the like, it is activated and displayed by an appropriate application. At this time, a file activated by an access command or the like is detected, and the reference path is stored in the reference material path storage unit 206. Further, the reference page number is acquired by the access command, and the reference page section information storage unit 207 stores the reference page information. Since the audio data and the reference material information are managed by the time stamp at the same time, the audio data and the reference material data are linked temporally.

次に、テキスト議事録生成部２３について説明する。音声データは、音声／テキスト変換部２１１にてテキスト化され、各出席者の識別情報および参照資料情報と共に、議事録追記・記憶部２１２によりテキストベースの議事録として随時更新・追記される。 Next, the text minutes generation unit 23 will be described. The voice data is converted into text by the voice / text conversion unit 211, and updated / added at any time as a text-based minutes by the minutes addition / storage unit 212 together with identification information and reference material information of each attendee.

次に、操作・表示処理部２４について説明する。
本実施形態の操作・表示処理部２４は、音声処理部２１と参照資料処理部２２によって記憶されたそれぞれのデータは議事録上の出席者別時間軸に沿って帯状に参照資料のアイコンおよび発言状況を示すアイコンと共に表示されるようにしている。 Next, the operation / display processing unit 24 will be described.
In the operation / display processing unit 24 of the present embodiment, each data stored by the voice processing unit 21 and the reference material processing unit 22 is a band of reference material icons and statements along the time axis according to the attendees on the minutes. It is displayed with an icon indicating the status.

図４は、議事録生成装置のディスプレイに表示される発言・参照資料一覧の画面を示す図である。この表示は、出席者を縦軸に、経過時間を横軸に配し、参照資料の参照時間に応じた長さの帯状アイコンを表示し、音声データについてはある単位時間毎（例えば３分）に細分化し、状況を示すアイコンを出席者別に配置している。勿論、参照資料を単位時間毎のアイコンで示し、音声データを発言時間に応じた帯状のアイコンで示すことも可能である。 FIG. 4 is a diagram illustrating a screen of a list of remarks / reference materials displayed on the display of the minutes generating device. This display shows attendees on the vertical axis and elapsed time on the horizontal axis, and displays a band-shaped icon with a length corresponding to the reference time of the reference material. For audio data, every unit time (for example, 3 minutes) The icons indicating the situation are arranged for each attendee. Of course, the reference material can be indicated by an icon for each unit time, and the audio data can be indicated by a band-shaped icon corresponding to the speech time.

図４は、図１０に示した従来の議事録に対応する本発明の第１の実施形態で生成される議事録の一例を示す図である。なお、本実施形態は会議に参加していない者にとってより有用となる議事録生成装置である。たとえば、会議に参加していないが、会議の結論に責任を持つ上司への報告用資料として特に有用である。 FIG. 4 is a diagram showing an example of the minutes generated in the first embodiment of the present invention corresponding to the conventional minutes shown in FIG. Note that this embodiment is a minutes generation device that is more useful to those who have not participated in the conference. For example, it is particularly useful as reporting material to a supervisor who is not attending a meeting but is responsible for the conclusion of the meeting.

本実施形態の議事録生成装置により生成される議事録は、音声の雰囲気を示すアイコンと、参照資料を示すアイコンが議論時間に対応して表示されることで、以下の点が明確になる。
（１）どの資料に対する議論であったか。
（２）結論に至る過程がどの様な雰囲気であったか。
（３）議論時間、すなわち、結論に至る過程での議論がどうであったかを、参照された資料情報と共に一見して知ることができる。たとえば、以下の状況が一見して理解できる。
（４）ある資料に対する議論では、短い時間で結論が出た（結論は、妥当である可能性が高いと推測可能）。
（５）別のある資料に対する議論では、紆余曲折の末、結論に至った（結論に至った経緯を確認し、妥当性を判断した方が良いと推測可能）。
これらにより、会議に不参加であった者でもポイントが理解できると共に、確認すべきと判断した議論における音声と資料を、マウスのクリック等により、容易に素早く参照することができる。 The minutes generated by the minutes generation apparatus of the present embodiment are displayed with an icon indicating an audio atmosphere and an icon indicating a reference material corresponding to the discussion time, and the following points become clear.
(1) Which material was the discussion about?
(2) What was the atmosphere of the process leading to the conclusion?
(3) It is possible to know at a glance the discussion time, that is, how the discussion was in the process of reaching the conclusion, together with the referenced material information. For example, the following situation can be understood at a glance.
(4) In a discussion on a document, a conclusion was reached in a short time (it can be assumed that the conclusion is likely to be valid).
(5) In the discussion of another document, after many twists and turns, we came to a conclusion (it can be assumed that it is better to confirm the circumstances leading to the conclusion and judge the validity).
As a result, even those who have not participated in the conference can understand the points, and can easily and quickly refer to the voices and materials in the discussion determined to be confirmed by clicking the mouse.

なお、会議における状況が変化した時がその会議における重要な議論である場合が多いため、この変化をわかり易くアイコンにて表示することも可能である。例えば、平穏な雰囲気から騒がしい雰囲気に変化したポイントにそれを示すアイコンにてハイライト表示する。これとは逆に、騒がしい雰囲気から、沈黙への移行等においても、同様にハイライト表示する。これにより、会議中または会議後の議事録表示を正確に行うことができるので、会議での状況の変化が一見して分かる。 In many cases, when the situation in the meeting changes is an important discussion in the meeting, it is also possible to display this change with an easy-to-understand icon. For example, an icon indicating that is highlighted at a point where the atmosphere changes from a calm atmosphere to a noisy atmosphere. Contrary to this, highlighting is similarly performed in the transition from a noisy atmosphere to silence. As a result, the minutes can be accurately displayed during or after the meeting, so that a change in the situation at the meeting can be seen at a glance.

次に、本実施形態の議事録生成に対応する処理の一例を図５のフローチャートを参照して説明する。
まず、Ｓ５０１では議事録のテンプレートを表示するのに必要な初期設定を行う。ここでは、会議の情報、主席者、参照資料などの登録を行う。初期設定後、会議の開始準備が整った段階で、続くＳ５０２で会議開始ボタン３０４が押されるとＳ５０３に進む。 Next, an example of processing corresponding to the minutes generation of the present embodiment will be described with reference to the flowchart of FIG.
First, in S501, initial settings necessary for displaying a template of minutes are performed. Here, registration of conference information, principals, reference materials, etc. is performed. When the conference start button 304 is pressed in the subsequent S502 at the stage where the preparation for starting the conference is completed after the initial setting, the process proceeds to S503.

Ｓ５０３では、会議終了ボタン３０５が押される（Ｓ５０３における「Ｙｅｓ」）まで議事の記録を続ける。Ｓ５０３で「Ｎｏ」と判断されると、音声処理を行うフローと参照資料処理を行うフローとを並列に処理する。 In S503, the recording of the proceedings is continued until the conference end button 305 is pressed (“Yes” in S503). If “No” is determined in S503, the flow for performing audio processing and the flow for performing reference material processing are processed in parallel.

まず、音声処理のフローについて説明する。
Ｓ５０４では複数の会議出席者（例えば出席者Ａ、Ｂ、Ｃ、Ｄ）の音声を収音し、デジタル音声信号に変換して音声データ記録する。続くＳ５０５ではデジタル音声信号を解析して、各出席者の発言の区切りを検出することにより、各出席者の発言区間、および非発言区間を検出する。各発言区間については、発言者を特定し、非発言区間については、その区間の状況（無音、拍手、物音騒など）を特定する。 First, the flow of audio processing will be described.
In S504, voices of a plurality of conference attendees (for example, attendees A, B, C, and D) are collected, converted into digital voice signals, and recorded as voice data. In subsequent S505, the speech segment and the non-speech segment of each attendee are detected by analyzing the digital audio signal and detecting the delimiter of each attendee's speech. For each speaking section, the speaker is specified, and for the non-speaking section, the status of the section (silence, applause, noise, etc.) is specified.

続くＳ５０６では発言区間を解析し、その結果「笑い」、「怒り」、「暗い」等の雰囲気に分類する。この感情の解析は、例えば、前述した特許文献５等に記載の技術を適用して実現することができる。なお、上記音声データの記憶、音声認識、感情分類は同時に並列に行うことも可能である。 In the subsequent S506, the speech section is analyzed, and as a result, it is classified into atmospheres such as “laughter”, “anger”, “dark”. This emotion analysis can be realized, for example, by applying the technique described in Patent Document 5 described above. Note that the storage of voice data, voice recognition, and emotion classification can be simultaneously performed in parallel.

次に、参照資料処理のフローについて説明する。議事録生成装置１０１はプレゼンテーション資料表示用の端末としても使用される。そのため、資料欄の資料がクリック等により選択されると、適当なアプリケーションによって起動・表示される。この時、Ｓ５０７ではアクセスコマンド等により起動されたファイルを検出し、参照パスを記憶する。
次に、Ｓ５０８では、さらに同アクセスコマンドにより参照ページ数を取得し、参照ページ情報を記憶する。 Next, the flow of reference material processing will be described. The minutes generating apparatus 101 is also used as a terminal for displaying presentation materials. Therefore, when a material in the material column is selected by clicking or the like, it is activated and displayed by an appropriate application. At this time, in S507, a file activated by an access command or the like is detected, and the reference path is stored.
In step S508, the number of reference pages is further acquired by the access command, and reference page information is stored.

音声データはさらに、Ｓ５０９にてテキスト変換され、続くＳ５１０にて各出席者の識別情報および参照資料情報と共に、テキストベースの議事録として随時更新・追記される。
Ｓ５１１では、音声データ処理と参照資料処理によって記憶されたそれぞれのデータを議事録上の出席者別時間軸に沿って帯状に参照資料のアイコンおよび発言状況を示すアイコンと共に表示する。 The voice data is further converted into text at S509, and subsequently updated and added as text-based minutes together with identification information and reference material information of each attendee at S510.
In S511, the respective data stored by the voice data processing and the reference material processing are displayed along with the icons of the reference material and the icons indicating the state of speech along the time axis for each attendee on the minutes.

次に、再生動作について説明する。
本議事録の再生に必要なデータベースは、議事録生成装置１０１の記憶領域に記憶されており、必要に応じて随時再生可能である。再生を開始すると、図４に示す様な議事録が表示される。 Next, the reproduction operation will be described.
A database necessary for reproducing the minutes is stored in the storage area of the minutes generating apparatus 101 and can be reproduced as needed. When the reproduction is started, the minutes as shown in FIG. 4 are displayed.

図４において、４０１はＡ氏の資料へのリンク、４０２はＢ氏の明るい雰囲気の発言へのリンクである。また、４０３はＣ氏の暗い雰囲気の発言へのリンク、４０４はＡ氏の怒った雰囲気の発言へのリンク、４０５はＤ氏の笑っている雰囲気の発言へのリンク、をそれぞれ示すアイコンである。また、４０６は有音区間であることを示し、４０７は無音区間であることを示す。 In FIG. 4, 401 is a link to Mr. A's material, and 402 is a link to Mr. B's bright atmosphere. Also, 403 is a link to Mr. C's dark atmosphere, 404 is a link to Mr. A's angry atmosphere, and 405 is a link to Mr. D's laughing atmosphere. . Further, 406 indicates a voiced section, and 407 indicates a silent section.

さらに、４０８はテキストベースの議事録へリンクするアイコンを示している。操作者により所望の区間の参照資料アイコンもしくは音声アイコンをマウス等でクリックされることにより、該当区間に参照されていた資料が再生表示されると共に同時刻の音声が再生される。この再生はある操作（例えばEscキーを押す等）で停止することができる。再生停止後、操作者は引き続き、所望の再生区間を指定することにより、議事録を読み進めることができる。 Reference numeral 408 denotes an icon that links to a text-based minutes. When the operator clicks the reference material icon or audio icon of a desired section with a mouse or the like, the material referred to in the corresponding section is reproduced and displayed, and the audio at the same time is reproduced. This playback can be stopped by a certain operation (for example, pressing the Esc key). After the reproduction is stopped, the operator can continue to read the minutes by designating a desired reproduction section.

以上によれば、実施した会議の情報を、会議の雰囲気を示す音声アイコンと、参照資料を示すアイコンとを時系列に発言時間および参照時間を示す期間を表現することができる。さらに、音声アイコンは会議のポイントとなる可能性の高い音声状況の変化点をハイライト表示することができる。 According to the above, it is possible to express the period of the speech time and the reference time in the time series of the audio icon indicating the atmosphere of the conference and the icon indicating the reference material, as information on the conference that has been performed. Furthermore, the voice icon can highlight a change point of the voice situation that is likely to be a meeting point.

各アイコンは、音声データおよび参照資料の参照ページへリンクしている。これにより、一見して会議の状況が把握できると共に、所望の情報に容易にアクセスが可能となる。
また、テキストベースの議事録へのリンク４０８をクリックすることにより、議事録の表示および閲覧が可能である。
以上説明したように、議事録の再生は閲覧者の操作に基づいて行われるため、記録時のようなフローチャートは規定しない。 Each icon links to a reference page for audio data and reference material. This makes it possible to grasp the status of the conference at a glance and easily access desired information.
In addition, by clicking a link 408 to a text-based minutes, the minutes can be displayed and viewed.
As described above, since the reproduction of the minutes is performed based on the operation of the viewer, a flowchart for recording is not specified.

［第２の実施形態］
以下、発明の第２の実施形態を図６のブロック図と、図７の会議開始前の議事録テンプレートを示す図と、図８の会議後の議事録を示す図を用いて説明する。第２の実施形態の構成図は、第１の実施形態の構成図である図１と同様であるため説明を省略する。 [Second Embodiment]
Hereinafter, a second embodiment of the invention will be described with reference to the block diagram of FIG. 6, the diagram showing the minutes template before the start of the conference in FIG. 7, and the diagram showing the minutes after the conference in FIG. Since the configuration diagram of the second embodiment is the same as FIG. 1 which is the configuration diagram of the first embodiment, the description thereof is omitted.

図６は、第２の実施形態の議事録生成装置のブロック図である。本実施形態において、議事録生成装置は、音声処理部６１、参照資料処理部６２、テキスト議事録生成部６３および操作・表示処理部６４で構成されている。なお、各記憶部における記憶手段は不図示のＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスクドライブ等により実現される。 FIG. 6 is a block diagram of the minutes generation apparatus according to the second embodiment. In the present embodiment, the minutes generating device includes an audio processing unit 61, a reference material processing unit 62, a text minutes generating unit 63, and an operation / display processing unit 64. The storage means in each storage unit is realized by a CPU, RAM, ROM, hard disk drive, etc. (not shown).

まず、会議前に議事録担当者は議事録生成装置を起動し、図７に示す様な議事録テンプレートの会議情報欄７０１に会議情報を、患者名欄７０２に患者名を、資料欄７０３に各患者の資料を、出席者欄７０４に出席者を登録する。この議事録テンプレートは縦軸が出席者別ではなく議論すべき項目別であることを特徴とする。 First, before the meeting, the person in charge of the minutes activates the minutes generation device, and the meeting information field 701 of the minutes template as shown in FIG. 7, the meeting information in the patient name field 702, the patient name in the document field 703. The attendees are registered in the attendee column 704 for the materials of each patient. This minutes template is characterized in that the vertical axis is not for each attendee but for each item to be discussed.

この例では、病院における手術前カンファレンスを想定しており、縦軸を患者名としている。これらの準備が整い会議が開始できる状態になったら、会議開始ボタン７０６を押す。これにより、音声処理と参照資料処理が会議終了ボタン７０７が押されるまで続行される。 In this example, a pre-operative conference in a hospital is assumed, and the vertical axis is the patient name. When these preparations are complete and the conference can be started, the conference start button 706 is pressed. Thereby, the voice processing and the reference material processing are continued until the conference end button 707 is pressed.

次に、図６を用いて各処理について説明する。
音声処理部６１について説明する。会議開始ボタン７０６が押されることにより、音声データの記録が開始される。音声処理部の収音部６０１〜再生部６０５は第１の実施形態における、図２に記載の収音部２０１〜再生部２０５の処理と同様であるため説明を省略する。 Next, each process will be described with reference to FIG.
The sound processing unit 61 will be described. When the conference start button 706 is pressed, recording of audio data is started. Since the sound collection unit 601 to the reproduction unit 605 of the sound processing unit are the same as the processing of the sound collection unit 201 to the reproduction unit 205 illustrated in FIG. 2 in the first embodiment, description thereof is omitted.

次に、参照資料処理部６２について説明する。参照資料パス記憶部６０６〜参照ページ区間情報記憶部６０７は第１の実施形態における図２に記載の参照資料パス記憶部２０６〜参照ページ区間情報記憶部２０７の処理と同様であるため説明を省略する。
次に、テキスト議事録生成部６３について説明する。音声／テキスト変換部６１１〜議事録追記・記憶部６１２は第１の実施形態における図２に記載の音声／テキスト変換部２１１〜議事録追記・記憶部２１２の処理と同様であるため、説明を省略する。 Next, the reference material processing unit 62 will be described. Since the reference material path storage unit 606 to the reference page section information storage unit 607 are the same as the processing of the reference material path storage unit 206 to the reference page section information storage unit 207 described in FIG. To do.
Next, the text minutes generation unit 63 will be described. Since the voice / text conversion unit 611 to the minutes addition / storage unit 612 are the same as the processing of the voice / text conversion unit 211 to the minutes addition / storage unit 212 described in FIG. Omitted.

音声／テキスト変換部６１１によりテキスト変換されたデータは、キーワード検出部６１３において、予め登録された開始キーワード（例えば、“これよりサマライズを開始します”など）を音声認識により検出される。そして、これをトリガとして以降の発言を各患者別のサマリ欄にテキスト表示する表示制御が行われる。 Data converted to text by the voice / text conversion unit 611 is detected by a keyword detection unit 613 by voice recognition of a pre-registered start keyword (for example, “summarization starts from now”). Then, using this as a trigger, display control is performed in which subsequent statements are displayed as text in the summary column for each patient.

さらに、予め登録された終了キーワード（例えば、“これでサマライズを終了します”など）を音声認識により検出し、以降の開始キーワードを検出するまでこの機能を停止する。これにより、会議における決定事項を会議参加者全員で再確認し共有することができ、認識の相違を避けることができると共に、会議後でも一見して決定事項を知ることができる。 Further, an end keyword registered in advance (for example, “This will end summarization”) is detected by voice recognition, and this function is stopped until a subsequent start keyword is detected. As a result, the decision items in the conference can be reconfirmed and shared by all the conference participants, and the recognition difference can be avoided, and the decision items can be known at a glance after the conference.

次に、操作・表示処理部６４について説明する。操作部６０８〜表示部６１０は第１の実施形態における図２に記載の操作部２０８〜表示部２１０の処理と同様であるため説明を省略する。
図８は、議事録生成装置のディスプレイに表示される発言・参照資料一覧の画面を示す図である。この表示は、患者を縦軸に、経過時間を横軸に配し、参照資料の参照時間に応じた長さの帯状アイコンを表示し、音声データについてはある単位時間毎（例えば３分）に細分化し、状況を示すアイコンを患者別および出席者別に配置している。音声データを示すアイコンは、どの患者別の資料が参照されているかにより、どの患者の時間軸に表示すべきかを判断することができる。 Next, the operation / display processing unit 64 will be described. Since the operation unit 608 to the display unit 610 are the same as the processing of the operation unit 208 to the display unit 210 illustrated in FIG. 2 in the first embodiment, description thereof will be omitted.
FIG. 8 is a diagram showing a screen of a list of remarks / reference materials displayed on the display of the minutes generating device. In this display, the patient is placed on the vertical axis, the elapsed time is placed on the horizontal axis, and a band-shaped icon having a length corresponding to the reference time of the reference material is displayed. Subdivided, icons indicating the situation are arranged by patient and attendee. The icon indicating the voice data can determine which patient's time axis should be displayed depending on which patient-specific material is referenced.

なお、会議における状況が変化した時がその会議における重要な議論である場合が多いため、この変化をわかり易くアイコンにて表示することも可能である。例えば平穏な雰囲気から騒がしい雰囲気に変化したポイントにそれを示すアイコンにてハイライト表示する。これとは逆に、騒がしい雰囲気から、沈黙への移行等においても、同様にハイライト表示する。これにより、会議での状況の変化が一見して分かる。 In many cases, when the situation in the meeting changes is an important discussion in the meeting, it is also possible to display this change with an easy-to-understand icon. For example, an icon indicating that is highlighted at a point where the atmosphere changes from a calm atmosphere to a noisy atmosphere. Contrary to this, highlighting is similarly performed in the transition from a noisy atmosphere to silence. As a result, the change in the situation at the conference can be seen at a glance.

次に、本実施形態の議事録生成に対応する処理を図９のフローチャートを参照して説明する。
まず、Ｓ９０１では議事録のテンプレートを表示するのに必要な初期設定を行う。ここでは、会議の情報、患者、主席者、参照資料などの登録を行う。初期設定後、会議の開始準備が整った段階で続くＳ９０２で会議開始ボタンが押される。
以降のＳ９０３〜Ｓ９１１の各処理は、前述した第１の実施形態における図５に記載のＳ５０３〜Ｓ５１１の処理と同様であるため、説明を省略する。 Next, processing corresponding to the minutes generation of this embodiment will be described with reference to the flowchart of FIG.
First, in S901, initial settings necessary for displaying a template of minutes are performed. Here, registration of conference information, patients, principals, reference materials, etc. is performed. After the initial setting, the conference start button is pressed in step S902 that follows when the conference preparation is ready.
Since the subsequent processing of S903 to S911 is the same as the processing of S503 to S511 described in FIG. 5 in the first embodiment described above, description thereof will be omitted.

Ｓ９１２は、予め登録された開始キーワード（例えば、“これよりサマライズを開始します”など）を音声認識により検出する。開始キーワードと認識されれば（Ｓ９１２における「Ｙｅｓ」）、続くＳ９１３により、これをトリガとしてサマリ欄にテキストを書き出す。 In S912, a start keyword registered in advance (for example, “summarization starts from now”) is detected by voice recognition. If it is recognized as a start keyword (“Yes” in S912), the text is written in the summary column using this as a trigger in subsequent S913.

開始キーワードと認識されなければ（Ｓ９１２における「Ｎｏ」）、続くＳ９１４にて予め登録された終了キーワード（例えば、“これでサマライズを終了します”など）を音声認識により検出する。終了キーワードと認識されれば（Ｓ９１４における「Ｙｅｓ」）、続くＳ９１５にてサマリ欄へのテキスト表示を停止する。終了キーワードと認識されなければ（Ｓ９１４における「Ｎｏ」）、Ｓ９０３へ戻る。
再生動作については第１の実施形態と同様であるので説明を省略する。 If it is not recognized as a start keyword (“No” in S912), an end keyword registered in advance in S914 (for example, “This will end summarization”) is detected by voice recognition. If it is recognized as an end keyword (“Yes” in S914), the text display in the summary column is stopped in subsequent S915. If it is not recognized as an end keyword (“No” in S914), the process returns to S903.
Since the reproduction operation is the same as that of the first embodiment, description thereof is omitted.

以上によれば、実施した会議の情報を、会議の雰囲気を示す音声アイコンと、参照資料を示すアイコンを時系列に発言時間および参照時間を示す期間を表現することができる。さらに音声アイコンは会議のポイントとなる可能性の高い音声状況の変化点をハイライト表示することができる。 According to the above, it is possible to express the period of the speech time and the reference time in the time series of the audio icon indicating the atmosphere of the conference and the icon indicating the reference material as information on the conference that has been performed. Furthermore, the voice icon can highlight a change point of the voice situation that is likely to be a meeting point.

各アイコンは音声データおよび参照資料の参照ページへリンクしている。これにより、一見して会議の状況が把握できると共に、所望の情報に容易にアクセスが可能となる。
さらに、各患者別にサマリ表示を行うことで情報の再確認・共有に有用である。
また、テキストベースの議事録へのリンクをクリックすることにより、議事録の表示および閲覧が可能である。 Each icon links to a reference page for audio data and reference material. This makes it possible to grasp the status of the conference at a glance and easily access desired information.
Furthermore, it is useful for reconfirmation and sharing of information by displaying a summary for each patient.
The minutes can be displayed and viewed by clicking the link to the text-based minutes.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（コンピュータプログラム）を、ネットワーク又は各種のコンピュータ読み取り可能な記憶媒体を介してシステム或いは装置に供給する。そして、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (computer program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various computer-readable storage media. Then, the computer (or CPU, MPU, etc.) of the system or apparatus reads out and executes the program.

２０１収音部、２０２音声データ記憶部、２０３音声認識部、２０４感情分類部、２０５再生部、２０６参照資料パス記憶部、２０７参照ページ区間情報記憶部、２０８操作部、２０９表示・再生制御部、２１０表示部、２１１音声／テキスト変換部、２１２議事録追記・記憶部 DESCRIPTION OF SYMBOLS 201 Sound collecting part, 202 Voice data storage part, 203 Voice recognition part, 204 Emotion classification part, 205 Playback part, 206 Reference material path storage part, 207 Reference page section information storage part, 208 Operation part, 209 Display / reproduction control part , 210 Display section, 211 Voice / text conversion section, 212 Minutes record / storage section

Claims

A minutes generation device for creating and playing back minutes of a meeting,
Voice processing means for collecting voices of attendees of the conference by a microphone, converting voice data obtained by digitizing the collected voices, and storing the voice data in a storage unit;
Analyzing the voice data to identify attendees of the meeting, and voice recognition means for identifying the speech section and non-speech section of each attendee;
Reference material processing means for detecting a material referred to by the attendee when the meeting is being performed, and storing a path of the material and a reference page of the reference material in a storage unit;
Operation means for setting the reference material, inputting conference information, and instructing to reproduce the minutes;
Each data stored in each storage unit by the voice processing unit and the reference material processing unit is displayed in a band along the time axis according to the attendees on the minutes of the meeting. An apparatus for generating a minutes comprising an operation / display processing means for displaying with an icon to be displayed.

2. The minutes generating apparatus according to claim 1, further comprising emotion classification means for analyzing the speech section and the non-speech section and classifying the result as a meeting atmosphere.

The operation / display processing means displays on the display device a minutes template having a meeting information field for inputting meeting information, an attendee field for inputting attendee names, and a material field for registering materials for each attendee. The minutes generating device according to claim 1 or 2, characterized by the above-mentioned.

The apparatus according to claim 1, further comprising: a reproduction unit that reproduces audio in response to an instruction from the operation / display processing unit; and a display unit that displays data in the conference and minutes during or after the conference. The minutes generating device given in any 1 paragraph of 1-3.

5. The voice / text converting means for converting the voice data into text, and the minutes adding / storing means for adding / storing the text as minutes. Minutes generator described in 1.

6. The minutes generating apparatus according to claim 1, further comprising keyword detecting means for detecting a keyword from the voice data and performing display control.

The operation / display processing means displays the minutes of audio information along with the icon indicating the atmosphere along the time axis for a length corresponding to the time, and also refers to the minutes together with the icon indicating the material referred to along the time axis. Only the length according to time is displayed, The minutes production | generation apparatus in any one of Claims 1-6 characterized by the above-mentioned.

The minutes generation apparatus according to any one of claims 1 to 6, wherein the operation / display processing unit highlights a change point of a voice situation.

The minutes generation apparatus according to any one of claims 1 to 6, wherein the operation / display processing means displays the speech status and the reference material in a band shape for each attendee along the time axis.

7. The minutes generating apparatus according to claim 1, wherein the operation / display processing means displays a utterance status and a reference material in a band shape according to items to be discussed along a time axis. .

A minutes generation method for creating and replaying meeting minutes,
A voice processing step of collecting voice of attendees of the conference by a microphone, converting voice data obtained by digitizing the collected voice and storing the voice data in a storage unit;
Analyzing the voice data to identify attendees of the meeting, and a voice recognition step of identifying the speech section and non-speech section of each attendee,
A reference material processing step of detecting a material referred to by an attendee when the meeting is being performed, and storing a path of the material and a reference page of the reference material in a storage unit;
An operation process for setting the reference material, inputting conference information, and instructing the reproduction of the minutes;
Each data stored in each storage unit in the voice processing step and the reference material processing step is a reference material, a reference page icon and a speech status in a band along the time axis by attendee on the minutes of the meeting. A minutes generating method comprising: an operation / display processing step for displaying together with an icon to be displayed.

A program that causes a computer to execute each process of a minutes generation method for creating and playing a meeting minutes,
A voice processing step of collecting voice of attendees of the conference by a microphone, converting voice data obtained by digitizing the collected voice and storing the voice data in a storage unit;
Analyzing the voice data to identify attendees of the meeting, and a voice recognition step of identifying the speech section and non-speech section of each attendee,
A reference material processing step of detecting a material referred to by an attendee when the meeting is being performed, and storing a path of the material and a reference page of the reference material in a storage unit;
An operation process for setting the reference material, inputting conference information, and instructing the reproduction of the minutes;
Each data stored in each storage unit in the voice processing step and the reference material processing step is a reference material, a reference page icon and a speech status in a band along the time axis by attendee on the minutes of the meeting. A program for causing a computer to execute an operation / display processing step to be displayed together with an icon to be displayed.

A computer-readable storage medium storing the program according to claim 12.