JPH07191690A

JPH07191690A - Minutes generation device and multispot minutes generation system

Info

Publication number: JPH07191690A
Application number: JP5348281A
Authority: JP
Inventors: Masaki Haranishi; 正樹原西
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1993-12-24
Filing date: 1993-12-24
Publication date: 1995-07-28

Abstract

PURPOSE:To provide the minutes generation device which can generate accurate minutes easily, speedily, and automatically without requiring a dedicated recording person. CONSTITUTION:Speech signals of a 1st and a 2nd speakers which are inputted through microphones 109 and 110 have noise components removed by a noise processing part 102 and are converted by a speaker detection part 103 into speaker codes corresponding to the vocalizing speakers; and the speaker codes are stored as speech data in a speech data memory part 104 together with the speech signals. The speech data are converted by a speech recognition part 105 into character codes, which are further converted by a system controller 107 into speaker's names that the speaker codes in the speech data correspond to. The character codes and speaker's names are outputted to the outside through an output part 106 to generate minutes.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、通常の会議の議事録を
作成する議事録作成装置および多地点テレビ会議システ
ム等による会議の議事録を作成する多地点議事録作成シ
ステムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a minutes creating apparatus for creating minutes of ordinary meetings and a multipoint minutes creating system for creating minutes of meetings by a multipoint video conference system or the like.

【０００２】[0002]

【従来の技術】従来、議事録を作成する場合は、専門の
記録者（書記）が会議の進行と同時に発言者の発言内容
を手書きにより記録、或いはテープレコーダ等の録音装
置に録音し、会議終了後にワープロ等を用いて編集して
議事録を作成するという手法が採用されていた。2. Description of the Related Art Conventionally, when a minutes is created, a professional recorder (clerk) records the speech of a speaker by handwriting at the same time as the conference progresses, or records it on a recording device such as a tape recorder, and the conference is recorded. After the completion, a method of creating a minutes by editing using a word processor was adopted.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上記従
来の手法によれば、議事録を作成するためにはある程度
の経験を有する専門の記録者が必要であるという問題点
があった。また、手書きにより記録するものでは、その
記録のために多くの時間と労力を費やし、録音装置によ
り録音するものでは、会議終了後に録音された音声を再
生して議事録を作成する必要があるので、議事録作成の
ために多くの時間を費やすという問題点があった。However, according to the above-mentioned conventional method, there is a problem that a specialized recorder having some experience is required to create the minutes. In addition, in the case of recording by handwriting, a lot of time and labor is spent for the recording, and in the case of recording by a recording device, it is necessary to reproduce the voice recorded after the end of the conference to create the minutes. However, there was a problem that it took a lot of time to create the minutes.

【０００４】更に、多地点テレビ会議システムによる会
議の議事録を作成する場合は、複数の地点における発言
者の音声がだれのものであるかを判断することが難し
く、議事録作成のために多くの時間を費やすという問題
点があった。Further, when the minutes of a conference are created by a multipoint video conference system, it is difficult to judge who the voices of speakers at a plurality of points are, and it is often necessary to create the minutes. There was a problem of spending time.

【０００５】本発明は上記問題点を解決するためになさ
れたもので、その第１の目的とするところは、専門の記
録者を必要とせず、正確な議事録を容易に且つ迅速に、
しかも自動的に作成することができる議事録作成装置を
提供することにある。The present invention has been made to solve the above problems, and its first object is to make it possible to accurately and easily record accurate minutes without requiring a professional recorder.
Moreover, it is to provide a minutes creating device which can be created automatically.

【０００６】また、本発明の第２の目的とするところ
は、多地点テレビ会議システムによる会議の正確な議事
録を容易に且つ迅速に、しかも自動的に作成することが
できる議事録作成装置を提供することにある。A second object of the present invention is to provide a minutes preparation apparatus capable of easily, quickly and automatically creating accurate minutes of a meeting by a multipoint video conference system. To provide.

【０００７】[0007]

【課題を解決するための手段】上記第１の目的を達成す
るために本発明の第１発明（請求項１）は、話者の音声
信号が入力されると共にその音声信号を該音声信号に対
応する話者を識別するための話者コードに変換する話者
識別部を有する音声信号入力手段と、該音声信号入力手
段から入力される音声信号に前記話者識別部から出力さ
れる話者コードを付加してなる音声データを記憶する音
声データ記憶手段と、該音声データ記憶手段に記憶され
た音声データ内の音声信号を文字コードに変換する音声
認識手段と、前記音声データ記憶手段に記憶された音声
データ内の話者コードと前記音声認識手段から出力され
る文字コードとを出力する出力手段とを備えたことを特
徴とするものである。In order to achieve the above first object, the first invention (Claim 1) of the present invention is such that a voice signal of a speaker is inputted and the voice signal is converted into the voice signal. A voice signal input unit having a speaker identification unit for converting into a speaker code for identifying a corresponding speaker, and a speaker output from the speaker identification unit in a voice signal input from the voice signal input unit. Voice data storage means for storing voice data added with a code, voice recognition means for converting voice signals in the voice data stored in the voice data storage means into character codes, and storage in the voice data storage means The present invention is characterized by comprising an output means for outputting a speaker code in the generated voice data and a character code output from the voice recognition means.

【０００８】また、同じ目的を達成するために本発明の
第２発明（請求項２）は、話者の音声信号が入力される
と共にその音声信号を該音声信号に対応する話者を識別
するための話者コードに変換する話者識別部を有する音
声信号入力手段と、該音声信号入力手段から入力される
音声信号を記憶する音声信号記憶手段と、前記話者識別
部から出力される話者コードを記憶する話者コード記憶
手段と、前記音声信号記憶手段に記憶された音声信号を
文字コードに変換する音声認識手段と、該音声認識手段
から出力される文字コードと前記話者コード変換手段か
ら出力される話者コードとを出力する出力手段とを備え
たことを特徴とするものである。In order to achieve the same object, a second invention of the present invention (claim 2) is that a voice signal of a speaker is input and the voice signal identifies a speaker corresponding to the voice signal. Voice signal input means having a speaker identification section for converting into a speaker code for use, a voice signal storage means for storing a voice signal input from the voice signal input means, and a speech output from the speaker identification section. Speaker code storage means for storing a person code, voice recognition means for converting a voice signal stored in the voice signal storage means into a character code, character code output from the voice recognition means and the speaker code conversion Output means for outputting the speaker code output from the means.

【０００９】また、同じ目的を達成するために本発明の
第３発明（請求項５）は、話者の音声信号が入力される
と共にその音声信号を該音声信号に対応する話者を識別
するための話者コードに変換する話者識別部を有する音
声信号入力手段と、該音声信号入力手段から入力される
音声信号の生成時間を計時する計時手段と、前記音声信
号入力手段から入力される音声信号に前記話者識別部か
ら出力される話者コードを付加してなる音声データを記
憶する音声データ記憶手段と、該音声データ記憶手段に
記憶された音声データ内の音声信号を文字コードに変換
する音声認識手段と、該音声認識手段から出力される文
字コードを記憶する文字コード記憶手段と、該文字コー
ド記憶手段に記憶された文字コードを文字表示する表示
手段と、該表示手段により表示された内容を編集する編
集手段と、予め前記話者コードに対応する話者の名前を
登録する話者名登録手段と、前記話者コード記憶手段に
記憶された話者コードを前記話者名登録手段に登録され
た話者の名前に変換する話者名変換手段と、前記音声認
識手段から出力される文字コードに前記話者名変換手段
から出力される話者の名前と前記計時手段により計時さ
れた音声信号の生成時間と当該装置を特定する装置番号
とを付加して出力する出力手段とを備えたことを特徴と
するものである。In order to achieve the same object, a third aspect of the present invention (claim 5) is that a voice signal of a speaker is input and the voice signal identifies the speaker corresponding to the voice signal. For inputting from a voice signal input means having a speaker identifying unit for converting into a speaker code, a clock means for measuring a generation time of a voice signal input from the voice signal input means, and the voice signal input means. A voice data storage unit for storing voice data obtained by adding a speaker code output from the speaker identification unit to the voice signal, and a voice signal in the voice data stored in the voice data storage unit as a character code. A voice recognition means for converting, a character code storage means for storing a character code output from the voice recognition means, a display means for displaying the character code stored in the character code storage means as a character, and the display hand. Editing means for editing the content displayed by, a speaker name registration means for previously registering the name of a speaker corresponding to the speaker code, and a speaker code stored in the speaker code storage means for the speaker code. A speaker name conversion means for converting the speaker name registered in the speaker name registration means, a speaker name output from the speaker name conversion means into a character code output from the voice recognition means, and the timekeeping. The present invention is characterized by further comprising output means for adding and outputting a generation time of an audio signal timed by the means and a device number for specifying the device.

【００１０】また、同じ目的を達成するため本発明の第
４発明（請求項６）は、話者の音声信号が入力されると
共にその音声信号を該音声信号に対応する話者を識別す
るための話者コードに変換する話者識別部を有する音声
信号入力手段と、該音声信号入力手段から入力される音
声信号の生成時間を計時する計時手段と、前記音声信号
入力手段から入力される音声信号を記憶する音声信号記
憶手段と、前記話者識別部から出力される話者コードを
記憶する話者コード記憶手段と、前記音声信号記憶手段
に記憶された音声信号を文字コードに変換する音声認識
手段と、該音声認識手段から出力される文字コードを記
憶する文字コード記憶手段と、該文字コード記憶手段に
記憶された文字コードを文字表示する表示手段と、該表
示手段により表示された内容を編集する編集手段と、予
め前記話者コードに対応する話者の名前を登録する話者
名登録手段と、前記話者コード記憶手段に記憶された話
者コードを前記話者名登録手段に登録された話者の名前
に変換する話者名変換手段と、前記音声認識手段から出
力される文字コードに前記話者名変換手段から出力され
る話者の名前と前記計時手段により計時された音声信号
の生成時間と当該装置を特定する装置番号とを付加して
出力する出力手段とを備えたことを特徴とするものであ
る。In order to achieve the same object, a fourth invention of the present invention (claim 6) is to input a voice signal of a speaker and to identify the speaker corresponding to the voice signal. Voice signal input means having a speaker identification section for converting into a speaker code, time measuring means for measuring a generation time of a voice signal input from the voice signal input means, and voice input from the voice signal input means. A voice signal storage unit for storing a signal, a speaker code storage unit for storing a speaker code output from the speaker identification unit, and a voice for converting a voice signal stored in the voice signal storage unit into a character code. Recognition means, character code storage means for storing the character code output from the voice recognition means, display means for displaying the character code stored in the character code storage means as characters, and display by the display means Editing means for editing the stored content, speaker name registration means for registering the name of the speaker corresponding to the speaker code in advance, and the speaker code stored in the speaker code storage means for the speaker name. The speaker name conversion means for converting the name of the speaker registered in the registration means, the speaker name output from the speaker name conversion means into the character code output from the voice recognition means, and the timekeeping means. The present invention is characterized by comprising an output means for adding and outputting the time of generation of the timed audio signal and the device number for specifying the device.

【００１１】また、同じ目的を達成する上で、予め前記
話者コードに対応する話者の名前を登録する話者名登録
手段と、前記音声データ記憶手段に記憶された音声デー
タ内の話者コードを前記話者名登録手段に登録された話
者の名前に変換する話者名変換手段とを備え、前記出力
手段は、前記音声認識手段から出力される文字コードに
前記話者名変換手段から出力される話者の名前を付加し
て出力するようにすることが望ましい（請求項３）。In order to achieve the same purpose, a speaker name registration means for registering the name of the speaker corresponding to the speaker code in advance, and a speaker in the voice data stored in the voice data storage means. Speaker name conversion means for converting a code into a speaker name registered in the speaker name registration means, and the output means converts the speaker name conversion means into a character code output from the voice recognition means. It is desirable to add the speaker name output from (1) to the output (Claim 3).

【００１２】また、上記同じ目的を達成する上で、前記
音声認識手段から出力される文字コードを記憶する文字
コード記憶手段と、該文字コード記憶手段に記憶された
文字コードを文字表示する表示手段と、該表示手段によ
り表示された内容を編集する編集手段とを備えるとよい
（請求項４）。To achieve the same object, a character code storage means for storing a character code output from the voice recognition means, and a display means for displaying the character code stored in the character code storage means as a character. And editing means for editing the content displayed by the display means (claim 4).

【００１３】また、上記第２の目的を達成するため本発
明の第５発明（請求項７）は、前記第３または第４の発
明に係わる議事録作成装置を多地点に配置すると共に、
前記各地点の議事録作成装置の出力手段から出力される
出力データを一か所に集約して編集する多地点編集手段
を備え、該多地点編集手段によって前記出力データを前
記計時手段により計時された音声信号の生成時間に基づ
いて並び変えることによって、多地点会議の議事録を作
成することを特徴とするものである。Further, in order to achieve the second object, the fifth invention (Claim 7) of the present invention arranges the minutes preparing apparatus according to the third or fourth invention at multiple points, and
The multipoint editing means for compiling and editing the output data output from the output means of the minutes recording device at each of the points is provided, and the multipoint editing means measures the output data by the timekeeping means. It is characterized in that the minutes of the multipoint conference are created by rearranging the audio signals based on the generation time of the audio signals.

【００１４】また、同じ目的を達成するために本発明の
第６発明（請求項８）は、前記第３又は第４の発明に係
わる議事録作成装置を多地点に配置すると共に、前記多
地点の議事録作成装置のうちの１つの議事録作成装置の
編集手段を各地点の議事録作成装置の出力手段から出力
される出力データを一か所に集約して編集する多地点編
集手段となし、該多地点編集手段によって前記出力デー
タを、前記計時手段により計時された音声信号の生成時
間に基づいて並び変えることによって、多地点会議の議
事録を作成することを特徴とするものである。In order to achieve the same object, a sixth invention (Claim 8) of the present invention has the minutes preparing apparatus according to the third or fourth invention arranged at multiple points, No multipoint editing means for editing the output means output from the output means of the minutes creating device of one of the minutes creating devices of one of the minutes creating devices in one place. The minutes of the multipoint conference are created by rearranging the output data by the multipoint editing means based on the generation time of the audio signal timed by the timekeeping means.

【００１５】[0015]

【作用】本発明の第１発明の議事録作成装置によれば、
音声認識手段により音声データ記憶手段に記憶された音
声データ内の音声信号が文字コードに変換され、話者識
別部により音声信号入力手段から入力された話者の音声
信号が話者コードに変換され、前記文字コード及び前記
音声データ記憶手段に記憶された音声データ内の話者コ
ードが出力手段により出力される。According to the minutes creating apparatus of the first invention of the present invention,
The voice recognition unit converts the voice signal in the voice data stored in the voice data storage unit into a character code, and the speaker identification unit converts the voice signal of the speaker input from the voice signal input unit into a speaker code. The speaker code in the voice data stored in the voice code storage means and the character code is output by the output means.

【００１６】本発明の第２発明の議事録作成装置によれ
ば、音声認識手段により音声信号記憶手段に記憶された
音声信号が文字コードに変換され、話者識別部により音
声信号入力手段から入力された話者の音声信号が話者コ
ードに変換され、前記文字コード及び話者コード記憶手
段に記憶された話者コードが出力手段により出力され
る。According to the minutes creating apparatus of the second aspect of the present invention, the voice signal stored in the voice signal storage means is converted into the character code by the voice recognition means, and the voice signal input means inputs the voice signal by the speaker identification section. The voice signal of the speaker thus converted is converted into a speaker code, and the character code and the speaker code stored in the speaker code storage means are output by the output means.

【００１７】本発明の第３発明の議事録作成装置によれ
ば、音声認識手段により音声データ記憶手段に記憶され
た音声データ内の音声信号が文字コードに変換され、話
者識別部により音声信号入力手段から入力された話者の
音声信号が話者コードに変換され、話者名登録手段によ
り予め前記話者コードに対応する話者の名前が登録さ
れ、話者名変換手段により話者コード記憶手段に記憶さ
れた話者コードが話者名登録手段に登録された話者の名
前に変換され、表示手段により文字コード記憶手段に記
憶された文字コードが文字表示され、編集手段により表
示手段によって表示された内容が編集され、出力手段に
より編集後の文字コードに話者名変換手段から出力され
る話者の名前と計時手段により計測された音声信号の生
成時間と当該装置を特定する装置番号とが付加された状
態で出力される。According to the minutes producing apparatus of the third invention of the present invention, the voice signal in the voice data stored in the voice data storage means is converted into the character code by the voice recognition means, and the voice signal is outputted by the speaker identifying section. The voice signal of the speaker input from the input means is converted into a speaker code, the speaker name corresponding to the speaker code is registered in advance by the speaker name registration means, and the speaker code is converted by the speaker name conversion means. The speaker code stored in the storage means is converted into the name of the speaker registered in the speaker name registration means, the character code stored in the character code storage means is displayed by the display means, and the display means is displayed by the editing means. The content displayed by is edited, and the output unit displays the name of the speaker output from the speaker name conversion unit to the edited character code, the generation time of the audio signal measured by the time measuring unit, and the device. It is output in a state where a constant to device number appended.

【００１８】本発明の第４発明の議事録作成装置によれ
ば、音声認識手段により音声信号記憶手段に記憶された
音声信号が文字コードに変換され、話者識別部により音
声信号入力手段から入力された話者の音声信号が話者コ
ードに変換され、話者名登録手段により予め前記話者コ
ードに対応する話者の名前が登録され、話者名変換手段
により話者コード記憶手段に記憶された話者コードが話
者名登録手段に登録された話者の名前に変換され、表示
手段により文字コード記憶手段に記憶された文字コード
が文字表示され、編集手段により表示手段によって表示
された内容が編集され、出力手段により編集後の文字コ
ードに話者名変換手段から出力される話者の名前と計時
手段により計測された音声信号の生成時間と当該装置を
特定する装置番号とが付加された状態で出力される。According to the minutes creating apparatus of the fourth aspect of the present invention, the voice signal stored in the voice signal storage means is converted into the character code by the voice recognition means, and input from the voice signal input means by the speaker identification section. The voice signal of the speaker thus converted is converted into a speaker code, the speaker name corresponding to the speaker code is registered in advance by the speaker name registration means, and stored in the speaker code storage means by the speaker name conversion means. The converted speaker code is converted into the name of the speaker registered in the speaker name registration means, the character code stored in the character code storage means is displayed by the display means and displayed by the editing means by the display means. The name of the speaker whose contents are edited and output from the speaker name conversion unit to the character code after being edited by the output unit, the generation time of the voice signal measured by the time counting unit, and the device number for identifying the device. There are output by the additional state.

【００１９】本発明の第５発明の多地点議事録作成シス
テムによれば、多地点に配置された前記第３または第４
の発明に関わる議事録作成装置の出力手段から出力され
る出力データが多地点編集手段により一か所に集約され
た状態で編集され、該多地点編集手段によって前記出力
データが、計時手段により計時された音声信号の生成時
間に基づいて並び変えられることによって、多地点会議
の議事録が作成される。According to the multipoint minutes preparation system of the fifth aspect of the present invention, the third or fourth points arranged at multiple points.
The output data output from the output means of the minutes creating apparatus relating to the invention of claim 1 is edited by the multipoint editing means in a state where the output data is collected in one place, and the multipoint editing means measures the output data by the timekeeping means. The minutes of the multipoint conference are created by rearranging them according to the generation time of the generated audio signal.

【００２０】本発明の第６発明の議事録作成システムに
よれば、多地点に配置された前記第３または第４の発明
に関わる議事録作成装置の出力手段から出力される出力
データが多地点の議事録作成装置のうちの１つの議事録
作成装置の編集手段により一か所に集約された状態で編
集され、該編集手段によって前記出力データが、計時手
段により計時された音声信号の生成時間に基づいて並び
変えられることによって、多地点会議の議事録が作成さ
れる。According to the minutes creating system of the sixth invention of the present invention, the output data output from the output means of the minutes creating apparatus according to the third or fourth invention arranged at multiple points is multipoint. One of the minutes creating devices is edited by the editing means of one of the minutes creating device in a state of being gathered in one place, and the output data is edited by the editing means, and the output time is generated by the time counting means. The minutes of the multipoint meeting are created by being sorted based on the.

【００２１】[0021]

【実施例】以下、本発明の実施例を図面を参照して説明
する。（第１実施例）まず最初に、本発明の第１実施例を、図
１乃至図４を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings. (First Embodiment) First, a first embodiment of the present invention will be described with reference to FIGS.

【００２２】図１は、本実施例に係る議事録作成装置の
概略構成を示すブロック図である。FIG. 1 is a block diagram showing the schematic arrangement of a minutes creating apparatus according to this embodiment.

【００２３】同図において、１０１は話者の音声が音声
信号として直接入力される音声信号入力部である。該音
声信号入力部１０１は、入力された音声信号に含まれて
いる雑音を除去する雑音処理部１０２と、複数の話者の
中から前記音声信号を入力した話者を検出し、その話者
に対応する話者コードを出力する話者検出部１０３とを
有する。雑音処理部１０２は、後述するシステムコント
ローラに制御されている。In the figure, reference numeral 101 is a voice signal input section for directly inputting the voice of the speaker as a voice signal. The voice signal input unit 101 detects a speaker that has input the voice signal from a plurality of speakers, and a noise processing unit 102 that removes noise included in the input voice signal. And a speaker detection unit 103 that outputs a speaker code corresponding to. The noise processing unit 102 is controlled by the system controller described later.

【００２４】雑音処理部１０２及び話者検出部１０３の
出力側は音声データメモリ部１０４に接続されており、
雑音処理部１０２により雑音処理された音声信号に話者
検出部１０３から出力された話者コードを付加してなる
音声データが音声データメモリ部１０４に格納される。The output sides of the noise processing unit 102 and the speaker detection unit 103 are connected to the voice data memory unit 104,
Voice data obtained by adding the speaker code output from the speaker detecting unit 103 to the voice signal noise-processed by the noise processing unit 102 is stored in the voice data memory unit 104.

【００２５】音声データメモリ部１０４の出力側は、音
声認識部１０５とシステムコントローラ１０７とに接続
されている。音声認識部１０５は、前記音声データメモ
リ部１０４から送られる音声データ内の音声信号を認識
し、認識された音声信号に対応する文字コードに変換し
て出力する。また、音声認識部１０５では、音声信号の
処理中であるときはＢＵＳＹフラグがセットされ、処理
が終了するとクリアされる。音声データメモリ部１０４
から音声認識部１０５へのデータの転送は、システムコ
ントローラによって制御されており、前記ＢＵＳＹフラ
グがクリアされているときのみ音声データメモリ部１０
４に対して転送を許可する。The output side of the voice data memory unit 104 is connected to the voice recognition unit 105 and the system controller 107. The voice recognition unit 105 recognizes a voice signal in the voice data sent from the voice data memory unit 104, converts it into a character code corresponding to the recognized voice signal, and outputs the character code. Further, in the voice recognition unit 105, the BUSY flag is set when the voice signal is being processed, and is cleared when the process is completed. Voice data memory unit 104
The transfer of data from the voice recognition unit 105 to the voice recognition unit 105 is controlled by the system controller, and only when the BUSY flag is cleared, the voice data memory unit 10
4 is permitted to transfer.

【００２６】システムコントローラ１０７は、前記音声
データメモリ部１０４に格納されている音声データ内の
話者コードを、話者名登録部１０８に登録されている話
者名に基づいて実際の話者の名前に変換し、出力部１０
６へする。また、該システムコントローラ１０７は、音
声入力部１０１、音声認識部１０５及び出力部１０６に
接続されており、これらの動作を制御する。前記音声認
識部１０５から出力される文字コードは、出力部１０６
において、システムコントローラ１０７によって認識さ
れた話者名が付加された後、出力部１０６から外部に接
続されている例えばプリンタ等の印字装置やディスプレ
イ等へ出力される。The system controller 107 determines the speaker code in the voice data stored in the voice data memory unit 104 based on the speaker name registered in the speaker name registration unit 108. Convert to name and output part 10
Go to 6. Further, the system controller 107 is connected to the voice input unit 101, the voice recognition unit 105, and the output unit 106, and controls their operations. The character code output from the voice recognition unit 105 is the output unit 106.
At, the speaker name recognized by the system controller 107 is added, and then output from the output unit 106 to a printing device such as a printer or a display connected to the outside.

【００２７】図２は、図１に示した雑音処理部１０２の
概略構成を示すブロック図である。同図において、雑音
処理部１０２は、周辺の雑音を参照信号として収音する
マイクロホン１１１と、マイクロホン１０９、１１１、
１１０を介して入力された信号を離散化するアナログ−
デジタル変換器（以下、Ａ／Ｄ変換器という）１１２，
１１３，１１４と、マイクロホン１０９，１１０を介し
て入力された音声信号から雑音信号をフィルタリングし
て取り除く適応フィルタ１１５とから構成される。該適
応フィルタ１１５は、Ａ／Ｄ変換器１１２〜１１４の出
力側に接続され、後述するようにＡ／Ｄ変換器１１３か
ら入力される参照信号に基づいて、Ａ／Ｄ変換器１１
２、１１４から入力される音声信号の雑音を除去する。FIG. 2 is a block diagram showing a schematic configuration of the noise processing unit 102 shown in FIG. In the figure, a noise processing unit 102 includes a microphone 111 that picks up ambient noise as a reference signal, microphones 109 and 111,
An analog for discretizing a signal input via 110-
A digital converter (hereinafter referred to as an A / D converter) 112,
113 and 114, and an adaptive filter 115 that filters out a noise signal from the voice signal input via the microphones 109 and 110. The adaptive filter 115 is connected to the output sides of the A / D converters 112 to 114, and based on a reference signal input from the A / D converter 113 as described later, the A / D converter 11
The noise of the voice signal input from 2, 114 is removed.

【００２８】図３は図１に示した話者検出部１０３の概
略構成を示すブロック図である。話者検出部１０３は、
入力される信号を離散化するアナログ−デジタル変換器
（以下、Ａ／Ｄ変換器という）１１６，１１７と、音声
信号のレベルを検出するレベル検出部１１８と、スイッ
チングにより話者を区別するマルチプレクサ１１９と、
該マルチプレクサ１１９によりオン状態にされた話者ス
イッチに対応する話者コードを発生する話者コード発生
部１２０とから構成される。レベル検出部１１８は、第
１話者及び第２話者が発話したか否かを検出するため
に、マイクロホン１０９，１１０から入力された音声信
号ｓ１，ｓ２の入力レベルが、予め記憶されている閾値
以上であるか否かを判別し、閾値以上であるときは、マ
ルチプレクサ１１９にスイッチングを行う信号を出力す
る。また、マルチプレクサ１１９は、前記レベル検出部
１１８により音声レベルが検出された話者に対応する話
者スイッチをオン状態にするものである。FIG. 3 is a block diagram showing a schematic configuration of the speaker detecting unit 103 shown in FIG. The speaker detection unit 103
Analog-digital converters (hereinafter referred to as A / D converters) 116 and 117 for discretizing an input signal, a level detection unit 118 for detecting the level of a voice signal, and a multiplexer 119 for distinguishing a speaker by switching. When,
The multiplexer 119 comprises a speaker code generator 120 for generating a speaker code corresponding to the speaker switch turned on. The level detection unit 118 stores in advance the input levels of the audio signals s1 and s2 input from the microphones 109 and 110 in order to detect whether or not the first speaker and the second speaker have spoken. Whether or not it is equal to or larger than the threshold value is determined. Further, the multiplexer 119 turns on the speaker switch corresponding to the speaker whose voice level has been detected by the level detector 118.

【００２９】次に、上記構成の議事録作成装置において
行われる議事録作成処理について、図４を参照して詳説
する。図４は、本実施例に係る議事録作成装置の議事録
作成動作を示すフローチャートである。Next, the minutes creating process performed by the minutes creating apparatus having the above configuration will be described in detail with reference to FIG. FIG. 4 is a flowchart showing the minutes creating operation of the minutes creating apparatus according to the present embodiment.

【００３０】いま、第１話者が発話した場合を考える。
第１話者の音声信号ｓ１は一方のマイクロホン１０９を
介して雑音処理部１０２及び話者検出部１０３に入力さ
れる（ステップＳ４０１）。次いで、雑音処理部１０２
では入力された音声信号に対する雑音除去処理が行われ
（ステップＳ４０２）、話者検出部１０３では入力され
た音声信号に基づいて話者の検出が行われる（ステップ
Ｓ４０３）。Now, consider the case where the first speaker speaks.
The voice signal s1 of the first speaker is input to the noise processing unit 102 and the speaker detection unit 103 via the one microphone 109 (step S401). Next, the noise processing unit 102
Then, noise removal processing is performed on the input voice signal (step S402), and the speaker detection unit 103 detects the speaker based on the input voice signal (step S403).

【００３１】前記ステップＳ４０２で行われる雑音処理
は、図５に示すフローチャートに沿って行われる。The noise processing performed in step S402 is performed according to the flowchart shown in FIG.

【００３２】Ａ／Ｄ変換部１１２に入力された音声信号
はデジタル信号に変換された後、適応フィルタ１１５に
入力される（ステップＳ５０１）。The audio signal input to the A / D converter 112 is converted to a digital signal and then input to the adaptive filter 115 (step S501).

【００３３】前記音声信号ｓ１は、図２に示すように、
雑音成分ｎ１が重畳された信号（ｓ１＋ｎ１）となって
いる。この雑音成分ｎ１を取り除くために、マイクロホ
ン１１１に入力した雑音が参照信号ｎとして適応フィル
タ１１５に取り込まれ（ステップＳ５０２）、該参照信
号ｎと前回の出力音声信号ｏ１とに基づいてフィルタ係
数が算出される（ステップＳ５０３、Ｓ５０４）。適応
フィルタ１１５では、算出されたフィルタ係数を用いて
音声信号（ｓ１＋ｎ１）に対する適応フィルタ処理が行
われ（ステップＳ５０５）、雑音ｎ１が除去された新た
な音声信号ｏ１が後段の音声データメモリ部１０４へ出
力されて、本処理動作を終了する（ステップＳ５０
６）。なお、新たな音声信号ｏ１は、図２の適応フィル
タ１１５に再度フィードバックされ、次回のフィルタ係
数算出に使用される。The audio signal s1 is, as shown in FIG.
It is a signal (s1 + n1) on which the noise component n1 is superimposed. In order to remove this noise component n1, the noise input to the microphone 111 is taken into the adaptive filter 115 as the reference signal n (step S502), and the filter coefficient is calculated based on the reference signal n and the previous output audio signal o1. (Steps S503 and S504). In the adaptive filter 115, the adaptive filter processing is performed on the audio signal (s1 + n1) using the calculated filter coefficient (step S505), and the new audio signal o1 from which the noise n1 is removed is sent to the audio data memory unit 104 in the subsequent stage. This is output and this processing operation is ended (step S50).
6). The new audio signal o1 is fed back to the adaptive filter 115 in FIG. 2 and used for the next filter coefficient calculation.

【００３４】一方、前記図４のステップＳ４０３で行わ
れる話者検出処理は、図６に示すフローチャートに沿っ
て行われる。まず、Ａ／Ｄ変換部１１６，１１７により
Ａ／Ｄ変換された音声信号は、レベル検出部１１８に入
力される（ステップＳ６０１）。ここでは第１話者及び
第２話者の音声信号のレベル検出が行われる。具体的に
は、マイクロホン１０９，１１０から入力された音声信
号ｓ１，ｓ２の入力レベルが、予め記憶されている閾値
以上であるか否かが判別され（ステップＳ６０２）、そ
の答えが否定（ＮＯ）のときは、前記ステップＳ６０１
とステップＳ６０２とを繰り返す待機状態となる。ま
た、ステップＳ６０２の答えが肯定（ＹＥＳ）のときに
は、発話があったことが確認される。上述したように、
ここでは第１話者だけが発話した場合を考えているの
で、第１話者の音声信号のレベルのみが閾値以上である
と判別され、第１話者の発話が確認される。次いで、マ
ルチプレクサ１１９により、音声信号のレベルが検出さ
れた話者（第１話者）の話者スイッチがオン状態にされ
（ステップＳ６０３）、図３の話者コード発生部１２０
に音声信号が入力されるようになる。これにより、前記
話者スイッチのオン状態が話者コード発生部１２０によ
り検知され、そのスイッチに対応する話者コードが後段
の回路に出力される（ステップＳ６０４）。ここで、前
記話者コードは、例えば第１話者の話者コードが
“１”、第２話者の話者コードが“２”と設定されてい
る。On the other hand, the speaker detection process performed in step S403 of FIG. 4 is performed according to the flowchart shown in FIG. First, the audio signal A / D converted by the A / D conversion units 116 and 117 is input to the level detection unit 118 (step S601). Here, the level detection of the voice signals of the first and second speakers is performed. Specifically, it is determined whether or not the input levels of the audio signals s1 and s2 input from the microphones 109 and 110 are equal to or higher than a threshold value stored in advance (step S602), and the answer is negative (NO). If it is, the above step S601
And step S602 are repeated to enter a standby state. When the answer in step S602 is affirmative (YES), it is confirmed that an utterance has been made. As mentioned above,
Here, since the case where only the first speaker utters is considered, it is determined that only the level of the voice signal of the first speaker is equal to or higher than the threshold value, and the utterance of the first speaker is confirmed. Next, the multiplexer 119 turns on the speaker switch of the speaker (first speaker) whose voice signal level has been detected (step S603), and the speaker code generator 120 of FIG.
An audio signal will be input to. As a result, the on-state of the speaker switch is detected by the speaker code generator 120, and the speaker code corresponding to the switch is output to the subsequent circuit (step S604). Here, as the speaker code, for example, the speaker code of the first speaker is set to "1" and the speaker code of the second speaker is set to "2".

【００３５】図４に戻り、前記図６のステップＳ６０４
で出力された話者コードは、図２の雑音処理部１０２に
より雑音処理された音声信号ｏ１と共に、音声データと
して、図７に示す形式で音声データメモリ部１０４に格
納される（ステップＳ４０４）。一つの音声データは、
無声音部分が検知された時に終了し、音声データの終端
には終端符号（ＥＯＤ）が付加される。Returning to FIG. 4, step S604 of FIG.
The speaker code output in step (1) is stored in the voice data memory unit 104 in the format shown in FIG. 7 as voice data together with the voice signal o1 that has been noise-processed by the noise processing unit 102 in FIG. 2 (step S404). One voice data is
It ends when an unvoiced sound portion is detected, and an end code (EOD) is added to the end of the voice data.

【００３６】ここで、図１のシステムコントローラ１０
７により、音声認識部１０６のＢＵＳＹフラグがクリア
されているか否かが判別され（ステップＳ４０５）、こ
の答えが否定（ＮＯ）のときは音声認識部１０６におい
て前回の音声データの認識処理が終了していないので、
待機状態となる。また、ステップＳ４０５の答えが肯定
（ＹＥＳ）のときは、システムコントローラ１０７によ
り、音声データメモリ部１０４に対して音声データの転
送が許可される。Here, the system controller 10 of FIG.
7, it is determined whether or not the BUSY flag of the voice recognition unit 106 is cleared (step S405). When the answer is negative (NO), the voice recognition unit 106 finishes the previous voice data recognition process. Not so
It becomes a standby state. If the answer to step S405 is affirmative (YES), the system controller 107 permits the audio data memory unit 104 to transfer audio data.

【００３７】前記ステップＳ４０４で音声データメモリ
部１０４に記憶された音声データは、１データずつ、図
１の音声認識部１０５とシステムコントローラ１０７と
に転送され、文字コード、話者コードに変換される（ス
テップＳ４０６）。即ち、音声認識部１０５では、入力
された音声データが対応する文字コードに変換される。
また、システムコントローラ１０７では、図１の話者名
登録部１０８内に予め登録されている話者名を参照する
ことにより、入力された音声データ内の話者コードが、
該話者コードに対応する話者名に変換される。The voice data stored in the voice data memory unit 104 in step S404 is transferred to the voice recognition unit 105 and the system controller 107 of FIG. 1 one by one, and converted into a character code and a speaker code. (Step S406). That is, the voice recognition unit 105 converts the input voice data into a corresponding character code.
Further, in the system controller 107, by referring to the speaker name registered in advance in the speaker name registration unit 108 of FIG.
It is converted into a speaker name corresponding to the speaker code.

【００３８】そして、前記話者名が、システムコントロ
ーラ１０７によって前記文字コードに付加されて図１の
出力部１０６から外部に出力されると（ステップＳ４０
７）、メモリ部１０４に記憶されていた音声データがメ
モリ部から開放され（ステップＳ４０８）、一つの音声
データに対する議事録作成処理が終了する。When the speaker name is added to the character code by the system controller 107 and output from the output unit 106 of FIG. 1 to the outside (step S40).
7) The voice data stored in the memory unit 104 is released from the memory unit (step S408), and the minutes creating process for one voice data ends.

【００３９】また、第２話者が発話した場合は、図２に
示すように、音声信号ｓ２が雑音成分ｎ２が重畳された
信号（ｓ２＋ｎ２）となって、他方のマイクロホン１１
０から雑音処理部１０２に入力され、上述した第１話者
の発話時と同様の処理が行われた後、出力部１０６より
文字コード及び話者名が出力される。When the second speaker utters, the voice signal s2 becomes a signal (s2 + n2) on which the noise component n2 is superimposed, as shown in FIG.
After being input to the noise processing unit 102 from 0, the same processing as that performed when the first speaker speaks is performed, and then the character code and the speaker name are output from the output unit 106.

【００４０】以上説明したように、本実施例によれば、
話者の音声を認識し、認識された音声信号から文字コー
ドへ自動的に文字変換するようにしたので、会議内容を
手書きあるいは録画装置や録音装置により記録すること
なく、容易に且つ迅速に、しかも自動的に議事録を作成
ができる。また、音声認識と同時に、話者コードを用い
て話者を判別するので、話者の認識も確実に行うことが
できる。（第２実施例）次に、本発明の第２実施例を、図８を参
照して説明する。なお、本実施例における議事録作成装
置の基本的な構成は、上述した第１実施例の図１と同一
であるから、同図を流用して説明する。As described above, according to this embodiment,
Since the speaker's voice is recognized and the recognized voice signal is automatically converted into a character code, it is possible to easily and quickly, without handwriting or recording the conference contents with a recording device or a recording device. Moreover, minutes can be created automatically. Further, since the speaker is discriminated using the speaker code at the same time as the voice recognition, the speaker can be surely recognized. (Second Embodiment) Next, a second embodiment of the present invention will be described with reference to FIG. Since the basic configuration of the minutes creating apparatus in this embodiment is the same as that of FIG. 1 of the above-described first embodiment, the same drawing will be used for the description.

【００４１】図８は、本実施例に係る議事録作成装置に
用いられる雑音処理部１０２の概略構成を示すブロック
図である。同図において上述した第１実施例の図２と同
一の構成要素には同一符号を付してある。FIG. 8 is a block diagram showing a schematic configuration of the noise processing unit 102 used in the minutes creating apparatus according to this embodiment. In the figure, the same components as those in FIG. 2 of the first embodiment described above are designated by the same reference numerals.

【００４２】図８において図２と異なる点は、図２の構
成から周辺の雑音を参照信号として取り入れるマイクロ
ホン１１１とＡ／Ｄ変換部１１３を除き、マイクロホン
１０９，１１０とＡ／Ｄ変換部１１２，１１４との間に
音声信号ｓ１，ｓ２を増幅するためのアンプ１２１、１
２２を設けると共に、適応フィルタを第１及び第２適応
フィルタ１１５ａ，１１５ｂの２つに分けたことであ
る。それ以外の構成は、図２と同様であるので、その説
明は省略する。8 is different from that shown in FIG. 2 in that the microphones 109 and 110 and the A / D converters 112, 112, except for the microphone 111 and the A / D converter 113 that take in ambient noise as a reference signal from the configuration of FIG. Amplifiers 121 and 1 for amplifying the audio signals s1 and s2 between
22 is provided and the adaptive filter is divided into first and second adaptive filters 115a and 115b. The other configuration is the same as that of FIG. 2, and the description thereof is omitted.

【００４３】本実施例において、例えば二人が同時に発
話した場合は、図８に示すように、一方のマイクロホン
１０９から入力される信号ｉ１は、第１話者の音声信号
ｓ１に第２話者の音声信号ｓ２が雑音信号ｎ２として重
畳された信号となり、他方のマイクロホン１１０から入
力される信号ｉ２は、第２話者の音声信号ｓ２に第１話
者の音声信号ｓ１が雑音信号ｎ１として重畳された信号
となっている。In the present embodiment, for example, when two people speak at the same time, as shown in FIG. 8, the signal i1 input from one of the microphones 109 is the voice signal s1 of the first speaker and the second speaker. Voice signal s2 is a signal superimposed as a noise signal n2, and a signal i2 input from the other microphone 110 is a voice signal s2 of the second speaker and a voice signal s1 of the first speaker superimposed as a noise signal n1. It is a signal that has been.

【００４４】入力信号ｉ１、ｉ２はアンプ１２１，１２
２により増幅され、Ａ／Ｄ変換器１１２，１１４により
デジタル信号に変換された後、各々第１適応フィルタ１
１５ａ，第２適応フィルタ１１５ｂに入力される。The input signals i1 and i2 are input to the amplifiers 121 and 12
2 is amplified and converted into digital signals by the A / D converters 112 and 114, and then the first adaptive filter 1
15a and the second adaptive filter 115b.

【００４５】デジタル信号化された入力信号ｉ１から雑
音信号ｎ２を取り除くために、まず、前回の出力信号ｏ
１を利用して第２適応フィルタ１１５ｂのフィルタ係数
が算出、更新される。このフィルタ係数と第２適応フィ
ルタ１１５ｂへ入力される信号ｉ２とに基づいて、入力
信号ｉ１中に含まれる雑音成分ｅ１が算出される。そし
て、入力信号ｉ１から算出された雑音成分ｅ１を取り除
くことにより、入力信号中の第１話者の音声成分が抽出
され、出力信号ｏ１として出力される。In order to remove the noise signal n2 from the digitalized input signal i1, first, the output signal o
1 is used to calculate and update the filter coefficient of the second adaptive filter 115b. The noise component e1 included in the input signal i1 is calculated based on the filter coefficient and the signal i2 input to the second adaptive filter 115b. Then, by removing the noise component e1 calculated from the input signal i1, the voice component of the first speaker in the input signal is extracted and output as the output signal o1.

【００４６】同様にして、前回の出力信号ｏ２を利用し
て第１適応フィルタ１１５ａのフィルタ係数が算出、更
新され，該フィルタ係数と第１適応フィルタ１１５ａへ
の入力信号ｉ１とに基づいて、入力信号ｉ２中に含まれ
る雑音成分ｅ２が算出される。そして、入力信号ｉ２か
ら算出された雑音成分ｅ２を取り除くことにより、入力
信号中の第２話者の音声成分が抽出され、出力信号ｏ２
として出力される。Similarly, the filter coefficient of the first adaptive filter 115a is calculated and updated using the previous output signal o2, and the input is made based on the filter coefficient and the input signal i1 to the first adaptive filter 115a. The noise component e2 included in the signal i2 is calculated. Then, by removing the noise component e2 calculated from the input signal i2, the voice component of the second speaker in the input signal is extracted, and the output signal o2
Is output as.

【００４７】出力された２つの信号ｏ１，ｏ２は、それ
ぞれ図１の話者検出部１０３から出力された話者コード
と共に、音声データとして音声データメモリ部１０４に
記憶される。この音声データは、上述した第１実施例と
同様にして処理され、文字データ及び話者名に変換され
て出力部１０６から出力される。The two output signals o1 and o2 are stored in the voice data memory unit 104 as voice data together with the speaker code output from the speaker detecting unit 103 in FIG. This voice data is processed in the same manner as in the first embodiment described above, converted into character data and a speaker name, and output from the output unit 106.

【００４８】以上説明したように、本実施例によれば、
２人が同時に発話した場合であっても、互いに雑音とな
る相手の音声を除去することができる。そのため、上述
した第１実施例の効果に加えて、話者の認識を更に確実
に行うことができる。（第３実施例）次に、本発明の第３実施例を、図９を参
照して説明する。なお、本実施例においける議事録作成
装置の基本的な構成は、上述した第１実施例の図１と同
一であるから、同図を流用して説明する。As described above, according to this embodiment,
Even when two people speak at the same time, it is possible to remove the voices of the other party who become noises with each other. Therefore, in addition to the effects of the first embodiment described above, it is possible to more reliably recognize the speaker. (Third Embodiment) Next, a third embodiment of the present invention will be described with reference to FIG. Since the basic configuration of the minutes creating apparatus in this embodiment is the same as that of FIG. 1 of the above-mentioned first embodiment, the description will be made by diverting this drawing.

【００４９】図９は本実施例に係る議事録作成装置の概
略構成を示すブロック図である。同図において上述した
第１実施例の図１と同一部分には同一符号を付してあ
る。図９において図１と異なる点は、図１の構成から音
声データメモリ部１０４を削除し、その代わりにバイナ
リデータである音声信号を記憶する音声信号メモリ部
（音声信号記憶手段）１２３ａとテキストデータである
話者コードを記憶する話者コードメモリ部（話者コード
記憶手段）１２３ｂとをそれぞれ設けたものである。そ
れ以外の構成は、第１実施例の図１と同様であるので、
その詳細な説明は省略する。FIG. 9 is a block diagram showing the schematic arrangement of the minutes creating apparatus according to this embodiment. In the figure, the same parts as those in FIG. 1 of the first embodiment described above are designated by the same reference numerals. 9 is different from FIG. 1 in that the voice data memory unit 104 is deleted from the configuration of FIG. 1 and a voice signal memory unit (voice signal storage means) 123a for storing a voice signal which is binary data is replaced with text data. And a speaker code memory unit (speaker code storage means) 123b for storing the speaker code. Since the other structure is the same as that of FIG. 1 of the first embodiment,
Detailed description thereof will be omitted.

【００５０】本実施例では、雑音処理部１０２により雑
音を除去された音声信号は、そのまま音声信号メモリ部
１２３ａに記憶される。一つの音声信号は無声音部分が
検知された時に終了し、音声信号の終端には終端符号が
付加される。一方、話者検出部１０３から出力される話
者コードは、話者コードメモリ部１２３ｂに記憶され
る。前記音声信号と同様に、話者コードの終端にも、音
声信号の場合と同じタイミングで終端符号が付加され
る。In this embodiment, the voice signal from which the noise is removed by the noise processing unit 102 is stored in the voice signal memory unit 123a as it is. One voice signal ends when an unvoiced sound portion is detected, and a termination code is added to the end of the voice signal. On the other hand, the speaker code output from the speaker detection unit 103 is stored in the speaker code memory unit 123b. Similar to the voice signal, the termination code is added to the end of the speaker code at the same timing as in the voice signal.

【００５１】音声信号メモリ部１２３ａに記憶された音
声信号は、音声認識部１０５により文字コードに変換さ
れ、話者コードメモリ部１２３ｂに記憶された話者コー
ドは、システムコントローラ１０７により実際の話者名
に変換される。The voice signal stored in the voice signal memory unit 123a is converted into a character code by the voice recognition unit 105, and the speaker code stored in the speaker code memory unit 123b is converted into an actual speaker by the system controller 107. Is converted to a name.

【００５２】そして、文字コードと話者名とが出力部１
０６から出力されると、出力された音声信号メモリ部１
２３ａの音声信号及び話者コードメモリ部１２３ｂの話
者コードは、各メモリ１２３ａ，１２３ｂより開放され
る。The character code and the speaker name are output by the output unit 1.
Output from the audio signal memory unit 1
The voice signal of 23a and the speaker code of the speaker code memory unit 123b are released from the memories 123a and 123b.

【００５３】本実施例のように、音声信号を記憶するメ
モリ部と話者コードを記憶するメモリ部とを個別にした
構成でも、上述した第１、第２実施例と同様に、容易且
つ迅速に、しかも自動的に、正確な議事録を作成するこ
とが可能となる。Even if the memory unit for storing the voice signal and the memory unit for storing the speaker code are separately provided as in this embodiment, they are easy and quick as in the first and second embodiments. In addition, it is possible to automatically create accurate minutes.

【００５４】なお、本実施例では、話者検出部１０３か
ら出力される話者コードを話者コードメモリ部１２３ｂ
に一時的に格納したが、話者コードメモリ部１２３ｂを
設けずに、直接システムコントローラ１０７に話者コー
ドを送るような構成としてもよい。（第４実施例）次に、本発明の第４実施例を、図１０乃
至図１５を参照して説明する。In this embodiment, the speaker code output from the speaker detecting unit 103 is stored in the speaker code memory unit 123b.
However, the speaker code may be directly sent to the system controller 107 without providing the speaker code memory unit 123b. (Fourth Embodiment) Next, a fourth embodiment of the present invention will be described with reference to FIGS.

【００５５】図１０は本実施例に係る議事録作成装置の
概略構成を示すブロック図である。同図において上述し
た第１実施例の図１と同一部分には同一符号を付してあ
る。FIG. 10 is a block diagram showing the schematic arrangement of a minutes creating apparatus according to this embodiment. In the figure, the same parts as those in FIG. 1 of the first embodiment described above are designated by the same reference numerals.

【００５６】図１０において図１と異なる点は、図１の
音声認識部１０５と出力部１０６との間に文字コードメ
モリ部１２４（文字コード記憶手段）を設け、該文字コ
ードメモリ部１２４にＣＲＴ等の表示部１２６（表示手
段）を有する編集部１２５を接続したことである。それ
以外の構成は図１と同様であるので、その説明は省略す
る。ここで、前記文字コードメモリ部１２４は、システ
ムコントローラ１０７に接続され、その動作が該システ
ムコントローラ１０７により制御される。10 is different from FIG. 1 in that a character code memory unit 124 (character code storage means) is provided between the voice recognition unit 105 and the output unit 106 of FIG. 1, and the character code memory unit 124 has a CRT. That is, the editing unit 125 having the display unit 126 (display unit) such as is connected. The rest of the configuration is the same as that of FIG. 1, and therefore its explanation is omitted. Here, the character code memory unit 124 is connected to the system controller 107, and its operation is controlled by the system controller 107.

【００５７】上記構成における議事録作成装置の議事録
作成処理は、図１１に示すフローチャートに沿って行わ
れる。The minutes creating process of the minutes creating apparatus having the above configuration is performed according to the flowchart shown in FIG.

【００５８】同図において、ステップＳ１１０１からス
テップＳ１１０６までの処理は、上述した第１実施例の
図４に示したステップＳ４０１からステップＳ４０６ま
での処理と同一であるので、その説明は省略する。な
お、上述したように、一つの音声データは、無声音部分
が検知された時に終了し、音声データの終端には終端符
号が付加される。In the figure, the processing from step S1101 to step S1106 is the same as the processing from step S401 to step S406 shown in FIG. 4 of the above-mentioned first embodiment, and therefore its explanation is omitted. As described above, one piece of voice data ends when an unvoiced sound portion is detected, and a termination code is added to the end of the voice data.

【００５９】ステップＳ１１０６で図１０の音声認識部
１０５から出力された文字コードは、システムコントロ
ーラ１０７により話者名を付加されて、文字コードメモ
リ部１２４に格納される（ステップＳ１１０７）。格納
された文字コードは誤認識されている可能性があるの
で、表示部１２６に文字コードメモリ部１２４に記憶さ
れた文字コードを表示して、操作者により誤認識された
文字コードの編集を行う（ステップＳ１１０８）。編集
された文字コードは、再度文字コードメモリ部１２４に
格納され、出力部１０６から出力される（ステップＳ１
１０９）。The character code output from the voice recognition unit 105 of FIG. 10 in step S1106 is added with the speaker name by the system controller 107 and stored in the character code memory unit 124 (step S1107). Since the stored character code may have been erroneously recognized, the character code stored in the character code memory unit 124 is displayed on the display unit 126 to edit the character code erroneously recognized by the operator. (Step S1108). The edited character code is stored again in the character code memory unit 124 and output from the output unit 106 (step S1).
109).

【００６０】このような議事録作成装置について、具体
例を挙げて説明する。Such a minutes creating apparatus will be described with a specific example.

【００６１】今、第１話者と第２話者との間で、次のよ
うな会話が行われたとする。Now, assume that the following conversation is performed between the first speaker and the second speaker.

【００６２】第１話者：私はＡです。あなたはＢです
か。First speaker: I am A. Are you B?

【００６３】第２話者：いいえ。Second speaker: No.

【００６４】第１話者：あなたはＣですか。First speaker: Are you C?

【００６５】第２話者：はい、そうです。Second speaker: Yes, that's right.

【００６６】このとき、図１０の雑音処理部１０２によ
り雑音を低減された音声信号は、話者コードと共に、音
声データとして例えば図１２に示す形式で図１０の音声
データメモリ部１０４に記憶される。この音声信号は１
データずつ音声認識部１０５に転送されて音素単位で認
識され、文字コードに変換され、出力される。At this time, the voice signal whose noise has been reduced by the noise processing unit 102 of FIG. 10 is stored as voice data in the voice data memory unit 104 of FIG. 10 together with the speaker code in the format shown in FIG. . This voice signal is 1
The data is transferred to the voice recognition unit 105 one by one, recognized in phoneme units, converted into a character code, and output.

【００６７】一方、音声データ内の話者コードは、シス
テムコントローラ１０７により、例えば図１３に示す形
式で、予め図１０の話者名登録部１０８に登録されてい
る話者名に変換される。前記文字コードは、話者名に付
加されて、文字コードメモリ部１２４に転送され、記憶
される。記憶された文字コードは、例えばコンピュータ
の編集部１２５のディスプレイ等である表示部１２６上
に順次表示される。このときの表示の形式の一例を図１
４に示す。On the other hand, the speaker code in the voice data is converted by the system controller 107 into the speaker name registered in advance in the speaker name registration unit 108 of FIG. 10 in the format shown in FIG. 13, for example. The character code is added to the speaker name, transferred to the character code memory unit 124, and stored therein. The stored character codes are sequentially displayed on the display unit 126 which is, for example, the display of the editing unit 125 of the computer. An example of the display format at this time is shown in FIG.
4 shows.

【００６８】ここで、編集者により、表示部１２６上に
表示された会話の中に認められた認識ミスの修正や、漢
字やアルファベットへの変換が行われる。図１４の場合
は、「は」が「わ」に認識ミスされているので「は」に
改められ、更に、仮名文字の「え」、「びぃ」、「し
ぃ」が、それぞれアルファベットの「Ａ」、「Ｂ」、
「Ｃ」に変換される。このように変換或いは句読点を付
加して読み易くする処理は、編集部１２５において、編
集者により行われる。そして、この結果は出力部１０６
を介して、図１５に示す形式で外部に出力される。Here, the editor corrects a recognition error recognized in the conversation displayed on the display unit 126 and converts it into a kanji or alphabet. In the case of FIG. 14, since "ha" is mistakenly recognized as "wa", it is changed to "ha". Furthermore, the kana characters "e", "bii", and "shii" are the alphabetical characters "A". , "B",
Converted to "C". In this way, the editing unit 125 performs the process of converting or adding a punctuation mark to make it easier to read by the editor. Then, this result is output by the output unit 106.
Is output to the outside in the format shown in FIG.

【００６９】このように、本実施例によれば、上述した
第１〜第３実施例の構成に表示部１２６を有する編集部
１２５を設け、編集者により音声認識ミスや漢字やアル
ファベットへの変換を行えるようにしたので、第１〜第
３実施例と同様に容易且つ迅速に、しかも自動的に、正
確な議事録を作成できると共に、更に正確で読みやすい
議事録を作成することが可能となる。As described above, according to this embodiment, the editing unit 125 having the display unit 126 is provided in the configuration of the above-described first to third embodiments, and the editor makes a voice recognition error or conversion into Kanji or alphabet. As described above, it is possible to create an accurate minutes easily, quickly and automatically as in the first to third embodiments, and also to create an even more accurate and readable minutes. Become.

【００７０】なお、本実施例は、上述した第１実施例の
図１に示した議事録作成装置に表示部１２６及び編集部
１２５を設けて編集者により文字コード認識ミスや漢字
やアルファベットへの変換を行えるようにしたものであ
るが、上述した第２及び第３実施例の構成に編集部１２
５及び表示部１２６を付加してもよいことはいうまでも
ない。（第５実施例）次に、本発明の第５実施例を図１６を参
照して説明する。図１６は、本実施例に係る議事録作成
装置の構成を示すブロック図であり、同図において、上
述した第４実施例の図１０と同一部分には同一符号を付
してある。図１６において、図１０と異なる点は、音声
信号入力部１０１の話者検出部１０３と雑音処理部１０
２とを直列に接続したことである。In this embodiment, the display unit 126 and the editing unit 125 are provided in the minutes creating apparatus shown in FIG. 1 of the above-mentioned first embodiment, and the editor recognizes a character code recognition error, a kanji character or an alphabet. Although the conversion can be performed, the editing unit 12 is added to the configurations of the second and third embodiments described above.
It goes without saying that the number 5 and the display unit 126 may be added. (Fifth Embodiment) Next, a fifth embodiment of the present invention will be described with reference to FIG. FIG. 16 is a block diagram showing the configuration of the minutes creating apparatus according to the present embodiment. In FIG. 16, the same parts as those in FIG. 10 of the above-described fourth embodiment are designated by the same reference numerals. 16 is different from FIG. 10 in that the speaker detection unit 103 and the noise processing unit 10 of the voice signal input unit 101 are different.
2 is connected in series.

【００７１】この場合は、入力された音声信号は、図３
に示す話者検出部１０３内のＡ／Ｄ変換部１１６，１１
７によってデジタル信号に変換されているので、図２に
示す雑音処理部１０２内のＡ／Ｄ変換部１１２，１１４
は削除することも可能である。（第６実施例）次に、本発明の第６実施例を、図１７乃
至図１９を参照して説明する。本実施例は、上述した第
１〜第５実施例に示した議事録作成装置を多地点に配置
して、多地点テレビ会議システムによる会議の議事録を
作成する多地点議事録作成システムとしたものである。In this case, the input audio signal is as shown in FIG.
A / D converters 116 and 11 in the speaker detector 103 shown in FIG.
Since it has been converted to a digital signal by the A.D. 7, the A / D conversion units 112 and 114 in the noise processing unit 102 shown in FIG.
Can be deleted. (Sixth Embodiment) Next, a sixth embodiment of the present invention will be described with reference to FIGS. The present embodiment is a multipoint minutes recording system in which the minutes recording devices shown in the above-described first to fifth embodiments are arranged at multiple points to create the minutes of a conference by a multipoint video conference system. It is a thing.

【００７２】図１７は、前記多地点テレビ会議を説明す
るための概念図である。本実施例では、メイン会場、Ａ
会場、Ｂ会場の３地点でテレビ会議が行われ、各会場に
は２人ずつの出席者がいる。便宜上、話者名を、メイン
会場の２人を「イ」，「ロ」、Ａ会場の２人を「ハ」，
「ニ」、Ｂ会場の２人を「ホ」，「ヘ」とする。また、
メイン会場に設置される議事録作成装置の装置番号を
「１」、Ａ会場、Ｂ会場に設置される議事録作成装置の
装置番号をそれぞれ「２」，「３」とする。本実施例に
おいて使用される各会場（地点）の議事録作成装置の構
成は、上述した第５実施例の図１６と同様であるので、
同図を流用して説明する。FIG. 17 is a conceptual diagram for explaining the multipoint video conference. In this embodiment, the main venue, A
Video conferences are held at three locations, venue B and venue B, and there are two attendees at each venue. For the sake of convenience, the speaker names are "a" and "b" for the two people in the main venue, and "ha" for the two people in the A venue.
Let's say "H" and "H" for the two people in "D" and B venue. Also,
The device numbers of the minutes creating devices installed in the main venue are "1", and the device numbers of the minutes creating devices installed in A venue and B venue are "2" and "3", respectively. Since the configuration of the minutes creating device of each venue (point) used in this embodiment is the same as that in FIG. 16 of the fifth embodiment described above,
The same drawing will be used for description.

【００７３】また、各議事録作成装置により生成される
文字コードに付加されている音声データ生成時間は、時
・分・秒まで記録されるが、これについても、便宜上、
生成時間の早いものから１、２、３……の番号を記録す
ることにする。Further, the voice data generation time added to the character code generated by each minutes preparation device is recorded up to hour / minute / second.
The numbers 1, 2, 3, ... Are recorded from the earliest generation time.

【００７４】上記３台の議事録作成装置は、図１８に示
すように、公衆回線ＩＳＤＮ（サービス総合デジタル
網）を介して、多地点編集装置（多地点編集手段）２１
０に接続されている。図１９は、前記多地点編集装置２
１０の概略構成を示すブロック図であり、公衆回線ＩＳ
ＤＮに接続され複数の議事録作成装置により生成された
データの入力部であるデータバッファ２１１と、システ
ムコントローラ２１２と、出力部２１３とから構成され
ている。As shown in FIG. 18, the above-mentioned three minutes preparation devices are provided with a multipoint editing device (multipoint editing means) 21 via a public line ISDN (Integrated Services Digital Network).
It is connected to 0. FIG. 19 shows the multipoint editing apparatus 2
10 is a block diagram showing a schematic configuration of a public line IS.
It is composed of a data buffer 211 which is an input unit of data generated by a plurality of minutes preparation devices connected to the DN, a system controller 212, and an output unit 213.

【００７５】上記構成において、多地点テレビ会議の議
事録は、図１８に示すフローチャートに沿って行われ
る。In the above configuration, the minutes of the multipoint video conference are performed according to the flowchart shown in FIG.

【００７６】いま、各出席者が次のような自己紹介をし
たとする。It is assumed that each attendee introduces himself as follows.

【００７７】第１話者：私はイです。First speaker: I am Lee.

【００７８】第３話者：私はハです。Third speaker: I'm Ha.

【００７９】第５話者：私はホです。Fifth speaker: I'm ho.

【００８０】第２話者：私はロです。Second speaker: I'm Ro.

【００８１】第４話者：私はニです。Fourth speaker: I'm Ni.

【００８２】第６話者：私はヘです。Speaker 6: I'm F.

【００８３】同図において、ステップＳ２００１〜ステ
ップＳ２００８では、上述した第４実施例の図１１のス
テップＳ１１０１〜ステップＳ１１０８と同一の処理動
作が、各会場に設置されている議事録作成装置により行
われる。In step S2001 to step S2008 in the figure, the same processing operation as step S1101 to step S1108 of FIG. 11 of the fourth embodiment described above is performed by the minutes creating device installed in each venue. .

【００８４】即ち、各話者の音声は、各会場の議事録作
成装置１〜３の雑音処理部１０２及び話者検出部１０３
に入力される（ステップＳ２２０１）。次いで、雑音処
理部１０２では入力された音声信号に対して雑音処理が
行われ（ステップＳ２２０２）、話者検出部１０３では
入力された音声信号に基づいて話者コードが出力される
（ステップＳ２２０３）。雑音処理された音声信号は、
話者コードを付加されて、音声データとして音声データ
メモリ部１０４に格納される（ステップＳ２２０４）。
ここで、システムコントローラ１０７により音声認識部
１０６のＢＵＳＹフラグがクリアされていることが判別
されると（ステップＳ２２０５）、音声データメモリ部
１０４から音声認識部１０５及びシステムコントローラ
１０７へ音声データが転送され、夫々文字コード、話者
名が出力される（ステップＳ２２０６）。出力された文
字コードは、話者名及び音声信号の生成時間を付加され
て、文字コードメモリ部１２４に格納される（ステップ
Ｓ２２０７）。該文字コードメモリ部１２４に格納され
た文字コードは、表示部１２６に表示され、編集部１２
５において編集処理が行われる（ステップＳ２２０
８）。That is, the voice of each speaker is converted into the noise processing unit 102 and the speaker detection unit 103 of the minutes creating devices 1 to 3 of each venue.
(Step S2201). Next, the noise processing unit 102 performs noise processing on the input voice signal (step S2202), and the speaker detection unit 103 outputs a speaker code based on the input voice signal (step S2203). . The noise-processed voice signal is
The speaker code is added and stored as voice data in the voice data memory unit 104 (step S2204).
When the system controller 107 determines that the BUSY flag of the voice recognition unit 106 is cleared (step S2205), the voice data is transferred from the voice data memory unit 104 to the voice recognition unit 105 and the system controller 107. , Respectively, the character code and the speaker name are output (step S2206). The output character code is added with the speaker name and the generation time of the voice signal and stored in the character code memory unit 124 (step S2207). The character code stored in the character code memory unit 124 is displayed on the display unit 126, and the editing unit 12
Editing processing is performed in step 5 (step S220).
8).

【００８５】ここで、各議事録作成装置において編集処
理が行われて得られた議事録データ（出力データ）は、
出力部１０６から、公衆回線ＩＳＤＮを介して、メイン
会場に設置されている多地点編集装置２１０へ送られ、
該多地点編集装置２１０内のデータバッファ２１１に格
納される（ステップＳ２００９）。そして、前記データ
バッファ２１１に蓄積されたデータが、システムコント
ローラ２１２により該データに付加されているヘッダの
音声信号生成時間に基づいて生成時間の早いものから順
に並び替えられる（ステップＳ２０１０）。正しい順序
に編集されたデータは、議事録として出力部２１３か
ら、外部に接続されているプリンタ等に出力されて、議
事録が生成される（ステップＳ２０１１）。Here, the minutes data (output data) obtained by performing the editing process in each minutes creation device is:
From the output unit 106, via the public line ISDN, is sent to the multipoint editing device 210 installed in the main venue,
The data is stored in the data buffer 211 in the multipoint editing apparatus 210 (step S2009). Then, the data stored in the data buffer 211 is rearranged in order from the earliest generation time based on the audio signal generation time of the header added to the data by the system controller 212 (step S2010). The data edited in the correct order is output as the minutes from the output unit 213 to a printer or the like connected to the outside to generate the minutes (step S2011).

【００８６】このように、本実施例の多地点議事録作成
システムによれば、上述した第１〜第５実施例の議事録
作成装置を多地点に配置し、各議事録作成装置に公衆回
線ＩＳＤＮを介して多地点編集装置２１０を接続するこ
とにより、複数の地点で会議を行う場合であっても、容
易且つ迅速に、しかも自動的に、正確な議事録を作成す
ることが可能となる。（第７実施例）次に、本発明の第７実施例を、図２１乃
至図２４を参照して説明する。上述した第６実施例で
は、各会場に設置された議事録作成装置により文字コー
ドを作成し、メイン会場に設置された多地点編集装置２
１０により議事録の編集処理及び出力を行ったが、本実
施例は、各会場の議事録作成装置において音声のレベル
値を記憶した音声データ、即ち、図１６の音声データメ
モリ部１０４に記憶された音声データにヘッダを付加す
るまでの処理行い、このデータの文字コード化及び編集
・出力をメイン会場の議事録作成装置により行うように
した点で、前記第６実施例と異なる。As described above, according to the multipoint minutes preparation system of this embodiment, the minutes preparation devices of the above-mentioned first to fifth embodiments are arranged at multiple points, and the public lines are connected to the respective minutes preparation devices. By connecting the multipoint editing apparatus 210 via ISDN, it becomes possible to easily and quickly create an accurate minutes, even when a conference is held at a plurality of points. . (Seventh Embodiment) Next, a seventh embodiment of the present invention will be described with reference to FIGS. In the sixth embodiment described above, the character code is created by the minutes creating device installed in each venue and the multipoint editing device 2 installed in the main venue.
The minutes are edited and output according to 10. However, in this embodiment, the minutes are stored in the voice data memory unit 104 of FIG. The sixth embodiment is different from the sixth embodiment in that the processing up to the addition of the header to the voice data is performed, and the character encoding, editing, and output of this data are performed by the minutes creating device of the main venue.

【００８７】図２１、図２２は、本実施例に係る議事録
作成装置の概略構成を示すブロック図であり、両図にお
いて、上述した第６実施例の図１６と同一部分には同一
符号を付してある。図２１はＡ，Ｂ会場（以下、サブ会
場という）に設置される議事録作成装置を、図２２はメ
イン会場に設置される議事録作成装置を、それぞれ示
す。21 and 22 are block diagrams showing a schematic configuration of the minutes creating apparatus according to the present embodiment. In both figures, the same parts as those of FIG. 16 of the sixth embodiment described above are designated by the same reference numerals. It is attached. FIG. 21 shows a minutes creating device installed in the A and B venues (hereinafter referred to as sub-venues), and FIG. 22 shows a minutes creating device installed in the main venue.

【００８８】図２１において、サブ会場に設置される議
事録作成装置（以下、サブシステムという）は、上述し
た第５実施例を示す図１６に示した音声認識部１０５以
降を省いた構成となっており、音声データメモリ部１０
４は公衆回線ＩＳＤＮに直接接続されている。また、図
２２において、メイン会場に設置される議事録作成装置
（以下、メインシステムという）は、前記図１６に示し
た音声データメモリ部１０４に公衆回線ＩＳＤＮを接続
し、前記図２１のサブシステムより送られてきた音声デ
ータを一時的に蓄積するように構成されている。なお、
前記図２２のメインシステムの音声データメモリ部１０
４のメモリ容量は、図２１のサブシステムより送られて
きたデータを格納できるように大容量にすることが好ま
しい。なお、図２１及び図２２において、上記以外の構
成は、図１６に示した構成と同一であるので、その詳細
な説明は省略する。In FIG. 21, a minutes creating apparatus (hereinafter referred to as a subsystem) installed in a sub-venue has a configuration in which the speech recognition unit 105 and the subsequent parts shown in FIG. 16 showing the fifth embodiment described above are omitted. Audio data memory unit 10
4 is directly connected to the public line ISDN. Further, in FIG. 22, the minutes recording device (hereinafter referred to as the main system) installed in the main venue connects the public line ISDN to the voice data memory unit 104 shown in FIG. It is configured to temporarily store the voice data sent from it. In addition,
The audio data memory unit 10 of the main system shown in FIG.
It is preferable that the memory capacity of 4 is large so that the data sent from the subsystem of FIG. 21 can be stored. 21 and 22, the configuration other than the above is the same as the configuration shown in FIG. 16, and thus detailed description thereof will be omitted.

【００８９】図２３は、前記図２１のメインシステムと
前記図２２のサブシステムとにより構成される多地点議
事録作成システムを説明するブロック図である。同図に
おいて、多地点議事録作成システムは、前記メインシス
テムとサブシステムとが公衆回線ＩＳＤＮを介して接続
されることにより構成される。FIG. 23 is a block diagram for explaining a multipoint minutes preparation system composed of the main system of FIG. 21 and the subsystem of FIG. In the figure, the multipoint minutes preparation system is configured by connecting the main system and the subsystem via a public line ISDN.

【００９０】上記構成において、多地点テレビ会議等の
議事録作成は、図２４に示すフローチャートに沿って行
われる。In the above structure, the minutes of a multipoint video conference or the like are created according to the flowchart shown in FIG.

【００９１】サブ会場で行われた会話は、各会場に設置
されているサブシステム（装置番号２，３）に入力さ
れ、雑音処理部１０２により雑音処理された音声信号
は、話者検出部１０３から出力される話者コード及びシ
ステムコントローラにより計時された音声信号生成時間
が付加されて、音声データとして音声データメモリ部１
０４に一時記憶される（ステップＳ２４０１）。音声デ
ータメモリ部１０４に記憶された音声データは公衆回線
ＩＳＤＭを介してメインシステム（装置番号１）の音声
データメモリ部１０４に転送される（ステップＳ２４０
２）。一方、メイン会場で行われた会話は、メインシス
テム（装置番号１）に入力され、サブシステム（装置番
号２，３）によって行われる処理と同様の処理が行われ
て、音声データメモリ部１０４に格納される。The conversation conducted at the sub-venue is input to the subsystems (apparatus numbers 2 and 3) installed in each hall, and the noise signal processed by the noise processing unit 102 is converted into the speaker detecting unit 103. The voice signal memory unit 1 adds voice signal generation time measured by the speaker controller and the system controller, as voice data.
04 is temporarily stored (step S2401). The voice data stored in the voice data memory unit 104 is transferred to the voice data memory unit 104 of the main system (device number 1) via the public line ISDM (step S240).
2). On the other hand, the conversation conducted at the main venue is input to the main system (device number 1), the same process as the process performed by the subsystems (device numbers 2 and 3) is performed, and the voice data memory unit 104 is processed. Is stored.

【００９２】音声データメモリ部１０４格納された音声
データは、システムコントローラ１０７により音声認識
部１０５のＢＵＳＹフラグがクリアされたことが確認さ
れると（ステップＳ２４０３）、他のサブシステムから
送られてきた音声データと共に音声認識部１０５及びシ
ステムコントローラ１０７に送られて、文字コード及び
話者名に変換される（ステップＳ２４０４）。変換され
た文字コードは文字コードメモリ部１２４に格納される
（ステップＳ２４０５）。そして、システムコントロー
ラ１０７により、前記音声データメモリ部１０４に蓄積
された音声データが、該音声データに付加されているヘ
ッダの音声信号生成時間を参照して生成時間の早いもの
の順に並び替えられ（ステップＳ２４０６）、正しい順
序に編集された議事録データが出力部１０６から出力さ
れ（ステップＳ２４０７）、議事録が生成される。When the system controller 107 confirms that the BUSY flag of the voice recognition unit 105 is cleared (step S2403), the voice data stored in the voice data memory unit 104 is sent from another subsystem. It is sent to the voice recognition unit 105 and the system controller 107 together with the voice data and converted into a character code and a speaker name (step S2404). The converted character code is stored in the character code memory unit 124 (step S2405). Then, the system controller 107 rearranges the audio data stored in the audio data memory unit 104 in order of earliest generation time with reference to the audio signal generation time of the header added to the audio data (step (S2406), the minutes data edited in the correct order is output from the output unit 106 (step S2407), and the minutes are generated.

【００９３】このように、サブ会場に設置された議事録
作成装置により完全な議事録データを作成してファイル
化しなくても、メイン会場に設置された議事録作成装置
のメモリ容量を大きくすることにより、上述した第６実
施例と同様に、容易且つ迅速に、しかも自動的に、正確
な議事録を作成することが可能となる。As described above, it is possible to increase the memory capacity of the minutes creating device installed in the main venue even if complete minutes data is not created and filed by the minutes creating device installed in the sub-venue. As a result, like the sixth embodiment described above, it is possible to easily, quickly, and automatically create an accurate minutes.

【００９４】[0094]

【発明の効果】以上説明したように、請求項１及び２の
議事録作成装置によれば、話者の音声信号を文字コード
に変換し、話者を識別するための話者コードを話者名に
変換した後、前記文字コード及び前記話者名を出力する
ので、容易且つ迅速に、しかも自動的に、正確な議事録
を作成することができるという効果を奏する。As described above, according to the minutes creating apparatus of claims 1 and 2, the speaker code for converting the voice signal of the speaker into the character code and identifying the speaker is used as the speaker. Since the character code and the speaker name are output after the name is converted into the name, there is an effect that an accurate minutes can be created easily, quickly, and automatically.

【００９５】また、請求項３の議事録作成装置によれ
ば、予め登録された話者名に基づいて話者コードが話者
名に変換され、文字コードと共に出力されるので、上述
した請求項１及び２の効果に加えて、話者の認識を確実
に行うことができるという効果を奏する。According to the minutes producing apparatus of claim 3, the speaker code is converted into the speaker name based on the speaker name registered in advance and is output together with the character code. In addition to the effects 1 and 2, there is an effect that the speaker can be surely recognized.

【００９６】また、請求項４の議事録作成装置によれ
ば、表示手段と編集手段とにより、音声認識ミスや漢字
・アルファベットへの変換等の編集を行なえるので、上
述した請求項１及び２の議事録作成装置の効果に加え
て、誤字・脱字・変換ミス等がなくなり、正確で読みや
すい議事録を作成することができるという効果を奏す
る。Further, according to the minutes creating apparatus of claim 4, the display means and the editing means can edit the voice recognition error and the conversion into kanji / alphabet. In addition to the effect of the minutes creating device, it is possible to create accurate and easy-to-read minutes by eliminating typographical errors, omissions, and conversion errors.

【００９７】また、請求項５及び６の議事録作成装置に
よれば、文字コードに音声信号の生成時間を付加し、該
音声信号生成時間に基づいて文字コードを編集できるよ
うにしたので、上記請求項１〜請求項６の効果に加え
て、複数の話者が会話をしていた場合にも、会議の時間
経過に正確に対応した、判りやすい議事録を作成するこ
とができるという効果を奏する。Further, according to the minutes producing apparatus of claims 5 and 6, the voice signal generation time is added to the character code, and the character code can be edited based on the voice signal generation time. In addition to the effects of claim 1 to claim 6, even when a plurality of speakers are having a conversation, it is possible to create an easy-to-understand minutes that accurately corresponds to the elapsed time of the meeting. Play.

【００９８】また、請求項７の多地点議事録作成システ
ムによれば、多地点に設置された議事録作成装置により
生成された文字コードを多地点編集手段に集め、該多地
点編集手段により文字コードに付加されている生成時間
に基づいて文字コードの編集を行えるので、複数の場所
で会議が行われる多地点テレビ会議の正確な議事録を、
容易且つ迅速に、しかも自動的に作成できるという効果
を奏する。According to the multipoint minutes preparation system of claim 7, the character codes generated by the minutes preparation device installed at the multipoints are collected in the multipoint editing means, and the characters are written by the multipoint editing means. Since the character code can be edited based on the generation time added to the code, accurate minutes of the multipoint video conference where meetings are held in multiple places can be performed.
The effect is that it can be created easily and quickly, and automatically.

【００９９】更に、請求項８の多地点議事録作成システ
ムによれば、多地点に設置された議事録作成装置の出力
手段から出力された出力データが、前記多地点の議事録
作成装置の内の一つの議事録作成装置の編集手段に集め
られ、該編集手段により編集して議事録を作成できるの
で、上述した請求項７の多地点議事録作成システムの効
果に加えて、多地点編集手段を格別設ける必要がないの
で、構成が簡素化されるという効果を奏する。Further, according to the multipoint minutes preparation system of claim 8, the output data output from the output means of the minutes preparation device installed at the multipoint is within the multipoint minutes preparation device. In addition to the effect of the multipoint minutes preparation system of claim 7 described above, the multipoint editing means can be added to the editing means of one of the minutes preparation apparatus and edited by the editing means to create the minutes. Since there is no need to provide the special, there is an effect that the configuration is simplified.

[Brief description of drawings]

【図１】本発明の第１実施例に係る議事録作成装置の概
略構成を示すブロック図である。FIG. 1 is a block diagram showing a schematic configuration of a minutes creating apparatus according to a first embodiment of the present invention.

【図２】図１に示した議事録作成装置の雑音処理部の概
略構成を示すブロック図である。FIG. 2 is a block diagram showing a schematic configuration of a noise processing unit of the minutes creating apparatus shown in FIG.

【図３】図１に示した議事録作成装置の話者検出部の概
略構成を示すブロック図である。FIG. 3 is a block diagram showing a schematic configuration of a speaker detection unit of the minutes creating apparatus shown in FIG.

【図４】図１に示した議事録作成装置の議事録作成処理
手順を示すフローチャートである。FIG. 4 is a flowchart showing a procedure of creating minutes of the minutes creating apparatus shown in FIG.

【図５】図１に示した議事録作成装置における雑音処理
部の雑音処理手順を示すフローチャートである。5 is a flowchart showing a noise processing procedure of a noise processing unit in the minutes creating apparatus shown in FIG.

【図６】図１に示した議事録作成装置における話者検出
部の話者検出手順を示すフローチャートである。6 is a flowchart showing a speaker detecting procedure of a speaker detecting unit in the minutes creating apparatus shown in FIG.

【図７】図１に示した議事録作成装置における音声デー
タの形式を示す説明図である。7 is an explanatory diagram showing a format of audio data in the minutes creating apparatus shown in FIG.

【図８】本発明の第２実施例に係る議事録作成装置にお
ける雑音処理部の概略構成を示すブロック図である。FIG. 8 is a block diagram showing a schematic configuration of a noise processing unit in the minutes creating apparatus according to the second embodiment of the present invention.

【図９】本発明の第３実施例に係る議事録作成装置の概
略構成を示すブロック図である。FIG. 9 is a block diagram showing a schematic configuration of a minutes creating apparatus according to a third embodiment of the present invention.

【図１０】本発明の第４実施例に係る議事録作成装置の
概略構成を示すブロック図である。FIG. 10 is a block diagram showing a schematic configuration of a minutes creating apparatus according to a fourth embodiment of the present invention.

【図１１】図１０に示した議事録作成装置の議事録作成
処理手順を示すフローチャートである。11 is a flowchart showing a procedure of creating minutes of the minutes creating apparatus shown in FIG.

【図１２】図１０に示した議事録作成装置の音声データ
メモリ部に記憶される音声信号の記憶形式を示す説明図
である。12 is an explanatory diagram showing a storage format of an audio signal stored in an audio data memory unit of the minutes creating apparatus shown in FIG.

【図１３】図１０に示した議事録作成装置の話者名登録
部の話者名登録形式を示す説明図である。13 is an explanatory diagram showing a speaker name registration format of a speaker name registration unit of the minutes creating apparatus shown in FIG.

【図１４】図１０に示した議事録作成装置の文字コード
メモリ部により記憶される音声信号の記憶形式を示す説
明図である。14 is an explanatory diagram showing a storage format of an audio signal stored in a character code memory unit of the minutes creating apparatus shown in FIG.

【図１５】図１０に示した議事録作成装置により作成さ
れた議事録の出力の一例を示す説明図である。15 is an explanatory diagram showing an example of the output of the minutes created by the minutes creating apparatus shown in FIG.

【図１６】本発明の第５実施例に係る議事録作成装置の
概略構成を示すブロック図である。FIG. 16 is a block diagram showing a schematic configuration of a minutes creating device according to a fifth embodiment of the present invention.

【図１７】本発明の第６実施例に係る多地点議事録作成
システムが適用される多地点テレビ会議を説明するため
の説明図である。FIG. 17 is an explanatory diagram illustrating a multipoint video conference to which the multipoint minutes recording system according to the sixth embodiment of the present invention is applied.

【図１８】同実施例に係る多地点議事録作成システムに
おける議事録作成装置の概略構成を示すブロック図であ
る。FIG. 18 is a block diagram showing a schematic configuration of a minutes creating device in the multipoint minutes creating system according to the embodiment.

【図１９】同実施例に係る多地点議事録作成システムに
おける議事録作成装置の概略構成を示すブロック図であ
る。FIG. 19 is a block diagram showing a schematic configuration of a minutes creating device in the multipoint minutes creating system according to the embodiment.

【図２０】同実施例に係る多地点議事録作成システムの
多地点議事録作成処理手順を示すフローチャートであ
る。FIG. 20 is a flowchart showing a multipoint minutes preparation process procedure of the multipoint minutes preparation system according to the embodiment.

【図２１】本発明の第７実施例に係る多地点議事録作成
システムにおける議事録作成装置（サブシステム）の概
略構成を示すブロック図である。FIG. 21 is a block diagram showing a schematic configuration of a minutes creating device (subsystem) in a multipoint minutes creating system according to a seventh embodiment of the present invention.

【図２２】同実施例に係る多地点議事録作成システムに
おける議事録作成装置（メインシステム）の概略構成を
示すブロック図である。FIG. 22 is a block diagram showing a schematic configuration of a minutes creating device (main system) in the multipoint minutes creating system according to the embodiment.

【図２３】同実施例に係る多地点議事録作成システムの
概略構成を示すブロック図である。FIG. 23 is a block diagram showing a schematic configuration of a multipoint minutes preparation system according to the embodiment.

【図２４】同実施例に係る多地点議事録作成システムの
多地点議事録作成処理手順を示すフローチャートであ
る。FIG. 24 is a flowchart showing a multipoint minutes preparation process procedure of the multipoint minutes preparation system according to the embodiment.

[Explanation of symbols]

１０１音声信号入力部（音声信号入力部）１０３話者検出部（話者識別部）１０４音声データメモリ部（音声データ記憶手段）１０５音声認識部（音声認識手段）１０６出力部（出力手段）１０７システムコントローラ（話者コード変換手
段、計時手段）１０８話者名登録部（話者名登録手段）１０９マイクロホン１１０マイクロホン１２３ａ音声信号メモリ部（音声信号記憶手段）１２３ｂ話者コードメモリ部（話者コード記憶手
段）１２４文字コードメモリ部（文字コード記憶手段）１２５編集部（編集手段）１２６表示部（表示手段）２１０多地点編集装置（多地点編集手段）101 voice signal input unit (voice signal input unit) 103 speaker detection unit (speaker identification unit) 104 voice data memory unit (voice data storage unit) 105 voice recognition unit (voice recognition unit) 106 output unit (output unit) 107 System controller (speaker code conversion means, clocking means) 108 speaker name registration unit (speaker name registration means) 109 microphone 110 microphone 123a voice signal memory unit (voice signal storage unit) 123b speaker code memory unit (speaker code) Storage unit) 124 Character code memory unit (Character code storage unit) 125 Editing unit (Editing unit) 126 Display unit (Displaying unit) 210 Multipoint editing device (Multipoint editing unit)

Claims

[Claims]

1. A voice signal input means having a speaker identification section for receiving a voice signal of a speaker and converting the voice signal into a speaker code for identifying a speaker corresponding to the voice signal. A voice data storage unit for storing voice data obtained by adding a speaker code output from the speaker identification unit to a voice signal input from the voice signal input unit, and a voice stored in the voice data storage unit. A voice recognition means for converting a voice signal in the data into a character code; and an output means for outputting the speaker code in the voice data stored in the voice data storage means and the character code output from the voice recognition means. A minutes preparation device characterized by being equipped with.

2. A voice signal input means having a speaker identification section for receiving a voice signal of a speaker and converting the voice signal into a speaker code for identifying a speaker corresponding to the voice signal. A voice signal storage unit for storing a voice signal input from the voice signal input unit, a speaker code storage unit for storing a speaker code output from the speaker identification unit, and a voice signal storage unit for storing the voice signal. A voice recognition means for converting the voice signal into a character code; and an output means for outputting the character code output from the voice recognition means and the speaker code output from the speaker code conversion means. Characteristic minutes making device.

3. A speaker name registration means for registering the name of a speaker corresponding to the speaker code in advance, and a speaker code in the voice data stored in the voice data storage means for the speaker name registration means. And a speaker name conversion means for converting the speaker name into a name of a speaker output from the speaker name conversion means into a character code output from the voice recognition means. 3. The minutes preparing apparatus according to claim 1, wherein the minutes are added and output.

4. A character code storage unit for storing a character code output from the voice recognition unit, a display unit for displaying the character code stored in the character code storage unit as a character, and a display unit for displaying the character code. 4. The minutes creating apparatus according to claim 1, further comprising an editing means for editing the contents.

5. A voice signal input means having a speaker identification unit for receiving a voice signal of a speaker and converting the voice signal into a speaker code for identifying a speaker corresponding to the voice signal. A time measuring means for measuring a generation time of a voice signal input from the voice signal input means, and a voice code input from the voice signal input means, and a speaker code output from the speaker identifying unit. A voice data storage unit that stores voice data, a voice recognition unit that converts a voice signal in the voice data stored in the voice data storage unit into a character code, and a character code that is output from the voice recognition unit are stored. Character code storage means, display means for displaying the character code stored in the character code storage means as characters, and editing means for editing the contents displayed by the display means,
Speaker name registration means for registering the speaker name corresponding to the speaker code in advance, and the speaker code stored in the speaker name storage means for the speaker code stored in the speaker code storage means. A speaker name converting means for converting into a character code output from the voice recognizing means, a speaker name output from the speaker name converting means, a generation time of a voice signal timed by the time counting means, and An apparatus for preparing minutes, comprising: an output means for adding and outputting a device number for identifying a device.

6. A voice signal input means having a speaker identification section for receiving a voice signal of a speaker and converting the voice signal into a speaker code for identifying a speaker corresponding to the voice signal. Outputting from the speaker identifying unit, and a clocking unit that counts a generation time of a voice signal input from the voice signal input unit, a voice signal storage unit that stores a voice signal input from the voice signal input unit, and a speaker identification unit. A speaker code storage unit for storing a speaker code, a voice recognition unit for converting a voice signal stored in the voice signal storage unit into a character code, and a character code for storing a character code output from the voice recognition unit. Storage means, display means for displaying the character code stored in the character code storage means as a character, editing means for editing the content displayed by the display means, and corresponding to the speaker code in advance. Speaker name registration means for registering a speaker name, and speaker name conversion means for converting a speaker code stored in the speaker code storage means into a speaker name registered in the speaker name registration means. And a name of the speaker output from the speaker name conversion means, a generation time of the voice signal timed by the timekeeping means, and a device number for specifying the device in the character code output from the voice recognition means. A minutes creating device, comprising: an output means for additionally outputting.

7. The minutes preparing apparatus according to claim 5 or 6 is arranged at multiple points, and the output data output from the output means of the minutes preparing apparatus at each of the points is collected in one place. A multipoint editing means is provided, and the minutes of the multipoint conference are created by rearranging the output data by the multipoint editing means based on the generation time of the audio signal timed by the timekeeping means. A multi-point minutes recording system featuring.

8. The minutes creating device according to claim 5 or 6 is arranged at multiple points, and the editing means of one of the multiple points minutes creating device is used to edit the minutes of each point. No multi-point editing means for editing the output data output from the output means of the recording device in one place,
A multipoint minutes preparation is made by rearranging the output data by the multipoint editing means based on a generation time of an audio signal timed by the timekeeping means to create a minutes of a multipoint meeting. system.