JP3446530B2

JP3446530B2 - Multimedia minutes preparation method and apparatus

Info

Publication number: JP3446530B2
Application number: JP10204997A
Authority: JP
Inventors: 弘行松井; 久靖高田; 喜義山中
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-04-18
Filing date: 1997-04-18
Publication date: 2003-09-16
Anticipated expiration: 2017-04-18
Also published as: JPH10294798A

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、マルチメディア議
事録作成方法及び装置に係り、特に、テレビ会議におけ
る画像情報、音声情報の情報処理を行い、議事録を自動
的に作成可能なマルチメディア議事録作成方法及び装置
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multimedia minutes creating method and apparatus , and more particularly to a multimedia minutes capable of automatically processing minutes by processing image information and audio information in a video conference. Recording method and apparatus .

【０００２】[0002]

【従来の技術】会議議事録作成は、通常は会議が終わっ
た後に、出席者の発言内容を確認しながら作成されるも
のであり、内容の確認等に時間を要するものである。こ
のため、会議議事録作成の迅速化に主目的を置いた会議
議事録作成技術が開示されている。このような例が、特
開平６−２２５３０２に開示されている。この会議議事
録作成方法は、会議参加者は各人の人物情報として、Ｉ
Ｄ番号や氏名等を入力し、当該入力情報を記憶しておく
と共に、会議の進行状況や会議に提供された動画像、静
止画、音声及びデータの各情報を記憶する。会議終了後
に記憶されている記憶情報から議事報告書を作成する。
さらに、作成された報告書を各局のモニタに表示し、修
正を受け付けた後、印刷する。2. Description of the Related Art Normally, the minutes of a meeting are created after checking the contents of the speeches of attendees after the meeting ends, and it takes time to check the contents. Therefore, a technology for creating meeting minutes has been disclosed, which has a main purpose of speeding up creation of meeting minutes. Such an example is disclosed in JP-A-6-225302. In this method of creating meeting minutes, meeting participants use the I
The D number, name, etc. are input and the input information is stored, and the progress status of the conference and each information of the moving image, still image, voice, and data provided to the conference are stored. Create a proceedings report from the stored information stored after the meeting.
Further, the prepared report is displayed on the monitor of each station, the correction is accepted, and then printed.

【０００３】また、画像情報、音声情報の情報処理を行
い、議事録を作成する場合、この過程で発言者の特定を
行う場合には、始めに本人用ＩＤカードを利用して、こ
の情報を付加する、または、発言者特定のための会議座
席位置を固定して、座席位置により発言者を同定する方
法が採られている。Further, when processing minutes of image information and voice information and creating minutes, in the process of identifying a speaker, first, a personal ID card is used to store this information. A method of adding or fixing the conference seat position for speaker identification and identifying the speaker by the seat position is adopted.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記従
来の方法における発言者の特定方法では、単に人物情報
を記憶しておく、または、座席位置による人物特定方法
では、『このようなことは言っていない』と、本人が発
言内容を否認した場合に対抗できる手段がないという問
題があり、正確な会議議事録が生成できない。However, in the method of specifying a speaker in the above-mentioned conventional method, the person information is simply stored, or in the method of specifying a person by the seat position, "Such a thing is said. "There is no", there is a problem that there is no means to counter when the person disapproves the content of the statement, it is not possible to generate an accurate meeting minutes.

【０００５】本発明は、上記の点に鑑みなされたもの
で、会議において本人が発言した内容を否認することを
不可能とし、議事録の作成の効率化を図ることが可能な
テレビ会議におけるマルチメディア議事録作成方法及び
装置を提供することを目的とする。The present invention has been made in view of the above points, and it is impossible to deny what a person has said in a conference, and it is possible to improve the efficiency of the production of minutes in a video conference. How to create media minutes and
The purpose is to provide a device .

【０００６】[0006]

【課題を解決するための手段】図１は、本発明の原理を
説明するための図である。本発明は、文字情報からなる
議事録（以下、文字議事録と記す）を生成するマルチメ
ディア議事録作成方法において、文字情報からなる発言
内容に対して、ディジタル透かし技術を用いて、該発言
内容の発言者固有の情報を埋め込み（ステップ１）、文
字情報蓄積手段に蓄積する（ステップ２）。 FIG. 1 is a diagram for explaining the principle of the present invention. The present invention consists of character information
A multimedia that generates minutes (hereinafter referred to as text minutes)
A statement consisting of textual information in the method of creating the minutes of Deer
For the contents, using digital watermark technology,
Embed information unique to the speaker of the content (step 1), sentence
It is stored in the character information storage means (step 2).

【０００７】また、本発明の文字議事録は、ネットワー
クを介した遠隔テレビ会議において発言された音声情報
を音声認識して作成される。また、本発明は、遠隔テレ
ビ会議の出席者と該出席者が使用するテレビ会議端末と
を対応付け、音声情報が入力されたテレビ会議端末から
発言者を特定する。 Further, the character minutes of the present invention is
Voice information spoken in a remote video conference via
Is created by voice recognition . In addition, the present invention is a remote
Bi-conference attendees and video conferencing terminals used by the attendees
From the video conferencing terminal in which the audio information has been input.
Identify the speaker.

【０００８】また、本発明は、文字議事録をテレビ会議
終了後に編集する。また、本発明は、発言内容に対し、
発言者固有のフォント変更制御を行う。 Further, the present invention provides a teleconferencing for the text minutes.
Edit after finishing. In addition, the present invention is, with respect to the speech content,
Performs speaker-specific font change control.

【０００９】また、本発明は、文字議事録に対し、文字
情報から音声情報の変換を行い、音声議事録を作成し、
音声情報からなる発言内容に対して、ディジタル透かし
技術を用いて、該発言内容の発言者固有の情報を埋め込
み、音声情報蓄積手段に蓄積する。また、本発明は、ネ
ットワークを介した遠隔テレビ会議において、発言され
た音声情報からなる音声議事録を作成し、音声情報から
なる発言内容に対して、ディジタル透かし技術を用い
て、該発言内容の発言者固有の情報を埋め込み、音声情
報蓄積手段に蓄積する。また、本発明は、遠隔テレビ会
議の出席者と該出席者が使用するテレビ会議端末とを対
応付け、音声情報が入力されたテレビ会議端末から発言
者を特定する。 [0009] In addition, the present invention is, for the character proceedings, character
Converts information from voice information, creates voice minutes,
A digital watermark is added to the content of speech that consists of voice information.
Embed information specific to the speaker of the statement using technology
The voice information is stored in the voice information storage means. In addition, the present invention, the value
In a remote video conference via network
Create a voice minutes from the voice information
Digital watermarking technology is used for
The information specific to the speaker of
It is stored in the information storage means. The present invention also provides a remote television
Attendees of the conference and the video conference terminals used by the attendees
Respond and speak from a video conference terminal with voice information entered
Identify the person.

【００１０】図２は、本発明の原理構成図である。本発
明は、文字情報からなる議事録（以下、文字議事録と記
す）を生成するマルチメディア議事録作成装置であっ
て、文字情報からなる発言内容に対して、ディジタル透
かし技術を用いて、該発言内容の発言者固有の情報を埋
め込むディジタル透かし埋め込み手段２７４と、ディジ
タル透かしが埋め込まれた情報を文字情報蓄積手段２７
８に格納する格納手段２７３を有する。FIG. 2 is a block diagram showing the principle of the present invention. The present invention includes a minutes including text information (hereinafter referred to as a character minutes).
It is a multimedia minutes recorder that generates
The content of the message consisting of text information
Filling out the information unique to the speaker of the statement content using
Digital watermark embedding means 274 for embedding, and digit
The information in which the digital watermark is embedded is the character information storage means 27.
8 has a storage means 273 for storing .

【００１１】本発明は、文字議事録を、ネットワークを
介した遠隔テレビ会議において発言された音声情報を音
声認識して作成する手段を有する。The present invention is a method for collecting a minutes of a character from a network.
Audio information spoken in a remote video conference via
It has a means for voice recognition and creation .

【００１２】また、本発明は、遠隔テレビ会議の出席者
と該出席者が使用するテレビ会議端末とを対応付ける手
段と、音声情報が入力されたテレビ会議端末から発言者
を特定する手段と、を有する。The present invention also provides attendees of a remote video conference.
And the video conference terminal used by the attendee
From the video conferencing terminal where the voice information was input.
And means for identifying .

【００１３】また、本発明は、文字議事録をテレビ会議
終了後に編集する手段を含む。また、本発明は、発言内
容に対し、発言者固有のフォント変更制御を行う手段を
含む。 The present invention also provides a teleconferencing of the minutes of text.
Includes means for editing after completion . In addition, the present invention is
To the speaker, a means to control the font change unique to the speaker
Including.

【００１４】また、本発明は、文字議事録に対し、文字
情報から音声情報の変換を行い、音声議事録を作成する
手段と、音声情報からなる発言内容に対して、ディジタ
ル透かし技術を用いて、該発言内容の発言者固有の情報
を埋め込み、音声情報蓄積手段に蓄積する手段を有す
る。また、本発明は、ネットワークを介した遠隔テレビ
会議において、発言された音声情報からなる音声議事録
を作成し、音声情報からなる発言内容に対して、ディジ
タル透かし技術を用いて、該発言内容の発言者固有の情
報を埋め込み、音声情報蓄積手段に蓄積する手段を有す
る。 In addition, the present invention provides a character minutes
Converts information to audio information and creates audio minutes
The means of speech and the voice content
Information unique to the speaker using the watermarking technology
Embedded in the voice information storage means
It The present invention also provides a remote TV via a network.
Voice minutes consisting of voice information that was spoken in a meeting
To create a message for the message content consisting of voice information.
Information that is unique to the speaker of the utterance content using the digital watermark technology.
It has a means to embed information and store it in the voice information storage means.
It

【００１５】また、本発明は、遠隔テレビ会議の出席者
と該出席者が使用するテレビ会議端末とを対応付け、音
声情報が入力されたテレビ会議端末から発言者を特定す
る手段を有する。上記のように、本発明によれば、会議
出席者本人にはわからないように、本人である証拠とし
て、議事録を本人の発言に付加する。このとき、テレビ
会議画像情報にディジタル透かし情報を埋め込むことに
より、会議出席者本人が自分の発言を否定した場合の証
拠とすることが可能となる。The present invention also provides attendees of a remote video conference.
And associate the video conference terminal used by the attendee with
Identify the speaker from the video conference terminal to which the voice information was input.
Have the means to As described above, according to the present invention, so that it does not know to a meeting in person, as evidence of identity, adding minutes to speak of himself. At this time, by embedding the digital watermark information in the video conference image information, it becomes possible to provide evidence when the conference attendee himself denied his statement.

【００１６】また、本発明において、議事発言内容に発
信者を対応させて、その上で発言者固有のディジタル透
かし情報を、その発言文字情報に付加して表示する、ま
たは、暗号関数を用いて、発言文字表示をフォント変更
制御することにより、会議出席者本人には、議事発言内
容が認識できないが、システムにおいては、当該出席者
の発言を認識できる。Further, in the present invention, the originator is made to correspond to the contents of the proceedings message, and digital watermark information unique to the speaker is added to the message character information and displayed, or an encryption function is used. By controlling the font of the utterance character display, the meeting attendees themselves cannot recognize the contents of the proceedings utterance, but the system can recognize the utterances of the attendees.

【００１７】さらに、音声議事録を作成する場合には、
発言者名を当該発言に付加することにより、音声情報再
生した場合に、発言者を確定できる。また、本発明は、
発言者の音声情報を文字情報に変換し、議事録として蓄
積しておき、ディジタル透かし情報を埋め込んで、画面
上に当該議事録を表示する、または、当該文字情報議事
録を音声情報に変換して音声による議事録の出力を行う
ことも可能である。Further, when creating a voice minutes,
By adding the speaker name to the message, the speaker can be determined when the voice information is reproduced. Further, the present invention is
Converts the voice information of the speaker to text information and accumulates it as minutes, embeds digital watermark information and displays the minutes on the screen, or converts the text information minutes to voice information. It is also possible to output the minutes by voice.

【００１８】さらに、発言内容の音声情報に氏名やＩＤ
等の出席者の固有の情報を直接付与して音声議事録とし
て蓄積することも可能であり、議事録を公開した場合
に、自分の発言を否定することができなくなる。Furthermore, the name and ID of the voice information of the utterance content
It is also possible to directly add the peculiar information of the attendees such as, and accumulate it as a voice minutes, and when the minutes are made public, it becomes impossible to deny one's remarks.

【００１９】[0019]

【発明の実施の形態】図３は、本発明が適用されるテレ
ビ会議システム構成を示す。テレビ会議システムは、同
図に示すように、ネットワーク１０を介してテレビ会議
制御端末２０が接続されている。各会議制御端末２０は
テレビ会議の一人または、複数の出席者により利用され
る。テレビ会議システム全体の運用は、テレビ会議制御
端末２０からの指示により進められる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 3 shows the configuration of a video conference system to which the present invention is applied. As shown in FIG. 1, the video conference system is connected with a video conference control terminal 20 via a network 10. Each conference control terminal 20 is used by one or a plurality of participants in the video conference. The operation of the entire video conference system is advanced by an instruction from the video conference control terminal 20.

【００２０】なお、この会議制御端末２０は、会議にお
ける画像（人物・文字）に関する情報処理機能を有して
いるものとする。図４は、本発明のテレビ会議制御端末
の構成を示す。ネットワーク１０に接続される端末２０
は、映像入力カメラ２１、映像出力画面２２、音声入力
マイク２３、音声出力装置２４、ＩＤカード読み取り機
２５、本人確認装置２６、テレビ会議情報処理装置２
７、テレビ会議制御端末２８及びネットワーク接続及び
テレビ会議制御装置２９より構成される。It is assumed that the conference control terminal 20 has an information processing function regarding images (persons / characters) in the conference. FIG. 4 shows the configuration of the video conference control terminal of the present invention. Terminal 20 connected to network 10
Is a video input camera 21, a video output screen 22, a voice input microphone 23, a voice output device 24, an ID card reader 25, an identification device 26, and a video conference information processing device 2.
7, a video conference control terminal 28, a network connection and a video conference control device 29.

【００２１】このうち、本発明の主旨である議事録自動
作成機能を有するのは、ＩＤカード読み取り機２５、本
人認識装置２６及びテレビ会議情報処理装置２７であ
る。映像入力カメラ２１は、出席者を撮影し、その動画
像情報をネットワーク１０を介して他のテレビ会議制御
端末２０や、映像出力画面２２に供給する。出席者自身
の顔を当該カメラに向けると映像入力カメラ２１は、当
該出席者の顔画像を本人認識装置２６に供給する。ま
た、出席者が１つの端末に複数存在する場合には、出席
者一人毎に顔画像を撮影するものとする。Of these, the ID card reader 25, the personal identification device 26, and the video conference information processing device 27 have the minutes automatically creating function, which is the gist of the present invention. The video input camera 21 photographs the attendees and supplies the moving image information to the other video conference control terminal 20 and the video output screen 22 via the network 10. When the attendee's own face is turned to the camera, the video input camera 21 supplies the attendee's face image to the person recognition device 26. Further, when there are a plurality of attendees on one terminal, a face image is taken for each attendee.

【００２２】映像出力画面２２は、映像入力カメラ２１
から入力された動画像情報や、自己のテレビ会議情報処
理装置２７、または、他のテレビ会議制御端末２０で処
理された議事録等を表示する。音声入力マイク２３は、
出席者が発言した音声情報を入力し、その音声情報を音
声出力装置２４、テレビ会議情報処理装置２７、他のテ
レビ会議制御端末２０に供給する。出席者が会議の最初
に自己の氏名を名乗るものとし、当該氏名は、本人認識
装置２６及びテレビ会議情報処理装置２７に渡される。
なお、音声入力マイク２３は、出席者が複数の場合でも
出席者毎に配置する、または、出席者毎に番号を付与し
て出席者毎の発声毎にリンクをとるものとする。The video output screen 22 is displayed on the video input camera 21.
The moving image information input from the user, the video conference information processing device 27 of its own, or the minutes processed by another video conference control terminal 20 is displayed. The voice input microphone 23 is
The voice information spoken by the attendees is input, and the voice information is supplied to the voice output device 24, the video conference information processing device 27, and another video conference control terminal 20. It is assumed that the attendees give their names at the beginning of the conference, and the names are passed to the personal identification device 26 and the video conference information processing device 27.
It should be noted that the voice input microphone 23 is arranged for each attendee even when there are a plurality of attendees, or a number is given to each attendee and a link is taken for each utterance of each attendee.

【００２３】音声出力装置２４は、音声入力マイク２３
から供給された音声情報及び自己のテレビ会議制御端末
２０または、他のテレビ会議制御端末２０のテレビ会議
情報処理装置２７から供給される音声情報による議事録
の内容をスピーカ等に出力する。ＩＤカード読み取り機
２５は、出席者が保持するＩＤカードの情報を読み取
り、書き込まれている出席者のＩＤ番号や氏名を読み取
り、それらの情報が本人認識装置２６及びテレビ会議情
報処理装置２７に供給される。The voice output device 24 is a voice input microphone 23.
The audio information supplied from the user and the contents of the minutes based on the audio information supplied from the own video conference control terminal 20 or the video conference information processing device 27 of another video conference control terminal 20 are output to a speaker or the like. The ID card reader 25 reads the information of the ID card held by the attendee, reads the attended attendee's ID number and name, and supplies the information to the personal identification device 26 and the video conference information processing device 27. To be done.

【００２４】本人認識装置２６は、予め保持されている
情報と、を出席者から入力されたＩＤまたはパスワード
とを照合し、一致するかを判定する。テレビ会議情報処
理装置２７は、自己のテレビ会議制御端末２０または、
他の端末２０の音声入力マイク２３から渡された発言内
容を音声情報を取得し、文字情報に変換する、または、
そのままの音声情報として扱う。また、文字情報また
は、音声情報による議事録を作成し、本発明の特徴であ
るディジタル透かし技術を用いて、発言者を特定する情
報を議事録の文字情報または、音声情報に埋め込む。The personal identification device 26 compares the information held in advance with the ID or password input by the attendee, and determines whether they match. The video conference information processing device 27 is the video conference control terminal 20 of its own or
Acquiring the voice information from the speech content passed from the voice input microphone 23 of the other terminal 20 and converting it into character information, or
It is treated as it is as audio information. In addition, a minutes is created by using character information or voice information, and information specifying a speaker is embedded in the character information or voice information of the minutes using the digital watermarking technology that is a feature of the present invention.

【００２５】図５は、本発明のテレビ会議情報処理装置
の構成を示す。同図に示すように、情報取得部２７１、
画像情報編集部２７２、音声情報編集部２７３、ディジ
タル透かし埋込部２７４、画像情報蓄積部２７５、音声
情報蓄積部２７６、文字情報蓄積部２７８及び議事録編
集部２７７から構成される。情報取得部２７１は、映像
入力カメラ２１で撮影された出席者の画像情報、自己ま
たは、他のテレビ会議制御端末２０の音声入力マイク２
３で収集された出席者の音声情報、本人認識装置２６か
らの認識結果等を取得し、音声情報は音声情報編集部２
７３に、画像情報は画像情報編集部２７２に渡す。FIG. 5 shows the configuration of the video conference information processing apparatus of the present invention. As shown in the figure, the information acquisition unit 271,
The image information editing unit 272, the voice information editing unit 273, the digital watermark embedding unit 274, the image information storage unit 275, the voice information storage unit 276, the character information storage unit 278, and the minutes editing unit 277 are included. The information acquisition unit 271 detects the image information of the attendee photographed by the video input camera 21, the self, or the voice input microphone 2 of another video conference control terminal 20.
The voice information of the attendees, the recognition result from the personal identification device 26, and the like collected in 3 are acquired.
In 73, the image information is passed to the image information editing unit 272.

【００２６】画像情報編集部２７２は、後述する議事録
編集部２７７から渡された議事録にディジタル透かし技
術により発言者の固有情報を埋め込んで編集する、また
は、音声情報編集部２７３や議事録編集部２７７また
は、ネットワーク１０を介して取得した議事録の情報を
映像出力画面２２に表示するための編集を行う。なお、
映像入力カメラ２１から取得した情報や、編集された議
事録は、画像情報蓄積部２７５に蓄積されるものとす
る。The image information editing unit 272 edits the minutes passed from a minutes editing unit 277, which will be described later, by embedding the unique information of the speaker by digital watermark technology, or by editing the voice information editing unit 273 or the minutes. Editing is performed to display the information on the minutes acquired through the unit 277 or the network 10 on the video output screen 22. In addition,
The information acquired from the video input camera 21 and the edited minutes are stored in the image information storage unit 275.

【００２７】音声情報編集部２７３は、議事録を文字情
報で表現する場合には、音声入力マイク２３から入力さ
れた音声情報である出席者の発声を音声認識し、音声情
報について文字情報変換を行い、変換された文字情報に
発言者固有の情報と共に文字情報蓄積部２７８に格納す
る。また、議事録を音声情報として表現する場合には、
音声入力マイク２３から取得した発言者の氏名を当該音
声情報に付与して音声情報蓄積部２７６に格納する。When the minutes are represented by text information, the voice information editing unit 273 recognizes the voice of the attendee, which is the voice information input from the voice input microphone 23, and converts the voice information into text information. The converted character information is stored in the character information storage unit 278 together with the speaker-specific information. Also, when expressing the minutes as voice information,
The name of the speaker acquired from the voice input microphone 23 is added to the voice information and stored in the voice information storage unit 276.

【００２８】議事録編集部２７７は、音声情報編集部２
７３において、発言がある毎に、文字情報に変換され、
文字情報蓄積部２７８に蓄積されている文字情報をテレ
ビ会議が終了した時点で議事録に編集する。議事録編集
部２７７は、文字情報蓄積部２７８に議事録として格納
されている文字情報を音声情報に変換して、音声情報蓄
積部２７６に音声議事録として一旦格納し、出力時に当
該音声情報を音声出力装置２４から出力することも可能
である。The minutes editor 277 is a voice information editor 2
In 73, every time there is a statement, it is converted into character information,
The character information stored in the character information storage unit 278 is edited into the minutes when the video conference ends. The minutes editing unit 277 converts the character information stored as the minutes in the character information storage unit 278 into voice information, temporarily stores it as the voice minutes in the voice information storage unit 276, and stores the voice information at the time of output. It is also possible to output from the audio output device 24.

【００２９】画像情報編集部２７２は、議事録編集部２
７７で議事録として編集されたもの、または、音声情報
編集部２７３において発言毎に文字情報に変換され、議
事録として文字情報蓄積部２７８に格納されている情報
を画面上に表示するための編集を行う。このとき、ディ
ジタル透かし埋め込み処理部２７４に対してディジタル
透かし技術による発言者固有の情報を埋め込むよう指示
する。The image information editing unit 272 is the minutes editing unit 2
Edited as the minutes in 77, or edited to display the information stored in the character information storage section 278 as the minutes on the screen, which is converted into character information for each statement in the voice information editing section 273. I do. At this time, the digital watermark embedding processing unit 274 is instructed to embed information unique to the speaker by the digital watermark technique.

【００３０】ディジタル透かし埋込処理部２７４は、以
下のような機能を有する。図６は、本発明のディジタル
透かし埋込処理部の動作を説明するための図である。デ
ィジタル透かし埋込技術とは、ディジタル情報（画像
（静止画、動画像）、音声）内に人間に知覚されないよ
うに別の情報を埋め込み、必要時に埋め込んだ情報を取
り出すことができるようにしたものである。The digital watermark embedding processor 274 has the following functions. FIG. 6 is a diagram for explaining the operation of the digital watermark embedding processing unit of the present invention. The digital watermark embedding technology embeds other information in digital information (image (still image, moving image), audio) so that it is not perceived by humans, and the embedded information can be taken out when necessary. Is.

【００３１】以下、ディジタル透かし技術の原理につい
てディジタル情報が画像情報の場合を図６を用いて説明
する。詳細は、特願平８−３０５３７０号、特願平８−
３３８７６９号を参照されたい。図６（ａ）は、画像情
報に別の情報（埋め込み情報）を埋め込む場合の処理の
流れを示した図である。分解処理（ステップ１０１）で
は、原画像を１ブロックがｎ画素×ｍ画素の複数ブロッ
クに分解する。動画像の場合には、各フレーム等に分
け、それぞれのフレームを複数ブロックに分解する。The principle of the digital watermark technique will be described below with reference to FIG. 6 when the digital information is image information. For details, see Japanese Patent Application No. 8-305370 and Japanese Patent Application No. 8-
See 338769. FIG. 6A is a diagram showing a flow of processing when another information (embedded information) is embedded in the image information. In the decomposition processing (step 101), one block is decomposed into a plurality of blocks each of which has n pixels × m pixels. In the case of a moving image, it is divided into frames and the like, and each frame is decomposed into a plurality of blocks.

【００３２】直交変換処理（ステップ１０２）では、分
解処理（ステップ１０１）で分解したそれぞれのブロッ
クに離散コサイン変換（ＤＣＴ変換）等の直交変換を施
し、ｎ×ｍの周波数成分行列を得る。埋め込み情報の埋
め込みに先立ち、直交変換処理でえられた周波数成分行
列のどの位置に埋め込み情報を埋め込むかを決定する埋
め込み位置を乱数により決定し、さらに、その位置に周
波数成分の値をどの程度変更するかを示す変更量を決定
し、決定した埋め込み位置と変更量を鍵情報として取得
しておく。埋め込み処理（ステップ１０３）では、埋め
込み情報を埋め込む場合、１つのブロックに対する周波
数成分行列に全てを埋め込む必要はなく、複数のブロッ
クの周波数成分行列にまたがって埋め込んでもよい。埋
め込み位置として、例えば、周波数成分行列の低周波数
部分を選択することにより、人間に知覚できないように
埋め込むことができる。また、変更量を変えることによ
り、周波数成分行列の元の値との差を変えられるため、
画質の劣化を制御することができる。埋め込み処理で
は、鍵情報の埋め込み位置と変化量に基づいてそれぞれ
のブロックの周波数成分行列の値を変え、埋め込み情報
を埋め込む。In the orthogonal transform process (step 102), orthogonal transformation such as discrete cosine transform (DCT transform) is applied to each block decomposed in the decomposing process (step 101) to obtain an n × m frequency component matrix. Prior to embedding the embedding information, a random number is used to determine the embedding position that determines the position of the frequency component matrix obtained by the orthogonal transform processing to embed the embedding information, and the frequency component value is changed to that position. A change amount indicating whether to perform is determined, and the determined embedding position and the change amount are acquired as key information. In the embedding process (step 103), when embedding the embedding information, it is not necessary to embed all in the frequency component matrix for one block, and it is also possible to embed it over the frequency component matrices of a plurality of blocks. As the embedding position, for example, by selecting the low frequency part of the frequency component matrix, it is possible to embed it so that it cannot be perceived by humans. Also, by changing the change amount, the difference from the original value of the frequency component matrix can be changed,
It is possible to control deterioration of image quality. In the embedding process, the value of the frequency component matrix of each block is changed based on the embedding position of the key information and the amount of change, and the embedding information is embedded.

【００３３】逆直交変換処理（ステップ１０４）では、
埋め込み処理により埋め込み情報が埋め込まれたそれぞ
れのブロックの周波数成分行列を逆直交変換し、ｎ画素
×ｍ画素のブロック画像を得る。再構成処理（ステップ
１０５）では、逆直交変換処理（ステップ１０４）で得
られた各ブロック画像をつなぎ合わせ、埋め込み情報が
埋め込まれた透かし画像を得る。In the inverse orthogonal transform process (step 104),
The frequency component matrix of each block in which the embedded information is embedded by the embedding process is inversely orthogonally transformed to obtain a block image of n pixels × m pixels. In the reconstruction process (step 105), the block images obtained in the inverse orthogonal transform process (step 104) are joined together to obtain a watermark image in which embedded information is embedded.

【００３４】図６（ｂ）は、透かし画像の埋め込み画像
を取り出す場合の処理の流れを示した図である。分解処
理（ステップ２０１）では、透かし画像を１ブロックが
ｎ画素×ｍ画素の複数ブロックに分解する。直交変換処
理（ステップ２０２）では、分解処理（ステップ２０
１）で分解されたそれぞれのブロックに対し、直交変換
を行い、ｎ×ｍの周波数成分行列を得る。取り出し処理
（ステップ２０３）では、埋め込み処理（ステップ１０
３）で用いた鍵情報から埋め込み位置と変更量を得て、
それぞれのブロックの周波数成分行列から埋め込み情報
を取り出す。FIG. 6B is a diagram showing the flow of processing when an embedded image of a watermark image is taken out. In the decomposition processing (step 201), the watermark image is decomposed into a plurality of blocks each having n pixels × m pixels. In the orthogonal transformation process (step 202), the decomposition process (step 20)
An orthogonal transformation is performed on each of the blocks decomposed in 1) to obtain an n × m frequency component matrix. In the extraction process (step 203), the embedding process (step 10)
Obtain the embedding position and the amount of change from the key information used in 3),
The embedded information is extracted from the frequency component matrix of each block.

【００３５】なお、ディジタル情報が音声情報の場合に
は、埋め込み時、取り出し時の具体的な処理方法は、画
像情報の場合とは異なるが、画像情報の場合と同様に、
音声情報の冗長部分に埋め込み情報を埋め込み、その位
置情報等を鍵情報とし、この鍵情報に基づいて埋め込み
情報を埋め込み、取り出しができる。以上のように、デ
ィジタル透かし技術は、埋め込み時に用いた鍵情報が
なければ埋め込み情報の取り出しができないこと、鍵
情報中の埋め込み情報は乱数により作成するため固定さ
れておらず、埋め込み情報の解読は困難なこと、埋め
込み位置を工夫することにより、人間が知覚できないよ
うに埋め込み情報を埋め込むこと、変更量を変えるこ
とにより、画質の劣化の程度を制御できること、等の特
徴がある。When the digital information is voice information, the concrete processing method at the time of embedding and taking out is different from the case of the image information, but like the case of the image information,
The embedded information is embedded in the redundant portion of the audio information, the position information and the like are used as key information, and the embedded information can be embedded and extracted based on this key information. As described above, in the digital watermark technology, the embedded information cannot be taken out without the key information used at the time of embedding, and the embedded information in the key information is not fixed because it is created by a random number, and the embedded information cannot be decrypted. It is difficult, by embedding the embedded information so that it cannot be perceived by humans by devising the embedding position, and by changing the amount of change, the degree of deterioration of the image quality can be controlled.

【００３６】ディジタル透かし埋込処理部２７４は、情
報取得部２７１から取得した鍵情報と画像情報編集部２
７２から取得した議事録に基づいて埋込情報を当該議事
録に埋め込む。画像情報蓄積部２７５は、画像情報編集
部２７２により編集された議事録の画像情報を蓄積す
る。The digital watermark embedding processing unit 274 has the key information and image information editing unit 2 acquired from the information acquisition unit 271.
The embedded information is embedded in the minutes based on the minutes acquired from 72. The image information storage unit 275 stores the image information of the minutes edited by the image information editing unit 272.

【００３７】文字情報蓄積部２７８は、音声情報編集部
２７３により文字情報に変換された発言内容を格納す
る。次に、テレビ会議制御端末２８は、テレビ会議を進
行していく過程において、映像入力カメラ２１、映像出
力画面２２、音声入力マイク２３、音声出力装置２４、
ＩＤカード読み取り機２５、本人認識装置２６、テレビ
会議情報処理装置２７、テレビ会議制御端末２８等を制
御する。The character information storage unit 278 stores the content of the statement converted into the character information by the voice information editing unit 273. Next, the video conference control terminal 28, in the process of proceeding the video conference, the video input camera 21, the video output screen 22, the audio input microphone 23, the audio output device 24,
It controls the ID card reader 25, the personal identification device 26, the video conference information processing device 27, the video conference control terminal 28, and the like.

【００３８】[0038]

【実施例】以下、本発明の実施例を図面と共に説明す
る。図７は、本発明の一実施例の議事録作成処理のフロ
ーチャートである。ステップ３０１）まず、テレビ会議端末２０の使用す
る際に、出席者Ａは、先ず、ユーザＩＤまたは、パスワ
ードをテレビ会議端末２０に投入する、または、ＩＤカ
ードをＩＤカード読み取り機２５に挿入し、ユーザＩＤ
または、パスワードを投入する。または、映像入力カメ
ラ２１に向かって、自分の顔を写し、『私はＡです』と
自分の音声入力マイク２３に対して発言する。Embodiments of the present invention will be described below with reference to the drawings. FIG. 7 is a flowchart of the minutes creating process according to the embodiment of the present invention. Step 301) First, when using the video conference terminal 20, the attendee A first inserts a user ID or password into the video conference terminal 20, or inserts an ID card into the ID card reader 25, User ID
Or enter the password. Alternatively, he / she shoots his / her face toward the video input camera 21 and says “I am A” to my voice input microphone 23.

【００３９】ステップ３０２）本人認識装置２６は、
上記のいずれかの方法により入力された情報と予め登録
されている情報とを照合し、一致した場合には、当該端
末の使用者は、出席者Ａが使用するものと決定される。ステップ３０３）次に、テレビ会議が始まると、並行
して、議事録作成が開始される。Step 302) The personal identification device 26
The information input by any of the above methods and the information registered in advance are collated, and if they match, the user of the terminal is determined to be used by the attendee A. (Step 303) Next, when the video conference starts, the minutes are created in parallel.

【００４０】まず、各出席者の発言内容は、テレビ会議
情報処理装置２７の情報取得部２７１で取得され、議事
録作成のため音声情報編集部２７３の音声認識機能によ
り音声認識処理が行われる。ステップ３０４）音声認識された音声情報は、音声情
報編集部２７３において文字情報に変換される。First, the speech contents of each attendee are acquired by the information acquisition unit 271 of the video conference information processing apparatus 27, and voice recognition processing is performed by the voice recognition function of the voice information editing unit 273 to create the minutes. Step 304) The voice information subjected to voice recognition is converted into text information in the voice information editing unit 273.

【００４１】ステップ３０５）変換された文字情報が
文字情報蓄積部２７８に蓄積される。このとき、誰が発
言しているかは上記のステップ３０１、ステップ３０２
で既知であるので、音声変換後の発言内容の後ろに発言
者の氏名を付加する。ステップ３０６）議事録編集部２７７において、文字
情報蓄積部２７８に蓄積されている議事内容の訂正等の
処理を行い、最終作成される。Step 305) The converted character information is stored in the character information storage unit 278. At this time, it is determined in step 301 and step 302 described above who is speaking.
Since it is already known, the name of the speaker is added to the end of the speech content after voice conversion. Step 306) In the minutes editor 277, the contents of the proceedings accumulated in the character information accumulator 278 are corrected and the final preparation is made.

【００４２】ステップ３０７）画像情報編集部２７２
において、文字情報蓄積部２７８に蓄積されている会議
中の文字情報中に、ディジタル透かし埋込処理部２７４
で埋め込まれる発言者の情報（氏名等）を埋め込む。ステップ３０８）画像情報編集部２７２において、埋
め込まれた発言者の情報を見える形で画像の最後等の任
意の場所に纏めて表示するように編集する、または、議
事進行に同期を取り、議事録編集部２７７から提供され
たディジタル透かし埋め込み処理により発言者の固有情
報が埋め込まれた文字情報を、映像出力画面２２上に任
意に分割表示するように編集してもよい。Step 307) Image information editing section 272
, The digital watermark embedding processing unit 274 is added to the character information in the conference stored in the character information storage unit 278.
The information (name, etc.) of the speaker embedded in is embedded. Step 308) In the image information editing unit 272, the information of the embedded speaker is edited so as to be collectively displayed at an arbitrary place such as the end of the image in a visible form, or in synchronization with the proceedings of the proceedings. The character information in which the unique information of the speaker is embedded by the digital watermark embedding process provided by the editing unit 277 may be edited so as to be arbitrarily divided and displayed on the video output screen 22.

【００４３】次に、発言内容の後ろに発言者氏名をディ
ジタル透かし技術により埋め込み、画像情報として表示
する一連の処理について説明する。画像情報編集部２７
２は、音声情報編集部２７３において文字情報に編集さ
れた発言内容に、発言者の氏名やＩＤ等をディジタル透
かし埋め込み処理部２７４により埋め込み処理され、当
該情報を取得して画像情報蓄積部２７５に格納してお
く。Next, a series of processes for embedding the name of the speaker behind the content of the message by the digital watermark technique and displaying it as image information will be described. Image information editing unit 27
The digital watermark embedding processing unit 274 embeds the name and ID of the speaker into the utterance content edited into the character information by the voice information editing unit 273, acquires the information, and stores it in the image information storage unit 275. Store it.

【００４４】このとき、更に厳格に埋め込み処理を行う
場合には、ディジタル透かし埋込処理部２７４では、発
言者の秘密鍵（ＲＳＡ等）でディジタル署名（この秘密
鍵はＩＤカード内または、テレビ会議端末内に設定して
おく）した情報を入れておく。なお、発言者が自分の発
言にこのような発言内容否認防止に役立つ透かしが埋め
込まれていることを知る／知らないについては、本発明
では、限定しない。At this time, if the embedding process is performed more strictly, the digital watermark embedding processing unit 274 digitally signs the secret key (RSA or the like) of the speaker (this secret key is stored in the ID card or in the video conference). Enter the information set in the terminal). It should be noted that the present invention does not limit the fact that the speaker knows / does not know that the watermark useful for preventing the repudiation of the message content is embedded in his / her message.

【００４５】上記で、文字情報蓄積部２７８に蓄積され
ている文字情報を議事録編集部２７７において、編集
し、画像情報編集部２７２を経てネットワーク１０に送
信され、当該ネットワーク１０を介して受信したテレビ
会議端末２０では、透かしが挿入された画像情報を取得
することができる。このように、ネットワーク１０を介
して取得する議事録のイメージを図８に示す。同図にお
いて、実際には、発言者固有に暗号化されて埋め込まれ
ているので、氏名やＩＤ等は見えない。In the above, the character information stored in the character information storage unit 278 is edited in the minutes editing unit 277, transmitted to the network 10 via the image information editing unit 272, and received via the network 10. The video conference terminal 20 can acquire the image information in which the watermark is inserted. FIG. 8 shows an image of the minutes thus obtained via the network 10. In the figure, the name, ID, etc. are actually invisible because they are encrypted and embedded uniquely to the speaker.

【００４６】さらに、議事録を表示する際に、画像情報
編集部２７２において、映像出力画面２２上に出席者の
発言内容を表示する際に、ディジタル埋め込み時にフォ
ント変更制御を行うことが可能である。この例を図９に
示す。同図の例では、（山本）の発言内容についてはブ
ロック体で表示するようにフォント変更制御を行い、
（大西）、（西川）の発言内容については、明朝体で表
示している。これを実現するために図１０に示す方法を
用いるものとする。図１０において、発言内容の入力が
ブロック体で入力された場合に、ディジタル埋め込み処
理によるフォント変更制御により、変換鍵Ｐが入力され
ることにより、入力されたブロック体のフォントが明朝
体に変換されて出力される。このように、どのようなフ
ォント順となるかは、変換鍵Ｐを端末から指定し、発言
内容を変換（暗号）関数Ｆに入力することで決まる。Furthermore, when displaying the minutes, the image information editing section 272 can control the font change at the time of digital embedding when displaying the speech of the attendees on the video output screen 22. . An example of this is shown in FIG. In the example of the figure, the font change control is performed so that (Yamamoto) 's message contents are displayed in blocks,
(Onishi) and (Nishikawa) are displayed in Mincho style. In order to realize this, the method shown in FIG. 10 is used. In FIG. 10, when the utterance contents are input in blocks, the conversion key P is input by the font change control by digital embedding processing, and the input block font is converted to Mincho font. Is output. In this way, the font order is determined by designating the conversion key P from the terminal and inputting the utterance content to the conversion (encryption) function F.

【００４７】また、上記の例は、議事録として、音声情
報を文字情報に変換して文字情報蓄積部２７８に保持し
ておき、文字情報を画像情報編集部２７２において、議
事録として、文字情報を出力しているが、この例に限定
されることなく、議事録を音声情報として提供すること
も可能である。その例を図１１を用いて説明する。図１
１は、本発明の一実施例の音声議事録作成の動作を示す
フローチャートである。Further, in the above example, as the minutes, the voice information is converted into the character information and held in the character information storage unit 278, and the character information is stored in the image information editing unit 272 as the minutes. However, the minutes can be provided as voice information without being limited to this example. An example thereof will be described with reference to FIG. Figure 1
FIG. 1 is a flowchart showing the operation of creating a voice minutes according to an embodiment of the present invention.

【００４８】ステップ４０１）テレビ会議システムの
運用を開始し、本人認識装置２６において、出席者と端
末との対応付けを行う。ステップ４０２）音声議事録を作成する場合には、以
下の処理に移行し、作成せずに、文字情報による議事録
を作成する場合には、前述の処理に移行し、当該処理を
終了する。Step 401) The operation of the video conference system is started, and the person recognizing device 26 associates the attendee with the terminal. Step 402) When creating a voice minutes, the process proceeds to the following process, and when creating a minutes by text information without creating, the process proceeds to the above process and the process ends.

【００４９】ステップ４０３）条件選択議事録作成タ
イプが、発言内容の音声情報に発言者名を付加して議事
録を作成する場合には、音声情報編集部２７３におい
て、取得した発言内容を文字情報に変換せずに、テレビ
会議が終了するまで、進行に並行して音声情報（発言内
容）のまま音声情報蓄積部２７６に蓄積し、ステップ４
０６に移行する。Step 403) When the condition selection minutes creation type creates the minutes by adding the speaker name to the voice information of the utterance contents, the voice information editing unit 273 displays the acquired utterance contents as character information. The voice information (statement content) is stored in the voice information storage unit 276 in parallel with the progress until the video conference is ended without converting
Move to 06.

【００５０】ステップ４０４）音声情報編集部２７３
において、文字情報に変換されている場合には、埋め込
み処理が行われている文字情報を発言の度に取得・編集
する。ステップ４０５）編集された文字情報を音声情報編集
部２７３において、音声情報に変換し、音声情報蓄積部
２７６に蓄積する。Step 404) Audio information editing section 273
In the case where the text information is converted into the text information, the text information in which the embedding process is performed is acquired / edited every time the utterance is made. Step 405) The edited character information is converted into voice information in the voice information editing unit 273 and stored in the voice information storage unit 276.

【００５１】ステップ４０６）音声情報蓄積部２７６
に蓄積されている音声情報を出力する。このように、本発明は、その議事録の編集の仕方として
以下ように大別される。発言者から取得した音声情報に当該発言者を特定す
る固有情報（氏名、ＩＤ等）を付与して、音声情報によ
る議事録として蓄積する。Step 406) Audio information storage unit 276
The voice information stored in is output. As described above, the present invention is roughly classified as follows as a method of editing the minutes. Unique information (name, ID, etc.) for specifying the speaker is added to the voice information acquired from the speaker, and the minutes are accumulated as voice information.

【００５２】発言者から取得した音声情報を文字情
報に変換し、当該情報にディジタル透かし技術による埋
込処理を行い、議事録として出力する。文字議事録を再度音声情報に変換し、音声議事録と
して蓄積する。議事録を生成するタイミングをテレビ会議中に発言
があった時点とし、発言毎に、上記の処理を行う。The voice information obtained from the speaker is converted into character information, and the information is embedded by the digital watermark technique and output as the minutes. The text minutes are converted into voice information again and stored as voice minutes. The timing at which the minutes are generated is set to the time when there is a statement during the video conference, and the above processing is performed for each statement.

【００５３】議事録を生成するタイミングをテレビ
会議の終了時点とし、発言がある毎に、単に発言情報
（文字情報または、音声情報）として保持し、会議終了
後、会議中に蓄積されている文字議事録または、音声議
事録を統括して議事録として編集する。音声情報については、ディジタル透かし技術を適用
または、適用せずに、発言者を特定する情報を設定す
る。The timing at which the minutes are generated is the end time of the video conference, and each time there is a utterance, it is simply retained as utterance information (character information or voice information), and after the end of the conference, the characters accumulated during the conference are stored. Edit the minutes or voice minutes by integrating them. For voice information, information for identifying the speaker is set with or without applying the digital watermark technology.

【００５４】なお、本発明は、上記の各議事録の編集方
法を、適宜選択または、組み合わせて実行することが可
能である。なお、本発明は、上記の実施例に限定される
ことなく、特許請求の範囲内で種々変更・応用が可能で
ある。In the present invention, it is possible to appropriately select or combine the above-described minutes editing methods. The present invention is not limited to the above embodiments, and various modifications and applications are possible within the scope of the claims.

【００５５】[0055]

【発明の効果】上記のように、本発明のマルチメディア
議事録作成方法及びシステムによれば、発言者を特定
し、当該発言者の氏名等の固有情報を発言内容（音声情
報または、文字情報）にディジタル透かし技術を用いて
埋め込む（直接固有情報を埋め込んでも可）ことによ
り、発言者本人が発言した内容を否定することができな
くなる。As described above, according to the multimedia minutes creating method and system of the present invention, a speaker is specified, and unique information such as the speaker's name is added to the content of the speech (voice information or character information). ) By using a digital watermark technology (the unique information may be directly embedded), it becomes impossible to deny the content of what the speaker himself said.

【００５６】また、本発明によれば、議事録作成時に種
々の方法及び組み合わせを選択することが可能であるた
め、文書で残したい場合には、文字情報で表された議事
録をプリントすればよい。また、画面に表示したい場合
には、文字情報を画面上の位置に配置し、表示すること
が可能となる。さらに、音声で議事録を出力したい場合
には、音声情報で表された議事録をスピーカ等より出力
することが可能となる。Further, according to the present invention, various methods and combinations can be selected at the time of creating the minutes. Therefore, if the user wants to leave it as a document, the minutes represented by character information can be printed. Good. Further, when it is desired to display it on the screen, the character information can be arranged and displayed at a position on the screen. Further, when it is desired to output the minutes by voice, the minutes represented by the voice information can be output from a speaker or the like.

【００５７】また、ディジタル透かし技術により、発言
内容は見ることができるが、固有情報を発言者のテレビ
会議端末において、透かし情報は見ることができない。
また、音声情報で議事録が生成されている場合に、当該
音声情報にディジタル透かし技術を適用しない場合に
は、当該音声情報に発言者の氏名を直接埋め込むことに
より、発言者は自己の発言内容と氏名から当該発言内容
を否定することができない。Further, by the digital watermark technique, the contents of the utterance can be seen, but the unique information cannot be seen at the speaker's video conference terminal.
When the minutes are generated by voice information and the digital watermark technology is not applied to the voice information, the speaker's name is directly embedded in the voice information so that the speaker can speak Therefore, the content of the statement cannot be denied based on the name.

[Brief description of drawings]

【図１】本発明の原理を説明するための図である。FIG. 1 is a diagram for explaining the principle of the present invention.

【図２】本発明の原理構成図である。FIG. 2 is a principle configuration diagram of the present invention.

【図３】本発明のテレビ会議システム構成図である。FIG. 3 is a block diagram of a video conference system of the present invention.

【図４】本発明のテレビ会議端末の構成図である。FIG. 4 is a block diagram of a video conference terminal of the present invention.

【図５】本発明のテレビ会議情報処理装置の構成図であ
る。FIG. 5 is a block diagram of a video conference information processing apparatus of the present invention.

【図６】本発明のディジタル透かし埋込処理部の動作を
説明するための図である。FIG. 6 is a diagram for explaining the operation of the digital watermark embedding processing unit of the present invention.

【図７】本発明の一実施例の議事録作成処理のフローチ
ャートである。FIG. 7 is a flowchart of a minutes creating process according to an embodiment of the present invention.

【図８】本発明の一実施例の議事録のイメージを示す図
である。FIG. 8 is a diagram showing an image of minutes of an embodiment of the present invention.

【図９】本発明の一実施例のフォント変更制御により発
言内容を特定する例である。FIG. 9 is an example in which the content of a message is specified by font change control according to an embodiment of the present invention.

【図１０】本発明の一実施例のフォント制御動作を説明
するための図である。FIG. 10 is a diagram illustrating a font control operation according to an embodiment of the present invention.

【図１１】本発明の一実施例の音声議事録作成の動作を
示すフローチャートである。FIG. 11 is a flowchart showing an operation of creating a voice minutes according to an embodiment of the present invention.

[Explanation of symbols]

１０ネットワーク２０テレビ会議端末２１映像入力カメラ２２映像出力画面２３音声入力マイク２４音声出力装置２５ＩＤカード読み取り機２６本人認識装置２７テレビ会議情報処理装置２８テレビ会議制御端末２９ネットワーク接続及びテレビ会議制御装置１００音声情報処理手段１１０音声認識手段２００議事録生成手段２１０埋込手段２２０固有情報付加手段２７１情報取得部２７２画像情報編集部２７３音声情報編集部２７４ディジタル透かし埋込処理部２７５画像情報蓄積部２７６音声情報蓄積部２７７議事録編集部２７８文字情報蓄積部３００画像変換手段４００出席者対応手段５００文字情報変換手段６００文字議事録蓄積手段７００議事録編集手段８００音声議事録蓄積手段９００議事録編集手段 10 network 20 video conference terminals 21 Video input camera 22 Video output screen 23 Voice input microphone 24 Audio output device 25 ID card reader 26 personal identification device 27 Video conference information processing equipment 28 Video conference control terminal 29 Network connection and video conference control device 100 voice information processing means 110 voice recognition means 200 Minutes Generation Method 210 Embedding means 220 means for adding unique information 271 Information acquisition unit 272 Image Information Editing Department 273 Voice information editor 274 Digital Watermark Embedding Processor 275 Image information storage unit 276 Voice information storage unit 277 Minutes Editor 278 Character information storage 300 image conversion means 400 Attendee Response Means 500 Character information conversion means 600-character minutes storage means 700 minutes editing means 800 Sound minutes storage means 900 Minutes Editing Means

フロントページの続き (56)参考文献特開平７−191690（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04M 3/56 Front page continuation (56) Reference JP-A-7-191690 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) H04M 3/56

Claims

(57) [Claims]

1.Minutes consisting of text information (hereinafter text
How to create multimedia minutes
At A digital watermark is added to the content of the statement consisting of text information.
Embed information specific to the speaker of the statement using technology
Accumulation in character information storage means Maru characterized by
Chimeda minutes preparation method.

2. The text minutes are transmitted via a network.
Voice recognition of voice information spoken in a remote video conference
The method for creating a multimedia minutes according to claim 1, which is created with knowledge .

3.Attendees of the remote video conference and the attendees
The video conferencing terminal used by From the video conference terminal to which the voice information is input,
Claim 2 to specify How to make multimedia minutes mentioned
Law.

4. The text minutes after the video conference is over.
The method for creating multimedia minutes according to claim 2 or 3 , which is edited in accordance with claim 4 .

5. The content unique to the speaker with respect to the content of the statement
5. The method for creating multimedia minutes according to claim 1, wherein font change control is performed .

6.For text minutes, from text information to voice information
Converts information, creates voice minutes, A digital watermark is added to the content of speech that consists of voice information.
Embed information specific to the speaker of the statement using technology
5. The voice information storing means stores the voice information in the voice information storing means. Described
How to make multimedia minutes.

7. A remote video conference via a network
Create a voice minutes consisting of voice information
However, for the content of speech consisting of voice information, digital
Filling out the information unique to the speaker of the statement content using
A method for creating a multimedia minutes , characterized in that it is embedded and stored in a voice information storage means .

8.Attendees of the remote video conference and the attendees
The video conferencing terminal used by From the video conference terminal to which the voice information is input,
Claim 7 to specify How to make multimedia minutes mentioned
Law.

9.Minutes consisting of text information (hereinafter text
A multimedia minutes creating device that generates a minutes)
And A digital watermark is added to the content of the statement consisting of text information.
Embed information specific to the speaker of the statement using technology
The means for storing in the character information storage means Special to have
Creating multimedia minutes to collectapparatus.

10. The text minutes are transmitted via a network.
The voice information that was spoken in the remote video conference
The multimedia minutes creating apparatus according to claim 9, further comprising means for recognizing and creating the minutes.

11.Attendees of the remote video conference and attendance
Means for associating with the video conference terminal used by the person, From the video conference terminal to which the voice information is input,
And means for identifying Claim10Listed multimedia
Ear minutes creationapparatus.

12. The video conference ends with the text minutes.
11. The multimedia minutes creating apparatus according to claim 9 , further comprising means for editing later .

13. The speaker is unique to the content of the statement.
13. A method for controlling the font change of a computer is included.
The described multimedia minutes recording device .

14.For text minutes, from text information to voice
A means to convert information and create a voice minutes, A digital watermark is added to the content of speech that consists of voice information.
Embed information specific to the speaker of the statement using technology
Only has a means for accumulating in the voice information accumulating means Claim9
Through 12Creation of multimedia minutes mentionedapparatus.

15. A remote video conference via a network.
At, the voice minutes consisting of voice information
Created, and digitally responded to the content of the statement consisting of voice information.
By using watermark technology, information unique to the speaker of the statement
It must have a means for embedding and storing it in the voice information storage means.
A multimedia minutes recording device characterized by .

16. An attendee of the remote video conference and the attendance.
The audio information is associated with the video conference terminal used by the person.
A method to identify the speaker from the video conference terminal where the information was input.
Creation of multimedia minutes according to claim 15 having a step
Equipment .