JP3552338B2

JP3552338B2 - Complex information processing device

Info

Publication number: JP3552338B2
Application number: JP12510795A
Authority: JP
Inventors: 浩史川本; 宏石川
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1995-05-24
Filing date: 1995-05-24
Publication date: 2004-08-11
Anticipated expiration: 2019-08-11
Also published as: JPH08320940A

Description

【０００１】
【産業上の利用分野】
本発明は、画像と音声との双方を入力して画像と音声との双方からなる複合情報を生成する複合情報処理装置、および、その複合情報に基づいて画像と音声の双方を出力する複合情報処理装置に関する。
【０００２】
【従来の技術】
従来、文章や画像、図表のような視覚的情報と、発話音声や音楽、効果音のような音声情報は、もっぱら各々の情報モードに切りわけて取り扱われて来た。例えば紙メディアに代表される視覚的情報は、物理的な郵送やファクシミリ、ＬＡＮ上での電子メール、テレックス、電報などの通信経路で流通し、利用者には紙面、もしくは、電子的あるいは光学的表示画面の上に提供されてきた。その一方で音声情報は、電話やラジオ、ＴＶ等の放送などの通信路、およびＣＤ、テープなどのオフライン媒体を中心に伝達、流通されてきた。
【０００３】
しかし近年、複合情報、すなわち視覚情報と音声情報との双方を合わせてデジタル記録し保存・再利用するものとして、パーソナルコンピュータ上でのマルチメディア媒体（あるいはシステム）が普及しはじめている。例えばアップルコンピューターではクイックタイム（ＱｕｉｃｋＴｉｍｅ）というソフトウエアがマッキントッシュ上に蓄積された音声と動画を再生できるし、いくつかの動画を作るアプリケーションソフトも発売されている。
【０００４】
特に最近では、ネットワーク上の通信端末から視覚情報と音声情報との双方をアクセスする技術も開発されつつある。特開平４−１０５１１４号公報に記載された装置においては、音声に加えて画像情報をスキャナで取り込み、タブレット等のポインティング・デバイスで双方向に操作することのできる通信端末が報告されている。
【０００５】
また、プレゼンテーション用として、音声つきの簡単なアニメーションを編集するソフトウエア・ツールも開発されている。
【０００６】
【発明が解決しようとする課題】
しかしながら、従来のマルチメディア・コミュニケーションにおいては、利用者が作成した画像情報と音声情報は、ページ単位と文章単位のように、比較的大きな単位で分離され、保存、伝達されてきた。そのため、ボイスつき電子メールなどでは、文書中に音声を挿入することはできるが、画像の作成過程と音声の説明とを関連づけることはできない。したがって、例えば地図を使って道順を説明したり、設計図を用いて機械の動作を説明しようとするときに、音声情報が画像情報のどこを指示しているかが明確にならず、マルチメディアとしての効力が十分には発揮されない。
【０００７】
前述のアニメーション編集ソフトウエア・ツールには、画像を細かく制御する機能も提供されているものがあるが、これを利用するためには多くの時間をかけて細かい編集作業を遂行する必要があり、利用者がリアルタイムに記録、通信に利用するための道具としてまったく不向きである。
本発明は、上記事情に鑑み、目と耳との双方から統一のとれた情報を受け取ることのできるシステムに適した複合情報処理装置を提供することを目的とする。
【０００８】
【課題を解決するための手段】
上記目的を達成する本発明の第１の複合情報処理装置は、
（１−１）複数のストロークからなる線画を入力する線画入力操作子
（１−２）音声を収録する音声収録手段
（１−３）線画入力操作子の操作により入力された各ストロークの入力順もしくは入力時刻を計測する計測手段
（１−４）線画入力操作子の操作により入力された各ストロークを表わす各ストローク情報と、上記計測手段により計測された、各ストロークの入力順もしくは入力時刻と、各ストロークの入力期間中に音声収録手段により収録された各音声を表わす各音声情報とが互いに対応づけられてなる複合情報を生成する複合情報生成手段
を備えたことを特徴とする。
【０００９】
また上記目的を達成する本発明の第２の複合情報処理装置は、（２−１）表示画面上に、複数のストロークから構成される線画を描画する線画描画手段
（２−２）音声を発音する発音手段
（２−３）線画を構成する各ストロークを表わす各ストローク情報と、各ストロークの入力順もしくは入力時刻と、各ストロークに対応する各音声を表わす各音声情報とが互いに対応づけられてなる複合情報を構成する上記入力順もしくは入力時刻に基づいて互いに対応をとりながら、上記複合情報を構成する各ストローク情報に基づく各ストロークを上記表示画面に表示させるとともに、上記複合情報を構成する各音声情報に基づく各音声を上記発音手段により発音させる出力制御手段
を備えたことを特徴とする。
【００１０】
ここで、上記本発明の第２の複合情報処理装置において、上記出力制御手段が、各ストロークに対応する各音声の持続時間に応じた描画速度で、各ストロークを表示画面に描画させるものであることが好ましい。
あるいは、上記本発明の第２の複合情報処理装置において、上記出力制御手段が、各ストロークに対応する各音声が上記発音手段により発音されている間、発音中の音声に対応する上記表示画面上のストロークを、その表示画面上に表示されている他のストロークの属性とは異なる属性で表示させるものであることも好ましい態様である。
【００１１】
ここで、上記「入力時刻」は、典型的にはそのストロークの入力開始時刻を指すが、必ずしも入力開始時刻である必要はなく、そのストロークの入力終了時刻、あるいはそのストロークの入力開始時刻と入力終了時刻との中間の時刻等、そのストロークに対応する代表的な時刻であればよい。また、その「入力時刻」は、絶対的な時刻であってもよいが、それに限られず、例えば一連のストロークのうちの最初のストロークの入力開始時刻を基準とした相対時刻、あるいは一連のストロークの入力順と、直前に入力されたストロークの入力開始時刻ないし入力終了時刻と今回入力されたストロークの入力開始時刻との間の時間差との組合せ等、ストロークとストロークとの入力タイミングの関係がわかる情報であればどのようなものであってもよい。
【００１２】
尚、上記本発明の第１の複合情報処理装置は複合情報の入力を担当する装置であり、また本発明の第２の複合情報処理装置は複合情報の出力を担当する装置であり、これら第１及び第２の複合情報処理装置を統合して複合情報の入力と出力との双方を担う１つの複合情報処理装置を構成してもよい。ただし、１つの装置に統合する必要はなく、それら第１の複合情報処理装置及び第２の複合情報処理装置が通信回線を経由して互いに接続されていてもよく、あるいは独立に設置され、例えばフロッピィディスク、ＭＯ等の可搬性記憶媒体を介して複合情報を授受するものであってもよい。
【００１３】
【作用】
本発明の第１の複合情報処理装置によれば、各ストロークの入力順ないし入力時刻を計測し、各ストロークと、各ストロークの入力順ないし入力時刻と、各音声とが互いに対応づけられた複合情報を生成するものであるため、情報出力にあたっては、目と耳とで対応のとれた情報を受けることができ、表現力と理解性の高いコミュニケーション手段が実現される。
【００１４】
また、本発明の第２の複合情報処理装置は、複合情報中の入力順ないし入力時刻に基づいて、ストロークと音声を、互いに対応をとりながら出力するものであり、上述の、表現力と理解性の高いコミュニケーションが実現する。
本発明の第２の複合情報処理装置における、ストロークと音声との対応のとり方は、特定の対応のとり方に限定されるものではないが、例えば、音声の持続時間に応じた描画速度で、その音声に対応するストロークを表示画面上に描画すると、描画中のストロークが発音中の音声に対応することが直ちに認識される。あるいは描画速度で対応をとる代わりに、例えば発音中の音声に対応するストロークを他のストロークとは別の属性で表示すること、例えば別の色で表示したり、太く表示したり、そのストロークをブリンクさせたりすることによっても、発音中の音声に対応するストロークが容易に認識される。
【００１５】
尚、本発明では、ストロークと音声とを対応させているため、入力時に例えばストロークの開始よりも遅れて音声が入力された場合であっても、出力時にはストロークの出力開始時刻と音声出力開始時刻とを容易に揃えることができ、入力時に、出力時を想定して対応する音声の入力時点を精密に管理する必要はなく、使い勝手の良い装置ないしシステムが実現する。
【００１６】
【実施例】
以下、本発明の実施例について説明する。ここでは、本発明の本実施例として、ポインティングデバイスによるフリーハンド作画と音声による解説入力を同期させて記録する装置について説明する。
図１は、本発明の第１の複合情報処理装置および本発明の第２の複合情報処理装置の双方の構成を含む複合情報処理装置の一実施例のブロック図である。
【００１７】
詳細説明は後述することとし、ここでは、先ず、この図１を参照して、本実施例の全体についてその概要を説明する。ストローク入力部１には、マウス（図示せず）が備えられており、そのマウスを操作することにより、線画を構成する各ストロークを表すストロークデータが入力される。また、音声入力部２では音声が収録されて音声情報が入力される。
【００１８】
ストローク入力部１から入力されたストローク情報には、パケットデータ構成部３においてクロック／カウンタ１０から得る現在時刻のデータが添付される。この時刻情報つきストロークデータは、そのストロークを入力する際に収録された音声データとともに、書き出し処理部４により、複合情報ファイル５に出力される。複合情報ファイル５は描画ストローク・データと音声データとを時刻データつきでキュー形式のシーケンス・データにしたものである。複合情報ファイル５に記録された複合情報は、読み出し処理部６により読み出され、複合情報を構成する各時刻情報つきデータが記録時と同じ時間間隔で生起される。これは、時刻調整部７が、クロック／カウンタ１０を参照しながら、各時刻情報つきデータを管理することによって実行される。生起時刻となって起動された各時刻情報つきデータは、ストローク描画部８もしくは音声再生部９に伝送される。
【００１９】
図２は、図１にブロックで示す複合情報ファイルを構成する、ストロークデータと音声データのファイル構造を示す図である。
まず入力されたストロークデータを保存するストローク・データ領域から説明する。
パケット・インデックスは、各ストロークの生起順序を示す続き番号である。これは、音声のデータ・パケットにも同様に添付されており、２系統の通し番号となっている。ストローク描画起動時刻は、ストロークの入力が生起した時刻を、ストローク入力作業開始時点からの経過時間で記録するものである。計時に関して、ストロークの入力はマウス左ボタンのクリックをもって生起したものとする。経路点リストはストロークを形成する座標点の集合であり、ストローク点数はそのストロークを構成する座標点の数である。音声参照ポインタは、同時に再生する音声データ・パケットのアドレス（パケット・インデックス）を指定するものである。対応する音声が存在しない場合はヌル・ポインタが記録される。
【００２０】
次に、音声データ領域を説明する。
パケット・インデックスは、各音声データの生起順序を示す続き番号である。描画参照ポインタは、同時に表示すべきストロークデータ・パケットのインデックスを指定するものである。対応するストロークが存在しない場合はヌル・ポインタが記録される。音声データ長には、音声信号データのバイト長を記録しており、音声再生の所要時間はこれから容易に算出される。本実施例では圧縮の手段を用いていないが、記憶容量の効率的活用のために音声データ圧縮を行う場合は、この音声データ長に加えて音声持続時間を与えれば同様に装置が構成される。
【００２１】
次に、本実施例の動作を説明する。
図３は、複合情報ファイルに描画データと音声データとを関連づけて記録する処理順序を示すフローチャートである。
記録作業が開始されると（Ｓ２１）、マウスクリックおよびそれに続く音声入力の待ち状態にはいる。始めてのストローク・データ入力があった時点、すなわちここではマウスボタンの最初の押下を記録作業の起点としてクロックを初期化する。以降発生したデータ入力はすべてこの起点からの経過時間により記録される。マウス左ボタンが押下される（Ｓ２２）と、ストローク座標点の入力処理（Ｓ２３）に移り、そこでは、ストロークの開始点と一定時間間隔でサンプリングされた経路座標点とのデータセットを得ることで入力処理を完了する。ストローク情報の採取処理終了はマウス左ボタンの解放をもってする（Ｓ２４）。
【００２２】
ストローク情報の採取開始と同時に、音声データのサンプリングが並行して起動される。まずデータの入力待ち状態（Ｓ２５）に入り、あらかじめ定めた閾値強度Ａｔｈ以上の信号が閾値区間Ｔｔｈ１以上連続して入力されると、Ａ／Ｄコンバータ（図示せず）からのデータ・サンプリングを開始する（Ｓ２６）。音声のサンプリング終了については、開始と同様に、閾値強度Ａｔｈ以下の信号が閾値区間Ｔｔｈ２以上連続すると無音区間と判断し、データ・サンプリングを終了する（Ｓ２７）。
【００２３】
ストローク・データと音声データからなるデータ・パケットはＲＡＭ上のパケットバッファ１１へ書き込まれる（Ｓ２９）。ストロークデータの採取と音声データの採取が共に完了した時点で、図１に示すクロック／カウンタ１０のカウンタがインクリメントされる（Ｓ２８）。無音区間を検知する前にマウスがオフされ、その後未だ無音区間を検知していない時点で、次のストローク入力すなわちマウス左ボタンが押下されてもその押下は無効化されている。
【００２４】
また、音声（発話）を伴わないストローク入力は、そのストロークパケットが、音声データが空の音声データ・パケットを参照することによって表現される。一方、描画を伴わない音声情報を表現するために、入力待ち時点（Ｓ２５）においてマウス右ボタンが押下されると、その右ボタンが押下されている期間だけ音声信号のサンプリングがなされる。この音声データ・パケットは、空のストローク・パケットを参照するように組み合わされて記憶される。
【００２５】
ストローク入力もしくは音声入力が終わると、上述したようにカウンタがインクリメントされ（Ｓ２８）、パケットバッファ１１内のストロークパケットおよび音声パケットが、このカウンタ値をパケット・インデックスとし、生起時刻とともに、ディスク上の複合情報ファイル５（図１参照）に出力される（Ｓ２９）。
【００２６】
以上のデータ入力、入力データの複合情報ファイル５への書き出しの過程は、操作終了シグナルを受け取るまで（Ｓ３０）繰り返される。このシグナルはマウス左ボタンにて終了ボタンアイコンをクリックすることにより発生する。操作終了シグナルを受け取るとファイルの終了（ＥＯＦ）を複合情報ファイル５に書き込んで終了する（Ｓ３１）。
【００２７】
次に、複合情報ファイル５に記録された複合情報を再生出力する再生出力モードについて説明する。
本実施例における複合情報の再生出力モードには、リアルタイムモードとステップモードが用意されており、リアルタイムモードでは、ストローク記録時の時間経過をそのまま再現し、ステップモードでは、ストローク記録の再生順序にしたがって、ステップ式に描画／音声を再生出力する。
【００２８】
図４は、リアルタイムモードによる再生を示すフローチャートである。
リアルタイム再生では、各パケットを、記録時に入力された時刻に再生しなければならないので、まず再生処理の起点となる時点においてクロックを初期化する（Ｓ４１）。以降の各パケット再生は、このクロックとの比較によって進行する。
【００２９】
先ずストロークパケットが複合情報ファイル５より読み込まれ（Ｓ４３）、続いてこのパケットに対応する音声パケットが読み出され（Ｓ４４）、パケットバッファ１０に保持される。次に、パケットに記載されたストローク描画起動時刻とクロックとが比較され、出力デバイスへのデータ送信起動時刻に達するのを待つ（Ｓ４５）。起動時刻になると、描画および音声再生に必要なデータがストローク描画部８および音声再生部９（図１参照）に送信され、ストローク描画と音声再生が行なわれる（Ｓ４６，Ｓ４７）。このリアルタイム再生は、パケットの終了を示すＥＯＦを検知するか（Ｓ４２）、利用者からの割り込み終了シグナルを特定のキーにより受けるまで（Ｓ４８）繰り返される。
【００３０】
続いて、図５により、ステップモードにおける再生処理を説明する。
リアルタイム再生と同様に、まず複合情報ファイル５からパケット・データをフェッチしパケットバッファ１１に保持する（Ｓ５２，Ｓ５３）。パケットデータはタイプ別にストローク描画部８および音声再生部９（図１参照）に送信され、利用者からの進行を指示するシグナルを待つ（Ｓ５４）。指示はキー入力もしくはマウス左ボタンの押下による。指示が入力されると、ストローク描画と音声再生が行なわれる（Ｓ５５，Ｓ５６）。以上のステップは、複合情報ファイル５からそのパケットのデータをすべて読み出すか（Ｓ５１）、あるいは利用者からの中止指示があるまで（Ｓ５７）、繰り返される。中止シグナルの受信は、リアルタイム再生の場合と同様である。
【００３１】
図６は、ストローク描画の詳細を示すフローチャートである。
ストローク描画にあたっては、先ず、パケットバッファ１１から音声データ長およびストローク点数が読み出される（Ｓ６１，図２参照）。
１）表示モード（Ｓ６３）が描画時間制御の場合には、音声の持続時間Ｔｓを均等に割った時間間隔Ｔｉｎｔにて経路点を結び、描画される。そのために、まず音声データ長Ｌｖ（ｂｙｔｅ）は、実時間Ｔｓ（ｓｅｃ）へ変換される（Ｓ６２）。この変換はＡ／Ｄ変換のサンプリング周波数Ｎ（Ｈｚ）、データ幅Ｗｂを用いて、式Ｔｓ＝Ｌｖ／（Ｎ×Ｗｂ）によって簡単に求まる。この音声持続時間をストローク経路点の数Ｌで均等に割り（Ｓ６４）、求められた平均値Ｔｉｎｔ＝Ｔｓ／Ｌを描画速度として、経路点をスプライン補間しながら、曲線を形成する（Ｓ６５）。
【００３２】
２）表示モードとして、ストロークの特殊描画が選択されているときには、特殊属性｛ブリンク、線幅・線種変更、色変更｝のいずれかあるいは組合せがセットされ（Ｓ６６）、ストロークを描画する（Ｓ６７）。このときの経路点はパラメータバッファ７０に保存される。その描画したストロークに対応する音声出力が終了すると、その描画ストロークと同じストロークの属性が標準値にセットされ（Ｓ６８）その同じストロークがパラメータバッファから読み出されて再度描画される（Ｓ６９）。
【００３３】
図７は、本実施例による複合情報処理装置がネットワーク上に配置された状態を示す図である。
本複合情報処理装置は、高機能ワークステーション７０を用いて実現したり、携帯情報端末７１に実装することが可能である。それらがＬＡＮあるいは公衆回線を通じて通信することにより、従来のように、画像のみ、あるいは音声のみ、さらには単純な音声つき画像によっても困難であった複合情報の有機的なコミュニケーションが可能となる。
【００３４】
ここで、上述の実施例は、本発明にいう第１の複合情報処理装置の機能と本発明にいう第２の複合情報処理装置の機能との双方の機能をもった複合情報処理装置であるが、本発明にいう第１の複合情報処理装置と本発明にいう第２複合情報処理装置は１つの装置に統合されている必要はない。例えば図７に示すワークステーション７０は線画と音声の入力のみを担当するもの（本発明にいう第１の複合情報処理装置の一例）であって、携帯情報端末７１は線画と音声の出力のみを担当するもの（本発明にいう第２の複合情報処理装置の一例）であってもよい。
【００３５】
【発明の効果】
以上説明したように、本発明によれば、簡便な操作性のもとで、経時的に細かく同期した線画情報と音声情報とを記録、伝達することができ、表現力と理解性のより高いコミュニケーションが可能となる。
【図面の簡単な説明】
【図１】本発明の実施例のブロック図である。
【図２】データ・パケットの構成図である。
【図３】記録処理を示すフローチャートである。
【図４】リアルタイムモードでの再生処理を示すフローチャートである。
【図５】ステップモードでの再生処理を示すフローチャートである。
【図６】ストローク描画の詳細動作を示すフローチャートである。
【図７】複合情報処理装置がネットワーク上に配置された状態を示す図である。
【符号の説明】
１ストローク入力部
２音声入力部
３パケットデータ構成部
４書き出し処理部
５複合情報ファイル
６読み出し処理部
７時刻調整部
８ストローク描画部
９音声再生部
１０クロック／カウンタ
１１パケットバッファ
１２パラメータバッファ[0001]
[Industrial applications]
The present invention relates to a composite information processing apparatus that inputs both an image and a sound to generate composite information including both the image and the sound, and a composite information that outputs both the image and the sound based on the composite information. It relates to a processing device.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, visual information such as texts, images, and charts, and audio information such as uttered voice, music, and sound effects have been exclusively handled in respective information modes. For example, visual information represented by paper media is distributed by physical mail, facsimile, communication routes such as e-mail over the LAN, telex, telegram, etc., and is available to users on paper or electronically or optically. Has been provided on the display screen. On the other hand, audio information has been transmitted and distributed mainly through communication channels such as broadcasting of telephones, radios and TVs, and off-line media such as CDs and tapes.
[0003]
However, in recent years, multimedia media (or systems) on personal computers has begun to spread as digital information that is combined with digital information, that is, both visual information and audio information, and stored and reused. For example, at Apple Computer, software called Quick Time can play audio and moving images stored on a Macintosh, and application software for creating several moving images has been released.
[0004]
In particular, recently, a technology for accessing both visual information and audio information from a communication terminal on a network has been developed. In the device described in Japanese Patent Application Laid-Open No. 4-105114, a communication terminal capable of capturing image information in addition to sound with a scanner and operating bidirectionally with a pointing device such as a tablet is reported.
[0005]
Software tools for editing simple animations with sound have also been developed for presentations.
[0006]
[Problems to be solved by the invention]
However, in conventional multimedia communication, image information and audio information created by a user have been separated, stored, and transmitted in relatively large units such as pages and sentences. For this reason, in an electronic mail with a voice or the like, it is possible to insert a voice into a document, but it is not possible to associate a process of creating an image with a description of the voice. Therefore, for example, when trying to explain a route using a map or explaining the operation of a machine using a design drawing, it is not clear where the audio information points in the image information. Is not fully effective.
[0007]
Some of the animation editing software tools mentioned above also provide a function to control the image in detail, but in order to use this, it is necessary to spend a lot of time performing detailed editing work, It is completely unsuitable as a tool for users to record and use in real time.
The present invention has been made in view of the above circumstances, and has as its object to provide a composite information processing apparatus suitable for a system capable of receiving uniform information from both eyes and ears.
[0008]
[Means for Solving the Problems]
A first composite information processing apparatus of the present invention that achieves the above object,
(1-1) Line drawing input operator for inputting a line drawing composed of a plurality of strokes (1-2) Voice recording means for recording voice (1-3) Input order of each stroke input by operating the line drawing input operator Or measuring means (1-4) for measuring an input time, stroke information representing each stroke input by operating a line drawing input operator, an input order or input time of each stroke measured by the measuring means, The information processing apparatus further includes a composite information generating unit configured to generate composite information in which audio information representing each voice recorded by the voice recording unit during the input period of each stroke is associated with each other.
[0009]
A second compound information processing apparatus of the present invention that achieves the above object is: (2-1) a line drawing drawing unit that draws a line drawing composed of a plurality of strokes on a display screen; (2-3) Each stroke information representing each stroke constituting the line drawing, the input order or input time of each stroke, and each voice information representing each voice corresponding to each stroke are associated with each other. Each stroke based on the stroke information constituting the composite information is displayed on the display screen while associating with each other based on the input order or the input time constituting the composite information. An output control means for causing each sound based on the sound information to sound by the sound means is provided.
[0010]
Here, in the second compound information processing apparatus according to the present invention, the output control means causes each stroke to be drawn on the display screen at a drawing speed corresponding to a duration of each sound corresponding to each stroke. Is preferred.
Alternatively, in the second compound information processing apparatus according to the present invention, the output control means may control the display screen corresponding to the sound being generated while each sound corresponding to each stroke is being generated by the sound generation means. It is also a preferred embodiment that the stroke is displayed with an attribute different from the attributes of other strokes displayed on the display screen.
[0011]
Here, the “input time” typically indicates the input start time of the stroke, but does not necessarily need to be the input start time, and the input end time of the stroke or the input start time of the stroke and the input Any representative time corresponding to the stroke, such as an intermediate time from the end time, may be used. The “input time” may be an absolute time, but is not limited thereto. For example, a relative time based on the input start time of the first stroke of a series of strokes, or a time of a series of strokes Information that indicates the relationship between strokes and the input timing of the stroke, such as the combination of the input order and the time difference between the input start time or the input end time of the stroke input immediately before and the input start time of the stroke input this time. Any type may be used.
[0012]
Note that the first composite information processing apparatus of the present invention is an apparatus in charge of inputting composite information, and the second composite information processing apparatus of the present invention is a device in charge of output of composite information. The first and second composite information processing apparatuses may be integrated to form one composite information processing apparatus that performs both input and output of composite information. However, it is not necessary to integrate them into one device, and the first composite information processing device and the second composite information processing device may be connected to each other via a communication line, or may be installed independently. The composite information may be transmitted and received via a portable storage medium such as a floppy disk or MO.
[0013]
[Action]
According to the first composite information processing apparatus of the present invention, the input order or input time of each stroke is measured, and each stroke, the input order or input time of each stroke, and each voice are associated with each other. Since the information is generated, it is possible to receive information corresponding to eyes and ears when outputting the information, thereby realizing a communication means with high expressiveness and understandability.
[0014]
The second composite information processing apparatus according to the present invention outputs strokes and voices in correspondence with each other based on the input order or input time in the composite information. Highly communicative communication is realized.
In the second compound information processing apparatus of the present invention, the correspondence between the stroke and the sound is not limited to a specific correspondence, but, for example, at a drawing speed corresponding to the duration of the sound, When the stroke corresponding to the sound is drawn on the display screen, it is immediately recognized that the stroke being drawn corresponds to the sound being sounded. Alternatively, instead of responding at the drawing speed, for example, displaying the stroke corresponding to the sound being pronounced with a different attribute from the other strokes, for example, displaying it in a different color, By blinking, the stroke corresponding to the sound being generated can be easily recognized.
[0015]
In the present invention, since the stroke and the voice are associated with each other, even when the voice is input later than the start of the stroke, for example, at the time of input, the output start time of the stroke and the voice output start time are output. It is not necessary to precisely control the input time of the corresponding sound at the time of input, assuming the time of output, and an easy-to-use device or system is realized.
[0016]
【Example】
Hereinafter, examples of the present invention will be described. Here, as an embodiment of the present invention, an apparatus for synchronizing and recording freehand drawing by a pointing device and commentary input by voice will be described.
FIG. 1 is a block diagram of an embodiment of a composite information processing apparatus including both the configuration of the first composite information processing apparatus of the present invention and the configuration of the second composite information processing apparatus of the present invention.
[0017]
The detailed explanation will be described later, and here, the outline of the entire embodiment will be described first with reference to FIG. The stroke input unit 1 is provided with a mouse (not shown), and by operating the mouse, stroke data representing each stroke constituting the line drawing is input. In the voice input unit 2, voice is recorded and voice information is input.
[0018]
The data of the current time obtained from the clock / counter 10 in the packet data forming unit 3 is attached to the stroke information input from the stroke input unit 1. The stroke data with time information is output to the composite information file 5 by the writing processing unit 4 together with the voice data recorded when the stroke is input. The composite information file 5 is a file in which drawing stroke data and audio data are converted into queue-format sequence data with time data. The composite information recorded in the composite information file 5 is read by the read processing unit 6, and the data with time information constituting the composite information is generated at the same time interval as that at the time of recording. This is performed by the time adjusting unit 7 managing the data with the time information while referring to the clock / counter 10. The data with each time information activated at the occurrence time is transmitted to the stroke drawing unit 8 or the sound reproduction unit 9.
[0019]
FIG. 2 is a diagram showing a file structure of stroke data and audio data, which constitute the composite information file indicated by the blocks in FIG.
First, the stroke data area for storing the input stroke data will be described.
The packet index is a sequential number indicating the order of occurrence of each stroke. This is similarly attached to the voice data packet, and has two serial numbers. The stroke drawing start time records the time at which the input of the stroke has occurred as the elapsed time from the start of the stroke input operation. Regarding timing, it is assumed that the input of a stroke has occurred by clicking the left mouse button. The path point list is a set of coordinate points forming a stroke, and the number of stroke points is the number of coordinate points forming the stroke. The audio reference pointer specifies an address (packet index) of an audio data packet to be reproduced at the same time. If there is no corresponding voice, a null pointer is recorded.
[0020]
Next, the audio data area will be described.
The packet index is a sequential number indicating the order of occurrence of each audio data. The drawing reference pointer designates the index of the stroke data packet to be displayed at the same time. If there is no corresponding stroke, a null pointer is recorded. The audio data length records the byte length of the audio signal data, and the time required for audio reproduction can be easily calculated from this. Although no compression means is used in the present embodiment, if audio data compression is performed for efficient use of storage capacity, a device is similarly configured by giving an audio duration in addition to the audio data length. .
[0021]
Next, the operation of this embodiment will be described.
FIG. 3 is a flowchart showing a processing order for recording drawing data and audio data in the composite information file in association with each other.
When the recording operation is started (S21), the apparatus is in a state of waiting for a mouse click and a subsequent voice input. The clock is initialized when the first stroke data input, that is, here, the first depression of the mouse button is the starting point of the recording operation. All subsequent data inputs are recorded by the elapsed time from this starting point. When the left mouse button is pressed (S22), the process proceeds to a stroke coordinate point input process (S23) where a data set of a stroke start point and a path coordinate point sampled at fixed time intervals is obtained. Complete the input process. The stroke information collection process ends when the left mouse button is released (S24).
[0022]
Sampling of audio data is started in parallel with the start of stroke information collection. First, a data input wait state (S25) is entered, and when a signal having a predetermined threshold strength Ath or more is continuously input for a threshold section Tth1 or more, data sampling from an A / D converter (not shown) is started. (S26). As to the end of the voice sampling, similarly to the start, when a signal having the threshold strength Ath or less continues for the threshold section Tth2 or more, it is determined that the section is a silent section, and the data sampling ends (S27).
[0023]
The data packet including the stroke data and the voice data is written to the packet buffer 11 on the RAM (S29). When both the stroke data collection and the voice data collection are completed, the counter of the clock / counter 10 shown in FIG. 1 is incremented (S28). The mouse is turned off before the silent section is detected, and at the time when the silent section is not yet detected, even if the next stroke input, that is, the left mouse button is pressed, the pressing is invalidated.
[0024]
A stroke input without voice (utterance) is expressed by referring to a voice data packet whose voice data is empty. On the other hand, if the right mouse button is pressed at the time of waiting for input (S25) in order to represent audio information without drawing, the audio signal is sampled only while the right button is pressed. The audio data packets are stored in combination to refer to empty stroke packets.
[0025]
When the stroke input or the voice input ends, the counter is incremented as described above (S28), and the stroke packet and the voice packet in the packet buffer 11 use this counter value as a packet index, and generate a composite packet on the disk together with the occurrence time. The information is output to the information file 5 (see FIG. 1) (S29).
[0026]
The process of inputting data and writing the input data to the composite information file 5 is repeated until an operation end signal is received (S30). This signal is generated by clicking the end button icon with the left mouse button. When the operation end signal is received, the end of the file (EOF) is written into the composite information file 5 and the process ends (S31).
[0027]
Next, a reproduction output mode for reproducing and outputting the composite information recorded in the composite information file 5 will be described.
A real-time mode and a step mode are prepared as the reproduction output mode of the composite information in the present embodiment. In the real-time mode, the time lapse at the time of stroke recording is reproduced as it is. , And reproduce / output the drawing / voice in a step manner.
[0028]
FIG. 4 is a flowchart showing the reproduction in the real-time mode.
In real-time reproduction, since each packet must be reproduced at the time input at the time of recording, first, the clock is initialized at the time when the reproduction process starts (S41). Subsequent packet reproduction proceeds by comparison with this clock.
[0029]
First, a stroke packet is read from the composite information file 5 (S43), and subsequently, an audio packet corresponding to this packet is read (S44) and held in the packet buffer 10. Next, the stroke drawing start time described in the packet is compared with the clock, and the process waits until the data transmission start time to the output device is reached (S45). At the start time, data necessary for drawing and sound reproduction is transmitted to the stroke drawing unit 8 and the sound reproducing unit 9 (see FIG. 1), and stroke drawing and sound reproduction are performed (S46, S47). This real-time reproduction is repeated until an EOF indicating the end of the packet is detected (S42) or an interrupt end signal from the user is received by a specific key (S48).
[0030]
Next, the reproduction process in the step mode will be described with reference to FIG.
As in the case of the real-time reproduction, first, the packet data is fetched from the composite information file 5 and held in the packet buffer 11 (S52, S53). The packet data is transmitted to the stroke drawing unit 8 and the sound reproducing unit 9 (see FIG. 1) for each type, and waits for a signal for instructing the user to proceed (S54). The instruction is made by key input or pressing the left mouse button. When the instruction is input, stroke drawing and sound reproduction are performed (S55, S56). The above steps are repeated until all the data of the packet is read from the composite information file 5 (S51) or until the user gives a stop instruction (S57). The reception of the stop signal is the same as in the case of the real-time reproduction.
[0031]
FIG. 6 is a flowchart showing details of stroke drawing.
In drawing a stroke, first, the audio data length and the number of stroke points are read from the packet buffer 11 (S61, see FIG. 2).
1) When the display mode (S63) is the drawing time control, drawing is performed by connecting the path points at time intervals Tint obtained by equally dividing the duration Ts of the sound. For that purpose, first, the audio data length Lv (byte) is converted into real time Ts (sec) (S62). This conversion can be easily obtained by the equation Ts = Lv / (N × Wb) using the sampling frequency N (Hz) of the A / D conversion and the data width Wb. This voice duration is equally divided by the number L of stroke path points (S64), and a curve is formed while the path points are spline-interpolated using the calculated average value Tint = Ts / L as the drawing speed (S65).
[0032]
2) When the special drawing of the stroke is selected as the display mode, one or a combination of the special attributes {blink, line width / line type change, color change} is set (S66), and the stroke is drawn (S67). ). The path point at this time is stored in the parameter buffer 70. When the audio output corresponding to the drawn stroke is completed, the same stroke attribute as the drawn stroke is set to a standard value (S68), and the same stroke is read from the parameter buffer and drawn again (S69).
[0033]
FIG. 7 is a diagram illustrating a state in which the composite information processing apparatus according to the present embodiment is arranged on a network.
The present composite information processing apparatus can be realized by using a high-function workstation 70 or can be mounted on a portable information terminal 71. By communicating through a LAN or a public line, it becomes possible to perform organic communication of complex information, which has been difficult even with only an image or only a sound, or even a simple image with sound, as in the related art.
[0034]
Here, the above embodiment is a composite information processing apparatus having both the function of the first composite information processing apparatus according to the present invention and the function of the second composite information processing apparatus according to the present invention. However, the first composite information processing device according to the present invention and the second composite information processing device according to the present invention need not be integrated into one device. For example, the workstation 70 shown in FIG. 7 is in charge of inputting only a line drawing and an audio (an example of the first composite information processing apparatus according to the present invention), and the portable information terminal 71 only outputs the line drawing and the audio. The person in charge (an example of the second compound information processing apparatus according to the present invention) may be used.
[0035]
【The invention's effect】
As described above, according to the present invention, under simple operability, it is possible to record and transmit line drawing information and audio information that are finely synchronized with time, and have higher expressiveness and understandability. Communication becomes possible.
[Brief description of the drawings]
FIG. 1 is a block diagram of an embodiment of the present invention.
FIG. 2 is a configuration diagram of a data packet.
FIG. 3 is a flowchart illustrating a recording process.
FIG. 4 is a flowchart showing a reproduction process in a real-time mode.
FIG. 5 is a flowchart showing a reproduction process in a step mode.
FIG. 6 is a flowchart illustrating a detailed operation of stroke drawing.
FIG. 7 is a diagram illustrating a state in which the composite information processing apparatus is arranged on a network.
[Explanation of symbols]
REFERENCE SIGNS LIST 1 stroke input unit 2 voice input unit 3 packet data configuration unit 4 write-out processing unit 5 composite information file 6 read processing unit 7 time adjustment unit 8 stroke drawing unit 9 audio reproduction unit 10 clock / counter 11 packet buffer 12 parameter buffer

Claims

A line drawing input operator for inputting a line drawing composed of a plurality of strokes,
Voice recording means for recording voice,
Measuring means for measuring the input order or input time of each stroke input by the operation of the line drawing input operator; and each stroke information representing each stroke input by the operation of the line drawing input operator, and the measuring means Generates composite information in which the measured input order or input time of each stroke and each piece of voice information representing each voice recorded by the voice recording means during the input period of each stroke are associated with each other. A composite information processing apparatus comprising composite information generating means.

A line drawing drawing means for drawing a line drawing composed of a plurality of strokes on a display screen;
Each stroke information representing each stroke constituting a line drawing, the input order or input time of each stroke, and each voice information representing each voice corresponding to each stroke are associated with each other. While corresponding to each other based on the input order or the input time constituting the composite information, each stroke based on the stroke information constituting the composite information is displayed on the display screen, and the composite information is displayed. A composite information processing apparatus, comprising: output control means for causing each of the sounds based on the respective sound information to sound by the sounding means.

3. The composite information according to claim 2, wherein the output control means draws each stroke on the display screen at a drawing speed corresponding to a duration of each sound corresponding to each stroke. Processing equipment.

The output control means displays, on the display screen, a stroke on the display screen corresponding to the sound being sounded while the sound corresponding to each stroke is being sounded by the sounding means. 3. The composite information processing apparatus according to claim 2, wherein the display is performed with an attribute different from an attribute of another stroke.