JP4109367B2

JP4109367B2 - Non-linear video editing apparatus and method

Info

Publication number: JP4109367B2
Application number: JP00783699A
Authority: JP
Inventors: 良英藤岡; 智晃吉田
Original assignee: Thomson Licensing SAS
Current assignee: Thomson Licensing SAS
Priority date: 1999-01-14
Filing date: 1999-01-14
Publication date: 2008-07-02
Anticipated expiration: 2019-01-14
Also published as: JP2000209543A

Description

【０００１】
【発明の属する技術分野】
この発明は、ノンリニアビデオ編集装置に関し、特に、オーディオデータのサンプリングレートの変動に関する。
【０００２】
【関連技術】
今日、ＤＶ（ディジタルビデオ）ノンリニアビデオ編集が注目されている。これは、ビデオキャプチャーボードを介して、ＤＶビデオ信号をパソコンのＨＤＤ（ハードディスク）に記憶させ、任意のフレームを取り出して画像を編集する方法である。
【０００３】
かかる編集処理について簡単に説明する。ＩＥＥＥ１３９４インターフェイス（図示せず）により接続されたデジタルビデオ（ＤＶ）カメラＤＶ機器から、図６Ａに示すようなＤＶデータを取り込む。このデジタルデータは、複数のフレームから構成されたストリームデータである。各フレームは、ＤＶ圧縮されたビデオデータと、所定のサンプリングレートでデジタル化されたオーディオデータで構成されている。
【０００４】
ＤＶビデオキャプチャボードは図６Ｂに示すように、各フレーム毎にビデオデータとオーディオデータに分離する。分離されたビデオデータとオーディオデータは、各フレームを連続させたストリームデータとしてハードディスクに記憶される。
【０００５】
操作者から表示命令が与えられると、ハードディスクに記憶されたビデオデータが解凍され、所定のフレームの画像がモニタに表示される。操作者は、表示された画像を見て、所望の編集、例えば、第７フレームの画像を削除する等の作業を行う。これにより、第７フレームに対応するオーディオデータも同じく削除される。かかるオーディオデータの削除は次のようにして実行される。オーディオデータのサンプリングレートはあらかじめ定められているので、１フレームに該当する平均サンプル数を求めることができる。この平均サンプル数＊６で第７フレームの画像に該当するオーディオデータの先頭位置が判明するので、その位置から１フレーム分に該当するオーディオデータを削除すればよい。
【０００６】
【発明が解決しようとする課題】
しかしながら、上記ノンリニアビデオ編集装置においては、以下の様な問題があった。ＤＶビデオカメラによっては、オーディオデータのサンプリングレートが所定のサンプリングレートと少しずれている場合がある。例えば、サンプリングレートが所定のレートよりも高い場合、図６Ｃに示すように、１フレーム当たりのデータ量が大きくなる。しかし、前記編集作業において、編集対象のビデオフレームに該当するオーディオデータの位置特定は、前記所定のレート、具体的には、ビデオ１フレーム当たりのオーディオデータ数、に基づいて決定される。したがって、編集対象のオーディオデータがずれてしまう。逆に、サンプリングレートが所定のレートよりも低い場合、図６Ｄに示すように、１フレーム当たりのデータ量が小さくなる。したがって、この場合も、ズレの問題が生ずる。このような映像と音声のズレは、ビデオ編集において問題となる。
【０００７】
また、前記ノンリニアビデオ編集においては、次のような問題もあった。ＤＶビデオカメラによっては、オーディオデータのサンプリングレートを１つのテープ中にて切り換えることができる。この場合には、途中からオーディオデータについてサンプリングレートが異なることになり、ハードディスクにオーディオデータを取り込むことができず停止したり、データが取り込めても無音で記憶されてしまう。
【０００８】
本発明は上記問題を解決し、ビデオデータとオーディオデータがインタリーブされたデジタルストリームデータについて、オーディオデータのサンプリングレートの変動にかかわらず、ビデオデータとオーディオデータとを正確に対応づけして記憶媒体に記憶することができるデジタルデータ記憶装置またはその方法を提供することを目的とする。
【０００９】
また、ビデオデータとオーディオデータがインタリーブされたデジタルストリームデータについて、オーディオデータのサンプリングレートの変動にかかわらず、ビデオデータとオーディオデータとを正確に対応付けしてノンリニアビデオ編集が可能なノンリニアビデオ編集装置またはその方法を提供することを目的とする。
【００１０】
【課題を解決するための手段および発明の効果】
本発明にかかるノンリニアビデオ編集装置においては、1)前記混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離して記憶する分離記憶手段、2)前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されているか否かを判断する判断手段、3)前記判断手段が前記オーディオストリームデータが前記目標サンプリングレートで生成されていないと判断した場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換するサンプリングレート変換手段、4)オーディオストリームデータの編集命令が与えられると、前記目標サンプリングレートに基づいて、前記オーディオストリームデータの編集対象データを特定して、前記オーディオストリームデータを編集する編集手段を備えている。したがって、前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されていない場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換される。これにより、与えられる混在ストリームデジタルデータのオーディオデータのサンプリングレートが変動する場合でも、編集手段にて正確に対応するオーディオデータを特定して編集することができる。よって、音ずれのないノンリニアビデオ編集が可能となる。
【００１１】
本発明にかかる混在ストリームデジタルデータ分離記憶装置においては、1)前記混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離して記憶する分離記憶手段、2)前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されているか否かを判断する判断手段、3)前記判断手段が前記オーディオストリームデータが前記目標サンプリングレートで生成されていないと判断した場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換するサンプリングレート変換手段を備えている。したがって、前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されていない場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換される。これにより、与えられる混在ストリームデジタルデータのオーディオデータのサンプリングレートが変動する場合でも、ノンリニアビデオ編集装置は、正確に対応するオーディオデータを特定して編集することができる。
【００１２】
本発明にかかる混在ストリームデジタルデータ分離装置においては、ビデオデータとオーディオデータがインタリーブされた混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離する混在ストリームデジタルデータ分離装置であって、
前記混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離する分離手段、
前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されているか否かを判断する判断手段、
前記判断手段が前記オーディオストリームデータが前記目標サンプリングレートで生成されていないと判断した場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換するサンプリングレート変換手段、
を備えている。
【００１３】
したがって、与えられる混在ストリームデジタルデータのオーディオデータのサンプリングレートが変動する場合でも、ノンリニアビデオ編集装置に前記ビデオストリームデータとオーディオストリームデータを記憶することにより、正確に対応するオーディオデータを特定して編集することができる。
【００１４】
本発明にかかるノンリニアビデオ編集方法においては、前記混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離して記憶し、前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されているか否かを判断し、前記オーディオストリームデータが前記目標サンプリングレートで生成されていない場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換し、オーディオストリームデータの編集命令が与えられると、前記目標サンプリングレートに基づいて、前記オーディオストリームデータの編集対象データを特定して、前記オーディオストリームデータを編集する。したがって、前記オーディオストリームデータのサンプリングレートが、編集時にデータ特定に用いる目標サンプリングレートで生成されていない場合には、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換される。これにより、与えられる混在ストリームデジタルデータのオーディオデータのサンプリングレートが変動する場合でも、正確に対応するオーディオデータを特定して編集することができる。
【００１５】
本発明にかかる混在ストリームデジタルデータのデータ変換方法においては、ビデオデータとオーディオデータがインタリーブされた混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離し、前記オーディオストリームデータのサンプリングレートを計測し、前記計測したサンプリングレートに基づいて、前記オーディオストリームデータが、ビデオストリームデータを構成する各フレームデータとの対応づけ許容度を越えるサンプリングレートで生成されている場合には、前記オーディオデータのサンプリングレートを前記対応付け許容度の範囲内となるようデータ変換する。したがって、前記対応付け許容度の範囲外である場合には、前記オーディオストリームデータのサンプリングレートがデータ変換される。これにより、与えられる混在ストリームデジタルデータのオーディオデータのサンプリングレートが変動する場合でも、正確に対応するオーディオデータを特定して編集することができる。
【００１６】
本発明にかかるプログラムを記憶した記憶媒体においては、前記プログラムは、前記オーディオストリームデータのサンプリングレートが、前記目標サンプリングレートで生成されているか否かの判断は、先頭から所定量のデータに基づいて判断し、前記先頭から所定量のオーディオストリームデータのサンプリングレートが、前記目標サンプリングレートで生成されていないと判断した場合には、残りのオーディオストリームデータについては、この判断をすることなく前記データ変換する。したがって、先頭から所定量のオーディオストリームデータのサンプリングレートだけを求めるだけで、前記調整をすべきか否か判断できる。これにより、全体としての処理時間をより短くすることができる。
【００１７】
本発明にかかるプログラムを記憶した記憶媒体においては、与えられたオーディオストリームデータを構成する個別データの値を時系列で並べて、直線補完して仮想直線を求め、この仮想直線について前記目標サンプリングレートで再度サンプリングして、前記サンプリングレートの変換をする。したがって、簡易、高速さらに高精度の変換が可能となる。
【００１８】
本発明にかかるプログラムを記憶した記憶媒体においては、前記プログラムは、与えられたオーディオストリームデータの制御データに記憶されたサンプリングレートを検出して目標サンプリングレートとする。したがって、制御データで特定されたサンプリングレートに変換することができる。
【００１９】
本発明にかかるプログラムを記憶した記憶媒体においては、前記混在ストリームデジタルデータから最初に抽出したサンプリングレートを前記目標サンプリングレートとし、与えられる混在ストリームデジタルデータの制御データ領域に記憶されたサンプリングレートが異なる場合でも、前記目標サンプリングレートにてサンプリングレート変換する。したがって、異なるサンプリングレートでデジタル化されている場合でも、１の目標サンプリングレートのオーディオデータに変換することができる。
【００２０】
【発明の実施の形態】
１．機能ブロック図の説明
本発明の一実施形態を図面に基づいて説明する。図１に示すノンリニアビデオ編集装置１は、ビデオデータとオーディオデータがインタリーブされた混在ストリームデジタルデータが与えられると、ビデオストリームデータとオーディオストリームデータに分離して記憶し、前記混在ストリームデジタルデータのノンリニアビデオ編集を行うノンリニアビデオ編集装置であって、分離記憶手段３、判断手段５、サンプリングレート変換手段７、および編集手段９を備えている。
【００２１】
分離記憶手段３は、前記混在ストリームデジタルデータが与えられると、与えられたデータを取り込み、取り込んだデータをビデオストリームデータとオーディオストリームデータに分離して記憶する。判断手段５は、前記オーディオストリームデータのサンプリングレートを検出して、オーディオストリームデータが、編集時にデータ特定に用いる目標サンプリングレートで生成されているか否かを判断する。
【００２２】
判断手段５が前記オーディオストリームデータが前記目標サンプリングレートで生成されていないと判断した場合には、調整手段７は、前記オーディオストリームデータのサンプリングレートが前記目標サンプリングレートになるようデータ変換する。編集手段９は、オーディオストリームデータの編集命令が与えられると、前記目標サンプリングレートに基づいて、前記オーディオストリームデータの編集対象データを特定して、前記オーディオストリームデータを編集する。
【００２３】
このように、前記オーディオストリームデータのサンプリングレートを計測し、前記計測したサンプリングレートに基づいて、前記オーディオストリームデータが、ビデオストリームデータを構成する各フレームデータとの対応づけ許容度を越えるサンプリングレートで生成されている場合には、前記オーディオデータのサンプリングレートを前記対応付け許容度の範囲内となるよう調整することにより、音ずれのないノンリニアビデオ編集ができる。
【００２４】
２．ハードウェア構成
図２に、図１に示すノンリニアビデオ編集装置１を、ＣＰＵを用いて実現したハードウェア構成の一例を示す。
【００２５】
本発明の一実施形態を図面に基づいて説明する。図１に本発明にかかるノンリニアビデオ編集システム１のハードウェア構成を示す。ノンリニアビデオ編集システム１は、ＣＰＵ２３、メモリ２７、ハードディスク２６、表示部３０、ＦＤＤ２５、キーボード２８、マウス３１、ＤＶビデオキャプチャボード４１およびバスライン２９を備えている。
【００２６】
ＣＰＵ２３は、ハードディスク２６に記憶されたプログラムにしたがいバスライン２９を介して、各部を制御する。このプログラムは、ＦＤＤ２５を介して、プログラムが記憶されたフレキシブルディスク２５ａから読み出されてハードディスク２６にインストールされたものである。なお、フレキシブルディスク以外に、ＣＤ−ＲＯＭ、ＩＣカード等のプログラムを実体的に一体化したコンピュータ可読の記憶媒体から、ハードディスクにインストールさせるようにしてもよい。さらに、通信回線を用いてダウンロードするようにしてもよい。
【００２７】
本実施形態においては、プログラムをフレキシブルディスクからハードディスク２６にインストールさせることにより、フレキシブルディスクに記憶させたプログラムを間接的にコンピュータに実行させるようにしている。しかし、これに限定されることなく、フレキシブルディスクに記憶させたプログラムをＦＤＤ２５から直接的に実行するようにしてもよい。なお、コンピュータによって、実行可能なプログラムとしては、そのままのインストールするだけで直接実行可能なものはもちろん、一旦他の形態等に変換が必要なもの（例えば、データ圧縮されているものを、解凍する等）、さらには、他のモジュール部分と組合して実行可能なものも含む。
【００２８】
ハードディスク２６には、ビデオ編集プログラム２６ｅ、データ取り込みプログラム２６ｓ、ビデオデータ２６ｖ、オーディオデータ２６ａ、オペレーティングシステム（ＯＳ）２６ｗを記憶する。本実施形態においては、オペレーティングシステムとして、マイクロソフト社のＷｉｎｄｏｗｓ９８を採用した。
【００２９】
ビデオデータ２６ｖ、オーディオデータ２６ａは、ビデオカメラ４３からのＤＶデータが、ＤＶビデオキャプチャボード４１で取り込まれ、一旦メモリ２７に記憶される。その後、ＣＰＵ２３によってビデオデータとオーディオデータとに分離され、記憶されたものである。詳細は後述する。
【００３０】
ビデオ編集プログラム２６ｅについては従来と同様である。簡単に説明すると、操作者が読み出し命令を与えると、ビデオデータ２６ｖのうち特定されたフレームの画像データを解凍して、表示部３０に与える。これにより、表示部３０に編集対象の画像が１フレームづつ表示される。操作者は表示された各フレームの画像をカットアンドペーストしたり、合成したりする。この際、オーディオデータもビデオデータにあわせて、１フレーム単位で同時に加工される。操作者は、所望の編集作業が終了すると、書き込み命令を与える。各フレームのビデオデータが所定の形式で圧縮され、ハードディスクにビデオデータとして記憶する。ビデオ編集プログラム２６ｅは、オペレーティングシステム２６ｗの上で動作するアプリケーションプログラムである。
【００３１】
メモリ２７にはその他、各種の演算結果等が記憶される。表示部３０は、グラフィックカード３０ａとモニタ３０ｂを備えており、ビデオ編集プログラム２６ｅの実行により、編集対象の各フレームの画像がビデオ編集プログラム２６ｅのビデオ編集画面中に表示される。キーボード２８およびマウス３１は各種の命令（編集開始命令、編集終了命令等）を入力する入力手段である。
【００３２】
３．データ取り込みプログラム２６ｓのフローチャート
つぎに、ハードディスク２６に記憶されているデータ取り込みプログラム２６ｓについて図３，図４を用いて説明する。
【００３３】
データ取り込みプログラム２６ｓは、図３、図４のメインプログラムと別に、割り込みプログラムを有する。この割り込みプログラムは、操作者が取り込み開始命令を与えると、ＤＶビデオキャプチャーボード４１から、ビデオデータとオーディオデータがインタリーブされたＤＶストリームデータが与えられるか否かを判断し、ＤＶビデオキャプチャーボード４１から、ＤＶストリームデータが与えられると、ハードディスク２６に一時記憶する。一方、ＤＶストリームデータが与えられなければ、かかる記憶処理は行わない。かかる割り込み処理は、ビデオ１フレーム周期ごとに、繰り返される。
【００３４】
メインプログラムについて図３，図４を用いて説明する。以下では、図６Ｃに示すようにオーディオデータのサンプリングレートが標準よりも大きく、かつ、読み出したサンプリングレートが途中で変更されない場合のデータ取り込み処理について説明する。
【００３５】
ＣＰＵ２３は、サンプリングレートフラグ＝０とし（図３ステップＳＴ１）、変換フラグｆをｆ＝０とする（ステップＳＴ１１）。ＣＰＵ２３は、新しいＤＶストリームデータがハードディスク２６に記憶されているか否か判断する（ステップＳＴ１３）。新しいＤＶストリームデータがハードディスク２６に記憶されている場合には、ビデオデータとオーディオデータに分離して、ハードディスクに記憶する（ステップＳＴ１５）。ＤＶストリームデータがハードディスク２６に記憶されていなければ、ステップＳＴ１３の処理を繰り返す。
【００３６】
ＣＰＵ２３は、ＤＶストリームデータの制御データからサンプリングレートを取得する（ステップＳＴ１７）。具体的には、ビデオカメラ４３がオーディオデータをデジタル化した場合に、そのサンプリングレートを制御データ中に埋め込んでいるので、この制御データから読み出せばよい。
【００３７】
ＣＰＵ２３は、サンプリングレートフラグ＝０か否か判断する（ステップＳＴ１８）。この場合、ステップＳＴ１０にてサンプリングレートフラグ＝０に設定されているので、ステップＳＴ１７にて読みだしたサンプリングレートを標準値として記憶する（ステップＳＴ４１）。ＣＰＵ２３は、サンプリングレートフラグ＝１とする（ステップＳＴ４３）。
【００３８】
ＣＰＵ２３は、標準値として記憶したサンプリングレートと読みだしたサンプリングレートが一致するか否か判断する（ステップＳＴ１９）。この場合、読み出したサンプリングレートは途中で変更されないので、ＣＰＵ２３は、変換フラグがｆ＝０であるか否か判断する（ステップＳＴ２１）。この場合、ステップＳＴ１１にてｆ＝０であるので、ＣＰＵ２３は、分離したオーディオデータの個別データをカウントして、加算する（図４ステップＳＴ２３）。ＤＶストリームデータはビデオ１フレーム単位でブロック化されているので（図６Ａ参照）、各ブロック中のサンプル数をカウントするようにすればよい。なお、ＤＶ規格では、ＮＴＳＣモードにおける一秒の動画は２９．９７フレームで構成されている。したがって、ビデオ１フレームに該当する時間、１／２９．９７秒のサンプル数が加算されることとなる。
【００３９】
ＣＰＵ２３は、所定数のフレーム分のカウントが終了したか否か判断する（ステップＳＴ２５）。所定数のフレーム分のカウントが終了していなければ、図３ステップＳＴ１３以下の処理を繰り返す。
【００４０】
図４ステップＳＴ２５にて、所定数のフレーム分のカウントが終了していれば、ＣＰＵ２３は、誤差が許容範囲内か否か判断する（図４ステップＳＴ２７）。例えば、図３ステップＳＴ１７にて取得したサンプリングレートが４８kHzである場合、ビデオ１フレームの時間あたり、平均のサンプル数は、４８kHz／２９．９７＝１６０１．６０であるはずである。本実施形態においては、所定数のフレーム分のサンプリングデータ数の平均を求めて、この平均値と前記平均サンプル値との誤差がプライマイナス０．５％以下に収まっているかを否かを判断するようにした。なお、平均を求めるのは、前記１ブロック当たりの記憶されるサンプル数が若干変動するからである。
【００４１】
なお、かかる誤差の許容範囲はこの値に限定されるわけでなく、編集対象のデータにおいて許容できる範囲、例えばプラスマイナス０．３％以下や、プラスマイナス１％以下というように、任意に設定するようにすればよい。さらに、操作者にかかる許容範囲を入力させてもよい。
【００４２】
許容範囲内である場合には、ＣＰＵ２３は、変換フラグｆをｆ＝２として、ステップＳＴ１３以下の処理を繰り返す。許容範囲を超えている場合には、変換フラグｆをｆ＝１とし（図４ステップＳＴ２８）、サンプリングレート変換処理を行う（ステップＳＴ２９）。サンプリングレート変換処理について、図５を用いて説明する。図５Ａに示すように、サンプリングレート変換処理が必要なオーディオデータを時系列に並べる。この場合、図６Ｃに示すように、サンプリングレートが高い場合であるので、所定時間あたり（例えば１フレーム分）のサンプリング数が多い。したがって、各サンプル間の時間Δtxが標準値よりも小さい。ＣＰＵ２３は、各サンプルの値をつなぐ仮想直線Ｌ１を演算する。この仮想直線Ｌ１を図５Ｂに示すように、標準値である場合の各サンプル間の時間Δtsで再度サンプリングする。この処理により、オーディオデータの周波数と強度をほとんど代えることなく、サンプリングレートを変換することができる。
【００４３】
ＣＰＵ２３は分離したオーディオデータのサンプリングレート変換が終了すると、変換前のオーディオーディオに代えて変換後のオーディオデータを記憶し、図３ステップＳＴ１３以下の処理を繰り返す。
【００４４】
なお、本実施形態においては、図４ステップＳＴ２７において先頭から所定数のフレームについてサンプリングレートが許容範囲か否かを判断し、残りのフレームについてはかかる判断に基づいて、同じ処理を行うようにした。すなわち、図３ステップＳＴ１３から図４ステップＳＴ２７の処理によって、一度許容範囲か否かの判断を行うと、変換フラグｆは、ｆ＝１またはｆ＝２となる。したがって、図３ステップＳＴ２１からステップＳＴ２２に進み、変換フラグｆがｆ＝１である場合には、ステップＳＴ２３〜ステップＳＴ２７の許容範囲か否かの判断処理を行わず、ステップＳＴ２９のサンプリングレート変換処理を行う。一方、変換フラグｆがｆ＝２であれば、ステップＳＴ２９のサンプリングレート変換処理を行うことなく、ステップＳＴ１３以下の処理を繰り返す。これは、多くのビデオカメラは、オーディオデータのサンプリングレートがふらつくのではなく、標準値と比べて上下いずれかにずれた値で安定している場合が多いからである。もちろん、全フレームについて、前記許容範囲か否かの判断を行うようにしてもよい。
【００４５】
すなわち、本実施形態において、変換フラグｆは、ｆ＝０:サンプリングレート判断要、ｆ＝１:サンプリングレート変換要、ｆ＝３:サンプリングレート変換不要を示すこととなる。
【００４６】
このようにして、与えられたオーディオデータのサンプリングレートがずれている場合でも、ノンリニアビデオ編集ソフトが、対応オーディオデータを特定するのに用いる基準サンプリングレートに変換しておくことにより、オーディオストリームデータの編集命令が与えられた場合に、前記目標サンプリングレートに基づいて、前記オーディオストリームデータの編集対象位置が特定できる。これにより、編集時における音ずれの問題を回避することができる。
【００４７】
このように、オーディオデータの取り込み時にサンプリングレートの誤差を調整することにより、音ずれのないビデオ編集が可能となる。
【００４８】
つぎに、読み出したサンプリングレートが途中で変更される場合について説明する。ＤＶデータのオーディオデータには、以下の４種類が採用されている。
【００４９】
1)サンプリングレート４８kHz,データ長16ビット
2)サンプリングレート４４．１kHz,データ長16ビット
3)サンプリングレート３２kHz,データ長16ビット
4)サンプリングレート３２kHz,データ長12ビット
ビデオカメラは、録音途中にて前記オーディオデータのサンプリングレートを切り換えることができる。しかし、途中でサンプリングレートが切り換わった場合、ノンリニアビデオ編集ソフトは、かかる切り替えに対応できず、データ取り込みができず停止したり、データが取り込めても無音で記憶されてしまい、結局、オーディオデータを正確に取り込むことができなかった。
【００５０】
そこで、本実施形態においては、途中でサンプリングレートが切り替わった場合でも、最初に抽出したサンプリングレートにて自動的にサンプリングレート変換処理を行うことにより、ノンリニアビデオ編集ソフトにおける編集作業を可能とした。以下では、サンプリングレート４８kHzから、サンプリングレート４４．１kHzに切り換えられた場合を例として説明する。
【００５１】
ＣＰＵ２３は、ビデオデータとオーディオデータを分離後、制御データからサンプリングレートを取得している（ステップＳＴ１７）。そして、ステップＳＴ１８にてサンプリングレートフラグを参照して、標準値として記憶ずみでない場合だけ、標準値として記憶し（ステップＳＴ４１）、サンプリングレートフラグ＝１とする（ステップＳＴ４３）。
【００５２】
一方、一旦記憶後は、標準値として記憶したサンプリングレートと読みだしたサンプリングレートが一致するか否か判断する（ステップＳＴ１９）。かかる判断は、標準値として記憶されているサンプリングレート４８kHzが、サンプリングレート４４．１kHzとなるか否かで判断できる。
【００５３】
サンプリングレートが途中で切り替わった場合には、ＣＰＵ２３は、オーディオデータの変換フラグｆをｆ＝０とし（ステップＳＴ３１）、カウンタをクリアする（ステップＳＴ３３）。そして、ステップＳＴ２９以下の処理を行う。
【００５４】
このようにして、サンプリングレートが途中で切り替わった場合に、その後の制御データで指定されたサンプリングレートに変換するのではなく、最初に検出したサンプリングレートでサンプリングすることができる。
【００５５】
３．他の実施形態
なお、ＤＶビデオキャプチャーボード４１から与えられるＤＶデータは、ストリームデータであれば、記憶媒体の種類は問わず、テープだけでなく、ＤＶＤディスクでも同様である。
【００５６】
なお、前記オーディオストリームデータのサンプリングレートが、前記目標サンプリングレートで生成されているか否かの判断は、先頭から所定量のデータに基づいて判断し、前記先頭から所定量のオーディオストリームデータのサンプリングレートが、前記目標サンプリングレートで生成されていないと判断した場合には、残りのオーディオストリームデータについては、この判断をすることなく前記データ変換をするようにしてもよい。
【００５７】
また、本実施形態においては、与えられたオーディオストリームデータを構成する個別データの値を時系列で並べて、直線補完して仮想直線を求め、この仮想直線について前記目標サンプリングレートで再度サンプリングして、前記サンプリングレート変換をした。これにより、周波数が高ければ誤差も少なく、かつ計算も容易となる。しかし、これ以外の手法で補完処理を行ってもよく、例えば、仮想スプライン曲線を求めて、この曲線に基づいて、前記データ変換を行ってもよい。
【００５８】
本実施形態においては、与えられたＤＶデータ中のサブコード記憶領域に記憶されたサンプリングレートを検出して目標サンプリングレートとしたが、固定値としてもよい。さらに、ＤＶデータ以外の混在ストリームデジタルデータ、例えば、ＭＰＥＧデータ等、の制御データ領域に記憶されたサンプリングレートを検出して目標サンプリングレートとしてもよい。
【００５９】
また、前記目標サンプリングレートとして、前記ＤＶストリームデータから最初に検出したサンプリングレートを用いてもよい。
【００６０】
前記コンバータを汎用コンピュータに内蔵する場合に、オペレーティングシステム（ＯＳ）プログラムが上記プログラムを実行するようにしてもよい。すなわち、プログラム単独で行っても、オペレーティングシステム（ＯＳ）と分担して、実現するようにしてもよい。
【００６１】
なお、本実施形態においては、図１に示す機能を実現する為に、ＣＰＵ２３を用い、ソフトウェアによってこれを実現している。しかし、その一部もしくは全てを、ロジック回路等のハードウェアによって実現してもよい。例えば、サンプリングレート変換はロジック回路で、サンプリングレートの許容判断はソフトウェアでやってもよい。また、一部のハードウェアで行っている処理をＣＰＵで実行するようにしてもよい。
【００６２】
また、本実施形態においては、図３、４に示すプログラムをハードディスクにインストールし、一旦ＤＶキャプチャーボード４１から与えられたＤＶストリームデータをハードディスク２６に記憶して、データ変換するか否か判断するようにしたが、ＤＶキャプチャーボード４１に別途ＣＰＵおよびＲＯＭを搭載し、このＲＯＭに前記プログラムを記憶させて、ＤＶキャプチャーボード４１から、サンプリングレートを一致させたＤＶストリームデータを出力できるようにしてもよい。この場合、ステップＳＴ２７の判断を行うための最初から所定フレーム分のオーディオデータを記憶するためのメモリを用意すればよい。この場合、ビデオデータについては、データ変換作業が不要なので、順次、ハードディスク２６に転送するようにしてもよい。これにより、ＤＶキャプチャーボード４１に用意するメモリは、比較的データ量の少ないオーディオデータを記憶するメモリでよくなる。もちろん、判断が終了するまで、全ストリームデータを記憶するメモリを搭載してもよい。
【図面の簡単な説明】
【図１】本発明にかかるノンリニアビデオ編集システム１の機能ブロック図である。
【図２】ノンリニアビデオ編集システム１をＣＰＵを用いて構成したハードウエア構成の一例を示す図である。
【図３】データ取り込みプログラムのメインフローチャートである。
【図４】データ取り込みプログラムのメインフローチャートである。
【図５】サンプリングレート変換を説明するための図である。
【図６】ビデオデータとオーディオーディオの対応を示す図である。
【符号の説明】
１・・・・・ノンリニアビデオ編集装置
２３・・・・ＣＰＵ
２６ｓ・・・データ取り込みプログラム
２６ｅ・・・ビデオ編集プログラム
４１・・・・ＤＶビデオキャプチャーボード[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a non-linear video editing apparatus, and more particularly, to a change in sampling rate of audio data.
[0002]
[Related technologies]
Today, DV (digital video) nonlinear video editing is attracting attention. In this method, a DV video signal is stored in an HDD (hard disk) of a personal computer via a video capture board, an arbitrary frame is taken out, and an image is edited.
[0003]
Such editing processing will be briefly described. DV data as shown in FIG. 6A is captured from a digital video (DV) camera DV device connected via an IEEE 1394 interface (not shown). This digital data is stream data composed of a plurality of frames. Each frame is composed of DV-compressed video data and audio data digitized at a predetermined sampling rate.
[0004]
As shown in FIG. 6B, the DV video capture board separates video data and audio data for each frame. The separated video data and audio data are stored in the hard disk as stream data in which each frame is continued.
[0005]
When a display command is given from the operator, the video data stored in the hard disk is decompressed and an image of a predetermined frame is displayed on the monitor. The operator looks at the displayed image and performs desired editing, for example, deleting the seventh frame image. As a result, the audio data corresponding to the seventh frame is also deleted. Such audio data deletion is executed as follows. Since the sampling rate of the audio data is determined in advance, the average number of samples corresponding to one frame can be obtained. Since the average position of the audio data corresponding to the image of the seventh frame is determined by this average number of samples * 6, the audio data corresponding to one frame may be deleted from that position.
[0006]
[Problems to be solved by the invention]
However, the nonlinear video editing apparatus has the following problems. Depending on the DV video camera, the sampling rate of the audio data may be slightly different from a predetermined sampling rate. For example, when the sampling rate is higher than a predetermined rate, the data amount per frame becomes large as shown in FIG. 6C. However, in the editing operation, the position of the audio data corresponding to the video frame to be edited is determined based on the predetermined rate, specifically, the number of audio data per video frame. Therefore, the audio data to be edited is shifted. On the other hand, when the sampling rate is lower than the predetermined rate, the data amount per frame becomes small as shown in FIG. 6D. Therefore, also in this case, a problem of deviation occurs. Such a shift between video and audio is a problem in video editing.
[0007]
The nonlinear video editing also has the following problems. Depending on the DV video camera, the sampling rate of the audio data can be switched in one tape. In this case, the sampling rate differs for audio data from the middle, and the audio data cannot be taken into the hard disk and stopped, or even if the data is taken in, it is stored silently.
[0008]
The present invention solves the above-described problem, and in digital stream data in which video data and audio data are interleaved, video data and audio data are accurately associated with each other in a storage medium regardless of fluctuations in the sampling rate of the audio data. It is an object of the present invention to provide a digital data storage device or method thereof that can be stored.
[0009]
In addition, non-linear video editing apparatus capable of non-linear video editing by accurately associating video data and audio data with respect to digital stream data in which video data and audio data are interleaved regardless of fluctuations in the sampling rate of the audio data. Another object is to provide a method thereof.
[0010]
[Means for Solving the Problems and Effects of the Invention]
In the nonlinear video editing apparatus according to the present invention, when 1) the mixed stream digital data is given, separate storage means for separating and storing the video stream data and the audio stream data, and 2) the sampling rate of the audio stream data is Determining means for determining whether or not the target sampling rate used for data specification at the time of editing is generated, and 3) if the determining means determines that the audio stream data is not generated at the target sampling rate, Sampling rate conversion means for converting data so that the sampling rate of the audio stream data becomes the target sampling rate, and 4) when an audio stream data editing command is given, the audio stream data is converted based on the target sampling rate. To identify the edit data of audio stream data, and includes an editing means for editing the audio stream data. Therefore, when the sampling rate of the audio stream data is not generated at the target sampling rate used for data specification at the time of editing, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate. As a result, even when the sampling rate of the audio data of the given mixed stream digital data varies, the corresponding audio data can be accurately identified and edited by the editing means. Therefore, non-linear video editing without sound deviation becomes possible.
[0011]
In the mixed stream digital data separation storage device according to the present invention, when 1) the mixed stream digital data is given, separate storage means for storing the video stream data and the audio stream data separately; 2) the audio stream data; Judgment means for judging whether or not the sampling rate is generated at a target sampling rate used for data specification at the time of editing, 3) When the judgment means judges that the audio stream data is not generated at the target sampling rate Includes sampling rate conversion means for converting the data so that the sampling rate of the audio stream data becomes the target sampling rate. Therefore, when the sampling rate of the audio stream data is not generated at the target sampling rate used for data specification at the time of editing, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate. Thereby, even when the sampling rate of the audio data of the given mixed stream digital data varies, the nonlinear video editing apparatus can accurately identify and edit the corresponding audio data.
[0012]
The mixed stream digital data separator according to the present invention is a mixed stream digital data separator that separates video stream data and audio stream data when mixed stream digital data in which video data and audio data are interleaved is provided. ,
Separation means for separating video stream data and audio stream data when the mixed stream digital data is given,
Judgment means for judging whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
Sampling rate conversion means for performing data conversion so that the sampling rate of the audio stream data becomes the target sampling rate when the determining means determines that the audio stream data is not generated at the target sampling rate;
It has.
[0013]
Therefore, even when the sampling rate of the audio data of the given mixed stream digital data fluctuates, the corresponding audio data can be accurately identified and edited by storing the video stream data and the audio stream data in the nonlinear video editing device. can do.
[0014]
In the nonlinear video editing method according to the present invention, when the mixed stream digital data is given, the video stream data and the audio stream data are separated and stored, and the sampling rate of the audio stream data is used for data specification at the time of editing. It is determined whether or not the audio stream data is generated at the target sampling rate. If the audio stream data is not generated at the target sampling rate, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate. When an audio stream data editing command is given, the audio stream data is identified based on the target sampling rate, and the audio stream data is specified. Edit. Therefore, when the sampling rate of the audio stream data is not generated at the target sampling rate used for data specification at the time of editing, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate. Thereby, even when the sampling rate of the audio data of the given mixed stream digital data varies, the corresponding audio data can be accurately identified and edited.
[0015]
In the mixed stream digital data conversion method according to the present invention, when mixed stream digital data in which video data and audio data are interleaved is given, the stream data is separated into video stream data and audio stream data, and sampling of the audio stream data is performed. If the audio stream data is generated at a sampling rate that exceeds the permissible correspondence with each frame data constituting the video stream data based on the measured sampling rate, the audio stream data is generated. Data conversion is performed so that the data sampling rate falls within the range of the association tolerance. Therefore, if the association tolerance is out of the range, the sampling rate of the audio stream data is converted. Thereby, even when the sampling rate of the audio data of the given mixed stream digital data varies, the corresponding audio data can be accurately identified and edited.
[0016]
In the storage medium storing the program according to the present invention, the program determines whether the sampling rate of the audio stream data is generated at the target sampling rate based on a predetermined amount of data from the beginning. If it is determined that the sampling rate of the predetermined amount of audio stream data from the head is not generated at the target sampling rate, the remaining audio stream data is converted without performing this determination. To do. Therefore, it is possible to determine whether or not the adjustment should be made by only obtaining the sampling rate of a predetermined amount of audio stream data from the beginning. Thereby, the processing time as a whole can be further shortened.
[0017]
In the storage medium storing the program according to the present invention, the values of the individual data constituting the given audio stream data are arranged in time series, and a straight line is obtained by linear interpolation, and the virtual straight line is obtained at the target sampling rate. Sampling is performed again to convert the sampling rate. Therefore, simple, high-speed and high-accuracy conversion is possible.
[0018]
In the storage medium storing the program according to the present invention, the program detects the sampling rate stored in the control data of the given audio stream data and sets it as the target sampling rate. Therefore, it is possible to convert to the sampling rate specified by the control data.
[0019]
In the storage medium storing the program according to the present invention, the sampling rate first extracted from the mixed stream digital data is set as the target sampling rate, and the sampling rate stored in the control data area of the given mixed stream digital data is different. Even in this case, the sampling rate is converted at the target sampling rate. Therefore, even when digitized at different sampling rates, it can be converted into audio data of one target sampling rate.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
1. Functional block diagram description
An embodiment of the present invention will be described with reference to the drawings. 1 is provided with mixed stream digital data in which video data and audio data are interleaved, the video stream data and the audio stream data are separated and stored, and the non-linearity of the mixed stream digital data is stored. A non-linear video editing apparatus that performs video editing, and includes a separation storage unit 3, a determination unit 5, a sampling rate conversion unit 7, and an editing unit 9.
[0021]
When the mixed stream digital data is given, the separation storage means 3 takes in the given data, and stores the fetched data into video stream data and audio stream data. The determination means 5 detects the sampling rate of the audio stream data, and determines whether the audio stream data is generated at a target sampling rate used for data specification at the time of editing.
[0022]
When the determination unit 5 determines that the audio stream data is not generated at the target sampling rate, the adjustment unit 7 performs data conversion so that the sampling rate of the audio stream data becomes the target sampling rate. When an editing instruction for audio stream data is given, the editing unit 9 specifies data to be edited for the audio stream data based on the target sampling rate, and edits the audio stream data.
[0023]
In this way, the sampling rate of the audio stream data is measured, and based on the measured sampling rate, the audio stream data has a sampling rate that exceeds the permissible association with each frame data constituting the video stream data. If it is generated, non-linear video editing without sound deviation can be performed by adjusting the sampling rate of the audio data to be within the range of the association tolerance.
[0024]
2. Hardware configuration
FIG. 2 shows an example of a hardware configuration in which the nonlinear video editing apparatus 1 shown in FIG. 1 is realized using a CPU.
[0025]
An embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows a hardware configuration of a nonlinear video editing system 1 according to the present invention. The non-linear video editing system 1 includes a CPU 23, a memory 27, a hard disk 26, a display unit 30, an FDD 25, a keyboard 28, a mouse 31, a DV video capture board 41, and a bus line 29.
[0026]
The CPU 23 controls each unit via the bus line 29 according to a program stored in the hard disk 26. This program is read from the flexible disk 25a storing the program via the FDD 25 and installed in the hard disk 26. In addition to the flexible disk, it may be installed on a hard disk from a computer-readable storage medium in which a program such as a CD-ROM or an IC card is substantially integrated. Furthermore, it may be downloaded using a communication line.
[0027]
In the present embodiment, the program stored in the flexible disk is indirectly executed by the computer by installing the program from the flexible disk to the hard disk 26. However, the present invention is not limited to this, and the program stored in the flexible disk may be directly executed from the FDD 25. Note that programs that can be executed by a computer are not only those that can be directly executed by simply installing them, but also those that need to be converted to other forms once (for example, those that have been compressed) are decompressed. Etc.), and further executable in combination with other module parts.
[0028]
The hard disk 26 stores a video editing program 26e, a data capturing program 26s, video data 26v, audio data 26a, and an operating system (OS) 26w. In this embodiment, Windows 98 of Microsoft Corporation is adopted as the operating system.
[0029]
As the video data 26v and the audio data 26a, DV data from the video camera 43 is captured by the DV video capture board 41 and temporarily stored in the memory 27. Thereafter, the CPU 23 separates and stores the video data and the audio data. Details will be described later.
[0030]
The video editing program 26e is the same as the conventional one. Briefly, when the operator gives a read command, the image data of the specified frame in the video data 26v is decompressed and given to the display unit 30. As a result, the image to be edited is displayed on the display unit 30 frame by frame. The operator cuts and pastes the images of the displayed frames and combines them. At this time, the audio data is simultaneously processed in units of one frame in accordance with the video data. When the desired editing operation is completed, the operator gives a write command. The video data of each frame is compressed in a predetermined format and stored as video data on the hard disk. The video editing program 26e is an application program that runs on the operating system 26w.
[0031]
In addition, the memory 27 stores various calculation results and the like. The display unit 30 includes a graphic card 30a and a monitor 30b, and an image of each frame to be edited is displayed on the video editing screen of the video editing program 26e by executing the video editing program 26e. The keyboard 28 and mouse 31 are input means for inputting various commands (edit start command, edit end command, etc.).
[0032]
3. Flow chart of data import program 26s
Next, the data capturing program 26s stored in the hard disk 26 will be described with reference to FIGS.
[0033]
The data capture program 26s has an interrupt program in addition to the main program shown in FIGS. When the operator gives a capture start command, the interrupt program determines whether or not DV stream data in which video data and audio data are interleaved is provided from the DV video capture board 41, and from the DV video capture board 41. When the DV stream data is given, it is temporarily stored in the hard disk 26. On the other hand, if DV stream data is not given, such storage processing is not performed. Such interruption processing is repeated for each frame period of video.
[0034]
The main program will be described with reference to FIGS. In the following, as shown in FIG. 6C, a data capturing process when the sampling rate of audio data is larger than the standard and the read sampling rate is not changed in the middle will be described.
[0035]
The CPU 23 sets the sampling rate flag = 0 (step ST1 in FIG. 3), and sets the conversion flag f = 0 (step ST11). The CPU 23 determines whether or not new DV stream data is stored in the hard disk 26 (step ST13). If new DV stream data is stored in the hard disk 26, it is separated into video data and audio data and stored in the hard disk (step ST15). If the DV stream data is not stored in the hard disk 26, the process of step ST13 is repeated.
[0036]
CPU23 acquires a sampling rate from the control data of DV stream data (step ST17). Specifically, when the audio data is digitized by the video camera 43, the sampling rate is embedded in the control data, so that the control data may be read out.
[0037]
The CPU 23 determines whether or not the sampling rate flag = 0 (step ST18). In this case, since the sampling rate flag = 0 is set in step ST10, the sampling rate read in step ST17 is stored as a standard value (step ST41). The CPU 23 sets the sampling rate flag = 1 (step ST43).
[0038]
The CPU 23 determines whether or not the sampling rate stored as the standard value matches the read sampling rate (step ST19). In this case, since the read sampling rate is not changed midway, the CPU 23 determines whether or not the conversion flag is f = 0 (step ST21). In this case, since f = 0 in step ST11, the CPU 23 counts and adds the individual data of the separated audio data (step ST23 in FIG. 4). Since the DV stream data is blocked in units of one video frame (see FIG. 6A), the number of samples in each block may be counted. In the DV standard, a one-second moving image in the NTSC mode is composed of 29.97 frames. Accordingly, the time corresponding to one frame of video, the number of samples of 1 / 29.97 seconds, is added.
[0039]
The CPU 23 determines whether or not the counting for a predetermined number of frames has been completed (step ST25). If the counting for the predetermined number of frames has not been completed, the processes in and after step ST13 in FIG. 3 are repeated.
[0040]
If the count for the predetermined number of frames has been completed in step ST25 in FIG. 4, CPU 23 determines whether or not the error is within an allowable range (step ST27 in FIG. 4). For example, when the sampling rate acquired in step ST17 in FIG. 3 is 48 kHz, the average number of samples per video one frame time should be 48 kHz / 29.97 = 1601.60. In the present embodiment, an average of the number of sampling data for a predetermined number of frames is obtained, and it is determined whether or not an error between the average value and the average sample value is within 0.5% or less. I did it. The average is obtained because the number of samples stored per block varies slightly.
[0041]
Note that the allowable range of such errors is not limited to this value, and is set arbitrarily such as a range allowable in the data to be edited, for example, plus or minus 0.3% or less, plus or minus 1% or less. What should I do? Further, an allowable range for the operator may be input.
[0042]
If it is within the allowable range, the CPU 23 sets the conversion flag f to f = 2 and repeats the processing from step ST13 onward. If the allowable range is exceeded, the conversion flag f is set to f = 1 (step ST28 in FIG. 4), and sampling rate conversion processing is performed (step ST29). The sampling rate conversion process will be described with reference to FIG. As shown in FIG. 5A, audio data that requires sampling rate conversion processing is arranged in time series. In this case, as shown in FIG. 6C, since the sampling rate is high, the number of samplings per predetermined time (for example, for one frame) is large. Therefore, the time Δtx between each sample is smaller than the standard value. The CPU 23 calculates a virtual straight line L1 that connects the values of the samples. As shown in FIG. 5B, the virtual straight line L1 is sampled again at a time Δts between samples in the case of a standard value. By this processing, the sampling rate can be converted with almost no change in the frequency and intensity of the audio data.
[0043]
When the sampling rate conversion of the separated audio data is completed, the CPU 23 stores the converted audio data in place of the audio audio before the conversion, and repeats the processes after step ST13 in FIG.
[0044]
In this embodiment, in step ST27 in FIG. 4, it is determined whether the sampling rate is within an allowable range for a predetermined number of frames from the beginning, and the same processing is performed for the remaining frames based on such determination. . That is, once it is determined whether or not it is within the allowable range by the processing from step ST13 to step ST27 in FIG. 3, the conversion flag f becomes f = 1 or f = 2. Accordingly, the process proceeds from step ST21 to step ST22 in FIG. 3, and if the conversion flag f is f = 1, the sampling rate conversion process in step ST29 is not performed without determining whether the allowable range is in step ST23 to step ST27. I do. On the other hand, if the conversion flag f is f = 2, the processing after step ST13 is repeated without performing the sampling rate conversion processing at step ST29. This is because in many video cameras, the sampling rate of audio data does not fluctuate, but is often stable at a value that is shifted up or down compared to the standard value. Of course, it may be determined whether all frames are within the allowable range.
[0045]
That is, in the present embodiment, the conversion flag f indicates that f = 0: sampling rate determination is required, f = 1: sampling rate conversion is required, and f = 3: sampling rate conversion is not required.
[0046]
In this way, even when the sampling rate of the given audio data is deviated, the non-linear video editing software converts the audio stream data into a reference sampling rate used to identify the corresponding audio data. When an editing command is given, an editing target position of the audio stream data can be specified based on the target sampling rate. Thereby, it is possible to avoid the problem of sound deviation during editing.
[0047]
In this way, by adjusting the sampling rate error at the time of capturing audio data, video editing can be performed without any sound shift.
[0048]
Next, a case where the read sampling rate is changed halfway will be described. The following four types of audio data of DV data are employed.
[0049]
1) Sampling rate 48kHz, data length 16 bits
2) Sampling rate 44.1kHz, data length 16 bits
3) Sampling rate 32kHz, data length 16 bits
4) Sampling rate 32kHz, data length 12 bits
The video camera can switch the sampling rate of the audio data during recording. However, if the sampling rate is switched in the middle, the non-linear video editing software will not be able to cope with such switching, and will not be able to capture data and will stop, or even if the data is captured, it will be stored silently, eventually resulting in audio data Could not be imported correctly.
[0050]
Therefore, in the present embodiment, even when the sampling rate is switched halfway, the editing operation in the non-linear video editing software is enabled by automatically performing the sampling rate conversion process at the sampling rate extracted first. Hereinafter, a case where the sampling rate is switched from 48 kHz to the sampling rate 44.1 kHz will be described as an example.
[0051]
The CPU 23 obtains the sampling rate from the control data after separating the video data and the audio data (step ST17). Then, referring to the sampling rate flag in step ST18, only when it is not stored as a standard value, it is stored as a standard value (step ST41), and the sampling rate flag = 1 is set (step ST43).
[0052]
On the other hand, once stored, it is determined whether or not the sampling rate stored as the standard value matches the read sampling rate (step ST19). Such a determination can be made based on whether or not the sampling rate 48 kHz stored as the standard value becomes the sampling rate 44.1 kHz.
[0053]
When the sampling rate is switched halfway, the CPU 23 sets the audio data conversion flag f to f = 0 (step ST31) and clears the counter (step ST33). And the process after step ST29 is performed.
[0054]
In this way, when the sampling rate is changed in the middle, it is possible to sample at the sampling rate detected first instead of converting to the sampling rate designated by the subsequent control data.
[0055]
3. Other embodiments
Note that the DV data provided from the DV video capture board 41 is stream data, and the same applies to not only a tape but also a DVD disk, regardless of the type of storage medium.
[0056]
Whether or not the sampling rate of the audio stream data is generated at the target sampling rate is determined based on a predetermined amount of data from the beginning, and the sampling rate of the predetermined amount of audio stream data from the beginning. However, if it is determined that the data has not been generated at the target sampling rate, the data conversion may be performed on the remaining audio stream data without making this determination.
[0057]
Further, in this embodiment, the values of the individual data constituting the given audio stream data are arranged in time series, a straight line is obtained by obtaining a virtual straight line, and the virtual straight line is sampled again at the target sampling rate, The sampling rate was converted. Thereby, if the frequency is high, the error is small and the calculation is easy. However, supplementary processing may be performed by a method other than this, for example, a virtual spline curve may be obtained, and the data conversion may be performed based on this curve.
[0058]
In this embodiment, the sampling rate stored in the subcode storage area in the given DV data is detected and set as the target sampling rate, but it may be a fixed value. Furthermore, a sampling rate stored in a control data area of mixed stream digital data other than DV data, for example, MPEG data may be detected and set as a target sampling rate.
[0059]
Further, as the target sampling rate, a sampling rate first detected from the DV stream data may be used.
[0060]
When the converter is built in a general-purpose computer, an operating system (OS) program may execute the program. In other words, the program may be executed alone or shared with the operating system (OS).
[0061]
In the present embodiment, the CPU 23 is used to realize the function shown in FIG. 1, and this is realized by software. However, some or all of them may be realized by hardware such as a logic circuit. For example, sampling rate conversion may be performed by a logic circuit, and sampling rate tolerance determination may be performed by software. Further, processing performed by some hardware may be executed by the CPU.
[0062]
In the present embodiment, the program shown in FIGS. 3 and 4 is installed in the hard disk, and the DV stream data given from the DV capture board 41 is temporarily stored in the hard disk 26 to determine whether or not to convert the data. However, a separate CPU and ROM may be mounted on the DV capture board 41, and the program may be stored in the ROM so that the DV capture board 41 can output DV stream data having the same sampling rate. . In this case, a memory for storing audio data for a predetermined frame from the beginning for performing the determination in step ST27 may be prepared. In this case, since the video data need not be converted, the video data may be sequentially transferred to the hard disk 26. As a result, the memory prepared in the DV capture board 41 may be a memory that stores audio data with a relatively small amount of data. Of course, a memory for storing all stream data may be mounted until the determination is completed.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of a non-linear video editing system 1 according to the present invention.
FIG. 2 is a diagram illustrating an example of a hardware configuration in which the nonlinear video editing system 1 is configured using a CPU.
FIG. 3 is a main flowchart of a data capturing program.
FIG. 4 is a main flowchart of a data capturing program.
FIG. 5 is a diagram for explaining sampling rate conversion;
FIG. 6 is a diagram illustrating a correspondence between video data and audio audio.
[Explanation of symbols]
1 ... Nonlinear video editing device
23 ... CPU
26s ... Data acquisition program
26e ... Video editing program
41 ... DV video capture board

Claims

A non-linear video editing apparatus that, when given mixed stream digital data in which video data and audio data are interleaved, separates and stores video stream data and audio stream data, and performs non-linear video editing of the mixed stream digital data. ,
When the mixed stream digital data is given, separate storage means for separating and storing video stream data and audio stream data;
Judgment means for judging whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
Sampling rate conversion means for performing data conversion so that the sampling rate of the audio stream data becomes the target sampling rate when the determining means determines that the audio stream data is not generated at the target sampling rate;
An editing means for editing the audio stream data by specifying editing target data of the audio stream data based on the target sampling rate when an audio stream data editing command is given;
A non-linear video editing apparatus characterized by comprising:

A mixed stream digital data separation and storage device that separates and stores video stream data and audio stream data when mixed stream digital data in which video data and audio data are interleaved is provided,
When the mixed stream digital data is given, separate storage means for separating and storing video stream data and audio stream data;
Judgment means for judging whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
Sampling rate conversion means for performing data conversion so that the sampling rate of the audio stream data becomes the target sampling rate when the determining means determines that the audio stream data is not generated at the target sampling rate;
A mixed stream digital data separation and storage device comprising:

A mixed stream digital data separation device that separates video stream data and audio stream data when mixed stream digital data in which video data and audio data are interleaved is provided,
Separation means for separating video stream data and audio stream data when the mixed stream digital data is given,
Judgment means for judging whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
Sampling rate conversion means for performing data conversion so that the sampling rate of the audio stream data becomes the target sampling rate when the determining means determines that the audio stream data is not generated at the target sampling rate;
A mixed stream digital data separation device comprising:

A non-linear video editing method in which video stream data and audio data are interleaved, and when the mixed stream digital data is given, the video stream data and the audio stream data are stored separately, and non-linear video editing of the mixed stream digital data is performed. ,
Given the mixed stream digital data, separate and store video stream data and audio stream data,
It is determined whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
If the audio stream data is not generated at the target sampling rate, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate,
When an audio stream data editing command is given, based on the target sampling rate, the editing target data of the audio stream data is specified, and the audio stream data is edited.
A non-linear video editing method.

When mixed stream digital data in which video data and audio data are interleaved is given, it is separated into video stream data and audio stream data,
Measuring the sampling rate of the audio stream data;
Based on the measured sampling rate, if the audio stream data is generated at a sampling rate that exceeds the association tolerance with each frame data constituting the video stream data, the sampling rate of the audio data is set. Data conversion to be within the range of the association tolerance;
A mixed stream digital data conversion method characterized by the above.

A storage storing a program that causes a computer to function as a mixed stream digital data separation storage device that, when given mixed stream digital data in which video data and audio data are interleaved, is separated and stored into video stream data and audio stream data Performing the following processing on the medium:
Given the mixed stream digital data, separate and store video stream data and audio stream data,
It is determined whether the sampling rate of the audio stream data is generated at a target sampling rate used for data specification at the time of editing,
When the audio stream data is not generated at the target sampling rate, data conversion is performed so that the sampling rate of the audio stream data becomes the target sampling rate.
A storage medium storing a program characterized by the above.

In the storage medium which memorize | stored the program of Claim 6,
Whether the sampling rate of the audio stream data is generated at the target sampling rate is determined based on a predetermined amount of data from the beginning,
If it is determined that the sampling rate of the predetermined amount of audio stream data from the head is not generated at the target sampling rate, the remaining audio stream data is converted without performing this determination. ,
It is characterized by.

In the storage medium which memorize | stored the program of Claim 6,
Arranging values of individual data constituting the given audio stream data in time series, obtaining a virtual straight line by complementing the line, sampling the virtual straight line again at the target sampling rate, and converting the sampling rate ,
It is characterized by.

In the storage medium which memorize | stored the program of Claim 8,
The program detects the sampling rate stored in the control data of the given audio stream data and sets it as a target sampling rate,
It is characterized by.

In the storage medium which memorize | stored the program of Claim 9,
The program uses the sampling rate first extracted from the mixed stream digital data as the target sampling rate, and even if the sampling rate stored in the control data area of the given mixed stream digital data is different, the target sampling rate is used. Sampling rate conversion,
It is characterized by.