JP2004129040A

JP2004129040A - Device and method for processing video

Info

Publication number: JP2004129040A
Application number: JP2002292255A
Authority: JP
Inventors: Hiroshi Nishikawa; 西川　寛
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2002-10-04
Filing date: 2002-10-04
Publication date: 2004-04-22

Abstract

<P>PROBLEM TO BE SOLVED: To provide a rewinding/fast-forwarding method by which the interested position of the user is not overlooked by using the position of a scene change by detecting the scene change by a simple detecting method. <P>SOLUTION: A video processor has a storing means which stores a plurality of packets, each of which contains a plurality of compressed images and the management information on the packet, as video contents and a scene change detecting means which detects scene changes in the video contents based on the management information on each packet. Since the scene changes are detected by using the management information on the packets, the scene changes can be detected without extending compressed video or audio data. Consequently, the scene changes can be detected through a simple processing. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、映像を圧縮して記憶し、記憶した映像を再生、巻戻し、早送り、印刷などが可能な映像処理技術に関し、特にシーンチェンジ検出を行う映像処理技術に関する。
【０００２】
【従来の技術】
ＭＰＥＧ（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）方式で映像を圧縮して記憶し、記憶した圧縮映像を伸長して表示する装置は広く普及してきている。ＭＰＥＧ方式での、映像の各フレームは、画像内圧縮であるＩピクチャと、画像間圧縮されるＰピクチャ、Ｂピクチャから構成されており、Ｉピクチャは独立フレームとして、その画像内で圧縮される。非独立フレームであるＰピクチャは、以前のＩピクチャまたはＰピクチャとの差異を符号化して画像間圧縮し、同様に非独立フレームである、Ｂピクチャは前後のＩまたはＰピクチャとの差異を符号化して画像間圧縮する。ＭＰＥＧ方式では、このように画像間圧縮を行うことにより、数十分の１の圧縮が可能となっている。
【０００３】
また上記のＩピクチャ、Ｐピクチャ、Ｂピクチャは複数ピクチャがまとめてパケット化されており、再生時間を示すタイムスタンプや、パケットごとのバイト数などが各パケットの管理情報として記憶される。この管理情報に従って映像を表示していくことにより音声データとの同期を取ったスムーズな映像の描写をしている。
【０００４】
このＭＰＥＧ方式で圧縮した映像の早送り、巻戻し再生（以下特殊再生とも記載する）方法として、たとえば、下記の特許文献１では、早送り、巻戻し再生するとき、所定の間隔を計測し、その間隔に最も近いＩピクチャのみを表示する方式を提案している。
【０００５】
また、巻き戻しや早送りにおいてシーンチェンジを検出する方法として、下記の特許文献２及び特許文献３などが提案されている。
【０００６】
【特許文献１】
特開平５−３４４４９４号公報
【特許文献２】
特開２０００−３３３１１７号公報
【特許文献３】
特開２００１−６２３６号公報
【０００７】
【発明が解決しようとする課題】
早送りや巻戻し再生などの特殊再生するのは、ユーザの興味のある映像に早く到達したいためである。上記の特許文献１の早送り・巻戻し再生では、一定間隔での処理を行っている。
【０００８】
たとえばサッカーの映像を記憶しておき、映像の中でゴールシーンを再生した後、そのゴールシーンをもう一度見たい場合は、巻戻し再生状態にして、Ｉピクチャを一定間隔で再生する。ユーザは巻戻し中の映像シーンを注意深く見て、そのゴールシーンの直前に再生キーを押すことにより、再度ユーザの興味のある映像に到達し見ることができる。
【０００９】
しかしながら、この技術においては、映像シーンの巻戻しに強弱をつけずに一定間隔で行っているため、短い時間であるゴールシーンを見逃してしまう可能性もあり、また、Ｉピクチャのみを再生しているので、Ｉピクチャがゴールシーンかどうかの見分けがつかないこともある。
【００１０】
本発明では、上記のように早送りや巻戻し時に、ユーザが興味のある位置を見逃したり、見分けがつかないで見過ごしてしまったりすることを第１の課題とする。
【００１１】
またシーンチェンジを検出して早送りや巻戻しを行う提案がなされている。たとえば、上記の特許文献２では、映像信号からシーンチェンジを検出して、検出した複数のシーンチェンジ単位で再生位置を変更している。
【００１２】
しかしながら映像信号からシーンチェンジを検出しようとすると、圧縮された映像信号から伸長処理を行ったり、比較のための差分回路が必要になったりするため、回路が複雑になってしまう。
【００１３】
また、上記の特許文献３では、音声信号の強度からシーンチェンジを検出して、再生位置を特定する方法を提案している。しかしながら音声信号の強度からシーンチェンジを検出しようとすると、圧縮された音声信号を伸長し、音声強度の検出を行う必要があり、やはり回路が複雑になってしまう。
【００１４】
本発明においては、上記した特許文献２及び３での回路が複雑になってしまう事を第２の課題とする。
【００１５】
本発明は、簡単な検出方法でシーンチェンジを検出し、シーンチェンジ位置を使って、ユーザの興味のある位置を見逃さないような巻戻し・早送りの方法を提供することを第１の目的とする。
【００１６】
また、映像信号を基にしてプリンタに印刷するような映像処理装置が望まれるが、どのシーンを抽出して印刷するのか、いい方法がこれまで提案されていない状態である。
【００１７】
この為、簡単な検出方法でシーンチェンジを検出し、シーンチェンジ結果を使って映像を印刷する方法を提供することを第２の目的とする。
【００１８】
【課題を解決するための手段】
本発明の一観点によれば、複数の圧縮画像とパケットごとの管理情報とを一つのパケットとし、複数のパケットを映像コンテンツとして記憶する記憶手段と、前記パケットごとの管理情報に基づいて前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出手段とを有することを特徴とする映像処理装置が提供される。
本発明の他の観点によれば、画像内圧縮処理された圧縮画像と画像間圧縮処理された圧縮画像とを含む複数の圧縮画像で構成される映像コンテンツを記憶する記憶手段と、前記映像コンテンツを早送り又は巻戻し処理として再生する際に、すべての画像内圧縮処理された圧縮画像及び一部の画像間圧縮処理された圧縮画像を基に再生する特殊映像再生手段とを有すること特徴とする映像処理装置が提供される。
本発明のさらに他の観点によれば、複数の圧縮画像で構成される映像コンテンツを記憶する記憶手段と、前記映像コンテンツ内のシーンチェンジの圧縮画像を検出するシーンチェンジ検出手段と、前記シーンチェンジ検出手段により検出された圧縮画像を基に印刷画像を生成する印刷画像生成手段とを有することを特徴とする映像処理装置が提供される。
本発明のさらに他の観点によれば、複数の圧縮音声とパケットごとの管理情報とを一つの音声パケットとし、複数の圧縮画像を含む映像パケット及び複数の前記音声パケットを含む映像コンテンツを記憶する記憶手段と、前記音声パケットごとの管理情報に基づいて前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出手段とを有することを特徴とする映像処理装置が提供される。本発明のさらに他の観点によれば、複数の圧縮画像とパケットごとの管理情報とを一つのパケットとし、複数のパケットを映像コンテンツとして記憶する記憶ステップと、前記パケットごとの管理情報に基づいて前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出ステップとを有することを特徴とする映像処理方法が提供される。
本発明のさらに他の観点によれば、画像内圧縮処理された圧縮画像と画像間圧縮処理された圧縮画像とを含む複数の圧縮画像で構成される映像コンテンツを記憶する記憶ステップと、前記映像コンテンツを早送り又は巻戻し処理として再生する際に、すべての画像内圧縮処理された圧縮画像及び一部の画像間圧縮処理された圧縮画像を基に再生する特殊映像再生ステップとを有すること特徴とする映像処理方法が提供される。
本発明のさらに他の観点によれば、複数の圧縮画像で構成される映像コンテンツを記憶する記憶ステップと、前記映像コンテンツ内のシーンチェンジの圧縮画像を検出するシーンチェンジ検出ステップと、前記検出されたシーンチェンジの圧縮画像を基に印刷画像を生成する印刷画像生成ステップとを有することを特徴とする映像処理方法が提供される。
本発明のさらに他の観点によれば、複数の圧縮音声とパケットごとの管理情報とを一つの音声パケットとし、複数の圧縮画像を含む映像パケット及び複数の前記音声パケットを含む映像コンテンツを記憶する記憶ステップと、前記音声パケットごとの管理情報に基づいて前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出ステップとを有することを特徴とする映像処理方法が提供される。
【００１９】
本発明によれば、パケットの管理情報を用いて、シーンチェンジを検出しているため、映像または音声の圧縮データを伸長しないでシーンチェンジを検出できる。この為、簡単な処理内容でのシーンチェンジ検出が可能となる。また、早送り又は巻戻し処理として再生する時に、シーンチェンジ検出されたパケットの映像再生を、画像内圧縮処理された圧縮画像のみでなく、画像間圧縮処理された圧縮画像も行うため、ユーザが興味のあるシーンを見逃してしまうことを防止可能となる。
【００２０】
【発明の実施の形態】
（第１の実施形態）
以下図面と共に本発明の第１の実施形態を説明する。
図２は携帯映像装置の外観を示す模式図である。
図においてＢ３はＣＦカード（コンパクトフラッシュ（Ｒ）カード）であり、不揮発性メモリを有した大容量の半導体メモリカードである。
Ｂ１は携帯映像装置であり、前記ＣＦカードＢ３を装着するスロットを具備し、キーボードＢ４の再生キーを押すことにより、内部回路の回路を駆動開始し、ＣＦカードＢ３に記憶されている圧縮された映像及び音声の符号化データを伸長して、表示器Ｂ２に映像として表示するようにしている。
【００２１】
キーボードＢ４上の「再生」キーは映像再生を開始するキーである。「＜」及び「＞」キーは映像情報の再生時に巻戻し及び早送りを実行する。「中止」キーは、映像再生や巻戻し及び早送りを中断するキーである。Ｂ５はチャージャであり、供給されるＡＣ電源を低電圧のＤＣ電源に変換して、リセプタクルＢ６に供給する。
【００２２】
なお本実施形態における携帯映像装置Ｂ１は、映像録画機能は持ってなく、他の機器（たとえばパーソナルコンピュータ）で映像を録画及び圧縮処理した符号化データを、ＣＦカードに記憶した後、そのＣＦカードを装着することにより、映像再生するようにしている。
【００２３】
もちろん、携帯映像装置Ｂ１が録画機能を持ってもよく、また、映像データをＵＳＢなどのインターフェースを用いて有線で受信してもよく、あるいは、ブルーツースなどの無線通信により受信しても良い。
【００２４】
本実施形態においてＣＦカードに記憶される符号化データは、映像の圧縮に関してはＭＰＥＧ４規格のシンプルプロファイル方式を使用している。また音声の圧縮に関してはＡＡＣ（Ａｄｖａｎｃｅｄ　Ａｕｄｉｏ　Ｃｏｄｉｎｇ）を採用している。
【００２５】
ＭＰＥＧ４（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ　Ｐｈａｓｅ４）規格は国際機関であるＩＳＯにて標準化された規格であり、図４に示すように、Ｉ（２）のＩピクチャと、Ｐ（５），Ｐ（８）のＰピクチャと、Ｂ（０），Ｂ（１），Ｂ（３），Ｂ（４），Ｂ（６），Ｂ（７）のＢピクチャで構成されている。Ｉピクチャは前後の画像に関係なく、画像内の変化量を離散コサイン変換（ＤＣＴ）を用いてコード化し、その値を量子化した後、ランレングス符号化及びハフマン変換することにより圧縮している。Ｐピクチャは、前方向（過去）の圧縮画像であるＩまたはＰピクチャを参照して圧縮し、Ｂピクチャは、前後の過去または未来の圧縮画像であるＩまたはＰピクチャを参照して圧縮している。Ｐピクチャ、Ｂピクチャとも参照した画像の差分を、Ｉピクチャと同様、ＤＣＴやハフマン変換を用いて圧縮している。
【００２６】
映像の場合は前後の圧縮画像との関連が多く、Ｐピクチャ、Ｂピクチャのデータ量はＩピクチャと比べて少なくなるが、ＰピクチャとＢピクチャばかりを使って圧縮していると圧縮誤差が拡大するため、１５画像に一度程度、Ｉピクチャを作成する。
【００２７】
本実施形態で使用しているＭＰＥＧ４シンプルプロファイルは、処理を簡略化するために、ＩピクチャとＰピクチャでのみ構成した圧縮方法であり、図５に示すように、Ｉ（０）のＩピクチャと、Ｐ（１）〜Ｐ（１４）のＰピクチャで構成されている。このＩ（０）からＰ（１４）の圧縮画像を、ＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅ）とよび、またこのＧＯＰに制御コードを加えて、ひとつのパケットとして、ＣＦカードに記憶している。ＭＰＥＧ４シンプルプロファイルにおいても、元映像と比較して５０〜１００分の１の圧縮が可能である。
【００２８】
音声圧縮においてはＡＡＣ（Ａｄｖａｎｃｅｄ　Ａｕｄｉｏ　Ｃｏｄｉｎｇ）規格に基づいた圧縮を行っており、一定時間内の音声データを差分検出しながら、ＭＰＥＧ４の圧縮でも用いられる、離散コサイン変換（ＤＣＴ）、ランレングス符号化、ハフマン変換などを用いて圧縮している。
【００２９】
図３は本実施形態においての、ＣＦカードＢ３に記憶されている、符号化データの記憶構造を示している。図示されているように、ひとつのコンテンツは、複数のパケットで構成されパック化されている。
【００３０】
各パックの最初にはＧ１のパックヘッダが構成され、コンテンツの名称、コンテンツの総パケット数などを記憶している。
【００３１】
Ｇ２はシステムヘッダであり、映像の横方向のドット数、縦方向のドット数、フレーム数／秒、映像の圧縮方式、音声の圧縮方式などを記憶している。
【００３２】
Ｇ３からＧ７は符号化データを記憶しているパケット群であり、図５と共に説明したＭＰＥＧ４シンプルプロファイルで圧縮した画像をＧＯＰごとにまとめたパケットと、前述したＡＡＣで圧縮した音声をまとめたパケットを時系列で記憶している。
【００３３】
注目すべきこととして、バイト数で示されるパケットのサイズは、映像においても音声においてもサイズが異なっている。変化量の多いパケットにおいてはサイズが大きくなり、変化量の少ないパケットにおいてはサイズが小さくなっている。後述するが本実施形態ではこの特徴を用いて、シーンチェンジを検出し、早送りや巻戻し処理を行っている。
【００３４】
Ｇ８からＧ１４は映像パケットの構成を示しており、Ｇ８はパケットの種類が映像であることを示すコードを記憶している。Ｇ９のＩＤはパケットが何番目かの番号を記憶している。
【００３５】
Ｇ１０はパケットごとのサイズを示すバイト数を記憶している。変化量の多い映像においては、このバイト数は大きな値となり、変化量の少ない映像においては、バイト数は少なくなる。
【００３６】
Ｇ１１のＤＴＳはタイムスタンプの合計である。タイムスタンプは、各圧縮画像と共に記憶されており、直前の画像を表示してから、圧縮された符号化データを伸長した後、画像表示を開始する時間である。映像の圧縮時において、フレーム数（一秒間に圧縮する画像数）に従い、生データを圧縮するが、複雑な映像や変化量の多い映像になるとフレーム数が指定する時間内で圧縮できなくなる。たとえばフレーム数が３０枚とすると、一枚の画像を３３ｍｓ（１秒÷３０フレーム）以内に圧縮しなければいけないが、複雑な映像の場合は９０ｍｓ程度必要となる。この為タイムスタンプの時間は変動し、タイムスタンプの時間は、一番短いときで３３ｍｓ（１秒／３０フレーム）長いときで９０ｍｓとなる。
【００３７】
ＤＴＳはタイムスタンプの合計であり、１パケットが１５ピクチャで構成されているので、ＤＴＳの最小は　３３ｍｓ　×　１５フレーム＝約０．５秒、最大は　９０ｍｓ　×　１５フレーム＝約１．３５秒になる。
【００３８】
タイムスタンプの合計であるＤＴＳは、映像の複雑さを表現しているので、後述するが本実施形態ではこの特徴を用いて、シーンチェンジの検出にこのＤＴＳ（Ｇ１１）も判別データとして使用し、早送りや巻戻し処理を行っている。
【００３９】
Ｇ１２は図５と共に説明したＭＰＥＧ４シンプルプロファイルで規定されている画像群であり、図５と共に説明したように、ＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅ）と呼ばれている。ひとつのＩピクチャ（Ｇ１３）と１４個のＰピクチャ（Ｇ１４）の構成になっている。
【００４０】
Ｇ１５からＧ２０は音声パケットの構成を示しており、Ｇ１５はパケットの種類が音声であることを示すコードを記憶している。Ｇ１６のＩＤはパケットが何番目かの番号を記憶している。
【００４１】
Ｇ１７はパケットごとのサイズを示すバイト数を記憶している。
Ｇ１８のＤＴＳはタイムスタンプの合計である。バイト数及びＤＴＳは映像データと同様の性質を持っており、その詳細は映像データのＧ１０，Ｇ１１の説明を参照されたい。
Ｇ１９はＡＡＣで規定されている音声データ群であり、それぞれの音声データ（Ｇ２０）は映像データと同期して一定時間単位ごとに記憶されている。
【００４２】
図１は本実施形態における携帯映像装置Ｂ１のブロック図を示している。
図において、Ｂ１０は情報処理回路であり、図３と共に説明した構造の符号化データをＣＦカードＢ３から受け取り、圧縮された画像及び音声の伸長処理を行って、出力する回路である。圧縮された映像符号化データを伸長して表示する際、ＣＦカードＢ３からの符号化データは、複数のパケットデータごとにＢ１１のバッファＢｕｆｆ０に記憶され、Ｂ１４のアドレスカウンタＡＤＲ０にアドレスされながら、一画面の画像データの８ｘ８画素に対応する圧縮データごとにＶＬＣ回路（Ｂ２０）へ送られる。
【００４３】
ＶＬＣ回路（Ｂ２０）はランレングス符号化およびハフマン変換により圧縮されたデータを伸長する回路であり、伸長処理した後、逆量子化回路Ｂ２１に出力する。
逆量子化回路Ｂ２１は量子化値で乗算する回路であり、その値を逆ＤＣＴ回路Ｂ２２に出力する。
逆ＤＣＴ回路Ｂ２２は離散コサイン変換されている余弦関数に基づくデータを、８ｘ８画素の画素データに変換して加算器Ｂ２３へ出力する。
伸長処理している圧縮画像がＰピクチャの場合は、参照画像との合成を行うために、この加算器で画素ごとに加算し、復号化データとして出力する。
【００４４】
以上の処理により８ｘ８画素ごとの圧縮データの伸長処理を行い、これを繰り返すことにより、画像単位の伸長処理を行う。伸長処理され復号化した映像データは、表示制御回路Ｂ２８に出力され、表示可能な信号レベルに変換されて、表示器Ｂ２に出力される。
【００４５】
画像メモリＢ２４はＰピクチャの復号処理の際に参照画像を記憶するためのメモリである。
スイッチＢ２５はＰピクチャの際に画像メモリＢ２４のデータを加算器Ｂ２３に伝達するためのスイッチである。
【００４６】
音声符号化データを伸長してスピーカＢ３０に出力する際も、ＶＬＣ回路Ｂ２０、逆量子化回路Ｂ２１、逆ＤＣＴ回路Ｂ２２で伸長処理し、音声フィルタＢ２７にてフィルタ処理した後、ＤＡコンバータＢ２６でデジタル信号をアナログ信号に変換して、オーディオアンプＢ２９に出力する。オーディオアンプＢ２９は電流増幅を行い、スピーカＢ３０に出力して音声化する。
【００４７】
Ｂ１２のＢｕｆｆ１は早送り、巻戻し処理をする際にシーンチェンジを検出するためのバッファであり、Ｂ１５のアドレスカウンタＡＤＲ１によりアドレス指定される。
Ｂ１３のＢｕｆｆ２はシーンチェンジ検出した際に、図３に示すＧ９のＩＤ、Ｇ１０のバイト数、Ｇ１１のＤＴＳを記憶するためのバッファであり、Ｂ１６のＡＤＲ２はシーンチェンジ検出した際に、そのアドレスを記憶するアドレスカウンタである。
【００４８】
スイッチＢ１７及びＢ１８はシーンチェンジを検出した際に、Ｂｕｆｆ１，ＡＤＲ１のバッファ値及びアドレス値をＢｕｆｆ２およびＡＤＲ２に伝達するためのスイッチである。
バッファＢｕｆｆ１、Ｂｕｆｆ２、アドレスカウンタＡＤＲ１、ＡＤＲ２、およびスイッチＢ１７、Ｂ１８は本実施形態の特徴的な構成であり、処理詳細は図７及び図８のフローチャートと共に説明する。
【００４９】
なお、情報処理回路Ｂ１０は、ひとつの半導体チップでもよく、また複数の半導体チップで構成されたものでも良い。たとえば、バッファＢｕｆｆ０、Ｂｕｆｆ１、Ｂｕｆｆ２は別チップとしてＳＤＲＡＭなどの半導体メモリを使用しても良い。
【００５０】
また、ＤＳＰ（デジタルシグナルプロセッサ）を用いて、ＶＬＣ回路、逆量子化回路、逆ＤＣＴ回路などの機能をもつようにしても良い。
【００５１】
キーボードＢ４は図２と共に説明したキーを配設しており、ユーザによりキーが押されると制御回路Ｂ７にキー信号を出力する。
制御回路Ｂ７は中央処理装置ＣＰＵ及びＲＯＭ、ＲＡＭで構成され、キー信号に従った処理を実行指令するための回路である。
Ｂ８のＣＰＧはクロックパルス発生器である。
電源回路Ｂ５は装置内の各回路に電圧を供給する回路であり、充電可能な電池をふくんでいる。電源回路Ｂ５内の電池はチャージャにより充電される。
【００５２】
図６は再生キーを押したときの処理の流れを示すフローチャートである。
再生キーを押すと、流れＦ１で早送り・巻戻し処理をしているときに再生キーが押された場合、流れＦ４に進む判別処理である。
流れＦ２、Ｆ３は初期化処理であり、流れＦ２でＣＦカードに記憶されている圧縮データ化されたコンテンツを呼び出し、再生用バッファであるＢｕｆｆ０に記憶する。
コンテンツのサイズが大きくＢｕｆｆ０に入りきらない場合は、コンテンツの先頭部分をＢｕｆｆ０に記憶する。
なお、ＣＦカードに複数のコンテンツが記憶されている場合は、コンテンツを選択するようにしても良い。あるいは再生していないコンテンツを選択したり、録画日時が最新のものから選択するようにしても良い。
【００５３】
流れＦ３で映像コンテンツの最初から再生するため、再生用バッファＢｕｆｆ０の先頭アドレスをＡＤＲ０に記憶する。
流れＦ４から流れＦ１９は、図３と共に説明したコンテンツの構成で、ひとつのパケットに記憶されている符号化データを伸長処理する処理の流れである。
流れＦ４でパケットに記憶されている符号化データが音声データ（オーディオデータ）かどうかの判別を行う。図３に示したように映像データであれば、パケットの先頭のＧ８に記憶されているデータが映像コードであり、音声データであれば、パケットの先頭のＧ１５に記憶されているデータが音声コードである。このデータを判別して分岐を行う。
【００５４】
映像データであれば、流れＦ５に進み、図１に示した、ＶＬＣ回路、逆量子化回路、逆ＤＣＴ回路などを使って、８Ｘ８画素ごとに伸長処理を行い、画像メモリＢ２４に記憶する。これを繰り返すことにより一画面の圧縮画像の伸長処理を実行する。
また映像データがＰピクチャであれば、図１に示した加算器Ｂ２３を使って、直前の画像との合成処理を行う。
伸長処理を実行した後、流れＦ６でタイムスタンプの判別を行い、表示時間になっていれば、流れＦ７で伸長処理した一画面の表示処理を行う。
【００５５】
流れＦ８でアドレスカウンタＡＤＲ０の更新を行い、次の圧縮データを指定するようにする。
流れＦ９でＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅ）の終了判別を行う。
図３に示した映像パケットの構造で、Ｇ１２のＧＯＰにまだ伸長及び表示する圧縮画像が残っていれば、流れＦ５からＦ９を繰り返し、伸長処理及び表示処理を実行する。
ＧＯＰ内の全ての圧縮画像を伸長して、表示し終わった場合は、流れＦ１６に進む。
【００５６】
流れＦ１６から流れＦ１９は、次のパケットを呼び出す処理を示す。
流れＦ１６でアドレスカウンタＡＤＲ０が最後尾のパケットを示しているかの判別を行う。再生用バッファＢｕｆｆ０が記憶している圧縮データを、全て伸長し、表示または音声出力したかどうかの判別である。
同判別で、アドレスカウンタＡＤＲ０が最後尾のパケットを示していない場合は、流れＦ１７へ進みひとつ後のパケットを指定する。
アドレスカウンタＡＤＲ０が最後尾のパケットを示している場合は、流れＦ１８で、ＣＦカードからコンテンツの一部を呼び出し、再生バッファであるＢｕｆｆ０に記憶する。
【００５７】
流れＦ１９で、再生用バッファＢｕｆｆ０の最初から再生するため、Ｂｕｆｆ０の先頭アドレスを指示するアドレスをＡＤＲ０に記憶する。
流れＦ２０は終了判断であり、コンテンツの再生処理が終わったかどうかの判別である。
同判別で、再生する圧縮データが残っている場合は、流れＦ４〜Ｆ２０を繰り返し、上述した再生処理を実行する。
音声再生処理については、流れＦ４の判別で、流れＦ１０に進み、音声データの伸長処理を行い、流れＦ１１にて、図１で示したＤＡコンバータＢ２６でＤＡ変換を行う。
流れＦ１２でタイムスタンプの判別を行い、音声を出力する時間になっていれば、流れＦ１３で、伸長処理・ＤＡ変換した音声データをスピーカＢ３０に出力する。
【００５８】
流れＦ１４でアドレスカウンタＡＤＲ０の更新を行い、次の圧縮データを指定するようにする。
流れＦ１５でパケットの終了判別を行う。
図３に示した音声パケットの構造で、Ｇ１８の音声データ群にまだ伸長及び音声出力する圧縮データが残っていれば、流れＦ１０からＦ１５を繰り返し、伸長処理及び音声出力を実行する。
パケット内の音声圧縮データを全て処理し終わった場合は流れＦ１６に進んで、次のパケットを指示する処理を行う。
【００５９】
図７は早送りキーを押したときの処理の流れを示すフローチャートである。
早送りキーを押すと、流れＦ３０でシーンチェンジ検索のための初期化を行う。早送り処理で、シーンチェンジを検出するための検索バッファであるＢｕｆｆ１に再生バッファＢｕｆｆ０の内容を転送し、Ｂｕｆｆ１のアドレスをカウントするＡＤＲ１にＢｕｆｆ０のアドレスをカウントするＡＤＲ０の内容を転送する。
またシーンチェンジデータを記憶するシーンチェンジバッファＢｕｆｆ２、および、シーンチェンジアドレスを記憶するＡＤＲ２の内容をクリアし、シーンチェンジの検索期間を記憶する、変数Ｘをクリアする。
【００６０】
流れＦ３１でＡＤＲ１が指し示すＢｕｆｆ１内のパケットが音声データかどうかを判別し、音声データであれば流れＦ３９以降に進んで、次のパケットを指定する処理を行う。
流れＦ３２〜Ｆ３６の処理は本実施形態の特徴的な部分であり、シーンチェンジの検出を行って、発生した場所及び内容を、ＡＤＲ２およびＢｕｆｆ２に記憶する処理である。
【００６１】
流れＦ３２及びＦ３３でＡＤＲ１が指し示すＢｕｆｆ１内のパケットのバイト数と、Ｂｕｆｆ２に記憶されているバイト数と比較を行う。図３に示すように各パケットのバイト数はＧ１０の部分に記憶されている。このバイト数が多いと変化量の大きな圧縮画像がパケット内に記憶されていることであり、本実施形態ではこの特徴を使ってシーンチェンジを検出している。
【００６２】
Ｂｕｆｆ２に記憶されている値よりも、検索中のパケットのバイト数が大きい場合は、流れＦ３５でＡＤＲ１の値をＡＤＲ２に転送する。ＡＤＲ２はシーンチェンジが発生した場所を指定するアドレスカウンタである。
流れＦ３６で検索中のパケットの内容である、ＩＤ、バイト数、ＤＴＳをＢｕｆｆ２に記憶する。
流れＦ３３の判別で、Ｂｕｆｆ２に記憶されている値よりも、検索中のパケットのバイト数が小さい場合は、流れＦ３５、Ｆ３６を実行せずに流れＦ３７へと進む。
【００６３】
流れＦ３４の判別は、流れＦ３２で、Ｂｕｆｆ２に記憶されている値と、検索中のパケットのバイト数が同じ場合に、Ｂｕｆｆ２の記憶しているＤＴＳの値と、検索中のパケットのＤＴＳ値を比較する処理である。検索中のパケットの、ＤＴＳのほうが大きい場合、すなわちタイムスタンプの合計時間が大きい場合は、流れＦ３５へと進み、シーンチェンジの場所を修正するために、ＡＤＲ１の値をＡＤＲ２に転送し、流れＦ３６で検索中のパケットのＩＤ、バイト数、ＤＴＳをＢｕｆｆ２に転送する。
【００６４】
流れＦ３７はシーンチェンジの検索期間を記憶する変数Ｘに、検索中のパケットにおけるＤＴＳ（タイムスタンプの合計）を加算し、流れＦ３８で変数Ｘの値が、１８０秒よりも大きくなっているかの判別を行う。本実施形態においては、３分（１８０秒）毎にシーンチェンジを検出するようにしている。
【００６５】
他の方法として、キーボードＢ４に設定入力手段を設けて、３分、５分、７分、１０分など入力できるようにし、シーンチェンジ検出期間を変更できるようにしても良い。
また、シーンチェンジ検出に対して、閾値を設けておき、一定バイトサイズ以上のパケットがない場合は、シーンチェンジがなかったことを示すフラグを立てるようにしても良い。
【００６６】
流れＦ３８の判別で、変数Ｘが１８０秒よりも大きな値の場合は、シーンチェンジ処理を一旦終了し、流れＦ４２以降の処理へ進む。変数Ｘが１８０秒よりも小さな値の場合は、流れＦ３９へと進み、アドレスカウンタＡＤＲ１が、最後尾アドレスかどうかの判別を行う。最後尾アドレスの場合は、Ｂｕｆｆ１内のパケット全てに対してシーンチェンジ検出したことであり、シーンチェンジ処理を一旦終了し、流れＦ４２へと進む。
【００６７】
流れＦ４０で、ひとつ後ろのパケットを指定し、シーンチェンジ検出するため、Ｂｕｆｆ１のアドレスをカウントするＡＤＲ１の値を更新する。
流れＦ４１でコンテンツ終了の判別を行い、コンテンツが終了していなければ、流れＦ３１から流れＦ４１の処理を繰り返して、シーンチェンジの検出を行っていく。
流れＦ４２以降の処理は、早送り再生及び表示処理の実行ルーチンである。
【００６８】
流れＦ４２でアドレスカウンタＡＤＲ０によりバッファＢｕｆｆ０の表示用パケットを指定し、流れＦ４３で、指定パケットが音声データのパケットであれば、流れＦ４９へ進み、終了判断処理を行う。
音声データのパケットでなければ流れＦ４４に進み、図３に示したように各パケットの先頭に記憶されているＩピクチャの再生及び表示処理を行う。
【００６９】
流れＦ４５で表示パケットのアドレスをカウントしているＡＤＲ０と、シーンチェンジが発生したアドレスをカウントしているＡＤＲ２の比較を行う。一致していると、流れＦ４６以降の処理を実行し、一致していなければ、流れＦ４９へと進む。
流れＦ４６，Ｆ４７，Ｆ４８はシーンチェンジが発生したパケットの表示処理である。
流れＦ４６で、Ｐピクチャの再生及び表示処理を行う。図１と共に説明したように、Ｐピクチャは、前段のピクチャとの差を符号化しているので、再生時には、前段ピクチャの符号化データを記憶している画像メモリＢ２４との合成処理を加算器Ｂ２３にて行う。
【００７０】
流れＦ４７でＧＯＰ終了の判断を行い、ＧＯＰが終了していないとき、すなわち再生・表示を行っていないＰピクチャが残っている場合は、流れＦ４６、Ｆ４７の処理を繰り返す。
流れＦ４８で１秒間の処理待ちを行う、これにより最終Ｐピクチャが一秒間表示されることになる。
【００７１】
流れＦ４９でコンテンツ終了かどうかの判別を行い、終了でなければ、流れＦ５０でＡＤＲ０が最後尾のパケットを示すアドレスかどうかの判別を行う。最後尾アドレスでない場合は、流れＦ５１に進み、ＡＤＲ０がひとつ後ろのパケットを指定するようにする。
【００７２】
流れＦ５２はＡＤＲ０とＡＤＲ１の記憶しているアドレスが同じかどうかの判別である。同じアドレスであれば、シーンチェンジ検出した期間の再生・表示処理が終了したこととなり、再度シーンチェンジ検出するため、流れＦ５５で変数Ｘの値をクリアして、流れＦ３１へと進む。
異なるアドレスであれば、早送り処理としての再生・表示処理を続行するため、流れＦ４２へ戻る。
【００７３】
流れＦ５３、Ｆ５４の処理は、ＡＤＲ０が最後尾のパケットを指定している場合の処理である。
流れＦ５３でＣＦカードに記憶されている処理中の符号化データを、再生・表示処理用のバッファであるＢｕｆｆ０及び、シーンチェンジ検出用のバッファであるＢｕｆｆ１に記憶する。
流れＦ５４でアドレスカウンタのＡＤＲ０が先頭パケットを指定するようにして、以降の再生・表示処理で、Ｂｕｆｆ０の先頭パケットから処理を行うようにする。またＡＤＲ１が先頭のパケットを指定するようにして、以降のシーンチェンジ検出をＢｕｆｆ１の先頭パケットから行うようにする。流れＦ５５で変数Ｘをクリアした後、シーンチェンジ検出のため、流れＦ３１へ進む。
以上本実施形態における早送りの処理内容について説明した。
【００７４】
本実施形態の特徴的な部分を要約すると、流れＦ３１から流れＦ４１で、パケット情報である、パケットのバイト数、及びタイムスタンプ情報であるＤＴＳにより、シーンチェンジかどうかの判別を流れＦ３２、Ｆ３３，Ｆ３４で行い、シーンチェンジであると判断すれば、流れＦ３５、Ｆ３６でＡＤＲ２およびＢｕｆｆ２の記憶情報を更新する。この処理ループを変数Ｘが１８０秒を超えるまで繰り返すことにより、１８０秒の期間で一番大きなシーンチェンジを検出可能となる。このシーンチェンジ検出では、パケット情報のバイト数やタイムスタンプ情報ＤＴＳにより検出しており、圧縮されたピクチャ画像を処理して検出するようなことは行っていない。この為本実施形態において、シーンチェンジ検出処理が、簡単な方法で行うことが可能となっている。
【００７５】
流れＦ４２からＦ５５の処理で、シーンチェンジの起こっていないパケットにおいては、流れＦ４４でパケット内のＩピクチャのみを表示し、シーンチェンジが起こっていないパケットであることを流れＦ４５で検出し、Ｐピクチャを表示しないようにする。シーンチェンジが起こったパケットにおいては、流れＦ４４でパケット内のＩピクチャを表示した後、シーンチェンジの起こったパケットであることを流れＦ４５で検出し、流れＦ４６でＰピクチャを再生し表示する。すなわちシーンチェンジ検出したところのみＰピクチャを表示しているので、シーンチェンジしたところのみ多くの画像を表示して、ユーザは早送り実行中に、興味のあるシーンを正確に見つけることが可能となる。
【００７６】
図８は巻戻し処理のフローチャートである。
基本的に早送り処理と同様、流れＦ６１から流れＦ７１で、パケット情報である、パケットのバイト数、及びタイムスタンプ情報であるＤＴＳにより、シーンチェンジかどうかの判別を流れＦ６２，Ｆ６３，Ｆ６４で行い、シーンチェンジであると判断すれば、流れＦ６５，Ｆ６６でＡＤＲ２およびＢｕｆｆ２の記憶情報を更新する。この処理ループを変数Ｘが１８０秒を超えるまで繰り返すことにより、１８０秒の期間で一番大きなシーンチェンジを検出可能となる。巻戻し処理では、符号化データを前方方向にシーンチェンジ検索を行っていくため、流れＦ７０でひとつ前のパケットを指定するようにＡＤＲ１を更新する。また流れＦ６９ではＡＤＲ１が先頭アドレスかどうかの判別を行い、Ｂｕｆｆ１の符号化データ全てに対してシーンチェンジ検索を行ったかどうかをチェックしている。流れＦ７０では、次のパケットの、シーンチェンジ検出のためＡＤＲ１の更新を行い、流れＦ７１でコンテンツ終了の判別を行う。
【００７７】
流れＦ７２からＦ８５の処理で、シーンチェンジの起こっていないパケットにおいては、流れＦ７４でパケット内のＩピクチャのみを表示し、シーンチェンジが起こっていないパケットであることを流れＦ７５で検出し、Ｐピクチャを表示しないようにする。シーンチェンジが起こったパケットにおいては、流れＦ７４でパケット内のＩピクチャを表示した後、シーンチェンジの起こったパケットであることを流れＦ７５で検出し、流れＦ７６でＰピクチャを再生し表示する。すなわちシーンチェンジ検出したところのみＰピクチャを表示しているので、シーンチェンジしたところのみ多くの画像を表示して、ユーザは巻戻し実行中に、興味のあるシーンを、正確に見つけることが可能となる。
【００７８】
なお本実施形態においては、シーンチェンジを検出した位置の映像をＰピクチャを含めて再生しているが、シーンチェンジした位置から１〜２パケット前後からＰピクチャを含めて早送り・巻戻し再生をしても良い。
【００７９】
また、本実施形態において、シーンチェンジ検出を映像データのパケット情報でのみ検出しているが、音声データのパケット情報と組み合わせてシーンチェンジ検出を行ってもよい。
【００８０】
（第２の実施形態）
図９は、本発明の第２の実施形態における早送り処理のフローチャートである。
本実施形態では音声データのパケット情報に基づきシーンチェンジの検出を行っている。この為流れＦ９１の判別で、ビデオデータの場合、盛り上がりシーンの検出処理である流れＦ９４からＦ９８を行わずに、流れＦ９９に分岐している。音声データの場合、音量や周波数変動の大きな圧縮データはバイト数が大きい。本実施形態はこの点に着目して盛り上がりシーンの検出を流れＦ９４以降で行う。流れＦ９４で、図３に示した音声データの、パケット情報のひとつであるバイト数Ｇ１７が、Ｂｕｆｆ２に記憶されている値よりも大きい場合、流れＦ９５、Ｆ９６に進んで、ＡＤＲ２及びＢｕｆｆ２の記憶内容を更新している。この処理を変数Ｘが１８０秒をカウントするまで繰り返すことにより、１８０秒間で一番大きなバイト数のパケットのアドレスをＡＤＲ２に記憶し、パケット情報をＢｕｆｆ２に記憶している。すなわち盛り上がりシーンの検出を、音声パケットのバイト数の大きさで検出し、結果をＡＤＲ２及びＢｕｆｆ２に記憶している。
【００８１】
流れＦ９９でＡＤＲ１が最後尾アドレスかどうかの判別を行い、Ｂｕｆｆ１の符号化データ全てに対してシーンチェンジ検索を行ったかどうかをチェックしている。流れＦ１００で、次のパケットの、シーンチェンジ検出のためＡＤＲ１の更新を行い、流れＦ１０１でコンテンツ終了の判別を行う。
以上流れＦ９０からＦ１０１の処理として、音声データの大きいパケットの検出を行った。
【００８２】
この処理では音量や周波数変動の大きな瞬間である盛り上がりシーンを検出しており、たとえばサッカーのゴールした瞬間を検出している。この為、流れＦ１０２からＦ１１０で、サイレント位置の検出を行い、そのサイレント位置をシーンチェンジした位置として記憶する。
【００８３】
流れＦ１０２でＡＤＲ２の値をＡＤＲ１に記憶する。盛り上がりシーンからサイレント位置の検出処理を始めるためである。また、サイレント位置の検出期間をカウントする変数Ｘをクリアする。
【００８４】
流れＦ１０３で音声パケットのバイト数がＢｕｆｆ２の値よりも小さいかどうかの判別を行う。
小さい場合、流れＦ１０４，Ｆ１０５に進んで、ＡＤＲ２及びＢｕｆｆ２の記憶内容を更新する。この処理を変数Ｘが６０秒をカウントするまで繰り返すことにより、６０秒間で一番小さなバイト数のパケットのアドレスをＡＤＲ２に記憶し、パケット情報をＢｕｆｆ２に記憶している。すなわち音量も小さく、周波数変動も少ないサイレント位置の検出を、音声パケットのバイト数の小さいことで検出し、結果をＡＤＲ２及びＢｕｆｆ２に記憶している。
【００８５】
流れＦ１０６，Ｆ１０７は変数Ｘにより６０秒間を検出する処理である。
流れＦ１０６で変数Ｘにパケットのタイムスタンプ期間であるＤＴＳの値を加算し、流れＦ１０７で６０秒の判別を行う。
変数Ｘが６０秒を超えている場合は、サイレント位置の検出を終了し、サイレント位置をシーンチェンジした位置と記憶し、流れＦ１１１へ進み、再生・表示処理へと進む。
変数Ｘが６０秒になってないときは、流れＦ１０８へ進みＡＤＲ１が最後尾アドレスかどうかの判別を行い、Ｂｕｆｆ１の符号化データ全てに対してサイレント位置検索を行ったかどうかをチェックしている。流れＦ１０９で、次のパケットの、サイレント位置検出のためＡＤＲ１の更新を行い、流れＦ１１０でコンテンツ終了の判別を行う。
【００８６】
流れＦ１１１は再生及び表示処理のルーチンであり、第一の実施形態の図７と類似しているため説明を省略する。
以上第２の実施形態につき説明した。
【００８７】
本実施形態では、音声データの盛り上がりシーンと、サイレント位置により、シーンチェンジを検出している。他の方法として、第１の実施形態と組み合わせることにより、盛り上がりシーンを音声パケット情報により検出した後、映像パケット情報によりシーンチェンジを検出しても良い。
【００８８】
また、映像パケット情報によりシーンチェンジを行った後、音声パケット情報により、サイレント検出を行っても良い。
【００８９】
（第３の実施形態）
図１０、図１１は本発明の第３の実施形態に関する図面である。
図１０の模式図で、携帯映像装置Ｂ１のキーボードＢ４には印刷キーが配設されており、印刷時にはユーザが印刷キーを押すことにより、図１１に示す処理を行い、プリントデータをケーブルＢ５１経由でプリンタＢ５０に出力し印刷処理する。
【００９０】
図１１のフローチャートにおいて、流れＦ１２０からＦ１３３でシーンチェンジの検出を行っている。流れＦ１２０で初期化処理を行った後、流れＦ１２１でコンテンツ全体のサイズを８で割り、その結果を変数Ｚに記憶し、流れＦ１２２にて８分割印刷モードで印刷することをプリンタに伝える。
【００９１】
本実施形態では、映像データをシート紙上に８分割して印刷するようにしている。
この為コンテンツを８等分し、その中でシーンチェンジを検出するようにしている。流れＦ１２１及びＦ１２２はその準備の処理である。
【００９２】
流れＦ１２３でオーディオデータか映像データかの判別を行い、映像データであれば流れＦ１２６で、図３に示した映像データの、パケット情報のひとつであるバイト数Ｇ１０がＢｕｆｆ２に記憶されている値よりも大きい場合、流れＦ１２７，Ｆ１２８に進んで、ＡＤＲ２及びＢｕｆｆ２の記憶内容を更新している。この処理を後述するＦ１３０の判別により、変数Ｘが変数Ｚの値までカウントするまで繰り返し、変数Ｚの記憶している時間内で、一番大きなバイト数のパケットのアドレスをＡＤＲ２に記憶し、パケット情報をＢｕｆｆ２に記憶する。すなわちシーンの変動の一番大きな映像をＡＤＲ２及びＢｕｆｆ２で指定する事が可能となる。
【００９３】
流れＦ１２９で変数Ｘにパケットのタイムスタンプの合計値を加算し、流れＦ１３０で変数ＸとＺの比較を行う。
流れＦ１３１でＡＤＲ１が最後尾アドレスかどうかの判別を行い、Ｂｕｆｆ１の符号化データ全てに対してシーンチェンジ検索を行ったかどうかをチェックしている。流れＦ１３２で、次のパケットの、シーンチェンジ検出のためＡＤＲ１の更新を行い、流れＦ１３３でコンテンツ終了の判別を行う
以上の処理で、変数Ｚの期間内でのシーンチェンジ画像を特定し、流れＦ１３４以降の処理で印字処理を行う。
【００９４】
流れＦ１３４で印刷画像枚数をカウントする変数Ｙに１を加える。
流れＦ１３５でシーンチェンジが発生した画像のアドレスである、ＡＤＲ２の値をＡＤＲ０に転送し、流れＦ１３６にて、図１で示したＶＬＣ回路、逆量子化回路、逆ＤＣＴ回路などを使って、圧縮画像を伸長し、流れＦ１３７で印刷画像にするためのフィルタ処理を行った後、流れＦ１３８で映像データをプリンタに出力する。
【００９５】
流れＦ１３９で変数Ｙが８になったかどうかの判別を行う。
８分割した印刷を行うため、８枚の画像をプリンタに出力したかどうかの判別であり、一致した場合は流れＦ１４０に進み印字開始命令をプリンタに出力して終了する。
【００９６】
流れＦ１４１でコンテンツが終了したかどうかの判別を行い流れＦ１４２でＡＤＲ１がＢｕｆｆ１の最後尾アドレスを指定しているかの判別を行う。
判別の結果、最後尾アドレスを指定している場合は、Ｂｕｆｆ１の映像データ全てに対して、シーンチェンジ検出を行ったことであり、流れＦ１４３でＣＦカードからＢｕｆｆ０及びＢｕｆｆ１に映像データの続きを転送する。
【００９７】
流れＦ１４４でＡＤＲ１をＢｕｆｆ１の先頭アドレスを指定するようにし、Ｂｕｆｆ１の先頭アドレスからシーンチェンジ検出を行うようにする。
流れＦ１４５で変数Ｘをクリアした後、流れＦ１２３に戻りシーンチェンジ処理を引き続き行う。
以上第３の実施形態について説明した。
【００９８】
本実施形態において第１の実施形態と同様、映像データのパケット情報を基にシーンチェンジ検出を行っているが、第２の実施形態のように音声データのパケット情報を基にシーンチェンジ検出を行っても良く、また映像データと音声データの、パケット情報の組み合わせでシーンチェンジ検出を行っても良い。
【００９９】
本実施形態においては、コンテンツサイズを８で割った値を変数Ｚに入れて、その期間内でひとつのシーンチェンジ検出を行っているが、コンテンツサイズに依存しない値を変数Ｚにいれて、その期間内で、一定閾値以上のパケットサイズの映像データを、シーンチェンジとみなしても良い。
【０１００】
上記の第１〜第３の実施形態によれば、パケット情報を用いて、シーンチェンジ検出しているため、映像または音声の圧縮データを伸長しないでシーンチェンジを検出できる。この為簡単な処理内容でのシーンチェンジ検出が可能となる。また、早送り・巻戻し処理の実行時に、シーンチェンジパケットの画像再生を、画像間圧縮処理されたＰピクチャも行うため、ユーザが興味のあるシーンを見逃してしまうことを防止可能となる。
【０１０１】
以上の実施形態をまとめると、以下のようになる。
（１）映像処理装置は、複数の圧縮画像とパケットごとの管理情報を一つのパケットとし、複数の前記パケットを映像コンテンツとして記憶する記憶手段と、前記映像コンテンツを再生する情報処理回路と、前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出手段を具備し、前記シーンチェンジ検出手段は、少なくとも前記パケットごとの管理情報に基づいてシーンチェンジを検出することを特徴とする。すなわち、シーンチェンジの検出方法として、圧縮画像の伸長や、差分を検出するような複雑な回路が不要であり、簡単な検出方法でのシーンチェンジ検出が可能となる。
【０１０２】
（２）映像処理装置は、前記パケットごとの管理情報は、少なくともパケットの記憶サイズを表す情報を含むことを特徴とする。
【０１０３】
（３）映像処理装置は、前記パケットごとの管理情報は、少なくともパケット内の画像を表示する時間を表す、タイムスタンプに関する情報を含むことを特徴とする。
【０１０４】
（４）映像処理装置は、複数の圧縮画像で構成される映像コンテンツを記憶する記憶手段と、前記映像コンテンツを通常の再生速度で再生する映像再生手段と、映像コンテンツを早送りまたは巻戻し処理として再生する特殊映像再生手段とを含む映像処理装置であって、前記複数の圧縮画像は、画像内圧縮処理された圧縮画像と、画像間圧縮処理された圧縮画像を含み、前記特殊映像再生手段を実行する際は、少なくとも複数の画像内圧縮処理された圧縮画像のみを再生する処理と、画像間圧縮処理された圧縮画像も含めて再生する処理を実行することを特徴とする。すなわち、巻戻し・早送りを行うとき、ユーザの興味のある位置と推定されるシーンチェンジ付近の画像を、画像内圧縮のＩピクチャと画像間圧縮のＰピクチャ、Ｂピクチャで再生し、シーンチェンジ以外のところは、画像内圧縮のＩピクチャを再生するようにする。この為、ユーザの興味のある位置を見逃さない巻き戻し・早送り処理が可能となる。
【０１０５】
（５）映像処理装置は、シーンチェンジ検出手段と、前記シーンチェンジ検出手段により検出された検出画像または、複数画像を含む検出パケットを特定するアドレス手段と、前記特殊映像再生手段を実行する際は、前記アドレス手段により特定される前記検出画像または前記検出パケットを含む複数の圧縮画像に対して、画像間圧縮処理された圧縮画像も再生処理することを特徴とする。
【０１０６】
（６）映像処理装置は、複数の圧縮画像で構成される映像コンテンツを記憶する記憶手段と、シーンチェンジ検出手段と、前記シーンチェンジ検出手段により検出された検出画像または、複数画像を含む検出パケットを特定するアドレス手段と、前記検出画像または前記検出パケットに含まれる特定圧縮画像を基に、印刷画像を生成する印刷画像生成手段を設けたことを特徴とする。この為、シーンチェンジ検出により特定した画像の印刷が可能となる。
【０１０７】
（７）映像処理装置は、複数の圧縮音声とパケットごとの管理情報を一つの音声パケットとし、複数の前記音声パケットを映像コンテンツの一部として記憶する記憶手段と、前記映像コンテンツを再生する情報処理回路と、前記映像コンテンツ内のシーンチェンジを検出するシーンチェンジ検出手段を具備する映像処理装置であって、前記シーンチェンジ検出手段は、少なくとも前記音声パケットごとの管理情報に基づいてシーンチェンジを検出することを特徴とする。この為、圧縮音声のパケット情報によりシーンチェンジ検出が可能となる。
【０１０８】
（８）映像処理装置は、前記音声パケットの管理情報が、少なくとも音声パケットの記憶サイズを表す情報を含むことを特徴とする。
【０１０９】
（９）映像処理装置は、前記シーンチェンジ検出手段が、前記音声パケットの記憶サイズをあらわす情報を用いて、盛り上がりシーンを検出後、サイレント検出を行うことを特徴とする。
【０１１０】
本実施形態は、コンピュータがプログラムを実行することによって実現することができる。また、プログラムをコンピュータに供給するための手段、例えばかかるプログラムを記録したＣＤ−ＲＯＭ等のコンピュータ読み取り可能な記録媒体又はかかるプログラムを伝送するインターネット等の伝送媒体も本発明の実施形態として適用することができる。また、上記のプログラムを記録したコンピュータ読み取り可能な記録媒体等のプログラムプロダクトも本発明の実施形態として適用することができる。上記のプログラム、記録媒体、伝送媒体及びプログラムプロダクトは、本発明の範疇に含まれる。記録媒体としては、例えばフレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることができる。
【０１１１】
なお、上記実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。
【０１１２】
【発明の効果】
以上説明したように、本発明によれば、パケットの管理情報を用いて、シーンチェンジを検出しているため、映像または音声の圧縮データを伸長しないでシーンチェンジを検出できる。この為、簡単な処理内容でのシーンチェンジ検出が可能となる。また、早送り又は巻戻し処理として再生する時に、シーンチェンジ検出されたパケットの映像再生を、画像内圧縮処理された圧縮画像のみでなく、画像間圧縮処理された圧縮画像も行うため、ユーザが興味のあるシーンを見逃してしまうことを防止可能となる。
【図面の簡単な説明】
【図１】第１の実施形態における携帯映像装置のブロック図である。
【図２】第１の実施形態における携帯映像装置の模式図である。
【図３】第１の実施形態における映像コンテンツの記憶形態を示す模式図である。
【図４】ＭＰＥＧ４におけるピクチャ構造を示す模式図である。
【図５】ＭＰＥＧ４シンプルプロファイルにおけるピクチャ構造を示す模式図である。
【図６】第１の実施形態における再生処理の処理の流れを示すフローチャートである。
【図７】第１の実施形態における早送り再生の処理の流れを示すフローチャートである。
【図８】第１の実施形態における巻戻し再生の処理の流れを示すフローチャートである。
【図９】第２の実施形態における早送り再生の処理の流れを示すフローチャートである。
【図１０】第３の実施形態における携帯映像装置の模式図である。
【図１１】第３の実施形態における印刷処理の処理の流れを示すフローチャートである。
【符号の説明】
Ｂ１　携帯映像装置
Ｂ２　表示器
Ｂ３　ＣＦカード
Ｂ４　キーボード
Ｂ５　チャージャ
Ｂ６　リセプタクル[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a video processing technology capable of compressing and storing a video, and playing, rewinding, fast-forwarding, and printing the stored video, and more particularly to a video processing technology for detecting a scene change.
[0002]
[Prior art]
2. Description of the Related Art Apparatuses for compressing and storing an image by the MPEG (Moving Picture Experts Group) method and expanding and displaying the stored compressed image have been widely used. Each frame of a video in the MPEG system is composed of an I picture which is intra-picture compression, a P picture and a B picture which are compressed between pictures, and the I picture is compressed as an independent frame within the picture. . A P picture, which is a non-independent frame, encodes the difference from the previous I picture or P picture and compresses it between images. Similarly, a B picture, which is a non-independent frame, encodes the difference from the preceding or succeeding I or P picture. And compression between images. In the MPEG system, by performing inter-image compression in this way, several tenths of compression is possible.
[0003]
The I picture, P picture, and B picture are packetized from a plurality of pictures, and a time stamp indicating a reproduction time, the number of bytes for each packet, and the like are stored as management information of each packet. By displaying the video according to the management information, a smooth video is synchronized with the audio data.
[0004]
As a method of fast-forwarding and rewinding playback (hereinafter also referred to as special playback) of a video compressed by the MPEG method, for example, in Japanese Patent Application Laid-Open No. H11-163, a predetermined interval is measured when fast-forwarding and rewinding playback. Is proposed to display only the I-picture closest to.
[0005]
Further, as a method of detecting a scene change in rewinding or fast-forwarding, the following Patent Documents 2 and 3 are proposed.
[0006]
[Patent Document 1]
JP-A-5-344494
[Patent Document 2]
JP 2000-333117 A
[Patent Document 3]
JP 2001-6236 A
[0007]
[Problems to be solved by the invention]
The special reproduction such as fast-forward or rewind reproduction is performed in order to quickly reach a video of interest to the user. In the fast forward / rewind reproduction described in Patent Document 1, processing is performed at regular intervals.
[0008]
For example, if a video of soccer is stored and a goal scene is reproduced in the video and the user wants to see the goal scene again, the I-picture is reproduced at regular intervals in a rewind reproduction state. The user can watch the video scene being rewound carefully and press the play key immediately before the goal scene, so that the user can reach and view the video of interest again.
[0009]
However, in this technique, the rewinding of the video scene is performed at a fixed interval without giving any strength, and therefore, there is a possibility that the goal scene which is a short time may be missed. Therefore, it may not be possible to determine whether the I picture is a goal scene.
[0010]
In the present invention, a first problem is that a user may miss a position of interest at the time of fast-forward or rewind as described above, or may overlook without being able to recognize it.
[0011]
Further, a proposal has been made to detect a scene change and perform fast forward and rewind. For example, in Patent Document 2 described above, a scene change is detected from a video signal, and the playback position is changed in units of a plurality of detected scene changes.
[0012]
However, when trying to detect a scene change from a video signal, the circuit becomes complicated because expansion processing is performed from the compressed video signal and a difference circuit for comparison is required.
[0013]
Patent Document 3 proposes a method of detecting a scene change from the intensity of an audio signal and specifying a reproduction position. However, in order to detect a scene change from the intensity of the audio signal, it is necessary to expand the compressed audio signal and detect the audio intensity, which again complicates the circuit.
[0014]
In the present invention, a second problem is that the circuits in Patent Documents 2 and 3 become complicated.
[0015]
SUMMARY OF THE INVENTION It is a first object of the present invention to provide a rewind / fast-forward method that detects a scene change by a simple detection method and uses the scene change position so as not to miss a position of interest to the user. .
[0016]
In addition, a video processing apparatus that prints on a printer based on a video signal is desired, but a good method for extracting and printing which scene has not been proposed yet.
[0017]
Therefore, it is a second object to provide a method of detecting a scene change by a simple detection method and printing a video using the result of the scene change.
[0018]
[Means for Solving the Problems]
According to one aspect of the present invention, a storage unit that stores a plurality of compressed images and management information for each packet as one packet, and stores the plurality of packets as video content, and stores the video based on the management information for each packet. And a scene change detecting means for detecting a scene change in the content.
According to another aspect of the present invention, a storage means for storing video content composed of a plurality of compressed images including a compressed image subjected to intra-image compression processing and a compressed image subjected to inter-image compression processing; And special video playback means for playing back, based on all of the compressed images subjected to intra-image compression processing and some of the compressed images subjected to inter-image compression processing, when reproducing as fast-forward or rewind processing. An image processing device is provided.
According to still another aspect of the present invention, a storage means for storing video content composed of a plurality of compressed images, a scene change detection means for detecting a compressed image of a scene change in the video content, And a print image generation unit for generating a print image based on the compressed image detected by the detection unit.
According to still another aspect of the present invention, a plurality of compressed voices and management information for each packet are set as one voice packet, and a video packet including a plurality of compressed images and a video content including a plurality of the voice packets are stored. A video processing device is provided, comprising: a storage unit; and a scene change detection unit that detects a scene change in the video content based on management information for each audio packet. According to still another aspect of the present invention, a storage step of storing a plurality of compressed images and management information for each packet as one packet, storing the plurality of packets as video content, and based on the management information for each packet And a scene change detecting step of detecting a scene change in the video content.
According to still another aspect of the present invention, a storage step of storing video content including a plurality of compressed images including a compressed image subjected to intra-image compression processing and a compressed image subjected to inter-image compression processing; When playing back the content as a fast-forward or rewind process, a special video playback step of playing back based on all the compressed images subjected to the intra-image compression process and some of the compressed images subjected to the inter-image compression process is provided. An image processing method is provided.
According to still another aspect of the present invention, a storing step of storing video content composed of a plurality of compressed images, a scene change detecting step of detecting a compressed image of a scene change in the video content, And a print image generating step of generating a print image based on the compressed image of the scene change.
According to still another aspect of the present invention, a plurality of compressed voices and management information for each packet are set as one voice packet, and a video packet including a plurality of compressed images and a video content including a plurality of the voice packets are stored. A video processing method is provided, comprising: a storing step; and a scene change detecting step of detecting a scene change in the video content based on management information for each audio packet.
[0019]
According to the present invention, since a scene change is detected using packet management information, a scene change can be detected without expanding compressed video or audio data. For this reason, scene change detection with simple processing contents becomes possible. In addition, when playing back as fast-forward or rewind processing, video playback of a packet in which a scene change is detected is performed not only for a compressed image subjected to intra-image compression processing but also for a compressed image subjected to inter-image compression processing. This makes it possible to prevent a scene with a shadow from being missed.
[0020]
BEST MODE FOR CARRYING OUT THE INVENTION
(1st Embodiment)
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings.
FIG. 2 is a schematic diagram showing the appearance of the portable video device.
In the figure, B3 is a CF card (Compact Flash (R) card), which is a large-capacity semiconductor memory card having a nonvolatile memory.
B1 is a portable video device having a slot for mounting the CF card B3, and by pressing a play key of a keyboard B4, the internal circuit starts to be driven and the compressed data stored in the CF card B3 is stored. The coded video and audio data is decompressed and displayed on the display B2 as a video.
[0021]
A "play" key on the keyboard B4 is a key for starting video playback. The "<" and ">" keys execute rewind and fast forward when reproducing video information. The "stop" key is a key for interrupting video playback, rewinding, and fast-forwarding. B5 is a charger that converts supplied AC power into low-voltage DC power and supplies it to the receptacle B6.
[0022]
Note that the portable video device B1 in the present embodiment does not have a video recording function, and stores coded data obtained by recording and compressing a video with another device (for example, a personal computer) in a CF card, and then stores the encoded data in the CF card. The video is reproduced by attaching the.
[0023]
Of course, the portable video device B1 may have a recording function, and may receive video data by wire using an interface such as USB, or may receive video data by wireless communication such as Bluetooth.
[0024]
In the present embodiment, the encoded data stored in the CF card uses the MPEG4 standard simple profile method for video compression. For audio compression, AAC (Advanced Audio Coding) is employed.
[0025]
The MPEG4 (Moving Picture Experts Group Phase4) standard is a standard standardized by ISO which is an international organization. As shown in FIG. 4, an I picture of I (2) and an I picture of P (5), P (8) are used. It is composed of a P picture and B pictures B (0), B (1), B (3), B (4), B (6), and B (7). Regarding the I picture, the amount of change in the image is coded using discrete cosine transform (DCT) regardless of the preceding and succeeding images, the value is quantized, and then compressed by run-length coding and Huffman transform. . The P picture is compressed with reference to an I or P picture which is a forward (past) compressed image, and the B picture is compressed with reference to an I or P picture which is a preceding or succeeding past or future compressed image. I have. The difference between the images referred to by both the P picture and the B picture is compressed using DCT or Huffman transform, similarly to the I picture.
[0026]
In the case of video, the relationship between the preceding and following compressed images is large, and the data amount of P and B pictures is smaller than that of I pictures, but the compression error increases when compression is performed using only P and B pictures. Therefore, an I picture is created about once every 15 images.
[0027]
The MPEG4 simple profile used in the present embodiment is a compression method composed of only I pictures and P pictures in order to simplify the processing. As shown in FIG. , P (1) to P (14). The compressed images from I (0) to P (14) are called a GOP (Group of Picture), and a control code is added to the GOP and stored as one packet in the CF card. Even in the MPEG4 simple profile, the compression can be made 50 to 100 times smaller than the original video.
[0028]
In audio compression, compression is performed based on the AAC (Advanced Audio Coding) standard, and discrete cosine transform (DCT), run-length encoding, which is also used in MPEG4 compression while detecting a difference in audio data within a certain period of time. , And Huffman transform.
[0029]
FIG. 3 shows a storage structure of encoded data stored in the CF card B3 in the present embodiment. As shown, one content is composed of a plurality of packets and packed.
[0030]
A G1 pack header is formed at the beginning of each pack, and stores the name of the content, the total number of packets of the content, and the like.
[0031]
G2 is a system header that stores the number of dots in the horizontal direction of the video, the number of dots in the vertical direction, the number of frames / second, the video compression method, the audio compression method, and the like.
[0032]
G3 to G7 are groups of packets storing encoded data. The group of packets is a group of GOPs of images compressed by the MPEG4 simple profile described with reference to FIG. 5 and the group of packets of audio compressed by the AAC described above. They are stored in chronological order.
[0033]
It should be noted that the size of a packet represented by the number of bytes is different for both video and audio. The size of a packet with a large amount of change is large, and the size of a packet with a small amount of change is small. As will be described later, in this embodiment, a scene change is detected and fast-forwarding and rewinding processes are performed using this feature.
[0034]
G8 to G14 show the structure of a video packet, and G8 stores a code indicating that the packet type is video. The ID of G9 stores the number of the packet.
[0035]
G10 stores the number of bytes indicating the size of each packet. In a video with a large amount of change, the number of bytes has a large value, and in a video with a small amount of change, the number of bytes is small.
[0036]
The DTS of G11 is the sum of the time stamps. The time stamp is stored together with each compressed image, and is a time at which the image display is started after the compressed image data is expanded after the immediately preceding image is displayed. When compressing a video, the raw data is compressed according to the number of frames (the number of images to be compressed per second). However, if the video is a complex video or a video with a large amount of change, it cannot be compressed within the time specified by the number of frames. For example, if the number of frames is 30, one image must be compressed within 33 ms (1 second / 30 frames), but about 90 ms is required for a complicated video. Therefore, the time of the time stamp varies, and the time of the time stamp is 90 ms when the shortest is 33 ms (1 second / 30 frames).
[0037]
DTS is the sum of time stamps, and one packet is composed of 15 pictures. Therefore, the minimum DTS is 33 ms × 15 frames = about 0.5 seconds, and the maximum is 90 ms × 15 frames = about 1.35 seconds. .
[0038]
Since the DTS, which is the sum of the time stamps, represents the complexity of the video, as will be described later, in this embodiment, this feature is used to detect the scene change and also use this DTS (G11) as discrimination data. Fast forward or rewind processing is being performed.
[0039]
G12 is an image group defined by the MPEG4 simple profile described with reference to FIG. 5, and is called a GOP (Group of Picture) as described with reference to FIG. It has a configuration of one I picture (G13) and 14 P pictures (G14).
[0040]
G15 to G20 show the configuration of the voice packet, and G15 stores a code indicating that the packet type is voice. The ID of G16 stores the number of the packet.
[0041]
G17 stores the number of bytes indicating the size of each packet.
G18's DTS is the sum of the time stamps. The number of bytes and the DTS have the same properties as the video data. For details, see the description of G10 and G11 of the video data.
G19 is an audio data group defined by the AAC, and each audio data (G20) is stored for each fixed time unit in synchronization with the video data.
[0042]
FIG. 1 shows a block diagram of a portable video device B1 in the present embodiment.
In the figure, reference numeral B10 denotes an information processing circuit, which receives encoded data having the structure described in conjunction with FIG. 3 from the CF card B3, decompresses the compressed image and sound, and outputs the result. When decompressing and displaying the compressed video encoded data, the encoded data from the CF card B3 is stored in a buffer Buff0 of B11 for each of a plurality of packet data, and is addressed to an address counter ADR0 of B14. The compressed data corresponding to 8 × 8 pixels of the image data of the screen is sent to the VLC circuit (B20).
[0043]
The VLC circuit (B20) is a circuit for expanding data compressed by run-length encoding and Huffman transform, and after performing expansion processing, outputs the result to an inverse quantization circuit B21.
The inverse quantization circuit B21 is a circuit that multiplies by a quantization value, and outputs the value to the inverse DCT circuit B22.
The inverse DCT circuit B22 converts the data based on the cosine function subjected to the discrete cosine transform into 8 × 8 pixel data, and outputs the pixel data to the adder B23.
When the decompressed compressed image is a P-picture, in order to perform synthesis with the reference image, the adder adds each pixel and outputs the decoded data as decoded data.
[0044]
With the above processing, the expansion processing of the compressed data for each 8 × 8 pixel is performed, and by repeating this processing, the expansion processing for each image is performed. The decompressed and decoded video data is output to the display control circuit B28, converted into a displayable signal level, and output to the display B2.
[0045]
The image memory B24 is a memory for storing a reference image at the time of decoding a P picture.
The switch B25 is a switch for transmitting data of the image memory B24 to the adder B23 at the time of a P picture.
[0046]
When the audio encoded data is expanded and output to the speaker B30, the audio data is expanded by the VLC circuit B20, the inverse quantization circuit B21, and the inverse DCT circuit B22, filtered by the audio filter B27, and then digitally converted by the DA converter B26. The signal is converted into an analog signal and output to the audio amplifier B29. The audio amplifier B29 amplifies the current, outputs the amplified current to the speaker B30, and converts the sound to audio.
[0047]
Buff1 of B12 is a buffer for detecting a scene change when performing fast-forward and rewind processing, and its address is designated by the address counter ADR1 of B15.
A buffer B13 of B13 is a buffer for storing the ID of G9, the number of bytes of G10, and the DTS of G11 shown in FIG. 3 when a scene change is detected. This is an address counter to be stored.
[0048]
Switches B17 and B18 are switches for transmitting the buffer values and the address values of Buff1 and ADR1 to Buff2 and ADR2 when a scene change is detected.
The buffers Buff1, Buff2, the address counters ADR1, ADR2, and the switches B17, B18 are characteristic configurations of the present embodiment, and the processing will be described in detail with reference to the flowcharts of FIGS.
[0049]
Note that the information processing circuit B10 may be a single semiconductor chip, or may be composed of a plurality of semiconductor chips. For example, the buffers Buff0, Buff1, and Buff2 may use a semiconductor memory such as an SDRAM as a separate chip.
[0050]
Further, a DSP (Digital Signal Processor) may be used to have functions such as a VLC circuit, an inverse quantization circuit, and an inverse DCT circuit.
[0051]
The keyboard B4 is provided with the keys described with reference to FIG. 2, and outputs a key signal to the control circuit B7 when a user presses a key.
The control circuit B7 includes a central processing unit CPU, a ROM, and a RAM, and is a circuit for instructing execution of processing according to a key signal.
B8 CPG is a clock pulse generator.
The power supply circuit B5 is a circuit for supplying a voltage to each circuit in the device, and includes a rechargeable battery. The battery in the power supply circuit B5 is charged by the charger.
[0052]
FIG. 6 is a flowchart showing the flow of processing when the play key is pressed.
When the play key is pressed, when the play key is pressed during the fast forward / rewind processing in the flow F1, the process proceeds to the flow F4.
Flows F2 and F3 are initialization processing. In flow F2, the compressed data stored in the CF card is called and stored in the buffer Buff0 for reproduction.
If the content size is too large to fit in Buff0, the head of the content is stored in Buff0.
When a plurality of contents are stored in the CF card, the contents may be selected. Alternatively, a content that has not been reproduced may be selected, or the content with the latest recording date and time may be selected.
[0053]
In order to reproduce the video content from the beginning in the flow F3, the head address of the reproduction buffer Buff0 is stored in ADR0.
Flows F4 to F19 are the flows of processing for expanding the encoded data stored in one packet in the content configuration described with reference to FIG.
In the flow F4, it is determined whether or not the encoded data stored in the packet is audio data (audio data). As shown in FIG. 3, in the case of video data, the data stored in the first G8 of the packet is the video code, and in the case of audio data, the data stored in the first G15 of the packet is the audio code. It is. This data is discriminated and branching is performed.
[0054]
If it is video data, the flow proceeds to flow F5, where the image data is expanded for every 8 × 8 pixels using the VLC circuit, the inverse quantization circuit, the inverse DCT circuit, etc. shown in FIG. 1 and stored in the image memory B24. By repeating this, the expansion processing of the compressed image of one screen is executed.
If the video data is a P-picture, the combining process with the immediately preceding image is performed using the adder B23 shown in FIG.
After executing the decompression processing, the time stamp is determined in flow F6, and if the display time has come, display processing of one screen subjected to the decompression processing in flow F7 is performed.
[0055]
In the flow F8, the address counter ADR0 is updated to specify the next compressed data.
In the flow F9, the end of the GOP (Group of Picture) is determined.
In the structure of the video packet shown in FIG. 3, if a compressed image to be decompressed and displayed still remains in the GOP of G12, the flow F5 to F9 are repeated to execute the decompression process and the display process.
When all the compressed images in the GOP are decompressed and displayed, the flow proceeds to flow F16.
[0056]
Flow F16 to flow F19 show a process of calling the next packet.
In the flow F16, it is determined whether or not the address counter ADR0 indicates the last packet. This is to determine whether or not all the compressed data stored in the reproduction buffer Buff0 has been decompressed and displayed or output as audio.
If it is determined that the address counter ADR0 does not indicate the last packet, the flow proceeds to step F17 to specify the next packet.
If the address counter ADR0 indicates the last packet, a part of the content is called from the CF card in a flow F18 and stored in the buffer Buff0.
[0057]
In a flow F19, an address indicating the head address of Buff0 is stored in ADR0 in order to reproduce from the beginning of the reproduction buffer Buff0.
Flow F20 is an end determination, that is, a determination as to whether or not the content reproduction processing has ended.
If it is determined that there is compressed data to be reproduced, the flow F4 to F20 are repeated to execute the above-described reproduction processing.
In the audio reproduction process, the flow proceeds to flow F10 in the determination of the flow F4, the audio data is expanded, and the DA converter B26 shown in FIG. 1 performs the DA conversion in the flow F11.
The time stamp is determined in the flow F12, and if it is time to output the audio, the audio data subjected to the decompression processing and the DA conversion is output to the speaker B30 in the flow F13.
[0058]
In the flow F14, the address counter ADR0 is updated to specify the next compressed data.
The end of the packet is determined in the flow F15.
In the structure of the audio packet shown in FIG. 3, if compressed data to be decompressed and audio-output still remains in the audio data group of G18, the flow F10 to F15 are repeated to execute the decompression process and the audio output.
When all the audio compression data in the packet has been processed, the process proceeds to flow F16, and the process of instructing the next packet is performed.
[0059]
FIG. 7 is a flowchart showing the flow of processing when the fast forward key is pressed.
When the fast forward key is pressed, initialization for scene change search is performed in flow F30. In the fast-forward processing, the contents of the reproduction buffer Buff0 are transferred to Buff1, which is a search buffer for detecting a scene change, and the contents of ADR0 that counts the address of Buff0 are transferred to ADR1 that counts the address of Buff1.
Also, the contents of the scene change buffer Buff2 for storing the scene change data and the contents of the ADR2 for storing the scene change address are cleared, and the variable X for storing the search period of the scene change is cleared.
[0060]
In the flow F31, it is determined whether or not the packet in the Buff1 indicated by the ADR1 is audio data. If the data is audio data, the process proceeds to the flow F39 and the subsequent process for specifying the next packet is performed.
The processing of the flows F32 to F36 is a characteristic part of the present embodiment, and is a processing of detecting a scene change and storing the location and content of the occurrence in the ADR2 and the Buff2.
[0061]
In flows F32 and F33, the number of bytes of the packet in Buff1 indicated by ADR1 is compared with the number of bytes stored in Buff2. As shown in FIG. 3, the number of bytes of each packet is stored in G10. If the number of bytes is large, a compressed image having a large amount of change is stored in the packet. In this embodiment, a scene change is detected using this feature.
[0062]
If the number of bytes of the packet being searched is larger than the value stored in Buff2, the value of ADR1 is transferred to ADR2 in flow F35. ADR2 is an address counter that specifies a location where a scene change has occurred.
The ID, the number of bytes, and the DTS, which are the contents of the packet being searched in the flow F36, are stored in Buff2.
If it is determined in the flow F33 that the number of bytes of the packet being searched is smaller than the value stored in Buff2, the flow proceeds to the flow F37 without executing the flows F35 and F36.
[0063]
In the flow F34, when the value stored in Buff2 and the number of bytes of the packet being searched are the same in the flow F32, the DTS value stored in Buff2 and the DTS value of the packet being searched are This is a comparison process. If the DTS of the packet being searched is larger, that is, if the total time of the time stamp is larger, the process proceeds to flow F35, where the value of ADR1 is transferred to ADR2 to correct the location of the scene change, and the flow F36 is performed. Then, the ID, the number of bytes, and the DTS of the packet being searched are transferred to Buff2.
[0064]
In flow F37, the DTS (total of time stamps) in the packet being searched is added to the variable X that stores the search period of the scene change, and in flow F38, it is determined whether the value of the variable X is greater than 180 seconds. I do. In the present embodiment, a scene change is detected every three minutes (180 seconds).
[0065]
As another method, a setting input unit may be provided on the keyboard B4 so that the user can input three minutes, five minutes, seven minutes, ten minutes, or the like, so that the scene change detection period can be changed.
In addition, a threshold value may be set for scene change detection, and a flag indicating that there is no scene change may be set when there is no packet having a certain byte size or more.
[0066]
If it is determined in the flow F38 that the variable X has a value greater than 180 seconds, the scene change processing is temporarily terminated, and the flow proceeds to the processing after the flow F42. If the variable X is smaller than 180 seconds, the process proceeds to flow F39, and the address counter ADR1 determines whether or not the address is the last address. In the case of the last address, it means that a scene change has been detected for all the packets in Buff1, and the scene change processing is temporarily ended, and the flow proceeds to flow F42.
[0067]
In the flow F40, the value of ADR1 for counting the address of Buff1 is updated in order to specify the next packet and detect a scene change.
The end of the content is determined in the flow F41, and if the content is not ended, the processing from the flow F31 to the flow F41 is repeated to detect a scene change.
The processing after the flow F42 is an execution routine of the fast forward reproduction and display processing.
[0068]
In the flow F42, the display packet of the buffer Buff0 is specified by the address counter ADR0, and in the flow F43, if the specified packet is the packet of the audio data, the flow proceeds to the flow F49 to perform the end determination processing.
If the packet is not a packet of audio data, the flow proceeds to flow F44, where the reproduction and display processing of the I picture stored at the head of each packet is performed as shown in FIG.
[0069]
In flow F45, ADR0 which counts the address of the display packet and ADR2 which counts the address where the scene change has occurred are compared. If they match, the process from flow F46 is executed, and if they do not match, the flow proceeds to flow F49.
Flows F46, F47, and F48 are processing for displaying a packet in which a scene change has occurred.
In the flow F46, the reproduction and display processing of the P picture are performed. As described with reference to FIG. 1, since the P picture encodes the difference from the preceding picture, at the time of reproduction, the P picture is combined with the image memory B24 storing the encoded data of the preceding picture by the adder B23. Perform at
[0070]
The end of the GOP is determined in the flow F47. If the GOP is not completed, that is, if there is a P picture that has not been reproduced or displayed, the processing in the flows F46 and F47 is repeated.
The process waits for one second in flow F48, whereby the last P picture is displayed for one second.
[0071]
In flow F49, it is determined whether or not the content has ended. If not, in flow F50, it is determined whether or not ADR0 is the address indicating the last packet. If the address is not the last address, the flow proceeds to flow F51, and ADR0 specifies the next packet.
[0072]
The flow F52 is for determining whether or not the addresses stored in ADR0 and ADR1 are the same. If the addresses are the same, it means that the reproduction / display processing in the period in which the scene change is detected has been completed, and the value of the variable X is cleared in the flow F55 and the flow proceeds to the flow F31 in order to detect the scene change again.
If the addresses are different, the flow returns to step F42 to continue the reproduction / display processing as the fast-forward processing.
[0073]
The processing of the flows F53 and F54 is processing when ADR0 specifies the last packet.
In step F53, the coded data being processed stored in the CF card is stored in Buff0, a buffer for reproduction / display processing, and Buff1, a buffer for scene change detection.
In the flow F54, ADR0 of the address counter designates the first packet, and in the subsequent reproduction / display processing, processing is performed from the first packet of Buff0. Also, ADR1 specifies the first packet, and subsequent scene change detection is performed from the first packet of Buff1. After clearing the variable X in flow F55, the flow proceeds to flow F31 to detect a scene change.
The fast-forward processing in this embodiment has been described above.
[0074]
To summarize the characteristic part of the present embodiment, in flow F31 to flow F41, it is determined whether or not a scene change has been made based on the packet information, the number of bytes of the packet, and the time stamp information DTS. If it is determined in F34 that it is a scene change, the stored information of ADR2 and Buff2 is updated in flows F35 and F36. By repeating this processing loop until the variable X exceeds 180 seconds, the largest scene change can be detected in a period of 180 seconds. In this scene change detection, detection is performed based on the number of bytes of packet information and time stamp information DTS, and processing for detecting a compressed picture image is not performed. Therefore, in the present embodiment, the scene change detection processing can be performed by a simple method.
[0075]
In the processing of flow F42 to F55, for a packet in which a scene change has not occurred, only the I picture in the packet is displayed in flow F44, and it is detected in flow F45 that the packet has not undergone a scene change, and the P picture Not to be displayed. For a packet in which a scene change has occurred, an I picture in the packet is displayed in a flow F44, then the fact that the packet has undergone a scene change is detected in a flow F45, and a P picture is reproduced and displayed in a flow F46. That is, since P-pictures are displayed only when a scene change is detected, many images are displayed only when a scene change occurs, and the user can accurately find an interesting scene during fast-forwarding.
[0076]
FIG. 8 is a flowchart of the rewinding process.
Basically, in the same manner as the fast-forward processing, in flow F61 to flow F71, whether or not a scene change is performed is determined in flow F62, F63, and F64 based on the packet information, the number of bytes of the packet, and the DTS as time stamp information. If it is determined that a scene change has occurred, the storage information of ADR2 and Buff2 is updated in the flow F65, F66. By repeating this processing loop until the variable X exceeds 180 seconds, the largest scene change can be detected in a period of 180 seconds. In the rewinding process, ADR1 is updated so as to specify the immediately preceding packet in flow F70 in order to perform a scene change search on the encoded data in the forward direction. In the flow F69, it is determined whether or not ADR1 is the head address, and it is checked whether or not a scene change search has been performed for all the encoded data of Buff1. In the flow F70, the ADR1 is updated to detect the scene change of the next packet, and the end of the content is determined in the flow F71.
[0077]
In the processing of flows F72 to F85, for a packet in which a scene change has not occurred, only the I picture in the packet is displayed in flow F74, and it is detected in flow F75 that the packet has not undergone a scene change. Not to be displayed. For a packet in which a scene change has occurred, an I picture in the packet is displayed in a flow F74, a packet in which a scene change has occurred is detected in a flow F75, and a P picture is reproduced and displayed in a flow F76. That is, since P-pictures are displayed only when a scene change is detected, a large number of images are displayed only when a scene change occurs, and the user can accurately find an interesting scene during rewinding. Become.
[0078]
In the present embodiment, the video at the position where the scene change is detected is reproduced including the P picture, but the fast forward / rewind reproduction including the P picture is performed from around the first or second packet from the position where the scene change is performed. May be.
[0079]
Further, in the present embodiment, the scene change detection is detected only by the packet information of the video data. However, the scene change detection may be performed in combination with the packet information of the audio data.
[0080]
(Second embodiment)
FIG. 9 is a flowchart of the fast-forward processing according to the second embodiment of the present invention.
In the present embodiment, a scene change is detected based on packet information of audio data. For this reason, in the determination of the flow F91, in the case of video data, the flow branches to a flow F99 without performing the flow F94 to F98, which is the process of detecting a lively scene. In the case of audio data, the number of bytes is large for compressed data having large fluctuations in volume and frequency. In the present embodiment, attention is paid to this point, and the detection of an exciting scene is performed after the flow F94. In the flow F94, when the number of bytes G17, which is one of the packet information, of the audio data shown in FIG. 3 is larger than the value stored in Buff2, the flow proceeds to F95 and F96 to store the contents of ADR2 and Buff2. Has been updated. By repeating this process until the variable X counts 180 seconds, the address of the packet having the largest number of bytes in 180 seconds is stored in ADR2, and the packet information is stored in Buff2. That is, the detection of the exciting scene is detected by the size of the number of bytes of the audio packet, and the result is stored in ADR2 and Buff2.
[0081]
In the flow F99, it is determined whether or not ADR1 is the last address, and it is checked whether or not a scene change search has been performed for all the encoded data of Buff1. In the flow F100, the ADR1 is updated for detecting the scene change of the next packet, and the end of the content is determined in the flow F101.
As described above, as a process from F90 to F101, a packet with a large audio data is detected.
[0082]
In this process, an exciting scene, which is a moment of large fluctuations in volume and frequency, is detected, for example, a moment when a goal of soccer is reached. For this reason, in steps F102 to F110, a silent position is detected, and the silent position is stored as a scene-changed position.
[0083]
In flow F102, the value of ADR2 is stored in ADR1. This is to start the silent position detection process from the exciting scene. Further, the variable X for counting the silent position detection period is cleared.
[0084]
In flow F103, it is determined whether or not the number of bytes of the voice packet is smaller than the value of Buff2.
If it is smaller, the process proceeds to flows F104 and F105, and the storage contents of ADR2 and Buff2 are updated. By repeating this process until the variable X counts 60 seconds, the address of the packet having the smallest number of bytes in 60 seconds is stored in ADR2, and the packet information is stored in Buff2. That is, the detection of the silent position where the volume is small and the frequency fluctuation is small is detected by the small number of bytes of the voice packet, and the result is stored in ADR2 and Buff2.
[0085]
Flows F106 and F107 are processing for detecting 60 seconds using the variable X.
In flow F106, the value of DTS, which is the time stamp period of the packet, is added to the variable X, and in flow F107, a determination of 60 seconds is made.
If the variable X exceeds 60 seconds, the detection of the silent position is terminated, the silent position is stored as the position at which the scene is changed, the flow proceeds to F111, and the reproduction / display processing is performed.
If the variable X has not reached 60 seconds, the flow proceeds to flow F108, where it is determined whether or not ADR1 is the last address, and it is checked whether or not the silent position search has been performed for all the coded data of Buff1. In flow F109, ADR1 is updated for silent position detection of the next packet, and the end of the content is determined in flow F110.
[0086]
The flow F111 is a reproduction and display processing routine, which is similar to that of the first embodiment shown in FIG.
The second embodiment has been described above.
[0087]
In the present embodiment, a scene change is detected based on a lively scene of audio data and a silent position. As another method, by combining with the first embodiment, a scene change may be detected based on video packet information after detecting a lively scene using audio packet information.
[0088]
After performing a scene change based on the video packet information, silent detection may be performed based on the audio packet information.
[0089]
(Third embodiment)
FIG. 10 and FIG. 11 are drawings related to the third embodiment of the present invention.
In the schematic diagram of FIG. 10, a print key is provided on the keyboard B4 of the portable video device B1, and when the user presses the print key, the process shown in FIG. 11 is performed, and the print data is transmitted via the cable B51. Output to the printer B50 for print processing.
[0090]
In the flowchart of FIG. 11, a scene change is detected in steps F120 to F133. After performing the initialization processing in flow F120, the size of the entire content is divided by 8 in flow F121, the result is stored in a variable Z, and the flow is notified to the printer that printing is to be performed in the 8-split print mode in flow F122.
[0091]
In the present embodiment, the image data is divided into eight parts on a sheet paper and printed.
For this reason, the content is divided into eight equal parts, and a scene change is detected therein. Flows F121 and F122 are processing for the preparation.
[0092]
In flow F123, it is determined whether the data is audio data or video data. If the data is video data, in flow F126, the number of bytes G10, which is one of the packet information, of the video data shown in FIG. If is also larger, the flow proceeds to the flow F127, F128, and the storage contents of ADR2 and Buff2 are updated. This processing is repeated until the variable X counts up to the value of the variable Z by the discrimination of F130 described later, and the address of the packet having the largest number of bytes is stored in the ADR2 within the time stored in the variable Z, and The information is stored in Buff2. In other words, it is possible to specify the video with the largest scene variation by ADR2 and Buff2.
[0093]
In a flow F129, the total value of the packet time stamps is added to the variable X, and in a flow F130, the variables X and Z are compared.
In the flow F131, it is determined whether or not ADR1 is the last address, and it is checked whether or not a scene change search has been performed for all the encoded data of Buff1. In flow F132, ADR1 is updated to detect the scene change of the next packet, and the end of the content is determined in flow F133.
With the above processing, the scene change image within the period of the variable Z is specified, and the printing processing is performed in the processing after the flow F134.
[0094]
In a flow F134, 1 is added to a variable Y for counting the number of print images.
The value of ADR2, which is the address of the image where the scene change has occurred in flow F135, is transferred to ADR0, and compressed in flow F136 using the VLC circuit, inverse quantization circuit, inverse DCT circuit, etc. shown in FIG. After the image is decompressed and subjected to a filter process for making a print image in a flow F137, the video data is output to the printer in a flow F138.
[0095]
In the flow F139, it is determined whether or not the variable Y has become 8.
This is a determination as to whether or not eight images have been output to the printer in order to perform eight-division printing. If they match, the flow advances to F140 to output a print start instruction to the printer, and ends.
[0096]
In flow F141, it is determined whether or not the content is completed, and in flow F142, it is determined whether ADR1 designates the last address of Buff1.
As a result of the determination, if the last address is specified, it means that the scene change has been detected for all the Buff1 video data, and the continuation of the video data is transferred from the CF card to Buff0 and Buff1 in flow F143. I do.
[0097]
In the flow F144, ADR1 specifies the head address of Buff1, and detects a scene change from the head address of Buff1.
After clearing the variable X in the flow F145, the flow returns to the flow F123, and the scene change processing is continuously performed.
The third embodiment has been described above.
[0098]
In the present embodiment, as in the first embodiment, scene change detection is performed based on packet information of video data. However, as in the second embodiment, scene change detection is performed based on packet information of audio data. Alternatively, scene change detection may be performed based on a combination of packet information of video data and audio data.
[0099]
In the present embodiment, a value obtained by dividing the content size by 8 is entered in the variable Z, and one scene change is detected during that period. However, a value independent of the content size is entered in the variable Z, and Video data having a packet size equal to or larger than a predetermined threshold value during the period may be regarded as a scene change.
[0100]
According to the first to third embodiments, since a scene change is detected using packet information, a scene change can be detected without expanding compressed video or audio data. For this reason, scene change detection with simple processing contents becomes possible. Further, when the fast-forward / rewind process is executed, the image of the scene change packet is also reproduced with the P picture subjected to the inter-image compression process, so that it is possible to prevent the user from missing a scene of interest.
[0101]
The above embodiments are summarized as follows.
(1) The video processing device includes a storage unit configured to store the plurality of compressed images and the management information for each packet into one packet, and to store the plurality of packets as video content; an information processing circuit configured to reproduce the video content; It is provided with a scene change detecting means for detecting a scene change in the video content, wherein the scene change detecting means detects a scene change based on at least the management information for each packet. That is, as a method for detecting a scene change, a complicated circuit for detecting expansion of a compressed image and detecting a difference is unnecessary, and a scene change can be detected by a simple detection method.
[0102]
(2) The video processing device is characterized in that the management information for each packet includes at least information indicating a storage size of the packet.
[0103]
(3) The video processing device is characterized in that the management information for each packet includes at least information on a time stamp indicating a time for displaying an image in the packet.
[0104]
(4) The video processing device includes a storage unit for storing video content composed of a plurality of compressed images, a video reproduction unit for reproducing the video content at a normal reproduction speed, and a process for fast-forwarding or rewinding the video content. A special image reproducing unit for reproducing, wherein the plurality of compressed images include a compressed image subjected to an intra-image compression process and a compressed image subjected to an inter-image compression process; At the time of execution, at least a process of reproducing only a plurality of compressed images subjected to intra-image compression processing and a process of reproducing including a compressed image subjected to inter-image compression processing are performed. That is, when performing rewinding / fast-forwarding, an image near a scene change presumed to be a position of interest to the user is reproduced as an intra-compression I picture and an inter-picture compression P-picture or B-picture. In this case, the I-picture compressed in the image is reproduced. For this reason, it is possible to perform the rewind / fast-forward processing without missing the position where the user is interested.
[0105]
(5) The video processing device includes: a scene change detection unit; an address unit that specifies a detection image detected by the scene change detection unit or a detection packet including a plurality of images; In addition, for a plurality of compressed images including the detected image or the detected packet specified by the address means, a compressed image subjected to inter-image compression processing is also reproduced.
[0106]
(6) The video processing device is a storage unit for storing video content composed of a plurality of compressed images, a scene change detection unit, and a detection image detected by the scene change detection unit or a detection packet including a plurality of images. And print image generating means for generating a print image based on the detected image or the specified compressed image included in the detection packet. For this reason, it becomes possible to print the image specified by the scene change detection.
[0107]
(7) The video processing device stores a plurality of compressed voices and management information for each packet as one voice packet, and stores a plurality of the voice packets as a part of video content, and information for reproducing the video content. What is claimed is: 1. A video processing apparatus comprising: a processing circuit; and a scene change detecting unit configured to detect a scene change in the video content, wherein the scene change detecting unit detects a scene change based on at least management information for each audio packet. It is characterized by doing. For this reason, it is possible to detect a scene change based on the packet information of the compressed voice.
[0108]
(8) The video processing device is characterized in that the management information of the audio packet includes at least information indicating a storage size of the audio packet.
[0109]
(9) The video processing device is characterized in that the scene change detection means performs silent detection after detecting a lively scene using information representing the storage size of the audio packet.
[0110]
This embodiment can be realized by a computer executing a program. Further, means for supplying the program to the computer, for example, a computer-readable recording medium such as a CD-ROM in which the program is recorded, or a transmission medium such as the Internet for transmitting the program is also applied as an embodiment of the present invention. Can be. Further, a program product such as a computer-readable recording medium on which the above-mentioned program is recorded can be applied as an embodiment of the present invention. The above programs, recording media, transmission media, and program products are included in the scope of the present invention. As the recording medium, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a magnetic tape, a nonvolatile memory card, a ROM, and the like can be used.
[0111]
It should be noted that each of the above-described embodiments is merely an example of a concrete example in carrying out the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features.
[0112]
【The invention's effect】
As described above, according to the present invention, since a scene change is detected using packet management information, a scene change can be detected without decompressing compressed video or audio data. For this reason, scene change detection with simple processing contents becomes possible. In addition, when playing back as fast-forward or rewind processing, video playback of a packet in which a scene change is detected is performed not only for a compressed image subjected to intra-image compression processing but also for a compressed image subjected to inter-image compression processing. This makes it possible to prevent a scene with a shadow from being missed.
[Brief description of the drawings]
FIG. 1 is a block diagram of a portable video device according to a first embodiment.
FIG. 2 is a schematic diagram of a portable video device according to the first embodiment.
FIG. 3 is a schematic diagram illustrating a storage format of video content according to the first embodiment.
FIG. 4 is a schematic diagram showing a picture structure in MPEG4.
FIG. 5 is a schematic diagram showing a picture structure in an MPEG4 simple profile.
FIG. 6 is a flowchart illustrating a flow of a reproduction process according to the first embodiment.
FIG. 7 is a flowchart illustrating a flow of a fast-forward playback process according to the first embodiment.
FIG. 8 is a flowchart showing a flow of a rewind reproduction process according to the first embodiment.
FIG. 9 is a flowchart illustrating a flow of fast-forward playback processing according to the second embodiment.
FIG. 10 is a schematic diagram of a portable video device according to a third embodiment.
FIG. 11 is a flowchart illustrating a flow of a printing process according to a third embodiment.
[Explanation of symbols]
B1 Portable video device
B2 display
B3 CF card
B4 keyboard
B5 Charger
B6 receptacle

Claims

Storage means for storing a plurality of compressed images and management information for each packet as one packet, and storing the plurality of packets as video content;
A video processing apparatus comprising: a scene change detection unit configured to detect a scene change in the video content based on the management information for each packet.

The video processing device according to claim 1, wherein the management information for each packet includes at least information indicating a size of the packet.

The video processing device according to claim 1, wherein the management information for each packet includes at least information on a time stamp indicating a time at which a video in the packet is displayed.

Storage means for storing video content composed of a plurality of compressed images including a compressed image subjected to intra-image compression processing and a compressed image subjected to inter-image compression processing;
When reproducing the video content as a fast-forward or rewind process, a special video reproduction unit that reproduces based on all of the compressed images subjected to the intra-image compression process and some of the compressed images subjected to the inter-image compression process is provided. Characteristic video processing device.

Furthermore, it has a scene change detecting means for detecting a compressed image of a scene change in the video content,
5. The video processing apparatus according to claim 4, wherein the special video reproducing unit reproduces the compressed image that has been subjected to the detected inter-image compression processing.

Storage means for storing video content composed of a plurality of compressed images,
Scene change detection means for detecting a compressed image of a scene change in the video content,
A video image processing apparatus comprising: a print image generation unit configured to generate a print image based on the compressed image detected by the scene change detection unit.

Storage means for storing a plurality of compressed voices and management information for each packet as one voice packet, and storing a video packet including a plurality of compressed images and a video content including the plurality of voice packets,
A video processing apparatus, comprising: a scene change detecting unit configured to detect a scene change in the video content based on the management information for each audio packet.

8. The video processing apparatus according to claim 7, wherein the management information of the audio packet includes at least information indicating a size of the audio packet.

9. The video processing apparatus according to claim 8, wherein the scene change detecting means detects a scene change by detecting a climax scene using information representing the size of the audio packet and then performing a silent detection. .

The scene change detecting means detects a scene change by detecting the audio packet having the large size using the information indicating the size of the audio packet, and then detecting the audio packet having the small size. The video processing device according to claim 8, wherein

A storage step of storing a plurality of compressed images and management information for each packet as one packet, and storing the plurality of packets as video content;
A scene change detecting step of detecting a scene change in the video content based on the management information for each packet.

A storage step of storing video content including a plurality of compressed images including a compressed image subjected to intra-image compression processing and a compressed image subjected to inter-image compression processing;
A special video playback step of playing back the video content based on all of the compressed images subjected to intra-image compression processing and some of the compressed images subjected to inter-image compression processing when playing back the video content as fast-forward or rewind processing. Characteristic video processing method.

A storage step of storing video content composed of a plurality of compressed images,
A scene change detecting step of detecting a compressed image of a scene change in the video content;
A print image generating step of generating a print image based on the compressed image of the detected scene change.

A storage step of storing a plurality of compressed voices and management information for each packet as one voice packet, and storing a video packet including a plurality of compressed images and a video content including the plurality of voice packets,
A scene change detecting step of detecting a scene change in the video content based on the management information for each audio packet.