JP3918772B2

JP3918772B2 - Video editing apparatus, video editing method, and video editing program

Info

Publication number: JP3918772B2
Application number: JP2003132040A
Authority: JP
Inventors: 直人木内
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2003-05-09
Filing date: 2003-05-09
Publication date: 2007-05-23
Anticipated expiration: 2023-05-09
Also published as: JP2004336556A

Description

【０００１】
【発明の属する技術分野】
本発明は、映像データを映像のシーンごとに分割する映像編集装置、映像編集方法および映像編集プログラムに関する。
【０００２】
【従来の技術】
近年、パーソナルコンピュータ等で動画像を扱うようになり、動画像の検索や、編集等の作業をいかに効率よく行うかが重要になってきている。動画像の検索や編集等を行う場合、動画像を構成しているシーンに動画像を分解し、動画像内の各シーンの配置や、構成等を把握することが必要である。ここで、シーンとは、動画像においてカメラが切り替わる単位、または音声（例えば、話者。）が切り替わる単位等を意味する。
【０００３】
動画像である映像を編集する際に、映像を符号化した信号である映像データを用いて映像を編集する映像編集システムの一例として、特許文献１に記載されている従来の映像編集システムの一構成例を図１２のブロック図に示す。図１２に示す構成の従来の映像編集システムは、符号化パラメータ抽出部１２１が、符号化された映像データから符号化パラメータを抽出し、シーンチェンジフレーム検出部１２２が、符号化パラメータに基づいて映像のシーンの切り替わりのフレームであるシーンチェンジフレームを検出する。そして、シーン群決定部１２３が、シーンチェンジフレームの位置に基づいてシーンを区切ってシーンの集まりであるシーン群を特定し、シーン群における先頭フレームの位置の情報であるシーン群情報を生成する。シーン情報階層化部１２４は、映像全体にわたってシーンチェンジフレームの位置の情報とシーン群情報とを、階層化して蓄積部１２５に蓄積させる。
【０００４】
ここで、シーン群決定部１２３は、隣り合うシーンのシーンチェンジフレームの位置の差分であるシーンチェンジフレーム間の時間差を算出する。そして、算出したフレーム間の時間差と所定のしきい値とを比較してシーンチェンジフレーム間の時間差が所定のしきい値以上であれば隣り合うシーンはそれぞれ異なるシーン群に属すると決定して、隣り合うシーンのうち時間的に後にあるシーンを新たなシーン群の先頭シーンとする。また、シーンチェンジフレーム間の時間差が所定のしきい値以下であれば隣り合うシーンは同一のシーン群に属すると決定する。このようにシーン群決定部１２３は、隣り合う全てのシーンについてシーンチェンジフレーム間の時間差を所定のしきい値と比較して、映像全体をシーン群に区分する。
【０００５】
また、例えば特許文献２に記載されている従来の映像構造化装置の一構成例を、図１３のブロック図に示す。図１３に示す構成の映像構造化装置は、特徴量抽出部１３１が、入力された映像の時間的に分割された区間の特徴量ベクトルを抽出し、量子化部１３２が、特徴量ベクトルを番号に変換し映像を番号列で表現する。そして、計数部１３３が、番号列の出現回数を数え、出現頻度の高い部分列を抽出する。このように、入力された映像の時間的に分割された区間のうち、特定のパターンで高い頻度で出現する区間の並びを抽出する。
【０００６】
【特許文献１】
特開２００１−３２６９０１号公報（第４−７頁、第１図）
【特許文献２】
特開平１１−２４２６８５号公報（第４−１０頁、第１図）
【０００７】
【発明が解決しようとする課題】
特許文献１に記載された映像編集システムは、シーン群の判定をシーンチェンジフレーム間の時間差で判定している。そのため、類似したシーンが繰り返し同じ順番で出現するという繰り返し構造が所定の時間内に含まれる場合、シーン群決定部１２３は、繰り返し構造を構成する各シーンをシーン群として特定しない。
【０００８】
また、特許文献２に記載された映像構造化装置は、入力された映像の時間的に分割された区間のうち、特定のパターンで高い頻度で出現する区間の並びを抽出するが、高い頻度で出現する区間の並びとして抽出されなかった区間に対する処理を行わない。そのため、入力された映像の中で高い頻度で出現する区間の並びとして抽出されなかった区間に対する編集作業を、特許文献２に記載された映像構造化装置以外の手段を用いて行わなくてはならない。
【０００９】
そこで、本発明は、入力された映像に含まれるシーンの出現順と出現回数とを利用して、入力された映像全体を自動的にシーン群に区分する映像編集装置、映像編集方法、および映像編集プログラムを提供することを目的とする。
【００１０】
【課題を解決するための手段】
本発明による映像編集装置は、入力された映像データによる映像のシーンが変わるタイミングであるシーンチェンジを検出して、映像データを複数のシーンに分割するシーン検出手段と、各シーンの特徴量を抽出し、抽出した特徴量に応じて、映像データにおける各シーンを複数のグループに分類したシーングループを生成し、各シーングループを特定する対応情報を各シーンに対応付けるシーン分類手段と、時間軸上で複数回同じ並びで出現する対応情報の並びに応じたシーンの集まりをシーン群と特定して抽出するシーン群抽出手段と、シーン群抽出手段の抽出の対象とならなかったシーングループの並びと、シーン群抽出手段が抽出したシーン群とのマッチングを行い、マッチングの結果に応じてシーングループの並びをシーン群に含めるシーン群決定手段と、映像データをシーン群に分類した結果の情報を蓄積する蓄積手段とを備えたことを特徴とする。
【００１１】
シーン群決定手段は、シーン群の最初のシーンのシーングループと、シーングループの並びのうち最初のシーンのシーングループとが一致するか否か判定し、シーン群の最後のシーンのシーングループと、シーングループの並びのうち最後のシーンのシーングループとが一致するか否か判定し、ともに一致すると判定されたシーングループの並びをシーン群に含めてもよい。そのような構成によれば、シーン群の構成に類似するシーングループの並びを、シーン群と特定することができる。
【００１２】
シーン群決定手段は、映像データ中に登場する回数の多い順にシーン群を選択してマッチングを行ってもよく、映像データ中に登場する回数が同じシーン群が複数存在する場合は、時間軸上で登場する順にシーン群を選択してマッチングを行ってもよい。そのような構成によれば、類似したシーンが繰り返し同じ順番で出現するという繰り返し構造を構成する各シーンをシーン群として特定することができる。
【００１３】
シーン群決定手段は、マッチングの結果、シーン群と特定されなかったシーンの並びを、シーン群と特定してもよい。そのような構成によれば、すべてのシーンの並びを、シーン群に特定することができる。
【００１４】
本発明による映像編集装置は、入力された映像データによる映像の場面が切り替わるタイミングまたは音声が切り替わるタイミングであるシーンチェンジを検出してシーンチェンジの位置を特定する位置情報を生成し、時間軸上でシーンチェンジに挟まれた複数の区間に映像データを区分し、区分された複数の区間に時間順にシーン番号を付与してシーンを作成し、作成された複数のシーンを出力するシーン検出手段と、
シーン検出手段で作成された複数のシーンの特徴量を抽出してシーン間の類似度を算出し、シーンの間の類似度に基づいて複数のシーンを複数のグループに分類し、分類した複数のグループにシーングループＩＤを付与して複数のシーングループを作成し、複数のシーンのそれぞれにシーングループを特定するシーングループＩＤを付与するシーン分類手段と、シーングループＩＤが繰り返し同じ順番で出現するシーングループＩＤの組を抽出し、抽出された複数のシーングループＩＤの組にシーン群グループＩＤを付与して複数のシーン群グループを作成し、複数のシーン群グループが映像データに出現する回数を数えて出現回数の多い順に並べ、シーン群グループごとにシーン群グループＩＤとシーングループＩＤの出現順と入力された映像データ中の出現回数とで構成されるシーン群グループ情報を出力し、シーングループＩＤの出現順がシーン群グループのシーングループＩＤの出現順に一致する時間軸に沿った複数のシーンの組をシーン群として抽出し、抽出された複数のシーン群にシーン群を特定するシーン群ＩＤを付与し、シーン群ＩＤとシーングループＩＤの出現順が抽出されたシーン群でのシーングループＩＤの出現順に一致するシーン群グループのシーン群グループＩＤと抽出されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を出力するシーン群抽出手段と、シーン群グループを映像データ中の出現回数が多い順から１つずつ選択し、シーン群抽出手段でシーン群として抽出されなかった残りのシーンから時間軸上で連続したシーンの並びを１つずつ選択し、選択したシーンの並びの中に選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンが存在し、そのシーンより時間軸上で後ろに、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンが存在する場合に、選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンで始まり、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンで終わる複数のシーンの組をシーン群として抽出し、抽出した複数のシーン群にシーン群を特定するシーン群ＩＤを付与し、抽出されたシーン群のシーン群ＩＤと選択したシーン群グループのシーン群グループＩＤと抽出されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を生成し、シーン群の抽出を選択した全てのシーンの並びと選択した全てのシーン群グループとについて繰り返しおこない、シーン群グループ情報を利用してもシーン群として抽出されなかった時間軸に沿った１つ以上のシーンの並びをシーングループの出現順が一致するシーン群グループが存在しないシーン群と決定し、決定された複数のシーン群にシーン群ＩＤを付与し、シーン群とシーングループの出現順が一致するシーン群グループが存在しないことを意味する値と決定されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を生成するシーン決定手段と、シーン群情報をシーン群情報データベースに蓄積する蓄積手段とを備えたことを特徴とする。
【００１５】
本発明による映像編集方法は、入力された映像データによる映像のシーンが変わるタイミングであるシーンチェンジを検出して、映像データをシーンに分割し、シーンの特徴量を抽出し、抽出した特徴量に応じてシーンをグループに分類したシーングループを生成し、映像データの時間軸上に複数回同じ並びで出現するシーングループの並びをシーンの集まりであるシーン群と特定して抽出し、抽出の対象とならなかったシーングループの並びとシーン群とのマッチングを行い、マッチングの結果に応じてシーングループの並びをシーン群と特定し、映像データをシーン群に分類した結果の情報を蓄積することを特徴とする。
【００１６】
本発明による映像編集方法は、入力された映像データによる映像の場面が切り替わるタイミングまたは音声が切り替わるタイミングであるシーンチェンジを検出してシーンチェンジの位置を特定する位置情報を生成し、時間軸上でシーンチェンジに挟まれた複数の区間に映像データを区分し、区分された複数の区間に時間順にシーン番号を付与してシーンを作成し、作成された複数のシーンを出力し、作成された複数のシーンの特徴量を抽出してシーン間の類似度を算出し、シーンの間の類似度に基づいて複数のシーンを複数のグループに分類し、分類した複数のグループを構成するフレームにシーングループＩＤを付与して複数のシーングループを作成し、複数のシーンのそれぞれにシーングループを特定するシーングループＩＤを付与し、シーングループＩＤが繰り返し同じ順番で出現するシーングループＩＤの組を抽出し、抽出された複数のシーングループＩＤの組にシーン群グループＩＤを付与して複数のシーン群グループを作成し、複数のシーン群グループが映像データに出現する回数を数えて出現回数の多い順に並べ、シーン群グループごとにシーン群グループＩＤとシーングループＩＤの出現順と入力された映像データ中の出現回数とで構成されるシーン群グループ情報を出力し、シーングループＩＤの出現順がシーン群グループのシーングループＩＤの出現順に一致する時間軸に沿った複数のシーンの組をシーン群として抽出し、抽出された複数のシーン群にシーン群を特定するシーン群ＩＤを付与し、シーン群ＩＤとシーングループＩＤの出現順が抽出されたシーン群でのシーングループＩＤの出現順に一致するシーン群グループのシーン群グループＩＤと抽出されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を出力し、シーン群グループを映像データ中の出現回数が多い順から１つずつ選択し、シーン群として抽出されなかった残りのシーンから時間軸上で連続したシーンの並びを１つずつ選択し、選択したシーンの並びの中に選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンが存在し、そのシーンより時間軸上で後ろに、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンが存在する場合に、選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンで始まり、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを持つシーンで終わる複数のシーンの組をシーン群として抽出し、抽出した複数のシーン群にシーン群を特定するシーン群ＩＤを付与し、抽出されたシーン群のシーン群ＩＤと選択したシーン群グループのシーン群グループＩＤと抽出されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を生成し、シーン群の抽出を選択した全てのシーンの並びと選択した全てのシーン群グループとについて繰り返しおこない、シーン群グループ情報を利用してもシーン群として抽出されなかった時間軸に沿った１つ以上のシーンの並びをシーングループの出現順が一致するシーン群グループが存在しないシーン群と決定し、決定された複数のシーン群にシーン群ＩＤを付与し、シーン群とシーングループの出現順が一致するシーン群グループが存在しないことを意味する値と決定されたシーン群の先頭のシーンチェンジの位置情報とで構成されるシーン群情報を生成し、シーン群情報をシーン群情報データベースに蓄積することを特徴とする。
【００１７】
本発明による映像編集プログラムは、映像データを編集して、シーンの集まりであるシーン群を生成する映像編集装置に搭載される映像編集プログラムであって、コンピュータに、入力された映像データによる映像のシーンが変わるタイミングであるシーンチェンジを検出して、映像データをシーンに分割する処理と、シーンの特徴量を抽出し、抽出した特徴量に応じてシーンをグループに分類したシーングループを生成する処理と、映像データの時間軸上に複数回同じ並びで出現するシーングループの並びをシーンの集まりであるシーン群と特定して抽出する処理と、抽出の対象とならなかったシーングループの並びと抽出されたシーン群とのマッチングを行い、マッチングの結果に応じてシーングループの並びをシーン群に含める処理と、映像データをシーン群に分類した結果の情報を記憶装置に記憶させる処理とを備えることを特徴とする。
【００１８】
【発明の実施の形態】
本発明の実施の形態について図面を参照して説明する。図１は、本発明の実施の形態の一構成例を示すブロック図である。本発明による映像編集装置は、入力された映像データ１０１に含まれるカメラが切り替わる場面あるいは音声（例えば、話者。）が切り替わるタイミングであるシーンチェンジを検出して、映像データ１０１をシーンに分割するシーン検出手段１０２と、シーンの特徴量を算出し、算出した特徴量に応じてシーンをグループに分類したシーングループを生成するシーン分類手段１０３と、繰り返し同じ順番で出現するシーングループが構成するシーングループの集合であるシーン群を抽出するシーン群抽出手段１０４と、シーン群の抽出の対象とならなかったシーングループを分類してシーン群を生成するシーン群決定手段１０５と、映像データをシーン群に分類した結果の情報をシーン群情報データベース１０７に蓄積する蓄積手段１０６とを含む。本発明による映像編集装置は、コンピュータ等により実現され、各手段はプログラム等により実現される。
【００１９】
シーン検出手段１０２は、動画像のデータである映像データ１０１が入力されると、映像データ１０１の映像において、カメラが切り替わるタイミングあるいは音声が切り替わるタイミングであるシーンチェンジを検出する。シーンチェンジの検出方法は、例えば映像データ１０１による映像の、連続するフレーム間の画素の色情報のレイアウトの差分を算出し、算出した差分が所定のしきい値以上となった場合、シーンチェンジであると判定することにより行う。シーン検出手段１０２は、映像データ１０１をシーンチェンジを検出したタイミングで区切る。そして、区切られた映像データ１０１の各区間である各シーンに対応する情報のファイルであるシーン情報ファイルを生成する。
【００２０】
シーン検出手段１０２は、区切られた各シーンに時間順に先頭から番号を付与し、付与した番号を各シーンに対応付けてシーン情報ファイルに記録する。そして、シーン検出手段１０２は、シーンに区切られた映像データ１０１と、シーン情報ファイルとをシーン分類手段１０３に出力する。また、シーン検出手段１０２は、シーンチェンジのしたフレームであるシーンチェンジフレームを特定する情報であるシーンチェンジ位置情報を生成してシーン群決定手段１０５に出力する。ここで、映像データ１０１は、シーン検出手段１０２がシーンチェンジを検出できる信号形式であればよく、例えば、アナログＶＴＲやＤＶ（ＤｉｇｉｔａｌＶｉｄｅｏ）等の記録媒体に記録されているデータや、ＭＰＥＧ等のデータである。
【００２１】
シーン分類手段１０３は、シーンに区切られた映像データ１０１が入力されると各シーンの特徴量を抽出し、抽出した各シーンの特徴量を互いに比較して類似度を算出する。ここで、シーンの特徴量は、映像データ１０１による映像のフレームにおける各画素の色情報のレイアウトである。また、映像データに含まれる音声信号がステレオ音声であるか、モノラル音声であるか、または多重音声であるか等の音声信号の種類や、音声信号の波形を特徴量に用いてもよい。さらに、字幕の有無、字幕の表示位置や、表示言語等の情報を特徴量に用いてもよい。また、類似度は、例えば、シーン間の特徴量の差分絶対値和である。そして、シーンの類似度が、所定のしきい値よりも小さいシーンのグループであるシーングループに分類し、シーングループを特定する記号である対応情報（例えば、シーングループＩＤ。）を各シーンに付与する。つまり、シーン分類手段１０３は、各シーングループを特定するシーングループＩＤを各シーンに対応付ける。
【００２２】
シーン分類手段１０３は、特異値分解（ＳＶＤ）法等の既存の方法を用い、シーンを特徴量の類似度に基づいてグループに分類する。具体的には、各シーンの特徴量を抽出し、特徴量空間にマッピングすると、特徴量空間に特徴量の類似するシーンの固まりができる。特徴量空間において、特徴量間の距離があらかじめ決められているしきい値よりも小さい場合に、それらは１つの固まりに属するとする。それぞれの固まりを各々のグループとする。シーン分類手段１０３は、各シーンに付与されたシーングループＩＤを各シーンに対応付けてシーン情報ファイルに記録する。シーン分類手段１０３は、映像データ１０１とシーン情報ファイルとをシーン群抽出手段１０４に出力する。
【００２３】
シーン群抽出手段１０４は、映像データ１０１が入力されると、シーン情報ファイルに記録されているシーングループＩＤに基づいて、同じシーングループＩＤが繰り返し同じ順番で出現するシーングループＩＤの組であるシーン群グループを抽出する。シーン群グループの抽出は、テキストデータマイニングの手法である動的計画法等の既存の手法を用いてよい。そして、シーン群グループを特定する記号であるシーン群グループＩＤを各シーンに付与する。シーン群抽出手段１０４は、各シーンに付与されたシーン群グループＩＤを、各シーンに対応付けてシーン情報ファイルに記録する。そして、シーン群抽出手段１０４は、各シーン群グループが映像データ１０１に存在する数と、各シーン群グループの映像データ１０１における時間軸上の順序とを特定する。
【００２４】
シーン群抽出手段１０４は、シーン群グループＩＤと、シーン群グループが映像データ１０１に存在する数の情報と、シーン群グループの映像データ１０１における時間軸上の順序の情報とによって構成されるシーン群グループ情報を生成する。このとき、シーン群グループ情報におけるシーン群グループＩＤの順序を、映像データ１０１に存在するシーン群グループの数が多い順序にしてもよい。
【００２５】
シーン群抽出手段１０４は、シーン群グループを抽出し、抽出した各シーン群グループをシーン群とする。そして、各シーン群に、シーン群を特定する記号であるシーン群ＩＤをシーン群を構成する各シーンに付与する。シーン群抽出手段１０４は、シーン群ＩＤを各シーンに対応付けてシーン情報ファイルに記録する。そして、シーン群抽出手段１０４は、シーン群のシーン群ＩＤと、シーン群を構成する各シーンのシーン群グループＩＤの情報と、各シーン群の先頭のシーンチェンジフレームを特定する情報とによって構成されるシーン群情報を生成する。シーン群抽出手段１０４は、シーン群と、シーン群として抽出されなかった部分の映像データ１０１と、シーン群グループ情報と、シーン群情報とをシーン群決定手段１０５に出力する。シーン群抽出手段１０４は、シーン群情報を蓄積手段１０６に出力して、シーン群情報データベースに蓄積させてもよい。
【００２６】
シーン群決定手段１０５は、シーン検出手段１０２が出力したシーンチェンジ位置情報に基づいて、シーン群として抽出されなかった部分の映像データ１０１をシーンに区切る。そして、シーン群決定手段１０５は、シーン群グループを１つ選択する。選択したシーン群グループの最初のシーンのシーングループＩＤと、選択したシーン群グループの最後のシーンのシーングループＩＤとを抽出する。また、シーン群抽出手段１０４がシーン群として抽出しなかった残りのシーンのうち、時間軸上で連続したシーンの並びを特定する。そして、シーン群抽出手段１０４がシーン群として抽出しなかった残りのシーンのうち、選択したシーン群の最初のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンを特定する。
【００２７】
そして、特定したシーンと連続したシーンであって、時間軸上における後方に、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンが存在していた場合、選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンと、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンとの間のシーンを、シーン群として抽出し、シーン群を特定するシーン群ＩＤをシーン群を構成する各シーンに付与する。シーン群決定手段１０５は、シーン群ＩＤを各シーンに対応付けてシーン情報ファイルに記録する。
【００２８】
シーン群決定手段１０５は、入力された全てのシーン群について、出現回数の多い順序でシーン群を選択して上記の処理を行い、出現回数が同じシーン群が複数存在する場合は、時間軸上で登場する順にシーン群を選択して上記の処理を行い、シーン群を抽出する。そして、入力された全てのシーン群について上記の処理に行う。上記の処理を行っても抽出されなかったシーンをシーン群として抽出し、シーン群を特定するシーン群ＩＤをシーン群を構成する各シーンに付与する。シーン群決定手段１０５は、シーン群ＩＤを各シーンに対応付けてシーン情報ファイルに記録する。
【００２９】
シーン群決定手段１０５は、シーン群決定手段１０５が抽出したシーン群のシーン群ＩＤと、シーン群を構成する各シーンのシーン群グループＩＤと、各シーン群の先頭のシーンチェンジフレームを特定する情報とを新たに含む、シーン群情報を生成する。そして、シーン群決定手段１０５は、シーン情報ファイルと、シーン群グループ情報と、シーン群情報とを蓄積手段１０６に出力する。
【００３０】
蓄積手段１０６は、シーン群情報データベース１０７を備え、入力されたシーン情報ファイルと、シーン群グループ情報と、シーン群情報とをシーン群情報データベース１０７に蓄積する。
【００３１】
本発明による映像編集プログラムは、映像データを編集して、シーンの集まりであるシーン群を生成する映像編集装置に搭載されて各手段を実現する映像編集プログラムであって、コンピュータに、入力された映像データによる映像のシーンが変わるタイミングであるシーンチェンジを検出して、映像データをシーンに分割する処理と、シーンの特徴量を抽出し、抽出した特徴量に応じてシーンをグループに分類したシーングループを生成する処理と、映像データの時間軸上に複数回同じ並びで出現するシーングループの並びをシーンの集まりであるシーン群と特定して抽出する処理と、抽出の対象とならなかったシーングループの並びとシーン群とのマッチングを行い、マッチングの結果に応じてシーングループの並びをシーン群と特定する処理と、映像データをシーン群に分類した結果の情報を記憶装置に記憶させる処理とを備える。
【００３２】
次に、図面を参照して本発明の実施の形態の動作について説明する。図２は、本発明の実施の形態の映像編集装置の動作を示すフローチャートである。図２において、ステップＳ２０１はシーン検出手段１０２の動作を表し、ステップＳ２０２は、シーン分類手段１０３の動作を表し、ステップＳ２０３はシーン群抽出手段１０４の動作を表し、ステップＳ２０４はシーン群決定手段１０５の動作を表し、ステップＳ２０５は蓄積手段１０６の動作を表す。
【００３３】
各手段の動作について説明する。まず、シーン検出手段１０２の動作について説明する。図３は、本発明におけるシーン検出手段１０２の動作を示すフローチャートである。図４は、本発明における映像編集装置による映像編集の対象となる映像データを模式的に表した図である。シーン検出手段１０２は、編集対象となる映像データ１０１が入力されると（ステップＳ３０１）、入力された映像データ１０１による映像のシーンチェンジを検出する（ステップＳ３０２）。すなわち、図４の例に示すように、シーンチェンジのフレーム７０１〜７２４を検出する。そして、シーンチェンジフレームを特定するシーンチェンジ位置情報を生成してシーン群決定手段１０５に出力する。
【００３４】
シーン検出手段１０２は、検出したシーンチェンジのタイミングで、映像データ１０１を分割し（ステップＳ３０３）、分割した各区間である各シーンのフレームに時間順に先頭からシーン番号を付与してシーンを作成し（ステップＳ３０４）、シーン番号をシーン情報ファイルに記録する。図５のシーン番号の欄に、各シーンに付与したシーン番号の一例を示す。ここで、シーン番号「Ｓ１」が付与されたフレームによって構成されるシーンは、シーンチェンジフレーム７０１以降であって、シーンチェンジフレーム７０２の前のフレームまでを含む。以下、同様に、シーン番号「Ｓ２」が付与されたフレームによって構成されるシーンは、シーンチェンジフレーム７０２以降であって、シーンチェンジフレーム７０３の前のフレームまでを含む。そして、作成した各シーンを、シーン分類手段１０３に出力する（ステップＳ３０５）。
【００３５】
つぎに、シーン分類手段１０３の動作について説明する。図６は、本発明におけるシーン検出手段１０３の動作を示すフローチャートである。シーン分類手段１０３は、シーン検出手段１０２が映像データ１０１を区分したシーンが入力されると（ステップＳ４０１）、各シーンの特徴量を抽出し（ステップＳ４０２）、複数のシーンの特徴量の類似度に基づいて複数のシーンを複数のグループに分類する（ステップＳ４０３）。そして、分類した各グループのＩＤであるシーングループＩＤを各グループに付与してシーングループを作成し（ステップＳ４０４）、各シーンに各シーングループＩＤを付与する（ステップＳ４０５）。シーン分類手段１０３は、各シーンに付与されたシーングループＩＤを各シーンに対応付けてシーン情報ファイルに記録する。図５のシーングループＩＤの欄に、各シーンに付与されたシーングループＩＤの一例を示す。シーン分類手段１０３は、シーン番号「Ｓ１」が付与されている各フレームに、シーングループＩＤ「ａ」を付与し、以下、図５に示すように、シーングループＩＤ「ｂ」から「ｆ」までを付与したものとする。シーン分類手段１０３は、映像データ１０１とシーン情報ファイルとをシーン群抽出手段１０４に出力する。
【００３６】
つぎに、シーン分類手段１０４の動作について説明する。図７は、本発明におけるシーン検出手段１０４の動作を示すフローチャートである。シーン群抽出手段１０４は、映像データ１０１とシーン情報ファイルが入力されると（ステップＳ５０１）、シーン情報ファイルに記録されているシーングループＩＤに基づいて、映像データ１０１の中にシーングループＩＤが繰り返し同じ順番で出現するシーングループＩＤの組を抽出する（ステップ５０２）。図５を参照すると、シーングループＩＤが、「ａｂｃ」の順序で連続する組が存在することと、「ｄｂｅ」の順序で連続する組が存在することとが分かる。そして、抽出したシーングループＩＤの組にシーン群グループＩＤを付与して複数のシーン群グループを作成する（ステップ５０３）。
【００３７】
シーン群抽出手段１０４は、各シーンに付与されたシーン群グループＩＤを、各シーンに対応付けてシーン情報ファイルに記録する。図８に示すように、「ａｂｃ」の順序で連続するシーングループＩＤの組を構成する各フレームに、シーン群グループＩＤ「ｓｇ１」を付与してシーン群グループを作成し、「ｄｂｅ」の順序で連続するシーングループＩＤの組を構成する各フレームに、シーン群グループＩＤ「ｓｇ２」を付与してシーン群グループを作成したものとする。
【００３８】
シーン分類手段１０４は、シーン情報ファイルを参照して、各シーン群グループが映像データ１０１に存在する数を特定する（ステップＳ５０４）。図５を参照すると、図８の出現回数の欄に示すように、シーン群グループ「ｓｇ１」が３回出現し、シーン群グループ「ｓｇ１」が２回出現している。そして、シーン群グループを出現回数の多い順序に並べ、シーン群グループＩＤと、シーン群グループが映像データ１０１に存在する数の情報と、シーン群グループの映像データ１０１における時間軸上の順序の情報とによって構成されるシーン群グループ情報を生成する。すなわち、シーン群グループ情報は、図８に示すような構成となる。シーン分類手段１０４は、シーン群グループ情報をシーン群決定手段１０５に出力する（ステップＳ５０６）。
【００３９】
シーン群抽出手段１０４は、シーン群グループを抽出し、抽出した各シーン群グループをシーン群とする（ステップＳ５０７）。そして、抽出した各シーン群にシーン群ＩＤを付与する（ステップＳ５０８）。シーン群抽出手段１０４は、各シーンに付与されたシーン群グループＩＤを、各シーンに対応付けてシーン情報ファイルに記録する。シーン番号「Ｓ１、Ｓ２、Ｓ３」にシーン群ＩＤ「ＳＧ１」を付与し、以下、図９に示すように、シーン群ＩＤ「ＳＧ２」から「ＳＧ５」までを付与したものとする。そして、シーン群のシーン群ＩＤと、シーン群を構成する各シーンのシーン群グループＩＤの情報と、各シーン群の先頭のシーンチェンジフレームを特定する情報とによって構成されるシーン群情報を生成し、シーン群決定手段１０５に出力する（ステップＳ５０９）。また、シーン群抽出手段１０４は、シーン情報ファイルと、シーン群と、シーン群として抽出されなかった各シーンとをシーン群決定手段１０５に出力する。
【００４０】
つぎに、シーン群決定手段１０５の動作について説明する。図７は、本発明におけるシーン群決定手段１０５の動作を示すフローチャートである。シーン群決定手段１０５は、シーン群グループ情報と、シーン群抽出手段１０４がシーン群として抽出しなかった残りのシーンが入力されると（ステップＳ６０１）、シーンチェンジ位置情報に基づいて、シーンを区切る。そして、シーン群グループを出現回数の多い順から１つ選択する（ステップＳ６０２）。図８を参照すると、シーン群グループ「ｓｇ０１」の出現回数が最も多いので、シーン群グループ「ｓｇ０１」を選択する。そして、シーン群抽出手段１０４がシーン群として抽出しなかった残りのシーンの並びのうち、時間軸に沿った連続したシーンの並びを１つ特定する（ステップＳ６０３）。例えば、図９におけるシーン番号「Ｓ１０」から「Ｓ１２」までを特定する。
【００４１】
そして、特定したシーンの並びと、シーン群グループとのマッチングをおこなう。すなわち、シーン群抽出手段１０４がシーン群として抽出しなかった残りのシーンのうち、選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンを特定する。そして、特定したシーンと連続したシーンであって、時間軸上における後方に、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンが存在していた場合、選択したシーン群グループの最初のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンから、選択したシーン群グループの最後のシーンのシーングループＩＤと同じシーングループＩＤを有するシーンまでのシーンをシーン群として抽出し、シーン群ＩＤを付与する（ステップＳ６０４、６０５）。シーン群決定手段１０５は、シーン群ＩＤを各シーンに対応付けてシーン情報ファイルに記録する。シーン群決定手段１０５は、このとき抽出したシーン群の情報を新たに含むシーン群情報を生成して、蓄積手段１０６に出力してもよい。
【００４２】
図８を参照すると、シーン群グループ「ｓｇ０１」の最初のシーンのシーングループＩＤは「ａ」である。そこで、シーン情報ファイルを用いて、図９におけるシーン「Ｓ１０」から「Ｓ１２」のうちシーングループＩＤが「ａ」であるシーンを検索すると、シーン「Ｓ１０」のシーングループＩＤが「ａ」であることが分かる。また、シーン群グループ「ｓｇ０１」の最後のシーンのシーングループＩＤは「ｃ」である。そこで、シーン「Ｓ１０」から「Ｓ１２」までのうち、シーン「Ｓ１０」から時間軸上における後方に、シーングループＩＤが「ｃ」であるシーンを検索する。すると、シーン「Ｓ１２」のシーングループＩＤが「ｃ」であることが分かる。そこで、シーン「Ｓ１０」から「Ｓ１２」までを、シーン群として抽出し、シーン群ＩＤ「ＳＧ０６」を付与する。
【００４３】
シーン群決定手段１０５は、特定したシーンの並びのうち、シーン群グループとマッチングを行わなかったシーンが存在するか否かを判断し（ステップＳ６０４）、存在していた場合、ステップＳ６０５の動作を行う。特定したシーンの並び「Ｓ１０」から「Ｓ１２」までのうち、シーン群グループとマッチングをしなかったシーンの並びは存在しないため、ステップＳ６０５の動作は行わない。そして、シーン群として抽出しなかった他のシーンにおける時間軸に沿った連続したシーンの並びについて、ステップＳ６０５およびステップＳ６０６の動作を行ったか否かを判断し（ステップＳ６０７）、動作を行っていない並びを選択して（ステップＳ６０３）、同様の動作を行う。
【００４４】
図９を参照すると、シーン「Ｓ１６」から「Ｓ２０」までについて同様の動作を行う。すなわち、シーン「Ｓ１６」から「Ｓ２０」までのうちシーングループＩＤが「ａ」であるシーンを検索すると、シーン「Ｓ１６」のシーングループＩＤが「ａ」であることが分かる。また、シーン群グループ「ｓｇ０１」の最後のシーンのシーングループＩＤは「ｃ」である。そこで、シーン「Ｓ１６」から「Ｓ２０」までのうち、シーン「Ｓ１６」から時間軸上における後方に、シーングループＩＤが「ｃ」であるシーンを検索する。すると、シーン「Ｓ１７」のシーングループＩＤが「ｃ」であることが分かる。そこで、シーン「Ｓ１６」から「Ｓ１７」までを、シーン群として抽出し、シーン群ＩＤ「ＳＧ０７」を付与する。
【００４５】
次に、シーン「Ｓ１８」から「Ｓ２０」までのうちシーングループＩＤが「ａ」であるシーンを検索すると、シーングループＩＤが「ａ」であるシーンは存在しない。シーン群決定手段１０５は、全てのシーンの並びについて、シーン群グループ「ｓｇ０１」とのマッチングが終了すると、全てのシーン群グループによるマッチングが終了したか否かを判断する（ステップＳ６０７）。図８を参照すると、シーン群グループ「ｓｇ０２」についてマッチングを行っていないので、シーン群決定手段１０５は、シーン群グループ「ｓｇ０２」を選択して（ステップＳ６０２）、マッチングを行う。そして、シーン群として抽出しなかった残りのシーンの並びのうち、時間軸に沿った連続したシーンの並びを特定する（ステップＳ６０３）。すると、図９におけるシーン番号「Ｓ１８」から「Ｓ２０」までを特定される。
【００４６】
図８を参照すると、シーン群グループ「ｓｇ０２」の最初のシーンのシーングループＩＤは「ｄ」である。そこで、図９におけるシーン「Ｓ１８」から「Ｓ２０」のうちシーングループＩＤが「ｄ」であるシーンを検索すると、シーン「Ｓ１８」のシーングループＩＤが「ｄ」であることが分かる。また、シーン群グループ「ｓｇ０２」の最後のシーンのシーングループＩＤは「ｅ」である。そこで、シーン「Ｓ１８」から「Ｓ２０」までのうち、シーン「Ｓ１８」から時間軸上における後方に、シーングループＩＤが「ｅ」であるシーンを検索する。すると、シーン「Ｓ１９」のシーングループＩＤが「ｅ」であることが分かる。そこで、シーン「Ｓ１８」から「Ｓ１９」までを、シーン群として抽出し、シーン群ＩＤ「ＳＧ０８」を付与する。
【００４７】
残ったシーンは、シーン「Ｓ２０」のみである。シーンがただ一つだけの場合は、シーン群決定手段１０５は、シーン群の抽出を行わない。シーン群決定手段１０５は、全てのシーンの並びと、全てのシーン群グループによるマッチングが終了したと判断する。
【００４８】
シーン群決定手段１０５は、全てのシーンの並びと、全てのシーン群グループによるマッチングを行ってもシーン群として抽出されなかった時間軸に沿った１つ以上のシーンの並びを、シーングループの出現順が一致するシーン群グループが存在しないシーン群と決定し（ステップＳ６０８）、決定された複数のシーン群にシーン群ＩＤを付与する（ステップＳ６０９）。シーン「Ｓ２０」に、シーン群ＩＤ「ＳＧ０９」を付与する。図１１に、映像データ１０１の全てのシーンにシーン群ＩＤを付与した結果を示す。
【００４９】
シーン群決定手段１０５は、シーン群決定手段１０５が抽出したシーン群の、シーン群ＩＤと、シーン群を構成する各シーンのシーン群グループＩＤの情報と、各シーン群の先頭のシーンチェンジフレームを特定する情報とを新たに含む、シーン群情報を生成する。そして、シーン群決定手段１０５は、シーン群グループ情報と、シーン群情報とを蓄積手段１０６に出力する（ステップＳ６０９）。
【００５０】
蓄積手段１０６は、シーン群決定手段１０５が出力したシーン群グループ情報と、シーン群情報とをシーン群情報データベース１０７に蓄積する（ステップＳ２０５）。
【００５１】
【発明の効果】
以上のように、本発明によれば、入力された映像をシーンの特徴量で分類してシーンの集まりであるシーン群に区分する編集を自動的に行うことができる。また、シーンが繰り返し同じ順番で出現する構造である繰り返し構造が入力した映像に含まれる場合に、繰り返し構造をシーン群として抽出できる。すると、本発明による映像編集装置による映像の編集後、一の繰り返し構造に含まれる一のシーンを検索する際に、シーン群を検索することにより、所望のシーンの映像を発見することができる効果がある。
【図面の簡単な説明】
【図１】本発明の実施の形態の一構成例を示すブロック図である。
【図２】本発明の実施の形態の動作を示すフローチャートである。
【図３】本発明によるシーン検出手段の動作を示すフローチャートである。
【図４】本発明の実施の形態が編集の対象とする映像データのフレーム構成例を示した図である。
【図５】映像データの各シーンのシーン番号と各シーンに付与されたシーングループＩＤとを示した図である。
【図６】本発明によるシーン分類手段の動作を示すフローチャートである。
【図７】本発明によるシーン抽出手段の動作を示すフローチャートである。
【図８】シーン群グループＩＤと、シーングループＩＤの出現順と、シーン群グループの出現回数とを示した図である。
【図９】シーン番号と、シーングループＩＤと、シーン抽出手段が抽出したシーン群のシーン群ＩＤとを示した図である。
【図１０】本発明によるシーン決定手段の動作を示す流れ図である。
【図１１】シーン番号と、シーングループＩＤと、シーン抽出手段が抽出したシーン群のシーン群ＩＤとシーン決定手段が決定したシーン群のシーン群ＩＤとを示した図である。
【図１２】特許文献１に記載の、従来の映像編集システムの実施の形態の構成を示すブロック図である。
【図１３】特許文献２に記載の、従来の映像編集システム実施の形態の構成を示すブロック図である。
【符号の説明】
１０１映像データ
１０２シーン検出手段
１０３シーン分類手段
１０４シーン群抽出手段
１０５シーン群決定手段
１０６蓄積手段
１０７シーン群情報データベース
１２１符号パラメータ抽出部
１２２シーンチェンジフレーム検出部
１２３シーン群決定部
１２４シーン情報階層化部
１２５蓄積部
１３１特徴量抽出部
１３２量子化部
１３３計数部
７０１第１のシーンチェンジの先頭フレーム
７０２第２のシーンチェンジの先頭フレーム
７０３第３のシーンチェンジの先頭フレーム
７０４第４のシーンチェンジの先頭フレーム
７０５第５のシーンチェンジの先頭フレーム
７０６第６のシーンチェンジの先頭フレーム
７０７第７のシーンチェンジの先頭フレーム
７０８第８のシーンチェンジの先頭フレーム
７０９第９のシーンチェンジの先頭フレーム
７１０第１０のシーンチェンジの先頭フレーム
７１１第１１のシーンチェンジの先頭フレーム
７１２第１２のシーンチェンジの先頭フレーム
７１３第１３のシーンチェンジの先頭フレーム
７１４第１４のシーンチェンジの先頭フレーム
７１５第１５のシーンチェンジの先頭フレーム
７１６第１６のシーンチェンジの先頭フレーム
７１７第１７のシーンチェンジの先頭フレーム
７１８第１８のシーンチェンジの先頭フレーム
７１９第１９のシーンチェンジの先頭フレーム
７２０第２０のシーンチェンジの先頭フレーム
７２１第２１のシーンチェンジの先頭フレーム
７２２第２２のシーンチェンジの先頭フレーム
７２３第２３のシーンチェンジの先頭フレーム
７２４第２４のシーンチェンジの先頭フレーム[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a video editing apparatus, a video editing method, and a video editing program that divide video data for each video scene.
[0002]
[Prior art]
In recent years, moving images have been handled by personal computers and the like, and it has become important to efficiently perform operations such as moving image search and editing. When searching or editing a moving image, it is necessary to disassemble the moving image into scenes that make up the moving image and grasp the arrangement, configuration, and the like of each scene in the moving image. Here, the scene means a unit in which a camera is switched in a moving image or a unit in which a sound (for example, a speaker) is switched.
[0003]
As an example of a video editing system that edits a video using video data that is a signal obtained by encoding a video when editing a video that is a moving image, a conventional video editing system described in Patent Document 1 is described. A configuration example is shown in the block diagram of FIG. In the conventional video editing system having the configuration shown in FIG. 12, the encoding parameter extraction unit 121 extracts the encoding parameter from the encoded video data, and the scene change frame detection unit 122 outputs the video based on the encoding parameter. A scene change frame that is a frame for switching scenes is detected. Then, the scene group determination unit 123 divides the scene based on the position of the scene change frame, specifies a scene group that is a collection of scenes, and generates scene group information that is information on the position of the first frame in the scene group. The scene information hierarchizing unit 124 hierarchizes and accumulates the information on the position of the scene change frame and the scene group information over the entire video in the accumulating unit 125.
[0004]
Here, the scene group determination unit 123 calculates a time difference between scene change frames, which is a difference in position of scene change frames of adjacent scenes. Then, the time difference between the calculated frames is compared with a predetermined threshold, and if the time difference between the scene change frames is equal to or greater than the predetermined threshold, it is determined that adjacent scenes belong to different scene groups, A scene that is temporally later among adjacent scenes is set as the first scene of a new scene group. If the time difference between scene change frames is equal to or smaller than a predetermined threshold value, adjacent scenes are determined to belong to the same scene group. In this way, the scene group determination unit 123 compares the time difference between scene change frames for all adjacent scenes with a predetermined threshold value, and divides the entire video into scene groups.
[0005]
A block diagram of FIG. 13 shows an example of the configuration of a conventional video structuring apparatus described in Patent Document 2, for example. In the video structuring apparatus having the configuration shown in FIG. 13, the feature quantity extraction unit 131 extracts feature quantity vectors of the segmented time of the input video, and the quantization unit 132 assigns the feature quantity vectors to numbers. The video is expressed as a number sequence. Then, the counting unit 133 counts the number of appearances of the number sequence and extracts a partial sequence having a high appearance frequency. In this manner, a sequence of sections that frequently appears in a specific pattern is extracted from the time-divided sections of the input video.
[0006]
[Patent Document 1]
JP 2001-326901 A (page 4-7, FIG. 1)
[Patent Document 2]
Japanese Patent Laid-Open No. 11-242585 (page 4-10, FIG. 1)
[0007]
[Problems to be solved by the invention]
The video editing system described in Patent Document 1 determines a scene group based on a time difference between scene change frames. Therefore, when a repetitive structure in which similar scenes repeatedly appear in the same order is included within a predetermined time, the scene group determination unit 123 does not identify each scene constituting the repetitive structure as a scene group.
[0008]
In addition, the video structuring apparatus described in Patent Document 2 extracts a sequence of sections that appear with high frequency in a specific pattern from the time-divided sections of the input video. Processing is not performed for sections that are not extracted as a sequence of appearing sections. For this reason, editing work for sections not extracted as a sequence of sections that appear frequently in the input video must be performed using means other than the video structuring apparatus described in Patent Document 2. .
[0009]
Accordingly, the present invention provides a video editing apparatus, a video editing method, and a video that automatically classify the entire input video into scene groups using the appearance order and the number of appearances of scenes included in the input video. The purpose is to provide an editing program.
[0010]
[Means for Solving the Problems]
A video editing apparatus according to the present invention detects a scene change that is a timing at which a video scene changes according to input video data, and extracts a scene detection unit that divides the video data into a plurality of scenes and a feature amount of each scene Then, according to the extracted feature amount, a scene group in which each scene in the video data is classified into a plurality of groups is generated, and correspondence information for identifying each scene group is associated with each scene, and on the time axis Scene group extraction means for identifying and extracting a set of scenes corresponding to the correspondence information appearing in the same sequence multiple times as a scene group, an arrangement of scene groups that were not extracted by the scene group extraction means, and a scene Matches the scene group extracted by the group extraction means, and includes the scene group sequence in the scene group according to the matching result. And a scene group determination means that, characterized by comprising a means for storing the information of the result of classifying the image data into the scene group.
[0011]
The scene group determining means determines whether the scene group of the first scene of the scene group matches the scene group of the first scene in the sequence of the scene groups, and the scene group of the last scene of the scene group, It may be determined whether or not the scene group of the last scene in the sequence of scene groups matches, and the sequence of scene groups determined to match together may be included in the scene group. According to such a configuration, an arrangement of scene groups similar to the configuration of the scene group can be specified as the scene group.
[0012]
The scene group determination means may perform matching by selecting scene groups in descending order of appearances in the video data. If there are multiple scene groups having the same number of appearances in the video data, Matching may be performed by selecting scene groups in the order in which they appear. According to such a configuration, it is possible to specify each scene constituting a repeating structure in which similar scenes repeatedly appear in the same order as a scene group.
[0013]
The scene group determination means may specify an arrangement of scenes not specified as a scene group as a result of matching as a scene group. According to such a configuration, the arrangement of all scenes can be specified as a scene group.
[0014]
The video editing apparatus according to the present invention detects a scene change that is a timing at which a video scene changes according to input video data or a timing at which a sound switches, and generates position information for specifying the position of the scene change on the time axis. Scene detection means for dividing video data into a plurality of sections sandwiched between scene changes, creating scenes by assigning scene numbers to the plurality of sections divided in time order, and outputting the plurality of created scenes;
The feature quantity of the plurality of scenes created by the scene detection means is extracted to calculate the similarity between the scenes, and the plurality of scenes are classified into a plurality of groups based on the similarity between the scenes. A scene classification unit that assigns a scene group ID to a group, creates a plurality of scene groups, and assigns a scene group ID that identifies the scene group to each of the plurality of scenes, and a scene in which the scene group IDs repeatedly appear in the same order A group ID group is extracted, a scene group group ID is assigned to the plurality of extracted scene group ID groups to create a plurality of scene group groups, and the number of times the plurality of scene group groups appear in the video data is counted. Arranged in descending order of appearance frequency, and the scene group group ID and the appearance order of the scene group ID are input for each scene group. Scene group group information including the number of appearances in the image data is output, and a set of scenes along the time axis in which the appearance order of the scene group IDs matches the appearance order of the scene group IDs of the scene group groups As a group, a scene group ID for identifying the scene group is assigned to the extracted plurality of scene groups, and the appearance order of the scene group ID and the scene group ID matches the appearance order of the scene group ID in the extracted scene group Scene group extraction means for outputting scene group information composed of a scene group group ID of the scene group to be performed and position information of the first scene change of the extracted scene group, and the number of appearances of the scene group in the video data The scenes are selected one by one in descending order, and the scenes that have not been extracted as scene groups by the scene group extraction means are consecutive on the time axis. A scene with the same scene group ID as the scene group ID of the first scene of the selected scene group exists in the selected scene sequence, and the selected scene sequence has a time axis more than that scene. When there is a scene having the same scene group ID as the scene group ID of the last scene of the selected scene group group, the scene group ID is the same as the scene group ID of the first scene of the selected scene group group. A set of scenes starting with a scene and ending with a scene having the same scene group ID as the scene group ID of the last scene of the selected scene group group is extracted as a scene group, and the scene group is specified for the extracted scene groups. Scene group ID to be assigned, the scene group ID of the extracted scene group and the selected scene group group Generating scene group information consisting of the scene group group ID of the group and the position information of the first scene change of the extracted scene group, and arranging all the scenes selected for scene group extraction and all selected scenes There is no scene group in which the order of appearance of the scene groups matches the sequence of one or more scenes along the time axis that is not extracted as a scene group even if the scene group group information is used. It is determined as a scene group, a scene group ID is assigned to the determined plurality of scene groups, and a value that means that there is no scene group group in which the appearance order of the scene group and the scene group matches is determined. Scene determination means for generating scene group information composed of position information of the first scene change, and scene group information as scene group information database Characterized by comprising a means for storing the scan.
[0015]
The video editing method according to the present invention detects a scene change that is a timing at which a video scene changes according to input video data, divides the video data into scenes, extracts scene feature values, and extracts the extracted feature values. In response, a scene group is generated by classifying the scenes into groups, and the scene group sequence that appears multiple times on the time axis of the video data is identified and extracted as a scene group that is a collection of scenes. To match the scene group sequence and the scene group that did not become, identify the scene group sequence as a scene group according to the matching result, and accumulate information on the result of classifying the video data into scene groups Features.
[0016]
According to the video editing method of the present invention, a scene change that is a timing at which a video scene is switched or a voice is switched according to input video data is detected to generate position information for specifying the position of the scene change on the time axis. Divide video data into multiple sections sandwiched between scene changes, create scenes by assigning scene numbers to the divided sections in chronological order, and output the created multiple scenes. Scene features are extracted to calculate the similarity between scenes, and multiple scenes are classified into multiple groups based on the similarity between scenes. A plurality of scene groups are created by assigning IDs, and scene group IDs are assigned to each of the plurality of scenes. A set of scene group IDs in which scene group IDs repeatedly appear in the same order is extracted, and a plurality of scene group groups are created by assigning scene group group IDs to the extracted sets of scene group IDs. Count the number of times the group groups appear in the video data and arrange them in the descending order of appearance. Each scene group group is composed of the scene group group ID, the order of appearance of the scene group ID, and the number of appearances in the input video data. A scene group group information is output, a set of a plurality of scenes along the time axis in which the appearance order of the scene group ID matches the appearance order of the scene group ID of the scene group group is extracted as a scene group, and the extracted plurality of scenes A scene group ID that identifies the scene group is assigned to the group, and the scene group ID and the appearance order of the scene group ID are extracted. Output scene group information composed of the scene group ID of the scene group that matches the order of appearance of the scene group ID in the scene group and the position information of the first scene change of the extracted scene group. Select one by one in descending order of appearance in the video data, select one sequence of consecutive scenes on the time axis from the remaining scenes not extracted as a scene group, and select the sequence of selected scenes. There is a scene having the same scene group ID as the scene group ID of the first scene of the selected scene group group, and the scene group ID of the last scene of the selected scene group group is located behind the scene on the time axis. When a scene having the same scene group ID exists, the scene group ID of the first scene of the selected scene group group and A plurality of scenes are extracted by extracting a set of scenes starting with a scene having the same scene group ID and ending with a scene having the same scene group ID as the scene group ID of the last scene of the selected scene group group. A scene group ID for specifying a scene group is assigned to the group, and the scene group ID of the extracted scene group, the scene group group ID of the selected scene group group, and the position information of the first scene change of the extracted scene group Generate configured scene group information, and repeat the extraction of scene groups for all selected scenes and all selected scene group groups. Even if scene group group information is used, they are not extracted as scene groups. A group of scene groups in which the order of appearance of the scene groups matches the arrangement of one or more scenes along the time axis Is determined to be a value that means that there is no scene group having the same appearance order of the scene group and the scene group by assigning a scene group ID to the determined plurality of scene groups. Scene group information composed of position information of a scene change at the head of the scene group is generated, and the scene group information is stored in a scene group information database.
[0017]
A video editing program according to the present invention is a video editing program installed in a video editing apparatus that edits video data to generate a scene group that is a collection of scenes. Processing to detect a scene change that is the timing when the scene changes, divide the video data into scenes, extract scene feature values, and generate scene groups that classify scenes into groups according to the extracted feature values And a process of extracting scene groups that appear in the same sequence multiple times on the time axis of the video data as scene groups that are a collection of scenes, and the arrangement and extraction of scene groups that were not subject to extraction Matching with the selected scene group, and including the arrangement of the scene group in the scene group according to the matching result Characterized in that it comprises a process of storing the information of the result of classifying the image data into the scene groups in the storage device.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration example of an embodiment of the present invention. The video editing apparatus according to the present invention detects a scene change which is a scene where a camera included in input video data 101 is switched or a voice is switched (for example, a speaker), and divides the video data 101 into scenes. Scene detection unit 102, scene classification unit 103 that calculates a scene feature amount and generates a scene group in which scenes are classified into groups according to the calculated feature amount, and a scene formed by scene groups that repeatedly appear in the same order A scene group extraction unit 104 that extracts a scene group that is a set of groups, a scene group determination unit 105 that generates a scene group by classifying a scene group that is not a target of the scene group extraction, and a scene group Storage means 106 for storing information on the result of the classification into the scene group information database 107 Including the. The video editing apparatus according to the present invention is realized by a computer or the like, and each means is realized by a program or the like.
[0019]
When the video data 101 that is moving image data is input, the scene detection unit 102 detects a scene change in the video of the video data 101, which is a timing at which the camera is switched or a timing at which the audio is switched. For example, the scene change detection method calculates the difference in the color information layout of pixels between successive frames of the video based on the video data 101, and if the calculated difference exceeds a predetermined threshold value, This is done by determining that there is. The scene detection means 102 divides the video data 101 at the timing when the scene change is detected. Then, a scene information file that is a file of information corresponding to each scene that is each section of the divided video data 101 is generated.
[0020]
The scene detection unit 102 assigns numbers to the divided scenes from the top in time order, and records the assigned numbers in the scene information file in association with each scene. Then, the scene detection unit 102 outputs the video data 101 divided into scenes and the scene information file to the scene classification unit 103. Further, the scene detection unit 102 generates scene change position information that is information for specifying a scene change frame that is a frame subjected to a scene change, and outputs the scene change position information to the scene group determination unit 105. Here, the video data 101 may be in a signal format that allows the scene detection unit 102 to detect a scene change. For example, the video data 101 is data recorded on a recording medium such as an analog VTR or DV (Digital Video), MPEG, or the like. It is data.
[0021]
When the video data 101 divided into scenes is input, the scene classification unit 103 extracts the feature amount of each scene and compares the extracted feature amounts of each scene with each other to calculate the similarity. Here, the feature amount of the scene is a layout of color information of each pixel in a video frame based on the video data 101. In addition, the type of audio signal such as whether the audio signal included in the video data is stereo audio, monaural audio, or multiplexed audio, and the waveform of the audio signal may be used as the feature amount. Furthermore, information such as the presence / absence of captions, the display position of captions, and the display language may be used as feature quantities. The similarity is, for example, a sum of absolute differences of feature amounts between scenes. Then, the scene similarity is classified into scene groups that are groups of scenes that are smaller than a predetermined threshold, and correspondence information (for example, a scene group ID) that is a symbol for identifying the scene group is assigned to each scene. To do. That is, the scene classification unit 103 associates a scene group ID that identifies each scene group with each scene.
[0022]
The scene classification unit 103 classifies scenes into groups based on the similarity of feature values using an existing method such as a singular value decomposition (SVD) method. Specifically, when feature quantities of each scene are extracted and mapped to the feature quantity space, scenes having similar feature quantities in the feature quantity space can be formed. In the feature amount space, when the distance between feature amounts is smaller than a predetermined threshold value, they belong to one group. Each group is a group. The scene classification unit 103 records the scene group ID assigned to each scene in the scene information file in association with each scene. The scene classification unit 103 outputs the video data 101 and the scene information file to the scene group extraction unit 104.
[0023]
When the video data 101 is input, the scene group extraction unit 104 is a scene that is a set of scene group IDs in which the same scene group ID repeatedly appears in the same order based on the scene group ID recorded in the scene information file. Extract group groups. The extraction of the scene group may be performed using an existing method such as dynamic programming which is a text data mining method. Then, a scene group group ID, which is a symbol identifying the scene group group, is assigned to each scene. The scene group extraction unit 104 records the scene group group ID assigned to each scene in the scene information file in association with each scene. Then, the scene group extraction unit 104 specifies the number of each scene group group existing in the video data 101 and the order on the time axis in the video data 101 of each scene group group.
[0024]
The scene group extraction means 104 is a scene group constituted by a scene group group ID, information on the number of scene group groups existing in the video data 101, and information on the order of the scene group group on the time axis in the video data 101. Generate group information. At this time, the order of the scene group group IDs in the scene group group information may be an order in which the number of scene group groups existing in the video data 101 is large.
[0025]
The scene group extraction unit 104 extracts a scene group group and sets each extracted scene group group as a scene group. Then, a scene group ID, which is a symbol for specifying the scene group, is assigned to each scene constituting the scene group. The scene group extraction unit 104 records the scene group ID in the scene information file in association with each scene. The scene group extraction unit 104 includes a scene group ID of the scene group, information on the scene group group ID of each scene constituting the scene group, and information for specifying the first scene change frame of each scene group. To generate scene group information. The scene group extraction unit 104 outputs the scene group, the video data 101 of the portion not extracted as the scene group, the scene group group information, and the scene group information to the scene group determination unit 105. The scene group extraction unit 104 may output the scene group information to the storage unit 106 and store it in the scene group information database.
[0026]
The scene group determination unit 105 divides the video data 101 of a portion not extracted as a scene group into scenes based on the scene change position information output from the scene detection unit 102. Then, the scene group determination unit 105 selects one scene group group. The scene group ID of the first scene of the selected scene group group and the scene group ID of the last scene of the selected scene group group are extracted. In addition, among the remaining scenes that are not extracted as scene groups by the scene group extraction unit 104, a sequence of scenes that are continuous on the time axis is specified. Then, a scene having the same scene group ID as the scene group ID of the first scene of the selected scene group is specified from the remaining scenes that are not extracted as the scene group by the scene group extraction unit 104.
[0027]
The selected scene is a scene that is continuous with the specified scene and has a scene group ID that is the same as the scene group ID of the last scene of the selected scene group group on the rear side in the time axis. A scene between a scene having the same scene group ID as the scene group ID of the first scene of the scene group and a scene having the same scene group ID as the last scene of the selected scene group As a group, a scene group ID for specifying the scene group is assigned to each scene constituting the scene group. The scene group determination unit 105 records the scene group ID in the scene information file in association with each scene.
[0028]
The scene group determining means 105 selects the scene groups in the order of appearance frequency for all the input scene groups and performs the above processing. If there are a plurality of scene groups having the same appearance frequency, The scene group is selected in the order of appearance, and the above processing is performed to extract the scene group. Then, the above process is performed for all input scene groups. Scenes that are not extracted even if the above processing is performed are extracted as scene groups, and a scene group ID that identifies the scene group is assigned to each scene constituting the scene group. The scene group determination unit 105 records the scene group ID in the scene information file in association with each scene.
[0029]
The scene group determination unit 105 is information that specifies the scene group ID of the scene group extracted by the scene group determination unit 105, the scene group group ID of each scene constituting the scene group, and the first scene change frame of each scene group. The scene group information is newly generated. Then, the scene group determination unit 105 outputs the scene information file, the scene group group information, and the scene group information to the storage unit 106.
[0030]
The storage unit 106 includes a scene group information database 107 and stores the input scene information file, scene group group information, and scene group information in the scene group information database 107.
[0031]
A video editing program according to the present invention is a video editing program that is mounted on a video editing device that edits video data and generates a scene group that is a collection of scenes and realizes each means, and is input to a computer A scene that detects the scene change that is the timing when the video scene changes due to the video data, divides the video data into scenes, extracts the scene features, and classifies the scenes into groups according to the extracted features Processing to generate groups, processing to identify and extract scene groups that appear in the same order multiple times on the video data time axis, and scenes that were not subject to extraction Match the sequence of groups and scene groups, and identify the scene group sequence as a scene group according to the matching result. Comprising a processing, and processing to store the information of the result of classifying the image data into the scene groups in the storage device.
[0032]
Next, the operation of the embodiment of the present invention will be described with reference to the drawings. FIG. 2 is a flowchart showing the operation of the video editing apparatus according to the embodiment of the present invention. 2, step S201 represents the operation of the scene detection unit 102, step S202 represents the operation of the scene classification unit 103, step S203 represents the operation of the scene group extraction unit 104, and step S204 represents the scene group determination unit 105. Step S205 represents the operation of the storage means 106.
[0033]
The operation of each means will be described. First, the operation of the scene detection unit 102 will be described. FIG. 3 is a flowchart showing the operation of the scene detection means 102 in the present invention. FIG. 4 is a diagram schematically showing video data to be subjected to video editing by the video editing apparatus according to the present invention. When the video data 101 to be edited is input (step S301), the scene detection unit 102 detects a scene change of the video based on the input video data 101 (step S302). That is, as shown in the example of FIG. 4, scene change frames 701 to 724 are detected. Then, scene change position information for specifying a scene change frame is generated and output to the scene group determining means 105.
[0034]
The scene detection means 102 divides the video data 101 at the detected scene change timing (step S303), and creates scenes by assigning scene numbers from the beginning to the frames of each scene that are the divided sections. (Step S304), the scene number is recorded in the scene information file. An example of the scene number assigned to each scene is shown in the scene number column of FIG. Here, the scene constituted by the frame to which the scene number “S1” is assigned includes the frame after the scene change frame 701 and before the scene change frame 702. Hereinafter, similarly, a scene constituted by a frame to which the scene number “S2” is assigned includes a frame after the scene change frame 702 and before the scene change frame 703. Then, each created scene is output to the scene classification means 103 (step S305).
[0035]
Next, the operation of the scene classification unit 103 will be described. FIG. 6 is a flowchart showing the operation of the scene detection means 103 in the present invention. When the scene classification unit 103 receives a scene obtained by dividing the video data 101 by the scene detection unit 102 (step S401), the scene classification unit 103 extracts the feature amount of each scene (step S402), and the similarity of the feature amounts of a plurality of scenes Based on the above, a plurality of scenes are classified into a plurality of groups (step S403). Then, a scene group ID that is an ID of each classified group is assigned to each group to create a scene group (step S404), and each scene group ID is assigned to each scene (step S405). The scene classification unit 103 records the scene group ID assigned to each scene in the scene information file in association with each scene. An example of the scene group ID assigned to each scene is shown in the scene group ID column of FIG. The scene classification unit 103 assigns a scene group ID “a” to each frame assigned the scene number “S1”, and hereinafter, from the scene group ID “b” to “f” as shown in FIG. Shall be given. The scene classification unit 103 outputs the video data 101 and the scene information file to the scene group extraction unit 104.
[0036]
Next, the operation of the scene classification unit 104 will be described. FIG. 7 is a flowchart showing the operation of the scene detection means 104 in the present invention. When the video data 101 and the scene information file are input to the scene group extraction unit 104 (step S501), the scene group ID is repeatedly included in the video data 101 based on the scene group ID recorded in the scene information file. A set of scene group IDs appearing in the same order is extracted (step 502). Referring to FIG. 5, it can be seen that there are pairs in which the scene group IDs are continuous in the order of “abc”, and there are pairs in which the scene group ID is continuous in the order of “dbe”. Then, a scene group group ID is assigned to the set of extracted scene group IDs to create a plurality of scene group groups (step 503).
[0037]
The scene group extraction unit 104 records the scene group group ID assigned to each scene in the scene information file in association with each scene. As shown in FIG. 8, a scene group group ID “sg1” is assigned to each frame constituting a set of consecutive scene group IDs in the order of “abc” to create a scene group group, and the order of “dbe” It is assumed that a scene group group is created by assigning a scene group group ID “sg2” to each frame constituting a set of consecutive scene group IDs.
[0038]
The scene classification unit 104 refers to the scene information file and specifies the number of scene group groups existing in the video data 101 (step S504). Referring to FIG. 5, the scene group group “sg1” appears three times and the scene group group “sg1” appears twice as shown in the appearance number column of FIG. Then, the scene group groups are arranged in the order of appearance frequency, the scene group group ID, information on the number of scene group groups existing in the video data 101, and information on the order on the time axis in the video data 101 of the scene group groups To generate scene group group information. That is, the scene group group information is configured as shown in FIG. The scene classification unit 104 outputs the scene group group information to the scene group determination unit 105 (step S506).
[0039]
The scene group extraction means 104 extracts a scene group group and sets each extracted scene group group as a scene group (step S507). Then, a scene group ID is assigned to each extracted scene group (step S508). The scene group extraction unit 104 records the scene group group ID assigned to each scene in the scene information file in association with each scene. Assume that the scene group ID “SG1” is assigned to the scene numbers “S1, S2, S3”, and the scene group IDs “SG2” to “SG5” are assigned as shown in FIG. Then, scene group information composed of the scene group ID of the scene group, the information of the scene group group ID of each scene constituting the scene group, and the information specifying the first scene change frame of each scene group is generated. The image is output to the scene group determining means 105 (step S509). The scene group extraction unit 104 outputs the scene information file, the scene group, and each scene that has not been extracted as a scene group to the scene group determination unit 105.
[0040]
Next, the operation of the scene group determination unit 105 will be described. FIG. 7 is a flowchart showing the operation of the scene group determining means 105 in the present invention. When the scene group determination unit 105 receives the scene group group information and the remaining scenes that the scene group extraction unit 104 did not extract as scene groups (step S601), the scene group determination unit 105 divides the scenes based on the scene change position information. . Then, one scene group group is selected in descending order of appearance frequency (step S602). Referring to FIG. 8, since the scene group group “sg01” has the highest number of appearances, the scene group group “sg01” is selected. Then, one of the consecutive scenes along the time axis is specified from the remaining scenes that are not extracted as scene groups by the scene group extraction unit 104 (step S603). For example, scene numbers “S10” to “S12” in FIG. 9 are specified.
[0041]
Then, matching between the specified scene sequence and the scene group is performed. That is, the scene having the same scene group ID as the scene group ID of the first scene of the selected scene group among the remaining scenes not extracted as the scene group by the scene group extraction unit 104 is specified. The selected scene is a scene that is continuous with the specified scene and has a scene group ID that is the same as the scene group ID of the last scene of the selected scene group group on the rear side in the time axis. Scenes from the scene having the same scene group ID as the scene group ID of the first scene of the scene group to the scene having the same scene group ID as the last scene of the selected scene group are extracted as scene groups. Then, a scene group ID is assigned (steps S604 and S605). The scene group determination unit 105 records the scene group ID in the scene information file in association with each scene. The scene group determination unit 105 may generate scene group information that newly includes the information of the scene group extracted at this time and output the scene group information to the storage unit 106.
[0042]
Referring to FIG. 8, the scene group ID of the first scene of the scene group group “sg01” is “a”. Therefore, when a scene having the scene group ID “a” in the scenes “S10” to “S12” in FIG. 9 is searched using the scene information file, the scene group ID of the scene “S10” is “a”. I understand that. The scene group ID of the last scene of the scene group group “sg01” is “c”. Therefore, the scene having the scene group ID “c” is searched from the scenes “S10” to “S12” on the time axis behind the scene “S10”. Then, it can be seen that the scene group ID of the scene “S12” is “c”. Therefore, scenes “S10” to “S12” are extracted as scene groups, and a scene group ID “SG06” is assigned.
[0043]
The scene group determination means 105 determines whether or not there is a scene that has not been matched with the scene group group in the specified sequence of scenes (step S604). If there is a scene, the operation of step S605 is performed. Do. Since there is no scene sequence that has not been matched with the scene group among the specified sequence of scenes “S10” to “S12”, the operation of step S605 is not performed. Then, it is determined whether or not the operations in steps S605 and S606 have been performed on the sequence of consecutive scenes along the time axis in the other scenes not extracted as the scene group (step S607), and no operation is performed. A sequence is selected (step S603), and the same operation is performed.
[0044]
Referring to FIG. 9, the same operation is performed for scenes “S16” to “S20”. That is, when a scene with the scene group ID “a” is searched from the scenes “S16” to “S20”, it is found that the scene group ID of the scene “S16” is “a”. The scene group ID of the last scene of the scene group group “sg01” is “c”. Therefore, among the scenes “S16” to “S20”, the scene having the scene group ID “c” is searched behind the scene “S16” on the time axis. Then, it can be seen that the scene group ID of the scene “S17” is “c”. Therefore, scenes “S16” to “S17” are extracted as scene groups, and a scene group ID “SG07” is assigned.
[0045]
Next, when a scene with the scene group ID “a” is searched from the scenes “S18” to “S20”, there is no scene with the scene group ID “a”. When the matching with the scene group group “sg01” is finished for all the scene arrangements, the scene group determining unit 105 determines whether the matching with all the scene group groups is finished (step S607). Referring to FIG. 8, since the scene group group “sg02” is not matched, the scene group determination unit 105 selects the scene group group “sg02” (step S602) and performs matching. Then, a sequence of continuous scenes along the time axis is specified from the sequence of the remaining scenes not extracted as the scene group (step S603). Then, scene numbers “S18” to “S20” in FIG. 9 are specified.
[0046]
Referring to FIG. 8, the scene group ID of the first scene of the scene group “sg02” is “d”. Therefore, when the scene having the scene group ID “d” is searched from the scenes “S18” to “S20” in FIG. 9, it is found that the scene group ID of the scene “S18” is “d”. The scene group ID of the last scene of the scene group group “sg02” is “e”. Therefore, among the scenes “S18” to “S20”, the scene having the scene group ID “e” is searched behind the scene “S18” on the time axis. Then, it can be seen that the scene group ID of the scene “S19” is “e”. Therefore, scenes “S18” to “S19” are extracted as scene groups, and a scene group ID “SG08” is assigned.
[0047]
The only remaining scene is the scene “S20”. If there is only one scene, the scene group determination means 105 does not extract the scene group. The scene group determination means 105 determines that the alignment of all scenes and the matching by all scene group groups have been completed.
[0048]
The scene group determination means 105 displays the arrangement of all scenes and the arrangement of one or more scenes along the time axis that are not extracted as scene groups even after matching by all scene group groups. A scene group having no matching scene group is determined (step S608), and a scene group ID is assigned to the determined plurality of scene groups (step S609). The scene group ID “SG09” is assigned to the scene “S20”. FIG. 11 shows the result of assigning scene group IDs to all the scenes of the video data 101.
[0049]
The scene group determination unit 105 receives the scene group ID of the scene group extracted by the scene group determination unit 105, information on the scene group group ID of each scene constituting the scene group, and the first scene change frame of each scene group. Scene group information including new information to be specified is generated. Then, the scene group determination unit 105 outputs the scene group group information and the scene group information to the storage unit 106 (step S609).
[0050]
The accumulating unit 106 accumulates the scene group group information output by the scene group determining unit 105 and the scene group information in the scene group information database 107 (step S205).
[0051]
【The invention's effect】
As described above, according to the present invention, it is possible to automatically perform editing in which the input video is classified by the scene feature amount and is classified into a scene group which is a collection of scenes. In addition, when a repeated structure, which is a structure in which scenes repeatedly appear in the same order, is included in the input video, the repeated structure can be extracted as a scene group. Then, after editing the video by the video editing apparatus according to the present invention, when searching for one scene included in one repeating structure, it is possible to find a video of a desired scene by searching for a scene group There is.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of an embodiment of the present invention.
FIG. 2 is a flowchart showing the operation of the exemplary embodiment of the present invention.
FIG. 3 is a flowchart showing the operation of the scene detection means according to the present invention.
FIG. 4 is a diagram showing a frame configuration example of video data to be edited by the embodiment of the present invention.
FIG. 5 is a diagram showing a scene number of each scene of video data and a scene group ID assigned to each scene.
FIG. 6 is a flowchart showing the operation of the scene classification means according to the present invention.
FIG. 7 is a flowchart showing the operation of the scene extracting means according to the present invention.
FIG. 8 is a diagram showing scene group group IDs, appearance order of scene group IDs, and number of appearances of scene group groups.
FIG. 9 is a diagram showing a scene number, a scene group ID, and a scene group ID of a scene group extracted by a scene extracting unit.
FIG. 10 is a flowchart showing the operation of the scene determining means according to the present invention.
FIG. 11 is a diagram showing a scene number, a scene group ID, a scene group ID of a scene group extracted by a scene extraction unit, and a scene group ID of a scene group determined by a scene determination unit.
12 is a block diagram showing a configuration of an embodiment of a conventional video editing system described in Patent Document 1. FIG.
13 is a block diagram showing a configuration of a conventional video editing system embodiment described in Patent Document 2. FIG.
[Explanation of symbols]
101 Video data
102 Scene detection means
103 Scene classification means
104 Scene group extraction means
105 Scene group determining means
106 Storage means
107 Scene group information database
121 Code parameter extraction unit
122 Scene change frame detector
123 Scene group determination unit
124 Scene information layering section
125 Accumulator
131 Feature extraction unit
132 Quantization unit
133 Counting unit
701 First frame of the first scene change
702 First frame of second scene change
703 The first frame of the third scene change
704 First frame of the fourth scene change
705 First frame of the fifth scene change
706 First frame of the sixth scene change
707 First frame of the seventh scene change
708 First frame of the eighth scene change
709 First frame of the ninth scene change
710 First frame of the 10th scene change
711 The first frame of the eleventh scene change
712 First frame of the 12th scene change
713 First frame of the 13th scene change
714 First frame of the 14th scene change
715 First frame of 15th scene change
716 First frame of the 16th scene change
717 First frame of the 17th scene change
718 First frame of the 18th scene change
719 The first frame of the 19th scene change
720 First frame of the 20th scene change
721 The first frame of the 21st scene change
722 First frame of the 22nd scene change
723 First frame of the 23rd scene change
724 First frame of the 24th scene change

Claims

A scene detection unit that detects a scene change that is a timing at which a video scene changes according to input video data, and divides the video data into a plurality of scenes;
A scene classification that extracts the feature amount of each scene, generates a scene group in which each scene in the video data is classified into a plurality of groups according to the extracted feature amount, and associates correspondence information for identifying each scene group with each scene Means,
Scene group extraction means for identifying and extracting a set of scenes corresponding to a series of corresponding information appearing in the same sequence multiple times on the time axis;
Matches the sequence of scene groups that have not been extracted by the scene group extraction unit with the scene group extracted by the scene group extraction unit, and includes the sequence of scene groups in the scene group according to the matching result Scene group determination means;
A video editing apparatus comprising storage means for storing information as a result of classifying video data into scene groups.

The scene group determination means
Determine whether the scene group of the first scene in the scene group matches the scene group of the first scene in the sequence of scene groups,
Determine whether the scene group of the last scene in the scene group matches the scene group of the last scene in the sequence of scene groups,
The video editing apparatus according to claim 1, wherein the scene group includes an arrangement of scene groups determined to match each other.

The scene group determination means performs matching by selecting scene groups in descending order of appearance in the video data, and when there are multiple scene groups having the same number of appearances in the video data, they appear on the time axis. The video editing apparatus according to claim 2, wherein matching is performed by selecting scene groups in order.

4. The video editing apparatus according to claim 2, wherein the scene group determining means identifies a sequence of scenes not identified as a scene group as a result of matching as a scene group.

Detects scene changes that are the timing of switching video scenes or the timing of switching audio based on the input video data, and generates position information that identifies the position of the scene change. Scene detection means for dividing video data into sections, creating scenes by assigning scene numbers to the plurality of sections divided in time order, and outputting the plurality of created scenes;
Extracting the feature quantities of the plurality of scenes created by the scene detection means to calculate the similarity between the scenes, classifying the plurality of scenes into a plurality of groups based on the similarity between the scenes, Scene classification means for creating a plurality of scene groups by assigning scene group IDs to the plurality of classified groups, and assigning the scene group IDs for specifying a scene group to each of the plurality of scenes;
Extracting a set of scene group IDs in which the scene group ID repeatedly appears in the same order, and creating a plurality of scene group groups by assigning a scene group group ID to the extracted set of scene group IDs; Count the number of appearances of the plurality of scene group groups in the video data, arrange them in descending order of appearance numbers, and for each scene group group, the appearance order of the scene group group ID and scene group ID and the number of appearances in the input video data And a group of a plurality of scenes along the time axis in which the appearance order of the scene group ID matches the appearance order of the scene group ID of the scene group group is extracted as a scene group. A scene group ID for specifying a scene group is assigned to the plurality of extracted scene groups, and the scene group ID and the scene are assigned. The scene group ID of the scene group that matches the order of appearance of the scene group ID in the extracted scene group and the positional information of the first scene change of the extracted scene group. Scene group extraction means for outputting scene group information;
The scene group groups are selected one by one in descending order of appearance in the video data, and one continuous scene sequence on the time axis is selected from the remaining scenes not extracted as scene groups by the scene group extracting means. There is a scene having the same scene group ID as the scene group ID of the first scene of the selected scene group in the selected scene sequence, and the scene is selected behind the scene on the time axis. The scene having the same scene group ID as the scene group ID of the first scene of the selected scene group group when there is a scene having the same scene group ID as the scene group ID of the last scene of the selected scene group group The same scene group as the scene group ID of the last scene of the selected scene group group starting with A set of a plurality of scenes ending with the scene having an ID is extracted as a scene group, a scene group ID for specifying the scene group is assigned to the extracted plurality of scene groups, and a scene group ID of the extracted scene group and All the scenes selected to generate scene group information including scene group group ID of the selected scene group group and position information of the first scene change of the extracted scene group, and selected to extract the scene group Iterate over all the selected scene group groups and the scene group appears as a sequence of one or more scenes along the time axis that was not extracted as a scene group using the scene group group information A scene group that does not have a scene group having the same order is determined, and a scene group ID is assigned to the plurality of determined scene groups; A scene that generates scene group information including a value that means that there is no scene group that matches the appearance order of the scene group and the scene group, and position information of the first scene change of the determined scene group A determination means;
A video editing apparatus comprising storage means for storing the scene group information in a scene group information database.

Detects scene changes that are the timing when the video scene changes according to the input video data, divides the video data into multiple scenes,
Extracting the feature quantity of each scene, generating a scene group in which each scene in the video data is classified into a plurality of groups according to the extracted feature quantity, and associating correspondence information for identifying each scene group with each scene,
A set of scenes corresponding to a series of corresponding information appearing in the same sequence multiple times on the time axis is identified and extracted as a scene group,
Match the scene group and the scene group that was not the target of extraction, specify the scene group and the scene group according to the matching result,
A video editing method comprising storing information on a result of classifying video data into scene groups.

Detects scene changes that are the timing of switching video scenes or the timing of switching audio based on the input video data, and generates position information that identifies the position of the scene change. Dividing video data into sections, creating scenes by assigning scene numbers to the plurality of sections divided in time order, outputting the plurality of created scenes,
The feature quantities of the plurality of scenes created are extracted to calculate the similarity between scenes, the plurality of scenes are classified into a plurality of groups based on the similarity between the scenes, and the plurality of classified A plurality of scene groups are created by assigning scene group IDs to frames constituting the group, and each of the plurality of scenes is assigned the scene group ID for specifying a scene group,
Extracting a set of scene group IDs in which the scene group ID repeatedly appears in the same order, and creating a plurality of scene group groups by assigning a scene group group ID to the extracted set of scene group IDs; Count the number of appearances of the plurality of scene group groups in the video data, arrange them in descending order of appearance numbers, and for each scene group group, the appearance order of the scene group group ID and scene group ID and the number of appearances in the input video data And a group of a plurality of scenes along the time axis in which the appearance order of the scene group ID matches the appearance order of the scene group ID of the scene group group is extracted as a scene group. A scene group ID for specifying a scene group is assigned to the plurality of extracted scene groups, and the scene group ID and the scene are assigned. The scene group ID of the scene group that matches the order of appearance of the scene group ID in the extracted scene group and the positional information of the first scene change of the extracted scene group. Output scene group information,
The scene group groups are selected one by one from the order of appearance frequency in the video data, and the sequence of scenes that are consecutive on the time axis are selected one by one from the remaining scenes that are not extracted as scene groups. A scene having the same scene group ID as the scene group ID of the first scene of the selected scene group group exists in the sequence of scenes, and the scene group ID of the selected scene group group is located behind the scene on the time axis. If there is a scene having the same scene group ID as the scene group ID of the last scene, the selected scene starts with the scene having the same scene group ID as the scene group ID of the first scene of the selected scene group. The scene having the same scene group ID as the scene group ID of the last scene in the scene group Extracting a set of a plurality of scenes as a scene group, assigning a scene group ID for specifying a scene group to the extracted plurality of scene groups, and selecting the scene group ID of the extracted scene group and the selected scene group group Generating scene group information composed of the scene group group ID of the scene group and the position information of the first scene change of the extracted scene group, and selecting all of the scenes selected to extract the scene group and all the selected scenes. The scene group in which the order of appearance of the scene groups matches the sequence of one or more scenes along the time axis that is not extracted as a scene group even when the scene group group information is used. It is determined that the scene group does not exist, a scene group ID is assigned to the determined plurality of scene groups, and the scene group and the scene group are assigned. And generating a composed scene group information in the position information of the start of a scene change of the scene group scene group group is determined to a value that means it is absent the order of appearance matches,
A video editing method, wherein the scene group information is stored in a scene group information database.

A video editing program installed in a video editing apparatus that edits video data and generates a scene group that is a collection of scenes,
On the computer,
A process of detecting a scene change that is a timing at which a video scene changes according to input video data, and dividing the video data into a plurality of scenes;
A process of extracting the feature amount of each scene, generating a scene group in which each scene in the video data is classified into a plurality of groups according to the extracted feature amount, and associating correspondence information identifying each scene group with each scene ,
A process of identifying and extracting a set of scenes corresponding to a series of corresponding information appearing in the same sequence multiple times on the time axis,
A process of matching the sequence of scene groups that were not extracted with the extracted scene groups, and including the sequence of scene groups in the scene groups according to the matching results;
A video editing program comprising: processing for storing information on a result of classifying video data into scene groups in a storage device.