JPH10304242A

JPH10304242A - Dramatic video image production support method and device

Info

Publication number: JPH10304242A
Application number: JP10623997A
Authority: JP
Inventors: Yasumasa Niikura; 康巨新倉; Kenichi Minami; 憲一南; Akito Akutsu; 明人阿久津; Yoshinobu Tonomura; 佳伸外村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-04-23
Filing date: 1997-04-23
Publication date: 1998-11-13
Anticipated expiration: 2017-04-23
Also published as: JP3506410B2

Abstract

PROBLEM TO BE SOLVED: To easily produce an attractive and dramatic video image by supporting a video image production job. SOLUTION: In the video block/music block extract stage 1 at first, a video block 13 to which a music is added and a music block 14 desired to be in use are selected based on video information 11 and music information 12. Then in the video information extract stage 2, a scene change of a video image, a start point and an end point of a camera work, a start point and an end point of the video block are extracted from the video block 13 as a video image climax object 15. In the music information extract stage 3, a change point of a sound volume of a music and a change point of frequency distribution are extracted as a music climax object 16 from the music block 14. Then in the climax coincidence stage 4, the music block 14 is adjusted so as to be synchronized with the video block 13 based on the climax objects 15, 16. Finally, in the information addition stage 5, the music information is added to the video image to obtain a final dramatic video work 19.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像情報と音楽情
報を組み合わせて、マルチメディアタイトルを制作し、
編集する方法および装置に関し、特に映像情報に劇的な
効果を付加するように音楽情報を組み合わせることを支
援する方法および装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to the production of a multimedia title by combining video information and music information,
The present invention relates to a method and apparatus for editing, and more particularly to a method and apparatus for assisting in combining music information to add dramatic effects to video information.

【０００２】[0002]

【従来の技術】ディジタル技術の進歩により、映像情報
を放送局やプロダクション以外の一般の人々が気軽に扱
えるようになった。しかし、カメラなどで撮影された映
像情報は事実を伝えるのには十分ではあるが、強調した
り、人の興味を引いたりといった効果を出すためには、
映像をただ見せるだけでなく、カット割り、ＢＧＭ等の
工夫を行なうことが一般的である。例えば、映像にＢＧ
Ｍを付け加えるだけで、映像作品の魅力が大きく増す。2. Description of the Related Art Advances in digital technology have made it easier for ordinary people other than broadcast stations and productions to handle video information. However, while video information captured by a camera or the like is enough to convey the facts, in order to emphasize or attract people's interests,
It is common to not only show a video, but also to devise cutting and BGM. For example, BG
Just adding M greatly enhances the appeal of the video work.

【０００３】そこで、こうした編集、制作作業を支援す
る作業ツールがいくつか開発、製品化されている。例え
ば、ＡＶＩＤ社が開発したノンリニア編集装置は、映像
情報や音楽情報をディジタルデータとして保管し、時系
列情報を空間的に表示し、ディスプレイ上で視覚化する
ことにより、時間方向のデータを空間方向に直感的に把
握させ、映像および音楽の各情報をランダムにアクセス
して容易に編集作業を行なうことができる。Therefore, several work tools for supporting such editing and production work have been developed and commercialized. For example, a non-linear editing device developed by AVID Co. stores video information and music information as digital data, spatially displays time-series information, and visualizes it on a display, so that data in the time direction can be spatially edited. The user can intuitively comprehend, and can easily perform editing work by randomly accessing each information of video and music.

【０００４】しかし、より魅力のある映像とするために
は、映像に音楽をただ付加するだけでなく、音楽によっ
て、ムードや情緒、心理状態を高め、表現し、映像と音
楽による相乗効果を生かす必要がある。映像の雰囲気や
シナリオに従って、適切な選曲を行い、さらに適切なタ
イミングで映像に音楽を付加することで相乗効果を呼
び、映像作品として魅力あるものにしている。[0004] However, in order to make a video more attractive, it is necessary to not only add music to the video but also to enhance and express the mood, emotion and mental state by the music, and to make use of the synergistic effect of the video and music. There is a need. According to the mood and scenario of the video, appropriate music selection is performed, and music is added to the video at the appropriate timing to create a synergistic effect, making the video work attractive.

【０００５】映像に音楽を付加させる場合のテクニック
には、様々なものが存在するが、中でもいくつかの手法
は古くから多く使われている手法である。以下にそれら
の例をあげる。１）カメラワークを有する映像で、カメラワークにあわ
せて音楽情報の音量を大きくし、カメラワークの終了時
に音量を最大にし、高揚感を表現し、視聴者に興奮を与
える。２）映像の切り替えと同時に音楽を開始または終了させ
ることで、視聴者にインパクトを与え、音楽の開始また
は終了時の映像を印象づける。３）映像シーンの進行とともに音楽の音量を緩やかに小
さくしたり、徐々にゆっくりとした演奏にすることで、
視聴者に穏やかな気持ちを与えたり、哀しみの感情を誘
わせる。４）音楽の曲調の変化と、映像のショットの変化を一致
させ、激しさから静寂、またはその逆の状態を視聴者に
与える。[0005] There are various techniques for adding music to a video, and some of them are techniques that have been widely used since ancient times. Examples are given below. 1) In a video having camera work, the volume of music information is increased in accordance with the camera work, the volume is maximized at the end of the camera work, a sense of excitement is expressed, and the viewer is excited. 2) By starting or ending the music at the same time as switching the video, the viewer is impacted and the video at the start or end of the music is impressed. 3) By gradually lowering the volume of music as the video scene progresses, or by making the performance gradually slower,
Give viewers a calm and emotional feeling. 4) The change in the musical tone and the change in the shot of the video are matched, and the viewer is given a state of silence due to the intensity and vice versa.

【０００６】なお、これらのテクニックでは、映像から
は、編集点、すなわち、シーンの切り替わり、カメラワ
ークの終了点や開始点、音楽からは、音量の変化点、リ
ズム点、テンポ等の諸情報を基準にして、映像と音楽の
同期が取られた結果構成されていると一致する。In these techniques, various information such as an editing point, that is, a scene change, an end point and a start point of camera work, and a music, such as a volume change point, a rhythm point, and a tempo, are recorded from the video. Based on the reference, the video and music match as a result of synchronization.

【０００７】理想的には、映像の雰囲気およびシナリオ
に基づいて、自動的に曲想やメロディー等を考慮した上
で、適切な音楽を選曲し、必要な長さだけを切り出し、
上述のテクニックに基づいて適切に映像に付加するツー
ルが望ましいが、これらは人間の創造力を必要とし、映
像および音楽情報における意味的な情報を理解する力を
必要とするため、完全に自動化することは現時点ではほ
ぼ不可能である。[0007] Ideally, based on the mood and scenario of the video, the music is automatically selected, the appropriate music is selected, and only the required length is cut out.
Fully automated tools are needed, as they require human creativity and the ability to understand the semantic information in video and music information, although tools that properly add to the video based on the techniques described above are desirable. That is almost impossible at the moment.

【０００８】[0008]

【発明が解決しようとする課題】そこで、従来からの編
集装置を用いて、選曲し、効果を生み出すように編集す
るのであるが、上述のテクニックの際に利用する映像の
カメラワーク、編集変化点、音楽情報の音量、テンポ、
リズム、周波数分布の変化点やピークといった情報は不
明であり、映像制作者が、個々に映像および音楽情報を
解析し、特徴的な情報を探し出し、同期をとり、試行錯
誤を繰り返して完成度を高める必要があった。すなわ
ち、より完成度の高い映像作品を制作しようとする場合
には、完成度を高めるために、膨大な時間と労力を必要
としている。Therefore, a conventional editing device is used to select a song and edit it so as to produce an effect. However, the camera work of an image used in the above-described technique and an editing change point are used. , Music information volume, tempo,
Information such as rhythms, change points and peaks in the frequency distribution is unknown, and the video creator analyzes the video and music information individually, searches for characteristic information, synchronizes, and repeats trial and error to determine perfection. We needed to raise it. That is, when a video work with a higher degree of perfection is to be produced, an enormous amount of time and labor is required to enhance the degree of perfection.

【０００９】上述の従来技術の問題点は、従来の映像制
作編集装置が、映像と音楽を生のデータとして扱うだけ
であり、より完成度の高い映像作品を制作、編集するた
めには、特徴的な情報を探し出し、組み合わせる作業を
必要としているからであった。[0009] The above-mentioned problem of the prior art is that the conventional video production and editing apparatus only handles video and music as raw data. It is necessary to search for and combine relevant information.

【００１０】本発明の目的は、映像および音情報に対し
て処理を加えて、映像制作テクニックで良く用いられる
映像および音楽情報から特徴的な情報を抽出し、双方か
ら得られた特徴的な情報同士を基準に利用者の要求に応
じて自動的に加工し、組み合わせることを可能にし、映
像制作作業を支援し、魅力ある劇的な映像の制作を容易
にする劇的映像制作支援方法および装置を提供すること
にある。An object of the present invention is to process video and audio information, extract characteristic information from video and music information often used in video production techniques, and obtain characteristic information obtained from both. Dramatic video production support method and apparatus that enables automatic processing and combination according to the user's request based on each other, supports video production work, and facilitates the production of attractive dramatic video Is to provide.

【００１１】[0011]

【課題を解決するための手段】本発明の劇的映像制作支
援方法は、映像情報と音楽情報からそれぞれ利用者が対
象としたい映像区間と音楽区間を取り出す映像区間・音
楽区間取り出し段階と、該映像区間から映像のクライマ
ックス候補情報である映像区間の開始点および終了点、
シーンチェンジの開始点および終了点、カメラワークの
開始点および終了点を抽出する映像情報抽出段階と、該
音楽区間から音楽のクライマックス候補情報である音楽
区間の開始点および終了点、音量および周波数の変化点
を抽出する音楽情報抽出段階と、あらかじめ用意された
１つのクライマックス候補一致モデルまたは複数のクラ
イマックス候補一致モデルの中から選択されたクライマ
ックス候補モデルを用いて、映像クライマックス候補情
報と音楽クライマックス候補情報を一致させ、一致させ
るクライマックス候補に従って、映像区間に同期できる
ように音楽区間に対する加工処理を行なって映像区間に
音楽区間を同期させるクライマックス一致段階を有す
る。A dramatic video production support method according to the present invention comprises: a video section / music section extracting step of extracting a video section and a music section desired by a user from video information and music information; The start and end points of the video section, which is the climax candidate information of the video from the video section,
A video information extraction step of extracting a start point and an end point of a scene change, a start point and an end point of camera work, and a start point and an end point of a music section which is music climax candidate information from the music section; A music information extracting step of extracting a change point, and a video climax candidate information and a music climax candidate information using a climax candidate matching model prepared from one or a plurality of climax candidate matching models prepared in advance. In accordance with the climax candidate to be matched, processing the music section so that it can be synchronized with the video section, and synchronizing the music section with the video section.

【００１２】本発明は、あらかじめ映像制作に用いられ
そうな映像のシーンチェンジ、カメラワーク、音楽の音
量、周波数分布の変化点の情報をクライマックス情報と
して抽出し、それらクライマックス候補情報を基準に映
像制作テクニックモデルのオペレーションにしたがっ
て、映像に音楽を付加する。According to the present invention, information on a scene change of a video, camera work, volume of music, and a change point of a frequency distribution which is likely to be used for video production is extracted as climax information, and video production is performed based on the climax candidate information. Add music to the video according to the operation of the technique model.

【００１３】本発明によって、映像制作におけるテクニ
ックを応用する際の基準となる映像の特徴点と音楽の特
徴点を基準とした同期を自動的に処理することが可能と
なり、劇的効果を生む映像作品の作成を支援することが
できる。According to the present invention, it is possible to automatically process synchronization based on video feature points and music feature points, which are references when applying techniques in video production, and produce a dramatic effect. Can assist in the creation of works.

【００１４】本発明の実施態様によれば、あらかじめ用
意されているクライマックス候補一致モデルの１つは、
映像クライマックス候補として抽出されたカメラワーク
の終了点と、音楽クライマックス候補として抽出された
音量の変化点を一致させ、それらの値を基準に映像に音
楽を付加するモデル、または映像クライマックス候補と
して抽出されたカメラワークの終了点と、音楽クライマ
ックス候補として抽出された周波数の変化点を一致さ
せ、それらの値を基準に映像に音楽を付加するモデル、
または映像クライマックス候補として抽出されたカメラ
ワークの開始点と、音楽クライマックス候補として抽出
された音量の変化点を一致させ、それらの値を基準に映
像に音楽を付加するモデル、または映像クライマックス
候補として抽出されたカメラワークの開始点と、音楽ク
ライマックス候補として抽出された周波数の変化点を一
致させ、それらの値を基準に映像に音楽を付加するモデ
ル、または映像クライマックス候補として抽出されたシ
ーンチェンジと、音楽クライマックス候補として抽出さ
れた音量の変化点を一致させ、それらの値を基準に映像
に音楽を付加するモデル、または映像クライマックス候
補として抽出されたシーンチェンジと、音楽クライマッ
クス候補として抽出された周波数の変化点を一致させ、
それらの値を基準に映像に音楽を付加するモデル、また
は映像クライマックス候補として抽出された映像区間の
開始点および終了点のうちのいずれか１つと、音楽クラ
イマックス候補として抽出された音量および周波数の変
化点のうちのいずれか１つとを一致させ、それらの値を
基準に映像に音楽を付加するモデル、または映像クライ
マックス候補として抽出された映像区間中のシーンチェ
ンジの開始点および終了点、カメラワークの開始点およ
び終了点のうちのいずれか１つと、音楽クライマックス
候補として抽出された音楽区間の開始点および終了点の
うちのいずれか１つとを一致させ、それらの値を基準に
映像に音楽を付加するモデルである。According to an embodiment of the present invention, one of the climax candidate matching models prepared in advance is:
A model that matches the end point of the camerawork extracted as a video climax candidate with the volume change point extracted as a music climax candidate and adds music to the video based on those values, or is extracted as a video climax candidate A model that matches the end point of the camerawork that has been performed with the change point of the frequency extracted as a music climax candidate, and adds music to the video based on those values,
Or, match the starting point of the camera work extracted as a video climax candidate with the volume change point extracted as a music climax candidate, and extract as a model or video climax candidate that adds music to the video based on those values A model that adds the music to the video based on those values, or the scene change extracted as a video climax candidate, A model that matches the change point of the volume extracted as a music climax candidate and adds music to the video based on those values, or a scene change extracted as a video climax candidate and a frequency of a frequency extracted as a music climax candidate Match change points,
A model that adds music to the video based on those values, or one of the start point and end point of the video section extracted as a video climax candidate, and a change in volume and frequency extracted as a music climax candidate A model that matches any one of the points and adds music to the video based on those values, or a start point and an end point of a scene change in a video section extracted as a video climax candidate; Match one of the start point and the end point with one of the start point and the end point of the music section extracted as the music climax candidate, and add music to the video based on those values. Model.

【００１５】これらのモデルを用いることで、映像作品
を自動的に生成することが可能になる。By using these models, a video work can be automatically generated.

【００１６】本発明の劇的映像制作支援装置は、映像情
報と音楽情報からそれぞれ利用者が対象としたい映像区
間と音楽区間を取り出す映像区間・音楽区間取り出し手
段と、該映像区間から映像のクライマックス候補情報で
ある映像区間の開始点および終了点、シーンチェンジの
開始点および終了点、カメラワークの開始点および終了
点を抽出する映像情報抽出手段と、該音楽区間から音楽
のクライマックス候補情報である音楽区間中の開始点お
よび終了点、音量および周波数の変化点を抽出する音楽
情報抽出手段と、あらかじめ用意されたクライマックス
候補一致モデルまたは複数のクライマックス候補一致モ
デルの中から選択されたクライマックス候補モデルを用
いて、映像クライマックス候補と音楽クライマックス候
補を一致させ、一致させたクライマックス候補に従っ
て、映像区間に同期できるように音楽区間に対する加工
処理を行なって映像区間に音楽区間を同期させるクライ
マックス一致手段とを有する。A dramatic video production supporting apparatus according to the present invention is a video section / music section extracting means for extracting a video section and a music section which a user wants to target from video information and music information, respectively, and a climax of video from the video section. Image information extraction means for extracting the start and end points of the video section, the start and end points of the scene change, and the start and end points of the camera work, which are candidate information, and climax candidate information of music from the music section. Music information extraction means for extracting a start point and an end point in a music section, a change point of volume and frequency, and a climax candidate model selected from a climax candidate match model or a plurality of climax candidate match models prepared in advance. Video climax candidate and music climax candidate According climax candidate is, and a climax matching means for synchronizing the music section to video section performs a processing for the music section to allow synchronization with the video section.

【００１７】[0017]

【発明の実施の形態】次に、本発明の実施の形態を図面
を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１８】図１は本発明の一実施形態の劇的映像制作
支援方法の流れ図である。FIG. 1 is a flowchart of a dramatic video production supporting method according to an embodiment of the present invention.

【００１９】まず、映像区間・音楽区間取り出し段階１
において、映像情報１１と音楽情報１２から、利用者が
指定した、音楽を付加させたい映像区間１３と、利用し
たい音楽区間１４を取り出す。両区間１３、１４の長さ
は必ずしも一致している必要はない。映像区間１３の取
り出しに当たっては、映像情報１１の時間情報をディス
プレイ等の出力装置に表示し、ディスプレイとマウスを
用いた対話的な指定形式や、コマンド形式の指定方法等
によって利用したい映像区間の開始点、終了点を選択さ
せる。この際、時間情報だけでなく、時間情報上に映像
情報中の代表的な画像を表示しておいてもよい。なお、
ここでの代表的な画像とは、シーンチェンジ直後の画像
やカメラワーク終了直後の画像や適当な時間間隔だけの
画像やあらかじめ人の手によって与えられた画像をい
う。また、時間情報だけでなく、時間情報を無視して、
こうした画像情報からのみ選択抽出する方法でもよい。
なお、既存のメディアであるビデオテープ、レーザーデ
ィスク、コンピータ上でのファイルでは、映像情報と共
に音楽や音声情報を同時に格納している。しかし、編集
作業においては、こうした音情報の時間情報については
あらかじめ分離されているものとする。音楽区間１４の
取り出しに当たっては、音楽情報の時間情報をディスプ
レイ等の出力装置に表示し、ディスプレイとマウスを用
いた対話的な指定方式や、コマンド形式の指定方法によ
り、音楽区関の開始点、終了点を選択させる。また、映
像情報と同時に、時系列にそって音楽情報から抽出した
特徴量を表示することも可能である。なお、ここでの音
楽情報から抽出したパラメータとは、周波数分布や、音
量変化等のことを指す。また、メディアによっては、音
楽情報とともに、曲名や歌手名、歌詞等の付加情報が格
納されている場合がある。時系列上にこれらの情報を付
加して表示し、選択させてもよい。First, video section / music section extraction stage 1
, A video section 13 to which music is to be added and a music section 14 to be used specified by the user are extracted from the video information 11 and the music information 12. The lengths of both sections 13 and 14 do not necessarily have to match. In taking out the video section 13, the time information of the video information 11 is displayed on an output device such as a display, and the start of the video section to be used is specified in an interactive specification format using a display and a mouse, a command format specification method, or the like. Point and end point. At this time, not only the time information but also a representative image in the video information may be displayed on the time information. In addition,
The representative image here refers to an image immediately after a scene change, an image immediately after the end of a camera work, an image for an appropriate time interval, or an image given by a human hand in advance. Also, ignoring time information as well as time information,
A method of selecting and extracting only from such image information may be used.
It should be noted that music and audio information are stored together with video information in files on existing media such as video tapes, laser disks, and computers. However, in the editing work, it is assumed that the time information of such sound information is separated in advance. At the time of extracting the music section 14, the time information of the music information is displayed on an output device such as a display, and the starting point of the music division is determined by an interactive specification method using a display and a mouse or a command format. Select the end point. In addition to the video information, it is also possible to display a feature amount extracted from the music information in time series. Here, the parameters extracted from the music information indicate frequency distribution, volume change, and the like. Further, depending on the media, additional information such as a song title, a singer's name, and lyrics may be stored together with the music information. These pieces of information may be added and displayed on a time series and selected.

【００２０】次に、映像情報抽出段階２において、映像
区間１３から映像のシーンチェンジ開始点および終了
点、カメラワークの開始点と終了点、映像区間の開始点
と終了点を映像クライマックス候補１５として抽出す
る。一方、音楽情報抽出段階３において、音楽区間１４
から音楽区間の開始点および終了点、音量の変化点、周
波数分布の変化点を音楽クライマックス候補１６として
抽出する。次に、クライマックス一致段階４において、
利用者が指定したクライマックス候補一致モデル１７を
用いて、映像クライマックス候補１５と音楽クライマッ
クス候補１６を一致させる処理を行なう。クライマック
ス一致段階４では、クライマックス候補１５，１６を基
準に映像区間１３に同期できるように音楽区間１４を調
節し、同期信号１８を出力し、情報付加段階５におい
て、同期信号１８により映像に音楽情報を付加し、最終
的な劇的映像作品１９を得る。なお、この劇的映像作品
１９に対し、さらに編集加工を加え、新たな作品を制作
することは当然可能である。Next, in the video information extracting step 2, the video scene change start and end points, the camera work start and end points, and the video section start and end points from the video section 13 are set as video climax candidates 15. Extract. On the other hand, in the music information extraction stage 3, the music section 14
, The start point and end point of the music section, the change point of the volume, and the change point of the frequency distribution are extracted as music climax candidates 16. Next, in the climax matching stage 4,
Using the climax candidate matching model 17 specified by the user, a process of matching the video climax candidate 15 with the music climax candidate 16 is performed. In the climax matching step 4, the music section 14 is adjusted so as to be able to synchronize with the video section 13 based on the climax candidates 15 and 16, and a synchronization signal 18 is output. To obtain the final dramatic video work 19. It should be noted that it is naturally possible to further edit the dramatic video work 19 to produce a new work.

【００２１】なお、同期信号１８は、映像情報及び音楽
情報を管理し、処理時に必要な時間信号に基づく。例え
ば、映像情報及び音楽情報の再生時に要する時間情報を
基準とし、時間情報に比例した情報であれば何でも構わ
ない。同期信号１８は、映像情報及び音楽情報を他の映
像情報や音声情報、さらに、それら以外の付加情報（例
えば、画面上に表示させる文字、テロップ、歌詞等）と
同期させながら再生させるための必要な信号である。例
えば、映像情報及び音楽情報がコンピュータ上でデータ
ファイルとして構成されている場合、対象となる情報は
複数のデータファイルの複数箇所をランダムにアクセス
して再生させることになる。この際のアクセスするファ
イル、アクセスするファイルのデータ個所、アクセス時
に要するバッファリングのタイミング等を同期信号に収
めていることになる。この信号を記録、保存してもよ
く、その記録媒体は特に限定しない。コンピュータ上で
のデータファイルや、光学、磁気等を応用した読み取り
用の媒体に記録されたデータ等として保管する。The synchronization signal 18 manages video information and music information, and is based on a time signal required for processing. For example, any information may be used as long as the information is proportional to the time information based on the time information required for reproducing the video information and the music information. The synchronization signal 18 is necessary to reproduce video information and music information while synchronizing them with other video information and audio information, and other additional information (for example, characters, telops, lyrics, etc. displayed on the screen). Signal. For example, when video information and music information are configured as data files on a computer, the target information is played by randomly accessing a plurality of locations in a plurality of data files. At this time, the file to be accessed, the data location of the file to be accessed, the buffering timing required at the time of access, and the like are included in the synchronization signal. This signal may be recorded and stored, and the recording medium is not particularly limited. It is stored as a data file on a computer or data recorded on a reading medium using optics, magnetism, or the like.

【００２２】図２は本発明の実施形態の劇的映像制作支
援装置の概略構成を示すブロック図である。入力映像バ
ッファ２０、入力音楽バッファ２１はそれぞれ入力映像
情報１１と入力音楽情報１２を保管しておくバッファで
ある。なお、入力映像情報１１，入力音楽情報１２は、
あらかじめデジタルデータ化されているものであり、様
々な圧縮形式で圧縮されていてもよく、処理を行なえれ
ばよい。したがって、本実施形態では、便宜上バッファ
としたが、ＣＤ−ＲＯＭやハードディスク等のデジタル
データを保管するメディアであれば構わない。FIG. 2 is a block diagram showing a schematic configuration of the dramatic video production support device according to the embodiment of the present invention. The input video buffer 20 and the input music buffer 21 are buffers for storing the input video information 11 and the input music information 12, respectively. The input video information 11 and the input music information 12 are
It is digitalized in advance, and may be compressed in various compression formats, as long as it can be processed. Therefore, in the present embodiment, a buffer is used for convenience, but a medium for storing digital data such as a CD-ROM or a hard disk may be used.

【００２３】利用者は映像特徴表示部２２および音楽特
徴表示部２３に表示されている映像情報、音楽情報から
映像区間切り出し部２５、音楽区間切り出し部２６よっ
て、映像区間１３と音楽区間１４を選択し、有用な区間
を切り出す。両区間の切り出しにあたっては、利用者は
利用者指示部２４を通じて指示を与え、映像および音楽
区間を切り出してくる。なお、この映像特徴表示部２２
についてはパソコンのディスプレイモニター上で、時系
列情報、特徴的な画像等を表示してもよい。または、ビ
デオデッキとテレビモニターで構成される映像表示部の
モニターとビデオデッキのタイマーやテープカウンタ等
の情報を用いてもよい。同様に、レーザディスクのカウ
ンタ等の情報を用いてもよい。音楽特徴表示部２３につ
いては音楽の特徴量を表示してもよいし、ＣＤや音楽を
中心とする音楽プレイヤーにおけるカウンター表示など
によって時系列を表示してもよい。また、音楽内部から
得られた特徴ではないが、カラオケにおける字幕のよう
な外部からあらかじめ付加されている情報を、特徴量と
して表示してもよい。利用者は、それら表示された特徴
量を見て、場合によっては、映像及び音楽情報を適宜再
生することによって、必要な映像及び音楽区間を決定す
る。その指示方法は、ディスプレイ上に表示されている
表示部に対し、マウス等のポインティングデバイスを用
いて指示する方式でもよいし、特別なオペレーティング
システムに特化したスクリプト形式の命令でも構わな
い。以上の一連の処理によって、映像区間切り出し部２
５によって映像情報からは必要な映像区間１３を映像区
間バッファ２７へ、音楽区間切り出し部２６によって音
楽情報からは必要な音楽区間１４を音楽区間バッファ２
８へ格納する。なお、利用者指示部２４は、モニターに
ディスプレイされているインタフェース画面とキーボー
ドやマウスから構成されていたり、あらかじめコンピュ
ータのエディターによって作成されたスクリプト等、デ
ータに対する指示が行なえるものであれば構わない。ま
た、映像区間バッファ２７、音楽区間バッファ２８は計
算機のメモリであってもよいし、仮想ディスクであって
もよく、一時的に対象となるデジタルデータ化されてい
る映像区間１３および音楽区間１４を格納できればよ
く、なるべくなら高速にアクセスすることができるもの
の方が望ましい。The user selects the video section 13 and the music section 14 from the video information and music information displayed on the video feature display section 22 and the music feature display section 23 by the video section cutout section 25 and the music section cutout section 26. And cut out useful sections. In cutting out both sections, the user gives an instruction through the user instruction section 24 to cut out the video and music sections. It should be noted that the image feature display unit 22
With regard to, time-series information, characteristic images, and the like may be displayed on a display monitor of a personal computer. Alternatively, information such as a monitor of a video display unit composed of a VCR and a television monitor, and a timer and a tape counter of the VCR may be used. Similarly, information such as a laser disk counter may be used. The music feature display unit 23 may display the feature amount of the music, or may display the time series by a counter display in a music player centering on a CD or music. Information that is not a feature obtained from inside the music but is added in advance from the outside, such as subtitles in karaoke, may be displayed as a feature amount. The user determines the necessary video and music sections by viewing the displayed feature amounts and, if necessary, reproducing the video and music information as appropriate. The instruction method may be a method of instructing the display unit displayed on the display using a pointing device such as a mouse, or a script-type instruction specialized for a special operating system. By the above series of processing, the video section cutout unit 2
5, the required video section 13 from the video information is stored in the video section buffer 27, and the required music section 14 is stored in the music section buffer 2 from the music information by the music section cutout unit 26.
8 is stored. The user instructing unit 24 may include an interface screen displayed on a monitor, a keyboard and a mouse, or may be any unit that can instruct data, such as a script created in advance by a computer editor. . The video section buffer 27 and the music section buffer 28 may be a memory of a computer or a virtual disk. The video section 13 and the music section 14 which are temporarily converted into digital data are stored in the computer. Anything that can be stored is desirable, and one that can be accessed at high speed is desirable.

【００２４】映像区間１３および音楽区間１４がそれぞ
れ映像区間バッファ２７、音楽区間バッファ２８に格納
されると、映像情報抽出部３０、音楽情報抽出部３１に
よって映像クライマックス候補１５と音楽クライマック
ス候補１６をそれぞれ抽出する。これら映像情報抽出部
３０、音楽情報抽出部３１の詳細については後述する。
本実施形態では、映像クライマックス候補１５とは、対
象映像区間１３の開始点と終了点、映像区間１３中のシ
ーンチェンジの開始点および終了点、映像区間１３中に
存在するカメラワークの開始点と終了点を対象とし、音
楽クライマックス候補１６とは、対象音楽区間１４の開
始点と終了点、音量の変化点、周波数分布の変化点等で
ある。When the video section 13 and the music section 14 are stored in the video section buffer 27 and the music section buffer 28, respectively, the video climax candidate 15 and the music climax candidate 16 are Extract. The details of the video information extraction unit 30 and the music information extraction unit 31 will be described later.
In the present embodiment, the video climax candidate 15 includes a start point and an end point of the target video section 13, a start point and an end point of a scene change in the video section 13, and a start point of a camera work existing in the video section 13. For the end point, the music climax candidate 16 is a start point and an end point of the target music section 14, a volume change point, a frequency distribution change point, and the like.

【００２５】映像クライマックス候補１５と音楽クライ
マックス候補１６が抽出されると、クライマックス候補
１５と１６を一致させるクライマックス候補一致モデル
群２９からクライマックス候補一致モデル１７を利用者
が選択し、クライマックス一致部３２が選択されたクラ
イマックス候補一致モデル１７に従って、映像クライマ
ックス候補１５と音楽クライマックス候補１６を組み合
わせて一致させる。ここで、組み合わせは例えば映像区
間１３の開始点と音楽区間１４の開始点という組み合わ
せもありえるし、映像区間１３の終了点と音楽区間１４
の終了点という組み合わせもありえる。また、映像区間
１３中のカメラワーク終了点と音楽区間１４の音量の最
大になる変化点という組み合わせもありえる。さらに、
映像区間１３の終了点と音楽区間１４の開始点という組
み合わせもありえる。When the video climax candidate 15 and the music climax candidate 16 are extracted, the user selects the climax candidate matching model 17 from the climax candidate matching model group 29 for matching the climax candidates 15 and 16, and the climax matching unit 32 According to the selected climax candidate matching model 17, the video climax candidate 15 and the music climax candidate 16 are combined and matched. Here, the combination may be, for example, a combination of the start point of the video section 13 and the start point of the music section 14, or the end point of the video section 13 and the music section 14.
There can be a combination of end points. Further, there may be a combination of a camera work end point in the video section 13 and a change point at which the volume of the music section 14 becomes maximum. further,
There may be a combination of an end point of the video section 13 and a start point of the music section 14.

【００２６】さらに、クライマックス一致部３２では、
クライマックス候補一致モデル１７で指定された双方の
クライマックス候補を基準に映像区間と音楽区間の同期
をとるだけでなく、対象とする音楽クライマックス候補
を中心に音量を調節し、徐々に音量を小さくしたり、あ
る周波数に関してだけ削除するといった処理を記述し、
それにあわせて音楽区間を加工する。なお、クライマッ
クス候補の選択方法によっては、時間長が異なって完全
な同期が取れない場合が生じる可能性がある。こうした
場合には、音楽区間を映像区間にあわせて延長、短縮す
ることで対応する。入力音楽情報が全体でも映像区間よ
りも短い場合には、繰り返し入力音楽情報を付加する。
これ以外の組み合わせ処理を行なうことも可能である。Further, in the climax matching section 32,
In addition to synchronizing the video section and the music section based on both climax candidates specified by the climax candidate matching model 17, the volume is adjusted around the target music climax candidate and the volume is gradually reduced. , Such as deleting only a certain frequency,
The music section is processed accordingly. Note that depending on the method of selecting the climax candidate, there is a possibility that perfect synchronization cannot be obtained due to a difference in time length. In such a case, the music section is extended or shortened in accordance with the video section. If the entire input music information is shorter than the video section, the input music information is repeatedly added.
Other combination processing can be performed.

【００２７】図３は映像情報抽出部３０の構成図であ
る。映像開始点・終了点検出部４１、シーンチェンジ検
出部４２とカメラワーク検出部４３によってそれぞれ映
像区間開始点・終了点４４、シーンチェンジ４５、カメ
ラワーク開始点・終了点４６を抽出し、それらを映像ク
ライマックス候補１５として抽出する。なお、シーンチ
ェンジやカメラワークは、対象が非圧縮デジタル映像の
場合には、各画素の情報を用いて、フレーム間の相関演
算を行なうことで抽出する。また、圧縮符号化されてい
るような場合、例えばＭＰＥＧ符号化圧縮されている場
合には、ＭＰＥＧデータに含まれる動きベクトルの情報
や予測誤差の情報から抽出する方法を用いてこれを実現
する。さらにシーンチェンジ、カメラワーク以外に、被
写体の動き、被写体の変化、特定被写体の登場／退場等
を映像クライマックス候補に含めてもよい。FIG. 3 is a block diagram of the video information extracting unit 30. The video section start point / end point detection section 41, the scene change detection section 42, and the camera work detection section 43 extract a video section start point / end point 44, a scene change 45, and a camera work start point / end point 46, respectively. It is extracted as a video climax candidate 15. When an object is an uncompressed digital video, a scene change or a camera work is extracted by performing a correlation operation between frames using information of each pixel. Further, when compression encoding is performed, for example, when MPEG encoding compression is performed, this is realized by using a method of extracting from motion vector information and prediction error information included in MPEG data. Further, in addition to the scene change and the camera work, the movement of the subject, the change of the subject, the appearance / exit of a specific subject, and the like may be included in the video climax candidate.

【００２８】図４は音楽情報抽出部３１の構成図であ
る。音楽区間開始点・終了点検出部５１、最大音量抽出
部５２、音量変化継続区間抽出部５３、周波数分布抽出
部５４によって、音楽区間開始点・終了点５５、最大音
量点５６、同一音量が一定時間継続している区間である
同一音量継続区間の開始点・終了点５７、単位時間当た
りの周波数分布が最大となった区間の終了点である周波
数密度最大区間終了点５８や、周波数成分毎に観察し
て、周波数分布が変化した点である周波数分布変曲点５
９を音楽クライマックス候補１６として抽出する。FIG. 4 is a configuration diagram of the music information extraction unit 31. The music section start / end point detection section 51, the maximum volume extraction section 52, the volume change continuation section extraction section 53, and the frequency distribution extraction section 54, the music section start point / end point 55, the maximum volume point 56, and the same volume are constant. The start point / end point 57 of the same volume continuation section that is a section where time continues, the end point 58 of the maximum frequency density section that is the end point of the section where the frequency distribution per unit time is maximum, and the frequency component Observation, frequency distribution inflection point 5 which is a point where the frequency distribution has changed
9 is extracted as a music climax candidate 16.

【００２９】なお、同一音量継続区間の開始点・終了点
５７とは、音量レベルを観察し、一定の音量が継続して
いる区間の開始点と終了点であり、音量変化を微分した
値が０で、かつ、音量が０でない区間の開始点・終了点
を指す。また、周波数密度最大区間終了点５８とは、周
波数成分を観察し、多くの周波数成分が存在し、それが
終了した点を指す。また、周波数分布変曲点５９とは、
高周波成分から低周波成分までの分布を観察し、分布有
りから無し、分布無しから有りをそれぞれ単位時間当た
りに計測し、有りから無し、または無しから有りの個数
がそれぞれ変化したところを変曲点としている。こうし
た点は、音楽の高周波中心から低周波中心、または、そ
の逆の変化を示し、曲調の変化と深い関係にある。The start point and end point 57 of the same volume continuation section are the start point and end point of the section in which the volume level is observed and a constant volume is maintained. It indicates the start point and end point of a section where the volume is 0 and the volume is not 0. In addition, the end point 58 of the maximum frequency density section indicates a point at which the frequency components are observed, many frequency components are present, and the frequency components have ended. The frequency distribution inflection point 59 is
Observe the distribution from the high-frequency component to the low-frequency component, measure the presence / absence of the distribution, and the presence / absence of the distribution per unit time per unit time. And Such a point indicates a change from a high frequency center to a low frequency center of music, or vice versa, and is deeply related to a change in tune.

【００３０】なお、周波数分布の変化点には、他にも特
徴点として用いることができる。例えば、周波数分布を
まず、高周波数帯と低周波数帯とに分割しておき、各周
波数のパワーを抽出し、個々のパワーがあらかじめ与え
られた閾値よりも大きいときの個数を高周波数帯と低周
波数帯とで計数し、それぞれの個数の変化をみておき、
低周波数帯の個数と高周波数帯の個数が逆転したところ
を周波数分布の変化点としている。なお、周波数分布の
変化点に関しては、これ以外にも、様々な方法が存在す
る。ここで記載したものは、個数でしかないが、その周
波数方向への密度を特徴量として比較の対象として用い
てもよいし、パワーの大きさによって重みを付加させて
から分布を評価しても構わない。また、高低の２つの分
布だけで評価するだけでなく、複数の分布帯にわけて評
価を行ってもよい。また、音楽クライマックス候補１
６として、上述した以外に、リズム変化やテンポ変化、
歌付きの音楽の場合の歌詞の変化などの情報も音楽クラ
イマックス候補として用いてもよい。The changing point of the frequency distribution can be used as another characteristic point. For example, the frequency distribution is first divided into a high frequency band and a low frequency band, the power of each frequency is extracted, and the number when each power is larger than a predetermined threshold is determined as the high frequency band and the low frequency band. Count with the frequency band, look at the change of each number,
A point where the number of low frequency bands and the number of high frequency bands are reversed is defined as a change point of the frequency distribution. There are various other methods for changing the frequency distribution. Although what is described here is only the number, the density in the frequency direction may be used as a comparison target as a feature value, or the distribution may be evaluated after weighting is added according to the magnitude of power. I do not care. In addition to the evaluation using only the two distributions of the height, the evaluation may be performed in a plurality of distribution bands. Music climax candidate 1
6, other than the above, rhythm change, tempo change,
Information such as changes in lyrics in the case of music with a song may also be used as music climax candidates.

【００３１】次に、クライマックス候補一致モデル１７
について説明する。クライマックス候補一致モデル１７
とは、映像制作におけるテクニックを反映したものであ
り、例えば上述の４例の制作テクニックを、パラメータ
上で実現することを目的とした操作の記録にあたる。Next, the climax candidate matching model 17
Will be described. Climax candidate matching model 17
Is a reflection of a technique in video production, and corresponds to, for example, recording of an operation aiming at realizing the above four production techniques on parameters.

【００３２】それぞれ上述の４例に対応するモデルを以
下にあげる。１）カメラワークを有する映像で、カメラワークにあわ
せて音楽情報の音量を大きくし、カメラワークの終了時
に音量を最大にし、高揚感を表現し、視聴者に興奮を与
える。The following are models corresponding to the above four examples. 1) In a video having camera work, the volume of music information is increased in accordance with the camera work, the volume is maximized at the end of the camera work, a sense of excitement is expressed, and the viewer is excited.

【００３３】この場合には、映像クライマックス候補点
の中からカメラワークの終了点を、音楽クライマックス
候補１６の中から音量最大点の双方を同期の際の基準と
して用いる。２）映像の切り替えと同時に音楽を開始または終了させ
ることで、視聴者にインパクトを与え、音楽の開始また
は終了時の映像を印象つける。In this case, the end point of the camera work from the video climax candidate points and the maximum sound volume point from the music climax candidates 16 are used as a reference for synchronization. 2) By starting or ending the music at the same time as switching the video, the viewer is impacted and the video at the start or end of the music is impressed.

【００３４】この場合には、映像クライマックス候補１
５の中からシーンチェンジを、音楽クライマックス候補
１６の中から音楽区間の開始点ないしは終了点の双方を
同期の際の基準として用いる。３）映像シーンの進行とともに音楽の音量を緩やかに小
さくしたり、徐々にゆっくりとした演奏にすることで、
視聴者に穏やかな気持ちを与えたり、哀しみの感情を誘
わせる。In this case, video climax candidate 1
5 and the start point or the end point of the music section from the music climax candidates 16 are used as a reference for synchronization. 3) By gradually lowering the volume of music as the video scene progresses, or by making the performance gradually slower,
Give viewers a calm and emotional feeling.

【００３５】この場合には、映像クライマックス候補１
５の中から映像区間の終了点を、音楽クライマックス候
補１６の中から音楽区間の音量最大点の双方を同期の際
の基準として用いる。４）音楽の曲調の変化と、映像のショットの変化を一致
させ、激しさから静寂、または、その逆の状態を視聴者
に与える。In this case, video climax candidate 1
5 and the maximum volume point of the music section from the music climax candidates 16 is used as a reference for synchronization. 4) The change in the musical tone and the change in the shot of the video are matched, and the viewer is given a state of silence due to violence or vice versa.

【００３６】この場合には、映像クライマックス候補１
５の中から映像区間のシーンチェンジを、音楽クライマ
ックス候補１６の中から音楽区間の周波数分布変曲点の
双方を同期の際の基準として用いる。In this case, video climax candidate 1
5, the scene change of the video section from the music climax candidate 16 and the frequency distribution inflection point of the music section from the music climax candidates 16 are both used as a reference for synchronization.

【００３７】本実施形態では、上述の映像作品制作テク
ニックを対象とし、上述のクライマックス候補一致モデ
ル１７によって、これらの制作を容易にすることが可能
となる。In the present embodiment, the above-described video work production techniques are targeted, and the climax candidate matching model 17 described above makes it easy to produce them.

【００３８】さらに、これら以外の制作テクニックを満
たすクライマックス候補一致モデルをモデル群に増や
し、選択可能にできれば、より完成度の高い映像作品の
制作を支援を行なうことが可能である。Furthermore, if a climax candidate matching model satisfying other production techniques can be added to a model group and made selectable, it is possible to support the production of a more complete video work.

【００３９】図５はクライマックス一致部３２によって
映像と音楽が付加され、劇的映像ができる様子を示す。
この場合は、映像クライマックス候補１５からはカメラ
ワーク終了点、音楽クライマックス候補１６からは音量
最大点を、それぞれクライマックス候補一致モデル１７
にて同期させる過程をしめし、最終的な劇的映像作品１
９を制作している。FIG. 5 shows how climax matching section 32 adds video and music to produce a dramatic video.
In this case, the camerawork end point is obtained from the video climax candidate 15, the volume maximum point is obtained from the music climax candidate 16, and the climax candidate match model 17
In the final dramatic video work 1
9 is being produced.

【００４０】図６は、クライマックス一致部３２によっ
て映像と音楽が付加され、劇的映像作品１９ができる様
子を示す。この場合は、映像クライマックス候補１５か
らはシーンチェンジ、音楽クライマックス候補１６から
は周波数分布の変化点を、それぞれクライマックス候補
一致モデル１７にて同期させる過程を示し、最終的な劇
的映像作品１９を製作している。なお、周波数分布の変
化点には、周波数分布最大密度終了点を用いている。FIG. 6 shows how climax matching section 32 adds video and music to form dramatic video work 19. In this case, the process of synchronizing the scene change from the video climax candidate 15 and the change point of the frequency distribution from the music climax candidate 16 by the climax candidate matching model 17 is shown, and the final dramatic video work 19 is produced. doing. Note that the frequency distribution maximum density end point is used as the change point of the frequency distribution.

【００４１】なお、上述のように、あらかじめ映像クラ
イマックス候補及び音楽クライマックス候補が抽出され
ていれば、あとは双方のいずれかのクライマックス候補
を基準にして、クライマックス候補同士を一致させ、同
期をとるだけであり、他のクライマックス候補において
も、同様の処理を行うことが可能である。As described above, if the video climax candidate and the music climax candidate have been extracted in advance, then the climax candidates are matched with each other based on either of the climax candidates and synchronization is achieved. The same processing can be performed for other climax candidates.

【００４２】[0042]

【発明の効果】以上説明したように本発明は、映像およ
び音楽情報を組み合わせて、完成度の高い映像作品を制
作、編集する際に、あらかじめ映像制作テクニックに用
いられる、映像のシーンチェンジ、カメラワークの開始
点および終了点、音楽の音量、周波数波形の変化点とい
った情報をクライマックス候補として抽出し、あらかじ
め構築された映像制作モデルに投入してやることで、基
本的な映像制作テクニックに基づいた加工を自動的に行
ない、完成度の高い映像作品を簡単に作成することが可
能となる。As described above, the present invention relates to a video scene change and camera which are used in advance in video production techniques when producing and editing a video work with a high degree of perfection by combining video and music information. By extracting information such as the start and end points of the work, the volume of music, and the change point of the frequency waveform as climax candidates, and inputting them to a pre-constructed video production model, processing based on basic video production techniques is performed. Automatically, it is possible to easily create high-quality video works.

【００４３】なお、本発明の実現にあたっては、その方
式を記述したソフトウェアを汎用コンピュータに搭載す
ることにより、その装置を実現することが可能である。In implementing the present invention, it is possible to realize the device by installing software describing the method on a general-purpose computer.

[Brief description of the drawings]

【図１】本発明の一実施形態の劇的映像制作支援方法の
流れ図である。FIG. 1 is a flowchart of a dramatic video production support method according to an embodiment of the present invention.

【図２】本発明の一実施形態の劇的映像制作支援装置の
構成図である。FIG. 2 is a configuration diagram of a dramatic video production support device according to an embodiment of the present invention.

【図３】映像情報抽出部３０の構成図である。FIG. 3 is a configuration diagram of a video information extraction unit 30.

【図４】音楽情報抽出部３１の構成図である。FIG. 4 is a configuration diagram of a music information extraction unit 31.

【図５】劇的映像制作の模式図である。FIG. 5 is a schematic diagram of dramatic video production.

【図６】劇的映像製作の模式図である。FIG. 6 is a schematic diagram of dramatic video production.

[Explanation of symbols]

１映像区間・音楽取り出し段階２映像情報抽出段階３音楽情報抽出段階４クライマックス一致段階５情報付加段階１１映像情報１２音楽情報１３映像区間１４音楽区間１５映像クライマックス候補１６音楽クライマックス候補１７クライマックス候補一致モデル１８同期信号１９劇的映像作品２０入力映像バッファ２１入力音楽バッファ２２映像特徴表示部２３音楽特徴表示部２４利用者指示部２５映像区間切り出し部２６音楽区間切り出し部２７映像区間バッファ２８音楽区間バッファ２９クライマックス候補一致モデル群３０映像情報抽出部３１音楽情報抽出部３２クライマックス一致部３３情報付加部３４映像作品メモリ４１映像開始点・終了点検出部４２シーンチェンジ検出部４３カメラワーク検出部４４映像区間開始点・終了点４５シーンチェンジ４６カメラワーク開始点・終了点５１音楽区間開始点・終了点検出部５２最大音量抽出部５３音量変化継続区間抽出部５４周波数分布抽出部５５音楽区間開始点・終了点５６最大音量点５７同一音量継続区間開始点・終了点５８周波数密度最大区間終了点５９周波数分布変曲点 Reference Signs List 1 video section / music extraction stage 2 video information extraction stage 3 music information extraction stage 4 climax matching stage 5 information addition stage 11 video information 12 music information 13 video section 14 music section 15 video climax candidate 16 music climax candidate 17 climax candidate matching model Reference Signs List 18 Synchronization signal 19 Dramatic video work 20 Input video buffer 21 Input music buffer 22 Video feature display unit 23 Music feature display unit 24 User instruction unit 25 Video section cutout unit 26 Music section cutout unit 27 Video section buffer 28 Music section buffer 29 Climax candidate matching model group 30 Video information extraction unit 31 Music information extraction unit 32 Climax matching unit 33 Information addition unit 34 Video work memory 41 Video start / end point detection unit 42 Scene change detection unit 43 Turtle Work detection unit 44 Video section start point / end point 45 Scene change 46 Camera work start point / end point 51 Music section start point / end point detection unit 52 Maximum volume extraction unit 53 Volume change continuation section extraction unit 54 Frequency distribution extraction unit 55 Music section start point / end point 56 Maximum volume point 57 Same volume continuation section start point / end point 58 Frequency density maximum section end point 59 Frequency distribution inflection point

───────────────────────────────────────────────────── フロントページの続き (72)発明者外村佳伸東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 ──────────────────────────────────────────────────の Continued on the front page (72) Inventor Yoshinobu Tonomura Nippon Telegraph and Telephone Corporation 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo

Claims

[Claims]

1. A dramatic video production support method for adding dramatic effect to video information by adding music information according to an instruction, wherein the video section desired by the user is selected from the video information and the music information. And a video section / music section extraction step of extracting a music section and a start point and an end point of a video section which is climax candidate information of a video from the video section, a start point and an end point of a scene change, and a start point and an end of a camera work. A video information extraction step of extracting a point; a music information extraction step of extracting, from the music section, a start point and an end point of a music section which is music climax candidate information; A climax candidate match model or a climax candidate model selected from multiple climax candidate match models Using the above, the video climax candidate information and the music climax candidate information are matched, and according to the matched climax candidate information, the music section is processed so as to be synchronized with the video section, and the music section is synchronized with the video section. A dramatic video production support method having a climax matching stage.

2. One of the climax candidate matching models prepared in advance is to match the end point of the camerawork of the video extracted as the video climax candidate with the change point of the volume extracted as the music climax candidate. 2. The dramatic video production support method according to claim 1, wherein the model is a model for adding music to a video based on the value of.

3. One of the climax candidate matching models prepared in advance is to match the end point of the camerawork extracted as the video climax candidate with the change point of the frequency extracted as the music climax candidate. 2. The dramatic image production support method according to claim 1, wherein the model is a model for adding music to an image based on the standard.

4. One of the climax candidate matching models prepared in advance matches a starting point of camerawork extracted as a video climax candidate with a change point of a volume extracted as a music climax candidate,
The dramatic video production support method according to claim 1, wherein the model is a model for adding music to a video based on those values.

5. A climax candidate matching model prepared in advance matches a starting point of camerawork extracted as a video climax candidate with a change point of a frequency extracted as a music climax candidate, and sets the values thereof to be equal. 2. The dramatic image production support method according to claim 1, wherein the model is a model for adding music to an image based on the standard.

6. One of the climax candidate matching models prepared in advance matches a scene change extracted as a video climax candidate with a volume change point extracted as a music climax candidate, and based on these values. The dramatic video production support method according to claim 1, wherein the model is a model for adding music to video.

7. One of the climax candidate matching models prepared in advance matches a scene change extracted as a video climax candidate with a change point of a frequency extracted as a music climax candidate, and based on these values, The dramatic video production support method according to claim 1, wherein the model is a model for adding music to video.

8. One of the climax candidate matching models prepared in advance includes one of a start point and an end point of a video section extracted as a video climax candidate, and a volume and a volume extracted as a music climax candidate. 2. The dramatic image production supporting method according to claim 1, wherein the model is a model in which any one of the frequency change points is matched and music is added to the image based on those values.

9. One of the climax candidate matching models prepared in advance is one of a start point and an end point of a scene change in a video section extracted as a video climax candidate, and a start point and an end point of a camera work. One and one of the start point and end point of the music section extracted as a music climax candidate
2. The dramatic video production support method according to claim 1, wherein the model is a model in which music is added to a video based on those values.

10. A dramatic video production support device that adds music information to video information in accordance with an instruction to produce a more dramatic effect, wherein a video section desired by a user from video information and music information is provided. Means for extracting a music section and a music section; audio information in the video including conversations of people from the video section; start and end points of a video section which is climax candidate information of the video; start point of a scene change And video information extracting means for extracting a start point and an end point of camera work, and extracting, from the music section, a start point and an end point of a music section which is music climax candidate information, and a change point of a volume and a frequency. Music information extracting means, and whether a climax candidate matching model or a plurality of climax candidate matching models prepared in advance Using the selected climax candidate model, the video climax candidate information and the music climax candidate information are matched, and according to the matched climax candidate, processing for the music section is performed so that the video section can be synchronized with the video section. A dramatic video production support device having climax matching means for synchronizing music sections.