JP7456232B2

JP7456232B2 - Photo movie generation system, photo movie generation device, user terminal, photo movie generation method, and program

Info

Publication number: JP7456232B2
Application number: JP2020055808A
Authority: JP
Inventors: 健秀岸本; 健二朗村田
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2024-03-27
Anticipated expiration: 2040-03-26
Also published as: JP2021157007A

Description

本発明は、フォトムービーを生成し、提供するフォトムービー生成システム等に関する。 The present invention relates to a photo movie generation system and the like that generate and provide photo movies.

クラウドフォトストレージサービスにおいて、クラウドストレージに保存されている写真を使用してフォトムービーを自動作成するサービスが提供されている（非特許文献１）。例えば、非特許文献１のクラウドフォトストレージサービスでは、ユーザがクラウドストレージに保存した画像からムービーにしたい写真や動画を選択し、予め用意されているいくつかのテーマの中から所望のテーマを選択すると、サービス側でムービーが自動生成される。 Among cloud photo storage services, a service is provided that automatically creates a photo movie using photos stored in cloud storage (Non-Patent Document 1). For example, in the cloud photo storage service of Non-Patent Document 1, when a user selects the photos or videos that he or she wants to make into a movie from the images saved in the cloud storage, and selects the desired theme from among several themes prepared in advance, , the movie is automatically generated on the service side.

また、撮影した動画に印象の合った楽曲を付与する手法が報告されている（非特許文献２）。非特許文献２には、撮影した動画の特徴量（動画特徴量）を抽出し、音楽の特徴量（音楽特徴量）を抽出し、動画とメロディ・リズムの印象の関係性を算出し、ユーザの印象に合った楽曲を生成する手法が提案されている。 Furthermore, a method has been reported for adding music that matches the impression of a captured video (Non-Patent Document 2). Non-Patent Document 2 describes how to extract features of a captured video (video features), extract features of music (music features), calculate the relationship between the video and the impression of melody/rhythm, and A method has been proposed to generate music that matches the impression of the user.

一方、特許文献１には、任意の音響データにおける印象またはタイミングを表す音響表現を取得する表現取得部と、音響表現に応じた視覚効果等を画像等に付与する効果付与部と、音響データ及び効果を付与した画像等を再生するディスプレイと、を備える音響再生装置について記載されている。これにより、市販の音楽データ等の音響コンテンツの音響表現を取得して、音響表現に応じた視覚効果等の付与を可能としている。 On the other hand, Patent Document 1 discloses an expression acquisition unit that acquires an acoustic expression representing an impression or timing in arbitrary acoustic data, an effect imparting unit that imparts a visual effect, etc. to an image, etc. according to the acoustic expression, and A sound reproducing device is described that includes a display that reproduces images and the like to which effects have been added. This makes it possible to acquire the acoustic representation of audio content such as commercially available music data, and to add visual effects and the like according to the audio expression.

特開２０１３－１１４０８８号公報Japanese Patent Application Publication No. 2013-114088

Google（登録商標）フォト、"ムービーを作成"、［online］、［令和2年3月23日検索］、インターネット〈URL：https://photos.google.com/movies/create〉Google (registered trademark) Photos, "Create a movie", [online], [Searched on March 23, 2020], Internet <URL: https://photos.google.com/movies/create> 清水柚里奈、他４名、"動画特徴量からの印象推定に基づく動画ＢＧＭの自動素材選出"、NICOGRAPH 2016,pp177-184Yurina Shimizu and 4 others, "Automatic material selection for video BGM based on impression estimation from video features", NICOGRAPH 2016, pp177-184

しかしながら、上述の非特許文献１のムービー作成機能では、フォトムービーに付加する音楽（ＢＧＭ）をユーザが選択することも可能であるが、選択できる音楽は、デフォルトのカテゴリー単位（dramatic、electronic、rockin'、upbeat、…等）か、或いはユーザ端末内の音楽アプリにある音楽のうちＤＲＭ（Digital
Rights Management；デジタル著作権管理）で保護されていない楽曲のみとしている。また、非特許文献２の提案手法は、動画の印象に合った楽曲を生成する手法である。よって、いずれの手法も市販のＣＤや音楽配信サービスによって得た楽曲をフォトムービーのＢＧＭに使用することができない。これに対し、特許文献１の手法は、市販の音楽データから音響表現を抽出し、音響表現に応じた視覚効果等の付与を可能とするものであるが、表現に対して設定された視覚効果が固定的であるため、何度か視聴するとユーザが飽きてしまうおそれがある。 However, with the movie creation function of Non-Patent Document 1 mentioned above, the user can also select the music (BGM) to be added to the photo movie, but the music that can be selected is categorized by default category (dramatic, electronic, rockin'). ', upbeat,..., etc.), or DRM (Digital
Only songs that are not protected by Rights Management (digital rights management) are included. Furthermore, the proposed method in Non-Patent Document 2 is a method of generating music that matches the impression of a video. Therefore, in either method, music obtained from a commercially available CD or a music distribution service cannot be used as the BGM of a photo movie. On the other hand, the method disclosed in Patent Document 1 extracts acoustic expressions from commercially available music data and makes it possible to add visual effects according to the acoustic expressions. Since the content is fixed, the user may become bored after viewing the content several times.

本発明は上記の問題に鑑みてなされたものであり、ユーザの好みの楽曲をフォトムービー付きで再生可能とし、ユーザが楽しんで音楽を視聴することができるフォトムービー生成システム等を提供することを目的とする。 The present invention has been made in view of the above problems, and it is an object of the present invention to provide a photo movie generation system, etc. that allows users to play their favorite songs with photo movies, and that allows users to enjoy listening to music. purpose.

前述した課題を解決するための第１の発明は、サーバとユーザ端末とがネットワークを介して通信接続されたフォトムービー生成システムであって、前記ユーザ端末は、フォトムービーに使用する楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段と、フォトムービーの画像素材として使用する複数の画像の指定を受け付け、指定された画像データまたは画像の指定情報を前記サーバに送信する画像指定手段と、前記サーバから送信されたフォトムービーを受信し、再生する再生手段と、を備え、前記サーバは、前記ユーザ端末において指定された楽曲の音響特徴量データを取得する音響特徴量データ取得手段と、前記フォトムービーに使用する画像を取得する画像取得手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、前記ユーザ端末に送信するフォトムービー生成手段と、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、前記音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶する記憶手段と、を備え、前記サーバの記憶手段は、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、前記楽曲ＩＤに紐づけて記憶し、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記記憶手段に記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記記憶手段に記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするフォトムービー生成システムである。 A first invention for solving the above-mentioned problems is a photo movie generation system in which a server and a user terminal are communicatively connected via a network, and the user terminal specifies music to be used in the photo movie. music designation means that accepts and transmits the sound source data of the designated music or a music ID that is identification information of the music to the server; and music designation means that receives designation of a plurality of images to be used as image materials of a photo movie and transmits the designated image data. Alternatively, the server includes image specifying means for transmitting image specifying information to the server, and playback means for receiving and playing the photo movie transmitted from the server, and the server is configured to play the music specified in the user terminal. An acoustic feature data acquisition means for acquiring acoustic feature data, an image acquisition means for acquiring images to be used in the photo movie, and an image acquired by the image acquisition means are used to create a song based on the acoustic feature data. a photo movie generating means for generating a photo movie whose display changes according to a change in the image data and transmitting the generated photo movie to the user terminal; an effect template in which at least some of the effects are associated with a number of used images of 2 or more; a storage means for storing in advance an effect correspondence table in which a plurality of patterns of effect application examples are defined, the table associating acoustic feature amount data with the effect templates suitable therefor ; The storage means stores movie configuration information that is information constituting the photo movie and stores the number of used images acquired from the effect template and applied effects acquired from the effect correspondence table, based on the song ID. The photo movie generating means refers to the movie structure information stored in association with the specified song ID, and if the movie structure information is not stored in the storage means, the photo movie generating means An arbitrary pattern is read from the effect correspondence table, and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table, and a photo movie is created according to the effect correspondence table of the read pattern. This is a photo movie generation system that is characterized by generating a photo movie.

第１の発明によれば、サーバは、ユーザ端末において指定された楽曲の音響特徴量データを取得し、フォトムービーに使用する画像を取得すると、取得した画像を使用し、音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、ユーザ端末に送信する。ユーザ端末は、フォトムービーを受信して生成する。これにより、ユーザが指定した楽曲に対し、楽曲の変化に応じて表示が変化するフォトムービーを生成してユーザ端末に提供できるため、ユーザは、好きな楽曲をフォトムービー付きで視聴できる。このため、ユーザは音楽視聴をより楽しむことができる。 According to the first invention, when the server acquires acoustic feature data of a music specified in a user terminal and acquires an image to be used in a photo movie, the server uses the acquired image and uses the acoustic feature data based on the acoustic feature data. A photo movie whose display changes according to changes in music is generated and sent to the user terminal. The user terminal receives and generates a photo movie. As a result, a photo movie whose display changes according to changes in the song can be generated for a song specified by the user and provided to the user terminal, allowing the user to view his or her favorite song along with the photo movie. Therefore, the user can enjoy listening to music even more.

第１の発明のフォトムービー生成システムにおいて、前記サーバは、前記フォトムービーを構成する情報であるムービー構成情報を前記楽曲ＩＤに紐づけて記憶する記憶手段を備え、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、前記記憶手段に記憶されているムービー構成情報と異なる構成のフォトムービーを生成する。これにより、同じ楽曲に対し異なるムービー表現を生成してユーザ端末に提供できるため、ユーザに多様な楽しみを提供できる。 In the photo movie generation system of the first invention, the server includes a storage means for storing movie configuration information, which is information that configures the photo movie, in association with the song ID, and the photo movie generation means references the movie configuration information stored in association with a specified song ID and generates a photo movie with a different configuration from the movie configuration information stored in the storage means. This allows different movie representations to be generated for the same song and provided to the user terminal, providing a variety of entertainment to the user.

また、前記サーバの記憶手段は、前記楽曲の音響特徴量データを前記楽曲ＩＤと紐づけて記憶し、前記音響特徴量データ取得手段は、前記記憶手段に記憶されている音響特徴量データを取得することを特徴とすることが望ましい。これにより、一度解析した楽曲については音響解析を省略することができるため、短時間でフォトムービーを提供できる。また、前記サーバの音響特徴量データ取得手段は、前記ユーザ端末において指定された楽曲の音響特徴量データをネットワークを介して通信接続された音楽配信サーバから取得するようにしてもよい。これによりサーバ側で音響解析処理を行うことなく音響特徴量データを取得でき、サーバの負荷を軽減できる。 Further, the storage means of the server stores the acoustic feature data of the song in association with the song ID, and the acoustic feature data acquisition means obtains the acoustic feature data stored in the storage means. It is desirable that the system be characterized by: As a result, acoustic analysis can be omitted for songs that have been analyzed once, so a photo movie can be provided in a short time. Further, the acoustic feature data acquisition means of the server may obtain acoustic feature data of a music piece specified by the user terminal from a music distribution server communicatively connected via a network. As a result, acoustic feature data can be obtained without performing acoustic analysis processing on the server side, and the load on the server can be reduced.

また、前記サーバは、楽曲の音響解析処理を行って音響特徴量データを得る音響解析手段を更に備え、前記音響特徴量データ取得手段は、前記音響解析手段により得た音響特徴量データを取得するようにしてもよい。この際、前記サーバは、前記楽曲の音源データをネットワークを介して通信接続された音楽配信サーバから取得してもよいし、前記ユーザ端末から取得してもよい。これにより、ユーザの所望の楽曲に合ったフォトムービーを生成できる。 Further, the server further includes an acoustic analysis unit that performs an acoustic analysis process on the song to obtain acoustic feature data, and the acoustic feature data acquisition unit obtains the acoustic feature data obtained by the acoustic analysis unit. You can do it like this. At this time, the server may acquire the sound source data of the song from a music distribution server communicatively connected via a network, or from the user terminal. Thereby, a photo movie matching the user's desired song can be generated.

また、前記サーバのフォトムービー生成手段は、前記フォトムービーを音源を付加せずに生成し、前記楽曲の楽曲ＩＤと紐づけて出力し、前記ユーザ端末の再生手段は、前記フォトムービーに紐づけられた楽曲ＩＤに対応する楽曲の音源データを取得し、取得した音源データと前記フォトムービーとを同期して再生することが望ましい。これにより、音楽付きフォトムービーよりファイルサイズの小さいフォトムービーを生成でき、フォトムービーを保存するためのストレージ容量を節約できる。 Further, the photo movie generation means of the server generates the photo movie without adding a sound source, and outputs the photo movie by linking it with the song ID of the song, and the playback means of the user terminal generates the photo movie without adding a sound source, and the playback means of the user terminal generates the photo movie without adding a sound source. It is desirable that the sound source data of the song corresponding to the music ID is acquired, and the acquired sound source data and the photo movie are played back in synchronization. As a result, a photo movie with a smaller file size than a photo movie with music can be generated, and the storage capacity for storing the photo movie can be saved.

また、前記サーバは、前記画像取得手段により取得した画像から画像素材を生成する画像素材生成手段を更に備え、前記フォトムービー生成手段は、前記画像素材生成手段により生成された画像素材を使用して前記フォトムービーを生成することが望ましい。
また、前記ユーザ端末は、前記フォトムービーに使用する画像を前記サーバに送信する画像送信手段を更に備え、前記サーバの画像取得手段は、前記ユーザ端末から送信された画像を受信することが望ましい。これにより、ユーザが指定した画像やその画像から生成された画像素材を使用してフォトムービーを生成できる。 The server further includes image material generation means for generating image materials from the images acquired by the image acquisition means, and the photo movie generation means uses the image materials generated by the image material generation means. It is desirable to generate the photo movie.
Preferably, the user terminal further includes an image transmitting means for transmitting images to be used in the photo movie to the server, and the image acquiring means of the server receives the images transmitted from the user terminal. Thereby, a photo movie can be generated using an image specified by the user and an image material generated from the image.

第２の発明は、画像を取得する画像取得手段と、楽曲の音響特徴量データを取得する音響特徴量データ取得手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段と、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、前記音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶する記憶手段と、を備え、前記記憶手段は、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、楽曲の識別情報である楽曲ＩＤに紐づけて記憶し、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記記憶手段に記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記記憶手段に記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするフォトムービー生成装置である。 A second invention uses an image acquisition means for acquiring an image, an acoustic feature data acquisition means for acquiring acoustic feature data of a song, and an image acquired by the image acquisition means, and uses the image acquisition means to acquire the acoustic feature data. a photo movie generating means for generating a photo movie whose display changes according to changes in music based on the music; an effect template in which at least some of the effects are associated with a number of used images of 2 or more; and the acoustic feature amount data. storage means for storing in advance an effect correspondence table in which a plurality of patterns of effect application examples are defined, the storage means being a table in which the effect templates are associated with the effect templates suitable for the photo movie; The movie configuration information, which is the information that constitutes the music file, and stores the number of used images obtained from the effect template and the applied effects obtained from the effect correspondence table, is linked to the song ID, which is the identification information of the song. The photo movie generating means refers to the movie structure information stored in association with the specified song ID, and if the movie structure information is not stored in the storage means, the photo movie generating means stores the movie structure information corresponding to the effect. An arbitrary pattern is read from the table, and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table and a photo movie is generated according to the effect correspondence table of the read pattern. This is a photo movie generation device characterized by the following.

第２の発明により、楽曲の音響特徴量データに基づいて楽曲の変化に応じて表示が変化するフォトムービーを生成することが可能となる。 According to the second invention, it is possible to generate a photo movie whose display changes according to changes in the music based on the acoustic feature data of the music.

第３の発明は、画像を取得する画像取得手段と、楽曲の音源データを取得する音源データ取得手段と、前記音源データを解析して音響特徴量データを取得する音響解析手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段と、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、前記音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶する記憶手段と、を備え、前記記憶手段は、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、楽曲の識別情報である楽曲ＩＤに紐づけて記憶し、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記記憶手段に記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記記憶手段に記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするフォトムービー生成装置である。 A third invention provides an image acquisition means for acquiring an image, a sound source data acquisition means for acquiring sound source data of a song, an acoustic analysis means for analyzing the sound source data to acquire acoustic feature data, and the image acquisition means. a photo movie generating means for generating a photo movie whose display changes according to changes in music based on the acoustic feature amount data using the images acquired by the means; Preliminarily storing an effect correspondence table in which a plurality of patterns of effect application examples are defined, which is a table in which associated effect templates are associated with each other, and the acoustic feature amount data is associated with the effect template suitable therefor. storage means, the storage means stores information constituting the photo movie, including the number of used images obtained from the effect template and applied effects obtained from the effect correspondence table. Movie configuration information is stored in association with a song ID, which is identification information of a song, and the photo movie generating means refers to the movie configuration information stored in association with the specified song ID, and generates the movie configuration information. is not stored in the storage means, an arbitrary pattern is read from the effect correspondence table, and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table. This photo movie generation device is characterized in that it reads out and generates a photo movie according to an effect correspondence table of the read patterns .

第３の発明により、取得した音源データについて音響解析を実施して音響特徴量データを取得し、音響特徴量データに基づいて楽曲の変化に応じて表示が変化するフォトムービーを生成することが可能となる。 According to the third invention, it is possible to perform acoustic analysis on the acquired sound source data to obtain acoustic feature data, and to generate a photo movie whose display changes according to changes in the song based on the acoustic feature data. becomes.

第４の発明は、第２または第３の発明のフォトムービー生成装置であるサーバとネットワークを介して通信接続可能なユーザ端末であって、楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段と、複数の画像の指定を受け付け、指定された画像データまたは画像の指定情報を前記サーバに送信する画像指定手段と、前記サーバから送信された、前記楽曲の変化に応じて表示が変化し、かつ複数の画像を表示するフォトムービーを受信し、再生する再生手段と、を備えることを特徴とするユーザ端末である。 A fourth invention is a user terminal that can be communicatively connected to the server that is the photo movie generation device of the second or third invention via a network, which accepts a designation of a song, and receives sound source data of the specified song or music designation means for transmitting a music ID, which is identification information of a music, to the server ; image designation means for receiving designation of a plurality of images and transmitting designated image data or image designation information to the server; and the server. The user terminal is characterized in that it is provided with a reproduction means for receiving and reproducing a photo movie whose display changes according to changes in the music and which displays a plurality of images , transmitted from the user terminal.

第４の発明により、ユーザ端末において、楽曲を指定してサーバに楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤをサーバに送信すると、前記楽曲の変化に応じて表示が変化するフォトムービーをサーバから受信し、再生することが可能となる。 According to the fourth invention, when a user terminal specifies a song and sends the music source data of the song or the song ID, which is identification information of the song, to the server, a photo movie whose display changes according to changes in the song is displayed. It becomes possible to receive it from the server and play it back.

第５の発明は、サーバとユーザ端末とがネットワークを介して通信接続されたフォトムービー生成システムにおけるフォトムービー生成方法であって、前記サーバが、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶するステップと、前記ユーザ端末が、フォトムービーに使用する楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信するステップと、前記ユーザ端末が、フォトムービーの画像素材として使用する複数の画像の指定を受け付け、指定された画像データまたは画像の指定情報を前記サーバに送信するステップと、前記サーバが、画像を取得するステップと、前記サーバが、前記ユーザ端末において指定された楽曲の音響特徴量データを取得するステップと、前記サーバが、前記画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、前記ユーザ端末に送信するステップと、前記ユーザ端末が、前記サーバから送信されたフォトムービーを受信し、再生するステップと、を含み、前記サーバが、前記画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、前記ユーザ端末に送信するステップでは、生成したフォトムービーについて、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、前記楽曲ＩＤに紐づけて前記サーバに記憶し、フォトムービーを生成する際には、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記サーバに記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記サーバに記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするフォトムービー生成方法である。 A fifth invention is a photo movie generation method in a photo movie generation system in which a server and a user terminal are communicatively connected via a network , wherein the server has at least some effects that have a number of used images of 2 or more. storing in advance an effect correspondence table in which a plurality of patterns of effect application examples are defined, which is a table in which associated effect templates, acoustic feature amount data, and the effect templates suitable therefor are associated; a step in which the user terminal accepts a designation of a song to be used in the photo movie, and transmits a song ID, which is the sound source data of the specified song or identification information of the song, to the server; a step of accepting designations of a plurality of images to be used as image materials of the user, and transmitting the designated image data or designation information of the images to the server; a step of the server acquiring the images; a step of acquiring acoustic feature data of a song specified in the terminal; the server using the image and generating a photo movie whose display changes according to changes in the song based on the acoustic feature data; the user terminal receives the photo movie transmitted from the server and plays the photo movie, and the server uses the image and reproduces the acoustic feature data. In the step of generating a photo movie whose display changes according to changes in the music based on the information and transmitting it to the user terminal, the generated photo movie includes information that constitutes the photo movie and that is acquired from the effect template. Movie configuration information in which the number of used images and applied effects obtained from the effect correspondence table are stored is stored in the server in association with the song ID, and when generating a photo movie, If the movie configuration information is not stored in the server, an arbitrary pattern is read from the effect correspondence table, and if the movie configuration information is already stored in the This photo movie generation method is characterized in that, if stored in the server, an unused pattern is read from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern .

第５の発明によれば、サーバは、ユーザ端末において指定された楽曲の音響特徴量データを取得し、フォトムービーに使用する画像を取得すると、取得した画像を使用し、音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、ユーザ端末に送信する。ユーザ端末は、フォトムービーを受信して生成する。これにより、ユーザが指定した楽曲に対し、楽曲の変化に応じて表示が変化するフォトムービーを生成してユーザ端末に提供できるため、ユーザは、好きな楽曲をフォトムービー付きで視聴できる。このため、ユーザは音楽視聴をより楽しむことができる。 According to the fifth invention, when the server acquires the acoustic feature data of the music specified in the user terminal and acquires the image to be used in the photo movie, the server uses the acquired image and uses the acoustic feature data based on the acoustic feature data. A photo movie whose display changes according to changes in music is generated and sent to the user terminal. The user terminal receives and generates a photo movie. As a result, a photo movie whose display changes according to changes in the song can be generated for a song specified by the user and provided to the user terminal, allowing the user to view his or her favorite song along with the photo movie. Therefore, the user can enjoy listening to music even more.

第６の発明は、コンピュータを、画像を取得する画像取得手段、楽曲の音響特徴量データを取得する音響特徴量データ取得手段、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段、として機能させるためのプログラムであって、前記コンピュータを、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、前記音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶する記憶手段と、として機能させ、前記記憶手段は、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、楽曲の識別情報である楽曲ＩＤに紐づけて記憶し、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記記憶手段に記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記記憶手段に記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするプログラムである。 A sixth aspect of the present invention provides a computer with image acquisition means for acquiring an image, acoustic feature data acquisition means for acquiring acoustic feature data of a song, and an image acquired by the image acquisition means; A program for causing the computer to function as a photo movie generating means for generating a photo movie whose display changes according to changes in music based on the above, wherein at least some of the effects correspond to the number of used images of 2 or more. A memory that stores in advance an effect correspondence table in which a plurality of patterns of effect application examples are defined, the table associating the attached effect template, the acoustic feature amount data, and the effect template suitable therefor. and the storage means stores information constituting the photo movie, including the number of used images obtained from the effect template and applied effects obtained from the effect correspondence table. Movie configuration information is stored in association with a song ID, which is identification information of a song, and the photo movie generating means refers to the movie configuration information stored in association with the specified song ID, and generates the movie configuration information. is not stored in the storage means, an arbitrary pattern is read from the effect correspondence table, and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table. This program is characterized in that it reads out a pattern and generates a photo movie according to an effect correspondence table of the read pattern .

第６の発明により、コンピュータを第２の発明のフォトムービー生成装置として機能させることが可能となる。 According to the sixth invention, it is possible to cause a computer to function as the photo movie generating device according to the second invention.

第７の発明は、コンピュータを、画像を取得する画像取得手段、楽曲の音源データを取得する音源データ取得手段、前記音源データを解析して音響特徴量データを取得する音響解析手段、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段、として機能させるためのプログラムであって、前記コンピュータを、少なくとも一部のエフェクトが２以上の使用画像枚数と対応付けられているエフェクトテンプレートと、前記音響特徴量データとそれに適した前記エフェクトテンプレートとを対応づけたテーブルであって、複数のパターンのエフェクト適用例が定義されているエフェクト対応テーブルを予め記憶する記憶手段と、として機能させ、前記記憶手段は、前記フォトムービーを構成する情報であって、かつ前記エフェクトテンプレートから取得された使用画像枚数と、前記エフェクト対応テーブルから取得された適用エフェクトが格納されたムービー構成情報を、楽曲の識別情報である楽曲ＩＤに紐づけて記憶し、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、当該ムービー構成情報が前記記憶手段に記憶されていない場合は、前記エフェクト対応テーブルから任意のパターンを読み出し、当該ムービー構成情報が既に前記記憶手段に記憶されている場合は、前記エフェクト対応テーブルから未使用のパターンを読み出し、読み出したパターンのエフェクト対応テーブルに従ってフォトムービーを生成することを特徴とするプログラムである。 A seventh aspect of the invention includes a computer, an image acquisition means for acquiring an image, a sound source data acquisition means for acquiring sound source data of a song, an acoustic analysis means for analyzing the sound source data to acquire acoustic feature data, and the image acquisition means. A program for causing the computer to function as a photo movie generating means for generating a photo movie whose display changes according to changes in music based on the acoustic feature data using images acquired by the means , the program comprising: A table that associates an effect template in which at least some of the effects are associated with a number of used images of 2 or more, and the acoustic feature amount data and the effect template that is suitable for the acoustic feature amount data, the table including the effect template that is associated with the number of images used, wherein at least some of the effects are associated with the number of images to be used, and the acoustic feature amount data and the effect template that is suitable for the effect template; A storage means for storing in advance an effect correspondence table in which an example is defined; The photo movie generating means stores movie configuration information storing applied effects obtained from the effect correspondence table in association with a song ID that is identification information of a song, and the photo movie generating means stores movie configuration information in which applied effects obtained from the effect correspondence table are stored, and the photo movie generating means associates it with a specified song ID. The stored movie configuration information is referred to, and if the movie configuration information is not stored in the storage means, an arbitrary pattern is read from the effect correspondence table, and if the movie configuration information is already stored in the storage means. If there is an effect correspondence table, the program reads an unused pattern from the effect correspondence table and generates a photo movie according to the effect correspondence table of the read pattern .

第７の発明により、コンピュータを第３の発明のフォトムービー生成装置として機能させることが可能となる。 According to the seventh invention, it is possible to cause a computer to function as the photo movie generation device according to the third invention.

第８の発明は、第６または第７の発明のプログラムを実行するサーバとネットワークを介して通信接続可能なコンピュータを、楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段、複数の画像の指定を受け付け、指定された画像データまたは画像の指定情報を前記サーバに送信する画像指定手段、前記サーバから送信された、前記楽曲の変化に応じて表示が変化し、かつ複数の画像を表示するフォトムービーを受信し、再生する再生手段、として機能させるためのプログラムである。 An eighth invention provides a computer that can be communicatively connected via a network to a server that executes the program of the sixth or seventh invention , receives a designation of a music piece, and transmits sound source data of the designated music piece or identification information of the music piece. music designation means for transmitting a certain music ID to the server; image designation means for accepting designation of a plurality of images and transmitting designated image data or image designation information to the server; and the music transmitted from the server. This is a program for functioning as a reproduction means for receiving and reproducing a photo movie whose display changes according to changes in the image and which displays a plurality of images .

第８の発明により、コンピュータを第１及び第４の発明のユーザ端末として機能させることが可能となる。 The eighth invention allows a computer to function as the user terminal of the first and fourth inventions.

本発明により、ユーザの好みの楽曲をフォトムービー付きで再生可能とし、ユーザが楽しんで音楽を視聴することができるフォトムービー生成システム等を提供することが可能となる。 According to the present invention, it is possible to provide a photo movie generation system and the like that allows the user to play back a user's favorite music along with a photo movie, thereby allowing the user to enjoy listening to the music.

フォトムービー生成システム１の全体構成を示す図Diagram showing the overall configuration of photo movie generation system 1 サーバ２のハードウェア構成を示す図Diagram showing the hardware configuration of server 2 ユーザ端末３のハードウェア構成を示す図Diagram showing the hardware configuration of user terminal 3 フォトムービー生成システム１の機能構成を示す図Diagram showing the functional configuration of photo movie generation system 1 音響特徴量データ４５のデータ構成例を示す図A diagram showing an example of the data structure of acoustic feature data 45 フォトムービー生成システム１が実行する処理の流れを示すフローチャートFlowchart showing the flow of processing executed by the photo movie generation system 1 音響解析処理の流れを示すフローチャートFlowchart showing the flow of acoustic analysis processing エフェクトテンプレート２２１の例を示す図A diagram showing an example of an effect template 221 エフェクト対応テーブル２２３の例を示す図A diagram showing an example of an effect correspondence table 223 フォトムービー生成処理の流れを示すフローチャートFlowchart showing the flow of photo movie generation processing フォトムービー６１、６２、６３の例を示す図A diagram showing examples of photo movies 61, 62, and 63 ムービー構成情報２２４の例を示す図A diagram showing an example of movie configuration information 224 フォトムービー生成システム１Ａの全体構成を示す図Diagram showing the overall configuration of the photo movie generation system 1A 音楽配信サーバ４のハードウェア構成を示す図A diagram showing the hardware configuration of the music distribution server 4 フォトムービー生成システム１Ａの機能構成を示す図Diagram showing the functional configuration of the photo movie generation system 1A 音源データ４４のデータ構成例を示す図A diagram showing an example of the data structure of sound source data 44 フォトムービー生成システム１Ａが実行する処理の流れを示すフローチャートFlowchart showing the flow of processing executed by the photo movie generation system 1A

以下、図面に基づいて本発明の好適な実施形態について詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail based on the drawings.

［第１の実施形態］
図１は本発明の第１の実施形態に係るフォトムービー生成システム１の全体構成を示す図である。図１に示すように、本発明に係るフォトムービー生成システム１は、ユーザ端末３及びサーバ２がネットワーク５を介して通信接続される。サーバ２は、ユーザ端末３からの要求に応答してフォトムービーを生成するコンピュータ（フォトムービー生成装置）であり、ユーザ端末３と通信接続される。 [First embodiment]
FIG. 1 is a diagram showing the overall configuration of a photo movie generation system 1 according to a first embodiment of the present invention. As shown in FIG. 1, in the photo movie generation system 1 according to the present invention, a user terminal 3 and a server 2 are communicably connected via a network 5. The server 2 is a computer (photo movie generation device) that generates a photo movie in response to a request from the user terminal 3, and is communicatively connected to the user terminal 3.

ユーザ端末３は、ユーザが利用する電子機器であり、フォトムービー生成システム１を利用するための専用のアプリケーションプログラム（以下、フォトムービーアプリという）を搭載する。またはユーザ端末３は、サーバ２がネットワーク５上に開設したＷＥＢサイトを閲覧可能なブラウザを搭載し、ＷＥＢサイトを介してサーバ２との間で処理を行うことによりフォトムービーの注文及び受信を行うものとしてもよい。ユーザ端末３は、例えば、スマートフォンやタブレット、ＰＣ、音楽プレーヤー、ゲーム機等により構成される。 The user terminal 3 is an electronic device used by the user, and is equipped with a dedicated application program (hereinafter referred to as a photo movie application) for using the photo movie generation system 1. Alternatively, the user terminal 3 is equipped with a browser that can view the website established by the server 2 on the network 5, and orders and receives photo movies by performing processing with the server 2 via the website. It can also be used as a thing. The user terminal 3 includes, for example, a smartphone, a tablet, a PC, a music player, a game console, and the like.

図２は、サーバ２の構成を示す図である。図に示すように、サーバ２は、例えば制御部２０１、記憶部２０２、通信部２０３等をバス２０４等により接続して構成したコンピュータにより実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。 FIG. 2 is a diagram showing the configuration of the server 2. As shown in FIG. As shown in the figure, the server 2 can be realized by, for example, a computer configured by connecting a control section 201, a storage section 202, a communication section 203, and the like via a bus 204 and the like. However, the present invention is not limited to this, and various configurations can be adopted as appropriate.

制御部２０１は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等により構成される。ＣＰＵは、記憶部２０２、ＲＯＭ、記録媒体等に格納されるプログラムをＲＡＭ上のワークメモリ領域に呼び出して実行し、バス２０４を介して接続された各部を駆動制御する。ＲＯＭは、コンピュータのブートプログラムやＢＩＯＳ等のプログラム、データ等を恒久的に保持する。ＲＡＭは、ロードしたプログラムやデータを一時的に保持するとともに、制御部２０１が各種処理を行うために使用するワークエリアを備える。制御部２０１は、上記プログラムを読み出して実行することにより、サーバ２の各手段として機能する。 The control unit 201 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The CPU calls programs stored in the storage unit 202, ROM, recording medium, etc. to a work memory area on the RAM and executes them, and drives and controls each unit connected via the bus 204. The ROM permanently stores programs such as a computer boot program and BIOS, data, and the like. The RAM temporarily stores loaded programs and data, and includes a work area used by the control unit 201 to perform various processes. The control unit 201 functions as each means of the server 2 by reading and executing the above programs.

記憶部２０２は、例えば、ハードディスクドライブやソリッドステートドライブ、フラッシュメモリ等の記憶装置である。記憶部２０２には制御部２０１が実行するプログラムや、プログラム実行に必要なデータ、オペレーティングシステム等が格納されている。これらのプログラムコードは、制御部２０１により必要に応じて読み出されてＲＡＭに移され、ＣＰＵに読み出されて実行される。また、サーバ２の記憶部２０２には、図４に示すように、エフェクトテンプレート２２１、一般画像素材２２２、エフェクト対応テーブル２２３、及びムービー構成情報２２４等のフォトムービーの生成に必要なデータが記憶される。これらのデータの詳細については後述する。 The storage unit 202 is, for example, a storage device such as a hard disk drive, solid state drive, or flash memory. The storage unit 202 stores programs executed by the control unit 201, data necessary for program execution, an operating system, and the like. These program codes are read out by the control unit 201 as needed, moved to the RAM, and read out and executed by the CPU. Furthermore, as shown in FIG. 4, the storage unit 202 of the server 2 stores data necessary for generating a photo movie, such as an effect template 221, a general image material 222, an effect correspondence table 223, and movie configuration information 224. Ru. Details of these data will be described later.

通信部２０３は、通信制御装置、通信ポート等を有し、ネットワーク５等との通信を制御する。ネットワーク５は、ＬＡＮ（Local Area Network）や、より広域に通信接続されたＷＡＮ（Wide
Area Network）、またはインターネット等の公衆の通信回線、基地局等を含む。ネットワーク５は有線、無線を問わない。サーバ２はネットワーク５を介してユーザ端末３と通信接続し、各種のデータを送受信可能である。
バス２０４は、各装置間の制御信号、データ信号等の授受を媒介する経路である。 The communication unit 203 has a communication control device, a communication port, etc., and controls communication with the network 5 and the like. The network 5 is a LAN (Local Area Network) or a WAN (Wide
Area Network), public communication lines such as the Internet, base stations, etc. The network 5 may be wired or wireless. The server 2 is communicatively connected to the user terminal 3 via the network 5 and is capable of transmitting and receiving various data.
The bus 204 is a path for transmitting and receiving control signals, data signals, etc. between each device.

図３は、ユーザ端末３の構成を示す図である。図に示すように、ユーザ端末３は、例えば制御部３０１、記憶部３０２、通信部３０３、表示部３０４、入力部３０５、周辺機器Ｉ／Ｆ部３０６、及び音声出力部３０７等をバス３０８等により接続して構成したコンピュータ等により実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。制御部３０１、記憶部３０２、通信部３０３の構成は、サーバ２の制御部２０１、記憶部２０２、通信部２０３の構成と同様である。 FIG. 3 is a diagram showing the configuration of the user terminal 3. As shown in the figure, the user terminal 3 connects, for example, a control unit 301, a storage unit 302, a communication unit 303, a display unit 304, an input unit 305, a peripheral device I/F unit 306, an audio output unit 307, etc. to a bus 308, etc. This can be realized by a computer or the like configured by connecting. However, the present invention is not limited to this, and various configurations can be adopted as appropriate. The configurations of the control unit 301, storage unit 302, and communication unit 303 are similar to those of the control unit 201, storage unit 202, and communication unit 203 of the server 2.

表示部３０４は、例えば液晶パネル、ＣＲＴモニタ等のディスプレイ装置と、ディスプレイ装置と連携して表示処理を実行するための論理回路（ビデオアダプタ等）で構成され、制御部３０１の制御により入力された表示情報をディスプレイ装置上に表示させる。なお、入力部３０５及び表示部３０４は、表示画面にタッチパネル等の入力装置を一体的に設けたタッチパネルディスプレイとしてもよい。 The display unit 304 includes a display device such as a liquid crystal panel or a CRT monitor, and a logic circuit (such as a video adapter) for performing display processing in cooperation with the display device. Display information on a display device. Note that the input unit 305 and the display unit 304 may be touch panel displays in which an input device such as a touch panel is integrally provided on the display screen.

入力部３０５は、例えば、タッチパネル、キーボード、マウス等のポインティング・デバイス、テンキー等の入力装置であり、入力されたデータを制御部３０１へ出力する。 The input unit 305 is, for example, an input device such as a touch panel, a keyboard, a pointing device such as a mouse, or a numeric keypad, and outputs input data to the control unit 301.

周辺機器Ｉ／Ｆ（インタフェース）部３０６は、周辺機器を接続させるためのポートであり、周辺機器Ｉ／Ｆ部３０６を介して周辺機器とのデータの送受信を行う。周辺機器Ｉ／Ｆ部３０６は、ＵＳＢ等で構成されており、通常複数の周辺機器Ｉ／Ｆを有する。周辺機器との接続形態は有線、無線を問わない。 The peripheral device I/F (interface) section 306 is a port for connecting peripheral devices, and transmits and receives data to and from the peripheral devices via the peripheral device I/F section 306. The peripheral device I/F section 306 is configured with a USB or the like, and usually has a plurality of peripheral device I/Fs. Connection with peripheral devices can be wired or wireless.

音声出力部３０７は、制御部３０１から入力された音声データをスピーカから出力する。 The audio output unit 307 outputs the audio data input from the control unit 301 from the speaker.

なお、ユーザ端末３の記憶部３０２には後述する処理（図６）を実施するためのアプリケーションプログラム（フォトムービーアプリ）が格納され、このアプリケーションプログラムに従って後述する処理をユーザ端末３の制御部３０１が実行する。 Note that the storage unit 302 of the user terminal 3 stores an application program (photo movie application) for implementing the process described later (FIG. 6), and the control unit 301 of the user terminal 3 performs the process described later according to this application program. Execute.

次に、図４を参照してフォトムービー生成システム１の機能構成について説明する。図４に示すように、フォトムービー生成システム１において、サーバ２は、上述の記憶部２０２の他、音源データ取得部２１、音響解析部２２、画像取得部２３、画像素材生成部２４、及びフォトムービー生成部２５等を備える。ユーザ端末３は、上述の記憶部３０２、表示部３０４、音声出力部３０７の他、楽曲指定部３１、画像指定部３２、送信部３３、及び受信部３４等を備える。 Next, the functional configuration of the photo movie generation system 1 will be explained with reference to FIG. 4. As shown in FIG. 4, in the photo movie generation system 1, the server 2 includes, in addition to the storage unit 202 described above, a sound source data acquisition unit 21, an acoustic analysis unit 22, an image acquisition unit 23, an image material generation unit 24, and a photo It includes a movie generation section 25 and the like. The user terminal 3 includes a music specifying section 31, an image specifying section 32, a transmitting section 33, a receiving section 34, etc. in addition to the above-mentioned storage section 302, display section 304, and audio output section 307.

ユーザ端末３の楽曲指定部３１は、ユーザから楽曲の指定を受け付ける。第１の実施形態において楽曲指定部３１は、ユーザ端末３の記憶部３０２に記憶されている音源データ、或いはＣＤ等から読み取った音源データの中からユーザの操作によって楽曲を指定する。 The music designation unit 31 of the user terminal 3 receives music designation from the user. In the first embodiment, the music designation unit 31 designates a music piece from the sound source data stored in the storage unit 302 of the user terminal 3 or the sound source data read from a CD or the like by a user's operation.

画像指定部３２は、ユーザからフォトムービーの画像素材として使用する画像の指定を受け付ける。指定する画像は、ユーザ端末３の記憶部３０２に記憶されている画像でもよいし、サーバ２の記憶部３０２に記憶されている一般画像素材２２２から指定してもよい。サーバ２の記憶部３０２に記憶されている一般画像素材２２２を指定する場合、例えば、サーバ２はユーザ端末３に対して一般画像素材２２２を選択するための画像選択画面を送信し、ユーザによる選択を受け付ける。 The image specifying unit 32 receives a user's specification of an image to be used as an image material of a photo movie. The specified image may be an image stored in the storage unit 302 of the user terminal 3, or may be specified from the general image materials 222 stored in the storage unit 302 of the server 2. When specifying the general image material 222 stored in the storage unit 302 of the server 2, for example, the server 2 transmits an image selection screen for selecting the general image material 222 to the user terminal 3, and the user selects the general image material 222. accept.

送信部３３は、楽曲指定部３１により指定された楽曲の音源データや画像指定部３２により指定された画像データを記憶部３０２から取得してサーバ２に送信する。指定画像がサーバ２の一般画像素材２２２である場合は、画像の指定情報（画像の識別情報等）を送信する。 The transmitting unit 33 acquires the sound source data of the music specified by the music specifying unit 31 and the image data specified by the image specifying unit 32 from the storage unit 302 and transmits them to the server 2 . If the designated image is the general image material 222 of the server 2, image designation information (image identification information, etc.) is transmitted.

記憶部３０２には、楽曲の音源データが楽曲の識別情報である楽曲ＩＤと紐づけて記憶される。楽曲の音源データはＣＤから読み込んだ音源データ、或いは音楽配信サービスのサーバからダウンロードした音源データ等である。また、記憶部３０２には、画像データが記憶される。画像データは、ユーザがカメラで撮影した画像やスキャナで読み取った画像、またはネットワーク５を介してダウンロードした画像等である。 In the storage unit 302, sound source data of a song is stored in association with a song ID, which is identification information of the song. The sound source data of a song is sound source data read from a CD, sound source data downloaded from a server of a music distribution service, or the like. Furthermore, the storage unit 302 stores image data. The image data is an image taken by a user with a camera, an image read with a scanner, an image downloaded via the network 5, or the like.

楽曲ＩＤは、例えば、アルバム名＋アーティスト名＋曲名＋再生時間等の複合情報を使用する。音楽ＣＤデータベース（ＣＤＤＢ；Compact Disc DataBase）では、アルバム名、アーティスト名、及び曲名がＤｉｓｃ
ＩＤと紐づけて管理されている。ユーザ端末３に搭載されている音楽アプリは、音楽ＣＤのＴＯＣ（Table of Contents）という領域に記録されている情報からＤｉｓｃＩＤを算出し、ネットワーク５を介して音楽ＣＤデータベースにアクセスして、ＤｉｓｃＩＤに対応するアルバム名、アーティスト名、及び曲名を得ることができる。本実施形態では、このようにして得たアルバム名、アーティスト名、及び曲名と、楽曲の再生時間等とを複合した情報を楽曲ＩＤとして使用するものとする。 The song ID uses, for example, composite information such as album name + artist name + song title + playback time. In the music CD database (CDDB; Compact Disc DataBase), album names, artist names, and song titles are displayed on discs.
It is managed by linking it to an ID. The music application installed in the user terminal 3 calculates the Disc ID from the information recorded in the TOC (Table of Contents) area of the music CD, accesses the music CD database via the network 5, and stores the disc. The album name, artist name, and song title corresponding to the ID can be obtained. In this embodiment, information obtained by combining the album name, artist name, and song title obtained in this way with the playback time of the song, etc. is used as the song ID.

サーバ２の音源データ取得部２１は、ユーザ端末３から送信された音源データ及び楽曲ＩＤを取得する。 The sound source data acquisition unit 21 of the server 2 acquires the sound source data and song ID transmitted from the user terminal 3.

音響解析部２２は、音源データ取得部２１が取得した音源データについて音響解析処理を行い、音響特徴量データ４５を取得する。 The acoustic analysis unit 22 performs acoustic analysis processing on the sound source data acquired by the sound source data acquisition unit 21 and acquires acoustic feature amount data 45.

ここで、音響解析部２２による音響解析処理及び音響特徴量データ４５について説明する。図５は音響特徴量データ４５の一例を示す図である。音響解析処理では、楽曲構成（Ａメロ、Ｂメロ、サビ等）、楽曲のテンポ（ＢＰＭ；Beats Per Minute）、ビート（拍子）、特定の楽器の発音タイミング等の特徴量が解析される。また、これらに加え、和音進行（音楽の持つ曲調の流れ）、音量変化（聴感上の音量変化）、周波数重心（音楽の盛り上がり度合い）、音程パワー分離（音楽の持つ複雑さ）等の特徴量について解析してもよい。 Here, the acoustic analysis process performed by the acoustic analysis unit 22 and the acoustic feature data 45 will be described. FIG. 5 is a diagram showing an example of the acoustic feature data 45. In the acoustic analysis process, features such as the music structure (verse A, verse B, chorus, etc.), music tempo (BPM; Beats Per Minute), beat (time signature), and timing of pronunciation of specific instruments are analyzed. In addition to these, features such as chord progression (the flow of the musical melody), volume change (audible volume change), frequency center of gravity (degree of excitement in the music), and pitch power separation (complexity of the music) may also be analyzed.

音響解析処理は、公知の手法（例えば、株式会社ＣＲＩ・ミドルウェア、“超高速・高精度
楽曲解析ミドルウェアBEATWIZ（登録商標）”、[onine]、[令和２年３月１５日検索]、インターネット、< URL ：https://www.cri-mw.co.jp/product/amusement/beatwiz/index.html>、または、堀内直明、他５名共著、"Song
Surfing：類似フレーズで音楽ライブラリを散策する音楽再生システム"、PIONEER R&D（Vol.17,No.2/2007）等）を利用して行うことができる。なお、図５に示す音響特徴量データ４５は、一例であり、これに限定されない。音響解析部２２は、図５に示す音響特徴量データ４５の各項目以外の特徴量についても解析してもよい。 Acoustic analysis processing is performed using known methods (for example, CRI Middleware Co., Ltd., “Ultra-high-speed, high-precision music analysis middleware BEATWIZ (registered trademark)”, [onine], [searched on March 15, 2020], the Internet). , <URL: https://www.cri-mw.co.jp/product/amusement/beatwiz/index.html>, or co-authored by Naoaki Horiuchi and 5 others, "Song
Surfing: A music playback system that lets you explore a music library using similar phrases", PIONEER R&D (Vol.17, No.2/2007), etc.) is an example, and is not limited thereto.The acoustic analysis unit 22 may also analyze feature amounts other than each item of the acoustic feature amount data 45 shown in FIG.

音響解析部２２は、音響解析処理の結果、取得した音響特徴量データ４５をフォトムービー生成部２５に出力するとともに、記憶部２０２に楽曲ＩＤと紐づけて記憶する。なお、音響解析部２２は、一度音響解析処理を実施した楽曲については、記憶部２０２に記憶されている音響特徴量データ４５を取得するのみとする。これにより、音響解析処理に要するサーバ２の負荷を削減でき、かつ処理時間を低減できる。 The acoustic analysis unit 22 outputs the acquired acoustic feature data 45 as a result of the acoustic analysis process to the photo movie generation unit 25, and stores it in the storage unit 202 in association with the song ID. Note that the acoustic analysis unit 22 only acquires the acoustic feature amount data 45 stored in the storage unit 202 for songs that have been subjected to acoustic analysis processing once. Thereby, the load on the server 2 required for acoustic analysis processing can be reduced, and the processing time can be reduced.

画像取得部２３は、ユーザ端末３から送信された画像データ、または一般画像素材２２２の指定情報を取得する。一般画像素材２２２の指定情報を取得した場合は、指定情報に従って記憶部２０２から一般画像素材２２２を取得する。 The image acquisition unit 23 acquires image data transmitted from the user terminal 3, or designation information for the general image material 222. When designation information for the general image material 222 is acquired, the image acquisition unit 23 acquires the general image material 222 from the storage unit 202 according to the designation information.

画像素材生成部２４は、画像取得部２３により取得した画像データから画像素材を生成する。画像素材生成部２４は、例えば、写真等の画像データから人物、動物、物品等の対象物を検出し、検出した対象物を切り出してそれぞれ個々の画像素材とする。画像素材生成部２４は、生成した画像素材をフォトムービー生成部２５に出力する。 The image material generation section 24 generates an image material from the image data acquired by the image acquisition section 23. The image material generation unit 24 detects objects such as people, animals, and articles from image data such as photographs, cuts out the detected objects, and uses them as individual image materials. The image material generation section 24 outputs the generated image material to the photo movie generation section 25.

フォトムービー生成部２５は、画像取得部２３により取得した画像または画像素材生成部２４により生成した画像素材を使用し、音響解析部２２から取得した音響特徴量データ４５に基づき楽曲の変化に応じて表示が変化するフォトムービーを生成する。フォトムービー生成部２５は、記憶部２０２に記憶されているエフェクト対応テーブル２２３及びエフェクトテンプレート２２１を参照して楽曲の変化に応じて表示が変化するフォトムービーを生成する。楽曲の変化は、音響特徴量データ４５の「楽曲構成」や「特定の楽器の発音タイミング」等から検出される。フォトムービー生成処理の詳細については、後述する。 The photo movie generating unit 25 uses the images acquired by the image acquiring unit 23 or the image material generated by the image material generating unit 24 to generate a photo movie in which the display changes in response to changes in the music based on the acoustic feature data 45 acquired from the acoustic analysis unit 22. The photo movie generating unit 25 generates a photo movie in which the display changes in response to changes in the music by referring to the effect correspondence table 223 and the effect template 221 stored in the storage unit 202. Changes in the music are detected from the "music composition" and "timing of sound production of specific instruments" of the acoustic feature data 45, etc. Details of the photo movie generating process will be described later.

なお、フォトムービー生成部２５が生成するフォトムービーは、音楽付きフォトムービーまたは音楽無しフォトムービーである。音楽付きフォトムービーはユーザ端末３から指定された楽曲の音源データをＢＧＭとして付加したフォトムービーである。すなわち、映像データと音声データとが含まれるフォトムービーである。音楽無しフォトムービーは、楽曲の音源（ＢＧＭ）を付加せずに映像データのみのフォトムービーである。音楽無しフォトムービーは再生時に楽曲の音源データと同期して再生される。 Note that the photo movie generated by the photo movie generation unit 25 is a photo movie with music or a photo movie without music. A photo movie with music is a photo movie to which the sound source data of a song specified by the user terminal 3 is added as BGM. That is, it is a photo movie that includes video data and audio data. A photo movie without music is a photo movie that includes only video data without adding a music sound source (BGM). When a photo movie without music is played, it is played in synchronization with the sound source data of the song.

フォトムービー生成部２５は、フォトムービーを生成すると、データ（映像データ）をmpeg4方式等で圧縮符号化し、ユーザ端末３において再生可能なデータ形式に変換する。そして、フォトムービーの構成を表す情報であるムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶するとともに、フォトムービーのデータを楽曲ＩＤと紐づけてユーザ端末３に送信する。 When the photo movie generation unit 25 generates the photo movie, it compresses and encodes the data (video data) using the MPEG4 method or the like, and converts it into a data format that can be played back on the user terminal 3. Then, the movie configuration information 224, which is information representing the configuration of the photo movie, is stored in the storage unit 202 in association with the song ID, and the data of the photo movie is transmitted to the user terminal 3 in association with the song ID.

ユーザ端末３の受信部３４は、サーバ２から送信されたフォトムービーを受信し、再生処理部３５へ送る。 The receiving unit 34 of the user terminal 3 receives the photo movie transmitted from the server 2, and sends it to the reproduction processing unit 35.

再生処理部３５は、フォトムービーを再生する。音楽無しフォトムービーを受信した場合は、再生処理部３５は、フォトムービーに紐づけられた楽曲ＩＤに対応する楽曲の音源データを記憶部３０２から取得し、取得した音源データとフォトムービー（音楽無しフォトムービー）とを同期して再生する。再生処理部３５は、フォトムービーを復号して符号化前の映像データとし、表示部３０４に表示するとともに、音源データを同期再生して音声出力部３０７から出力する。 The reproduction processing unit 35 reproduces the photo movie. When a photo movie without music is received, the playback processing unit 35 acquires the sound source data of the song corresponding to the song ID linked to the photo movie from the storage unit 302, and combines the acquired sound source data with the photo movie (without music). (Photo Movie). The playback processing unit 35 decodes the photo movie into pre-encoded video data and displays it on the display unit 304, and also synchronously plays back the sound source data and outputs it from the audio output unit 307.

音楽付きフォトムービーを受信した場合は、再生処理部３５は、受信したフォトムービーのデータを復号して符号化前の映像データ及び音声データとし、映像を表示部３０４に表示するとともに、音声（音源）を音声出力部３０７から出力する。なお、フォトムービーの再生方法は、ストリーミング再生でもダウンロード再生でもよい。 When a photo movie with music is received, the playback processing unit 35 decodes the received photo movie data into pre-encoded video data and audio data, displays the video on the display unit 304, and displays the audio (sound source). ) is output from the audio output unit 307. Note that the photo movie may be played back by streaming playback or download playback.

次に、図６を参照して、フォトムービー生成システム１が実行する処理の流れを説明する。以下の説明では、ユーザ端末３の記憶部３０２には、フォトムービー生成システム１を利用するためのアプリであるフォトムービーアプリがインストールされ、ユーザ端末３の制御部３０１がこのフォトムービーアプリを読み出して実行する手順について説明する。 Next, with reference to FIG. 6, the flow of processing executed by the photo movie generation system 1 will be described. In the following explanation, a photo movie application, which is an application for using the photo movie generation system 1, is installed in the storage unit 302 of the user terminal 3, and the control unit 301 of the user terminal 3 reads out this photo movie application. Describe the steps to take.

ユーザ端末３において、フォトムービーアプリを起動すると、ユーザ端末３の制御部３０１は、フォトムービーに使用する楽曲の指定（ステップＳ１０１）、及び画像の指定（ステップＳ１０２）を受け付ける。ステップＳ１０１～ステップＳ１０２において、制御部３０１は、楽曲や画像を選択入力するための入力画面を表示してもよい。 When the photo movie application is started on the user terminal 3, the control unit 301 of the user terminal 3 accepts the designation of a song to be used in the photo movie (step S101) and the designation of an image (step S102). In steps S101 and S102, the control unit 301 may display an input screen for selectively inputting songs and images.

第１の実施形態では、指定できる楽曲は、ユーザ端末３の記憶部３０２に記憶されている楽曲（音源データ）またはＣＤドライブにより読み取り可能な楽曲の音源データとする。また、指定できる画像は、ユーザ端末３の記憶部３０２に記憶されている画像、及びサーバ２に記憶されている一般画像素材２２２とする。 In the first embodiment, the music that can be specified is music (sound source data) stored in the storage unit 302 of the user terminal 3 or sound source data of the music that can be read by a CD drive. Furthermore, images that can be specified include images stored in the storage unit 302 of the user terminal 3 and general image materials 222 stored in the server 2.

楽曲及び画像が指定されると、制御部３０１は、指定された音源データ及び画像データを記憶部３０２から取得し（ステップＳ１０３）、サーバ２に送信する（ステップＳ１０４）。なお、一般画像素材２２２が指定された場合は、制御部３０１は、指定された一般画像素材２２２の識別情報をサーバ２に送信する。 When a song and an image are specified, the control unit 301 acquires the specified sound source data and image data from the storage unit 302 (step S103), and transmits them to the server 2 (step S104). Note that when the general image material 222 is specified, the control unit 301 transmits the identification information of the specified general image material 222 to the server 2.

サーバ２は、音源データ及び画像データ（一般画像素材２２２が指定された場合は、一般画像素材２２２の識別情報）を受信すると（ステップＳ１０５）、制御部２０１（音響解析部２２）は、受信した音源データについて音響解析処理を実行し、楽曲の音響特徴量データ４５を取得する（ステップＳ１０６）。 When the server 2 receives the sound source data and the image data (if the general image material 222 is specified, the identification information of the general image material 222) (step S105), the control unit 201 (acoustic analysis unit 22) Acoustic analysis processing is performed on the sound source data to obtain acoustic feature data 45 of the song (step S106).

図７は、音響解析処理の流れを示すフローチャートである。図７に示すように、制御部２０１（音響解析部２２）は楽曲の音源データを取得し（ステップＳ２０１）、音響解析処理を実行する（ステップＳ２０２）。音響解析処理では、楽曲のテンポ（ＢＰＭ；Beats Per Minute）、ビート（拍子）、楽曲構成（Ａメロ、Ｂメロ、サビ等）、特定の楽器の発音タイミング等が解析される。 FIG. 7 is a flowchart showing the flow of acoustic analysis processing. As shown in FIG. 7, the control unit 201 (acoustic analysis unit 22) acquires sound source data of a song (step S201), and executes acoustic analysis processing (step S202). In the acoustic analysis process, the tempo (BPM; Beats Per Minute) of the song, the beat (time signature), the song structure (A melody, B melody, chorus, etc.), the sound timing of a specific instrument, etc. are analyzed.

制御部２０１は、音響解析結果を音響特徴量データ４５として、楽曲ＩＤと紐づけて記憶部２０２に記憶する（ステップＳ２０３）。音響特徴量データ４５を楽曲ＩＤと紐づけてサーバ２に記憶しておくことにより、一度解析した楽曲について、２回目以降は解析が不要となる。 The control unit 201 stores the acoustic analysis results as acoustic feature data 45 in the storage unit 202 in association with the song ID (step S203). By storing the acoustic feature data 45 in association with the song ID in the server 2, a song that has already been analyzed does not need to be analyzed again from the second time onwards.

また、サーバ２の制御部２０１（画像素材生成部２４）は、ステップＳ１０５で受信した画像データについて必要に応じて画像素材生成処理を実行し、画像データから画像素材を切り出す（ステップＳ１０７）。なお、受信した画像データをそのまま画像素材として使用してもよい。 Further, the control unit 201 (image material generation unit 24) of the server 2 executes image material generation processing as necessary on the image data received in step S105, and cuts out image materials from the image data (step S107). Note that the received image data may be used as is as the image material.

サーバ２の制御部２０１（フォトムービー生成部２５）は、画像素材と所定のエフェクトテンプレート２２１を使用し、音響特徴量データ４５（例えば、「楽曲構成」の変化）に応じて画面が変化するフォトムービーを生成する（ステップＳ１０８）。 The control unit 201 (photo movie generation unit 25) of the server 2 uses the image material and a predetermined effect template 221 to create a photo movie whose screen changes according to the acoustic feature data 45 (for example, a change in "song composition"). A movie is generated (step S108).

ここで、エフェクトテンプレート２２１及びエフェクト対応テーブル２２３について説明する。
エフェクトテンプレート２２１は、画像に所定の加工（エフェクト）を加えるためのプログラムであり、図８に示すように、複数のエフェクトテンプレート２２１がエフェクトＮｏ．及び使用画像枚数と対応付けて記憶されている。 Here, the effect template 221 and the effect correspondence table 223 will be explained.
The effect template 221 is a program for adding a predetermined processing (effect) to an image, and as shown in FIG. 8, a plurality of effect templates 221 are assigned effect numbers. and the number of used images are stored in association with each other.

エフェクトテンプレート２２１は、例えば、複数の表示画像を順に切り替える「画像切替」、画像をモノクロからカラーに変化させる「モノクロ→カラー」、画像の透過率を変化させる「透過率変化」、画像の表示位置を画面内で移動させる「画面移動」、画像を回転させる「回転」、画像を拡大させる「拡大」、画像を縮小させる「縮小」、表示画面内における画像の表示割合を変化させる「表示割合変化」、画像を出現させる「出現」、画像を拡大させながら移動させる「拡大移動」、画像を拡大させながら回転させる「拡大回転」、画像を縮小させながら移動させる「縮小移動」、画像を縮小させながら回転させる「縮小回転」、小画像を徐々に増加させていく「小画像増加」、複数の画像を一画面内に表示する「複数画像一画面表示」、一画面内に複数表示させた画像をスライドさせる「複数画像一画面表示＋スライド」等がある。 The effect template 221 includes, for example, "image switching" to sequentially switch between multiple display images, "monochrome to color" to change the image from monochrome to color, "transmittance change" to change the transmittance of the image, and image display position. ``Move screen'' to move the image within the screen, ``Rotate'' to rotate the image, ``Enlarge'' to enlarge the image, ``Shrink'' to reduce the image, and ``Change display ratio'' to change the display ratio of the image within the display screen. ", "Appearance" to make an image appear, "Move to enlarge" to move the image while enlarging it, "Rotate to enlarge" to rotate the image while enlarging it, "Move to shrink" to move the image while shrinking it, "Move to reduce" to move the image while enlarging it, ``reducing rotation'', which rotates while rotating, ``small image increase'', which gradually increases the number of small images, ``multiple image single screen display'', which displays multiple images on one screen, and multiple images displayed on one screen. There are "multiple images in one screen display + slide" which slides images.

なお、図８に示すエフェクトテンプレート２２１は一例であり、他のエフェクトを含むものとしてもよい。また、各エフェクトについて移動量や変化の速さ等のパラメータを複数パターン用意してもよい。例えば、エフェクト「拡大」について、拡大率によって「拡大１」、「拡大２」、「拡大３」、…を用意したり、エフェクト「透過率変化」について、変化の速さによって「透過率変化１」、「透過率変化２」…を用意したりする。 Note that the effect template 221 shown in FIG. 8 is an example, and may include other effects. Furthermore, a plurality of patterns of parameters such as the amount of movement and the speed of change may be prepared for each effect. For example, for the effect "Enlargement", "Enlargement 1", "Enlargement 2", "Enlargement 3", etc. are prepared depending on the enlargement rate, and for the effect "Transmittance change", "Transmittance change 1" is prepared depending on the speed of change. ”, “transmittance change 2”, etc.

制御部２０１（フォトムービー生成部２５）は、フォトムービーを生成するにあたり、どのエフェクトを使用するかを、エフェクト対応テーブル２２３を参照して決定する。図９はエフェクト対応テーブル２２３の例を示す図である。エフェクト対応テーブル２２３は、楽曲構成要素（イントロ、Ａメロ、Ｂメロ、Ｃメロ、ソロ、サビ等）とそれに適したエフェクトテンプレート２２１とを対応づけたテーブルであり、図９に示すように、複数のパターンが定義されている。 The control unit 201 (photo movie generation unit 25) refers to the effect correspondence table 223 and determines which effect to use when generating a photo movie. FIG. 9 is a diagram showing an example of the effect correspondence table 223. The effect correspondence table 223 is a table that associates musical composition elements (intro, A melody, B melody, C melody, solo, chorus, etc.) with effect templates 221 suitable for the elements, and as shown in FIG. pattern is defined.

図９の例では、パターン（１）の場合、制御部２０１（フォトムービー生成部２５）は、楽曲の「イントロ」でエフェクト「モノクロ→カラー」を適用し、楽曲の「Ａメロ」でエフェクト「移動３」を適用し、楽曲の「Ｂメロ」でエフェクト「小画像増加」を適用し、楽曲の「Ｃメロ」で「複数画像一画面表示１」を適用し、楽曲の「ソロ」でエフェクト「拡大１」を適用し、楽曲の「サビ」でエフェクト「画像切替１」を適用する。 In the example of FIG. 9, in the case of pattern (1), the control unit 201 (photo movie generation unit 25) applies the effect "monochrome → color" to the "intro" of the song, and applies the effect "monochrome → color" to the "A melody" of the song. Apply "Movement 3", apply the effect "Increase small images" to the "B melody" of the song, apply "Multiple images in one screen display 1" to the "C melody" of the song, and apply the effect "Movement 3" to the "Solo" of the song. Apply "Enlargement 1" and apply the effect "Image switching 1" at the "chorus" of the song.

また、図９のパターン（２）の場合、制御部２０１（フォトムービー生成部２５）は、楽曲の「イントロ」でエフェクト「画像切替１」を適用し、楽曲の「Ａメロ」でエフェクト「縮小回転１」を適用し、楽曲の「Ｂメロ」でエフェクト「拡大回転１」を適用し、「Ｃメロ」でエフェクト「画面移動」を適用し、楽曲の「ソロ」でエフェクト「拡大２」を適用し、楽曲の「サビ」でエフェクト「画像切替２」を適用する。このように、複数のパターンのエフェクト適用例がエフェクト対応テーブル２２３に定義されている。 Furthermore, in the case of pattern (2) in FIG. 9, the control unit 201 (photo movie generation unit 25) applies the effect "image switching 1" to the "intro" of the song, and applies the effect "shrinking" to the "A melody" of the song. Apply the effect "Rotation 1" to the "B melody" of the song, apply the effect "Screen movement" to the "C melody", and apply the effect "Enlargement 2" to the "Solo" of the song. Apply the effect "Image switching 2" at the "chorus" of the song. In this way, a plurality of patterns of effect application examples are defined in the effect correspondence table 223.

ステップＳ１０８において、制御部２０１（フォトムービー生成部２５）は、エフェクト対応テーブル２２３のどのパターンでフォトムービーを生成するかを、例えば図１０に示す処理により決定する。 In step S108, the control unit 201 (photo movie generation unit 25) determines which pattern in the effect correspondence table 223 is used to generate the photo movie, for example, by the process shown in FIG. 10.

図１０に示すように、サーバ２の制御部２０１（フォトムービー生成部２５）は、まず、楽曲ＩＤに対応するムービー構成情報２２４が記憶部２０２に記憶されているか否かを判定する（ステップＳ３０１）。ムービー構成情報２２４は、フォトムービーに使用する画像データや画像の切替タイミング、画像に適用するエフェクト等のフォトムービーを構成するための情報が格納されたデータである（図１２参照）。サーバ２で生成済みのフォトムービーについては、楽曲ＩＤと対応づけてムービー構成情報２２４が記憶部２０２に記憶されている。 As shown in FIG. 10, the control unit 201 (photo movie generation unit 25) of the server 2 first determines whether or not the movie configuration information 224 corresponding to the song ID is stored in the storage unit 202 (step S301 ). The movie configuration information 224 is data that stores information for configuring the photo movie, such as image data used in the photo movie, image switching timing, and effects to be applied to the images (see FIG. 12). For photo movies already generated by the server 2, movie configuration information 224 is stored in the storage unit 202 in association with the song ID.

ステップＳ３０１において、ユーザ端末３から送信された音源データの楽曲ＩＤに対応するムービー構成情報２２４が記憶部２０２に記憶されていない場合は（ステップＳ３０１；Ｎｏ）、その楽曲についてはフォトムービーを作成した履歴が無いため、サーバ２の制御部２０１は、エフェクト対応テーブル２２３から任意のパターンを読み出し（ステップＳ３０２）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する（ステップＳ３０３）。 In step S301, if the movie configuration information 224 corresponding to the song ID of the sound source data transmitted from the user terminal 3 is not stored in the storage unit 202 (step S301; No), a photo movie has been created for the song. Since there is no history, the control unit 201 of the server 2 reads an arbitrary pattern from the effect correspondence table 223 (step S302), and generates a photo movie according to the effect correspondence table 223 of the read pattern (step S303).

一方、ステップＳ３０１において、ユーザ端末３から送信された音源データの楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されている場合は（ステップＳ３０１；Ｙｅｓ）、制御部２０１は、エフェクト対応テーブル２２３から未使用のパターンを読み出し（ステップＳ３０４）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する（ステップＳ３０３）。このように、フォトムービーを作成したことがある楽曲については、２回目以降は別のパターンのフォトムービーが作成される。このため、同じ楽曲に対し異なるムービー表現を生成することができ、ユーザに多様な楽しみを提供できる。 On the other hand, in step S301, if the movie configuration information 224 corresponding to the song ID of the sound source data transmitted from the user terminal 3 is already stored in the storage unit 202 (step S301; Yes), the control unit 201 An unused pattern is read from the correspondence table 223 (step S304), and a photo movie is generated according to the effect correspondence table 223 of the read pattern (step S303). In this way, for a song for which a photo movie has been created, a photo movie of a different pattern is created from the second time onwards. Therefore, different movie expressions can be generated for the same song, and users can be provided with a variety of entertainment.

図１１は、楽曲（音源データ４４）について生成される様々なフォトムービー６１、６２、６３の例を示す図である。図１１（ａ）のフォトムービー６１は、画像取得部２３により取得した画像７１、７２、７３、…をそのまま使用して生成したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２等）で表示する画像が変更される。 FIG. 11 is a diagram showing examples of various photo movies 61, 62, and 63 generated for songs (sound source data 44). The photo movie 61 in FIG. 11A is generated by using images 71, 72, 73, . . . acquired by the image acquisition unit 23 as they are. The displayed image is changed at the timing when the composition of the song (sound source data 44) changes (timing t1 when the melody A changes to the B melody, timing t2 when the melody B changes to the chorus, etc.).

また、図１１（ｂ）のフォトムービー６２は、画像取得部２３により取得した画像７１、７２、７３、…にエフェクトテンプレート２２１を適用したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２）で画像が変更されるとともに、各区間（ｔ０～ｔ１、ｔ１～ｔ２、ｔ２～）で所定のエフェクトが画像に施される。 Furthermore, the photo movie 62 in FIG. 11(b) is obtained by applying the effect template 221 to images 71, 72, 73, . . . acquired by the image acquisition unit 23. The image is changed at the timing when the music composition of the song (sound source data 44) changes (timing t1 when the A melody changes to the B melody, timing t2 when the B melody changes to the chorus), and the images are changed in each section (t0 to t1, t1 to At t2, t2~), a predetermined effect is applied to the image.

図１１（ｂ）の例では、「Ａメロ」の区間ｔ０～ｔ１において画像７１に対しエフェクト「モノクロ→カラー」が適用され、モノクロの画像７１ａからカラーの画像７１ｂへと徐々に変更される。次に、ＡメロからＢメロへ変わる時刻ｔ１で画像が切り替えるとともにエフェクトも「縮小回転１」に変更される。区間ｔ１～ｔ２では、背景に画像７１、前景に画像７２を使用し、前景画像７２が画像７２ａの状態から画像７２ｂの状態へ縮小回転するアニメーションが生成される。次に、Ｂメロからサビへ変わる時刻ｔ２で画像が切り替えられるとともにエフェクトも「縮小回転２」に変更される。区間ｔ２～は、背景に画像７１、前景に画像７３を使用し、前景画像７３が画像７３ａの状態から画像７３ｂの状態へ縮小回転するアニメーションが生成される。 In the example of FIG. 11(b), the effect "monochrome→color" is applied to the image 71 in the section t0 to t1 of "A melody", and the monochrome image 71a is gradually changed to the color image 71b. Next, at time t1 when the melody A changes to the melody B, the image is switched and the effect is also changed to "reduction rotation 1". In the interval t1 to t2, an animation is generated in which the image 71 is used as the background and the image 72 is used as the foreground, and the foreground image 72 is reduced and rotated from the state of the image 72a to the state of the image 72b. Next, at time t2 when the B melody changes to the chorus, the image is switched and the effect is also changed to "reduction rotation 2". In the interval t2~, the image 71 is used as the background and the image 73 is used as the foreground, and an animation is generated in which the foreground image 73 is reduced and rotated from the state of the image 73a to the state of the image 73b.

また、図１１（ｃ）のフォトムービー６３は、画像取得部２３により取得した画像７４、７６、７９、…を背景画像とし、前景に画像素材７５、７７、８１、８２（画像素材生成部２４により生成した画像素材）を使用して生成したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２）で背景画像が変更されるとともに、各区間（ｔ０～ｔ１、ｔ１～ｔ２、ｔ２～）で前景画像に所定のエフェクトが施される。 The photo movie 63 in FIG. 11(c) has images 74, 76, 79, ... acquired by the image acquisition unit 23 as background images, and image materials 75, 77, 81, 82 (image material generation unit 24) in the foreground. It was generated using the image material (generated by The background image is changed at the timing when the composition of the song (sound source data 44) changes (timing t1 when the melody A changes to the B melody, timing t2 when the melody B changes to the chorus), and the background image is changed in each section (t0 to t1, t1). ~t2, t2~), a predetermined effect is applied to the foreground image.

図１１（ｃ）の例では、Ａメロの区間ｔ０～ｔ１において画像７４が背景として表示され、前景の画像素材７５に対しエフェクト「画面移動」が適用されて画像７５ａ→７５ｂ→７５ｃに移動する。次に、ＡメロからＢメロへ変わる時刻ｔ１で背景画像が画像７６に切り替えられるとともに前景の画像素材７７に対し、エフェクト「交互表示」が適用されて、画像７７ａと画像７７ｂが交互に表示されるアニメーションが生成される。次に、Ｂメロからサビへ変わる時刻ｔ２で背景画像が画像７９に切り替えられるとともに前景の画像も画像素材８１、８２に変更される。区間ｔ２～は、前景の画像素材８１、８２に対し、エフェクト「出現」が適用されて、バスドラムの発音タイミングに合わせて画像素材８１、８２が順に出現するように表示される。 In the example of FIG. 11(c), the image 74 is displayed as the background in the section t0 to t1 of the melody, and the effect "Screen movement" is applied to the foreground image material 75, and the image moves from image 75a to 75b to 75c. . Next, at time t1 when the melody A changes to the melody B, the background image is switched to the image 76, and the effect "alternate display" is applied to the foreground image material 77, so that the images 77a and 77b are displayed alternately. An animation is generated. Next, at time t2 when the B melody changes to the chorus, the background image is switched to image 79, and the foreground images are also changed to image materials 81 and 82. In the period t2~, the effect "appearance" is applied to the image materials 81 and 82 in the foreground, and the image materials 81 and 82 are displayed so as to appear in sequence in accordance with the sound timing of the bass drum.

制御部２０１（フォトムービー生成部２５）はフォトムービーを生成すると、生成したフォトムービーのムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶する。ムービー構成情報２２４は、図１２に示すように、楽曲ＩＤと紐づけて、楽曲構成、時間、適用エフェクト、使用画像枚数、使用画像ファイル名が格納される。 When the control unit 201 (photo movie generating unit 25) generates a photo movie, it links the movie configuration information 224 of the generated photo movie to a music ID and stores it in the storage unit 202. As shown in FIG. 12, the movie configuration information 224 stores the music configuration, time, applied effects, number of images used, and file names of the images used, linked to the music ID.

図１２の例は、楽曲ＩＤ「＊＊＊＊＊（アルバム名＋アーティスト名＋曲名＋再生時間等）」の楽曲をＢＧＭとするフォトムービーのムービー構成情報２２４である。楽曲構成「イントロ」の区間の時間は「０－１５」であり、「モノクロ→カラー１」のエフェクトが適用される。このエフェクトでの使用画像枚数は「１」枚であり、使用する画像のファイル名は「image15」である。 The example in FIG. 12 is movie configuration information 224 of a photo movie that uses the song with the song ID "***** (album name + artist name + song title + playback time, etc.)" as BGM. The time of the section of the music composition "Intro" is "0-15", and the effect of "Monochrome→Color 1" is applied. The number of images used in this effect is "1", and the file name of the image used is "image15".

また、楽曲構成「Ａメロ」の区間の時間は「１５－５０」であり、「拡大回転３」のエフェクトが適用される。このエフェクトでの使用画像枚数は「２」枚であり、使用する画像のファイル名は、前景は「image8」（拡大回転する画像）、背景は「image2」である。 Further, the time of the section of the music composition "A melody" is "15-50", and the effect of "enlargement rotation 3" is applied. The number of images used in this effect is "2", and the file names of the images used are "image8" (enlarged and rotated image) for the foreground and "image2" for the background.

また、楽曲構成「Ｂメロ」の区間の時間は「５０－７０」であり、「縮小回転２」のエフェクトが適用される。このエフェクトでの使用画像枚数は「２」枚であり、使用する画像のファイル名は、前景は「image8」（縮小回転する画像）、背景は「image17」である。 Further, the time of the section of the music composition "B melody" is "50-70", and the effect of "reduced rotation 2" is applied. The number of images used in this effect is "2", and the file names of the images used are "image8" for the foreground (an image to be reduced and rotated) and "image17" for the background.

また、楽曲構成「サビ」の区間の時間は「７０－１２０」であり、「画像切替５」のエフェクトが適用される。このエフェクトでの使用画像枚数は「５」枚であり、使用する画像のファイル名は、「image10」（最初に表示される画像）、「image11」（２番目に表示される画像）、「image12」（３番目に表示される画像）、「image13」（４番目に表示される画像）、「image14」（５番目に表示される画像）である。 Further, the time of the section of the music composition "Chorus" is "70-120", and the effect of "Image switching 5" is applied. The number of images used in this effect is "5", and the file names of the images used are "image10" (the first image displayed), "image11" (the second image displayed), and "image12". ” (the third displayed image), “image13” (the fourth displayed image), and “image14” (the fifth displayed image).

また、楽曲構成「バスドラムタイミング」の時間（時刻）は「７０、１１０、１１１、１１２、１１３」であり、「出現５」のエフェクトが適用される。このエフェクトでの使用画像枚数は「６」枚であり、使用する画像のファイル名は、背景は「image15」、前景は「人物１」（最初のバスドラムタイミング（７０）において出現する画像）、「人物２」（２番目のバスドラムタイミング（１１０）において出現する画像）、「人物３」（３番目のバスドラムタイミング（１１１）において出現する画像）、「人物４」（４番目のバスドラムタイミング（１１２）において出現する画像）、「人物５」（５番目のバスドラムタイミング（１１３）において出現する画像）である。 Further, the times (times) of the music composition "Bass Drum Timing" are "70, 110, 111, 112, 113", and the effect of "Appearance 5" is applied. The number of images used in this effect is "6", and the file names of the images used are "image15" for the background, "person 1" for the foreground (the image that appears at the first bass drum timing (70)), "Person 2" (image appearing at the second bass drum timing (110)), "Person 3" (image appearing at the third bass drum timing (111)), "Person 4" (image appearing at the fourth bass drum timing) "Person 5" (image appearing at the fifth bass drum timing (113)).

ムービー構成情報２２４の「楽曲構成」と「時間」の項目は、楽曲の音響特徴量データ４５に対応している。「適用エフェクト」の項目は、図９に示すエフェクト対応テーブル２２３から取得され、「使用画像枚数」の項目は、図８に示すエフェクトテンプレート２２１から取得される。どのエフェクト対応テーブル２２３を使用するかは、制御部２０１（フォトムービー生成部２５）が図１０の処理を実行することにより決定される。また、「使用画像ファイル名」の項目は、画像取得部２３により取得した画像や画像素材生成部２４により生成した画像から、制御部２０１（フォトムービー生成部２５）が任意に決定する。 The items “music composition” and “time” of the movie composition information 224 correspond to the acoustic feature amount data 45 of the musical piece. The item "applied effect" is obtained from the effect correspondence table 223 shown in FIG. 9, and the item "number of images used" is obtained from the effect template 221 shown in FIG. 8. Which effect correspondence table 223 is to be used is determined by the control unit 201 (photo movie generation unit 25) executing the process shown in FIG. Further, the item “Used image file name” is arbitrarily determined by the control unit 201 (photo movie generation unit 25) from the images acquired by the image acquisition unit 23 and the images generated by the image material generation unit 24.

図６の説明に戻る。
ステップＳ１０８の処理によりフォトムービーが生成されると、サーバ２は生成したフォトムービーをユーザ端末３に送信する（ステップＳ１０９）。 Returning to the explanation of FIG. 6.
When the photo movie is generated by the process in step S108, the server 2 transmits the generated photo movie to the user terminal 3 (step S109).

ユーザ端末３はサーバ２から送信されるフォトムービーを受信し（ステップＳ１１０）、フォトムービーを再生する（ステップＳ１１１）。受信したフォトムービーが音楽付きフォトムービーである場合は、制御部３０１（再生処理部３５）は、受信したフォトムービーを復号し再生することで音楽及び映像が同期再生される。 The user terminal 3 receives the photo movie transmitted from the server 2 (step S110), and plays the photo movie (step S111). If the received photo movie is a photo movie with music, the control unit 301 (playback processing unit 35) decodes and plays the received photo movie, thereby synchronously playing back the music and video.

また、受信したフォトムービーが音楽無しフォトムービーである場合は、制御部３０１（再生処理部３５）は、指定した楽曲を記憶部３０２から読み出し、フォトムービーと同期させて再生する。ユーザ端末３の制御部３０１（再生処理部３５）は、フォトムービーと音源データとを同期再生することで、楽曲構成の変化に応じて画面が変化するフォトムービーが再生される。 Further, when the received photo movie is a photo movie without music, the control unit 301 (playback processing unit 35) reads the specified music from the storage unit 302 and plays it in synchronization with the photo movie. The control unit 301 (playback processing unit 35) of the user terminal 3 plays back the photo movie and the sound source data in synchronization, thereby playing back the photo movie whose screen changes according to changes in the music composition.

以上説明したように、第１の実施形態のフォトムービー生成システム１は、ユーザ端末３において楽曲及び画像を指定してサーバ２に音源データや画像データを送信すると、サーバ２側で音源データの音響解析処理を行い、音響特徴量データ４５に基づいてフォトムービーを生成する。ユーザ端末３は、サーバ２で生成したフォトムービーを受信し、音楽無しフォトムービーの場合は、音源データと同期再生する。 As explained above, in the photo movie generation system 1 of the first embodiment, when the user terminal 3 specifies music and images and transmits sound source data and image data to the server 2, the server 2 side generates an audio version of the sound source data. Analysis processing is performed to generate a photo movie based on the acoustic feature amount data 45. The user terminal 3 receives the photo movie generated by the server 2, and in the case of a photo movie without music, plays it in synchronization with the sound source data.

このように、フォトムービー生成システム１は、音響特徴量データ４５から検出できる楽曲の変化に応じて画面が変化するフォトムービーを生成してユーザに提供できる。また、サーバ２は、以前にフォトムービーを生成した楽曲については、ムービー構成情報２２４を蓄積記憶しておき、別のパターンのフォトムービーを生成するため、ユーザに多様な楽しみを提供できる。また、サーバ２は、以前に取得した音源データについては、音響特徴量データ４５を蓄積記憶しておくため、音響解析処理を省略して短時間でフォトムービーを生成できる。また、音楽無しフォトムービーを生成してユーザ端末３側で音源データと同期再生させることにより、音楽付きフォトムービーよりもファイルサイズを小さくでき、保存容量や通信容量を低減できる。 In this way, the photo movie generation system 1 can generate and provide to the user a photo movie whose screen changes according to changes in music that can be detected from the acoustic feature amount data 45. Further, the server 2 stores the movie configuration information 224 for songs for which photo movies have been previously generated, and generates photo movies of different patterns, so that it can provide a variety of entertainment to the user. Further, since the server 2 accumulates and stores acoustic feature data 45 for previously acquired sound source data, it is possible to generate a photo movie in a short time without performing acoustic analysis processing. Further, by generating a photo movie without music and playing it back in synchronization with the sound source data on the user terminal 3 side, the file size can be made smaller than that of a photo movie with music, and storage capacity and communication capacity can be reduced.

［第２の実施形態］
次に、第２の実施形態のフォトムービー生成システム１Ａについて説明する。第２の実施形態では、ユーザ端末３が音源データを持たない場合のシステム構成について説明する。 [Second embodiment]
Next, a photo movie generation system 1A according to a second embodiment will be described. In the second embodiment, a system configuration when the user terminal 3 does not have sound source data will be described.

図１３は本発明の第２の実施形態に係るフォトムービー生成システム１Ａの全体構成を示す図である。図に示すように、フォトムービー生成システム１Ａは、ユーザ端末３及びサーバ２がネットワーク５を介して通信接続される。また、ユーザ端末３及びサーバ２は、楽曲データベース４０を備えた音楽配信サーバ４とネットワーク５を介して通信接続可能となっている。サーバ２及びユーザ端末３のハードウェア構成は第１の実施形態と同様である。第１の実施形態と同一の各部は同一の符号を付し、重複する説明を省略する。 Figure 13 is a diagram showing the overall configuration of a photo movie production system 1A according to the second embodiment of the present invention. As shown in the figure, in the photo movie production system 1A, a user terminal 3 and a server 2 are communicatively connected via a network 5. The user terminal 3 and the server 2 can also be communicatively connected via the network 5 to a music distribution server 4 having a music database 40. The hardware configuration of the server 2 and the user terminal 3 is the same as in the first embodiment. The same parts as in the first embodiment are given the same reference numerals, and duplicate explanations will be omitted.

図１４は、音楽配信サーバ４の構成を示す図である。図に示すように、音楽配信サーバ４は、例えば制御部４０１、記憶部４０２、通信部４０３等をバス４０４等により接続して構成したコンピュータにより実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。制御部４０１、記憶部４０２、通信部４０３の構成は、サーバ２の制御部２０１、記憶部２０２、通信部２０３の構成と同様である。音楽配信サーバ４の楽曲データベース４０には、楽曲の音源データ４４及び音響特徴量データ４５が記憶される。 FIG. 14 is a diagram showing the configuration of the music distribution server 4. As shown in FIG. As shown in the figure, the music distribution server 4 can be realized by, for example, a computer configured by connecting a control section 401, a storage section 402, a communication section 403, etc. via a bus 404 or the like. However, the present invention is not limited to this, and various configurations can be adopted as appropriate. The configurations of the control unit 401, storage unit 402, and communication unit 403 are similar to those of the control unit 201, storage unit 202, and communication unit 203 of the server 2. The music database 40 of the music distribution server 4 stores music source data 44 and acoustic feature data 45.

次に、図１５を参照して第２の実施形態のフォトムービー生成システム１Ａの機能構成について説明する。図に示すように、フォトムービー生成システム１Ａにおいて、サーバ２は、記憶部２０２、音響特徴量データ取得部２６、画像取得部２３、画像素材生成部２４、及びフォトムービー生成部２５等を備える。ユーザ端末３は、記憶部３０２、表示部３０４、音声出力部３０７の他、楽曲指定部３１、画像指定部３２、送信部３３、受信部３４、音源データ取得部３６等を備える。音楽配信サーバ４は、楽曲データベース４０、送信部４１、及び音響解析部４２を備える。 Next, the functional configuration of the photo movie generation system 1A of the second embodiment will be described with reference to FIG. 15. As shown in the figure, in the photo movie generation system 1A, the server 2 includes a storage section 202, an acoustic feature amount data acquisition section 26, an image acquisition section 23, an image material generation section 24, a photo movie generation section 25, and the like. The user terminal 3 includes a storage unit 302, a display unit 304, an audio output unit 307, a music designation unit 31, an image designation unit 32, a transmission unit 33, a reception unit 34, a sound source data acquisition unit 36, and the like. The music distribution server 4 includes a music database 40, a transmitter 41, and an acoustic analyzer 42.

第２の実施形態のフォトムービー生成システム１Ａにおいて、第１の実施形態のフォトムービー生成システム１と異なる点は、サーバ２に音源データ取得部２１及び音響解析部２２を設けず、音響特徴量データ取得部２６を設けた点、ユーザ端末３の記憶部３０２に音源データを記憶せず音源データ取得部３６を設けた点、及び楽曲データベース４０及び音響解析部４２を備えた音楽配信サーバ４を備えた点である。 The photo movie generation system 1A of the second embodiment differs from the photo movie generation system 1 of the first embodiment in that the server 2 is not provided with the sound source data acquisition section 21 and the acoustic analysis section 22, and the acoustic feature amount data The acquisition unit 26 is provided, the sound source data acquisition unit 36 is provided without storing sound source data in the storage unit 302 of the user terminal 3, and the music distribution server 4 includes a music database 40 and an acoustic analysis unit 42. This is the point.

音楽配信サーバ４は、様々な楽曲の音源データ４４や音響特徴量データ４５、楽曲歌詞情報、楽曲書誌情報等を楽曲ＩＤと紐づけて楽曲データベース４０に記憶している。音源データ４４は、図１６に示すように、楽曲ＩＤに紐づけられて記憶されている。 The music distribution server 4 stores sound source data 44, acoustic feature data 45, song lyrics information, song bibliographic information, etc. of various songs in a song database 40 in association with song IDs. As shown in FIG. 16, the sound source data 44 is stored in association with a music ID.

音楽配信サーバ４の音響解析部４２は、楽曲データベース４０に記憶されている楽曲について音響解析処理を実施し、解析結果として音響特徴量データ４５を求め、楽曲ＩＤと紐づけて楽曲データベース４０に記憶する。音響特徴量データ４５は、図５に示す音響特徴量データ４５と同様とする。 The acoustic analysis unit 42 of the music distribution server 4 performs acoustic analysis processing on the songs stored in the song database 40, obtains acoustic feature data 45 as an analysis result, and stores it in the song database 40 in association with the song ID. do. The acoustic feature data 45 is the same as the acoustic feature data 45 shown in FIG.

音楽配信サーバ４の送信部４１は、ユーザ端末３からの要求に応じて音源データ４４をユーザ端末３に送信する。また、サーバ２からの要求に応じて音源データ４４や音響特徴量データ４５をサーバ２に送信する。 The transmission unit 41 of the music distribution server 4 transmits the sound source data 44 to the user terminal 3 in response to a request from the user terminal 3. In addition, the transmission unit 41 transmits the sound source data 44 and the acoustic feature data 45 to the server 2 in response to a request from the server 2.

サーバ２の音響特徴量データ取得部２６は、ユーザ端末３の楽曲指定部３１により指定された楽曲の楽曲ＩＤを受信すると、楽曲ＩＤに対応する楽曲の音響特徴量データ４５を音楽配信サーバ４に要求し、取得する。 Upon receiving the song ID of the song specified by the song specifying section 31 of the user terminal 3, the acoustic feature data acquisition unit 26 of the server 2 transmits the acoustic feature data 45 of the song corresponding to the song ID to the music distribution server 4. Request and get.

ユーザ端末３の音源データ取得部３６は、楽曲指定部３１により指定された楽曲の音源データ４４を音楽配信サーバ４に要求し、取得する。音楽無しフォトムービーがサーバ２から送信された場合、ユーザ端末３の再生処理部３５は、音源データ取得部３６によって音楽配信サーバ４から音源データを取得し、受信した音楽無しフォトムービーと同期再生する。 The audio data acquisition unit 36 of the user terminal 3 requests and acquires the audio data 44 of the music specified by the music designation unit 31 from the music distribution server 4. When a photo movie without music is transmitted from the server 2, the playback processing unit 35 of the user terminal 3 acquires the audio data from the music distribution server 4 by the audio data acquisition unit 36 and plays it in sync with the received photo movie without music.

次に、図１７を参照して、第２の実施形態のフォトムービー生成システム１Ａにおける処理の流れを説明する。
ユーザ端末３において、フォトムービーアプリを起動すると、ユーザ端末３の制御部３０１は、フォトムービーに使用する楽曲の指定（ステップＳ４０１）、及び画像の指定（ステップＳ４０２）を受け付ける。ステップＳ４０１～ステップＳ４０２において、制御部３０１は、楽曲や画像を選択入力するための入力画面を表示してもよい。 Next, with reference to FIG. 17, the flow of processing in the photo movie generation system 1A of the second embodiment will be described.
When the photo movie application is started on the user terminal 3, the control unit 301 of the user terminal 3 receives the designation of a song to be used in the photo movie (step S401) and the designation of an image (step S402). In steps S401 and S402, the control unit 301 may display an input screen for selectively inputting songs and images.

第２の実施形態では、指定できる楽曲は、音楽配信サーバ４の楽曲データベース４０に記憶されている楽曲（音源データ４４）とする。また、指定できる画像は、第１の実施形態と同様に、ユーザ端末３の記憶部３０２に記憶されている画像、及びサーバ２に記憶されている一般画像素材２２２とする。 In the second embodiment, the music that can be specified is the music (sound source data 44) stored in the music database 40 of the music distribution server 4. Furthermore, the images that can be specified are the images stored in the storage unit 302 of the user terminal 3 and the general image materials 222 stored in the server 2, as in the first embodiment.

楽曲及び画像が指定されると、制御部３０１は、指定された画像データを記憶部３０２から取得し（ステップＳ４０３）、指定された楽曲の楽曲ＩＤとともにサーバ２に送信する（ステップＳ４０４）。なお、一般画像素材２２２が指定された場合は、制御部３０１は、指定された一般画像素材２２２の識別情報をサーバ２に送信する。 When a song and an image are specified, the control unit 301 acquires the specified image data from the storage unit 302 (step S403), and transmits it to the server 2 together with the song ID of the specified song (step S404). Note that when the general image material 222 is specified, the control unit 301 transmits the identification information of the specified general image material 222 to the server 2.

サーバ２は、楽曲ＩＤ及び画像データ（一般画像素材２２２が指定された場合は、一般画像素材２２２の識別情報）を受信すると（ステップＳ４０５）、制御部２０１（音響特徴量データ取得部２６）は、受信した楽曲ＩＤに対応する音響特徴量データ４５を音楽配信サーバ４に要求する（ステップＳ４０６）。 When the server 2 receives the song ID and the image data (if the general image material 222 is specified, the identification information of the general image material 222) (step S405), the control unit 201 (acoustic feature data acquisition unit 26) , requests the music distribution server 4 for acoustic feature amount data 45 corresponding to the received song ID (step S406).

音楽配信サーバ４はサーバ２から楽曲ＩＤに対応する音響特徴量データ４５の要求を受信すると（ステップＳ４０７）、要求された楽曲ＩＤに対応する音響特徴量データ４５を楽曲データベース４０から読み出し、サーバ２に送信する（ステップＳ４０８）。サーバ２は楽曲ＩＤに対応する音響特徴量データ４５を受信する（ステップＳ４０９）。 When the music distribution server 4 receives a request for the acoustic feature data 45 corresponding to the song ID from the server 2 (step S407), it reads the acoustic feature data 45 corresponding to the requested song ID from the song database 40, and sends it to the server 2. (Step S408). The server 2 receives the acoustic feature data 45 corresponding to the song ID (step S409).

なお、ステップＳ４０８において、音楽配信サーバ４の楽曲データベース４０に該当する音響特徴量データ４５が記憶されていない場合、音楽配信サーバ４は要求された楽曲ＩＤの楽曲について音響解析部４２により音響解析処理を実行し、音響特徴量データ４５を得る。音楽配信サーバ４は音響特徴量データ４５を楽曲ＩＤと紐づけて楽曲データベース４０に記憶するとともに、要求元のサーバ２に送信する。 In step S408, if the corresponding acoustic feature data 45 is not stored in the music database 40 of the music distribution server 4, the music distribution server 4 executes acoustic analysis processing for the song with the requested song ID using the acoustic analysis unit 42 to obtain acoustic feature data 45. The music distribution server 4 associates the acoustic feature data 45 with the song ID, stores it in the music database 40, and transmits it to the server 2 that made the request.

また、サーバ２の制御部２０１（画像素材生成部２４）は、ステップＳ４０５で受信した画像データについて必要に応じて画像素材生成処理を実行し、画像データから画像素材を切り出す（ステップＳ４１０）。なお、受信した画像データをそのまま画像素材として使用してもよい。 Further, the control unit 201 (image material generation unit 24) of the server 2 executes image material generation processing as necessary on the image data received in step S405, and cuts out image materials from the image data (step S410). Note that the received image data may be used as is as the image material.

サーバ２の制御部２０１（フォトムービー生成部２５）は、画像素材と所定のエフェクトテンプレート２２１を使用し、音響特徴量データ４５（例えば、「楽曲構成」の変化）に応じて画面が変化するフォトムービー（音楽無しフォトムービー）を生成する（ステップＳ４１１）。 The control unit 201 (photo movie generation unit 25) of the server 2 uses the image material and a predetermined effect template 221 to create a photo movie whose screen changes according to the acoustic feature data 45 (for example, a change in the “song composition”). A movie (photo movie without music) is generated (step S411).

フォトムービーの生成方法は第１の実施形態と同様である。すなわち、制御部２０１（フォトムービー生成部２５）は、図１０に示すように、ユーザ端末３から指定された楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されているか否かを判定し、記憶部２０２に記憶されていない場合はエフェクト対応テーブル２２３から任意のパターンのエフェクト対応テーブル２２３を使用してフォトムービーを生成する。 The method of generating a photo movie is the same as in the first embodiment. That is, as shown in FIG. 10, the control unit 201 (photo movie generating unit 25) determines whether movie configuration information 224 corresponding to a song ID specified from the user terminal 3 is already stored in the storage unit 202, and if it is not stored in the storage unit 202, generates a photo movie using an effect correspondence table 223 of any pattern from the effect correspondence table 223.

また、ユーザ端末３から指定された楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されている場合は、制御部２０１は、エフェクト対応テーブル２２３から未使用のパターンのエフェクト対応テーブル２２３を読み出し（ステップＳ３０４）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する。 Further, if the movie configuration information 224 corresponding to the music ID specified from the user terminal 3 is already stored in the storage unit 202, the control unit 201 selects the effect correspondence table 224 of the unused pattern from the effect correspondence table 223. is read out (step S304), and a photo movie is generated according to the effect correspondence table 223 of the read pattern.

制御部２０１（フォトムービー生成部２５）はフォトムービーを生成すると、生成したフォトムービーのムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶する。ムービー構成情報２２４は、図１２に示すように、楽曲ＩＤと紐づけて、楽曲構成、時間、適用エフェクト、使用画像枚数、使用画像ファイル名が格納される。また、サーバ２は生成したフォトムービーをユーザ端末３に送信する（ステップＳ４１２）。 When the control unit 201 (photo movie generation unit 25) generates a photo movie, it stores the movie configuration information 224 of the generated photo movie in the storage unit 202 in association with the song ID. As shown in FIG. 12, the movie structure information 224 stores the song structure, time, applied effects, number of used images, and used image file name in association with the song ID. Further, the server 2 transmits the generated photo movie to the user terminal 3 (step S412).

ユーザ端末３はサーバ２からフォトムービーを受信すると（ステップＳ４１３）、再生処理を行う。受信したフォトムービーは音楽無しフォトムービーであるため、制御部３０１（再生処理部３５）は、ステップＳ４０１で指定した楽曲の楽曲ＩＤの音源データ４４を音楽配信サーバ４に要求する（ステップＳ４１４）。 When the user terminal 3 receives the photo movie from the server 2 (step S413), it performs playback processing. Since the received photo movie is a photo movie without music, the control unit 301 (playback processing unit 35) requests the music distribution server 4 for the sound source data 44 of the song ID of the song specified in step S401 (step S414).

音楽配信サーバ４は、ユーザ端末３から楽曲ＩＤの音源データ４４の要求を受信すると（ステップＳ４１５）、要求された楽曲ＩＤの音源データ４４を楽曲データベース４０から読み出し、要求元のユーザ端末３に送信する（ステップＳ４１６）。 When the music distribution server 4 receives a request for the sound source data 44 of the song ID from the user terminal 3 (step S415), the music distribution server 4 reads the sound source data 44 of the requested song ID from the song database 40 and transmits it to the user terminal 3 that made the request. (Step S416).

ユーザ端末３は、音楽配信サーバ４から音源データ４４を受信すると（ステップＳ４１７）、ユーザ端末３の制御部３０１（再生処理部３５）は、ステップＳ４１３で受信した音楽無しフォトムービーと音源データ４４とを同期させて再生する（ステップＳ４１８）。これにより、楽曲構成の変化に応じて画面が変化するフォトムービーを再生できる。 When the user terminal 3 receives the sound source data 44 from the music distribution server 4 (step S417), the control unit 301 (playback processing unit 35) of the user terminal 3 combines the sound source data 44 with the photo movie without music received in step S413. are synchronized and reproduced (step S418). With this, it is possible to play a photo movie whose screen changes according to changes in the music composition.

なお、上述の例は、音楽無しフォトムービーを生成する場合の手順について説明したが、ステップＳ４０６でサーバ２に音楽配信サーバ４に対して音響特徴量データ４５の送信を要求する際に、音源データ４４の送信も要求し、該当の楽曲の音源データ４４を取得すれば、音楽付きフォトムービーを生成することも可能である。サーバ２により音楽付きフォトムービーが生成された場合、ユーザ端末３は音楽付きフォトムービーを受信し、制御部３０１（再生処理部３５）により受信したフォトムービーを復号し再生することで音楽及び映像が同時に再生される。 Note that the above example describes the procedure for generating a photo movie without music, but when requesting the server 2 to transmit the acoustic feature data 45 to the music distribution server 4 in step S406, the sound source data If the user also requests the transmission of 44 and obtains the sound source data 44 of the corresponding song, it is possible to generate a photo movie with music. When a photo movie with music is generated by the server 2, the user terminal 3 receives the photo movie with music, and the control unit 301 (playback processing unit 35) decodes and plays the received photo movie so that the music and video can be reproduced. are played simultaneously.

以上のように、第２の実施形態では、ユーザ端末３が音源データを持たず、サーバ２が音楽配信サーバ４から音響特徴量データ４５を取得してフォトムービーを生成するフォトムービー生成システム１Ａについて説明した。第２の実施形態のフォトムービー生成システム１Ａによれば、サーバ２は音響解析処理を実施せず、既に解析済みの音響特徴量データ４５を音楽配信サーバ４から取得してフォトムービーの生成に使用するため、サーバ２の処理負担が軽減される。また、ユーザ端末３とサーバ２との間で音源データ４４の送受信を行わないため、通信量が少なくなり、通信に要する時間が短縮される。 As described above, in the second embodiment, a photo movie generation system 1A has been described in which the user terminal 3 does not have sound source data, and the server 2 acquires acoustic feature data 45 from the music distribution server 4 to generate a photo movie. According to the photo movie generation system 1A of the second embodiment, the server 2 does not perform acoustic analysis processing, but acquires already analyzed acoustic feature data 45 from the music distribution server 4 and uses it to generate a photo movie, thereby reducing the processing burden on the server 2. Also, since sound source data 44 is not sent or received between the user terminal 3 and the server 2, the amount of communication is reduced, and the time required for communication is shortened.

以上、添付図面を参照して、本発明の好適な実施形態について説明したが、本発明は係る例に限定されない。例えば、音楽配信サーバ４に音響解析部４２や音響特徴量データ４５を持たず、サーバ２が音響解析部４２を備え、音楽配信サーバ４から受信した楽曲について音響解析処理を実施し、各楽曲の音響特徴量データ４５を蓄積記憶しておく構成としてもよい。 Although preferred embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to such examples. For example, the music distribution server 4 does not have the acoustic analysis unit 42 or the acoustic feature data 45, but the server 2 includes the acoustic analysis unit 42, performs acoustic analysis processing on songs received from the music distribution server 4, and performs acoustic analysis processing on each song. A configuration may be adopted in which the acoustic feature amount data 45 is accumulated and stored.

また、上述の実施形態では、サーバ２は、主に楽曲構成の変化に応じて表示が変化するフォトムービーを生成する例を示したが、本発明はこれに限定されず、楽曲構成以外の変化、例えば、楽曲のテンポの変化、ビート（拍子）の変化等に応じて表示が変化するフォトムービーを生成してもよい。その他、当業者であれば、本願で開示した技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 Further, in the above-described embodiment, an example was shown in which the server 2 generates a photo movie whose display changes mainly according to changes in the song structure, but the present invention is not limited to this, and For example, a photo movie whose display changes according to changes in the tempo of the song, changes in the beat (time signature), etc. may be generated. It is clear that those skilled in the art can come up with various other changes or modifications within the scope of the technical idea disclosed in this application, and these naturally fall within the technical scope of the present invention. It is understood that

１・・・・・・・・・・フォトムービー生成システム
２・・・・・・・・・・サーバ
３・・・・・・・・・・ユーザ端末
４・・・・・・・・・・音楽配信サーバ
５・・・・・・・・・・ネットワーク
２１・・・・・・・・・音源データ取得部
２２・・・・・・・・・音響解析部
２３・・・・・・・・・画像取得部
２４・・・・・・・・・画像素材生成部
２５・・・・・・・・・フォトムービー生成部
２６・・・・・・・・・音響特徴量データ取得部
３１・・・・・・・・・楽曲指定部
３２・・・・・・・・・画像指定部
３３・・・・・・・・・送信部
３４・・・・・・・・・受信部
３５・・・・・・・・・再生処理部
３６・・・・・・・・・音源データ取得部
４０・・・・・・・・・楽曲データベース
４１・・・・・・・・・送信部
４２・・・・・・・・・音響解析部
４４・・・・・・・・・音源データ
４５・・・・・・・・・音響特徴量データ
６１、６２、６３・・・フォトムービー
２２１・・・・・・・・エフェクトテンプレート
２２２・・・・・・・・一般画像素材
２２３・・・・・・・・エフェクト対応テーブル
２２４・・・・・・・・ムービー構成情報
２０１、３０１・・・・制御部
２０２、３０２・・・・記憶部
２０３、３０３・・・・通信部
３０４・・・・・・・・入力部
３０５・・・・・・・・表示部
３０６・・・・・・・・周辺機器Ｉ／Ｆ部
３０７・・・・・・・・音声処理部 1...Photo movie generation system 2...Server 3...User terminal 4... -Music distribution server 5...Network 21...Sound source data acquisition unit 22...Acoustic analysis unit 23... ... Image acquisition section 24 ... Image material generation section 25 ... Photo movie generation section 26 ... Acoustic feature amount data acquisition section 31... Music designation section 32... Image specification section 33... Transmission section 34... Receiving section 35......Playback processing unit 36...Sound source data acquisition unit 40...Music database 41...Transmission Section 42...Acoustic analysis section 44...Sound source data 45...Acoustic feature data 61, 62, 63...Photo movie 221...Effect template 222...General image material 223...Effect support table 224...Movie configuration information 201, 301 ... Control section 202, 302 ... Storage section 203, 303 ... Communication section 304 ... Input section 305 ... Display section 306 ... ...Peripheral device I/F section 307 ...Audio processing section

Claims

A photo movie generation system in which a server and a user terminal are communicatively connected via a network,
The user terminal is
music designation means that accepts a designation of a music piece to be used in a photo movie and transmits music ID, which is the sound source data of the designated music piece or identification information of the music piece, to the server;
image designating means for accepting designations of a plurality of images to be used as image materials for a photo movie, and transmitting designated image data or designation information of the images to the server;
Reproducing means for receiving and reproducing the photo movie transmitted from the server,
The server is
Acoustic feature data acquisition means for acquiring audio feature data of a music piece specified by the user terminal;
image acquisition means for acquiring images to be used in the photo movie;
a photo movie generating unit that uses the image acquired by the image acquisition unit to generate a photo movie whose display changes according to changes in music based on the acoustic feature data, and transmits the generated photo movie to the user terminal;
A table that associates an effect template in which at least some of the effects are associated with a number of used images of 2 or more, and the acoustic feature amount data and the effect template that is suitable for the acoustic feature amount data, the table including the effect template that is associated with the number of images used, wherein at least some of the effects are associated with the number of images to be used, and the acoustic feature amount data and the effect template that is suitable for the effect template; storage means for storing in advance an effect correspondence table in which examples are defined;
Equipped with
The storage means of the server stores movie configuration information that is information constituting the photo movie and stores the number of used images acquired from the effect template and applied effects acquired from the effect correspondence table. Store it in association with the song ID,
The photo movie generation means refers to movie configuration information stored in association with the designated music ID, and if the movie configuration information is not stored in the storage means, generates an arbitrary pattern from the effect correspondence table. and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern. Photo movie generation system.

The storage means of the server stores the acoustic feature data of the song in association with the song ID,
2. The photo movie generation system according to claim 1 , wherein the acoustic feature amount data acquisition means acquires the acoustic feature amount data stored in the storage means.

The acoustic feature data acquisition means of the server
2. The photo movie generating system according to claim 1, wherein the acoustic feature data of the music piece designated by the user terminal is acquired from a music distribution server connected to the user terminal via a network.

The server further includes acoustic analysis means for performing an acoustic analysis process on the song to obtain acoustic feature data,
2. The photo movie generation system according to claim 1 , wherein the acoustic feature amount data acquisition means acquires the acoustic feature amount data obtained by the acoustic analysis means.

5. The photo movie generating system according to claim 4 , wherein the server acquires the sound source data of the music from a music distribution server connected to the server via a network.

5. The photo movie generation system according to claim 4 , wherein the server acquires sound source data of the song from the user terminal.

The photo movie generating means of the server generates the photo movie without adding a sound source, and outputs the photo movie in association with the song ID of the song,
A claim characterized in that the playback means of the user terminal acquires sound source data of a song corresponding to a song ID linked to the photo movie, and plays the acquired sound source data and the photo movie in synchronization. A photo movie generation system according to any one of claims 1 to 6 .

The server is
further comprising image material generation means for generating an image material from the image acquired by the image acquisition means,
The photo movie generation system according to any one of claims 1 to 7 , wherein the photo movie generation means generates the photo movie using the image material generated by the image material generation means. .

The user terminal further includes image sending means for sending images to be used in the photo movie to the server,
9. The photo movie generation system according to claim 1, wherein the image acquisition means of the server receives images transmitted from the user terminal.

image acquisition means for acquiring an image;
an acoustic feature data acquisition means for acquiring acoustic feature data of a piece of music;
a photo movie generating means for generating a photo movie in which a display changes in accordance with a change in music based on the acoustic feature data, using the images acquired by the image acquiring means;
a storage means for storing in advance an effect correspondence table in which an effect template in which at least some of the effects correspond to two or more images to be used and a table in which the acoustic feature data corresponds to the effect template suitable for the acoustic feature data, the effect correspondence table defining a plurality of patterns of effect application examples;
Equipped with
the storage means stores movie configuration information, which is information constituting the photo movie and includes the number of images used obtained from the effect template and the applied effects obtained from the effect correspondence table, in association with a song ID, which is identification information of a song;
The photo movie generation means refers to movie composition information stored in association with a specified song ID, and if the movie composition information is not stored in the storage means, reads an arbitrary pattern from the effect correspondence table, and if the movie composition information is already stored in the storage means, reads an unused pattern from the effect correspondence table and generates a photo movie according to the effect correspondence table of the read pattern.

an image acquisition means for acquiring an image;
a sound source data acquisition means for acquiring sound source data of a song;
acoustic analysis means for analyzing the sound source data to obtain acoustic feature data;
a photo movie generating unit that uses the images acquired by the image acquiring unit and generates a photo movie whose display changes according to changes in music based on the acoustic feature data;
A table that associates an effect template in which at least some of the effects are associated with a number of used images of 2 or more, and the acoustic feature amount data and the effect template that is suitable for the acoustic feature amount data, the table including the effect template that is associated with the number of images used, wherein at least some of the effects are associated with the number of images to be used, and the acoustic feature amount data and the effect template that is suitable for the effect template; storage means for storing in advance an effect correspondence table in which examples are defined;
Equipped with
The storage means stores movie configuration information that constitutes the photo movie and stores the number of used images acquired from the effect template and applied effects acquired from the effect correspondence table. It is stored in association with the song ID which is identification information,
The photo movie generation means refers to movie configuration information stored in association with the designated music ID, and if the movie configuration information is not stored in the storage means, generates an arbitrary pattern from the effect correspondence table. and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern. Photo movie generator.

A user terminal that can be communicatively connected to a server that is the photo movie generation device according to claim 10 or 11 via a network,
music designation means that receives a designation of a music piece and transmits the music ID, which is the sound source data of the designated music piece or identification information of the music piece, to the server;
image designating means for accepting designations of a plurality of images and transmitting designated image data or designation information of the images to the server;
Reproducing means for receiving and reproducing a photo movie whose display changes according to changes in the music and displays a plurality of images , which is transmitted from the server;
A user terminal comprising:

A photo movie generation method in a photo movie generation system in which a server and a user terminal are communicatively connected via a network,
The server is a table that associates effect templates in which at least some of the effects are associated with a number of used images of 2 or more, and acoustic feature amount data and the effect templates that are suitable for the effect templates, the table including a plurality of patterns. storing in advance an effect correspondence table in which effect application examples of are defined;
a step in which the user terminal receives a designation of a song to be used in a photo movie, and transmits a song ID, which is the sound source data of the specified song or identification information of the song, to the server;
the user terminal receiving designation of a plurality of images to be used as image materials of a photo movie, and transmitting designated image data or designation information of the images to the server;
the server acquiring an image;
a step in which the server acquires acoustic feature data of a music piece specified in the user terminal;
The server uses the images to generate a photo movie whose display changes according to changes in music based on the acoustic feature data, and transmits it to the user terminal;
the user terminal receiving and playing the photo movie transmitted from the server;
including;
The step of the server using the image to generate a photo movie whose display changes according to changes in the music based on the acoustic feature data and transmitting it to the user terminal,
Regarding the generated photo movie, the movie configuration information, which is information constituting the photo movie and stores the number of used images obtained from the effect template and the applied effects obtained from the effect correspondence table, is Store it in the server in association with the song ID,
When generating a photo movie, refer to the movie configuration information stored in association with the specified song ID, and if the movie configuration information is not stored in the server, select any one from the effect correspondence table. A pattern is read out, and if the movie configuration information is already stored in the server, an unused pattern is read out from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern. How to generate a photo movie.

computer,
image acquisition means for acquiring images;
acoustic feature data acquisition means for obtaining acoustic feature data of a song;
photo movie generation means for generating a photo movie whose display changes according to changes in music based on the acoustic feature data using the images acquired by the image acquisition means;
It is a program to function as
The computer,
A table that associates an effect template in which at least some of the effects are associated with a number of used images of 2 or more, and the acoustic feature amount data and the effect template that is suitable for the acoustic feature amount data, the table including the effect template that is associated with the number of images used, wherein at least some of the effects are associated with the number of images to be used, and the acoustic feature amount data and the effect template that is suitable for the effect template; storage means for storing in advance an effect correspondence table in which examples are defined;
function as
The storage means stores movie configuration information that constitutes the photo movie and stores the number of used images acquired from the effect template and applied effects acquired from the effect correspondence table. It is stored in association with the song ID which is identification information,
The photo movie generation means refers to movie configuration information stored in association with the designated music ID, and if the movie configuration information is not stored in the storage means, generates an arbitrary pattern from the effect correspondence table. and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern. program.

computer,
image acquisition means for acquiring images;
a sound source data acquisition means for acquiring sound source data of a song;
acoustic analysis means for analyzing the sound source data to obtain acoustic feature data;
photo movie generation means for generating a photo movie whose display changes according to changes in music based on the acoustic feature data using the images acquired by the image acquisition means;
It is a program to function as
The computer,
A table that associates an effect template in which at least some of the effects are associated with a number of used images of 2 or more, and the acoustic feature amount data and the effect template that is suitable for the acoustic feature amount data, the table including the effect template that is associated with the number of images used, wherein at least some of the effects are associated with the number of images to be used, and the acoustic feature amount data and the effect template that is suitable for the effect template; storage means for storing in advance an effect correspondence table in which examples are defined;
function as
The storage means stores movie configuration information that constitutes the photo movie and stores the number of used images acquired from the effect template and applied effects acquired from the effect correspondence table. It is stored in association with the song ID which is identification information,
The photo movie generation means refers to movie configuration information stored in association with the designated music ID, and if the movie configuration information is not stored in the storage means, generates an arbitrary pattern from the effect correspondence table. and if the movie configuration information is already stored in the storage means, an unused pattern is read from the effect correspondence table, and a photo movie is generated according to the effect correspondence table of the read pattern. program.

A computer that can be communicatively connected to a server that executes the program according to claim 14 or 15 via a network,
music designation means that receives a designation of a music piece and transmits the sound source data of the designated music piece or a music ID that is identification information of the music piece to the server;
image designating means for accepting designations of a plurality of images and transmitting designated image data or designation information of the images to the server;
Reproducing means for receiving and reproducing a photo movie whose display changes according to changes in the music and displays a plurality of images , which is transmitted from the server;
A program to function as