JP2021157007A

JP2021157007A - Photo movie generation system, photo movie generation device, user terminal, photo movie generation method, and program

Info

Publication number: JP2021157007A
Application number: JP2020055808A
Authority: JP
Inventors: 健秀岸本; Takehide Kishimoto; 健二朗村田; Kenjiro Murata
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2021-10-07
Anticipated expiration: 2040-03-26
Also published as: JP7456232B2

Abstract

To provide a photo movie generation system or the like which enables a user to play back the user's favorite music with a photo movie and enjoy watching and listening to the music.SOLUTION: In a photo movie generation system 1, a server 2 includes an acoustic analysis unit 22, an image acquisition unit 23, an image material generation unit 24, a photo movie generation unit 25, and the like. A user terminal 3 includes a music designation unit 31, an image designation unit 32, a transmitting unit 33, a receiving unit 34, and the like. When a music and an image are specified on the user terminal 3 and sound source data (or a music ID) or image data are transmitted to the server 2, the server 2 acquires acoustic feature data 45 of the sound source data, generates a photo movie whose screen changes according to a change in the composition of music which can be detected from the acoustic feature data 45, and transmits the photo movie to the user terminal 3. The user terminal 3 plays back the photo movie in synchronization with the sound source data.SELECTED DRAWING: Figure 4

Description

本発明は、フォトムービーを生成し、提供するフォトムービー生成システム等に関する。 The present invention relates to a photo movie generation system and the like that generate and provide a photo movie.

クラウドフォトストレージサービスにおいて、クラウドストレージに保存されている写真を使用してフォトムービーを自動作成するサービスが提供されている（非特許文献１）。例えば、非特許文献１のクラウドフォトストレージサービスでは、ユーザがクラウドストレージに保存した画像からムービーにしたい写真や動画を選択し、予め用意されているいくつかのテーマの中から所望のテーマを選択すると、サービス側でムービーが自動生成される。 In the cloud photo storage service, a service for automatically creating a photo movie using a photo stored in the cloud storage is provided (Non-Patent Document 1). For example, in the cloud photo storage service of Non-Patent Document 1, a user selects a photo or video to be made into a movie from an image saved in the cloud storage, and selects a desired theme from several prepared themes. , The movie is automatically generated on the service side.

また、撮影した動画に印象の合った楽曲を付与する手法が報告されている（非特許文献２）。非特許文献２には、撮影した動画の特徴量（動画特徴量）を抽出し、音楽の特徴量（音楽特徴量）を抽出し、動画とメロディ・リズムの印象の関係性を算出し、ユーザの印象に合った楽曲を生成する手法が提案されている。 In addition, a method of adding a musical piece having an impression to a photographed moving image has been reported (Non-Patent Document 2). In Non-Patent Document 2, the feature amount (video feature amount) of the captured moving image is extracted, the musical feature amount (music feature amount) is extracted, the relationship between the moving image and the impression of the melody / rhythm is calculated, and the user A method of generating a music that matches the impression of is proposed.

一方、特許文献１には、任意の音響データにおける印象またはタイミングを表す音響表現を取得する表現取得部と、音響表現に応じた視覚効果等を画像等に付与する効果付与部と、音響データ及び効果を付与した画像等を再生するディスプレイと、を備える音響再生装置について記載されている。これにより、市販の音楽データ等の音響コンテンツの音響表現を取得して、音響表現に応じた視覚効果等の付与を可能としている。 On the other hand, Patent Document 1 describes an expression acquisition unit that acquires an acoustic expression representing an impression or timing in arbitrary acoustic data, an effect imparting unit that imparts a visual effect or the like corresponding to the acoustic expression to an image or the like, acoustic data, and the acoustic data. A sound reproduction device including a display for reproducing an image or the like to which an effect is given is described. As a result, it is possible to acquire an acoustic expression of acoustic content such as commercially available music data and add a visual effect or the like according to the acoustic expression.

特開２０１３−１１４０８８号公報Japanese Unexamined Patent Publication No. 2013-114088

Google（登録商標）フォト、"ムービーを作成"、［online］、［令和2年3月23日検索］、インターネット〈URL：https://photos.google.com/movies/create〉Google (registered trademark) photos, "Create movie", [online], [Search on March 23, 2nd year of Reiwa], Internet <URL: https://photos.google.com/movies/create> 清水柚里奈、他４名、"動画特徴量からの印象推定に基づく動画ＢＧＭの自動素材選出"、NICOGRAPH 2016,pp177-184Yurina Shimizu, 4 others, "Automatic selection of video BGM materials based on impression estimation from video features", NICOGRAPH 2016, pp177-184

しかしながら、上述の非特許文献１のムービー作成機能では、フォトムービーに付加する音楽（ＢＧＭ）をユーザが選択することも可能であるが、選択できる音楽は、デフォルトのカテゴリー単位（dramatic、electronic、rockin'、upbeat、…等）か、或いはユーザ端末内の音楽アプリにある音楽のうちＤＲＭ（Digital
Rights Management；デジタル著作権管理）で保護されていない楽曲のみとしている。また、非特許文献２の提案手法は、動画の印象に合った楽曲を生成する手法である。よって、いずれの手法も市販のＣＤや音楽配信サービスによって得た楽曲をフォトムービーのＢＧＭに使用することができない。これに対し、特許文献１の手法は、市販の音楽データから音響表現を抽出し、音響表現に応じた視覚効果等の付与を可能とするものであるが、表現に対して設定された視覚効果が固定的であるため、何度か視聴するとユーザが飽きてしまうおそれがある。 However, in the movie creation function of Non-Patent Document 1 described above, the user can select the music (BGM) to be added to the photo movie, but the music that can be selected is the default category unit (dramatic, electronic, rockin). ', upbeat, ... etc.) Or DRM (Digital) of the music in the music app on the user's terminal
Only songs that are not protected by Rights Management (digital rights management). Further, the proposed method of Non-Patent Document 2 is a method of generating a musical piece that matches the impression of a moving image. Therefore, neither method can use the music obtained from a commercially available CD or music distribution service for the BGM of the photo movie. On the other hand, the method of Patent Document 1 extracts an acoustic expression from commercially available music data and makes it possible to add a visual effect or the like according to the acoustic expression, but the visual effect set for the expression is provided. Is fixed, so the user may get bored after watching it several times.

本発明は上記の問題に鑑みてなされたものであり、ユーザの好みの楽曲をフォトムービー付きで再生可能とし、ユーザが楽しんで音楽を視聴することができるフォトムービー生成システム等を提供することを目的とする。 The present invention has been made in view of the above problems, and provides a photo movie generation system or the like that enables a user's favorite music to be played with a photo movie and allows the user to enjoy listening to the music. The purpose.

前述した課題を解決するための第１の発明は、サーバとユーザ端末とがネットワークを介して通信接続されたフォトムービー生成システムであって、前記ユーザ端末は、フォトムービーに使用する楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段と、前記サーバから送信されたフォトムービーを受信し、再生する再生手段と、を備え、前記サーバは、前記ユーザ端末において指定された楽曲の音響特徴量データを取得する音響特徴量データ取得手段と、前記フォトムービーに使用する画像を取得する画像取得手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、前記ユーザ端末に送信するフォトムービー生成手段と、を備えることを特徴とするフォトムービー生成システムである。 The first invention for solving the above-mentioned problems is a photo movie generation system in which a server and a user terminal are communicated and connected via a network, and the user terminal specifies a music to be used for the photo movie. It is provided with a music designation means for receiving and transmitting sound source data of a designated music or a music ID which is identification information of a music to the server, and a playback means for receiving and playing a photo movie transmitted from the server. The server acquired the acoustic feature data acquisition means for acquiring the acoustic feature data of the music designated on the user terminal, the image acquisition means for acquiring the image used for the photo movie, and the image acquisition means. A photo movie generation characterized by comprising a photo movie generation means for generating a photo movie whose display changes according to a change in music based on the acoustic feature amount data using an image and transmitting the photo movie to the user terminal. It is a system.

第１の発明によれば、サーバは、ユーザ端末において指定された楽曲の音響特徴量データを取得し、フォトムービーに使用する画像を取得すると、取得した画像を使用し、音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、ユーザ端末に送信する。ユーザ端末は、フォトムービーを受信して生成する。これにより、ユーザが指定した楽曲に対し、楽曲の変化に応じて表示が変化するフォトムービーを生成してユーザ端末に提供できるため、ユーザは、好きな楽曲をフォトムービー付きで視聴できる。このため、ユーザは音楽視聴をより楽しむことができる。 According to the first invention, when the server acquires the acoustic feature data of the music specified on the user terminal and acquires the image to be used for the photo movie, the server uses the acquired image and is based on the acoustic feature data. Generates a photo movie whose display changes according to changes in music and sends it to the user terminal. The user terminal receives and generates a photo movie. As a result, for the music specified by the user, a photo movie whose display changes according to the change of the music can be generated and provided to the user terminal, so that the user can watch the favorite music with the photo movie. Therefore, the user can enjoy listening to music more.

第１の発明のフォトムービー生成システムにおいて、前記サーバは、前記フォトムービーを構成する情報であるムービー構成情報を前記楽曲ＩＤに紐づけて記憶する記憶手段を備え、前記フォトムービー生成手段は、指定された楽曲ＩＤに紐づけて記憶されたムービー構成情報を参照し、前記記憶手段に記憶されているムービー構成情報と異なる構成のフォトムービーを生成する。これにより、同じ楽曲に対し異なるムービー表現を生成してユーザ端末に提供できるため、ユーザに多様な楽しみを提供できる。 In the photo movie generation system of the first invention, the server includes a storage means for storing movie configuration information, which is information constituting the photo movie, in association with the music ID, and the photo movie generation means is designated. The movie composition information stored in association with the stored music ID is referred to, and a photo movie having a configuration different from the movie composition information stored in the storage means is generated. As a result, different movie expressions can be generated for the same music and provided to the user terminal, so that the user can be provided with various enjoyments.

また、前記サーバの記憶手段は、前記楽曲の音響特徴量データを前記楽曲ＩＤと紐づけて記憶し、前記音響特徴量データ取得手段は、前記記憶手段に記憶されている音響特徴量データを取得することを特徴とすることが望ましい。これにより、一度解析した楽曲については音響解析を省略することができるため、短時間でフォトムービーを提供できる。また、前記サーバの音響特徴量データ取得手段は、前記ユーザ端末において指定された楽曲の音響特徴量データをネットワークを介して通信接続された音楽配信サーバから取得するようにしてもよい。これによりサーバ側で音響解析処理を行うことなく音響特徴量データを取得でき、サーバの負荷を軽減できる。 Further, the storage means of the server stores the acoustic feature amount data of the music in association with the music ID, and the acoustic feature amount data acquisition means acquires the acoustic feature amount data stored in the storage means. It is desirable to be characterized by doing so. As a result, acoustic analysis can be omitted for the music once analyzed, so that a photo movie can be provided in a short time. Further, the acoustic feature data acquisition means of the server may acquire the acoustic feature data of the music designated by the user terminal from the music distribution server communicated and connected via the network. As a result, acoustic feature data can be acquired without performing acoustic analysis processing on the server side, and the load on the server can be reduced.

また、前記サーバは、楽曲の音響解析処理を行って音響特徴量データを得る音響解析手段を更に備え、前記音響特徴量データ取得手段は、前記音響解析手段により得た音響特徴量データを取得するようにしてもよい。この際、前記サーバは、前記楽曲の音源データをネットワークを介して通信接続された音楽配信サーバから取得してもよいし、前記ユーザ端末から取得してもよい。これにより、ユーザの所望の楽曲に合ったフォトムービーを生成できる。 Further, the server further includes an acoustic analysis means for performing an acoustic analysis process of music to obtain acoustic feature amount data, and the acoustic feature amount data acquisition means acquires the acoustic feature amount data obtained by the acoustic analysis means. You may do so. At this time, the server may acquire the sound source data of the music from the music distribution server communicated and connected via the network, or may acquire the sound source data from the user terminal. This makes it possible to generate a photo movie that matches the music desired by the user.

また、前記サーバのフォトムービー生成手段は、前記フォトムービーを音源を付加せずに生成し、前記楽曲の楽曲ＩＤと紐づけて出力し、前記ユーザ端末の再生手段は、前記フォトムービーに紐づけられた楽曲ＩＤに対応する楽曲の音源データを取得し、取得した音源データと前記フォトムービーとを同期して再生することが望ましい。これにより、音楽付きフォトムービーよりファイルサイズの小さいフォトムービーを生成でき、フォトムービーを保存するためのストレージ容量を節約できる。 Further, the photo movie generation means of the server generates the photo movie without adding a sound source, associates it with the music ID of the music, and outputs the photo movie, and the playback means of the user terminal associates the photo movie with the photo movie. It is desirable to acquire the sound source data of the music corresponding to the obtained music ID, and to synchronize the acquired sound source data with the photo movie and play it back. As a result, a photo movie having a smaller file size than a photo movie with music can be generated, and the storage capacity for storing the photo movie can be saved.

また、前記サーバは、前記画像取得手段により取得した画像から画像素材を生成する画像素材生成手段を更に備え、前記フォトムービー生成手段は、前記画像素材生成手段により生成された画像素材を使用して前記フォトムービーを生成することが望ましい。
また、前記ユーザ端末は、前記フォトムービーに使用する画像を前記サーバに送信する画像送信手段を更に備え、前記サーバの画像取得手段は、前記ユーザ端末から送信された画像を受信することが望ましい。これにより、ユーザが指定した画像やその画像から生成された画像素材を使用してフォトムービーを生成できる。 Further, the server further includes an image material generation means for generating an image material from an image acquired by the image acquisition means, and the photo movie generation means uses the image material generated by the image material generation means. It is desirable to generate the photo movie.
Further, it is desirable that the user terminal further includes an image transmission means for transmitting an image used for the photo movie to the server, and the image acquisition means of the server receives an image transmitted from the user terminal. As a result, a photo movie can be generated using an image specified by the user and an image material generated from the image.

第２の発明は、画像を取得する画像取得手段と、楽曲の音響特徴量データを取得する音響特徴量データ取得手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段と、を備えることを特徴とするフォトムービー生成装置である。 The second invention uses an image acquisition means for acquiring an image, an acoustic feature amount data acquisition means for acquiring the acoustic feature amount data of a music, and an image acquired by the image acquisition means to obtain the acoustic feature amount data. It is a photo movie generation device including a photo movie generation means for generating a photo movie whose display changes according to a change in music based on the above.

第２の発明により、楽曲の音響特徴量データに基づいて楽曲の変化に応じて表示が変化するフォトムービーを生成することが可能となる。 According to the second invention, it is possible to generate a photo movie whose display changes according to a change in the music based on the acoustic feature data of the music.

第３の発明は、画像を取得する画像取得手段と、楽曲の音源データを取得する音源データ取得手段と、前記音源データを解析して音響特徴量データを取得する音響解析手段と、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段と、を備えることを特徴とするフォトムービー生成装置である。 A third invention includes an image acquisition means for acquiring an image, a sound source data acquisition means for acquiring sound source data of music, an acoustic analysis means for analyzing the sound source data and acquiring acoustic feature amount data, and the image acquisition. The photo movie generation device is characterized by comprising a photo movie generation means for generating a photo movie whose display changes according to a change in music based on the acoustic feature amount data using an image acquired by the means. ..

第３の発明により、取得した音源データについて音響解析を実施して音響特徴量データを取得し、音響特徴量データに基づいて楽曲の変化に応じて表示が変化するフォトムービーを生成することが可能となる。 According to the third invention, it is possible to perform acoustic analysis on the acquired sound source data to acquire acoustic feature data, and to generate a photo movie whose display changes according to a change in music based on the acoustic feature data. It becomes.

第４の発明は、サーバとネットワークを介して通信接続可能なユーザ端末であって、楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段と、前記サーバから送信された、前記楽曲の変化に応じて表示が変化するフォトムービーを受信し、再生する再生手段と、を備えることを特徴とするユーザ端末である。 The fourth invention is a user terminal capable of communicating with a server via a network, accepting a designation of a musical piece, and transmitting the sound source data of the designated musical piece or the musical piece ID which is the identification information of the musical piece to the server. The user terminal is characterized by comprising a music designation means and a reproduction means for receiving and reproducing a photo movie whose display changes according to a change in the music transmitted from the server.

第４の発明により、ユーザ端末において、楽曲を指定してサーバに楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤをサーバに送信すると、前記楽曲の変化に応じて表示が変化するフォトムービーをサーバから受信し、再生することが可能となる。 According to the fourth invention, when a music piece is specified on the user terminal and the music sound source data of the music piece or the music ID which is the identification information of the music piece is transmitted to the server, a photo movie whose display changes according to the change of the music piece is produced. It can be received from the server and played back.

第５の発明は、サーバとユーザ端末とがネットワークを介して通信接続されたフォトムービー生成システムにおけるフォトムービー生成方法であって、前記ユーザ端末が、フォトムービーに使用する楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信するステップと、前記サーバが、画像を取得するステップと、前記サーバが、前記ユーザ端末において指定された楽曲の音響特徴量データを取得するステップと、前記サーバが、前記画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、前記ユーザ端末に送信するステップと、前記ユーザ端末が、前記サーバから送信されたフォトムービーを受信し、再生するステップと、を含むことを特徴とするフォトムービー生成方法である。 A fifth invention is a photo movie generation method in a photo movie generation system in which a server and a user terminal are communicated and connected via a network, and the user terminal accepts and designates a song to be used for the photo movie. A step of transmitting the sound source data of the music or a music ID which is identification information of the music to the server, a step of the server acquiring an image, and an acoustic feature of the music specified by the server on the user terminal. A step of acquiring quantity data, a step of the server using the image, generating a photo movie whose display changes according to a change of music based on the acoustic feature quantity data, and a step of transmitting it to the user terminal. The user terminal is a photo movie generation method including a step of receiving and playing back a photo movie transmitted from the server.

第５の発明によれば、サーバは、ユーザ端末において指定された楽曲の音響特徴量データを取得し、フォトムービーに使用する画像を取得すると、取得した画像を使用し、音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成し、ユーザ端末に送信する。ユーザ端末は、フォトムービーを受信して生成する。これにより、ユーザが指定した楽曲に対し、楽曲の変化に応じて表示が変化するフォトムービーを生成してユーザ端末に提供できるため、ユーザは、好きな楽曲をフォトムービー付きで視聴できる。このため、ユーザは音楽視聴をより楽しむことができる。 According to the fifth invention, when the server acquires the acoustic feature data of the music specified on the user terminal and acquires the image to be used for the photo movie, the server uses the acquired image and is based on the acoustic feature data. Generates a photo movie whose display changes according to changes in music and sends it to the user terminal. The user terminal receives and generates a photo movie. As a result, for the music specified by the user, a photo movie whose display changes according to the change of the music can be generated and provided to the user terminal, so that the user can watch the favorite music with the photo movie. Therefore, the user can enjoy listening to music more.

第６の発明は、コンピュータを、画像を取得する画像取得手段、楽曲の音響特徴量データを取得する音響特徴量データ取得手段、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段、として機能させるためのプログラムである。 A sixth invention uses a computer with an image acquisition means for acquiring an image, an acoustic feature amount data acquisition means for acquiring acoustic feature amount data of music, and an image acquired by the image acquisition means, and the acoustic feature amount data. This is a program for functioning as a photo movie generation means for generating a photo movie whose display changes according to a change in music based on the above.

第６の発明により、コンピュータを第２の発明のフォトムービー生成装置として機能させることが可能となる。 According to the sixth invention, the computer can function as the photo movie generator of the second invention.

第７の発明は、コンピュータを、画像を取得する画像取得手段、楽曲の音源データを取得する音源データ取得手段、前記音源データを解析して音響特徴量データを取得する音響解析手段、前記画像取得手段により取得した画像を使用し、前記音響特徴量データに基づき楽曲の変化に応じて表示が変化するフォトムービーを生成するフォトムービー生成手段、として機能させるためのプログラムである。 According to a seventh aspect of the present invention, a computer is provided with an image acquisition means for acquiring an image, a sound source data acquisition means for acquiring sound source data of music, an acoustic analysis means for analyzing the sound source data and acquiring acoustic feature amount data, and the image acquisition. This is a program for functioning as a photo movie generation means for generating a photo movie whose display changes according to a change in music based on the acoustic feature amount data using the image acquired by the means.

第７の発明により、コンピュータを第３の発明のフォトムービー生成装置として機能させることが可能となる。 According to the seventh invention, the computer can function as the photo movie generator of the third invention.

第８の発明は、サーバとネットワークを介して通信接続可能なコンピュータを、楽曲の指定を受け付け、指定された楽曲の音源データまたは楽曲の識別情報である楽曲ＩＤを前記サーバに送信する楽曲指定手段、前記サーバから送信された、前記楽曲の変化に応じて表示が変化するフォトムービーを受信し、再生する再生手段、として機能させるためのプログラムである。 Eighth invention is a music designation means for receiving a music designation by a computer capable of communicating with a server via a network and transmitting the music sound source data of the designated music or the music ID which is the identification information of the music to the server. , A program for functioning as a reproduction means for receiving and reproducing a photo movie whose display changes according to a change in the music transmitted from the server.

第８の発明により、コンピュータを第１及び第４の発明のユーザ端末として機能させることが可能となる。 The eighth invention makes it possible to make a computer function as a user terminal of the first and fourth inventions.

本発明により、ユーザの好みの楽曲をフォトムービー付きで再生可能とし、ユーザが楽しんで音楽を視聴することができるフォトムービー生成システム等を提供することが可能となる。 INDUSTRIAL APPLICABILITY According to the present invention, it is possible to provide a photo movie generation system or the like in which a user's favorite music can be played with a photo movie and the user can enjoy listening to the music.

フォトムービー生成システム１の全体構成を示す図The figure which shows the whole structure of the photo movie generation system 1. サーバ２のハードウェア構成を示す図The figure which shows the hardware configuration of a server 2. ユーザ端末３のハードウェア構成を示す図The figure which shows the hardware configuration of a user terminal 3. フォトムービー生成システム１の機能構成を示す図The figure which shows the functional structure of the photo movie generation system 1. 音響特徴量データ４５のデータ構成例を示す図The figure which shows the data composition example of the acoustic feature amount data 45 フォトムービー生成システム１が実行する処理の流れを示すフローチャートA flowchart showing the flow of processing executed by the photo movie generation system 1. 音響解析処理の流れを示すフローチャートFlowchart showing the flow of acoustic analysis processing エフェクトテンプレート２２１の例を示す図The figure which shows the example of the effect template 221 エフェクト対応テーブル２２３の例を示す図The figure which shows the example of the effect correspondence table 223 フォトムービー生成処理の流れを示すフローチャートFlowchart showing the flow of photo movie generation processing フォトムービー６１、６２、６３の例を示す図The figure which shows the example of the photo movie 61, 62, 63 ムービー構成情報２２４の例を示す図The figure which shows the example of the movie composition information 224 フォトムービー生成システム１Ａの全体構成を示す図The figure which shows the whole structure of the photo movie generation system 1A 音楽配信サーバ４のハードウェア構成を示す図The figure which shows the hardware configuration of a music distribution server 4. フォトムービー生成システム１Ａの機能構成を示す図The figure which shows the functional structure of the photo movie generation system 1A 音源データ４４のデータ構成例を示す図The figure which shows the data composition example of the sound source data 44 フォトムービー生成システム１Ａが実行する処理の流れを示すフローチャートFlowchart showing the flow of processing executed by the photo movie generation system 1A

以下、図面に基づいて本発明の好適な実施形態について詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

［第１の実施形態］
図１は本発明の第１の実施形態に係るフォトムービー生成システム１の全体構成を示す図である。図１に示すように、本発明に係るフォトムービー生成システム１は、ユーザ端末３及びサーバ２がネットワーク５を介して通信接続される。サーバ２は、ユーザ端末３からの要求に応答してフォトムービーを生成するコンピュータ（フォトムービー生成装置）であり、ユーザ端末３と通信接続される。 [First Embodiment]
FIG. 1 is a diagram showing an overall configuration of a photo movie generation system 1 according to a first embodiment of the present invention. As shown in FIG. 1, in the photo movie generation system 1 according to the present invention, the user terminal 3 and the server 2 are communicated and connected via the network 5. The server 2 is a computer (photo movie generation device) that generates a photo movie in response to a request from the user terminal 3, and is connected to the user terminal 3 by communication.

ユーザ端末３は、ユーザが利用する電子機器であり、フォトムービー生成システム１を利用するための専用のアプリケーションプログラム（以下、フォトムービーアプリという）を搭載する。またはユーザ端末３は、サーバ２がネットワーク５上に開設したＷＥＢサイトを閲覧可能なブラウザを搭載し、ＷＥＢサイトを介してサーバ２との間で処理を行うことによりフォトムービーの注文及び受信を行うものとしてもよい。ユーザ端末３は、例えば、スマートフォンやタブレット、ＰＣ、音楽プレーヤー、ゲーム機等により構成される。 The user terminal 3 is an electronic device used by the user, and is equipped with a dedicated application program (hereinafter, referred to as a photo movie application) for using the photo movie generation system 1. Alternatively, the user terminal 3 is equipped with a browser that allows the server 2 to browse the WEB site opened on the network 5, and orders and receives the photo movie by performing processing with the server 2 via the WEB site. It may be a thing. The user terminal 3 is composed of, for example, a smartphone, a tablet, a PC, a music player, a game machine, or the like.

図２は、サーバ２の構成を示す図である。図に示すように、サーバ２は、例えば制御部２０１、記憶部２０２、通信部２０３等をバス２０４等により接続して構成したコンピュータにより実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。 FIG. 2 is a diagram showing the configuration of the server 2. As shown in the figure, the server 2 can be realized by a computer configured by connecting, for example, a control unit 201, a storage unit 202, a communication unit 203, and the like by a bus 204 or the like. However, the present invention is not limited to this, and various configurations can be taken as appropriate.

制御部２０１は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等により構成される。ＣＰＵは、記憶部２０２、ＲＯＭ、記録媒体等に格納されるプログラムをＲＡＭ上のワークメモリ領域に呼び出して実行し、バス２０４を介して接続された各部を駆動制御する。ＲＯＭは、コンピュータのブートプログラムやＢＩＯＳ等のプログラム、データ等を恒久的に保持する。ＲＡＭは、ロードしたプログラムやデータを一時的に保持するとともに、制御部２０１が各種処理を行うために使用するワークエリアを備える。制御部２０１は、上記プログラムを読み出して実行することにより、サーバ２の各手段として機能する。 The control unit 201 is composed of a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The CPU calls and executes a program stored in the storage unit 202, ROM, recording medium, etc. in the work memory area on the RAM, and drives and controls each unit connected via the bus 204. The ROM permanently holds a computer boot program, a program such as a BIOS, data, and the like. The RAM temporarily holds the loaded program and data, and also includes a work area used by the control unit 201 to perform various processes. The control unit 201 functions as each means of the server 2 by reading and executing the above program.

記憶部２０２は、例えば、ハードディスクドライブやソリッドステートドライブ、フラッシュメモリ等の記憶装置である。記憶部２０２には制御部２０１が実行するプログラムや、プログラム実行に必要なデータ、オペレーティングシステム等が格納されている。これらのプログラムコードは、制御部２０１により必要に応じて読み出されてＲＡＭに移され、ＣＰＵに読み出されて実行される。また、サーバ２の記憶部２０２には、図４に示すように、エフェクトテンプレート２２１、一般画像素材２２２、エフェクト対応テーブル２２３、及びムービー構成情報２２４等のフォトムービーの生成に必要なデータが記憶される。これらのデータの詳細については後述する。 The storage unit 202 is, for example, a storage device such as a hard disk drive, a solid state drive, or a flash memory. The storage unit 202 stores a program executed by the control unit 201, data necessary for program execution, an operating system, and the like. These program codes are read by the control unit 201 as necessary, transferred to the RAM, read by the CPU, and executed. Further, as shown in FIG. 4, the storage unit 202 of the server 2 stores data necessary for generating a photo movie such as the effect template 221 and the general image material 222, the effect compatible table 223, and the movie configuration information 224. NS. Details of these data will be described later.

通信部２０３は、通信制御装置、通信ポート等を有し、ネットワーク５等との通信を制御する。ネットワーク５は、ＬＡＮ（Local Area Network）や、より広域に通信接続されたＷＡＮ（Wide
Area Network）、またはインターネット等の公衆の通信回線、基地局等を含む。ネットワーク５は有線、無線を問わない。サーバ２はネットワーク５を介してユーザ端末３と通信接続し、各種のデータを送受信可能である。
バス２０４は、各装置間の制御信号、データ信号等の授受を媒介する経路である。 The communication unit 203 has a communication control device, a communication port, and the like, and controls communication with the network 5 and the like. The network 5 is a LAN (Local Area Network) or a WAN (Wide) connected to a wider area.
Area Network), or public communication lines such as the Internet, base stations, etc. The network 5 may be wired or wireless. The server 2 communicates with the user terminal 3 via the network 5 and can transmit and receive various data.
The bus 204 is a route that mediates the transfer of control signals, data signals, and the like between the devices.

図３は、ユーザ端末３の構成を示す図である。図に示すように、ユーザ端末３は、例えば制御部３０１、記憶部３０２、通信部３０３、表示部３０４、入力部３０５、周辺機器Ｉ／Ｆ部３０６、及び音声出力部３０７等をバス３０８等により接続して構成したコンピュータ等により実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。制御部３０１、記憶部３０２、通信部３０３の構成は、サーバ２の制御部２０１、記憶部２０２、通信部２０３の構成と同様である。 FIG. 3 is a diagram showing the configuration of the user terminal 3. As shown in the figure, the user terminal 3 uses, for example, a control unit 301, a storage unit 302, a communication unit 303, a display unit 304, an input unit 305, a peripheral device I / F unit 306, an audio output unit 307, and the like as a bus 308. It can be realized by a computer or the like configured by connecting with. However, the present invention is not limited to this, and various configurations can be taken as appropriate. The configuration of the control unit 301, the storage unit 302, and the communication unit 303 is the same as the configuration of the control unit 201, the storage unit 202, and the communication unit 203 of the server 2.

表示部３０４は、例えば液晶パネル、ＣＲＴモニタ等のディスプレイ装置と、ディスプレイ装置と連携して表示処理を実行するための論理回路（ビデオアダプタ等）で構成され、制御部３０１の制御により入力された表示情報をディスプレイ装置上に表示させる。なお、入力部３０５及び表示部３０４は、表示画面にタッチパネル等の入力装置を一体的に設けたタッチパネルディスプレイとしてもよい。 The display unit 304 is composed of, for example, a display device such as a liquid crystal panel or a CRT monitor, and a logic circuit (video adapter or the like) for executing display processing in cooperation with the display device, and is input under the control of the control unit 301. Display the display information on the display device. The input unit 305 and the display unit 304 may be a touch panel display in which an input device such as a touch panel is integrally provided on the display screen.

入力部３０５は、例えば、タッチパネル、キーボード、マウス等のポインティング・デバイス、テンキー等の入力装置であり、入力されたデータを制御部３０１へ出力する。 The input unit 305 is, for example, an input device such as a touch panel, a keyboard, a pointing device such as a mouse, or a numeric keypad, and outputs the input data to the control unit 301.

周辺機器Ｉ／Ｆ（インタフェース）部３０６は、周辺機器を接続させるためのポートであり、周辺機器Ｉ／Ｆ部３０６を介して周辺機器とのデータの送受信を行う。周辺機器Ｉ／Ｆ部３０６は、ＵＳＢ等で構成されており、通常複数の周辺機器Ｉ／Ｆを有する。周辺機器との接続形態は有線、無線を問わない。 The peripheral device I / F (interface) unit 306 is a port for connecting peripheral devices, and transmits / receives data to / from the peripheral device via the peripheral device I / F unit 306. The peripheral device I / F unit 306 is composed of USB or the like, and usually has a plurality of peripheral device I / Fs. The connection form with peripheral devices may be wired or wireless.

音声出力部３０７は、制御部３０１から入力された音声データをスピーカから出力する。 The audio output unit 307 outputs the audio data input from the control unit 301 from the speaker.

なお、ユーザ端末３の記憶部３０２には後述する処理（図６）を実施するためのアプリケーションプログラム（フォトムービーアプリ）が格納され、このアプリケーションプログラムに従って後述する処理をユーザ端末３の制御部３０１が実行する。 An application program (photo movie application) for executing the process (FIG. 6) described later is stored in the storage unit 302 of the user terminal 3, and the control unit 301 of the user terminal 3 performs the process described later according to this application program. Run.

次に、図４を参照してフォトムービー生成システム１の機能構成について説明する。図４に示すように、フォトムービー生成システム１において、サーバ２は、上述の記憶部２０２の他、音源データ取得部２１、音響解析部２２、画像取得部２３、画像素材生成部２４、及びフォトムービー生成部２５等を備える。ユーザ端末３は、上述の記憶部３０２、表示部３０４、音声出力部３０７の他、楽曲指定部３１、画像指定部３２、送信部３３、及び受信部３４等を備える。 Next, the functional configuration of the photo movie generation system 1 will be described with reference to FIG. As shown in FIG. 4, in the photo movie generation system 1, in addition to the above-mentioned storage unit 202, the server 2 includes a sound source data acquisition unit 21, an acoustic analysis unit 22, an image acquisition unit 23, an image material generation unit 24, and a photo. A movie generation unit 25 and the like are provided. The user terminal 3 includes the above-mentioned storage unit 302, display unit 304, audio output unit 307, music designation unit 31, image designation unit 32, transmission unit 33, reception unit 34, and the like.

ユーザ端末３の楽曲指定部３１は、ユーザから楽曲の指定を受け付ける。第１の実施形態において楽曲指定部３１は、ユーザ端末３の記憶部３０２に記憶されている音源データ、或いはＣＤ等から読み取った音源データの中からユーザの操作によって楽曲を指定する。 The music designation unit 31 of the user terminal 3 receives a music designation from the user. In the first embodiment, the music designation unit 31 designates a music by the user's operation from the sound source data stored in the storage unit 302 of the user terminal 3 or the sound source data read from a CD or the like.

画像指定部３２は、ユーザからフォトムービーの画像素材として使用する画像の指定を受け付ける。指定する画像は、ユーザ端末３の記憶部３０２に記憶されている画像でもよいし、サーバ２の記憶部３０２に記憶されている一般画像素材２２２から指定してもよい。サーバ２の記憶部３０２に記憶されている一般画像素材２２２を指定する場合、例えば、サーバ２はユーザ端末３に対して一般画像素材２２２を選択するための画像選択画面を送信し、ユーザによる選択を受け付ける。 The image designation unit 32 receives from the user the designation of an image to be used as the image material of the photo movie. The designated image may be an image stored in the storage unit 302 of the user terminal 3, or may be designated from the general image material 222 stored in the storage unit 302 of the server 2. When the general image material 222 stored in the storage unit 302 of the server 2 is specified, for example, the server 2 transmits an image selection screen for selecting the general image material 222 to the user terminal 3, and the user selects the image material 222. Accept.

送信部３３は、楽曲指定部３１により指定された楽曲の音源データや画像指定部３２により指定された画像データを記憶部３０２から取得してサーバ２に送信する。指定画像がサーバ２の一般画像素材２２２である場合は、画像の指定情報（画像の識別情報等）を送信する。 The transmission unit 33 acquires the sound source data of the music designated by the music designation unit 31 and the image data designated by the image designation unit 32 from the storage unit 302 and transmits the data data to the server 2. When the designated image is the general image material 222 of the server 2, the designated image information (image identification information, etc.) is transmitted.

記憶部３０２には、楽曲の音源データが楽曲の識別情報である楽曲ＩＤと紐づけて記憶される。楽曲の音源データはＣＤから読み込んだ音源データ、或いは音楽配信サービスのサーバからダウンロードした音源データ等である。また、記憶部３０２には、画像データが記憶される。画像データは、ユーザがカメラで撮影した画像やスキャナで読み取った画像、またはネットワーク５を介してダウンロードした画像等である。 In the storage unit 302, the sound source data of the music is stored in association with the music ID which is the identification information of the music. The sound source data of the music is the sound source data read from the CD, the sound source data downloaded from the server of the music distribution service, or the like. In addition, image data is stored in the storage unit 302. The image data is an image taken by a user with a camera, an image read by a scanner, an image downloaded via a network 5, or the like.

楽曲ＩＤは、例えば、アルバム名＋アーティスト名＋曲名＋再生時間等の複合情報を使用する。音楽ＣＤデータベース（ＣＤＤＢ；Compact Disc DataBase）では、アルバム名、アーティスト名、及び曲名がＤｉｓｃ
ＩＤと紐づけて管理されている。ユーザ端末３に搭載されている音楽アプリは、音楽ＣＤのＴＯＣ（Table of Contents）という領域に記録されている情報からＤｉｓｃＩＤを算出し、ネットワーク５を介して音楽ＣＤデータベースにアクセスして、ＤｉｓｃＩＤに対応するアルバム名、アーティスト名、及び曲名を得ることができる。本実施形態では、このようにして得たアルバム名、アーティスト名、及び曲名と、楽曲の再生時間等とを複合した情報を楽曲ＩＤとして使用するものとする。 For the music ID, for example, composite information such as album name + artist name + music name + playback time is used. In the music CD database (CDDB; Compact Disc DataBase), the album name, artist name, and song name are Disc.
It is managed in association with the ID. The music application mounted on the user terminal 3 calculates the Disc ID from the information recorded in the area called TOC (Table of Contents) of the music CD, accesses the music CD database via the network 5, and discs. The album name, artist name, and song name corresponding to the ID can be obtained. In the present embodiment, the information obtained by combining the album name, artist name, and song name obtained in this way with the playback time of the song and the like is used as the song ID.

サーバ２の音源データ取得部２１は、ユーザ端末３から送信された音源データ及び楽曲ＩＤを取得する。 The sound source data acquisition unit 21 of the server 2 acquires the sound source data and the music ID transmitted from the user terminal 3.

音響解析部２２は、音源データ取得部２１が取得した音源データについて音響解析処理を行い、音響特徴量データ４５を取得する。 The acoustic analysis unit 22 performs an acoustic analysis process on the sound source data acquired by the sound source data acquisition unit 21, and acquires the acoustic feature amount data 45.

ここで、音響解析部２２による音響解析処理及び音響特徴量データ４５について説明する。図５は音響特徴量データ４５の一例を示す図である。音響解析処理では、楽曲構成（Ａメロ、Ｂメロ、サビ等）、楽曲のテンポ（ＢＰＭ；Beats Per Minute）、ビート（拍子）、特定の楽器の発音タイミング等の特徴量が解析される。また、これらに加え、和音進行（音楽の持つ曲調の流れ）、音量変化（聴感上の音量変化）、周波数重心（音楽の盛り上がり度合い）、音程パワー分離（音楽の持つ複雑さ）等の特徴量について解析してもよい。 Here, the acoustic analysis processing by the acoustic analysis unit 22 and the acoustic feature amount data 45 will be described. FIG. 5 is a diagram showing an example of acoustic feature amount data 45. In the acoustic analysis process, feature quantities such as music composition (A melody, B melody, chorus, etc.), music tempo (BPM; Beats Per Minute), beat (beat), and sounding timing of a specific musical instrument are analyzed. In addition to these, features such as chord progression (flow of music tone), volume change (audible volume change), frequency center of gravity (degree of excitement of music), pitch power separation (complexity of music), etc. May be analyzed.

音響解析処理は、公知の手法（例えば、株式会社ＣＲＩ・ミドルウェア、“超高速・高精度
楽曲解析ミドルウェアBEATWIZ（登録商標）”、[onine]、[令和２年３月１５日検索]、インターネット、< URL ：https://www.cri-mw.co.jp/product/amusement/beatwiz/index.html>、または、堀内直明、他５名共著、"Song
Surfing：類似フレーズで音楽ライブラリを散策する音楽再生システム"、PIONEER R&D（Vol.17,No.2/2007）等）を利用して行うことができる。なお、図５に示す音響特徴量データ４５は、一例であり、これに限定されない。音響解析部２２は、図５に示す音響特徴量データ４５の各項目以外の特徴量についても解析してもよい。 Acoustic analysis processing is performed by known methods (for example, CRI Middleware Co., Ltd., "Ultra-high-speed, high-precision music analysis middleware BEATWIZ (registered trademark)", [onine], [Search on March 15, 2nd year of Reiwa], Internet. , <URL: https://www.cri-mw.co.jp/product/amusement/beatwiz/index.html>, or Naoaki Horiuchi, co-authored by 5 others, "Song
Surfing: A music playback system that walks through a music library with similar phrases ", PIONEER R & D (Vol.17, No.2 / 2007), etc.) can be used. The acoustic feature data 45 shown in FIG. Is an example, and the present invention is not limited to this. The acoustic analysis unit 22 may also analyze feature quantities other than each item of the acoustic feature quantity data 45 shown in FIG.

音響解析部２２は、音響解析処理の結果、取得した音響特徴量データ４５をフォトムービー生成部２５に出力するとともに、記憶部２０２に楽曲ＩＤと紐づけて記憶する。なお、音響解析部２２は、一度音響解析処理を実施した楽曲については、記憶部２０２に記憶されている音響特徴量データ４５を取得するのみとする。これにより、音響解析処理に要するサーバ２の負荷を削減でき、かつ処理時間を低減できる。 The acoustic analysis unit 22 outputs the acoustic feature amount data 45 acquired as a result of the acoustic analysis process to the photo movie generation unit 25, and stores it in the storage unit 202 in association with the music ID. The acoustic analysis unit 22 only acquires the acoustic feature amount data 45 stored in the storage unit 202 for the music that has been subjected to the acoustic analysis processing once. As a result, the load on the server 2 required for the acoustic analysis processing can be reduced, and the processing time can be reduced.

画像取得部２３は、ユーザ端末３から送信された画像データ、または一般画像素材２２２の指定情報を取得する。一般画像素材２２２の指定情報を取得した場合は、指定情報に従って記憶部２０２から一般画像素材２２２を取得する。 The image acquisition unit 23 acquires the image data transmitted from the user terminal 3 or the designated information of the general image material 222. When the designated information of the general image material 222 is acquired, the general image material 222 is acquired from the storage unit 202 according to the designated information.

画像素材生成部２４は、画像取得部２３により取得した画像データから画像素材を生成する。画像素材生成部２４は、例えば、写真等の画像データから人物、動物、物品等の対象物を検出し、検出した対象物を切り出してそれぞれ個々の画像素材とする。画像素材生成部２４は、生成した画像素材をフォトムービー生成部２５に出力する。 The image material generation unit 24 generates an image material from the image data acquired by the image acquisition unit 23. The image material generation unit 24 detects an object such as a person, an animal, or an article from image data such as a photograph, cuts out the detected object, and uses each as an individual image material. The image material generation unit 24 outputs the generated image material to the photo movie generation unit 25.

フォトムービー生成部２５は、画像取得部２３により取得した画像または画像素材生成部２４により生成した画像素材を使用し、音響解析部２２から取得した音響特徴量データ４５に基づき楽曲の変化に応じて表示が変化するフォトムービーを生成する。フォトムービー生成部２５は、記憶部２０２に記憶されているエフェクト対応テーブル２２３及びエフェクトテンプレート２２１を参照して楽曲の変化に応じて表示が変化するフォトムービーを生成する。楽曲の変化は、音響特徴量データ４５の「楽曲構成」や「特定の楽器の発音タイミング」等から検出される。フォトムービー生成処理の詳細については、後述する。 The photo movie generation unit 25 uses the image acquired by the image acquisition unit 23 or the image material generated by the image material generation unit 24, and responds to changes in the music based on the acoustic feature amount data 45 acquired from the acoustic analysis unit 22. Generate a photo movie whose display changes. The photo movie generation unit 25 refers to the effect correspondence table 223 and the effect template 221 stored in the storage unit 202 to generate a photo movie whose display changes according to a change in the music. Changes in the music are detected from the "music composition" of the acoustic feature data 45, the "pronunciation timing of a specific musical instrument", and the like. The details of the photo movie generation process will be described later.

なお、フォトムービー生成部２５が生成するフォトムービーは、音楽付きフォトムービーまたは音楽無しフォトムービーである。音楽付きフォトムービーはユーザ端末３から指定された楽曲の音源データをＢＧＭとして付加したフォトムービーである。すなわち、映像データと音声データとが含まれるフォトムービーである。音楽無しフォトムービーは、楽曲の音源（ＢＧＭ）を付加せずに映像データのみのフォトムービーである。音楽無しフォトムービーは再生時に楽曲の音源データと同期して再生される。 The photo movie generated by the photo movie generation unit 25 is a photo movie with music or a photo movie without music. The photo movie with music is a photo movie in which the sound source data of the music designated from the user terminal 3 is added as BGM. That is, it is a photo movie including video data and audio data. A photo movie without music is a photo movie containing only video data without adding a sound source (BGM) of music. The photo movie without music is played in synchronization with the sound source data of the music at the time of playback.

フォトムービー生成部２５は、フォトムービーを生成すると、データ（映像データ）をmpeg4方式等で圧縮符号化し、ユーザ端末３において再生可能なデータ形式に変換する。そして、フォトムービーの構成を表す情報であるムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶するとともに、フォトムービーのデータを楽曲ＩＤと紐づけてユーザ端末３に送信する。 When the photo movie is generated, the photo movie generation unit 25 compresses and encodes the data (video data) by the mpeg4 method or the like, and converts the data (video data) into a data format that can be reproduced by the user terminal 3. Then, the movie configuration information 224, which is information representing the configuration of the photo movie, is associated with the music ID and stored in the storage unit 202, and the photo movie data is associated with the music ID and transmitted to the user terminal 3.

ユーザ端末３の受信部３４は、サーバ２から送信されたフォトムービーを受信し、再生処理部３５へ送る。 The receiving unit 34 of the user terminal 3 receives the photo movie transmitted from the server 2 and sends it to the playback processing unit 35.

再生処理部３５は、フォトムービーを再生する。音楽無しフォトムービーを受信した場合は、再生処理部３５は、フォトムービーに紐づけられた楽曲ＩＤに対応する楽曲の音源データを記憶部３０２から取得し、取得した音源データとフォトムービー（音楽無しフォトムービー）とを同期して再生する。再生処理部３５は、フォトムービーを復号して符号化前の映像データとし、表示部３０４に表示するとともに、音源データを同期再生して音声出力部３０７から出力する。 The reproduction processing unit 35 reproduces a photo movie. When the photo movie without music is received, the playback processing unit 35 acquires the sound source data of the music corresponding to the music ID associated with the photo movie from the storage unit 302, and the acquired sound source data and the photo movie (no music). Play in sync with the photo movie). The reproduction processing unit 35 decodes the photo movie into video data before encoding, displays it on the display unit 304, and synchronously reproduces the sound source data and outputs it from the audio output unit 307.

音楽付きフォトムービーを受信した場合は、再生処理部３５は、受信したフォトムービーのデータを復号して符号化前の映像データ及び音声データとし、映像を表示部３０４に表示するとともに、音声（音源）を音声出力部３０７から出力する。なお、フォトムービーの再生方法は、ストリーミング再生でもダウンロード再生でもよい。 When the photo movie with music is received, the playback processing unit 35 decodes the received photo movie data into video data and audio data before encoding, displays the video on the display unit 304, and audio (sound source). ) Is output from the audio output unit 307. The method of playing the photo movie may be streaming playback or download playback.

次に、図６を参照して、フォトムービー生成システム１が実行する処理の流れを説明する。以下の説明では、ユーザ端末３の記憶部３０２には、フォトムービー生成システム１を利用するためのアプリであるフォトムービーアプリがインストールされ、ユーザ端末３の制御部３０１がこのフォトムービーアプリを読み出して実行する手順について説明する。 Next, the flow of processing executed by the photo movie generation system 1 will be described with reference to FIG. In the following description, a photo movie application, which is an application for using the photo movie generation system 1, is installed in the storage unit 302 of the user terminal 3, and the control unit 301 of the user terminal 3 reads out the photo movie application. The procedure to be executed will be described.

ユーザ端末３において、フォトムービーアプリを起動すると、ユーザ端末３の制御部３０１は、フォトムービーに使用する楽曲の指定（ステップＳ１０１）、及び画像の指定（ステップＳ１０２）を受け付ける。ステップＳ１０１〜ステップＳ１０２において、制御部３０１は、楽曲や画像を選択入力するための入力画面を表示してもよい。 When the photo movie application is started on the user terminal 3, the control unit 301 of the user terminal 3 accepts the designation of the music to be used for the photo movie (step S101) and the designation of the image (step S102). In steps S101 to S102, the control unit 301 may display an input screen for selecting and inputting music or images.

第１の実施形態では、指定できる楽曲は、ユーザ端末３の記憶部３０２に記憶されている楽曲（音源データ）またはＣＤドライブにより読み取り可能な楽曲の音源データとする。また、指定できる画像は、ユーザ端末３の記憶部３０２に記憶されている画像、及びサーバ２に記憶されている一般画像素材２２２とする。 In the first embodiment, the music that can be specified is the music (sound source data) stored in the storage unit 302 of the user terminal 3 or the sound source data of the music that can be read by the CD drive. Further, the images that can be specified are the image stored in the storage unit 302 of the user terminal 3 and the general image material 222 stored in the server 2.

楽曲及び画像が指定されると、制御部３０１は、指定された音源データ及び画像データを記憶部３０２から取得し（ステップＳ１０３）、サーバ２に送信する（ステップＳ１０４）。なお、一般画像素材２２２が指定された場合は、制御部３０１は、指定された一般画像素材２２２の識別情報をサーバ２に送信する。 When the music and the image are designated, the control unit 301 acquires the designated sound source data and the image data from the storage unit 302 (step S103) and transmits the designated sound source data and the image data to the server 2 (step S104). When the general image material 222 is specified, the control unit 301 transmits the identification information of the designated general image material 222 to the server 2.

サーバ２は、音源データ及び画像データ（一般画像素材２２２が指定された場合は、一般画像素材２２２の識別情報）を受信すると（ステップＳ１０５）、制御部２０１（音響解析部２２）は、受信した音源データについて音響解析処理を実行し、楽曲の音響特徴量データ４５を取得する（ステップＳ１０６）。 When the server 2 receives the sound source data and the image data (identification information of the general image material 222 when the general image material 222 is specified) (step S105), the control unit 201 (acoustic analysis unit 22) receives the data. The acoustic analysis process is executed on the sound source data, and the acoustic feature amount data 45 of the music is acquired (step S106).

図７は、音響解析処理の流れを示すフローチャートである。図７に示すように、制御部２０１（音響解析部２２）は楽曲の音源データを取得し（ステップＳ２０１）、音響解析処理を実行する（ステップＳ２０２）。音響解析処理では、楽曲のテンポ（ＢＰＭ；Beats Per Minute）、ビート（拍子）、楽曲構成（Ａメロ、Ｂメロ、サビ等）、特定の楽器の発音タイミング等が解析される。 FIG. 7 is a flowchart showing the flow of the acoustic analysis process. As shown in FIG. 7, the control unit 201 (acoustic analysis unit 22) acquires the sound source data of the music (step S201) and executes the acoustic analysis process (step S202). In the acoustic analysis process, the tempo (BPM; Beats Per Minute), beat (beat), music composition (A melody, B melody, chorus, etc.) of a musical piece, the sounding timing of a specific musical instrument, and the like are analyzed.

制御部２０１は、音響解析結果を音響特徴量データ４５として、楽曲ＩＤと紐づけて記憶部２０２に記憶する（ステップＳ２０３）。音響特徴量データ４５を楽曲ＩＤと紐づけてサーバ２に記憶しておくことにより、一度解析した楽曲について、２回目以降は解析が不要となる。 The control unit 201 stores the acoustic analysis result as the acoustic feature amount data 45 in the storage unit 202 in association with the music ID (step S203). By associating the acoustic feature amount data 45 with the music ID and storing it in the server 2, it is not necessary to analyze the music once analyzed from the second time onward.

また、サーバ２の制御部２０１（画像素材生成部２４）は、ステップＳ１０５で受信した画像データについて必要に応じて画像素材生成処理を実行し、画像データから画像素材を切り出す（ステップＳ１０７）。なお、受信した画像データをそのまま画像素材として使用してもよい。 Further, the control unit 201 (image material generation unit 24) of the server 2 executes an image material generation process as necessary for the image data received in step S105, and cuts out the image material from the image data (step S107). The received image data may be used as it is as an image material.

サーバ２の制御部２０１（フォトムービー生成部２５）は、画像素材と所定のエフェクトテンプレート２２１を使用し、音響特徴量データ４５（例えば、「楽曲構成」の変化）に応じて画面が変化するフォトムービーを生成する（ステップＳ１０８）。 The control unit 201 (photo movie generation unit 25) of the server 2 uses the image material and the predetermined effect template 221 and changes the screen according to the acoustic feature amount data 45 (for example, a change in the “music composition”). Generate a movie (step S108).

ここで、エフェクトテンプレート２２１及びエフェクト対応テーブル２２３について説明する。
エフェクトテンプレート２２１は、画像に所定の加工（エフェクト）を加えるためのプログラムであり、図８に示すように、複数のエフェクトテンプレート２２１がエフェクトＮｏ．及び使用画像枚数と対応付けて記憶されている。 Here, the effect template 221 and the effect corresponding table 223 will be described.
The effect template 221 is a program for applying a predetermined processing (effect) to an image, and as shown in FIG. 8, a plurality of effect templates 221 have effect Nos. And it is stored in association with the number of images used.

エフェクトテンプレート２２１は、例えば、複数の表示画像を順に切り替える「画像切替」、画像をモノクロからカラーに変化させる「モノクロ→カラー」、画像の透過率を変化させる「透過率変化」、画像の表示位置を画面内で移動させる「画面移動」、画像を回転させる「回転」、画像を拡大させる「拡大」、画像を縮小させる「縮小」、表示画面内における画像の表示割合を変化させる「表示割合変化」、画像を出現させる「出現」、画像を拡大させながら移動させる「拡大移動」、画像を拡大させながら回転させる「拡大回転」、画像を縮小させながら移動させる「縮小移動」、画像を縮小させながら回転させる「縮小回転」、小画像を徐々に増加させていく「小画像増加」、複数の画像を一画面内に表示する「複数画像一画面表示」、一画面内に複数表示させた画像をスライドさせる「複数画像一画面表示＋スライド」等がある。 The effect template 221 is, for example, "image switching" for switching a plurality of display images in order, "monochrome → color" for changing an image from monochrome to color, "transparency change" for changing the transparency of an image, and an image display position. "Move screen" to move the image on the screen, "Rotate" to rotate the image, "Enlarge" to enlarge the image, "Reduce" to reduce the image, "Change display ratio" to change the display ratio of the image on the display screen. , "Appearance" to make an image appear, "Enlargement move" to move an image while enlarging it, "Enlargement rotation" to rotate an image while enlarging it, "Reduce movement" to move an image while reducing it, "Reduced rotation" to rotate while, "Small image increase" to gradually increase small images, "Multiple images on one screen" to display multiple images on one screen, Multiple images displayed on one screen There is "multiple image single screen display + slide" etc. to slide.

なお、図８に示すエフェクトテンプレート２２１は一例であり、他のエフェクトを含むものとしてもよい。また、各エフェクトについて移動量や変化の速さ等のパラメータを複数パターン用意してもよい。例えば、エフェクト「拡大」について、拡大率によって「拡大１」、「拡大２」、「拡大３」、…を用意したり、エフェクト「透過率変化」について、変化の速さによって「透過率変化１」、「透過率変化２」…を用意したりする。 The effect template 221 shown in FIG. 8 is an example, and may include other effects. Further, a plurality of patterns of parameters such as the amount of movement and the speed of change may be prepared for each effect. For example, for the effect "magnification", "magnification 1", "magnification 2", "magnification 3", ... , "Transmittance change 2" ...

制御部２０１（フォトムービー生成部２５）は、フォトムービーを生成するにあたり、どのエフェクトを使用するかを、エフェクト対応テーブル２２３を参照して決定する。図９はエフェクト対応テーブル２２３の例を示す図である。エフェクト対応テーブル２２３は、楽曲構成要素（イントロ、Ａメロ、Ｂメロ、Ｃメロ、ソロ、サビ等）とそれに適したエフェクトテンプレート２２１とを対応づけたテーブルであり、図９に示すように、複数のパターンが定義されている。 The control unit 201 (photo movie generation unit 25) determines which effect to use when generating the photo movie with reference to the effect correspondence table 223. FIG. 9 is a diagram showing an example of the effect corresponding table 223. The effect-compatible table 223 is a table in which music components (intro, verse, B-melody, C-melody, solo, chorus, etc.) are associated with an effect template 221 suitable for them, and as shown in FIG. 9, a plurality of them are associated with each other. Pattern is defined.

図９の例では、パターン（１）の場合、制御部２０１（フォトムービー生成部２５）は、楽曲の「イントロ」でエフェクト「モノクロ→カラー」を適用し、楽曲の「Ａメロ」でエフェクト「移動３」を適用し、楽曲の「Ｂメロ」でエフェクト「小画像増加」を適用し、楽曲の「Ｃメロ」で「複数画像一画面表示１」を適用し、楽曲の「ソロ」でエフェクト「拡大１」を適用し、楽曲の「サビ」でエフェクト「画像切替１」を適用する。 In the example of FIG. 9, in the case of the pattern (1), the control unit 201 (photo movie generation unit 25) applies the effect "monochrome → color" in the "intro" of the music, and the effect "A melody" in the music "A melody". Apply "Move 3", apply the effect "Small image increase" with the song "B melody", apply "Multiple image single screen display 1" with the song "C melody", and apply the effect with the song "Solo" Apply "Enlargement 1" and apply the effect "Image switching 1" on the "Sabi" of the song.

また、図９のパターン（２）の場合、制御部２０１（フォトムービー生成部２５）は、楽曲の「イントロ」でエフェクト「画像切替１」を適用し、楽曲の「Ａメロ」でエフェクト「縮小回転１」を適用し、楽曲の「Ｂメロ」でエフェクト「拡大回転１」を適用し、「Ｃメロ」でエフェクト「画面移動」を適用し、楽曲の「ソロ」でエフェクト「拡大２」を適用し、楽曲の「サビ」でエフェクト「画像切替２」を適用する。このように、複数のパターンのエフェクト適用例がエフェクト対応テーブル２２３に定義されている。 Further, in the case of the pattern (2) of FIG. 9, the control unit 201 (photo movie generation unit 25) applies the effect "image switching 1" in the "intro" of the music, and the effect "reduction" in the "verse" of the music. Apply "Rotation 1", apply the effect "Enlargement Rotation 1" with the song "B Melo", apply the effect "Screen Move" with "C Melo", and apply the effect "Enlargement 2" with the song "Solo". Apply and apply the effect "Image switching 2" with the song "Sabi". As described above, the effect application examples of a plurality of patterns are defined in the effect correspondence table 223.

ステップＳ１０８において、制御部２０１（フォトムービー生成部２５）は、エフェクト対応テーブル２２３のどのパターンでフォトムービーを生成するかを、例えば図１０に示す処理により決定する。 In step S108, the control unit 201 (photo movie generation unit 25) determines which pattern in the effect-corresponding table 223 is used to generate the photo movie, for example, by the process shown in FIG.

図１０に示すように、サーバ２の制御部２０１（フォトムービー生成部２５）は、まず、楽曲ＩＤに対応するムービー構成情報２２４が記憶部２０２に記憶されているか否かを判定する（ステップＳ３０１）。ムービー構成情報２２４は、フォトムービーに使用する画像データや画像の切替タイミング、画像に適用するエフェクト等のフォトムービーを構成するための情報が格納されたデータである（図１２参照）。サーバ２で生成済みのフォトムービーについては、楽曲ＩＤと対応づけてムービー構成情報２２４が記憶部２０２に記憶されている。 As shown in FIG. 10, the control unit 201 (photo movie generation unit 25) of the server 2 first determines whether or not the movie configuration information 224 corresponding to the music ID is stored in the storage unit 202 (step S301). ). The movie configuration information 224 is data in which information for constructing a photo movie, such as image data used for the photo movie, image switching timing, and effects applied to the image, is stored (see FIG. 12). For the photo movie generated by the server 2, the movie configuration information 224 is stored in the storage unit 202 in association with the music ID.

ステップＳ３０１において、ユーザ端末３から送信された音源データの楽曲ＩＤに対応するムービー構成情報２２４が記憶部２０２に記憶されていない場合は（ステップＳ３０１；Ｎｏ）、その楽曲についてはフォトムービーを作成した履歴が無いため、サーバ２の制御部２０１は、エフェクト対応テーブル２２３から任意のパターンを読み出し（ステップＳ３０２）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する（ステップＳ３０３）。 In step S301, if the movie configuration information 224 corresponding to the music ID of the sound source data transmitted from the user terminal 3 is not stored in the storage unit 202 (step S301; No), a photo movie is created for the music. Since there is no history, the control unit 201 of the server 2 reads an arbitrary pattern from the effect correspondence table 223 (step S302), and generates a photo movie according to the effect correspondence table 223 of the read pattern (step S303).

一方、ステップＳ３０１において、ユーザ端末３から送信された音源データの楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されている場合は（ステップＳ３０１；Ｙｅｓ）、制御部２０１は、エフェクト対応テーブル２２３から未使用のパターンを読み出し（ステップＳ３０４）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する（ステップＳ３０３）。このように、フォトムービーを作成したことがある楽曲については、２回目以降は別のパターンのフォトムービーが作成される。このため、同じ楽曲に対し異なるムービー表現を生成することができ、ユーザに多様な楽しみを提供できる。 On the other hand, in step S301, when the movie configuration information 224 corresponding to the music ID of the sound source data transmitted from the user terminal 3 is already stored in the storage unit 202 (step S301; Yes), the control unit 201 has an effect. An unused pattern is read from the corresponding table 223 (step S304), and a photo movie is generated according to the effect corresponding table 223 of the read pattern (step S303). In this way, for music that has created a photo movie, a photo movie with a different pattern is created from the second time onward. Therefore, different movie expressions can be generated for the same music, and various enjoyments can be provided to the user.

図１１は、楽曲（音源データ４４）について生成される様々なフォトムービー６１、６２、６３の例を示す図である。図１１（ａ）のフォトムービー６１は、画像取得部２３により取得した画像７１、７２、７３、…をそのまま使用して生成したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２等）で表示する画像が変更される。 FIG. 11 is a diagram showing examples of various photo movies 61, 62, 63 generated for music (sound source data 44). The photo movie 61 of FIG. 11A is generated by using the images 71, 72, 73, ... Acquired by the image acquisition unit 23 as they are. The image to be displayed is changed at the timing when the music composition of the music (sound source data 44) changes (timing t1 when the verse changes to B melody, timing t2 when the music changes from B melody to chorus, etc.).

また、図１１（ｂ）のフォトムービー６２は、画像取得部２３により取得した画像７１、７２、７３、…にエフェクトテンプレート２２１を適用したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２）で画像が変更されるとともに、各区間（ｔ０〜ｔ１、ｔ１〜ｔ２、ｔ２〜）で所定のエフェクトが画像に施される。 Further, in the photo movie 62 of FIG. 11B, the effect template 221 is applied to the images 71, 72, 73, ... Acquired by the image acquisition unit 23. The image is changed at the timing when the music composition of the music (sound source data 44) changes (timing t1 when changing from verse to B melody, timing t2 when changing from B melody to chorus), and each section (t0 to t1, t1 to 1). A predetermined effect is applied to the image at t2, t2).

図１１（ｂ）の例では、「Ａメロ」の区間ｔ０〜ｔ１において画像７１に対しエフェクト「モノクロ→カラー」が適用され、モノクロの画像７１ａからカラーの画像７１ｂへと徐々に変更される。次に、ＡメロからＢメロへ変わる時刻ｔ１で画像が切り替えるとともにエフェクトも「縮小回転１」に変更される。区間ｔ１〜ｔ２では、背景に画像７１、前景に画像７２を使用し、前景画像７２が画像７２ａの状態から画像７２ｂの状態へ縮小回転するアニメーションが生成される。次に、Ｂメロからサビへ変わる時刻ｔ２で画像が切り替えられるとともにエフェクトも「縮小回転２」に変更される。区間ｔ２〜は、背景に画像７１、前景に画像７３を使用し、前景画像７３が画像７３ａの状態から画像７３ｂの状態へ縮小回転するアニメーションが生成される。 In the example of FIG. 11B, the effect “monochrome → color” is applied to the image 71 in the sections t0 to t1 of the “verse”, and the monochrome image 71a is gradually changed to the color image 71b. Next, the image is switched at the time t1 when the verse changes from the verse to the B melody, and the effect is also changed to "reduced rotation 1". In the sections t1 to t2, the image 71 is used as the background and the image 72 is used as the foreground, and an animation is generated in which the foreground image 72 is reduced and rotated from the state of the image 72a to the state of the image 72b. Next, the image is switched at the time t2 when the B melody changes to the chorus, and the effect is also changed to "reduced rotation 2". In the sections t2 and 2, the image 71 is used as the background and the image 73 is used as the foreground, and an animation is generated in which the foreground image 73 is reduced and rotated from the state of the image 73a to the state of the image 73b.

また、図１１（ｃ）のフォトムービー６３は、画像取得部２３により取得した画像７４、７６、７９、…を背景画像とし、前景に画像素材７５、７７、８１、８２（画像素材生成部２４により生成した画像素材）を使用して生成したものである。楽曲（音源データ４４）の楽曲構成が変化するタイミング（ＡメロからＢメロに変わるタイミングｔ１、Ｂメロからサビに変わるタイミングｔ２）で背景画像が変更されるとともに、各区間（ｔ０〜ｔ１、ｔ１〜ｔ２、ｔ２〜）で前景画像に所定のエフェクトが施される。 Further, in the photo movie 63 of FIG. 11C, images 74, 76, 79, ... Acquired by the image acquisition unit 23 are used as background images, and image materials 75, 77, 81, 82 (image material generation unit 24) are used in the foreground. It was generated using the image material (image material generated by). The background image is changed at the timing when the music composition of the music (sound source data 44) changes (timing t1 when changing from verse to B melody, timing t2 when changing from B melody to chorus), and each section (t0 to t1, t1). ~ T2, t2 ~) applies a predetermined effect to the foreground image.

図１１（ｃ）の例では、Ａメロの区間ｔ０〜ｔ１において画像７４が背景として表示され、前景の画像素材７５に対しエフェクト「画面移動」が適用されて画像７５ａ→７５ｂ→７５ｃに移動する。次に、ＡメロからＢメロへ変わる時刻ｔ１で背景画像が画像７６に切り替えられるとともに前景の画像素材７７に対し、エフェクト「交互表示」が適用されて、画像７７ａと画像７７ｂが交互に表示されるアニメーションが生成される。次に、Ｂメロからサビへ変わる時刻ｔ２で背景画像が画像７９に切り替えられるとともに前景の画像も画像素材８１、８２に変更される。区間ｔ２〜は、前景の画像素材８１、８２に対し、エフェクト「出現」が適用されて、バスドラムの発音タイミングに合わせて画像素材８１、８２が順に出現するように表示される。 In the example of FIG. 11C, the image 74 is displayed as a background in the sections t0 to t1 of the verse, and the effect "screen movement" is applied to the image material 75 in the foreground to move the image 75a → 75b → 75c. .. Next, the background image is switched to the image 76 at the time t1 when the verse changes from the verse to the B melody, and the effect "alternate display" is applied to the image material 77 in the foreground, and the images 77a and 77b are displayed alternately. Animation is generated. Next, the background image is switched to the image 79 and the foreground image is also changed to the image materials 81 and 82 at the time t2 when the B melody changes to the chorus. In the sections t2-, the effect "appearance" is applied to the image materials 81 and 82 in the foreground, and the image materials 81 and 82 are displayed so as to appear in order in accordance with the sounding timing of the bass drum.

制御部２０１（フォトムービー生成部２５）はフォトムービーを生成すると、生成したフォトムービーのムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶する。ムービー構成情報２２４は、図１２に示すように、楽曲ＩＤと紐づけて、楽曲構成、時間、適用エフェクト、使用画像枚数、使用画像ファイル名が格納される。 When the control unit 201 (photo movie generation unit 25) generates a photo movie, the control unit 201 (photo movie generation unit 25) stores the generated photo movie movie configuration information 224 in the storage unit 202 in association with the music ID. As shown in FIG. 12, the movie configuration information 224 stores the music composition, time, applied effect, number of images used, and image file name used in association with the music ID.

図１２の例は、楽曲ＩＤ「＊＊＊＊＊（アルバム名＋アーティスト名＋曲名＋再生時間等）」の楽曲をＢＧＭとするフォトムービーのムービー構成情報２２４である。楽曲構成「イントロ」の区間の時間は「０−１５」であり、「モノクロ→カラー１」のエフェクトが適用される。このエフェクトでの使用画像枚数は「１」枚であり、使用する画像のファイル名は「image15」である。 An example of FIG. 12 is movie configuration information 224 of a photo movie in which the music of the music ID “***** (album name + artist name + song name + playback time, etc.)” is used as BGM. The time of the section of the music composition "intro" is "0-15", and the effect of "monochrome → color 1" is applied. The number of images used in this effect is "1", and the file name of the image used is "image15".

また、楽曲構成「Ａメロ」の区間の時間は「１５−５０」であり、「拡大回転３」のエフェクトが適用される。このエフェクトでの使用画像枚数は「２」枚であり、使用する画像のファイル名は、前景は「image8」（拡大回転する画像）、背景は「image2」である。 Further, the time of the section of the music composition "A melody" is "15-50", and the effect of "enlarged rotation 3" is applied. The number of images used in this effect is "2", and the file name of the image used is "image8" (enlarged and rotated image) in the foreground and "image2" in the background.

また、楽曲構成「Ｂメロ」の区間の時間は「５０−７０」であり、「縮小回転２」のエフェクトが適用される。このエフェクトでの使用画像枚数は「２」枚であり、使用する画像のファイル名は、前景は「image8」（縮小回転する画像）、背景は「image17」である。 Further, the time of the section of the music composition "B melody" is "50-70", and the effect of "reduction rotation 2" is applied. The number of images used in this effect is "2", and the file name of the image used is "image8" (reduced and rotated image) in the foreground and "image17" in the background.

また、楽曲構成「サビ」の区間の時間は「７０−１２０」であり、「画像切替５」のエフェクトが適用される。このエフェクトでの使用画像枚数は「５」枚であり、使用する画像のファイル名は、「image10」（最初に表示される画像）、「image11」（２番目に表示される画像）、「image12」（３番目に表示される画像）、「image13」（４番目に表示される画像）、「image14」（５番目に表示される画像）である。 Further, the time of the section of the music composition "rust" is "70-120", and the effect of "image switching 5" is applied. The number of images used in this effect is "5", and the file names of the images used are "image10" (first displayed image), "image11" (second displayed image), and "image12". "(Third displayed image)," image13 "(fourth displayed image)," image14 "(fifth displayed image).

また、楽曲構成「バスドラムタイミング」の時間（時刻）は「７０、１１０、１１１、１１２、１１３」であり、「出現５」のエフェクトが適用される。このエフェクトでの使用画像枚数は「６」枚であり、使用する画像のファイル名は、背景は「image15」、前景は「人物１」（最初のバスドラムタイミング（７０）において出現する画像）、「人物２」（２番目のバスドラムタイミング（１１０）において出現する画像）、「人物３」（３番目のバスドラムタイミング（１１１）において出現する画像）、「人物４」（４番目のバスドラムタイミング（１１２）において出現する画像）、「人物５」（５番目のバスドラムタイミング（１１３）において出現する画像）である。 Further, the time (time) of the music composition "bass drum timing" is "70, 110, 111, 112, 113", and the effect of "appearance 5" is applied. The number of images used in this effect is "6", and the file names of the images used are "image15" for the background and "person 1" for the foreground (the image that appears at the first bass drum timing (70)). "Person 2" (image appearing at the second bass drum timing (110)), "Person 3" (image appearing at the third bass drum timing (111)), "Person 4" (fourth bass drum) An image that appears at the timing (112)) and a "person 5" (an image that appears at the fifth bass drum timing (113)).

ムービー構成情報２２４の「楽曲構成」と「時間」の項目は、楽曲の音響特徴量データ４５に対応している。「適用エフェクト」の項目は、図９に示すエフェクト対応テーブル２２３から取得され、「使用画像枚数」の項目は、図８に示すエフェクトテンプレート２２１から取得される。どのエフェクト対応テーブル２２３を使用するかは、制御部２０１（フォトムービー生成部２５）が図１０の処理を実行することにより決定される。また、「使用画像ファイル名」の項目は、画像取得部２３により取得した画像や画像素材生成部２４により生成した画像から、制御部２０１（フォトムービー生成部２５）が任意に決定する。 The items of "music composition" and "time" of the movie composition information 224 correspond to the acoustic feature amount data 45 of the music. The item of "applied effect" is acquired from the effect correspondence table 223 shown in FIG. 9, and the item of "number of images used" is acquired from the effect template 221 shown in FIG. Which effect-corresponding table 223 is used is determined by the control unit 201 (photo movie generation unit 25) executing the process of FIG. Further, the item of "image file name to be used" is arbitrarily determined by the control unit 201 (photo movie generation unit 25) from the image acquired by the image acquisition unit 23 and the image generated by the image material generation unit 24.

図６の説明に戻る。
ステップＳ１０８の処理によりフォトムービーが生成されると、サーバ２は生成したフォトムービーをユーザ端末３に送信する（ステップＳ１０９）。 Returning to the description of FIG.
When the photo movie is generated by the process of step S108, the server 2 transmits the generated photo movie to the user terminal 3 (step S109).

ユーザ端末３はサーバ２から送信されるフォトムービーを受信し（ステップＳ１１０）、フォトムービーを再生する（ステップＳ１１１）。受信したフォトムービーが音楽付きフォトムービーである場合は、制御部３０１（再生処理部３５）は、受信したフォトムービーを復号し再生することで音楽及び映像が同期再生される。 The user terminal 3 receives the photo movie transmitted from the server 2 (step S110) and reproduces the photo movie (step S111). When the received photo movie is a photo movie with music, the control unit 301 (reproduction processing unit 35) decodes and reproduces the received photo movie to synchronize the music and the video.

また、受信したフォトムービーが音楽無しフォトムービーである場合は、制御部３０１（再生処理部３５）は、指定した楽曲を記憶部３０２から読み出し、フォトムービーと同期させて再生する。ユーザ端末３の制御部３０１（再生処理部３５）は、フォトムービーと音源データとを同期再生することで、楽曲構成の変化に応じて画面が変化するフォトムービーが再生される。 When the received photo movie is a photo movie without music, the control unit 301 (reproduction processing unit 35) reads the designated music from the storage unit 302 and reproduces the music in synchronization with the photo movie. The control unit 301 (reproduction processing unit 35) of the user terminal 3 synchronously reproduces the photo movie and the sound source data, so that the photo movie whose screen changes according to the change in the music composition is reproduced.

以上説明したように、第１の実施形態のフォトムービー生成システム１は、ユーザ端末３において楽曲及び画像を指定してサーバ２に音源データや画像データを送信すると、サーバ２側で音源データの音響解析処理を行い、音響特徴量データ４５に基づいてフォトムービーを生成する。ユーザ端末３は、サーバ２で生成したフォトムービーを受信し、音楽無しフォトムービーの場合は、音源データと同期再生する。 As described above, in the photo movie generation system 1 of the first embodiment, when music and images are specified on the user terminal 3 and sound source data or image data is transmitted to the server 2, the sound source data is sounded on the server 2 side. The analysis process is performed, and a photo movie is generated based on the acoustic feature amount data 45. The user terminal 3 receives the photo movie generated by the server 2, and in the case of a photo movie without music, synchronizes and reproduces it with the sound source data.

このように、フォトムービー生成システム１は、音響特徴量データ４５から検出できる楽曲の変化に応じて画面が変化するフォトムービーを生成してユーザに提供できる。また、サーバ２は、以前にフォトムービーを生成した楽曲については、ムービー構成情報２２４を蓄積記憶しておき、別のパターンのフォトムービーを生成するため、ユーザに多様な楽しみを提供できる。また、サーバ２は、以前に取得した音源データについては、音響特徴量データ４５を蓄積記憶しておくため、音響解析処理を省略して短時間でフォトムービーを生成できる。また、音楽無しフォトムービーを生成してユーザ端末３側で音源データと同期再生させることにより、音楽付きフォトムービーよりもファイルサイズを小さくでき、保存容量や通信容量を低減できる。 As described above, the photo movie generation system 1 can generate a photo movie whose screen changes according to the change of the music that can be detected from the acoustic feature amount data 45 and provide the user with the photo movie. Further, the server 2 stores and stores the movie configuration information 224 for the music for which the photo movie has been generated before, and generates a photo movie of another pattern, so that the user can be provided with various enjoyments. Further, since the server 2 stores and stores the acoustic feature amount data 45 for the previously acquired sound source data, the acoustic analysis process can be omitted and the photo movie can be generated in a short time. Further, by generating a photo movie without music and playing it in synchronization with the sound source data on the user terminal 3, the file size can be made smaller than that of the photo movie with music, and the storage capacity and communication capacity can be reduced.

［第２の実施形態］
次に、第２の実施形態のフォトムービー生成システム１Ａについて説明する。第２の実施形態では、ユーザ端末３が音源データを持たない場合のシステム構成について説明する。 [Second Embodiment]
Next, the photo movie generation system 1A of the second embodiment will be described. In the second embodiment, the system configuration when the user terminal 3 does not have the sound source data will be described.

図１３は本発明の第２の実施形態に係るフォトムービー生成システム１Ａの全体構成を示す図である。図に示すように、フォトムービー生成システム１Ａは、ユーザ端末３及びサーバ２がネットワーク５を介して通信接続される。また、ユーザ端末３及びサーバ２は、楽曲データベース４０を備えた音楽配信サーバ４とネットワーク５を介して通信接続可能となっている。サーバ２及びユーザ端末３のハードウェア構成は第１の実施形態と同様である。第１の実施形態と同一の各部は同一の符号を付し、重複する説明を省略する。 FIG. 13 is a diagram showing the overall configuration of the photo movie generation system 1A according to the second embodiment of the present invention. As shown in the figure, in the photo movie generation system 1A, the user terminal 3 and the server 2 are communicated and connected via the network 5. Further, the user terminal 3 and the server 2 can be communicated and connected to the music distribution server 4 provided with the music database 40 via the network 5. The hardware configuration of the server 2 and the user terminal 3 is the same as that of the first embodiment. The same parts as those in the first embodiment are designated by the same reference numerals, and duplicate description will be omitted.

図１４は、音楽配信サーバ４の構成を示す図である。図に示すように、音楽配信サーバ４は、例えば制御部４０１、記憶部４０２、通信部４０３等をバス４０４等により接続して構成したコンピュータにより実現できる。但しこれに限ることなく、適宜様々な構成をとることができる。制御部４０１、記憶部４０２、通信部４０３の構成は、サーバ２の制御部２０１、記憶部２０２、通信部２０３の構成と同様である。音楽配信サーバ４の楽曲データベース４０には、楽曲の音源データ４４及び音響特徴量データ４５が記憶される。 FIG. 14 is a diagram showing the configuration of the music distribution server 4. As shown in the figure, the music distribution server 4 can be realized by, for example, a computer configured by connecting a control unit 401, a storage unit 402, a communication unit 403, and the like by a bus 404 or the like. However, the present invention is not limited to this, and various configurations can be taken as appropriate. The configuration of the control unit 401, the storage unit 402, and the communication unit 403 is the same as the configuration of the control unit 201, the storage unit 202, and the communication unit 203 of the server 2. The music database 40 of the music distribution server 4 stores music sound source data 44 and acoustic feature data 45.

次に、図１５を参照して第２の実施形態のフォトムービー生成システム１Ａの機能構成について説明する。図に示すように、フォトムービー生成システム１Ａにおいて、サーバ２は、記憶部２０２、音響特徴量データ取得部２６、画像取得部２３、画像素材生成部２４、及びフォトムービー生成部２５等を備える。ユーザ端末３は、記憶部３０２、表示部３０４、音声出力部３０７の他、楽曲指定部３１、画像指定部３２、送信部３３、受信部３４、音源データ取得部３６等を備える。音楽配信サーバ４は、楽曲データベース４０、送信部４１、及び音響解析部４２を備える。 Next, the functional configuration of the photo movie generation system 1A of the second embodiment will be described with reference to FIG. As shown in the figure, in the photo movie generation system 1A, the server 2 includes a storage unit 202, an acoustic feature data acquisition unit 26, an image acquisition unit 23, an image material generation unit 24, a photo movie generation unit 25, and the like. The user terminal 3 includes a storage unit 302, a display unit 304, an audio output unit 307, a music designation unit 31, an image designation unit 32, a transmission unit 33, a reception unit 34, a sound source data acquisition unit 36, and the like. The music distribution server 4 includes a music database 40, a transmission unit 41, and an acoustic analysis unit 42.

第２の実施形態のフォトムービー生成システム１Ａにおいて、第１の実施形態のフォトムービー生成システム１と異なる点は、サーバ２に音源データ取得部２１及び音響解析部２２を設けず、音響特徴量データ取得部２６を設けた点、ユーザ端末３の記憶部３０２に音源データを記憶せず音源データ取得部３６を設けた点、及び楽曲データベース４０及び音響解析部４２を備えた音楽配信サーバ４を備えた点である。 The difference between the photo movie generation system 1A of the second embodiment and the photo movie generation system 1 of the first embodiment is that the server 2 does not have the sound source data acquisition unit 21 and the acoustic analysis unit 22, and the acoustic feature amount data. A point where an acquisition unit 26 is provided, a point where a sound source data acquisition unit 36 is provided without storing sound source data in the storage unit 302 of the user terminal 3, and a music distribution server 4 having a music database 40 and an acoustic analysis unit 42 are provided. This is the point.

音楽配信サーバ４は、様々な楽曲の音源データ４４や音響特徴量データ４５、楽曲歌詞情報、楽曲書誌情報等を楽曲ＩＤと紐づけて楽曲データベース４０に記憶している。音源データ４４は、図１６に示すように、楽曲ＩＤに紐づけられて記憶されている。 The music distribution server 4 stores the sound source data 44 of various songs, the acoustic feature amount data 45, the music lyrics information, the music journal information, and the like in the music database 40 in association with the music ID. As shown in FIG. 16, the sound source data 44 is stored in association with the music ID.

音楽配信サーバ４の音響解析部４２は、楽曲データベース４０に記憶されている楽曲について音響解析処理を実施し、解析結果として音響特徴量データ４５を求め、楽曲ＩＤと紐づけて楽曲データベース４０に記憶する。音響特徴量データ４５は、図５に示す音響特徴量データ４５と同様とする。 The acoustic analysis unit 42 of the music distribution server 4 performs acoustic analysis processing on the music stored in the music database 40, obtains acoustic feature data 45 as the analysis result, associates it with the music ID, and stores it in the music database 40. do. The acoustic feature amount data 45 is the same as the acoustic feature amount data 45 shown in FIG.

音楽配信サーバ４の送信部４１は、ユーザ端末３からの要求に応じて音源データ４４をユーザ端末３に送信する。また、サーバ２からの要求に応じて音源データ４４や音響特徴量データ４５をサーバ２に送信する。 The transmission unit 41 of the music distribution server 4 transmits the sound source data 44 to the user terminal 3 in response to a request from the user terminal 3. Further, the sound source data 44 and the acoustic feature amount data 45 are transmitted to the server 2 in response to the request from the server 2.

サーバ２の音響特徴量データ取得部２６は、ユーザ端末３の楽曲指定部３１により指定された楽曲の楽曲ＩＤを受信すると、楽曲ＩＤに対応する楽曲の音響特徴量データ４５を音楽配信サーバ４に要求し、取得する。 When the acoustic feature data acquisition unit 26 of the server 2 receives the music ID of the music designated by the music designation unit 31 of the user terminal 3, the acoustic feature data 45 of the music corresponding to the music ID is transmitted to the music distribution server 4. Request and get.

ユーザ端末３の音源データ取得部３６は、楽曲指定部３１により指定された楽曲の音源データ４４を音楽配信サーバ４に要求し、取得する。音楽無しフォトムービーがサーバ２から送信された場合、ユーザ端末３の再生処理部３５は、音源データ取得部３６によって音楽配信サーバ４から音源データを取得し、受信した音楽無しフォトムービーと同期再生する。 The sound source data acquisition unit 36 of the user terminal 3 requests and acquires the sound source data 44 of the music designated by the music designation unit 31 from the music distribution server 4. When the music-free photo movie is transmitted from the server 2, the playback processing unit 35 of the user terminal 3 acquires the sound source data from the music distribution server 4 by the sound source data acquisition unit 36, and synchronizes and plays back with the received music-free photo movie. ..

次に、図１７を参照して、第２の実施形態のフォトムービー生成システム１Ａにおける処理の流れを説明する。
ユーザ端末３において、フォトムービーアプリを起動すると、ユーザ端末３の制御部３０１は、フォトムービーに使用する楽曲の指定（ステップＳ４０１）、及び画像の指定（ステップＳ４０２）を受け付ける。ステップＳ４０１〜ステップＳ４０２において、制御部３０１は、楽曲や画像を選択入力するための入力画面を表示してもよい。 Next, the flow of processing in the photo movie generation system 1A of the second embodiment will be described with reference to FIG.
When the photo movie application is started on the user terminal 3, the control unit 301 of the user terminal 3 accepts the designation of the music to be used for the photo movie (step S401) and the designation of the image (step S402). In steps S401 to S402, the control unit 301 may display an input screen for selecting and inputting music or images.

第２の実施形態では、指定できる楽曲は、音楽配信サーバ４の楽曲データベース４０に記憶されている楽曲（音源データ４４）とする。また、指定できる画像は、第１の実施形態と同様に、ユーザ端末３の記憶部３０２に記憶されている画像、及びサーバ２に記憶されている一般画像素材２２２とする。 In the second embodiment, the music that can be specified is the music (sound source data 44) stored in the music database 40 of the music distribution server 4. Further, the images that can be specified are the image stored in the storage unit 302 of the user terminal 3 and the general image material 222 stored in the server 2 as in the first embodiment.

楽曲及び画像が指定されると、制御部３０１は、指定された画像データを記憶部３０２から取得し（ステップＳ４０３）、指定された楽曲の楽曲ＩＤとともにサーバ２に送信する（ステップＳ４０４）。なお、一般画像素材２２２が指定された場合は、制御部３０１は、指定された一般画像素材２２２の識別情報をサーバ２に送信する。 When the music and the image are designated, the control unit 301 acquires the designated image data from the storage unit 302 (step S403) and transmits it to the server 2 together with the music ID of the designated music (step S404). When the general image material 222 is specified, the control unit 301 transmits the identification information of the designated general image material 222 to the server 2.

サーバ２は、楽曲ＩＤ及び画像データ（一般画像素材２２２が指定された場合は、一般画像素材２２２の識別情報）を受信すると（ステップＳ４０５）、制御部２０１（音響特徴量データ取得部２６）は、受信した楽曲ＩＤに対応する音響特徴量データ４５を音楽配信サーバ４に要求する（ステップＳ４０６）。 When the server 2 receives the music ID and the image data (identification information of the general image material 222 when the general image material 222 is specified) (step S405), the control unit 201 (acoustic feature amount data acquisition unit 26) receives the music ID and the image data (identification information of the general image material 222). , Requests the music distribution server 4 for the acoustic feature amount data 45 corresponding to the received music ID (step S406).

音楽配信サーバ４はサーバ２から楽曲ＩＤに対応する音響特徴量データ４５の要求を受信すると（ステップＳ４０７）、要求された楽曲ＩＤに対応する音響特徴量データ４５を楽曲データベース４０から読み出し、サーバ２に送信する（ステップＳ４０８）。サーバ２は楽曲ＩＤに対応する音響特徴量データ４５を受信する（ステップＳ４０９）。 When the music distribution server 4 receives the request for the acoustic feature amount data 45 corresponding to the music ID from the server 2 (step S407), the music distribution server 4 reads the acoustic feature amount data 45 corresponding to the requested music ID from the music database 40 and reads the sound feature amount data 45 corresponding to the requested music ID from the music database 40. (Step S408). The server 2 receives the acoustic feature amount data 45 corresponding to the music ID (step S409).

なお、ステップＳ４０８において、音楽配信サーバ４の楽曲データベース４０に該当する音響特徴量データ４５が記憶されていない場合、音楽配信サーバ４は要求された楽曲ＩＤの楽曲について音響解析部４２により音響解析処理を実行し、音響特徴量データ４５を得る。音楽配信サーバ４は音響特徴量データ４５を楽曲ＩＤと紐づけて楽曲データベース４０に記憶するとともに、要求元のサーバ２に送信する。 In step S408, when the acoustic feature amount data 45 corresponding to the music database 40 of the music distribution server 4 is not stored, the music distribution server 4 acoustically analyzes the music of the requested music ID by the acoustic analysis unit 42. Is executed, and the acoustic feature amount data 45 is obtained. The music distribution server 4 associates the acoustic feature amount data 45 with the music ID, stores it in the music database 40, and transmits it to the requesting server 2.

また、サーバ２の制御部２０１（画像素材生成部２４）は、ステップＳ４０５で受信した画像データについて必要に応じて画像素材生成処理を実行し、画像データから画像素材を切り出す（ステップＳ４１０）。なお、受信した画像データをそのまま画像素材として使用してもよい。 Further, the control unit 201 (image material generation unit 24) of the server 2 executes an image material generation process as necessary for the image data received in step S405, and cuts out the image material from the image data (step S410). The received image data may be used as it is as an image material.

サーバ２の制御部２０１（フォトムービー生成部２５）は、画像素材と所定のエフェクトテンプレート２２１を使用し、音響特徴量データ４５（例えば、「楽曲構成」の変化）に応じて画面が変化するフォトムービー（音楽無しフォトムービー）を生成する（ステップＳ４１１）。 The control unit 201 (photo movie generation unit 25) of the server 2 uses the image material and the predetermined effect template 221 and changes the screen according to the acoustic feature amount data 45 (for example, a change in the “music composition”). Generate a movie (photo movie without music) (step S411).

フォトムービーの生成方法は第１の実施形態と同様である。すなわち、制御部２０１（フォトムービー生成部２５）は、図１０に示すように、ユーザ端末３から指定された楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されているか否かを判定し、記憶部２０２に記憶されていない場合はエフェクト対応テーブル２２３から任意のパターンのエフェクト対応テーブル２２３を使用してフォトムービーを生成する。 The method of generating the photo movie is the same as that of the first embodiment. That is, as shown in FIG. 10, the control unit 201 (photo movie generation unit 25) determines whether or not the movie configuration information 224 corresponding to the music ID designated from the user terminal 3 is already stored in the storage unit 202. If it is determined and not stored in the storage unit 202, a photo movie is generated from the effect correspondence table 223 using the effect correspondence table 223 of an arbitrary pattern.

また、ユーザ端末３から指定された楽曲ＩＤに対応するムービー構成情報２２４が既に記憶部２０２に記憶されている場合は、制御部２０１は、エフェクト対応テーブル２２３から未使用のパターンのエフェクト対応テーブル２２３を読み出し（ステップＳ３０４）、読み出したパターンのエフェクト対応テーブル２２３に従ってフォトムービーを生成する。 Further, when the movie configuration information 224 corresponding to the music ID designated from the user terminal 3 is already stored in the storage unit 202, the control unit 201 changes from the effect correspondence table 223 to the effect correspondence table 223 of an unused pattern. (Step S304), and a photo movie is generated according to the effect correspondence table 223 of the read pattern.

制御部２０１（フォトムービー生成部２５）はフォトムービーを生成すると、生成したフォトムービーのムービー構成情報２２４を楽曲ＩＤと紐づけて記憶部２０２に記憶する。ムービー構成情報２２４は、図１２に示すように、楽曲ＩＤと紐づけて、楽曲構成、時間、適用エフェクト、使用画像枚数、使用画像ファイル名が格納される。また、サーバ２は生成したフォトムービーをユーザ端末３に送信する（ステップＳ４１２）。 When the control unit 201 (photo movie generation unit 25) generates a photo movie, the control unit 201 (photo movie generation unit 25) stores the generated photo movie movie configuration information 224 in the storage unit 202 in association with the music ID. As shown in FIG. 12, the movie configuration information 224 stores the music composition, time, applied effect, number of images used, and image file name used in association with the music ID. Further, the server 2 transmits the generated photo movie to the user terminal 3 (step S412).

ユーザ端末３はサーバ２からフォトムービーを受信すると（ステップＳ４１３）、再生処理を行う。受信したフォトムービーは音楽無しフォトムービーであるため、制御部３０１（再生処理部３５）は、ステップＳ４０１で指定した楽曲の楽曲ＩＤの音源データ４４を音楽配信サーバ４に要求する（ステップＳ４１４）。 When the user terminal 3 receives the photo movie from the server 2 (step S413), the user terminal 3 performs a playback process. Since the received photo movie is a photo movie without music, the control unit 301 (reproduction processing unit 35) requests the music distribution server 4 for the sound source data 44 of the music ID of the music specified in step S401 (step S414).

音楽配信サーバ４は、ユーザ端末３から楽曲ＩＤの音源データ４４の要求を受信すると（ステップＳ４１５）、要求された楽曲ＩＤの音源データ４４を楽曲データベース４０から読み出し、要求元のユーザ端末３に送信する（ステップＳ４１６）。 When the music distribution server 4 receives the request for the sound source data 44 of the music ID from the user terminal 3 (step S415), the music distribution server 4 reads the sound source data 44 of the requested music ID from the music database 40 and transmits it to the requesting user terminal 3. (Step S416).

ユーザ端末３は、音楽配信サーバ４から音源データ４４を受信すると（ステップＳ４１７）、ユーザ端末３の制御部３０１（再生処理部３５）は、ステップＳ４１３で受信した音楽無しフォトムービーと音源データ４４とを同期させて再生する（ステップＳ４１８）。これにより、楽曲構成の変化に応じて画面が変化するフォトムービーを再生できる。 When the user terminal 3 receives the sound source data 44 from the music distribution server 4 (step S417), the control unit 301 (reproduction processing unit 35) of the user terminal 3 receives the music-free photo movie and the sound source data 44 in step S413. Are synchronized and reproduced (step S418). This makes it possible to play a photo movie whose screen changes according to changes in the composition of the music.

なお、上述の例は、音楽無しフォトムービーを生成する場合の手順について説明したが、ステップＳ４０６でサーバ２に音楽配信サーバ４に対して音響特徴量データ４５の送信を要求する際に、音源データ４４の送信も要求し、該当の楽曲の音源データ４４を取得すれば、音楽付きフォトムービーを生成することも可能である。サーバ２により音楽付きフォトムービーが生成された場合、ユーザ端末３は音楽付きフォトムービーを受信し、制御部３０１（再生処理部３５）により受信したフォトムービーを復号し再生することで音楽及び映像が同時に再生される。 In the above example, the procedure for generating a photo movie without music has been described. However, when requesting the server 2 to transmit the acoustic feature amount data 45 to the music distribution server 4 in step S406, the sound source data It is also possible to generate a photo movie with music by requesting the transmission of 44 and acquiring the sound source data 44 of the corresponding music. When the photo movie with music is generated by the server 2, the user terminal 3 receives the photo movie with music, and the control unit 301 (playback processing unit 35) decodes and plays the received photo movie to produce music and video. Played at the same time.

以上のように、第２の実施形態では、ユーザ端末３が音源データを持たず、サーバ２が音楽配信サーバ４から音響特徴量データ４５を取得してフォトムービーを生成するフォトムービー生成システム１Ａについて説明した。第２の実施形態のフォトムービー生成システム１Ａによれば、サーバ２は音響解析処理を実施せず、既に解析済みの音響特徴量データ４５を音楽配信サーバ４から取得してフォトムービーの生成に使用するため、サーバ２の処理負担が軽減される。また、ユーザ端末３とサーバ２との間で音源データ４４の送受信を行わないため、通信量が少なくなり、通信に要する時間が短縮される。 As described above, in the second embodiment, the photo movie generation system 1A in which the user terminal 3 does not have sound source data and the server 2 acquires the acoustic feature amount data 45 from the music distribution server 4 to generate a photo movie. explained. According to the photo movie generation system 1A of the second embodiment, the server 2 does not perform the acoustic analysis process, but acquires the already analyzed acoustic feature data 45 from the music distribution server 4 and uses it for generating the photo movie. Therefore, the processing load on the server 2 is reduced. Further, since the sound source data 44 is not transmitted / received between the user terminal 3 and the server 2, the amount of communication is reduced and the time required for communication is shortened.

以上、添付図面を参照して、本発明の好適な実施形態について説明したが、本発明は係る例に限定されない。例えば、音楽配信サーバ４に音響解析部４２や音響特徴量データ４５を持たず、サーバ２が音響解析部４２を備え、音楽配信サーバ４から受信した楽曲について音響解析処理を実施し、各楽曲の音響特徴量データ４５を蓄積記憶しておく構成としてもよい。 Although preferred embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to such examples. For example, the music distribution server 4 does not have the acoustic analysis unit 42 or the acoustic feature amount data 45, but the server 2 includes the acoustic analysis unit 42, performs acoustic analysis processing on the music received from the music distribution server 4, and performs acoustic analysis processing on each music. The acoustic feature amount data 45 may be stored and stored.

また、上述の実施形態では、サーバ２は、主に楽曲構成の変化に応じて表示が変化するフォトムービーを生成する例を示したが、本発明はこれに限定されず、楽曲構成以外の変化、例えば、楽曲のテンポの変化、ビート（拍子）の変化等に応じて表示が変化するフォトムービーを生成してもよい。その他、当業者であれば、本願で開示した技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 Further, in the above-described embodiment, the server 2 mainly generates an example of generating a photo movie whose display changes according to a change in the music composition, but the present invention is not limited to this, and changes other than the music composition For example, a photo movie whose display changes according to a change in the tempo of a musical piece, a change in a beat (beat), or the like may be generated. In addition, it is clear that a person skilled in the art can come up with various modified examples or modified examples within the scope of the technical idea disclosed in the present application, and these also naturally belong to the technical scope of the present invention. It is understood that it is.

１・・・・・・・・・・フォトムービー生成システム
２・・・・・・・・・・サーバ
３・・・・・・・・・・ユーザ端末
４・・・・・・・・・・音楽配信サーバ
５・・・・・・・・・・ネットワーク
２１・・・・・・・・・音源データ取得部
２２・・・・・・・・・音響解析部
２３・・・・・・・・・画像取得部
２４・・・・・・・・・画像素材生成部
２５・・・・・・・・・フォトムービー生成部
２６・・・・・・・・・音響特徴量データ取得部
３１・・・・・・・・・楽曲指定部
３２・・・・・・・・・画像指定部
３３・・・・・・・・・送信部
３４・・・・・・・・・受信部
３５・・・・・・・・・再生処理部
３６・・・・・・・・・音源データ取得部
４０・・・・・・・・・楽曲データベース
４１・・・・・・・・・送信部
４２・・・・・・・・・音響解析部
４４・・・・・・・・・音源データ
４５・・・・・・・・・音響特徴量データ
６１、６２、６３・・・フォトムービー
２２１・・・・・・・・エフェクトテンプレート
２２２・・・・・・・・一般画像素材
２２３・・・・・・・・エフェクト対応テーブル
２２４・・・・・・・・ムービー構成情報
２０１、３０１・・・・制御部
２０２、３０２・・・・記憶部
２０３、３０３・・・・通信部
３０４・・・・・・・・入力部
３０５・・・・・・・・表示部
３０６・・・・・・・・周辺機器Ｉ／Ｆ部
３０７・・・・・・・・音声処理部 1 ・・・・・・・・・・ Photo movie generation system 2 ・・・・・・・・ Server 3 ・・・・・・・・ User terminal 4 ・・・・・・・・・ Music distribution server 5 ・・・・・・・・ Network 21 ・・・・・・・・・ Sound source data acquisition unit 22 ・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・・ Image acquisition unit 24 ・・・・・・・・・ Image material generation unit 25 ・・・・・・・・・ Photo movie generation unit 26 ・・・・・・・・ Acoustic feature data acquisition unit 31 ・・・・・・・・・・・ Music designation unit 32 ・・・・・・・・・ Image designation unit 33 ・・・・・・・・ Transmission unit 34 ・・・・・・・・・・・ Reception unit 35 ・・・・・・・・・・・ Playback processing unit 36 ・・・・・・・・・ Sound source data acquisition unit 40 ・・・・・・・・・・・ Music database 41 ・・・・・・・・・・・ Transmission Part 42 ・・・・・・・・・ Acoustic analysis part 44 ・・・・・・・・・ Sound source data 45 ・・・・・・・・ Acoustic feature data 61, 62, 63 ・・・ Photo movie 221 ・・・・・・・・ Effect template 222 ・・・・・・・・ General image material 223 ・・・・・・・・ Effect compatible table 224 ・・・・・・・・ Movie configuration information 201, 301 ·········································································································.・・・・・ Peripheral device I / F unit 307 ・・・・・・・・ Voice processing unit

Claims

A photo movie generation system in which a server and a user terminal are connected via a network.
The user terminal is
A music designation means that accepts the designation of the music to be used for the photo movie and transmits the sound source data of the designated music or the music ID that is the identification information of the music to the server.
A playback means for receiving and playing back a photo movie transmitted from the server is provided.
The server
An acoustic feature data acquisition means for acquiring acoustic feature data of a music designated on the user terminal, and an acoustic feature data acquisition means.
An image acquisition means for acquiring an image used for the photo movie, and
A photo movie generation means that uses an image acquired by the image acquisition means to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data, and transmits the photo movie to the user terminal.
A photo movie generation system characterized by being equipped with.

The server
A storage means for storing movie composition information, which is information constituting the photo movie, in association with the music ID is provided.
The photo movie generation means is characterized in that it refers to the movie configuration information stored in association with the designated music ID and generates a photo movie having a configuration different from the movie configuration information stored in the storage means. The photo movie generation system according to claim 1.

The storage means of the server stores the acoustic feature amount data of the music in association with the music ID.
The photo movie generation device according to claim 2, wherein the acoustic feature data acquisition means acquires acoustic feature data stored in the storage means.

The acoustic feature data acquisition means of the server is
The photo movie generation system according to claim 1 or 2, wherein the acoustic feature data of the music designated by the user terminal is acquired from a music distribution server communicatively connected via a network.

The server further includes an acoustic analysis means for obtaining acoustic feature data by performing acoustic analysis processing of music.
The photo movie generation system according to claim 1 or 2, wherein the acoustic feature data acquisition means acquires acoustic feature data obtained by the acoustic analysis means.

The photo movie generation system according to claim 5, wherein the server acquires sound source data of the music from a music distribution server communicatively connected via a network.

The photo movie generation system according to claim 5, wherein the server acquires sound source data of the music from the user terminal.

The photo movie generation means of the server generates the photo movie without adding a sound source, and outputs the photo movie in association with the music ID of the music.
A claim characterized in that the playback means of the user terminal acquires sound source data of a music corresponding to a music ID associated with the photo movie, and synchronizes the acquired sound source data with the photo movie to play the music. The photo movie generation system according to any one of claims 1 to 7.

The server
An image material generation means for generating an image material from an image acquired by the image acquisition means is further provided.
The photo movie generation system according to any one of claims 1 to 8, wherein the photo movie generation means generates the photo movie by using the image material generated by the image material generation means. ..

The user terminal further includes an image transmission means for transmitting an image used for the photo movie to the server.
The photo movie generation system according to any one of claims 1 to 9, wherein the image acquisition means of the server receives an image transmitted from the user terminal.

Image acquisition means to acquire images and
Acoustic feature data acquisition means for acquiring acoustic feature data of music,
A photo movie generation means that uses an image acquired by the image acquisition means to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data.
A photo movie generator characterized by being equipped with.

Image acquisition means to acquire images and
Sound source data acquisition means for acquiring music sound source data,
An acoustic analysis means that analyzes the sound source data and acquires acoustic feature data,
A photo movie generation means that uses an image acquired by the image acquisition means to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data.
A photo movie generator characterized by being equipped with.

A user terminal that can communicate with the server via a network.
A music designation means that accepts the designation of a music and transmits the sound source data of the designated music or the music ID that is the identification information of the music to the server.
A playback means for receiving and playing a photo movie whose display changes according to a change in the music transmitted from the server.
A user terminal characterized by being provided with.

This is a photo movie generation method in a photo movie generation system in which a server and a user terminal are communicated and connected via a network.
A step in which the user terminal accepts the designation of the music to be used for the photo movie and transmits the sound source data of the designated music or the music ID which is the identification information of the music to the server.
When the server acquires an image,
The step that the server acquires the acoustic feature amount data of the music specified in the user terminal,
A step in which the server uses the image to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data, and transmits the photo movie to the user terminal.
A step in which the user terminal receives and plays back a photo movie transmitted from the server.
A photo movie generation method characterized by including.

Computer,
Image acquisition means to acquire images,
Acoustic feature data acquisition means for acquiring acoustic feature data of music,
A photo movie generation means that uses an image acquired by the image acquisition means to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data.
A program to function as.

Computer,
Image acquisition means to acquire images,
Sound source data acquisition means for acquiring music sound source data,
An acoustic analysis means that analyzes the sound source data and acquires acoustic feature data.
A photo movie generation means that uses an image acquired by the image acquisition means to generate a photo movie whose display changes according to a change in music based on the acoustic feature amount data.
A program to function as.

A computer that can communicate with the server via a network
A music designation means that accepts the designation of a music and transmits the sound source data of the designated music or the music ID that is the identification information of the music to the server.
A playback means for receiving and playing a photo movie whose display changes according to a change in the music transmitted from the server.
A program to function as.