JP2022047036A

JP2022047036A - Moving image creation device, moving image creation system, and moving image creation program

Info

Publication number: JP2022047036A
Application number: JP2020152737A
Authority: JP
Inventors: 奈緒子宮崎; Naoko Miyazaki
Original assignee: Sega Toys Co Ltd
Current assignee: Sega Toys Co Ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2022-03-24
Anticipated expiration: 2040-09-11
Also published as: JP7024027B1

Abstract

To provide a moving image creation device, a moving image creation system, and a moving image creation program that support creation of moving images in which an avatar moves and sings with simple and little effort.SOLUTION: A moving image creation device for creating a moving image including an avatar includes: basic information acquisition means for acquiring basic information about the avatar; schedule information generation means for generating at least part of schedule information in which a plurality of operation instructions for causing the avatar to perform predetermined operations are arranged in accordance with the elapse of time; voice information acquisition means for acquiring voice information; and moving image data generation means for generating moving image data based on the basic information, the schedule information, and the voice information. As the schedule information generation means, schedule information automatic generation means for automatically generating at least part of the schedule information is provided so that the mouth of the avatar moves in synchronization with the voice.SELECTED DRAWING: Figure 1

Description

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月６日掲載アドレス：ｈｔｔｐｓ：／／ｗｗｗ．ｙｏｕｔｕｂｅ．ｃｏｍ／ｗａｔｃｈ？ｖ＝ｇｖ６ｓ３ｘｚＺＹ７ｓ Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Publication date: July 6, 2nd year Reiwa Publication address: https: // www. youtube. com / watch? v = gv6s3xzZY7s

特許法第３０条第２項適用申請有りウェブサイトでの公開配信日：令和２年７月７日掲載アドレス：ｈｔｔｐｓ：／／ａｐｐｓ．ａｐｐｌｅ．ｃｏｍ／ｊｐ／ａｐｐ／ｉｄ１５１３５３３８５６Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Distribution date: July 7, Reiwa 2 Publication address: https: // apps. apple. com / jp / app / id15135338556

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月７日掲載アドレス：ｈｔｔｐｓ：／／ｗｗｗ．ｃｏｐｉｐｅｒｏｉｄ．ｃｏｍ／ Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Publication date: July 7, 2nd year Reiwa Publication address: https: // www. copyperoid. com /

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月１５日掲載アドレス：ｈｔｔｐｓ：／／ｗｗｗ．ｙｏｕｔｕｂｅ．ｃｏｍ／ｗａｔｃｈ？ｖ＝ｖｋＹｙ＿ｘＳｖｊ１Ｍ Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Publication date: July 15, 2nd year Reiwa Publication address: https: // www. youtube. com / watch? v = vkyy_xSvj1M

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月２１日掲載アドレス：ｈｔｔｐｓ：／／ｓｅｇａ．ｊｐ／ｐｒｏｄｕｃｔ／ｃｏｐｉｐｅｒｏｉｄ／ Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Date of publication: July 21, 2nd year of Reiwa Publication address: https: // SEGA. jp / product / copyperoid /

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月２１日掲載アドレス：ｈｔｔｐｓ：／／ｗｗｗ．ｓｅｇａｔｏｙｓ．ｃｏ．ｊｐ／ Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Date of publication: July 21, 2nd year of Reiwa Publication address: https: // www. Segatoys. co. jp /

特許法第３０条第２項適用申請有り株式会社セガトイズが、令和２年７月２１日に、各メディアに対し、宮崎奈緒子が発明した「動画作成装置、動画作成システム及び動画作成プログラム」が掲載されたニュースリリース書面を電子メールやＦＡＸにて送信した。Application for application of Article 30, Paragraph 2 of the Patent Law Sega Toys Co., Ltd. invented "Video Creation Device, Video Creation System and Video Creation Program" by Naoko Miyazaki for each media on July 21, 2nd year of Reiwa. I sent a news release document containing the above by e-mail or fax.

特許法第３０条第２項適用申請有りウェブサイトでの公開掲載日：令和２年７月２６日掲載アドレス：ｈｔｔｐｓ：／／ｔｗｉｔｔｅｒ．ｃｏｍ／ＳＥＧＡ＿ＯＦＦＩＣＩＡＬ／Application for application of Article 30, Paragraph 2 of the Patent Law Published on the website Date of publication: July 26, 2nd year of Reiwa Publication address: https: // twitter. com / SEGA_OFFICIAL /

本発明は、アバターが表示された動画を作成するための動画作成装置、動画作成システム及び動画作成プログラムに関する。 The present invention relates to a video creation device, a video creation system, and a video creation program for creating a video in which an avatar is displayed.

従来、動きの情報が付与されたアバター情報と音声情報とを設定し、各情報を合成することによって、アバターが歌う音声付動画の作成を支援する特許文献１等に記載の動画作成システムが従来公知である。 Conventionally, the video creation system described in Patent Document 1 and the like, which supports the creation of a video with voice sung by an avatar by setting avatar information and voice information to which motion information is added and synthesizing each information, has been conventionally used. It is known.

特開２０１１－１３３８８２号公報Japanese Unexamined Patent Publication No. 2011-133882

上記文献１では、端末装置からキャラクターに関するアバター情報と映像情報とメロディ情報と、ユーザが入力した歌詞情報と入力することによって、前記アバターが歌う音声付の映像をスムーズに作成することができるものであるが、前記アバターをより人間らしく動かすためにはより多くの動作情報を入力する必要があり、動画のクオリティを向上させるためには手間が掛かるという課題があった。 In Document 1, by inputting avatar information, video information, melody information, and lyrics information input by the user from the terminal device, it is possible to smoothly create a video with a voice sung by the avatar. However, in order to move the avatar more humanly, it is necessary to input more motion information, and there is a problem that it takes time and effort to improve the quality of the moving image.

本発明は、簡易且つ少ない手間でアバターが動いたり歌ったりする動画を作成することを支援する動画作成装置、動画作成システム及び動画作成プログラムを提供することを課題としている。 An object of the present invention is to provide a video creation device, a video creation system, and a video creation program that support the creation of a video in which an avatar moves or sings easily and with little effort.

上記課題を解決するため、本発明の動画作成装置は、アバターを含む動画を作成する動画作成装置であって、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、自己の録音手段又は自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設けたことを特徴としている。 In order to solve the above problems, the moving image creating device of the present invention is a moving image creating device that creates a moving image including an avatar, and obtains basic information about the avatar from its own storage unit or another via a network. Basic information acquisition means acquired from a computer, schedule information generation means for generating at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information. Based on the voice information acquisition means acquired from the own recording means or the own storage unit, or from another computer via the network, the basic information, the schedule information, and the voice information. The avatar is provided with a video data generation means for generating video data including a video with audio that performs a plurality of actions with the passage of time, and the mouth of the avatar synchronizes with the voice as the schedule information generation means. It is characterized in that a schedule information automatic generation means for automatically generating at least a part of the schedule information is provided so as to operate.

ユーザの音声を録音する録音手段を設け、前記音声取得手段により取得される音声情報として、前記録音手段により録音されたユーザの音声を用いたことを特徴としている。 A recording means for recording the user's voice is provided, and the user's voice recorded by the recording means is used as the voice information acquired by the voice acquisition means.

前記録音手段により録音したユーザの音声情報を合成音声に変更可能な合成音声生成手段を設けたことを特徴としている。 It is characterized by providing a synthetic voice generation means capable of changing a user's voice information recorded by the recording means into a synthetic voice.

動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段を設け、前記音声情報取得手段は、前記楽曲情報取得手段により取得された楽曲情報から音声情報を取得するように構成されたことを特徴としている。 A music information acquisition means for acquiring music information used for video creation from its own storage unit or from another computer via a network is provided, and the voice information acquisition means is a music acquired by the music information acquisition means. It is characterized by being configured to acquire voice information from information.

前記楽曲情報は、楽曲の歌詞情報と音声情報とを含み、前記音声情報取得手段により選択された楽曲の歌詞情報を編集する歌詞情報変更手段を設け、前記スケジュール情報自動生成手段は、前記歌詞情報変更手段により歌詞情報が編集された場合には、変更された歌詞に合わせてアバターの口の動きを変更するように構成されたことを特徴としている。 The music information includes lyrics information and voice information of the music, and a lyrics information changing means for editing the lyrics information of the music selected by the voice information acquisition means is provided, and the schedule information automatic generation means is the lyrics information. When the lyrics information is edited by the changing means, it is characterized in that it is configured to change the movement of the avatar's mouth according to the changed lyrics.

前記音声情報取得手段は、前記アバターが発声するボイスの種類に関するボイス情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得することができるように構成されたことを特徴としている。 The voice information acquisition means is characterized in that it can acquire voice information regarding the type of voice uttered by the avatar from its own storage unit or from another computer via a network. There is.

前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記アバターを表示する画面内の左右位置に関する情報を時間経過に対応させて複数並べてなる位置情報を含むことを特徴としている。 The schedule information generated by the schedule information generation means is characterized by including position information in which a plurality of information regarding left and right positions in a screen displaying the avatar are arranged in accordance with the passage of time.

前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記アバターを表示する大きさに関する情報を時間経過に対応させて複数並べてなるスケール情報を含むことを特徴としている。 The schedule information generated by the schedule information generation means is characterized by including scale information in which a plurality of information regarding the size of displaying the avatar is arranged in accordance with the passage of time.

前記アバターは３次元的な情報を有するように構成され、前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記アバターを表示する向きに関する情報を時間経過に対応させて複数並べてなる回転位置情報を含むことを特徴としている。 The avatar is configured to have three-dimensional information, and the schedule information generated by the schedule information generation means is rotation position information in which a plurality of information regarding the direction in which the avatar is displayed are arranged in accordance with the passage of time. It is characterized by including.

また、本発明の一実施形態に係る動画作成システムは、記憶部と、前記アバターに関する基礎情報を前記記憶部から取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、自己の録音手段又は前記記憶部から取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設けたことを特徴とする。 Further, the moving image creation system according to the embodiment of the present invention includes a storage unit, basic information acquisition means for acquiring basic information about the avatar from the storage unit, and an operation instruction for causing the avatar to perform a predetermined operation. A schedule information generation means for generating at least a part of schedule information arranged in a plurality of cases according to the passage of time, a voice information acquisition means for acquiring voice information from its own recording means or the storage unit, the basic information, and the basic information. The schedule is provided with a moving image data generation means for generating moving image data including an image with sound in which the avatar performs a plurality of actions with the passage of time based on the schedule information and the audio information. As the information generation means, a schedule information automatic generation means for automatically generating at least a part of the schedule information is provided so that the mouth of the avatar operates in synchronization with the voice.

また、本発明の一実施形態に係る動画作成プログラムは、コンピュータに、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得処理と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成処理と、音声情報を、自己の録音手段又は自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得処理と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成処理と、前記スケジュール情報生成処理として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成処理とを実行させることを特徴とする。 Further, the moving image creation program according to the embodiment of the present invention includes a basic information acquisition process of acquiring basic information about the avatar from its own storage unit or acquiring basic information from another computer via a network. A schedule information generation process that generates at least a part of schedule information in which a plurality of operation instructions for causing an avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information is stored in one's own recording means or one's own storage. Based on the voice information acquisition process acquired from the unit or from another computer via the network, the basic information, the schedule information, and the voice information, the avatar has a plurality of avatars over time. As the video data generation process for generating video data including a video with audio for operation and the schedule information generation process, at least a part of the schedule information is used so that the mouth of the avatar operates in synchronization with the audio. It is characterized by executing the schedule information automatic generation process that is automatically generated.

前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設けたことにより、簡易且つ少ない手間でアバターをより人間らしく動かす動画を作成することができる。 As the schedule information generation means, the schedule information automatic generation means for automatically generating at least a part of the schedule information is provided so that the mouth of the avatar operates in synchronization with the voice, so that the avatar can be easily and with little effort. You can create videos that make your avatar more human.

前記スケジュール情報生成手段によって動画内の時間経過に対応させて設定される前記スケジュール情報として、前記アバターの表示位置・スケール・回転位置の情報を設定可能に構成することにより、簡単な操作で前記アバターを動画内でよりダイナミックに動かすことができる。 By configuring the schedule information generation means so that information on the display position, scale, and rotation position of the avatar can be set as the schedule information set according to the passage of time in the moving image, the avatar can be easily operated. Can be moved more dynamically in the video.

本発明の動画作成処理装置の構成の例を示すブロック図である。It is a block diagram which shows the example of the structure of the moving image creation processing apparatus of this invention. 本発明の実施形態に対応するシステム構成の一例を示すブロック図である。It is a block diagram which shows an example of the system configuration corresponding to the embodiment of this invention. 本発明の実施形態の少なくとも一つに対応する動画作成処理の例を示すフローチャートである。It is a flowchart which shows the example of the moving image creation process corresponding to at least one of the Embodiments of this invention. 基礎情報を取得するための設定操作画面の例を示した図である。It is a figure which showed the example of the setting operation screen for acquiring the basic information. スケジュール情報を設定するための設定操作画面の例を示した図である。It is a figure which showed the example of the setting operation screen for setting schedule information. アバター表示編集画面の例を示した図である。It is a figure which showed the example of the avatar display edit screen. 別実施例の実施形態の少なくとも一つに対応する動画作成処理の例を示すフローチャートである。It is a flowchart which shows the example of the moving image creation process corresponding to at least one of the embodiments of another Example. 楽曲情報取得手段により楽曲情報を取得するための操作画面の例を示した図である。It is a figure which showed the example of the operation screen for acquiring the music information by the music information acquisition means. 別実施例でのスケジュール情報を設定するための設定操作画面の例を示した図である。It is a figure which showed the example of the setting operation screen for setting the schedule information in another embodiment.

以下、本発明の実施形態にかかる動画作成処理装置について説明する。図１は、本発明の動画作成処理装置の構成の例を示すブロック図である。図１に示すように、前記動画作成処理装置（動画作成装置）１０は、背景情報取得手段１１と、基礎情報取得手段１２と、音声情報取得手段１３と、スケジュール情報手動生成手段１４と、スケジュール情報自動生成手段１５と、動画データ生成手段１６と、記憶部１７とを備える。 Hereinafter, the moving image creation processing device according to the embodiment of the present invention will be described. FIG. 1 is a block diagram showing an example of the configuration of the moving image creation processing device of the present invention. As shown in FIG. 1, the moving image creation processing device (moving image creating device) 10 includes a background information acquisition means 11, a basic information acquisition means 12, an audio information acquisition means 13, a schedule information manual generation means 14, and a schedule. The information automatic generation means 15, the moving image data generation means 16, and the storage unit 17 are provided.

図２は、本発明の実施形態に対応するシステム構成の一例を示すブロック図である。図示する例では、ユーザが使用する端末装置３０１・・３０ｎは、通信ネットワーク４０を介して前記サーバ装置２０に接続可能に構成されており、前記サーバ装置２０側に前記動画作成処理装置１０の機能の一部を集約させるように構成されている。なお、前記動画作成処理装置１０は、１つの端末装置３０１において実現してオフラインで使用するものであってもよい。 FIG. 2 is a block diagram showing an example of a system configuration corresponding to an embodiment of the present invention. In the illustrated example, the terminal device 301 ... 30n used by the user is configured to be connectable to the server device 20 via the communication network 40, and the function of the moving image creation processing device 10 is on the server device 20 side. It is configured to aggregate a part of. The moving image creation processing device 10 may be realized in one terminal device 301 and used offline.

前記サーバ装置２０は、システム管理者によって管理され、ＷＷＷサーバなどの情報処理装置によって構成されている。また、該サーバ装置２０は、各種処理を実行する制御部と、各種情報を格納する記憶媒体と、通信部の他、コンピュータとしての一般的な構成を備え、複数の端末装置に対して各種処理に関する情報を提供するための各種機能を有する。 The server device 20 is managed by a system administrator and is configured by an information processing device such as a WWW server. Further, the server device 20 includes a control unit for executing various processes, a storage medium for storing various information, a communication unit, and a general configuration as a computer, and various processes are performed on a plurality of terminal devices. It has various functions to provide information about.

なお、システム構成は上述の例には限定されず、動画作成処理装置１０として機能する１つの端末装置を複数のユーザが使用する構成としても良いし、複数のサーバ装置を備える構成としても良い。 The system configuration is not limited to the above example, and one terminal device functioning as the moving image creation processing device 10 may be used by a plurality of users, or may be a configuration including a plurality of server devices.

複数の前記端末装置３０１・・３０ｎは、タッチ操作可能なタッチパネルからなる入力手段・出力手段を有するスマートフォンやタブレット端末等の情報処理端末であって、それぞれ通信ネットワーク４０に接続し、前記サーバ装置２０との通信を行うことによって各種処理を実行するためのハードウェア及びソフトウェアを備える。なお、複数の端末装置３０１・・３０ｎそれぞれは、前記サーバ装置２０を介さずに互いに直接通信を行うこともできる構成としても良い。 The plurality of terminal devices 301 ... 30n are information processing terminals such as smartphones and tablet terminals having input means and output means composed of touch-operable touch panels, each of which is connected to a communication network 40 and is connected to the server device 20. It is equipped with hardware and software for executing various processes by communicating with. It should be noted that each of the plurality of terminal devices 301 ... 30n may be configured to be able to directly communicate with each other without going through the server device 20.

前記背景情報取得手段１１は、作成する動画の背景として用いる画像データ又は動画データを取得する機能を有する。背景として用いる画像データ又は動画データは、ユーザが使用する端末装置３０１・・３０ｎの記憶部から取得しても、通信ネットワーク４０経由で他のコンピュータから取得しても良い。また、ユーザが使用する端末装置３０１・・３０ｎに設けられたカメラ等の撮影手段（不図示）によりその場で画像データや動画データを撮影して取得する構成としても良い。 The background information acquisition means 11 has a function of acquiring image data or moving image data used as a background of a moving image to be created. The image data or moving image data used as the background may be acquired from the storage unit of the terminal device 301 ... 30n used by the user, or may be acquired from another computer via the communication network 40. Further, the image data or the moving image data may be photographed and acquired on the spot by a photographing means (not shown) such as a camera provided in the terminal device 301 ... 30n used by the user.

前記基礎情報取得手段１２は、作成する動画に表示するアバターの容姿に関する基礎情報を取得する機能を有する。本実施例では、該基礎情報取得手段１２は、アバターに関する基礎情報として、アバターの髪型、衣装、色の各構成について複数種類のデザインをユーザに提示し、各構成についてユーザが任意のデザインを選択することによって、ユーザの好みに応じたオリジナルのアバターを作成することができる。本実施例では３次元的なアバターが生成される。該アバターを構成する基礎情報の具体的な設定操作の例については後述する。 The basic information acquisition means 12 has a function of acquiring basic information regarding the appearance of an avatar to be displayed in a moving image to be created. In this embodiment, the basic information acquisition means 12 presents to the user a plurality of types of designs for each configuration of the avatar's hairstyle, costume, and color as basic information regarding the avatar, and the user selects an arbitrary design for each configuration. By doing so, it is possible to create an original avatar according to the user's preference. In this embodiment, a three-dimensional avatar is generated. An example of a specific setting operation of the basic information constituting the avatar will be described later.

前記音声情報取得手段１３は、作成する動画に表示するアバターが発する音声情報を取得する機能を有している。本実施例では、該音声情報取得手段１３は、ユーザが使用する端末装置３０１・・３０ｎに設けられたマイク等を用いてユーザが発する音声を録音する録音手段と、該録音手段によって録音されたユーザ自身の音声情報を合成音声に編集する合成音声生成手段とを有し、前記録音手段により録音したユーザの音声情報か、該音声情報を合成音声に編集した音声情報を、動画の作成に用いる音声情報として取得することができるように構成されている。 The voice information acquisition means 13 has a function of acquiring voice information emitted by an avatar to be displayed in a moving image to be created. In this embodiment, the voice information acquisition means 13 is recorded by a recording means for recording a voice emitted by the user by using a microphone or the like provided in the terminal device 301 ... 30n used by the user, and the recording means. It has a synthetic voice generation means for editing the user's own voice information into a synthetic voice, and uses the user's voice information recorded by the recording means or the voice information obtained by editing the voice information into a synthetic voice for creating a moving image. It is configured so that it can be acquired as voice information.

前記スケジュール情報手動生成手段１４は、作成する動画に表示されるアバターに所定の動作を行わせる動作情報と、動画に表示されるテロップ情報と、演出効果情報と、前記音声情報と、動画に流す音楽（ＢＧＭ等）に関する音楽情報とを、作成する動画の時間経過に対応させて複数並べて表示したスケジュール情報を生成する機能を有し、該スケジュール情報は、ユーザによる操作で入力・編集可能に構成されている。該スケジュール情報の設定操作の例については後述する。 The schedule information manual generation means 14 causes the avatar displayed in the video to be created to perform a predetermined operation, the telop information displayed in the video, the effect information, the voice information, and the video. It has a function to generate schedule information in which a plurality of music information related to music (BGM, etc.) is displayed side by side according to the passage of time of a moving image to be created, and the schedule information can be input and edited by a user operation. Has been done. An example of the schedule information setting operation will be described later.

前記スケジュール情報自動生成手段１５は、前記スケジュール情報手動生成手段１４によって取得された前記スケジュール情報に基づいて、動画に表示されるアバターに所定の動作を行わせる動作情報（スケジュール情報）を自動的に生成する機能を有する。具体的に説明すると、該スケジュール情報自動生成手段１５には、前記録音手段により取得されたユーザの声からなる前記音声情報に合わせて前記アバターの顔を構成する口を自動的に動かす（口パク）リップシンク手段（機能）や、前記音声情報や演出情報等に基づいて前記アバターの顔を構成する目を所定のタイミングで自動的に瞬きさせる瞬き手段（機能）や、前記録音手段により取得されたユーザ自身の声を特定の声音（合成音声）に変更する合成音声設定手段（機能）等がある。 The schedule information automatic generation means 15 automatically generates operation information (schedule information) for causing an avatar displayed in a moving image to perform a predetermined operation based on the schedule information acquired by the schedule information manual generation means 14. It has a function to generate. Specifically, the schedule information automatic generation means 15 automatically moves the mouth constituting the face of the avatar according to the voice information composed of the user's voice acquired by the recording means (lip-sync). ) Acquired by the lip-sync means (function), the blinking means (function) that automatically blinks the eyes constituting the face of the avatar at a predetermined timing based on the voice information, the effect information, etc., and the recording means. There is a synthetic voice setting means (function) for changing the user's own voice to a specific voice sound (synthetic voice).

前記動画データ生成手段１６は、アバターを含む動画を作成するために設定した背景情報、基礎情報、音声情報、スケジュール情報等の全てを合成（レンダリング）することにより、アバターが歌ったり、メッセージを送ったりする動画を生成し、記憶部１７に保存する機能を有する。 The video data generation means 16 synthesizes (renders) all of the background information, basic information, audio information, schedule information, etc. set for creating a video including an avatar, so that the avatar sings or sends a message. It has a function of generating a moving image and storing it in the storage unit 17.

前記記憶部１７は、動画作成処理装置１０における各手段（各部）の処理に必要な情報を記憶し、また、各手段の処理で生じた各種の情報を記憶する機能を有する。 The storage unit 17 has a function of storing information necessary for processing of each means (each unit) in the moving image creation processing device 10, and also storing various information generated by the processing of each means.

次に、図３に基づき、本発明の実施形態に対応する動画作成処理装置１０における動画作成処理のフローについて説明する。図３は、本発明の実施形態の少なくとも一つに対応する動画作成処理の例を示すフローチャートである。図示する例では、動画作成処理は、動画作成のプロジェクトの実行が開始された場合には、ステップＳ１０１に進む。 Next, the flow of the moving image creation processing in the moving image creation processing apparatus 10 corresponding to the embodiment of the present invention will be described with reference to FIG. FIG. 3 is a flowchart showing an example of a moving image creation process corresponding to at least one of the embodiments of the present invention. In the illustrated example, the moving image creation process proceeds to step S101 when the execution of the moving image creation project is started.

ステップＳ１０１では、前記背景情報取得手段１１により動画の背景に使用される画像データ又は動画データを取得（背景情報取得処理を実行）し、その後、ステップＳ１０２に進む。ステップＳ１０２では、前記基礎情報取得手段１２により動画に表示されるアバターに関する基礎情報を取得（基礎情報取得処理を実行）し、その後、ステップＳ１０３に進む。 In step S101, the background information acquisition means 11 acquires image data or moving image data used for the background of the moving image (executes the background information acquisition process), and then proceeds to step S102. In step S102, the basic information acquisition means 12 acquires basic information about the avatar displayed in the moving image (executes the basic information acquisition process), and then proceeds to step S103.

ステップＳ１０３では、前記スケジュール情報手動生成手段１４と前記スケジュール情報自動生成手段１５とによって、前記スケジュール情報を生成（スケジュール情報生成処理を実行）し、その後、ステップＳ１０４に進む。ステップＳ１０４では、前記背景取得手段１１と前記基礎情報取得手段１２と前記スケジュール情報生成手段１４、１５とで設定された各種情報を合成（レンダリング処理）することにより、アバターによるメッセージ動画を生成（動画生成処理を実行）し、その後、処理を終了する。 In step S103, the schedule information is generated (schedule information generation processing is executed) by the schedule information manual generation means 14 and the schedule information automatic generation means 15, and then the process proceeds to step S104. In step S104, a message video by an avatar is generated (video) by synthesizing (rendering) various information set by the background acquisition means 11, the basic information acquisition means 12, and the schedule information generation means 14 and 15. Execute the generation process), and then end the process.

なお、上記スケジュール情報生成処理は、スケジュール情報を手動操作で編集する環境を提供するスケジュール情報手動生成処理と、スケジュール情報の入力に基づいて他のスケジュールを自動的に生成するスケジュール情報自動生成処理とを有する。ちなみに、上記の処理フローを構成する各種処理の順序は、処理内容に矛盾が生じない範囲で順不同である。 The schedule information generation process includes a schedule information manual generation process that provides an environment for manually editing schedule information, and a schedule information automatic generation process that automatically generates other schedules based on the input of schedule information. Have. By the way, the order of various processes constituting the above process flow is random as long as there is no contradiction in the process contents.

次に、図４乃至６に基づき、動画作成に必要な情報の取得方法の例について説明する。図４は、基礎情報を取得するための設定操作画面の例を示した図である。前記基礎情報取得手段１２は、ユーザ側の端末装置（タッチパネル）３０１・・３０ｎにアバターに関する基礎情報を設定するための操作環境（以下、基礎情報入力画面２０）を提供し（図４参照）、該基礎情報入力画面２０を介してアバターに関する前記基礎情報を取得する処理が実行可能に構成されている。 Next, an example of an acquisition method of information necessary for creating a moving image will be described with reference to FIGS. 4 to 6. FIG. 4 is a diagram showing an example of a setting operation screen for acquiring basic information. The basic information acquisition means 12 provides an operation environment (hereinafter, basic information input screen 20) for setting basic information about an avatar on a terminal device (touch panel) 301 ... 30n on the user side (see FIG. 4). The process of acquiring the basic information about the avatar via the basic information input screen 20 is configured to be executable.

図示する例では、ユーザ側のタッチパネルに表示される前記基礎情報入力画面２０として、設定した基礎情報に基づいて生成されるアバターを表示するアバター表示部２１と、アバターの見た目を変更可能な構成要素（基礎情報）を切換操作する要素切換操作部２２と、選択された構成要素内で変更可能なデザイン（基礎情報）が並べて表示されるデザイン選択操作部２３と、選択されたアバターに関する基礎情報を保存する保存操作部２４とが同時に表示されている（図４参照）。 In the illustrated example, as the basic information input screen 20 displayed on the touch panel on the user side, an avatar display unit 21 that displays an avatar generated based on the set basic information and a component that can change the appearance of the avatar. The element switching operation unit 22 for switching (basic information), the design selection operation unit 23 for displaying the design (basic information) that can be changed in the selected component side by side, and the basic information about the selected avatar. The save operation unit 24 for saving is displayed at the same time (see FIG. 4).

前記要素切換操作部２２は、左右方向の帯状の表示部と、該表示部内に沿って並べて配置されたタブ状の切換操作部２２Ａ，２２Ｂ，２２Ｃとから構成されている。図示する例では、前記表示部に、髪型用の切換操作部２２Ａと、衣装用の切換操作部２２Ｂと、アバターに関する色用の切換操作部２２Ｃとが左右方向に並べて配置されており、ユーザが各切換操作部２２Ａ，２２Ｂ，２２Ｃの何れか一つをタップ操作等によって選択操作することにで、前記デザイン選択操作部２３に選択された切換操作部２２に関する情報を表示させることができるように構成されている（図４参照）。 The element switching operation unit 22 includes a strip-shaped display unit in the left-right direction and tab-shaped switching operation units 22A, 22B, 22C arranged side by side along the display unit. In the illustrated example, the hairstyle switching operation unit 22A, the costume switching operation unit 22B, and the color switching operation unit 22C related to the avatar are arranged side by side in the left-right direction on the display unit, and the user can use the display unit. By selecting and operating any one of the switching operation units 22A, 22B, and 22C by tapping or the like, the design selection operation unit 23 can display information about the selected switching operation unit 22. It is configured (see FIG. 4).

具体的に説明すると、前記髪型用の切換操作部２２Ａが選択された場合には、前記デザイン選択操作部２３に髪型に関する基礎情報（デザイン）が複数並べて表示され、前記衣装用の切換操作部２２Ｂが選択された場合には、前記デザイン選択操作部２３に衣装に関する基礎情報（デザイン）が複数並べて表示され、前記アバターに関する色用の切換操作部２２Ｃが選択された場合には、前記デザイン選択操作部２３にアバターのメインカラーに関する基礎情報（色情報）が複数並べて表示される。 Specifically, when the switching operation unit 22A for the hairstyle is selected, a plurality of basic information (designs) regarding the hairstyle are displayed side by side on the design selection operation unit 23, and the switching operation unit 22B for the costume is displayed. When is selected, a plurality of basic information (designs) related to costumes are displayed side by side on the design selection operation unit 23, and when the color switching operation unit 22C related to the avatar is selected, the design selection operation is performed. A plurality of basic information (color information) regarding the main color of the avatar is displayed side by side in the unit 23.

また、該要素切換操作部２２は、帯状の表示部を左右方向にスライド操作（フリック操作）等することにより、該表示部内に表示された前記切換操作部２２が表示部内をスライド移動するように構成しても良い。これにより、前記表示部内により多くの種類の切換操作部２２を配置することができる。該構成によれば、デザイン選択可能な前記アバターを構成する基礎情報（要素切換操作部２２）として、上記の例の他にアバターの種族、性別等の他、顔を構成する輪郭、目、鼻、口、体格等、細かく選択可能に構成することもできる。なお、選択可能な切換操作部２２（基礎情報）の種類はこれらには限られない。 Further, the element switching operation unit 22 slides the band-shaped display unit in the left-right direction (flick operation) so that the switching operation unit 22 displayed in the display unit slides in the display unit. It may be configured. As a result, more types of switching operation units 22 can be arranged in the display unit. According to the configuration, as basic information (element switching operation unit 22) constituting the avatar whose design can be selected, in addition to the above example, the race, gender, etc. of the avatar, as well as the contour, eyes, and nose constituting the face. , Mouth, physique, etc. can be finely selected. The types of the switching operation unit 22 (basic information) that can be selected are not limited to these.

前記デザイン変更操作部２３は、前記要素切換操作部２２により選択されたアバターの構成要素のうち、変更可能なデザインが表示される窓状のデザイン表示部２５が並べて配置されている。ユーザによるタッチ操作等によって選択された一のデザイン表示部２５Ａは、その外枠を強調表示（太くしたり、点滅させたり）することによって、何れのデザイン表示部が選択されたかを直感的に確認できるように構成されている（図４参照）。 In the design change operation unit 23, among the components of the avatar selected by the element switching operation unit 22, the window-shaped design display unit 25 on which the changeable design is displayed is arranged side by side. One design display unit 25A selected by a user's touch operation or the like intuitively confirms which design display unit is selected by highlighting (thickening or blinking) the outer frame. It is configured so that it can be done (see FIG. 4).

ちなみに、該デザイン表示部２５には、薄墨表示や鍵付き表示２６等することによって選択不可能なデザイン表示部（デザイン）２５Ｂを配置し、所定の課金額を支払ったことをサーバ装置が認識したことを条件に、選択不能だったデザイン表示部を開放して使用可能とするように構成しても良い（図４参照）。 By the way, in the design display unit 25, a design display unit (design) 25B that cannot be selected by displaying a light ink display, a display with a key, etc. 26 is arranged, and the server device recognizes that the predetermined charge amount has been paid. On the condition that the design is not selectable, the design display unit may be opened and used (see FIG. 4).

前記アバター表示部２１は、前記要素切換操作部２２と、前記デザイン変更操作部２３とで選択されたアバターの各構成要素のデザイン（基礎情報）の全てを反映させたアバターを表示することができる。該アバター表示部２１に表示されるアバターは、前記デザイン表示部２５の選択を変更した場合には、変更した箇所のデザインが即座に反映されるように構成することで、ユーザの好みのアバターを効率的に作成することができる。 The avatar display unit 21 can display an avatar that reflects all the designs (basic information) of each component of the avatar selected by the element switching operation unit 22 and the design change operation unit 23. .. The avatar displayed on the avatar display unit 21 is configured so that when the selection of the design display unit 25 is changed, the design of the changed part is immediately reflected, so that the user's favorite avatar can be used. It can be created efficiently.

上述の構成によって、前記基礎情報取得手段１２によってアバターに関する前記基礎情報を取得することにより、作成する動画に映すアバターの容姿を決定することができる。 With the above configuration, the appearance of the avatar to be displayed in the created moving image can be determined by acquiring the basic information about the avatar by the basic information acquisition means 12.

次に、図５に基づき、スケジュール情報手動生成手段によるスケジュール情報の生成方法の例について説明する。図５は、スケジュール情報を手動入力で設定するための設定操作画面の例を示した図である。前記スケジュール情報手動生成手段１４は、ユーザ側の端末装置（タッチパネル）３０１・・３０ｎに前記スケジュール情報を設定するための操作環境（以下、スケジュール情報入力画面３０）を提供し（図５参照）、該スケジュール情報入力画面３０を介してスケジュール情報を取得する処理が実行可能に構成されている。 Next, an example of a method of generating schedule information by the schedule information manual generation means will be described with reference to FIG. FIG. 5 is a diagram showing an example of a setting operation screen for manually setting schedule information. The schedule information manual generation means 14 provides an operation environment (hereinafter, schedule information input screen 30) for setting the schedule information on the terminal device (touch panel) 301 ... 30n on the user side (see FIG. 5). The process of acquiring schedule information via the schedule information input screen 30 is configured to be executable.

図示する例では、ユーザ側のタッチパネルに前記スケジュール情報入力画面３０として、左右方向の帯状のタイムライン表示部３２と、該タイムライン表示部３２の左右方向中央で表示位置が固定された上下方向の表示基準線３３と、該表示基準線３３とタイムライン表示部３２とで指定された動画の再生位置の画像を表示する編集画面表示部３１と、前記表示基準線３３上に配置されたスケジュール編集トラック３４とが同時に表示されている（図５参照）。 In the illustrated example, the schedule information input screen 30 on the user's touch panel is a strip-shaped timeline display unit 32 in the left-right direction and a vertical direction in which the display position is fixed at the center of the timeline display unit 32 in the left-right direction. The display reference line 33, the edit screen display unit 31 that displays the image of the playback position of the moving image specified by the display reference line 33 and the timeline display unit 32, and the schedule edit arranged on the display reference line 33. The track 34 and the track 34 are displayed at the same time (see FIG. 5).

前記タイムライン表示部３２は、帯状の表示部に動画の再生時間が左右方向に並べて順番に記載され（図示する例では１０秒毎）ており、該タイムライン表示部３２又は前記スケジュール編集トラック３４の表示部分を左右方向にスライド操作（フリック操作）等することにより、該タイムライン表示部３２内の再生時間の表示がスライド移動するように構成されている。 In the timeline display unit 32, the playback time of moving images is described in order on the strip-shaped display unit side by side in the left-right direction (every 10 seconds in the illustrated example), and the timeline display unit 32 or the schedule editing track 34. By sliding the display portion of the above in the left-right direction (flick operation) or the like, the display of the reproduction time in the timeline display unit 32 is configured to slide and move.

このとき、前記編集画面表示部３１には、前記タイムライン表示部３２の左右方向中央に表示された再生時間（図示する例では００：１０秒）、言い換えると、前記表示基準線３３上に示される前記タイムライン表示部３２上の再生時間の様子が表示されるように構成されている。 At this time, the playback time (00:10 seconds in the illustrated example) displayed in the center of the timeline display unit 32 in the left-right direction on the edit screen display unit 31 is shown on the display reference line 33. It is configured to display the state of the reproduction time on the timeline display unit 32.

前記スケジュール編集トラック３４は、図示する例では、動画内に表示するテロップを編集するテロップ編集トラック３６と、動画の背景となる画像データや動画データを編集する背景編集トラック３７と、前記アバターに所定の動作を実行させるための動作指示等を編集するアバター編集トラック３８と、動画内の演出効果を編集する演出編集トラック３９と、動画に付す音声情報の取得と編集を行う音声編集トラック（音声情報取得手段）４１と、動画にＢＧＭを付ける音楽編集トラック４２とを上下方向に並べて表示するように構成されている。また、各編集トラック３４は、スケジュール情報が入力されている再生範囲には、スケジュール情報が入力済みであることを示す編集ブロック４０が表示されるように構成されている（図５参照）。 In the illustrated example, the schedule editing track 34 is predetermined to the telop editing track 36 for editing the telop displayed in the moving image, the background editing track 37 for editing the image data and the moving image data as the background of the moving image, and the avatar. An avatar editing track 38 that edits operation instructions and the like for executing the operation of, an effect editing track 39 that edits the effect in the video, and a voice editing track (voice information) that acquires and edits voice information attached to the video. The acquisition means) 41 and the music editing track 42 for attaching a BGM to the moving image are configured to be displayed side by side in the vertical direction. Further, each edit track 34 is configured so that an edit block 40 indicating that the schedule information has been input is displayed in the reproduction range in which the schedule information is input (see FIG. 5).

また、該スケジュール編集トラック３４は、ユーザによるスライド操作等によって前記タイムライン表示部３２の再生時間の表記とともに一体的に左右スライドするように構成されており、各編集トラックの左右方向の真ん中には、左右中央で固定表示された前記表示基準線３３が配置されている。これに伴い、各編集トラックは、前記タイムライン表示部３２と表示基準線３３とで表示された再生位置での各スケジュール情報の追加、又は編集を実行することができるよう構成されている。 Further, the schedule editing track 34 is configured to be integrally slid left and right together with the notation of the reproduction time of the timeline display unit 32 by a slide operation or the like by the user, and is in the center of each editing track in the left-right direction. , The display reference line 33 fixedly displayed at the center of the left and right is arranged. Along with this, each editing track is configured to be able to add or edit each schedule information at the reproduction position displayed by the timeline display unit 32 and the display reference line 33.

具体的に説明すると、図５に示される例では、上記テロップ編集トラック３６は、再生時間が開始１０秒の時点には前記編集ブロック４０がない（前記スケジュール情報が設定されていない）ため、該テロップ編集トラック３６の右端には、テロップに関するスケジュール情報を追加する追加ボタン４３が配置されている（図５参照）。同様に、上記背景編集トラック３７では、再生時間が開始１０秒の時点で前記編集ブロック４０がある（前記スケジュール情報が設定されている）ため、該背景編集トラックの右端には、背景に関するスケジュール情報を編集（変更・削除等）するための編集ボタン４４が配置されている（図５参照）。 Specifically, in the example shown in FIG. 5, the telop editing track 36 does not have the editing block 40 at the time when the playback time starts 10 seconds (the schedule information is not set). At the right end of the telop editing track 36, an additional button 43 for adding schedule information regarding the telop is arranged (see FIG. 5). Similarly, in the background editing track 37, since the editing block 40 is present (the schedule information is set) when the playback time starts 10 seconds, the schedule information regarding the background is at the right end of the background editing track. An edit button 44 for editing (changing / deleting, etc.) is arranged (see FIG. 5).

ちなみに、前記編集ブロック４０には、その端側に該編集ブロックの左右方向の長さをスライド操作によって編集可能にする編集範囲操作部４０ａが設けられている。ユーザは、該編集範囲操作部４０ａを長押し、タップ操作又はクリック操作等で掴んでスライドする操作（ドラッグ操作）等をすることによって、該編集ブロック４０の各編集トラック上での指定範囲を簡単に編集することができる。これにより、各編集ブロックの範囲指定に数値指定等を行う必要がなくなるため、編集作業がより容易且つスムーズになる。 Incidentally, the editing block 40 is provided with an editing range operation unit 40a on the end side thereof so that the length of the editing block in the left-right direction can be edited by a slide operation. The user can easily set the designated range of the editing block 40 on each editing track by long-pressing the editing range operation unit 40a and performing an operation (drag operation) of grasping and sliding the editing range operation unit 40a by a tap operation or a click operation. Can be edited to. As a result, it is not necessary to specify a numerical value or the like for specifying the range of each editing block, so that the editing work becomes easier and smoother.

上記音声編集トラック４１では、ユーザの端末装置等に設けた録音手段によってユーザが発する音声を録音し、前記音声情報として取得することができる。言い換えると、録音したユーザの音声を、動画内でアバターが話すメッセージや歌等として利用することができる。また、取得された音声情報を、前記合成音声生成手段によって誰の声か分からなくなるように合成音声に編集することができるように構成されている。 In the voice editing track 41, the voice emitted by the user can be recorded by a recording means provided in the user's terminal device or the like and acquired as the voice information. In other words, the recorded user's voice can be used as a message, song, or the like spoken by the avatar in the video. Further, the acquired voice information is configured to be edited into a synthetic voice so that the voice of whom is not known by the synthetic voice generation means.

また、前記音声編集トラック４１によって動画内で流す音声情報が取得された場合には、前記スケジュール情報自動生成手段１５により、取得された音声情報に基づいて、前記アバターの口の動作情報を自動的に生成するリップシンク処理が実行されるように構成されている。該リップシンク処理は、例えば、母音（ア・イ・ウ・エ・オ）を発音する際のアバターの口の形状を予め用意するとともに、取得された音声情報から母音が発声するタイミングを取得し、これらの情報を合わせることで、アバターの口の形状を適切なタイミングで適切な形に動作させるように構成しても良い。なお、該リップシンク処理によってアバターの口を動かす処理方法はこれらに限られない。 Further, when the voice information to be played in the moving image is acquired by the voice editing track 41, the operation information of the mouth of the avatar is automatically obtained by the schedule information automatic generation means 15 based on the acquired voice information. It is configured to perform the lip-sync process that is generated in. In the lip sync process, for example, the shape of the mouth of the avatar when pronouncing a vowel (a i u e o) is prepared in advance, and the timing at which the vowel is uttered is acquired from the acquired voice information. By combining these information, the shape of the mouth of the avatar may be configured to move in an appropriate shape at an appropriate timing. The processing method for moving the mouth of the avatar by the lip-sync processing is not limited to these.

また、前記アバター編集トラック３８の追記ボタン４３又は編集ボタン４４が押操作された場合には、ユーザの端末装置（タッチパネル）上に、アバターの表示に関するスケジュール情報を編集するためのアバター表示編集画面５０が表示されるように構成されている（図６参照）。以下、具体的に説明する。 Further, when the add button 43 or the edit button 44 of the avatar edit track 38 is pressed, the avatar display edit screen 50 for editing the schedule information related to the avatar display on the user's terminal device (touch panel). Is configured to be displayed (see FIG. 6). Hereinafter, a specific description will be given.

図６は、アバター表示編集画面の例を示した図である。図示されるように、前記アバター表示編集画面５０には、編集中の再生位置での動画の状態を表示する前記編集画面表示部５１と、アバターの動作の構成要素に関する情報をタブ形式で左右方向に並べて表示したアバター動作要素切換操作部５２とが表示されている。 FIG. 6 is a diagram showing an example of an avatar display editing screen. As shown in the figure, on the avatar display editing screen 50, the editing screen display unit 51 that displays the state of the moving image at the playback position being edited and information on the components of the avatar's operation are displayed in a tab format in the left-right direction. The avatar operation element switching operation unit 52 displayed side by side is displayed.

図示する例では、前記アバター動作要素切換表示部５２には、前記アバターのポーズに関する情報（スケジュール情報）を設定するポーズ切換部５２Ａと、前記アバターの表情に関する情報（スケジュール情報）を設定する表情切換部５２Ｂと、動画内での前記アバターの表示位置に関する情報（スケジュール情報）を設定するアバター表示位置切換部５２Ｃとが設けられている。 In the illustrated example, the avatar operation element switching display unit 52 has a pose switching unit 52A for setting information (schedule information) regarding the pose of the avatar and a facial expression switching unit for setting information regarding the facial expression of the avatar (schedule information). A unit 52B and an avatar display position switching unit 52C for setting information (schedule information) regarding the display position of the avatar in the moving image are provided.

上記アバター表示位置切換部５２Ｃの操作タブが選択された場合には、編集中の編集ブロック４０内における前記アバターの表示位置に関する情報を、複数のスライドバーからなる操作具（スライダー操作具）によって編集できるように構成されている。具体的に説明すると、図６に示されるように、該スライダー操作具として、動画内におけるアバターの左右位置を操作する位置情報設定スライダー５７と、動画内におけるアバターの大きさを操作するスケール情報設定スライダー５８と、３次元的に表現された前記アバターの回転位置を操作する回転情報設定スライダー５９とが上下方向に並べて配置されている（図６参照）。 When the operation tab of the avatar display position switching unit 52C is selected, the information regarding the display position of the avatar in the editing block 40 being edited is edited by an operation tool (slider operation tool) composed of a plurality of slide bars. It is configured to be able to. Specifically, as shown in FIG. 6, as the slider operation tool, a position information setting slider 57 for operating the left and right positions of the avatar in the video, and a scale information setting for operating the size of the avatar in the video. The slider 58 and the rotation information setting slider 59 that operates the rotation position of the avatar expressed three-dimensionally are arranged side by side in the vertical direction (see FIG. 6).

各スライダー操作具によれば、作成される動画内で前記アバターの位置・大きさ・向きを自由に設定できるようになるため、該アバターをよりダイナミックに動かすことができる。具体的に説明すれば、動画内でアバターを画面の奥行方向に歩かせながらスケールを小さく表示する等することによって、アバターを３次元的な空間の中にいるように見せることもできる。 According to each slider operation tool, the position, size, and orientation of the avatar can be freely set in the created moving image, so that the avatar can be moved more dynamically. Specifically, it is possible to make the avatar appear to be in a three-dimensional space by displaying the avatar in a small scale while walking the avatar in the depth direction of the screen in the video.

次に、図７乃至９に基づき、前記動画作成処理装置の別実施例について、上述の例と異なる点について説明する。別実施例では、前記音声情報取得手段１３として、上述の録音手段に代えて、動画内で前記アバターが歌う楽曲のデータ（楽曲情報）を、ユーザの端末装置３０１・・３０ｎの記憶部１７から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段７１が設けられている。また、上述の合成音声生成手段に代えて、前記アバターが楽曲を歌う際のボイス（合成音声）の種類を取得するボイス情報取得手段７２が設けられた。 Next, another embodiment of the moving image creation processing apparatus will be described with reference to FIGS. 7 to 9 and different from the above-mentioned example. In another embodiment, as the voice information acquisition means 13, instead of the recording means described above, data (music information) of the music sung by the avatar in the moving image is stored from the storage unit 17 of the user's terminal device 301 ... 30n. A music information acquisition means 71 is provided for acquiring music information or acquiring music information from another computer via a network. Further, instead of the above-mentioned synthetic voice generation means, a voice information acquisition means 72 for acquiring the type of voice (synthetic voice) when the avatar sings a song is provided.

ちなみに、該楽曲情報取得手段７１によって取得された楽曲情報には、前記アバターが楽曲を歌う際の音声情報とともに、該楽曲の歌詞情報も同時に取得することができる。 Incidentally, in the music information acquired by the music information acquisition means 71, the lyrics information of the music can be acquired at the same time as the voice information when the avatar sings the music.

これに伴い、前記スケジュール情報自動生成手段１５は、前記楽曲情報取得手段７１により取得された楽曲情報、具体的には、該楽曲情報に含まれる歌詞情報に合わせて、動画内の前記アバターの口を動かすことができるようにアバターの口の動作情報を自動的に設定することができる。言い換えると、前記楽曲情報取得手段７１により、アバターが歌う楽曲を選択することによって、アバターの口の動作に関する情報を別途に入力する必要がなくなる。このため、少ない手間で動画内のアバターをより人間らしく動かすことができる。 Along with this, the schedule information automatic generation means 15 matches the music information acquired by the music information acquisition means 71, specifically, the lyrics information included in the music information, and the mouth of the avatar in the moving image. You can automatically set the movement information of the avatar's mouth so that you can move. In other words, by selecting the music sung by the avatar by the music information acquisition means 71, it is not necessary to separately input information regarding the movement of the mouth of the avatar. Therefore, the avatar in the video can be moved more humanly with less effort.

図７は、別実施例の実施形態の少なくとも一つに対応する動画作成処理の例を示すフローチャートである。図示する例では、動画作成処理は、動画作成のプロジェクトの実行が開始された場合には、ステップＳ２０１に進む。 FIG. 7 is a flowchart showing an example of the moving image creation process corresponding to at least one of the embodiments of another embodiment. In the illustrated example, the moving image creation process proceeds to step S201 when the execution of the moving image creation project is started.

ステップＳ２０１では、前記音声情報取得手段１３（楽曲情報取得手段７１と、ボイス情報取得手段７２）により、動画内でアバターに歌わせる楽曲情報とボイス情報を取得し、その後、ステップＳ２０２に進む。 In step S201, the voice information acquisition means 13 (music information acquisition means 71 and voice information acquisition means 72) acquires music information and voice information to be sung by the avatar in the moving image, and then proceeds to step S202.

ステップＳ２０２では、前記背景情報取得手段により作成される動画の背景に使用される画像データ又は動画データを取得（背景情報取得処理を実行）し、その後、ステップＳ２０３に進む。ステップＳ２０３では、前記基礎情報取得手段により作成される動画に表示されるアバターに関する基礎情報を取得（基礎情報取得処理を実行）し、その後、ステップＳ２０４に進む。 In step S202, image data or moving image data used for the background of the moving image created by the background information acquiring means is acquired (background information acquisition processing is executed), and then the process proceeds to step S203. In step S203, basic information about the avatar displayed in the moving image created by the basic information acquisition means is acquired (basic information acquisition process is executed), and then the process proceeds to step S204.

ステップＳ２０４では、前記スケジュール情報生成手段によって、前記スケジュール情報を生成（スケジュール情報生成処理を実行）し、その後、ステップＳ２０５に進む。ステップＳ２０５では、前記背景取得手段と前記基礎情報取得手段と前記スケジュール情報生成手段とで設定された各種情報に基づいてアバターによるメッセージ動画を生成（動画生成処理を実行）し、その後、処理を終了する。 In step S204, the schedule information generation means generates the schedule information (executes the schedule information generation process), and then proceeds to step S205. In step S205, a message video by the avatar is generated (execution of the video generation process) based on various information set by the background acquisition means, the basic information acquisition means, and the schedule information generation means, and then the process is terminated. do.

すなわち、別実施例の前記音声情報取得手段１３では、前記スケジュール情報手動生成手段１４によって前記スケジュール情報を編集する前に、前記楽曲情報取得手段７１によって前記楽曲情報を取得し、動画で前記アバターが歌う前記音声情報と、アバターが発声する歌詞情報とを同時に取得するように構成されている。該構成によれば、前記アバターの基礎情報と、該アバターが歌う楽曲情報とを最初に取得することで、前記スケジュール情報の入力の手間を大幅に削減することができる。 That is, in the voice information acquisition means 13 of another embodiment, the music information is acquired by the music information acquisition means 71 before the schedule information is edited by the schedule information manual generation means 14, and the avatar is displayed in a moving image. It is configured to simultaneously acquire the voice information to be sung and the lyrics information to be uttered by the avatar. According to the configuration, by first acquiring the basic information of the avatar and the music information sung by the avatar, the time and effort for inputting the schedule information can be significantly reduced.

図８は、楽曲情報取得手段により楽曲情報を取得するための操作画面の例を示した図である。前記音声情報取得手段１３は、ユーザ側の端末装置（タッチパネル）に前記楽曲情報と、前記ボイス情報とを入力するための操作環境（以下、楽曲情報設定画面７０）を提供し（図８参照）、該楽曲情報設定画面７０を介して前記楽曲情報と前記ボイス情報とを取得する処理が実行可能に構成されている。 FIG. 8 is a diagram showing an example of an operation screen for acquiring music information by the music information acquisition means. The voice information acquisition means 13 provides an operation environment (hereinafter, music information setting screen 70) for inputting the music information and the voice information to the terminal device (touch panel) on the user side (see FIG. 8). The process of acquiring the music information and the voice information via the music information setting screen 70 is configured to be executable.

図示する例では、該楽曲情報設定画面７０は、複数並べて表示される楽曲の中から一つを選択するラジオボタンと、複数並べて表示されるボイスの種類の中から一つを選択するラジオボタンとが上下に並べて配置されているが、複数の選択肢から一つを選ぶ形式であればこれらに限られない。 In the illustrated example, the music information setting screen 70 has a radio button for selecting one from a plurality of songs displayed side by side, and a radio button for selecting one from a plurality of voice types displayed side by side. Are arranged side by side, but they are not limited to these as long as one is selected from multiple options.

また、前記音声情報取得手段１３は、所定の課金額を支払ったことをサーバ装置２０が認識したことを条件に、選択可能な楽曲情報の種類や、ボイス情報（合成音声）の種類を増やすことができるように構成しても良い。 Further, the voice information acquisition means 13 increases the types of music information that can be selected and the types of voice information (synthetic voice) on condition that the server device 20 recognizes that the predetermined billing amount has been paid. It may be configured so that

図９は、別実施例でのスケジュール情報を設定するための設定操作画面の例を示した図である。図示する例では、ユーザ側のタッチパネルに前記スケジュール情報入力画面３０として、上述の例と同様に、前記編集画面表示部３１と、前記タイムライン表示部３２と、前記表示基準線３３と、前記スケジュール編集トラック３４とが同時に表示されている（図９参照）。 FIG. 9 is a diagram showing an example of a setting operation screen for setting schedule information in another embodiment. In the illustrated example, as the schedule information input screen 30 on the touch panel on the user side, the edit screen display unit 31, the timeline display unit 32, the display reference line 33, and the schedule are the same as in the above example. The edit track 34 and the edit track 34 are displayed at the same time (see FIG. 9).

前記スケジュール編集トラック３４には、前記テロップ編集トラック３６と、前記楽曲情報から取得された歌詞情報を編集する歌詞編集トラック８１と、前記アバターへのダンスモーションを指定するダンス編集トラック８２と、前記アバター編集トラック３８と、前記演出編集トラック３９と、前記背景編集トラック３７とを上下方向に並べて表示するように構成されている。 The schedule editing track 34 includes the telop editing track 36, a lyrics editing track 81 for editing lyrics information acquired from the music information, a dance editing track 82 for designating a dance motion to the avatar, and the avatar. The editing track 38, the effect editing track 39, and the background editing track 37 are configured to be displayed side by side in the vertical direction.

上記歌詞編集トラック８１は、前記楽曲情報取得手段によって取得された楽曲情報に含まれる歌詞情報が予め入力されているが、ユーザが歌詞の一部を編集して替え歌にすることができるように構成されている。 The lyrics editing track 81 is configured such that the lyrics information included in the music information acquired by the music information acquisition means is input in advance, but the user can edit a part of the lyrics to make a substitute song. Has been done.

これに伴い、前記スケジュール情報自動生成手段１５は、前記歌詞編集トラック８１によって前記歌詞情報が編集された場合には、取得された楽曲情報に含まれる音声情報を編集された歌詞情報に対応した音声情報となるように合成音声を自動的に生成（編集）する合成音声生成処理を実行するように構成されている。 Along with this, when the lyrics information is edited by the lyrics editing track 81, the schedule information automatic generation means 15 uses the voice information included in the acquired music information as the edited voice. It is configured to execute a synthetic voice generation process that automatically generates (edits) synthetic voice so that it becomes information.

さらに、前記スケジュール情報自動生成手段１５は、前記歌詞情報が編集されたことによって前記音声情報を構成する合成音声が編集された場合には、編集された替え歌の歌詞に応じて、動画内で動かす前記アバターの口の動作を自動的に訂正するリップシンク処理を実行することができるように構成されている。 Further, when the synthetic voice constituting the voice information is edited by editing the lyrics information, the schedule information automatic generation means 15 moves in the moving image according to the lyrics of the edited parody. It is configured to be able to perform a lip-sync process that automatically corrects the movement of the avatar's mouth.

すなわち、前記スケジュール情報自動生成手段１５は、前記歌詞編集トラック８１により歌詞情報が編集されると、前記スケジュール情報の一部、具体的には、動画内の前記アバターが歌う音声情報と、該アバターの口の動きに関する動作情報とを編集後の歌詞の内容に応じて自動的に編集するように構成されている。このため、ユーザが選択した楽曲を好きな替え歌に編集して前記アバターに歌わせる動画をよりスムーズ且つ高い品質で生成することができる。 That is, when the lyrics information is edited by the lyrics editing track 81, the schedule information automatic generation means 15 includes a part of the schedule information, specifically, the voice information sung by the avatar in the moving image and the avatar. It is configured to automatically edit the motion information related to the movement of the mouth according to the content of the edited lyrics. Therefore, it is possible to edit the music selected by the user into a favorite parody and generate a moving image to be sung by the avatar with smoother and higher quality.

上記ダンス編集トラック８２は、前記アバター編集トラック３８において前記アバターのポーズの動作を指定可能に構成していたところを、楽曲に合わせたダンスモーションを指定可能に代えて構成されている。 The dance editing track 82 is configured such that the pose operation of the avatar can be specified in the avatar editing track 38, but the dance motion according to the music can be specified.

１２基礎情報取得手段
１３音声情報取得手段
１４スケジュール情報手動生成手段（スケジュール情報生成手段）
１５スケジュール情報自動生成手段（スケジュール情報生成手段）
１６動画データ生成手段
１７記憶部
４０通信ネットワーク
７１楽曲情報取得手段
７２ボイス情報取得手段（音声情報取得手段） 12 Basic information acquisition means 13 Voice information acquisition means 14 Schedule information manual generation means (schedule information generation means)
15 Schedule information automatic generation means (schedule information generation means)
16 Video data generation means 17 Storage unit 40 Communication network 71 Music information acquisition means 72 Voice information acquisition means (voice information acquisition means)

上記課題を解決するため、本発明の動画作成装置は、アバターを含む動画を作成する動画作成装置であって、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、自己の録音手段又は自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設け、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段を設け、前記音声情報取得手段は、前記楽曲情報取得手段により取得された楽曲情報から音声情報を取得するように構成され、前記スケジュール情報自動生成手段は、前記楽曲情報取得手段により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定するように構成され、前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を含むことを特徴としている。 In order to solve the above problems, the moving image creating device of the present invention is a moving image creating device that creates a moving image including an avatar, and obtains basic information about the avatar from its own storage unit or another via a network. Basic information acquisition means acquired from a computer, schedule information generation means for generating at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information. Based on the voice information acquisition means acquired from the own recording means or the own storage unit, or from another computer via the network, the basic information, the schedule information, and the voice information. The avatar is provided with a video data generation means for generating video data including a video with audio that performs a plurality of actions with the passage of time, and the mouth of the avatar synchronizes with the voice as the schedule information generation means. A schedule information automatic generation means for automatically generating at least a part of the schedule information is provided so that the information can be acquired from its own storage unit or from another computer via a network. The music information acquisition means to be acquired is provided, the voice information acquisition means is configured to acquire voice information from the music information acquired by the music information acquisition means, and the schedule information automatic generation means obtains the music information. The schedule information is configured to automatically set the motion information of the avatar's mouth based on the music information acquired by the means, and the schedule information generated by the schedule information generation means is acquired by the music information acquisition means. It is characterized by including information on the pose of the avatar singing the song or information on the dance motion .

また、本発明の一実施形態に係る動画作成システムは、記憶部と、前記アバターに関する基礎情報を前記記憶部から取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、自己の録音手段又は前記記憶部から取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設け、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段を設け、前記音声情報取得手段は、前記楽曲情報取得手段により取得された楽曲情報から音声情報を取得するように構成され、前記スケジュール情報自動生成手段は、前記楽曲情報取得手段により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定するように構成され、前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を含むことを特徴とする。 Further, the moving image creation system according to the embodiment of the present invention includes a storage unit, basic information acquisition means for acquiring basic information about the avatar from the storage unit, and an operation instruction for causing the avatar to perform a predetermined operation. A schedule information generation means for generating at least a part of schedule information arranged in a plurality of cases according to the passage of time, a voice information acquisition means for acquiring voice information from its own recording means or the storage unit, the basic information, and the basic information. The schedule is provided with a moving image data generation means for generating moving image data including an image with sound in which the avatar performs a plurality of actions with the passage of time based on the schedule information and the audio information. As an information generation means, a schedule information automatic generation means for automatically generating at least a part of the schedule information is provided so that the mouth of the avatar operates in synchronization with the voice, and the music information used for moving image is stored in its own storage unit. A music information acquisition means acquired from another computer or acquired from another computer via a network is provided, and the audio information acquisition means is configured to acquire audio information from the music information acquired by the music information acquisition means. The schedule information automatic generation means is configured to automatically set the motion information of the mouth of the avatar based on the music information acquired by the music information acquisition means, and is generated by the schedule information generation means. The schedule information is characterized by including information regarding a pose of an avatar singing a song acquired by the music information acquisition means or information regarding a dance motion .

また、本発明の一実施形態に係る動画作成プログラムは、コンピュータに、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得処理と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成処理と、音声情報を、自己の録音手段又は自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得処理と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成処理と、前記スケジュール情報生成処理として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成処理と、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得処理と、前記音声情報取得処理として、前記楽曲情報取得処理により取得された楽曲情報から音声情報を取得する処理と、前記スケジュール情報自動生成処理として、前記楽曲情報取得処理により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定する処理と、前記スケジュール情報生成手段として、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を生成する処理とを実行させることを特徴とする。 Further, the moving image creation program according to the embodiment of the present invention includes a basic information acquisition process of acquiring basic information about the avatar from its own storage unit or acquiring basic information from another computer via a network. A schedule information generation process that generates at least a part of schedule information in which a plurality of operation instructions for causing an avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information is stored in a recording means of the user or the storage of the computer. Based on the voice information acquisition process acquired from the unit or from another computer via the network, the basic information, the schedule information, and the voice information, the avatar has a plurality of avatars over time. As the video data generation process for generating video data including a video with audio for operation and the schedule information generation process, at least a part of the schedule information is used so that the mouth of the avatar operates in synchronization with the audio. As the schedule information automatic generation process that is automatically generated , the music information acquisition process that acquires the music information used for moving image from its own storage unit, or the music information acquisition process that is acquired from another computer via the network, and the audio information acquisition process. , The process of acquiring voice information from the music information acquired by the music information acquisition process, and the operation information of the avatar's mouth based on the music information acquired by the music information acquisition process as the schedule information automatic generation process. It is characterized in that a process of automatically setting and a process of generating information about a pose of an avatar singing a song acquired by the music information acquisition means or information about a dance motion are executed as the schedule information generation means . ..

以下、本発明の実施形態にかかる動画作成処理装置について説明する。図１は、本発明の動画作成処理装置の構成の例を示すブロック図である。図１に示すように、前記動画作成処理装置（動画作成装置）１０は、背景情報取得手段１１と、基礎情報取得手段１２と、音声情報取得手段１３と、スケジュール情報手動生成手段１４と、スケジュール情報自動生成手段１５と、動画データ生成手段１６と、記憶部１７とを備える。 Hereinafter, the moving image creation processing device according to the embodiment of the present invention will be described. FIG. 1 is a block diagram showing an example of the configuration of the moving image creation processing device of the present invention. As shown in FIG. 1, the moving image creation processing device (moving image creating device) 10 includes a background information acquisition unit 11, a basic information acquisition unit 12, an audio information acquisition unit 13, a schedule information manual generation unit 14, and a schedule. It includes an information automatic generation means 15, a moving image data generation means 16, and a storage unit 17.

前記スケジュール情報自動生成手段１５は、前記スケジュール情報手動生成手段１４によって取得された前記スケジュール情報に基づいて、動画に表示されるアバターに所定の動作を行わせる動作情報（スケジュール情報）を自動的に生成する機能を有する。具体的に説明すると、該スケジュール情報自動生成手段１５には、前記録音手段により取得されたユーザの声からなる前記音声情報に合わせて前記アバターの顔を構成する口を自動的に動かす（口パク）リップシンク手段（機能）や、前記音声情報や演出情報等に基づいて前記アバターの顔を構成する目を所定のタイミングで自動的に瞬きさせる瞬き手段（機能）や、前記録音手段により取得されたユーザ自身の声を特定の声音（合成音声）に変更する合成音声設定手段（機能）等がある。 The schedule information automatic generation means 15 automatically generates operation information (schedule information) that causes an avatar displayed in a moving image to perform a predetermined operation based on the schedule information acquired by the schedule information manual generation means 14. It has a function to generate. Specifically, the schedule information automatic generation means 15 automatically moves the mouth constituting the face of the avatar according to the voice information composed of the user's voice acquired by the recording means (lip-sync). ) Acquired by the lip-sync means (function), the blinking means (function) that automatically blinks the eyes constituting the face of the avatar at a predetermined timing based on the voice information, the effect information, etc., and the recording means. There is a synthetic voice setting means (function) for changing the user's own voice to a specific voice sound (synthetic voice).

ステップＳ１０３では、前記スケジュール情報手動生成手段１４と前記スケジュール情報自動生成手段１５とによって、前記スケジュール情報を生成（スケジュール情報生成処理を実行）し、その後、ステップＳ１０４に進む。ステップＳ１０４では、前記背景取得手段１１と前記基礎情報取得手段１２と前記スケジュール情報生成手段１４、１５とで設定された各種情報を合成（レンダリング処理）することにより、アバターによるメッセージ動画を生成（動画生成処理を実行）し、その後、処理を終了する。 In step S103, the schedule information is generated (schedule information generation processing is executed) by the schedule information manual generation means 14 and the schedule information automatic generation means 15, and then the process proceeds to step S104. In step S104, a message video by the avatar is generated (video) by synthesizing (rendering) various information set by the background acquisition means 11, the basic information acquisition means 12, and the schedule information generation means 14 and 15. Execute the generation process), and then end the process.

なお、上記スケジュール情報生成処理は、スケジュール情報を手動操作で編集する環境を提供するスケジュール情報手動生成処理と、スケジュール情報の入力に基づいて他のスケジュールを自動的に生成するスケジュール情報自動生成処理とを有する。ちなみに、上記の処理フローを構成する各種処理の順序は、処理内容に矛盾が生じない範囲で順不同である。 The schedule information generation process includes a schedule information manual generation process that provides an environment for manually editing schedule information, and a schedule information automatic generation process that automatically generates other schedules based on the input of schedule information. Have. Incidentally, the order of the various processes constituting the above process flow is random as long as there is no contradiction in the process contents.

前記要素切換操作部２２は、左右方向の帯状の表示部と、該表示部内に沿って並べて配置されたタブ状の切換操作部２２Ａ，２２Ｂ，２２Ｃとから構成されている。図示する例では、前記表示部に、髪型用の切換操作部２２Ａと、衣装用の切換操作部２２Ｂと、アバターに関する色用の切換操作部２２Ｃとが左右方向に並べて配置されており、ユーザが各切換操作部２２Ａ，２２Ｂ，２２Ｃの何れか一つをタップ操作等によって選択操作することにで、前記デザイン選択操作部２３に選択された切換操作部２２に関する情報を表示させることができるように構成されている（図４参照）。 The element switching operation unit 22 includes a strip-shaped display unit in the left-right direction and tab-shaped switching operation units 22A, 22B, 22C arranged side by side along the display unit. In the illustrated example, the switching operation unit 22A for the hairstyle, the switching operation unit 22B for the costume, and the switching operation unit 22C for the color related to the avatar are arranged side by side in the left-right direction on the display unit, and the user can use the display unit. By selecting and operating any one of the switching operation units 22A, 22B, and 22C by a tap operation or the like, the design selection operation unit 23 can display information about the selected switching operation unit 22. It is configured (see FIG. 4).

ちなみに、該デザイン表示部２５には、薄墨表示や鍵付き表示２６等することによって選択不可能なデザイン表示部（デザイン）２５Ｂを配置し、所定の課金額を支払ったことをサーバ装置が認識したことを条件に、選択不能だったデザイン表示部を開放して使用可能とするように構成しても良い（図４参照）。 By the way, in the design display unit 25, a design display unit (design) 25B that cannot be selected by displaying a light ink display or a display with a key 26 or the like is arranged, and the server device recognizes that the predetermined charge amount has been paid. On the condition that the design is not selectable, the design display unit may be opened so that it can be used (see FIG. 4).

また、該スケジュール編集トラック３４は、ユーザによるスライド操作等によって前記タイムライン表示部３２の再生時間の表記とともに一体的に左右スライドするように構成されており、各編集トラックの左右方向の真ん中には、左右中央で固定表示された前記表示基準線３３が配置されている。これに伴い、各編集トラックは、前記タイムライン表示部３２と表示基準線３３とで表示された再生位置での各スケジュール情報の追加、又は編集を実行することができるよう構成されている。 Further, the schedule editing track 34 is configured to slide integrally to the left and right together with the notation of the reproduction time of the timeline display unit 32 by a slide operation or the like by the user, and the schedule editing track 34 is configured to slide in the center of each editing track in the left-right direction. , The display reference line 33 fixedly displayed at the center of the left and right is arranged. Along with this, each editing track is configured to be able to add or edit each schedule information at the reproduction position displayed by the timeline display unit 32 and the display reference line 33.

上記課題を解決するため、本発明の動画作成装置は、アバターを含む動画を作成する動画作成装置であって、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設け、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段を設け、前記音声情報取得手段は、前記楽曲情報取得手段により取得された楽曲情報から音声情報を取得するように構成され、前記スケジュール情報自動生成手段は、前記楽曲情報取得手段により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定するように構成され、前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を含み、前記楽曲情報は、楽曲の歌詞情報と音声情報とを含み、前記音声情報取得手段により選択された楽曲の歌詞情報を編集する歌詞情報変更手段を設け、前記スケジュール情報自動生成手段は、前記歌詞情報変更手段により歌詞情報が編集された場合には、変更された歌詞に合わせてアバターの口の動きを変更するように構成されたことを特徴としている。 In order to solve the above problems, the moving image creating device of the present invention is a moving image creating device that creates a moving image including an avatar, and obtains basic information about the avatar from its own storage unit or another via a network. Basic information acquisition means acquired from a computer, schedule information generation means for generating at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information. The avatar is the avatar based on the voice information acquisition means, the basic information, the schedule information, and the voice information, which are acquired from the storage unit of the self or from another computer via the network. It is equipped with a video data generation means for generating video data including a video with sound that performs a plurality of actions with the passage of time, and as the schedule information generation means, the mouth of the avatar operates in synchronization with the voice. Is provided with a schedule information automatic generation means that automatically generates at least a part of the schedule information, and the music information used for video creation is acquired from its own storage unit, or music information acquired from another computer via a network. The acquisition means is provided, the voice information acquisition means is configured to acquire voice information from the music information acquired by the music information acquisition means, and the schedule information automatic generation means is acquired by the music information acquisition means. The schedule information is configured to automatically set the motion information of the avatar's mouth based on the music information, and the schedule information generated by the schedule information generation means is an avatar singing a music acquired by the music information acquisition means. The music information includes information about the pose or dance motion of the music, and the music information includes the lyrics information and the voice information of the music, and the lyrics information changing means for editing the lyrics information of the music selected by the voice information acquisition means. The schedule information automatic generation means is characterized in that, when the lyrics information is edited by the lyrics information changing means, the movement of the avatar's mouth is changed according to the changed lyrics. There is.

また、本発明の一実施形態に係る動画作成システムは、記憶部と、前記アバターに関する基礎情報を前記記憶部から取得する基礎情報取得手段と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成手段と、音声情報を、前記記憶部から取得する音声情報取得手段と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成手段とを備え、前記スケジュール情報生成手段として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成手段を設け、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得手段を設け、前記音声情報取得手段は、前記楽曲情報取得手段により取得された楽曲情報から音声情報を取得するように構成され、前記スケジュール情報自動生成手段は、前記楽曲情報取得手段により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定するように構成され、前記スケジュール情報生成手段により生成される前記スケジュール情報は、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を含み、前記楽曲情報は、楽曲の歌詞情報と音声情報とを含み、前記音声情報取得手段により選択された楽曲の歌詞情報を編集する歌詞情報変更手段を設け、前記スケジュール情報自動生成手段は、前記歌詞情報変更手段により歌詞情報が編集された場合には、変更された歌詞に合わせてアバターの口の動きを変更するように構成されたことを特徴とする。 Further, the moving image creation system according to the embodiment of the present invention includes a storage unit, basic information acquisition means for acquiring basic information about the avatar from the storage unit, and an operation instruction for causing the avatar to perform a predetermined operation. The schedule information generation means for generating at least a part of the schedule information arranged in a plurality of cases according to the passage of time, the voice information acquisition means for acquiring the voice information from the storage unit, the basic information, and the schedule information. As the schedule information generation means, the avatar is provided with a video data generation means for generating video data including a video with voice that performs a plurality of actions with the passage of time based on the voice information. , Whether to provide a schedule information automatic generation means that automatically generates at least a part of the schedule information so that the mouth of the avatar operates in synchronization with the voice, and acquire the music information used for video creation from its own storage unit. Alternatively, a music information acquisition means acquired from another computer via the network is provided, and the voice information acquisition means is configured to acquire voice information from the music information acquired by the music information acquisition means, and the schedule information. The automatic generation means is configured to automatically set the motion information of the avatar's mouth based on the music information acquired by the music information acquisition means, and the schedule information generated by the schedule information generation means is The music information includes information on a pose of an avatar singing a music acquired by the music information acquisition means or information on a dance motion , and the music information includes lyrics information and voice information of the music, and is selected by the voice information acquisition means. A means for changing lyrics information for editing the lyrics information of the song is provided, and the means for automatically generating schedule information is such that when the lyrics information is edited by the means for changing the lyrics information, the avatar's mouth is adjusted to the changed lyrics. It is characterized by being configured to change movement .

また、本発明の一実施形態に係る動画作成プログラムは、コンピュータに、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得処理と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成処理と、音声情報を、自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得処理と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成処理と、前記スケジュール情報生成処理として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成処理と、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得処理と、前記音声情報取得処理として、前記楽曲情報取得処理により取得された楽曲情報から音声情報を取得する処理と、前記スケジュール情報自動生成処理として、前記楽曲情報取得処理により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定する処理と、前記スケジュール情報生成手段として、前記楽曲情報取得手段により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を生成する処理と、前記楽曲情報から楽曲の歌詞情報を取得する処理と、前記音声情報取得手段により選択された楽曲の歌詞情報を編集する歌詞情報変更処理と、前記スケジュール情報自動生成処理として、前記歌詞情報変更手段により歌詞情報が編集された場合には、変更された歌詞に合わせてアバターの口の動きを変更する処理とを実行させることを特徴とする。 Further, the moving image creation program according to the embodiment of the present invention includes a basic information acquisition process of acquiring basic information about the avatar from its own storage unit or acquiring basic information from another computer via a network. Schedule information generation processing for generating at least a part of schedule information in which a plurality of operation instructions for causing an avatar to perform a predetermined operation are arranged in accordance with the passage of time, and voice information is acquired from the storage unit of the computer. Alternatively, based on the voice information acquisition process acquired from another computer via the network, the basic information, the schedule information, and the voice information, the voice that the avatar performs a plurality of actions with the passage of time. As the video data generation process for generating video data including the attached video and the schedule information generation process, at least a part of the schedule information is automatically generated so that the mouth of the avatar operates in synchronization with the voice. Schedule information automatic generation processing, music information acquisition processing for acquiring music information used for moving image from its own storage unit, or acquisition from another computer via a network, and music information as the audio information acquisition processing. As the process of acquiring voice information from the music information acquired by the acquisition process and the schedule information automatic generation process, the motion information of the avatar's mouth is automatically set based on the music information acquired by the music information acquisition process. And, as the schedule information generation means, the process of generating the information about the pose of the avatar singing the song acquired by the music information acquisition means or the information about the dance motion, and the process of acquiring the lyrics information of the music from the music information. When the lyrics information is edited by the lyrics information changing means as the processing, the lyrics information changing processing for editing the lyrics information of the music selected by the voice information acquisition means, and the schedule information automatic generation processing, the change is made. It is characterized by executing a process of changing the movement of the avatar's mouth according to the lyrics .

前記スケジュール編集トラック３４は、図示する例では、動画内に表示するテロップを編集するテロップ編集トラック３６と、動画の背景となる画像データや動画データを編集する背景編集トラック３７と、前記アバターに所定の動作を実行させるための動作指示等を編集するアバター編集トラック３８と、動画内の演出効果を編集する演出編集トラック３９と、動画に付す音声情報の取得と編集を行う音声編集トラック（音声情報取得手段）４１と、動画にＢＧＭを付ける音楽編集トラック４２とを上下方向に並べて表示するように構成されている。また、各編集トラック３４は、スケジュール情報が入力されている再生範囲には、スケジュール情報が入力済みであることを示す編集ブロック４０が表示されるように構成されている（図５参照）。 In the illustrated example, the schedule editing track 34 is predetermined to the telop editing track 36 for editing the telop displayed in the moving image, the background editing track 37 for editing the image data and the moving image data as the background of the moving image, and the avatar. An avatar editing track 38 for editing an operation instruction for executing the operation of, an effect editing track 39 for editing an effect in a video, and a voice editing track for acquiring and editing audio information attached to a video (audio information). The acquisition means) 41 and the music editing track 42 for attaching a BGM to the moving image are configured to be displayed side by side in the vertical direction. Further, each edit track 34 is configured so that an edit block 40 indicating that the schedule information has been input is displayed in the reproduction range in which the schedule information is input (see FIG. 5).

また、本発明の一実施形態に係る動画作成プログラムは、コンピュータに、前記アバターに関する基礎情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する基礎情報取得処理と、前記アバターに所定の動作を行わせるための動作指示を時間経過に対応させて複数並べてなるスケジュール情報の少なくとも一部を生成するスケジュール情報生成処理と、音声情報を、自己の前記記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する音声情報取得処理と、前記基礎情報及び前記スケジュール情報と、前記音声情報とに基づいて、前記アバターが前記時間経過に伴って複数の動作を行う音声付の映像が含まれた動画データを生成する動画データ生成処理と、前記スケジュール情報生成処理として、アバターの口が音声に同期して動作するように前記スケジュール情報の少なくとも一部を自動的に生成するスケジュール情報自動生成処理と、動画作成に用いる楽曲情報を自己の記憶部から取得するか、或いはネットワーク経由で他のコンピュータから取得する楽曲情報取得処理と、前記音声情報取得処理として、前記楽曲情報取得処理により取得された楽曲情報から音声情報を取得する処理と、前記スケジュール情報自動生成処理として、前記楽曲情報取得処理により取得された楽曲情報に基づいてアバターの口の動作情報を自動的に設定する処理と、前記スケジュール情報生成処理として、前記楽曲情報取得処理により取得された楽曲を歌うアバターのポーズに関する情報又はダンスモーションに関する情報を生成する処理と、前記楽曲情報から楽曲の歌詞情報を取得する処理と、前記音声情報取得処理により選択された楽曲の歌詞情報を編集する歌詞情報変更処理と、前記スケジュール情報自動生成処理として、前記歌詞情報変更処理により歌詞情報が編集された場合には、変更された歌詞に合わせてアバターの口の動きを変更する処理とを実行させることを特徴とする。 Further, the moving image creation program according to the embodiment of the present invention includes a basic information acquisition process of acquiring basic information about the avatar from its own storage unit or acquiring basic information from another computer via a network. Whether to acquire the schedule information generation process for generating at least a part of the schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time, and the voice information from the own storage unit. Or, with voice, the avatar performs a plurality of actions with the passage of time based on the voice information acquisition process acquired from another computer via the network, the basic information, the schedule information, and the voice information. As the video data generation process for generating the video data including the video of the above and the schedule information generation process, at least a part of the schedule information is automatically generated so that the mouth of the avatar operates in synchronization with the voice. The music information acquisition as the schedule information automatic generation processing, the music information acquisition processing of acquiring the music information used for moving image from its own storage unit, or the music information acquisition processing acquired from another computer via the network, and the voice information acquisition processing. As the process of acquiring voice information from the music information acquired by the process and the schedule information automatic generation process, the motion information of the avatar's mouth is automatically set based on the music information acquired by the music information acquisition process. Processing, as the schedule information generation processing , a process of generating information on a pose or a dance motion of an avatar singing a song acquired by the song information acquisition process , and a process of acquiring song lyrics information from the song information. And, when the lyrics information is edited by the lyrics information change processing as the schedule information automatic generation processing and the lyrics information change processing for editing the lyrics information of the music selected by the voice information acquisition processing , the change is made. It is characterized by executing a process of changing the movement of the avatar's mouth according to the lyrics.

Claims

It is a video creation device that creates videos including avatars.
Basic information acquisition means for acquiring basic information about the avatar from its own storage unit or from another computer via a network.
A schedule information generation means for generating at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time.
A voice information acquisition means that acquires voice information from its own recording means or its own storage unit, or from another computer via a network.
Based on the basic information, the schedule information, and the audio information, the moving image data generation means for generating the moving image data including the moving image with the sound in which the avatar performs a plurality of actions with the passage of time. Prepare,
As the schedule information generation means, a moving image creating device provided with a schedule information automatic generation means for automatically generating at least a part of the schedule information so that the mouth of the avatar operates in synchronization with the voice.

A recording means for recording the user's voice is provided.
The moving image creating device according to claim 1, wherein the user's voice recorded by the recording means is used as the voice information acquired by the voice information acquisition means.

The moving image creation device according to claim 2, further comprising a synthetic voice generating means capable of changing the voice information of the user recorded by the recording means into synthetic voice.

A music information acquisition means for acquiring music information used for video creation from one's own storage unit or from another computer via a network is provided.
The moving image creating device according to claim 1, wherein the voice information acquisition means is configured to acquire voice information from the music information acquired by the music information acquisition means.

The music information includes lyrics information and voice information of the music.
A lyric information changing means for editing the lyric information of the music selected by the voice information acquiring means is provided.
The fourth aspect of claim 4, wherein the schedule information automatic generation means is configured to change the movement of the avatar's mouth according to the changed lyrics when the lyrics information is edited by the lyrics information changing means. Video creator.

The voice information acquisition means is configured to be able to acquire voice information regarding the type of voice uttered by the avatar from its own storage unit or from another computer via a network. The moving image creation device according to 5.

The schedule information generated by the schedule information generation means is described in any one of claims 1 to 6, which includes position information in which a plurality of information regarding left and right positions in a screen displaying the avatar are arranged in accordance with the passage of time. Video creation device.

The moving image according to any one of claims 1 to 7, wherein the schedule information generated by the schedule information generation means includes scale information in which a plurality of information regarding the size of displaying the avatar is arranged in accordance with the passage of time. Device.

The avatar is configured to have three-dimensional information.
The moving image according to any one of claims 1 to 8, wherein the schedule information generated by the schedule information generating means includes rotation position information in which a plurality of information regarding the direction in which the avatar is displayed is arranged in accordance with the passage of time. Device.

A video creation system that creates videos that include avatars.
Memory and
Basic information acquisition means for acquiring basic information about the avatar from the storage unit,
A schedule information generation means for generating at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time.
A voice information acquisition means for acquiring voice information from its own recording means or the storage unit, and
Based on the basic information, the schedule information, and the audio information, the moving image data generation means for generating the moving image data including the moving image with the sound in which the avatar performs a plurality of actions with the passage of time. Prepare,
As the schedule information generation means, a moving image creating system characterized by providing a schedule information automatic generation means for automatically generating at least a part of the schedule information so that the mouth of the avatar operates in synchronization with the voice.

A video creation program that creates videos that include avatars.
On the computer
Basic information acquisition processing to acquire basic information about the avatar from its own memory or from another computer via the network.
A schedule information generation process that generates at least a part of schedule information in which a plurality of operation instructions for causing the avatar to perform a predetermined operation are arranged in accordance with the passage of time.
A voice information acquisition process of acquiring voice information from one's own recording means or one's own storage unit, or from another computer via a network.
A video data generation process for generating video data including a video with audio in which the avatar performs a plurality of actions with the passage of time based on the basic information, the schedule information, and the audio information.
As the schedule information generation process, a moving image creation program for executing a schedule information automatic generation process that automatically generates at least a part of the schedule information so that the mouth of the avatar operates in synchronization with the voice.