JP2005333309A

JP2005333309A - Apparatus, method, system, and program for information processing

Info

Publication number: JP2005333309A
Application number: JP2004148724A
Authority: JP
Inventors: Satoru Tokuhisa; 悟徳久; Takumi Kotabe; 巧小田部; Kanu Suguro; 冠宇勝呂; Soukai Okubo; 創介大久保; Masahiko Inakage; 正彦稲蔭
Original assignee: Individual
Current assignee: Individual
Priority date: 2004-05-19
Filing date: 2004-05-19
Publication date: 2005-12-02

Abstract

PROBLEM TO BE SOLVED: To form a content having a video image and a voice based on user's performance. SOLUTION: A movement detector 62 detects a user's movement from the video imaged by a camera 42, for example by a movement vector detection and tracking of the movement of an LED light grasped or mounted by the user and performed by various types of processes. A foot sensor input detector 66 receives the supply of a signal showing an operation input from a foot sensor 43. A video effect imparting processor 64 performs a predetermined effect on the video based on the detected result of the user's movement supplied from the movement detector 62, and the signal showing the operation input to the user's foot sensor 43 supplied from the foot sensor input detector 66. An effect sound generator 65 applies an effect sound to audio data of accompaniment formed by an accompaniment formation unit 67. This can be applied to a self-movie forming device. COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、情報処理装置および情報処理方法、情報処理システム、並びに、プログラムに関し、特に、映像撮像中に入力される、ユーザの動作に関する情報に基づいて、撮像された映像に効果を付与したり、音声データを生成することが可能な情報処理装置および情報処理方法、情報処理システム、並びに、プログラムに関する。 The present invention relates to an information processing device, an information processing method, an information processing system, and a program, and in particular, provides an effect to a captured image based on information about a user's operation input during image capturing. The present invention relates to an information processing apparatus and information processing method, an information processing system, and a program that can generate audio data.

仮想体験を主体としたエンターテインメントとして、例えば、ＬＢＥ（Location Based Entertainment）などがある。ＬＢＥは、例えば、近年の高密度記録技術、コンピュータによって作られた現実のような世界をユーザに感じさせることができる仮想現実感（Virtual Reality：バーチャルリアリティー）技術、更に、バーチャルリアリティと現実の世界をミックスし、連続した違和感の少ない3次元空間として人間に認識させる技術の総称である、複合現実感（Mixed Reality：ミクスドリアリティー）技術が発展してきたことにより実現可能となった、没入性の高い観客体験型のエンターテインメントである。ＬＢＥは、例えば、アミューズメントパークなどにおいて、視聴者が高い没入感を得ることができるような巨大スクリーンによる映像体験や、観客がストーリーの中のキャラクターに成りきることのできる体験型のコンテンツなどがある。 As entertainment mainly based on virtual experiences, for example, there is LBE (Location Based Entertainment). LBE is, for example, recent high-density recording technology, virtual reality (virtual reality) technology that allows users to feel the real world created by computers, and virtual reality and the real world. Highly immersive, made possible by the development of mixed reality technology, which is a collective term for technologies that mix humans and make humans recognize them as a continuous 3D space with little discomfort. It is entertainment of the audience experience type. For example, in an amusement park, LBE has a video experience with a huge screen that allows viewers to be highly immersive, and experience-based content that allows the audience to become characters in the story. .

また、近年の情報演算技術の発展により、音声データや映像データを略リアルタイムに処理することができるような技術がある（例えば、非特許文献１）
松田周、DI PS：MAX のためのリアルタイム映像処理オブジェクト群、情報処理学会、MUS36-9,2000. In addition, with recent development of information calculation technology, there is a technology that can process audio data and video data in substantially real time (for example, Non-Patent Document 1).
Shu Matsuda, DI PS: Real-time video processing objects for MAX, IPSJ, MUS36-9, 2000.

しかしながら、ＬＢＥは、ストーリーが予定されていること、年齢問わず誰でも参加できるようにインターフェースが簡易化されていることから、何度も繰り返し楽しめるエンターテインメントとは言い難い。 However, LBE is not an entertainment that can be enjoyed over and over again because the story is scheduled and the interface is simplified so that anyone can participate regardless of age.

また、映像データおよび音声データを処理することによるコンテンツ制作は、パーソナルコンピュータなどの発達によって、多くの人に可能なものとなってきたが、その技術内容によっては、例えば、ソフトウェアプログラミングなどの専門性の高い知識が求められる場合が多い。例えば、映像と音声を高度に編集することによりコンテンツを作成しようとした場合、求められる知識および技術は非常に高度なものとなってしまう。これに対して、コンテンツが、簡単なインターフェースで誰でも作成可能なようになされている場合、作成されたコンテンツにユーザのオリジナリティを盛り込むことができる余地は限られてしまい、作成されるコンテンツも、ユーザインターフェースが簡単なものであればあるほど、複雑なものではなく、単純なものとなってしまう。 In addition, content production by processing video data and audio data has become possible for many people due to the development of personal computers, etc., but depending on the technical contents, for example, expertise such as software programming Is often required. For example, when a content is to be created by highly editing video and audio, the required knowledge and technology are very advanced. On the other hand, if the content can be created by anyone with a simple interface, there is limited room for the user's originality to be included in the created content. The simpler the user interface, the less complex and simpler it becomes.

また、ユーザのジェスチャーを計測し、計測結果を基に、デジタル情報をマッピングすることにより、コンテンツを作成する、いわゆる、従来のインタラクティブアートの作成においては、特殊な操作入力方法が用いられているものが多く、更に、映像や音楽に対する深い知識を有するユーザにしか扱いづらいなど、ユーザビリティが考慮されているとは言い難い。また、従来のインタラクティブアートは、その表現手法において、身体動作と処理内容の関係性や音響処理と映像処理の関係性が考慮されていたとは言いがたい。 Also, a special operation input method is used in the creation of so-called conventional interactive art in which content is created by measuring user gestures and mapping digital information based on the measurement results. In addition, it is difficult to say that usability is taken into consideration, such as being difficult to handle only by users who have deep knowledge of video and music. In addition, it is difficult to say that conventional interactive art takes into account the relationship between body movement and processing content and the relationship between acoustic processing and video processing in its expression technique.

ユーザは、受動的な楽しみのみではなく、自己表現を含む、能動的な楽しみ方ができるエンターテイメントを求めている。これらの課題に対して、従来のＬＢＥコンテンツ特有の没入感を残しつつ、ユーザに創造性と表現性を感じさせることができ、シンプルなユーザインターフェースであっても、繰り返し楽しむことができ、かつ、身体動作と処理内容の関係性や音響処理と映像処理の関係性が考慮されて、オリジナルのコンテンツを気軽に作成することができるコンテンツ作成装置の提供が求められていた。 Users are looking for entertainment that can be actively enjoyed, including not only passive enjoyment but also self-expression. In response to these issues, the user can feel creativity and expressiveness while leaving the immersive feeling unique to conventional LBE content. Even a simple user interface can be enjoyed repeatedly, In consideration of the relationship between operation and processing content and the relationship between audio processing and video processing, it has been desired to provide a content creation device that can easily create original content.

本発明はこのような状況に鑑みてなされたものであり、特別な知識や技能を持たないユーザが充分使用可能なシンプルなユーザインターフェースを用いて、ユーザの動き（パフォーマンス）を検出して、音と映像を作成し、効果を付与することにより、オリジナリティのある、高度な画像処理や音声処理が施されたコンテンツを生成することができるようにするものである。 The present invention has been made in view of such a situation, and a user's movement (performance) is detected using a simple user interface that can be used sufficiently by a user who does not have special knowledge or skill, and sound is reproduced. By creating an image and adding an effect, it is possible to generate original content that has been subjected to advanced image processing and audio processing.

本発明の情報処理装置は、第１の映像データを撮像する撮像手段と、ユーザの動作に関する情報を取得する動作情報取得手段と、ユーザの操作入力を取得する操作入力手段と、第１の音声データを生成する音声生成手段と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、撮像手段により撮像された第１の映像データに所定の第１の処理を施して、コンテンツを構成する第２の映像データを生成する映像処理手段と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、音声生成手段により生成された第１の音声データに、所定の第２の処理を施して、コンテンツを構成する第２の音声データを生成する音声処理手段と、映像処理手段により生成された第２の映像データの表示を制御する表示制御手段と、音声処理手段により生成された第２の音声データの出力を制御する音声出力制御手段とを備えることを特徴とする。 An information processing apparatus according to the present invention includes an imaging unit that captures first video data, an operation information acquisition unit that acquires information about a user's operation, an operation input unit that acquires a user's operation input, and a first sound. The first video imaged by the imaging unit based on the voice generation unit that generates data, the information about the user's operation acquired by the operation information acquisition unit, and the user's operation input input by the operation input unit Video processing means for performing predetermined first processing on the data to generate second video data constituting the content, information on the user's motion acquired by the motion information acquisition means, and input by the operation input means The second audio data is generated by performing predetermined second processing on the first audio data generated by the audio generation unit based on the user's operation input. Audio processing means for generating voice data, display control means for controlling the display of the second video data generated by the video processing means, and audio for controlling the output of the second audio data generated by the audio processing means Output control means.

情報処理装置は、天井面およびユーザが出入り可能な開口部を有する壁面で構成される筐体と、筺体の下部に設置される床部とによって外部と区別される、ユーザが十分動作可能な空間を有するものとすることができ、操作入力手段には、床部の上面に複数設置され、ユーザの足によって踏まれることにより操作入力を受けさせるようにすることができる。 The information processing apparatus is a space where a user can sufficiently operate, which is distinguished from the outside by a casing formed of a ceiling surface and a wall surface having an opening through which the user can enter and exit, and a floor portion installed at a lower portion of the housing. A plurality of operation input means are installed on the upper surface of the floor, and can be operated by being stepped on by a user's foot.

情報処理装置は、ユーザにより把持可能な携帯型の情報処理装置であるものとすることができる。 The information processing apparatus may be a portable information processing apparatus that can be held by a user.

映像処理手段により生成された第２の映像データ、および、音声処理手段により生成された第２の音声データにより構成されるコンテンツの、他の情報処理装置への出力を制御するコンテンツ出力制御手段を更に備えさせるようにすることができる。 Content output control means for controlling the output of the second video data generated by the video processing means and the second audio data generated by the audio processing means to another information processing apparatus Further, it can be provided.

動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、照明を制御する照明制御手段を更に設けさせるようにすることができる。 An illumination control means for controlling illumination can be further provided based on the information related to the user's action acquired by the action information acquiring means and the user's operation input input by the operation input means.

操作入力手段により入力されたユーザの操作入力に対応して、映像処理手段により第１の映像データに施される第１の処理と、音声処理手段により第１の音声データに施される第２の処理とは、関連性を有する処理であるものとすることができる。 In response to a user operation input input by the operation input means, a first process applied to the first video data by the video processing means and a second process applied to the first audio data by the audio processing means. This process may be a process having relevance.

本発明の情報処理方法は、第１の音声データを生成する音声生成ステップと、第１の映像データの撮像を制御する撮像制御ステップと、ユーザの動作に関する情報を取得する動作情報取得ステップと、ユーザの操作入力を取得する操作入力取得ステップと、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、撮像制御ステップの処理により撮像が制御された第１の映像データに所定の第１の処理を施して、コンテンツを構成する第２の映像データを生成する映像処理ステップと、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、音声生成ステップの処理により生成された第１の音声データに、所定の第２の処理を施して、コンテンツを構成する第２の音声データを生成する音声処理ステップとを含むことを特徴とする。 The information processing method of the present invention includes an audio generation step for generating first audio data, an imaging control step for controlling imaging of the first video data, an operation information acquisition step for acquiring information related to the user's operation, Imaging based on the operation input acquisition step for acquiring the user's operation input, information on the user's operation acquired by the process of the operation information acquisition step, and the user's operation input acquired by the process of the operation input acquisition step By performing a predetermined first process on the first video data whose imaging is controlled by the process of the control step and generating the second video data constituting the content, and the process of the operation information acquisition step Based on the acquired user's operation information and the user's operation input acquired by the operation input acquisition process The first audio data generated by the processing of the speech production step is subjected to a predetermined second processing, characterized in that it comprises a sound processing step of generating the second audio data constituting the content.

本発明のプログラムは、コンピュータに、第１の音声データを生成する音声生成ステップと、第１の映像データの撮像を制御する撮像制御ステップと、ユーザの動作に関する情報を取得する動作情報取得ステップと、ユーザの操作入力を取得する操作入力取得ステップと、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、撮像制御ステップの処理により撮像が制御された第１の映像データに所定の第１の処理を施して、コンテンツを構成する第２の映像データを生成する映像処理ステップと、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、音声生成ステップの処理により生成された第１の音声データに、所定の第２の処理を施して、コンテンツを構成する第２の音声データを生成する音声処理ステップとを含む処理を実行させることを特徴とする。 The program according to the present invention includes a sound generation step for generating first sound data, an image pickup control step for controlling image pickup of the first video data, and an operation information acquisition step for acquiring information related to a user operation. Based on the operation input acquisition step for acquiring the user's operation input, information on the user's operation acquired by the process of the operation information acquisition step, and the user's operation input acquired by the process of the operation input acquisition step, A video processing step for performing a predetermined first process on the first video data whose imaging is controlled by the process of the imaging control step to generate the second video data constituting the content, and a process of the operation information acquisition step Information about the user's actions acquired by the user, and the user's An audio processing step of performing a predetermined second process on the first audio data generated by the process of the audio generation step based on the work input, and generating second audio data constituting the content. Processing is executed.

本発明の情報処理装置、情報処理方法、および、プログラムにおいては、第１の音声データが生成され、第１の映像データが撮像され、ユーザの動作に関する情報が取得され、ユーザの操作入力が取得され、ユーザの動作に関する情報、および、ユーザの操作入力を基に、第１の映像データに所定の第１の処理が施されて、コンテンツを構成する第２の映像データが生成され、ユーザの動作に関する情報、および、ユーザの操作入力を基に、第１の音声データに、所定の第２の処理が施されて、コンテンツを構成する第２の音声データが生成される。 In the information processing apparatus, the information processing method, and the program of the present invention, the first audio data is generated, the first video data is captured, information about the user's operation is acquired, and the user's operation input is acquired. The first video data is subjected to a predetermined first process based on the information related to the user's operation and the user's operation input, and the second video data constituting the content is generated. Based on the information related to the operation and the user's operation input, predetermined second processing is performed on the first audio data to generate second audio data constituting the content.

本発明の情報処理システムは、コンテンツを作成する第１の情報処理装置と、第１の情報処理装置からコンテンツを受信し、コンテンツをユーザに配信する第２の情報処理装置とで構成される情報処理システムであって、第１の情報処理装置は、第１の映像データを撮像する撮像手段と、ユーザの動作に関する情報を取得する動作情報取得手段と、ユーザの操作入力を取得する操作入力手段と、第１の音声データを生成する音声生成手段と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、撮像手段により撮像された第１の映像データに所定の第１の処理を施して、コンテンツを構成する第２の映像データを生成する映像処理手段と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、音声生成手段により生成された第１の音声データに、所定の第２の処理を施して、コンテンツを構成する第２の音声データを生成する音声処理手段と、映像処理手段により生成された第２の映像データの表示を制御する表示制御手段と、音声処理手段により生成された第２の音声データの出力を制御する音声出力制御手段と、映像処理手段により生成された第２の映像データ、および、音声処理手段により生成された第２の音声データで構成されるコンテンツを、第２の情報処理装置へ送信する送信手段とを備え、第２の情報処理装置は、第１の情報処理装置の送信手段により送信されたコンテンツを受信する受信手段と、受信手段により受信されたコンテンツを記憶する記憶手段と、ユーザからコンテンツの配信要求を受けた場合、記憶手段により記憶されたコンテンツのユーザへの配信を制御する配信制御手段とを備えることを特徴とする。 The information processing system according to the present invention includes information including a first information processing apparatus that creates content and a second information processing apparatus that receives the content from the first information processing apparatus and distributes the content to the user. In the processing system, the first information processing apparatus includes an imaging unit that captures the first video data, an operation information acquisition unit that acquires information about a user's operation, and an operation input unit that acquires an operation input of the user. Imaging by the imaging means based on the voice generation means for generating the first voice data, the information on the user's motion acquired by the motion information acquisition means, and the user's operation input input by the operation input means A video processing unit that performs a predetermined first process on the first video data that is generated to generate second video data constituting the content, and an operation information acquisition unit. Based on the obtained information about the user's operation and the user's operation input input by the operation input means, the first sound data generated by the sound generation means is subjected to a predetermined second process, Audio processing means for generating second audio data constituting the content, display control means for controlling display of the second video data generated by the video processing means, and second audio generated by the audio processing means Content that is composed of audio output control means for controlling the output of data, second video data generated by the video processing means, and second audio data generated by the audio processing means, and second information And a second information processor that receives the content transmitted by the transmitter of the first information processor, and a receiver that receives the content transmitted by the transmitter of the first information processor. Storage means for storing the received content Ri, when receiving a content distribution request from a user, characterized in that it comprises a distribution control means for controlling the delivery to a user of the content stored by the storage means.

第２の情報処理装置には、配信制御手段により配信が制御されるコンテンツの配信先の機器の種類に対応して、コンテンツのデータ形式を変換する変換手段を更に設けさせるようにすることができる。 The second information processing apparatus may be further provided with conversion means for converting the data format of the content corresponding to the type of the content distribution destination device whose distribution is controlled by the distribution control means. .

第２の情報処理装置には、配信制御手段により配信が制御されるコンテンツの配信に対して発生する課金処理を実行する課金処理手段を更に設けさせるようにすることができる。 The second information processing apparatus may be further provided with a billing processing unit that executes billing processing that occurs for the distribution of the content whose distribution is controlled by the distribution control unit.

本発明の情報処理システムにおいては、第１の情報処理装置で、第１の音声データが生成され、第１の映像データが撮像され、ユーザの動作に関する情報が取得され、ユーザの操作入力が取得され、ユーザの動作に関する情報、および、ユーザの操作入力を基に、第１の映像データに所定の第１の処理が施されて、コンテンツを構成する第２の映像データが生成され、ユーザの動作に関する情報、および、ユーザの操作入力を基に、第１の音声データに、所定の第２の処理が施されて、コンテンツを構成する第２の音声データが生成され、第２の情報処理装置で、第１の情報処理装置から送信されたコンテンツが受信され、コンテンツが記憶され、ユーザからコンテンツの配信要求を受けた場合、記憶されたコンテンツのユーザへの配信が制御される。 In the information processing system of the present invention, the first information processing apparatus generates the first audio data, the first video data is imaged, the information about the user's operation is acquired, and the user's operation input is acquired. The first video data is subjected to a predetermined first process based on the information related to the user's operation and the user's operation input, and the second video data constituting the content is generated. Based on the information related to the operation and the user's operation input, predetermined second processing is performed on the first audio data to generate second audio data constituting the content, and second information processing is performed. When the device receives the content transmitted from the first information processing device, stores the content, and receives a content distribution request from the user, the distribution of the stored content to the user is restricted. It is.

本発明によれば、映像と音声から構成されるコンテンツを作成することができる。特に、ユーザのパフォーマンスを取得して、それを基に、映像と音声を作成したり、映像や音声に効果を付与することができるので、簡単なインターフェースで、オリジナリティのある、高度な画像処理や音声処理が施されたコンテンツを生成することができる。 According to the present invention, content composed of video and audio can be created. In particular, it is possible to acquire user performance and create video and audio based on it, and to apply effects to video and audio, so with a simple interface, original and advanced image processing and Content that has been subjected to audio processing can be generated.

また、他の本発明によれば、映像と音声から構成されるコンテンツを作成してユーザに配信することができる。特に、ユーザのパフォーマンスを取得して、それを基に、映像と音声を作成したり、映像や音声に効果が付与されて生成されたコンテンツが、第２の情報処理装置に送信された後、ユーザに配信されるので、簡単なインターフェースで、オリジナリティのある、高度な画像処理や音声処理が施されたコンテンツが生成されるとともに、配信先の機器に対応するデータに変換されてユーザに配信されたり、コンテンツの配信にともなう課金処理を行うようにすることができる。 According to another aspect of the present invention, content composed of video and audio can be created and distributed to the user. In particular, after the user's performance is acquired, and the content generated by creating the video and audio based on the performance and the effect added to the video and audio is transmitted to the second information processing apparatus, Since it is delivered to the user, original and sophisticated content with advanced image processing and audio processing is generated with a simple interface, and the content is converted to data corresponding to the delivery destination device and delivered to the user. Or charging processing associated with content distribution can be performed.

以下に本発明の実施の形態を説明するが、本明細書に記載の発明と、発明の実施の形態との対応関係を例示すると、次のようになる。この記載は、本明細書に記載されている発明をサポートする実施の形態が、本明細書に記載されていることを確認するためのものである。したがって、発明の実施の形態中には記載されているが、発明に対応するものとして、ここには記載されていない実施の形態があったとしても、そのことは、その実施の形態が、その発明に対応するものではないことを意味するものではない。逆に、実施の形態が発明に対応するものとしてここに記載されていたとしても、そのことは、その実施の形態が、その発明以外の発明には対応しないものであることを意味するものでもない。 Embodiments of the present invention will be described below. The correspondence relationship between the invention described in this specification and the embodiments of the invention is exemplified as follows. This description is intended to confirm that the embodiments supporting the invention described in this specification are described in this specification. Therefore, even if there is an embodiment that is described in the embodiment of the invention but is not described here as corresponding to the invention, the fact that the embodiment is not It does not mean that it does not correspond to the invention. Conversely, even if an embodiment is described herein as corresponding to an invention, that means that the embodiment does not correspond to an invention other than the invention. Absent.

更に、この記載は、本明細書に記載されている発明の全てを意味するものでもない。換言すれば、この記載は、本明細書に記載されている発明であって、この出願では請求されていない発明の存在、すなわち、将来、分割出願されたり、補正により出現、追加される発明の存在を否定するものではない。 Further, this description does not mean all the inventions described in this specification. In other words, this description is for the invention described in the present specification, which is not claimed in this application, that is, for the invention that will be applied for in the future or that will appear and be added by amendment. It does not deny existence.

請求項１に記載の情報処理装置（例えば、図３および図９に記載のコンテンツ作成装置２）において、第１の映像データを撮像する撮像手段（例えば、図２、図３のカメラ４２、または、図９のカメラ３０１）と、ユーザの動作に関する情報を取得する動作情報取得手段（例えば、図３または図９の動き検出部６２）と、ユーザの操作入力を取得する操作入力手段（例えば、図２、図３のフットセンサ４３−１乃至４３−８、または、図９の操作入力部３０２）と、第１の音声データ（例えば、伴奏）を生成する音声生成手段（例えば、図３または図９の伴奏生成部６７）と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、撮像手段により撮像された第１の映像データに所定の第１の処理（例えば、エフェクトの付加）を施して、コンテンツ（例えば、セルフムービー）を構成する第２の映像データを生成する映像処理手段（例えば、図３または図９の映像効果付与処理部６４）と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、音声生成手段により生成された第１の音声データに、所定の第２の処理（例えば、効果音やエフェクトの付加）を施して、コンテンツを構成する第２の音声データを生成する音声処理手段（例えば、図３または図９の効果音生成部６５）と、映像処理手段により生成された第２の映像データの表示を制御する表示制御手段（例えば、図３または図９の映像表示制御部６９）と、音声処理手段により生成された第２の音声データの出力を制御する音声出力制御手段（例えば、図３または図９の音声出力制御部７０）とを備えることを特徴とする。 In the information processing apparatus according to claim 1 (for example, the content creation apparatus 2 according to FIGS. 3 and 9), an imaging unit that captures the first video data (for example, the camera 42 in FIGS. 2 and 3, or , The camera 301 in FIG. 9, operation information acquisition means (for example, the motion detection unit 62 in FIG. 3 or FIG. 9) that acquires information about the user's movement, and operation input means (for example, the user's operation input) The foot sensors 43-1 to 43-8 in FIG. 2 and FIG. 3 or the operation input unit 302 in FIG. 9 and voice generating means for generating first voice data (for example, accompaniment) (for example, FIG. 3 or The first accompaniment generation unit 67) of FIG. 9 and the information about the user's motion acquired by the motion information acquisition unit and the user's operation input input by the operation input unit are captured by the imaging unit. Image processing means (for example, as shown in FIG. 3 or FIG. 9) that performs predetermined first processing (for example, addition of an effect) on the image data to generate second image data constituting the content (for example, self-movie) Based on the information related to the user's motion acquired by the video effect applying processing unit 64) and the motion information acquiring unit, and the user's operation input input by the operation input unit, the first generated by the sound generating unit Audio processing means (for example, the sound effect of FIG. 3 or FIG. 9) that performs predetermined second processing (for example, addition of sound effects and effects) to the sound data to generate second sound data that constitutes the content Generating section 65), display control means for controlling the display of the second video data generated by the video processing means (for example, video display control section 69 in FIG. 3 or FIG. 9), audio processing The second sound output control means for controlling the output of the audio data (e.g., audio output control unit 70 of FIG. 3 or FIG. 9) generated by the means, characterized in that it comprises a.

情報処理装置は、天井面およびユーザが出入り可能な開口部を有する壁面で構成される筐体（例えば、図２の筺体２１）と、筺体の下部に設置される床部（例えば、図２の床２２）とによって外部と区別される、ユーザが十分動作可能な空間を有することができ、操作入力手段（例えば、図２および図３のフットセンサ４３−１乃至４３−８）は、床部の上面に複数設置され、ユーザの足によって踏まれることにより操作入力を受けることができる。 The information processing apparatus includes a casing (for example, the housing 21 in FIG. 2) having a ceiling surface and a wall surface having an opening through which a user can enter and exit, and a floor portion (for example, in FIG. It is possible to have a space that can be sufficiently operated by the user, which is distinguished from the outside by the floor 22), and the operation input means (for example, the foot sensors 43-1 to 43-8 in FIG. 2 and FIG. 3) A plurality of devices are installed on the upper surface of the device, and an operation input can be received by being stepped on by a user's foot.

情報処理装置は、ユーザにより把持可能な携帯型の情報処理装置（例えば、携帯型電話機４、または、ＰＤＡ５、もしくは、ディジタルスチルカメラ、または、ディジタルビデオカメラなど）であるものとすることができる。 The information processing apparatus can be a portable information processing apparatus that can be held by a user (for example, the mobile phone 4, the PDA 5, a digital still camera, or a digital video camera).

映像処理手段により生成された第２の映像データ、および、音声処理手段により生成された第２の音声データにより構成されるコンテンツの、他の情報処理装置（例えば、コンテンツ配信サーバ３、または、ユーザが保有する携帯型電話機４、または、ＰＤＡ５）への出力を制御するコンテンツ出力制御手段（例えば、図３または図９のネットワークインターフェース７１）を更に備えさせることができる。 Another information processing apparatus (for example, the content distribution server 3 or the user) of the content composed of the second video data generated by the video processing unit and the second audio data generated by the audio processing unit Content output control means (for example, the network interface 71 of FIG. 3 or FIG. 9) for controlling the output to the portable telephone 4 or the PDA 5) possessed by the mobile phone can be further provided.

動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、照明を制御する照明制御手段（例えば、図３の照明制御部６３）を更に備えることができる。 Illumination control means (for example, the illumination control unit 63 in FIG. 3) for controlling illumination based on the information related to the user's motion acquired by the motion information acquisition means and the user's operation input input by the operation input means. Further, it can be provided.

請求項７の情報処理方法は、コンテンツ（例えば、セルフムービー）を作成する情報処理装置（例えば、図３および図９に記載のコンテンツ作成装置２）の情報処理方法であって、第１の音声データ（例えば、伴奏）を生成する音声生成ステップ（例えば、図７のステップＳ４、または、図１０のステップＳ６４の処理）と、第１の映像データの撮像を制御する撮像制御ステップ（例えば、図７のステップＳ５、または、図１０のステップＳ６５の処理）と、ユーザの動作に関する情報を取得する動作情報取得ステップ（例えば、図７のステップＳ６、または、図１０のステップＳ６６の処理）と、ユーザの操作入力を取得する操作入力取得ステップ（例えば、図７のステップＳ８、または、図１０のステップＳ６７の処理）と、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、撮像制御ステップの処理により撮像が制御された第１の映像データに所定の第１の処理（例えば、エフェクトの付加）を施して、コンテンツを構成する第２の映像データを生成する映像処理ステップ（例えば、図７のステップＳ９、または、図１０のステップＳ６８の処理）と、動作情報取得ステップの処理により取得されたユーザの動作に関する情報、および、操作入力取得ステップの処理により取得されたユーザの操作入力を基に、音声生成ステップの処理により生成された第１の音声データに、所定の第２の処理（例えば、効果音やエフェクトの付加）を施して、コンテンツを構成する第２の音声データを生成する音声処理ステップ（例えば、図７のステップＳ１０、または、図１０のステップＳ６９の処理）とを含むことを特徴とする。 The information processing method according to claim 7 is an information processing method of an information processing apparatus (for example, the content creation apparatus 2 described in FIGS. 3 and 9) that creates content (for example, self-movie), and includes a first sound An audio generation step for generating data (for example, accompaniment) (for example, the process of step S4 in FIG. 7 or step S64 in FIG. 10) and an imaging control step for controlling the imaging of the first video data (for example, FIG. 7 step S5 or the process of step S65 of FIG. 10), an operation information acquisition step of acquiring information related to the user's action (for example, step S6 of FIG. 7 or the process of step S66 of FIG. 10), An operation input acquisition step (for example, step S8 in FIG. 7 or step S67 in FIG. 10) for acquiring a user operation input, and operation information acquisition Based on the information about the user's operation acquired by the processing of the step and the user's operation input acquired by the processing of the operation input acquisition step, the first video data whose imaging is controlled by the processing of the imaging control step A video processing step (for example, step S9 in FIG. 7 or step S68 in FIG. 10) for generating predetermined second processing (for example, adding an effect) and generating second video data constituting the content ) And information on the user's motion acquired by the processing of the motion information acquisition step, and the user's operation input acquired by the processing of the operation input acquisition step, the first generated by the processing of the voice generation step A second process (for example, addition of sound effects and effects) is applied to the audio data of the first audio data to form the content. Audio processing step (e.g., step S10 in FIG. 7, or the process of step S69 of FIG. 10) for generating audio data, characterized in that it comprises a.

また、請求項８に記載のプログラムにおいても、各ステップが対応する実施の形態（但し一例）は、請求項７に記載の情報処理方法と同様である。 Also in the program according to claim 8, the embodiment (however, an example) to which each step corresponds is the same as the information processing method according to claim 7.

請求項９に記載の情報処理システムは、コンテンツ（例えば、セルフムービー）を作成する第１の情報処理装置（例えば、図３および図９に記載のコンテンツ作成装置２）と、第１の情報処理装置からコンテンツを受信し、コンテンツをユーザに配信する第２の情報処理装置（例えば、図６に記載のコンテンツ配信サーバ３）とで構成される情報処理システムであって、第１の情報処理装置は、第１の映像データを撮像する撮像手段（例えば、図２、図３のカメラ４２、または、図９のカメラ３０１）と、ユーザの動作に関する情報を取得する動作情報取得手段（例えば、図３または図９の動き検出部６２）と、ユーザの操作入力を取得する操作入力手段（例えば、図２、図３のフットセンサ４３−１乃至４３−８、または、図９の操作入力部３０２）と、第１の音声データ（例えば、伴奏）を生成する音声生成手段（例えば、図３または図９の伴奏生成部６７）と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、撮像手段により撮像された第１の映像データに所定の第１の処理（例えば、エフェクトの付加）を施して、コンテンツを構成する第２の映像データを生成する映像処理手段（例えば、図３または図９の映像効果付与処理部６４）と、動作情報取得手段により取得されたユーザの動作に関する情報、および、操作入力手段により入力されたユーザの操作入力を基に、音声生成手段により生成された第１の音声データに、所定の第２の処理（例えば、効果音やエフェクトの付加）を施して、コンテンツを構成する第２の音声データを生成する音声処理手段（例えば、図３または図９の効果音生成部６５）と、映像処理手段により生成された第２の映像データの表示を制御する表示制御手段（例えば、図３または図９の映像表示制御部６９）と、音声処理手段により生成された第２の音声データの出力を制御する音声出力制御手段（例えば、図３または図９の音声出力制御部７０）と、映像処理手段により生成された第２の映像データ、および、音声処理手段により生成された第２の音声データで構成されるコンテンツを、第２の情報処理装置へ送信する送信手段（例えば、図３または図９のネットワークインターフェース７１）とを備え、第２の情報処理装置は、第１の情報処理装置の送信手段により送信されたコンテンツを受信する受信手段（例えば、図６のネットワークインターフェース２２０）と、受信手段により受信されたコンテンツを記憶する記憶手段（例えば、図６のＨＤＤ２１８）と、ユーザからコンテンツの配信要求を受けた場合、記憶手段により記憶されたコンテンツのユーザへの配信を制御する配信制御手段（例えば、図６のＣＰＵ２１１）とを備えることを特徴とする。 The information processing system according to claim 9 includes a first information processing apparatus (for example, the content creation apparatus 2 described in FIGS. 3 and 9) that creates content (for example, a self-movie), and a first information processing. An information processing system including a second information processing apparatus (for example, the content distribution server 3 described in FIG. 6) that receives content from an apparatus and distributes the content to a user. Imaging means for capturing the first video data (for example, the camera 42 in FIG. 2, FIG. 3 or the camera 301 in FIG. 9) and the operation information acquiring means for acquiring information about the user's motion (for example, FIG. 3 or the motion detection unit 62 in FIG. 9 and an operation input means for acquiring a user's operation input (for example, the foot sensors 43-1 to 43-8 in FIGS. 2 and 3 or the operation input in FIG. 9). 302), voice generation means for generating first voice data (for example, accompaniment) (for example, accompaniment generation section 67 in FIG. 3 or FIG. 9), information on the user's action acquired by the action information acquisition means, Then, based on the user's operation input input by the operation input means, predetermined first processing (for example, addition of an effect) is performed on the first video data imaged by the imaging means to constitute the content. Information about the user's motion acquired by the video processing means (for example, the video effect imparting processing unit 64 in FIG. 3 or FIG. 9) for generating the second video data, the motion information acquisition means, and input by the operation input means Based on the user's operation input, the first sound data generated by the sound generating means is subjected to predetermined second processing (for example, addition of sound effects and effects). Audio processing means for generating the second audio data constituting the content (for example, the sound effect generating unit 65 in FIG. 3 or FIG. 9) and a display for controlling the display of the second video data generated by the video processing means Control means (for example, the video display control unit 69 in FIG. 3 or FIG. 9) and sound output control means (for example, the sound in FIG. 3 or FIG. 9) for controlling the output of the second sound data generated by the sound processing means. Output control unit 70), the content composed of the second video data generated by the video processing means and the second audio data generated by the audio processing means, is transmitted to the second information processing apparatus. And the second information processing apparatus receives the content transmitted by the transmission means of the first information processing apparatus. Receiving means (for example, the network interface 220 in FIG. 6), storage means for storing the content received by the receiving means (for example, the HDD 218 in FIG. 6), and storage means when receiving a content distribution request from the user Distribution control means (for example, the CPU 211 in FIG. 6) for controlling distribution of the content stored by the user to the user.

第２の情報処理装置は、配信制御手段により配信が制御されるコンテンツの配信先の機器の種類に対応して、コンテンツのデータ形式を変換する変換手段（例えば、図６の画像処理部２４１および音声処理部２４２）を更に備えることができる。 The second information processing apparatus converts the data format of the content in accordance with the type of content distribution destination device whose distribution is controlled by the distribution control unit (for example, the image processing unit 241 in FIG. An audio processing unit 242) can be further provided.

第２の情報処理装置は、配信制御手段により配信が制御されるコンテンツの配信に対して発生する課金処理を実行する課金処理手段（例えば、図６の課金処理部２４３）を更に備えることができる。 The second information processing apparatus can further include a billing processing unit (for example, a billing processing unit 243 in FIG. 6) that executes billing processing that occurs for the distribution of content whose distribution is controlled by the distribution control unit. .

以下、図を参照して、本発明の実施の形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明を適用したセルフムービー配信システムのシステム構成を示す図である。 FIG. 1 is a diagram showing a system configuration of a self-movie distribution system to which the present invention is applied.

例えば、インターネットや電話回線網などを含む広域ネットワークであるネットワーク１には、セルフムービー作成装置２−１乃至２−ｎ（ただし、ｎは正の整数）、セルフムービー配信サーバ３、携帯型電話機４、ＰＤＡ（Personal Digital Assistant）５、パーソナルコンピュータ６、および、ホームサーバ７が接続され、ホームサーバ７には、テレビジョン受像機８が接続されている。 For example, the network 1 which is a wide area network including the Internet and a telephone line network includes a self-movie creation device 2-1 to 2-n (where n is a positive integer), a self-movie distribution server 3, and a mobile phone 4 A PDA (Personal Digital Assistant) 5, a personal computer 6, and a home server 7 are connected to the home server 7, and a television receiver 8 is connected.

セルフムービー作成装置２−１乃至２−ｎは、後述する処理により、ユーザの映像を撮像するとともに、ユーザのパフォーマンス（動きや装置内の各種センサなどへの操作入力）を取得し、取得されたフォーマンスを基に、音声データを生成したり、撮像された映像に所定の効果を施したり、必要に応じて、照明を制御したりすることにより、ユーザに配布可能なコンテンツであるセルフムービーを作成し、作成したセルフムービーを、ネットワーク１を介して、セルフムービー作成サーバ３に送信する。 The self-movie creation devices 2-1 to 2-n have acquired the user's performance (motion and operation inputs to various sensors in the device) and acquired the user's video by the process described later. Create self-movies that can be distributed to users by generating audio data based on performance, applying predetermined effects to captured images, and controlling lighting as necessary. The created self-movie is transmitted to the self-movie creating server 3 via the network 1.

セルフムービー配信サーバ３は、セルフムービー作成装置２−１乃至２−ｎから送信されたセルフムービーを保存し、ユーザによってセルフムービーの配信（ダウンロード）が要求されたとき、セルフムービーを、ネットワーク１を介して、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、および、ホームサーバ７に配信し、配信されたセルフムービーに対する課金処理を実行する。 The self-movie distribution server 3 stores the self-movie transmitted from the self-movie creating apparatuses 2-1 to 2-n, and when the self-movie distribution (download) is requested by the user, the self-movie distribution server 3 Via the mobile phone 4, the PDA 5, the personal computer 6, and the home server 7, and charging processing for the distributed self-movie is executed.

携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ホームサーバ７は、それぞれ、セルフムービ作成装置２−１乃至２−ｎのうちのいずれかにおいて、セルフムービーを作成したユーザが、セルフムービ配信サーバ３に、ネットワーク１を介してアクセスし、セルフムービーのダウンロードを要求する場合に利用されるものである。 The mobile phone 4, PDA 5, personal computer 6, or home server 7 has a user who has created a self-movie in the self-movie distribution server 3 in any of the self-movie creation devices 2-1 to 2 -n. It is used when accessing via the network 1 and requesting download of a self-movie.

ユーザは、例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ホームサーバ７などを用いて、セルフムービー配信サーバ３にネットワーク１を介してアクセスし、セルフムービー作成装置２−１乃至２−ｎのいずれかにおいて作成されたセルフムービーのダウンロードを要求する。そして、ユーザは、ダウンロードされたセルフムービーを、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ホームサーバ７に接続されたテレビジョン受像機８を用いて、視聴することができ、更に、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６などを用いて、例えば、電子メールや無線通信を利用して、友人が保有する同様の装置（図示しない携帯型電話機、ＰＤＡ、または、パーソナルコンピュータ）などと、セルフムービーを授受することも可能である。 The user accesses the self-movie distribution server 3 via the network 1 using, for example, the mobile phone 4, the PDA 5, the personal computer 6, or the home server 7, and the self-movie creation devices 2-1 to 2-2. Request download of self-movie created in any of n. Then, the user can view the downloaded self-movie using the mobile phone 4, the PDA 5, the personal computer 6, or the television receiver 8 connected to the home server 7, and further the mobile phone. Using a portable telephone 4, a PDA 5, a personal computer 6 and the like, for example, using a device (such as a portable telephone, a PDA, or a personal computer, not shown) that is owned by a friend using e-mail or wireless communication, Self-movies can also be exchanged.

以下、セルフムービー作成装置２−１乃至２−ｎを個々に区別する必要がない場合、単にセルフムービー作成装置２と総称する。 Hereinafter, when it is not necessary to individually distinguish the self-movie creating apparatuses 2-1 to 2-n, they are simply referred to as the self-movie creating apparatus 2.

図２は、セルフムービー作成装置２の第１の実施の形態における外観構成について説明するための図である。 FIG. 2 is a diagram for explaining an external configuration of the self-movie creating apparatus 2 according to the first embodiment.

セルフムービー作成装置２は、ユーザが内部に入るための入り口となる開口部を有する筒型の壁面と天井面からなる筺体２１と、円形の床２２によって、ユーザがセルフムービーを作成するために移動可能な空間（例えば、身長１８０ｃｍの人が、水平に真横に手を広げたり、ジャンプすることが可能な、直径１８００ｍｍ、高さ３０００ｍｍの円筒状の空間）を構成するようになされている。 The self-movie creation device 2 is moved by a user to create a self-movie by a cylindrical wall 21 having an opening serving as an entrance for the user to enter and a casing 21 made of a ceiling and a circular floor 22. A possible space (for example, a cylindrical space having a diameter of 1800 mm and a height of 3000 mm, in which a person with a height of 180 cm can horizontally spread his hand or jump, can be configured).

筺体２１の開口部正面に、ユーザが操作入力を行うための、キー、ボタン、または、タッチパットなどが備えられた操作入力部４１が設けられている。また、操作入力部４１には、赤外線などの無線による情報を受信可能な受信部５１が設けられている。ユーザは、操作入力部４１のキー、ボタン、または、タッチパットなどを用いて操作入力を行う以外に、例えば、赤外線通信やBluetoothなどによる無線通信が可能な携帯型電話機４やＰＤＡ５を用いて、所望の情報をセルフムービー作成装置２に供給することが可能である。例えば、ユーザは、セルフムービーの作成開始（以下、プレイ開始と称するものとする）を指令する前に、携帯型電話機４などを用いて、所望のテキストを入力し、赤外線通信などの無線通信を用いて、セルフムービー作成装置２に予め送信しておくことができる。そして、セルフムービー作成装置２は、ユーザからプレイ開始の指令を受けて、後述する処理により、受信したテキストを、撮像された映像に重畳させて表示させるようにすることができる。 An operation input unit 41 provided with keys, buttons, touch pads, or the like for a user to perform operation input is provided in front of the opening of the housing 21. In addition, the operation input unit 41 is provided with a receiving unit 51 that can receive wireless information such as infrared rays. In addition to performing operation input using the keys, buttons, touch pad, or the like of the operation input unit 41, the user uses, for example, the portable phone 4 or the PDA 5 capable of wireless communication by infrared communication or Bluetooth, It is possible to supply desired information to the self-movie creating apparatus 2. For example, before instructing the start of self-movie creation (hereinafter referred to as play start), the user inputs a desired text using the mobile phone 4 or the like, and performs wireless communication such as infrared communication. And can be transmitted to the self-movie creating apparatus 2 in advance. Then, the self-movie creating apparatus 2 can receive the play start command from the user and can display the received text superimposed on the captured image by a process described later.

そして、操作入力部４１の鉛直面上の上方には、ユーザを撮像するカメラ４２が設けられている。１４０ｃｍ乃至１８０ｃｍの身長の人が、カメラ４２により撮像される映像のフレームに入りきるように、カメラ４２の中心位置の高さを１４００ｍｍ程度にすると好適である。カメラ４２により撮像された映像は、セルフムービーの素材となるのみならず、ユーザの動きの検出に用いることも可能である。検出されたユーザの動きの検出結果は、後述する処理により、映像や音声にエフェクト（効果）をかけたり、音声に効果音を付与する処理のパラメータとして用いられる。 A camera 42 that images the user is provided above the vertical plane of the operation input unit 41. The height of the center position of the camera 42 is preferably about 1400 mm so that a person with a height of 140 cm to 180 cm can enter the frame of the image captured by the camera 42. The video imaged by the camera 42 can be used not only as a material for a self-movie but also for detecting a user's movement. The detection result of the detected user movement is used as a parameter of a process for applying an effect (effect) to video or audio or adding a sound effect to the audio by a process described later.

カメラ４２により撮像された映像を用いてユーザの動きを検出するには、例えば、ユーザにＬＥＤライトを持った、または、装着した状態で動作してもらい、撮像された映像データを解析して、ＬＥＤライトの移動を追跡するようにしても良いし、ユーザには、特に何も装着することなく動作してもらい、撮像された映像を画像処理して動きベクトルを得、その動きベクトルの大きさを求めるようにしたり、例えば、ユーザの手の先や頭の頂点などの所定の位置を検出して、その移動量を追跡するようにしてもよい。 In order to detect a user's movement using the video imaged by the camera 42, for example, the user operates with an LED light or wearing it, analyzes the captured video data, The movement of the LED light may be tracked, or the user may operate without wearing anything, and the captured image is subjected to image processing to obtain a motion vector, and the magnitude of the motion vector For example, a predetermined position such as the tip of the user's hand or the apex of the head may be detected and the amount of movement may be tracked.

また、ユーザの動きを検出するためには、カメラ４２によって撮像された映像データを用いる代わりに、例えば、筺体２１の内部に複数の赤外線センサやフォトセンサを設けることにより、そのセンサ入力に基づいて、ユーザの体の動きを検出するようにしてもよいし、ユーザにユーザ自身の動きを直接的に検出できるようなセンサを装着させて、そのセンサ入力に基づいて、ユーザの動きを検出するようにしてもよい。ユーザの動きを検出するために、筺体２１の内部に複数の赤外線センサなどが設けられている場合、ユーザが移動可能な空間が円柱状であると、複数のセンサと、空間の中央付近に存在すると仮定されるユーザの距離が均等になるため、ユーザの動きの検出に好適である。 Further, in order to detect the movement of the user, instead of using the video data captured by the camera 42, for example, by providing a plurality of infrared sensors and photo sensors inside the housing 21, based on the sensor input. The user's body movement may be detected, or the user may be equipped with a sensor that can directly detect the user's own movement, and the user's movement is detected based on the sensor input. It may be. When a plurality of infrared sensors or the like are provided in the housing 21 to detect the user's movement, the space in which the user can move is a columnar shape, and the sensors are located near the center of the space. Since the assumed user distances are equal, it is suitable for detecting the user's movement.

なお、ユーザの動きを検出するために、撮像された映像から動きベクトルを検出したり、筺体２１の内部に設けられた赤外線センサなどのセンサ入力を利用するようにした場合、ユーザは、まったく制約なく動作することが可能であるが、制約がないことが、逆に、ユーザの動きを萎縮させてしまう場合がある。一方、ユーザにＬＥＤライトを持った、または、装着した状態で動作してもらうようにした場合、ユーザは、ＬＥＤライトをカメラ４２に向かって振り回すという動作を意識し、自分自身の動作と映像や音声に施されるエフェクト、または、音声に付加される効果音との関連を認識しやすいという利点がある。 In addition, in order to detect a user's motion, when a motion vector is detected from the captured image or a sensor input such as an infrared sensor provided in the housing 21 is used, the user has no restrictions. However, there is a case where the movement of the user is shrunken if there is no restriction. On the other hand, when the user operates with the LED light attached or attached, the user is conscious of the operation of swinging the LED light toward the camera 42, and the user's own operation and video, There is an advantage that it is easy to recognize the relationship between the effect applied to the sound or the sound effect added to the sound.

すなわち、ユーザにＬＥＤライトを持った、または、装着した状態で動作してもらうようにした場合、ユーザは、ただ自由にダンスをするよりも、「光の筆で絵を描く、音を奏でる」というイメージを持つことができるので、全く制約のない動作を許される場合よりもシンプルかつ効果的な動きを誘発される。更に、カメラ４２により撮像された映像はセルフムービーの素材となるので、撮像される映像の中にＬＥＤライトの光が存在することにより、作成される映像が、変化に富んだものとなるという効果も発生する。 In other words, when the user has an LED light or is put on the device to operate, the user can “draw a picture with a light brush, play a sound” rather than just dancing freely. It is possible to have a simple image that is simpler and more effective than when it is allowed to operate without any restrictions. Furthermore, since the video imaged by the camera 42 is a material for a self-movie, the created video image is rich in change due to the presence of LED light in the video image. Also occurs.

なお、ＬＥＤは、一定の光量を全方位的に放射するため、取得された画像中における位置が検出しやすいという利点もある。なお、撮像された映像において、ＬＥＤライトの位置の追跡の対象となる情報として、ＬＥＤの色相（ＲＧＢ）よりも、ＬＥＤから発される輝度情報に重点を置くようにした場合、ＬＥＤライトの色が変更されても、検出方法を変更することなく、位置を検出することが可能となるので、例えば、複数の色のＬＥＤライトを用意し、ユーザが所望する色のＬＥＤライトを把持、または、装着することができるようにしてもよい。 In addition, since LED radiates | emits a fixed light quantity omnidirectionally, there also exists an advantage that the position in the acquired image is easy to detect. In addition, in the captured image, when the information on which the position of the LED light is to be tracked is focused on the luminance information emitted from the LED rather than the hue (RGB) of the LED, the color of the LED light Since the position can be detected without changing the detection method even if is changed, for example, a plurality of color LED lights are prepared, and the user desires a desired color LED light, or It may be possible to attach it.

床２２には、フットセンサ４３−１乃至４３−８が設けられている。フットセンサ４３−１乃至４３−８が、ユーザにより踏まれる、すなわち、操作入力を受けることにより、撮像された映像や作成される音声に所定のエフェクトがかかったり、作成される音声に所定の効果音が付与される。例えば、フットセンサ４３−１乃至４３−８のそれぞれに、全て異なるエフェクトや効果音を割り当てるようにしても良いし、フットセンサ４３−１乃至４３−８のうちの一部には、同一のエフェクトまたは同一の効果音が割り当てられるようにしても良い。また、フットセンサ４３−１乃至４３−８に割り当てられるエフェクトまたは効果音の種類が、ユーザのセルフムービーの作成処理（以下、プレイと称するものとする）ごとにランダムに変更されるようにすることにより、ユーザにフットセンサ４３−１乃至４３−８とエフェクトまたは効果音との対応関係を容易に覚えることができないようにして、何度もプレイしているユーザであっても、意外性のあるセルフムービーが作成されるようにしても良い。または、フットセンサ４３−１乃至４３−８に割り当てられるエフェクトまたは効果音の種類を、ユーザが設定可能なようにしても良い。そして、上述したように、カメラ４２により撮像された映像などにより検出されたユーザの動きの検出結果に基づいて、撮像された映像、および、作成される音声にかかるエフェクトの強度や効果音のパラメータなどが制御される。エフェクトおよび効果音の詳細な例については後述する。 The floor 22 is provided with foot sensors 43-1 to 43-8. When the foot sensors 43-1 to 43-8 are stepped on by the user, that is, when an operation input is received, a predetermined effect is applied to the captured video or the created audio, or a predetermined effect is applied to the created audio. Sound is given. For example, different effects and sound effects may be assigned to each of the foot sensors 43-1 to 43-8, or the same effect may be assigned to some of the foot sensors 43-1 to 43-8. Alternatively, the same sound effect may be assigned. Also, the type of effect or sound effect assigned to the foot sensors 43-1 to 43-8 is randomly changed for each user's self-movie creation process (hereinafter referred to as play). Therefore, even if the user is playing many times in such a way that the user cannot easily learn the correspondence between the foot sensors 43-1 to 43-8 and the effects or sound effects, it is surprising. A self-movie may be created. Alternatively, the user may be able to set the type of effect or sound effect assigned to the foot sensors 43-1 to 43-8. Then, as described above, based on the detection result of the user's motion detected by the video captured by the camera 42, the effect intensity and the sound effect parameters related to the captured video and the created audio Etc. are controlled. Detailed examples of effects and sound effects will be described later.

図２においては、フットセンサ４３−１乃至４３−８の８つのフットセンサが備えられているものとして説明したが、フットセンサの数は、８つ以外のいかなる数であっても良いことは言うまでもなく、また、筺体２１内部にタッチパッド等を用意して、フットセンサに代わって操作入力を受けることができるようにしても良い。 In FIG. 2, it has been described that eight foot sensors 43-1 to 43-8 are provided, but it goes without saying that the number of foot sensors may be any number other than eight. Alternatively, a touch pad or the like may be prepared inside the housing 21 so that an operation input can be received instead of the foot sensor.

また、ユーザが自分自身のパフォーマンスを確認できるように、ユーザの全身を映し出すことができる鏡を、筺体２１内部（ただし、カメラ４２による撮像処理や、フットセンサその他の操作を妨害しない位置）に設置するようにしてもよい。 In addition, a mirror capable of projecting the user's whole body is installed inside the housing 21 (however, a position that does not interfere with the imaging process by the camera 42, the foot sensor, and other operations) so that the user can check his / her own performance. You may make it do.

また、筺体２１の開口部正面には、モニタ４４が設けられている。モニタ４４には、カメラ４２によって撮像された映像に対して、フットセンサ４３−１乃至４３−８への操作入力およびユーザの動きの検出結果によってエフェクトが施されて作成された映像が表示されるようになされている。 A monitor 44 is provided in front of the opening of the housing 21. The monitor 44 displays an image created by applying an effect to the image captured by the camera 42 based on the operation input to the foot sensors 43-1 to 43-8 and the detection result of the user's movement. It is made like that.

また、筺体２１の内部には、フットセンサ４３−１乃至４３−８への操作入力およびユーザの動きの検出結果によってエフェクトが施されたリ、効果音が付加された音声を出力するスピーカ４５−１乃至４５−６が設けられている。スピーカ４５−１乃至４５−６は、ドルビーデジタルの最新方式で音声を再生することができるようになされ、２０Hz乃至２０kHzの帯域のセンタースピーカであるスピーカ４５−１、フロントの左右スピーカであるスピーカ４５−２およびスピーカ４５−３、リアの左右スピーカであるスピーカ４５−４およびスピーカ４５−５、および、１２０Hz以下の低域専用のサブウーファーであるスピーカ４５−６で構成されている。なお、図２においては、スピーカ４５−１乃至４５−６の６つのスピーカにより音声を出力することができるものとして説明するが、スピーカの数は、６つ以外のいかなる数であっても良く、更に、再生方式も、ドルビーデジタル以外のいかなる再生方式であっても良いことは言うまでもない。 In addition, inside the housing 21, a speaker 45-that outputs a sound to which an effect is applied and an effect sound is added according to an operation input to the foot sensors 43-1 to 43-8 and a detection result of a user's movement. 1 to 45-6 are provided. The speakers 45-1 to 45-6 are adapted to be able to reproduce sound by the latest Dolby Digital method, the speaker 45-1 being a center speaker in a 20 Hz to 20 kHz band, and the speaker 45 being a front left and right speaker. -2 and a speaker 45-3, a speaker 45-4 and a speaker 45-5 which are rear left and right speakers, and a speaker 45-6 which is a subwoofer dedicated to a low frequency of 120 Hz or less. In addition, in FIG. 2, although demonstrated as what can output an audio | voice with the speakers 45-1 thru | or 45-6, the number of speakers may be any number other than six, Furthermore, it goes without saying that any reproduction system other than Dolby Digital may be used.

また、筺体２１の内部の上面（天井面付近）に、照明４７−１乃至４７−５が備えられている。照明４７−１乃至４７−５は、一定の明るさで筺体２１内部を照明するようになされていても良いし、フットセンサ４３−１乃至４３−８への操作入力およびユーザの動きの検出結果によって、照明の色が変更されたり、点滅されたりなどの制御が行われるようにしても良い。また、ユーザの動きを検出するために、ユーザにＬＥＤライトを持たせて、その移動を検出するようにした場合、照明４７−１乃至４７−５による照明の明るさは、撮像された映像において、ＬＥＤライトの輝度の検出を妨げない程度に制御する必要がある。 Also, illuminations 47-1 to 47-5 are provided on the upper surface (near the ceiling surface) inside the housing 21. The illuminations 47-1 to 47-5 may be configured to illuminate the inside of the housing 21 with a constant brightness, and the operation input to the foot sensors 43-1 to 43-8 and the detection result of the user's movement Thus, control such as changing the color of the illumination or blinking may be performed. In addition, in order to detect the movement of the user, when the user is provided with an LED light and the movement thereof is detected, the brightness of the illumination by the illuminations 47-1 to 47-5 is determined in the captured image. Therefore, it is necessary to control the LED light so as not to interfere with the detection of luminance.

次に、図３は、図２を用いて説明した、第１の実施の形態のセルフムービー作成装置２における内部構成を示すブロック図である。 Next, FIG. 3 is a block diagram showing the internal configuration of the self-movie creating apparatus 2 according to the first embodiment described with reference to FIG.

セルフムービー作成装置２は、ユーザのパフォーマンスを取得するパフォーマンスセクション、取得されたパフォーマンスを利用して、所定のエフェクトが付与された映像データおよびエフェクトまたは効果音が付与された音声データを生成するリアルタイムエフェクトセクション、および、所定のエフェクトが付与された映像データおよび音声データをレンダリングするレンダリングセクションの３つのセクションに大きく分けられている。 The self-movie creating apparatus 2 uses a performance section for acquiring a user's performance, real-time effects for generating video data to which a predetermined effect is applied and audio data to which an effect or a sound effect is applied, using the acquired performance. The section is roughly divided into three sections: a rendering section for rendering video data and audio data to which a predetermined effect is applied.

ここでは、ユーザの動きは、カメラ４２により撮像された映像を基に、例えば、動きベクトル検出や、ユーザが把持または装着したＬＥＤライトの移動の追跡などによって検出されるものとして説明する。 Here, a description will be given assuming that the user's movement is detected by, for example, motion vector detection or tracking of movement of the LED light held or attached by the user based on the video imaged by the camera 42.

操作入力部４１は、ユーザから、プレイ開始が指令されたとき、プレイ開始を示す制御信号を、画像処理部６１、映像効果付与処理部６４、フットセンサ入力検出部６６、および、伴奏生成部６７に供給する。また、操作入力部４１は、必要に応じて、作成されたセルフムービーを固有に区別するために必要な情報（例えば、ユーザＩＤなど）の入力を受けることができる。 The operation input unit 41 outputs a control signal indicating the start of play to the image processing unit 61, the video effect applying processing unit 64, the foot sensor input detection unit 66, and the accompaniment generation unit 67 when play start is instructed by the user. To supply. In addition, the operation input unit 41 can receive input of information (for example, a user ID) necessary for uniquely distinguishing the created self-movie as necessary.

受信部５１は、ユーザが保有する携帯型電話機４やＰＤＡ５から送信された、セルフムービーに表示させるテキストデータを受信し、映像効果付与処理部６４に供給する。 The receiving unit 51 receives text data to be displayed on the self-movie transmitted from the mobile phone 4 or PDA 5 owned by the user and supplies the text data to the video effect applying processing unit 64.

カメラ４２により撮像された映像は、画像処理部６１に供給される。画像処理部６１は、プレイ開始を示す制御信号の供給を受けた場合、供給された映像に対して、例えば、Ａ／Ｄ変換、ガンマ補正、ホワイトバランス調整などの各種信号処理（画像補正処理）を施して、処理が施された映像データを、動き検出部６２および映像効果付与処理部６４に供給する。 The video imaged by the camera 42 is supplied to the image processing unit 61. When receiving a control signal indicating the start of play, the image processing unit 61 performs various signal processing (image correction processing) such as A / D conversion, gamma correction, and white balance adjustment on the supplied video. Then, the processed video data is supplied to the motion detector 62 and the video effect applying processor 64.

動き検出部６２は、例えば、動きベクトル検出や、ユーザが把持または装着したＬＥＤライトの移動の追跡などによって、撮像された映像から、ユーザの動きを検出し、検出結果を、照明制御部６３、映像効果付与処理部６４、および、効果音生成部６５に供給する。 The motion detection unit 62 detects the user's motion from the captured image by, for example, motion vector detection or tracking of the movement of the LED light held or attached by the user, and the detection result is displayed as the illumination control unit 63, This is supplied to the video effect applying processing unit 64 and the sound effect generating unit 65.

例えば、ユーザが把持または装着したＬＥＤライトの移動の追跡を行う場合、動き検出部６２は、撮像された映像の中心点を座標（０,０）とするＸＹ平面において、所定の閾値以上の輝度信号強度を有する座標の位置を基に、ＬＥＤライトの位置を認識するようにし、所定の単位時間におけるＬＥＤライトの位置の変化量を、ユーザの動き量として検出するようにすることができる。 For example, when tracking the movement of the LED light held or attached by the user, the motion detection unit 62 has a luminance equal to or higher than a predetermined threshold on the XY plane with the center point of the captured image as coordinates (0, 0). The position of the LED light can be recognized based on the position of the coordinates having the signal intensity, and the change amount of the position of the LED light in a predetermined unit time can be detected as the amount of movement of the user.

また、フットセンサ入力検出部６６は、プレイ開始を示す制御信号の供給を受けた場合、フットセンサ４３−１乃至４３−８から、操作入力を示す信号の供給を受け、フットセンサ４３−１乃至４３−８のいずれが操作入力を受けたかを示す信号を、照明制御部６３、映像効果付与処理部６４、および、効果音生成部６５に供給する。 In addition, when receiving a control signal indicating the start of play, the foot sensor input detection unit 66 receives a signal indicating an operation input from the foot sensors 43-1 to 43-8, and receives the foot sensor 43-1 to 43-8. A signal indicating which one of the operation inputs 43-8 has received the operation input is supplied to the illumination control unit 63, the video effect applying processing unit 64, and the sound effect generating unit 65.

伴奏生成部６７は、プレイ開始を示す制御信号の供給を受けた場合、例えば、リズム・ベース・メロディといった３トラックから構成されている所定のコード進行のダンスミュージックを、８小節を基本にループさせて、伴奏を生成し、効果音生成部６５に供給する。また、伴奏生成部６７は、効果音の生成や、音声データへのエフェクトの付加に必要な、例えば、現在再生中の伴奏のコード、現在のテンション値、または、現在再生中のセクションなどを示す情報を、効果音生成部６５に供給する。 When the accompaniment generator 67 receives a control signal indicating the start of play, the accompaniment generator 67 loops a predetermined chord progression dance music composed of three tracks such as rhythm, bass, and melody based on 8 bars. Then, an accompaniment is generated and supplied to the sound effect generator 65. The accompaniment generation unit 67 indicates, for example, an accompaniment chord currently being played, a current tension value, a currently playing section, or the like necessary for generating sound effects and adding effects to audio data. Information is supplied to the sound effect generator 65.

音楽を生成するにあたり、まったくランダムに音程を決めてしまっては、不協和音が発生してしまうので、コード進行のルールに基づいて、伴奏を生成しなければならない。伴奏生成部６７は、所定のコード進行のルールに基づいて、例えば、ランダム端数をマルコフ連鎖により制御するStochastic Markov モデルを利用し、毎回違った楽曲を生成（譜面生成）することができる。 In generating music, if the pitch is determined at random, a dissonant sound will be generated, so an accompaniment must be generated based on the chord progression rule. The accompaniment generation unit 67 can generate a different piece of music (music score generation) each time using a Stochastic Markov model in which a random fraction is controlled by a Markov chain based on a predetermined chord progression rule.

照明制御部６３は、動き検出部６２から供給されたユーザの動きの検出結果、および、フットセンサ入力検出部６６から供給されたユーザのフットセンサ４３−１乃至４３−８への操作入力を示す信号を基に、照明４７−１乃至４７−５を制御する。ユーザの動きの検出結果、および、ユーザのフットセンサ４３−１乃至４３−８への操作入力と、照明４７−１乃至４７−５の制御との関連付けの例としては、例えば、検出された動きの大きさ、または、検出された動きの大きさの変化量に合わせて、照明の照度、または、点滅速度などを変えるようにしても良いし、操作されたフットセンサの位置に対応して点灯させる照明の種類を変更するなどしてもよい。 The illumination control unit 63 indicates the detection result of the user's motion supplied from the motion detection unit 62 and the operation input to the user's foot sensors 43-1 to 43-8 supplied from the foot sensor input detection unit 66. Based on the signal, the lights 47-1 to 47-5 are controlled. As an example of the association between the detection result of the user's motion and the operation input to the user's foot sensors 43-1 to 43-8 and the control of the lights 47-1 to 47-5, for example, the detected motion Depending on the amount of movement or the amount of change in the detected movement, the illumination intensity or blinking speed may be changed, or it will light up according to the position of the operated foot sensor. You may change the kind of illumination to make.

映像効果付与処理部６４は、動き検出部６２から供給されたユーザの動きの検出結果、および、フットセンサ入力検出部６６から供給されたユーザのフットセンサ４３−１乃至４３−８への操作入力を示す信号を基に、画像処理部６１から供給された映像データに、必要に応じて、受信部５１から供給されたテキストデータを重畳し、所定のエフェクトを施して、その結果得られる映像データを、セルフムービー記憶部６８に供給する。 The video effect provision processing unit 64 detects the user motion detection result supplied from the motion detection unit 62 and the operation input to the user foot sensors 43-1 to 43-8 supplied from the foot sensor input detection unit 66. Video data supplied from the image processing unit 61 is superimposed on the video data supplied from the receiving unit 51 as necessary, and a predetermined effect is applied to the video data supplied as a result. Is supplied to the self-movie storage unit 68.

図４および図５を用いて、映像に対するエフェクトの付与について説明する。 With reference to FIG. 4 and FIG. 5, the application of the effect to the video will be described.

図４に、撮像された映像のうちの１フレームである画像９１と、単独のエフェクトが付与された画像１０１乃至１０８とを示す。 FIG. 4 shows an image 91 that is one frame of the captured video and images 101 to 108 to which a single effect is applied.

ここでは、フットセンサ４３−１乃至４３−８にはそれぞれに独立したエフェクトが対応付けられており、エフェクトの強度は、動き検出部６２により検出される、ユーザの動き量により決定されるようになされているものとして説明する。また、動き検出部６２により検出されるユーザの動き量に対して、エフェクトの強度がリニアに変化されるようにしても良いし、動き検出部６２により検出されるユーザの動き量を所定の量子化ステップで量子化して、エフェクトの強度に関連つけるようにしても良い。なお、動き量の検出値に対して、エフェクトの強度がリニアに変化されるようにするよりも、動き量の検出値を粗く量子化して、エフェクトの強度に関連付けたほうが、ユーザにとっては、動きとエフェクト強度との関連性の認識がしやすいという利点がある。 Here, independent effects are associated with each of the foot sensors 43-1 to 43-8, and the intensity of the effect is determined by the amount of motion of the user detected by the motion detector 62. It will be described as being done. Further, the intensity of the effect may be linearly changed with respect to the user's motion amount detected by the motion detection unit 62, or the user's motion amount detected by the motion detection unit 62 may be set to a predetermined quantum. The quantization step may be used to quantize and relate to the effect intensity. For users, it is better to roughly quantize the motion amount detection value and associate it with the effect strength than to change the effect strength linearly with respect to the motion amount detection value. There is an advantage that it is easy to recognize the relationship between and the effect strength.

例えば、フットセンサ４３−１が操作されたとき、映像効果付与処理部６４は、撮像された画像９１の明度およびコントラストを変換して、画像１０１を生成する。このとき、検出されたユーザの動きの量と、明度、コントラストの値が連動されるようになされ、例えば、検出された動きの量が大きいほど、画像内の明度があがり、コントラストの度合いが上昇するようになされている。 For example, when the foot sensor 43-1 is operated, the video effect provision processing unit 64 converts the brightness and contrast of the captured image 91 to generate the image 101. At this time, the detected amount of user movement is linked to the brightness and contrast values. For example, the greater the detected amount of motion, the higher the brightness in the image and the higher the degree of contrast. It is made to do.

また、フットセンサ４３−２が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を複数のセルに分割し、それぞれのセルをリマッピングすることにより、ジグソーパズルのような効果が与えられた画像１０２を生成する。このとき、検出されたユーザの動きのうち、横方向の動きが、セルのリマッピングの変化量に対応付けられ、縦方向の動きが、セルの位置変化のスピードに対応付けられるようになされ、例えば、検出された動きの量が大きいほど、セルの位置変化のスピード、変化量が増すようになされている。 When the foot sensor 43-2 is operated, the video effect imparting processing unit 64 divides the captured image 91 into a plurality of cells, and remaps each cell, thereby providing an effect like a jigsaw puzzle. A given image 102 is generated. At this time, among the detected user movements, the horizontal movement is associated with the amount of change in cell remapping, and the vertical movement is associated with the speed of cell position change. For example, the greater the amount of motion detected, the greater the speed and amount of change in cell position.

フットセンサ４３−３が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を所定枚数（図４においては４枚）のレイヤーに分割し、分割されたレイヤーを移動させることにより、分身効果が与えられた画像１０３を生成する。このとき、検出されたユーザの動きの量とそれぞれのレイヤーの移動量が連動されるようになされ、例えば、検出された動きの量が大きいほど、レイヤーの移動量が大きくなるようになされている。 When the foot sensor 43-3 is operated, the video effect imparting processing unit 64 divides the captured image 91 into a predetermined number of layers (four in FIG. 4) and moves the divided layers. Then, the image 103 to which the alternation effect is given is generated. At this time, the detected amount of movement of the user and the amount of movement of each layer are linked, for example, the amount of movement of the layer increases as the amount of detected movement increases. .

フットセンサ４３−４が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を所定数の画像に分割して、画像１０４を生成する。このとき、検出されたユーザの動きの量と画像の分割数（分割数はべき乗の値とされる）が連動され、例えば、検出された動きの量が大きいほど、画像の分割数が増えるようになされている。 When the foot sensor 43-4 is operated, the video effect imparting processing unit 64 divides the captured image 91 into a predetermined number of images and generates an image 104. At this time, the detected amount of user motion and the number of divisions of the image (the number of divisions is a power value) are linked. For example, the larger the amount of detected motion, the larger the number of divisions of the image. Has been made.

また、フットセンサ４３−５が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を所定のスケールに縮小し、所定の角度に回転させて、万華鏡のような効果が与えられた画像１０５を生成する。このとき、検出されたユーザの動きのうち、横方向の動きが、回転の角度に対応付けられ、縦方向の動きが、スケールの値に割り付けられるようになされ、例えば、検出された動きの量が大きいほど、スケールが小さくなり、回転のスピードが上がるようになされている。 In addition, when the foot sensor 43-5 is operated, the video effect imparting processing unit 64 reduces the captured image 91 to a predetermined scale and rotates the image 91 to a predetermined angle, thereby giving a kaleidoscopic effect. The generated image 105 is generated. At this time, among the detected user movements, the horizontal movement is associated with the rotation angle, and the vertical movement is assigned to the value of the scale. For example, the detected movement amount The larger the is, the smaller the scale and the faster the rotation speed.

フットセンサ４３−６が操作されたとき、映像効果付与処理部６４は、撮像された画像９１の画像データのうちの一部に、所定の遅延処理を施して、いわゆる、ゴーストを発生させることにより、画像１０６を生成する。このとき、検出されたユーザの動きの量と、遅延時間の長さが連動され、例えば、検出された動きの量が大きいほど、遅延時間が長くなり、ゴーストがはっきりと観察できる画像となる。 When the foot sensor 43-6 is operated, the video effect imparting processing unit 64 performs a predetermined delay process on a part of the image data of the captured image 91 to generate a so-called ghost. The image 106 is generated. At this time, the detected amount of movement of the user and the length of the delay time are linked. For example, the larger the amount of detected motion, the longer the delay time, and an image in which the ghost can be clearly observed.

フットセンサ４３−７が操作されたとき、映像効果付与処理部６４は、撮像された画像９１の画像データに、サイン波を乗算することにより、画像を歪曲させ、画像１０７を生成する。このとき、検出されたユーザの動きの量と、乗算されるサイン波の振幅の大きさが連動され、例えば、検出された動きの量が大きいほど、サイン波の振幅が大きくなるようになされ、歪曲量が大きくなるようになされる。 When the foot sensor 43-7 is operated, the video effect imparting processing unit 64 distorts the image by multiplying the image data of the captured image 91 by a sine wave, and generates the image 107. At this time, the detected amount of movement of the user and the magnitude of the amplitude of the sine wave to be multiplied are linked, for example, the larger the amount of detected motion, the larger the amplitude of the sine wave, The amount of distortion is increased.

そして、フットセンサ４３−８が操作されたとき、映像効果付与処理部６４は、撮像された画像９１のエッジ（輪郭）を検出し、その色相を変化させて、画像１０８生成する。このとき、検出されたユーザの動きの量と、色相の変化量が連動され、例えば、出された動きの量が大きいほど、色相の変化量が大きくなるようになされ、エッジのゆがみが増すようになされている。 When the foot sensor 43-8 is operated, the video effect imparting processing unit 64 detects an edge (contour) of the captured image 91, changes the hue, and generates the image 108. At this time, the detected amount of movement of the user and the amount of change in hue are linked. For example, the amount of change in hue increases as the amount of movement generated increases, and the distortion of the edge increases. Has been made.

また、フットセンサ４３−１乃至４３−８のうち、複数のフットセンサが同時に操作された場合、映像効果付与処理部６４は、並行処理を行い、重複してエフェクトをかけて、図５に示されるような画像を生成させることができる。 In addition, when a plurality of foot sensors among the foot sensors 43-1 to 43-8 are operated simultaneously, the video effect imparting processing unit 64 performs parallel processing and applies the effect repeatedly, as shown in FIG. Such an image can be generated.

例えば、図５Ａは、図４を用いて説明した画像１０４を生成するエフェクトと対応付けられているフットセンサ４３−４と画像１０５を生成するエフェクトと対応付けられているフットセンサ４３−５とが同時に操作されたときに生成される画像１１１である。画像１１１は、撮像された画像９１に対して、検出されたユーザの動きの量のうちの横方向の動きが回転の角度に対応付けられて、縦方向の動きがスケールの値に割り付けられて、更に、検出されたユーザの動きに基づいた分割数に画像が分割されることにより生成されるものである。 For example, FIG. 5A includes a foot sensor 43-4 associated with the effect that generates the image 104 described with reference to FIG. 4 and a foot sensor 43-5 associated with the effect that generates the image 105. It is the image 111 produced | generated when operated simultaneously. In the image 111, the horizontal movement of the detected amount of movement of the user is associated with the rotation angle, and the vertical movement is assigned to the scale value. Further, the image is generated by dividing the image into the number of divisions based on the detected user movement.

また、図５Ｂは、図４を用いて説明した画像１０２を生成するエフェクトと対応付けられているフットセンサ４３−２と画像１０４を生成するエフェクトと対応付けられているフットセンサ４３−４とが同時に操作されたときに生成される画像１１２である。画像１１２は、撮像された画像９１に対して、検出されたユーザの動きのうち、横方向の動きが、セルのリマッピングの変化量に対応付けられ、縦方向の動きが、セルの位置変化のスピードに対応付けられ、更に、検出されたユーザの動きに基づいた分割数に画像が分割されることにより生成されるものである。 5B includes a foot sensor 43-2 associated with the effect for generating the image 102 described with reference to FIG. 4 and a foot sensor 43-4 associated with the effect for generating the image 104. It is the image 112 produced | generated when operated simultaneously. In the image 112, with respect to the captured image 91, among the detected user movements, the movement in the horizontal direction is associated with the amount of change in cell remapping, and the movement in the vertical direction represents a change in the position of the cell. Further, the image is divided into the number of divisions based on the detected user movement, and is generated.

なお、図４および図５においては、説明のため、ＬＥＤライトを装着したユーザの手がカメラ４２近傍正面にある場合の映像を例として説明したが、プレイ中に撮像される映像には、通常、ユーザの頭部、上半身などを含む、ユーザの身体の多くの部分が含まれることは言うまでもない。 In FIGS. 4 and 5, for the sake of explanation, an example has been described in which the user's hand wearing the LED light is in front of the camera 42, but the image captured during play is usually It goes without saying that many parts of the user's body are included, including the user's head, upper body, and the like.

図３の説明に戻る。 Returning to the description of FIG.

効果音生成部６５は、動き検出部６２から供給されたユーザの動きの検出結果、および、フットセンサ入力検出部６６から供給されたユーザのフットセンサ４３−１乃至４３−８への操作入力を示す信号を基に、伴奏生成部６７から供給された伴奏の音声データに、所定のエフェクトを施したり、例えば、楽器の音などを含む効果音を付与して、その結果得られる音声データを、セルフムービー記憶部６８に供給する。効果音生成部６５は、伴奏生成部６７から、現在再生中の伴奏のコード、現在のテンション変数、または、現在再生中のセクションなどを示す情報を得て、これらを基に、効果音を生成したり、エフェクトを施す処理を実行する。 The sound effect generation unit 65 receives the detection result of the user's movement supplied from the movement detection unit 62 and the operation input to the user's foot sensors 43-1 to 43-8 supplied from the foot sensor input detection unit 66. On the basis of the signal shown, the accompaniment sound data supplied from the accompaniment generator 67 is subjected to a predetermined effect or, for example, a sound effect including a sound of an instrument is given, and the resulting sound data is The self-movie storage unit 68 is supplied. The sound effect generator 65 obtains information indicating an accompaniment chord currently being played, a current tension variable, a currently playing section, or the like from the accompaniment generator 67, and generates a sound effect based on the information. Or execute effects.

ここでは、効果音生成に関して、フットセンサ４３−１乃至４３−８にはそれぞれに独立した効果音の付加が対応付けられており、効果音のパラメータなどは、動き検出部６２により検出されるユーザの動き量により決定されるようになされているものとして説明する。また、動き検出部６２により検出されるユーザの動き量に対して、各種パラメータがリニアに変化されるようにしても良いし、動き検出部６２により検出されるユーザの動き量を所定の量子化ステップで量子化して、各種パラメータに関連つけるようにしても良い。 Here, regarding the sound effect generation, the foot sensors 43-1 to 43-8 are associated with the addition of independent sound effects, and the sound effect parameters and the like are detected by the motion detection unit 62. It is assumed that it is determined based on the amount of movement. In addition, various parameters may be linearly changed with respect to the user motion amount detected by the motion detection unit 62, or the user motion amount detected by the motion detection unit 62 may be quantized to a predetermined value. Quantization may be performed in steps and associated with various parameters.

例えば、フットセンサ４３−１が操作されたとき、効果音生成部６５は、供給された伴奏にノイズを付加する。そのとき、検出されたユーザの動き量が、ホワイトノイズのバンドパスフィルタの周波数に対応付けられ、例えば、ユーザの単位時間あたりの動きの検出量が増加するとフィルタのカットオフ周波数が上昇し、動きの検出量が減少すると、フィルタのカットオフ周波数は、ゆるやかに初期値に戻るようになされている。 For example, when the foot sensor 43-1 is operated, the sound effect generating unit 65 adds noise to the supplied accompaniment. At that time, the detected amount of movement of the user is associated with the frequency of the bandpass filter of white noise. For example, when the amount of detected movement of the user per unit time increases, the cutoff frequency of the filter increases, and the movement When the detection amount decreases, the cutoff frequency of the filter gradually returns to the initial value.

また、例えば、フットセンサ４３−２が操作されたとき、効果音生成部６５は、供給された伴奏に所定のサンプル音声を付加する。そのとき、検出されたユーザの横方向の動きの大きさが、サンプル中の再生位置に対応付けられ、その動きの早さが、サンプルの再生速度に対応付けられる。したがって、ユーザは、ＬＥＤライトを把持または装着した手を左右に振ることによって、サンプルをレコードのように、前後に再生させることができ、更に、手の左右の動作のスピードによって再生スピードをコントロールすることができる。なお、付与されるサンプルは予め内部に保存されているサンプル集から、ランダムでロードされるようになされている。 Further, for example, when the foot sensor 43-2 is operated, the sound effect generating unit 65 adds a predetermined sample sound to the supplied accompaniment. At that time, the detected magnitude of the horizontal movement of the user is associated with the reproduction position in the sample, and the speed of the movement is associated with the reproduction speed of the sample. Therefore, the user can play the sample back and forth like a record by shaking the hand holding or wearing the LED light to the left and right, and further control the playback speed according to the speed of the left and right movements of the hand. be able to. The given sample is randomly loaded from a sample collection stored in advance.

例えば、フットセンサ４３−３が操作されたとき、効果音生成部６５は、所定のステップ数（例えば、１６ステップ）のパターンを基に、検出されたユーザの横方向の動きの大きさに基づいて、現在出力されている伴奏のコードで規定される音階のうちのいずれかの音程を選んで伴奏に付加して出力する。したがって、ユーザは、ＬＥＤライトを把持または装着した手を左右に振ることによって、常にコード進行に乗った（伴奏に対して不協和音とならない）メロディーを仮想的に演奏する（仮想的に楽器を演奏する）ことができる。 For example, when the foot sensor 43-3 is operated, the sound effect generating unit 65 is based on the detected magnitude of the lateral movement of the user based on a pattern having a predetermined number of steps (for example, 16 steps). Then, any pitch in the scales defined by the currently output accompaniment chord is selected and added to the accompaniment for output. Therefore, the user virtually plays a melody that always rides the chord progression (does not become dissonant with accompaniment) by swinging the hand holding or wearing the LED light to the left or right (virtually playing the instrument) )be able to.

例えば、フットセンサ４３−４が操作されたとき、効果音生成部６５は、検出されたユーザの横方向の動きの大きさに基づいて、現在出力されている伴奏のコードで規定される音階のうちのいずれかの音程を選び、検出されたユーザの縦方向の動きの大きさに基づいて、音色（ＦＭ量）を選択して、伴奏に付加して出力する。したがって、ユーザは、ＬＥＤライトを把持または装着した手を左右に振ることによって、常にコード進行に乗った（伴奏に対して不協和音とならない）メロディーを、手の上下動によって変化される音色で、仮想的に演奏する（仮想的に楽器を演奏する）ことができる。 For example, when the foot sensor 43-4 is operated, the sound effect generating unit 65 performs the scale of the scale defined by the currently output accompaniment chord based on the detected magnitude of the lateral movement of the user. One of the pitches is selected, and a timbre (FM amount) is selected based on the detected magnitude of the vertical movement of the user, and is added to the accompaniment and output. Therefore, the user can synthesize a melody always on the chord progression (which does not become dissonant with accompaniment) with a tone changed by the vertical movement of the hand by swinging the hand holding or wearing the LED light to the left or right. Performance (virtual instrument playing).

また、例えば、フットセンサ４３−５が操作されたとき、効果音生成部６５は、フィルタがかかったホワイトノイズを常に変化するパターンで出力させることにより、レコードをスクラッチしたような音を伴奏に付加して出力する。検出されたユーザの横方向の動きの大きさによって、フィルタのカットオフがコントロールされるようになされている。 Also, for example, when the foot sensor 43-5 is operated, the sound effect generating unit 65 adds a sound like a scratched record to the accompaniment by outputting the filtered white noise in a constantly changing pattern. And output. The cutoff of the filter is controlled according to the detected magnitude of the lateral movement of the user.

例えば、フットセンサ４３−６が操作されたとき、効果音生成部６５は、所定のステップ数（例えば、１６ステップ）のパターンを基に、検出されたユーザの横方向の動きの大きさに基づいて、現在出力されている伴奏のコードで規定される音階のうちのいずれかの音程を選び、検出されたユーザの縦方向の動きの大きさに基づいて、時間軸に対する音の出現密度を変更させて、伴奏に付加して出力する。したがって、ユーザは、ＬＥＤライトを把持または装着した手を左右に振ることによって、常にコード進行に乗った（伴奏に対して不協和音とならない）メロディーを、手の上下動によって変化される演奏密度で、演奏する（仮想的に楽器を演奏する）ことができる。 For example, when the foot sensor 43-6 is operated, the sound effect generating unit 65 is based on the detected magnitude of the lateral movement of the user based on a pattern having a predetermined number of steps (for example, 16 steps). Select one of the scales defined by the currently output accompaniment chord, and change the sound appearance density with respect to the time axis based on the detected vertical movement of the user And add it to the accompaniment for output. Therefore, the user can always play the chord progression (does not become dissonant with the accompaniment) by swinging the hand holding or wearing the LED light left and right, with a performance density that is changed by the vertical movement of the hand, You can play (virtually play a musical instrument).

例えば、フットセンサ４３−７が操作されたとき、効果音生成部６５は、供給された伴奏に歌声の断片（一言二言の人間の声）のサンプル音声を付加する。そのとき、検出されたユーザの横方向の動きの大きさに基づいて、現在出力されている伴奏のコードで規定される音階のうちのいずれかの音程が選択される。したがって、したがって、ユーザは、ＬＥＤライトを把持または装着した手を左右に振ることによって、常にコード進行に乗った（伴奏に対して不協和音とならない）メロディーの歌声を付加することができる。なお、付与される歌声のサンプルは予め内部に保存されているサンプル集から、ランダムでロードされるようになされている。 For example, when the foot sensor 43-7 is operated, the sound effect generating unit 65 adds a sample voice of a singing voice fragment (one or two human voices) to the supplied accompaniment. At that time, based on the detected magnitude of the lateral movement of the user, one of the musical scales defined by the currently output accompaniment chord is selected. Therefore, the user can add a melody singing voice that always rides the chord progression (does not become dissonant with the accompaniment) by shaking the hand holding or wearing the LED light to the left or right. Note that the singing voice samples to be given are randomly loaded from a sample collection stored in advance.

例えば、フットセンサ４３−８が操作されたとき、効果音生成部６５は、伴奏生成部６７から供給された伴奏の現在のテンション変数を、検出されたユーザの動きの大きさに基づいて変更する。テンション変数により、伴奏を構成する複数の楽器の音域の広さがコントロールされるので、例えば、ユーザの単位時間あたりの動きの検出量が増加すると、伴奏の音域が広がって、生成される音が明るくなり、動きの検出量が減少すると、伴奏の音域が狭くなり、生成される音が落ち着いた雰囲気のものになるようになされている。 For example, when the foot sensor 43-8 is operated, the sound effect generation unit 65 changes the current tension variable of the accompaniment supplied from the accompaniment generation unit 67 based on the detected magnitude of user movement. . The tension variable controls the range of the range of multiple musical instruments that make up the accompaniment.For example, if the amount of motion detected per unit time of the user increases, the range of the accompaniment increases and the generated sound becomes When it becomes brighter and the amount of motion detection decreases, the accompaniment sound range becomes narrower and the generated sound has a calm atmosphere.

なお、ここでは、自動作成された伴奏に、ユーザによるフットセンサ４３−１乃至４３−８の操作およびユーザの動き量の検出値に基づいて作成される音を加えることにより、ユーザが、そのパフォーマンスによって、伴奏に合わせて所定の効果音を付与したり、仮想的な演奏を行ったりすることができるようになされているが、例えば、自動作成された伴奏、または、ランダムに選択された、予め記憶されているサンプリング音源に対して、ユーザによるフットセンサ４３−１乃至４３−８の操作およびユーザの動き量の検出値に基づいて、各種エフェクト処理を行うようにしてもよい。 In addition, here, by adding sound created based on the user's operation of the foot sensors 43-1 to 43-8 and the detected value of the amount of movement of the user to the automatically created accompaniment, the user can perform the performance. Can give a predetermined sound effect according to the accompaniment or perform a virtual performance. For example, an automatically created accompaniment or a randomly selected pre- Various effect processes may be performed on the stored sampling sound source based on the operation of the foot sensors 43-1 to 43-8 by the user and the detected value of the amount of movement of the user.

また、画像処理における場合と同様にして、同時に複数のフットセンサが操作されたとき、それぞれのフットセンサの操作入力により付与される効果音などが同時に出力されるものとしても良い。 Similarly to the case of image processing, when a plurality of foot sensors are operated at the same time, sound effects given by operation inputs of the respective foot sensors may be output simultaneously.

また、所定のフットセンサを操作した場合に実行される、撮像された映像へのエフェクトの付与と、伴奏に付与される効果音などとは、関連性があるようにしても良い。例えば、上述した例によると、フットセンサ４３−５が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を所定のスケールに縮小し、所定の角度に回転させて、万華鏡のような効果が与えられた画像１０５を生成し、効果音生成部６５は、フィルタがかかったホワイトノイズを常に変化するパターンで出力させることにより、レコードをスクラッチしたような音を伴奏に付加して出力する。すなわち、ユーザは、フットセンサ４３−５を操作したとき、出力される映像および音声から「回転」を感じることができる。 In addition, the application of the effect to the captured image and the sound effect applied to the accompaniment that are executed when a predetermined foot sensor is operated may be related to each other. For example, according to the above-described example, when the foot sensor 43-5 is operated, the video effect imparting processing unit 64 reduces the captured image 91 to a predetermined scale, rotates it to a predetermined angle, The sound effect generator 65 adds a sound like a scratched record to the accompaniment by generating the filtered white noise in a constantly changing pattern. Output. That is, when the user operates the foot sensor 43-5, the user can feel “rotation” from the output video and audio.

また、フットセンサ４３−２が操作されたとき、映像効果付与処理部６４は、撮像された画像９１を複数のセルに分割し、それぞれのセルをリマッピングすることにより、ジグソーパズルのような効果が与えられた画像１０２を生成し、効果音生成部６５は、供給された伴奏に所定のサンプル音声を付加するようになされている。このとき、映像効果付与処理部６４は、検出されたユーザの動きのうち、横方向の動きを、セルのリマッピングの変化量に対応付け、縦方向の動きを、セルの位置変化のスピードに対応付けるので、例えば、検出された動きの量が大きいほど、セルの位置変化のスピード、変化量が増すようになされている。そして、効果音生成部６５は、検出されたユーザの横方向の動きの大きさを、サンプル中の再生位置に対応付け、動きの早さを、サンプルの再生速度に対応付けるようになされている。すなわち、ユーザは、フットセンサ４３−２を操作したとき、動きの大きさとその速さによって、出力される映像および音声の「単位時間当たりの変化量」を操作していることを感じ、映像と音声の変化のスピード感を楽しむことができる。 When the foot sensor 43-2 is operated, the video effect imparting processing unit 64 divides the captured image 91 into a plurality of cells, and remaps each cell, thereby providing an effect like a jigsaw puzzle. The given image 102 is generated, and the sound effect generator 65 adds a predetermined sample sound to the supplied accompaniment. At this time, the video effect imparting processing unit 64 associates the horizontal movement of the detected user movement with the amount of change in cell remapping, and the vertical movement is used as the cell position change speed. Therefore, for example, the greater the amount of motion detected, the greater the speed and amount of change in cell position. The sound effect generation unit 65 associates the detected magnitude of the horizontal movement of the user with the reproduction position in the sample, and associates the speed of movement with the reproduction speed of the sample. That is, when the user operates the foot sensor 43-2, the user feels that he / she is operating the “change amount per unit time” of the output video and audio according to the magnitude and speed of the movement. You can enjoy the speed of change in audio.

このようにして、撮像された映像へのエフェクトの付与と、伴奏に付与される効果音などとで、関連性（処理の関連性のみならず、ユーザに与える感覚的な関連性であっても良い）を有するものを、同一のフットセンサの操作に対応付けるようにすることにより、ユーザは、容易に「ユーザのイメージ」に合致したコンテンツを作成することができる。 In this way, the relationship between the application of the effect to the captured image and the sound effect added to the accompaniment (not only the relationship of processing but also the sensory relationship given to the user) By associating those having “good” with the operation of the same foot sensor, the user can easily create content that matches the “user's image”.

次に、セルフムービー記憶部６８は、映像効果付与処理部６４から供給されたエフェクト付与済みの映像データと、効果音生成部６５から供給された効果音付与済みの音声データとをバッファリングして、エフェクト付与済みの映像データを映像表示制御部６９に供給し、効果音付与済みの音声データを音声出力制御部７０に出力するとともに、エフェクト付与済みの映像データと効果音付与済みの音声データとにより構成されるセルフムービーを所定のフォーマットでエンコードし、ネットワークインターフェース７１に供給する。 Next, the self-movie storage unit 68 buffers the effect-added video data supplied from the video effect addition processing unit 64 and the sound effect-added audio data supplied from the sound effect generation unit 65. The video data to which the effect has been applied is supplied to the video display control unit 69, and the audio data to which the sound effect has been applied is output to the audio output control unit 70, and the video data to which the effect has been applied and the sound data to which the sound effect has been applied, Is encoded in a predetermined format and supplied to the network interface 71.

映像表示制御部６９は、セルフムービー記憶部６８から供給される映像データのモニタ４４への表示を制御する。音声出力制御部７０は、セルフムービー記憶部６８から供給される音声データの、スピーカ４５−１乃至４５−６からの音声出力を制御する。 The video display control unit 69 controls display of video data supplied from the self-movie storage unit 68 on the monitor 44. The audio output control unit 70 controls the audio output from the speakers 45-1 to 45-6 of the audio data supplied from the self-movie storage unit 68.

ネットワークインターフェース７１は、エンコードされた映像と音声から構成されるセルフムービーと、セルフムービーを固有に区別可能な情報（例えば、操作入力部４１により入力されたユーザＩＤなどの情報を用いるようにしても良いし、または、現在時刻および自分自身の機器ＩＤなどを基に、セルフムービーＩＤを生成するようにしてもよい）を、ネットワーク１を介して、セルフムービー配信サーバ３に送信し、ユーザがセルフムービー配信サーバ３にセルフムービーのダウンロード要求を行う場合にアクセスするウェブページのＵＲＬなどの情報をユーザが保有する携帯型電話機４などに送信する。 The network interface 71 may use a self-movie composed of encoded video and audio and information that can uniquely distinguish the self-movie (for example, information such as a user ID input by the operation input unit 41). Or a self-movie ID may be generated based on the current time and its own device ID) to the self-movie distribution server 3 via the network 1 so that the user can Information such as the URL of a web page to be accessed when a self-movie download request is made to the movie distribution server 3 is transmitted to the mobile phone 4 held by the user.

また、セルフムービー作成装置２においては、ユーザが、自分の動作と、映像、音声、および照明などに与えられるエフェクトとの関係をある程度理解することができるように、ユーザからプレイ開始が指令されてから、例えば、３０秒間や１分間など、所定の時間を、ユーザの動作の検出結果およびフットセンサ４３−１乃至４３−８への操作入力を基に、撮像された映像にエフェクトが施されてモニタ４４に映像が表示され、生成された伴奏に効果音が付与されてスピーカ４５−１乃至４５−６から音声が出力されるが、セルフムービーとしては記憶されない「練習モード」とし、その後、更に、３０秒間や１分間など、所定の時間を、エフェクトが施された映像データが表示され、効果音が付与された音声データが音声出力されるとともに、セルフムービーとして記憶され、ネットワーク１を介して、セルフムービー作成サーバ３に供給される「セルフムービーモード」とすることができる。 Further, in the self-movie creating apparatus 2, the user is instructed to start playing so that the user can understand to some extent the relationship between his / her actions and the effects given to video, audio, lighting, and the like. Then, for example, an effect is applied to the captured video for a predetermined time such as 30 seconds or 1 minute based on the detection result of the user's motion and the operation input to the foot sensors 43-1 to 43-8. An image is displayed on the monitor 44, a sound effect is added to the generated accompaniment, and sound is output from the speakers 45-1 to 45-6. However, a “practice mode” that is not stored as a self-movie is set. When the video data with the effect applied is displayed for a predetermined time such as 30 seconds or 1 minute, and the audio data with the sound effect is output as audio To be stored as a self-movie, via the network 1, is supplied to the self movie creation server 3 may be a "self-movie mode".

ここで、「練習モード」として設定される時間および「セルフムービーモード」として設定される時間は、それぞれ、適宜設定可能な時間であることは言うまでもない。例えば、「練習モード」として設定される時間が長ければ、ユーザは、自分の動作と、映像、音声、および照明などに与えられるエフェクトとの関係をよりよく理解することができ、「練習モード」として設定される時間が短ければ、セルフムービー作成における意外性が増すという効果が発生する。 Here, it is needless to say that the time set as the “practice mode” and the time set as the “self-movie mode” can be set as appropriate. For example, if the time set as the “practice mode” is long, the user can better understand the relationship between his / her actions and effects given to video, audio, lighting, and the like. If the time set as is short, there is an effect that the unexpectedness in the self-movie creation increases.

また、以上の説明においては、作成されたセルフムービーは、ネットワーク１を介して、セルフムービー配信サーバ３に送信されるものとして説明したが、作成されたセルフムービーを、ユーザが保有する携帯型電話機４またはＰＤＡ５などに、例えば、Bluetoothなどの無線通信を利用して、または、ネットワーク１を介して、電子メールなどのシステムを利用して、セルフムービー配信サーバ３を介することなく、直接、ユーザに供給されるようにしても良い。 Further, in the above description, the created self-movie has been described as being transmitted to the self-movie distribution server 3 via the network 1, but the mobile phone that the user holds the created self-movie. 4 or PDA 5, for example, using wireless communication such as Bluetooth, or using a system such as e-mail via the network 1, directly to the user without going through the self-movie distribution server 3. It may be supplied.

図６は、セルフムービー配信サーバ３の構成を示すブロック図である。 FIG. 6 is a block diagram showing a configuration of the self-movie distribution server 3.

セルフムービー配信サーバ３のＣＰＵ（Central Processing Unit）２１１は、セルフムービー配信サーバ３の動作の全体を制御する。また、ＣＰＵ２１１は、内部バス２１３および入出力インターフェース２１２を介して、マウス２３１やキーボード２３２などからなる入力部２１４から、セルフムービー配信サーバ３の管理者による操作入力が入力されると、それに対応してＲＯＭ（Read Only Memory）２１５に格納されているプログラムをＲＡＭ（Random Access Memory）２１６にロードして実行する。あるいはまた、ＣＰＵ２１１は、ＨＤＤ２１８にインストールされたプログラムをＲＡＭ２１６にロードして実行し、ディスプレイ２３３やスピーカ２３４などの出力部２１７に実行結果を出力させる。更に、ＣＰＵ２１１は、ネットワークインターフェース２２０を制御して、外部と通信し、データの授受を実行する。 A CPU (Central Processing Unit) 211 of the self-movie distribution server 3 controls the overall operation of the self-movie distribution server 3. Further, the CPU 211 responds to an operation input by the administrator of the self-movie distribution server 3 from the input unit 214 including the mouse 231 and the keyboard 232 via the internal bus 213 and the input / output interface 212. Then, a program stored in a ROM (Read Only Memory) 215 is loaded into a RAM (Random Access Memory) 216 and executed. Alternatively, the CPU 211 loads a program installed in the HDD 218 to the RAM 216 and executes it, and causes the output unit 217 such as the display 233 and the speaker 234 to output the execution result. Further, the CPU 211 controls the network interface 220, communicates with the outside, and executes data exchange.

また、ＨＤＤ２１８は、セルフムービー作成装置２からネットワークを介して送信されたセルフムービーを、セルフムービー固有の情報と対応つけて記憶し、更に、必要に応じて、認証処理または課金処理に必要となるユーザの個人情報などを記憶する。 The HDD 218 stores the self-movie transmitted from the self-movie creation device 2 via the network in association with the information specific to the self-movie, and is further required for authentication processing or billing processing as necessary. The personal information of the user is stored.

また、ＣＰＵ２１１は、内部バス２１３および入出力インターフェース２１２を介して、必要に応じてドライブ２１９と接続され、ドライブ２１９に必要に応じて装着された磁気ディスク２２１、光ディスク２２２、光磁気ディスク２２３、または半導体メモリ２２４と情報を授受することができるようになされている。 The CPU 211 is connected to the drive 219 as necessary via the internal bus 213 and the input / output interface 212, and the magnetic disk 221, the optical disk 222, the magneto-optical disk 223, or the like mounted on the drive 219 as necessary. Information can be exchanged with the semiconductor memory 224.

更に、ＣＰＵ２１１は、内部バス２１３を介して、画像処理部２４１、音声処理部２４２、課金処理部２４３、および、認証処理部２４４と接続されている。 Further, the CPU 211 is connected to the image processing unit 241, the audio processing unit 242, the billing processing unit 243, and the authentication processing unit 244 via the internal bus 213.

画像処理部２４１は、ＣＰＵ２１１の制御に基づいて、ユーザからダウンロードが要求されているセルフムービーの映像データをＨＤＤ２１８から読み出して、セルフムービーの配信先となる機器（例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ホームサーバ７）の種類に応じて、映像フォーマットや解像度などを変換する。 Based on the control of the CPU 211, the image processing unit 241 reads out video data of a self-movie that is requested to be downloaded by the user from the HDD 218, and a device (for example, the mobile phone 4, the PDA 5, Depending on the type of personal computer 6 or home server 7), the video format, resolution, etc. are converted.

音声処理部２４２は、ＣＰＵ２１１の制御に基づいて、ユーザからダウンロードが要求されているセルフムービーの音声データをＨＤＤ２１８から読み出して、セルフムービーの配信先となる機器（例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ホームサーバ７）の種類に応じて、音声フォーマットやチャンネル数などを変換する。 Based on the control of the CPU 211, the audio processing unit 242 reads out the audio data of the self-movie requested to be downloaded from the user from the HDD 218, and the device (for example, the mobile phone 4, PDA 5, Depending on the type of the personal computer 6 or the home server 7), the audio format, the number of channels, and the like are converted.

例えば、セルフムービー作成装置２から送信されたセルフムービーのフォーマットが、３ＧＰＰ（3rd Generation Partnership Project）など、携帯型電話機４、ＰＤＡ５，パーソナルコンピュータ６、その他の機器において、共通に視聴可能なフォーマットである場合、セルフムービー配信サーバ３は、ユーザに要求されたセルフムービーをＨＤＤ２１８から読み出して、データの変換を行うことなく送信するようにしても良い。しかしながら、セルフムービーの配信先となる機器によって、処理可能なデータのフォーマット形式、表示可能な映像の解像度、音声データを出力するスピーカシステム（例えば、モノラル、ステレオ、５．１chなど）などが異なるので、セルフムービー作成装置２から送信されたセルフムービーのフォーマットが、送信先の機器におけるセルフムービーの視聴に適さないものである場合、画像処理部２４１および音声処理部２４２は、ＣＰＵ２１１の制御に基づいて、セルフムービー作成装置２から送信されたセルフムービーを、出力先の機器において視聴可能な形式に変換する処理を実行する。 For example, the format of the self-movie transmitted from the self-movie creating apparatus 2 is a format that can be commonly viewed on the mobile phone 4, the PDA 5, the personal computer 6, and other devices such as 3GPP (3rd Generation Partnership Project). In this case, the self-movie distribution server 3 may read the self-movie requested by the user from the HDD 218 and transmit it without performing data conversion. However, the format of data that can be processed, the resolution of video that can be displayed, the speaker system that outputs audio data (for example, monaural, stereo, 5.1ch, etc.) differ depending on the device to which the self-movie is distributed. When the format of the self-movie transmitted from the self-movie creating apparatus 2 is not suitable for viewing the self-movie on the destination device, the image processing unit 241 and the audio processing unit 242 are based on the control of the CPU 211. Then, a process of converting the self-movie transmitted from the self-movie creating apparatus 2 into a format that can be viewed on the output destination device is executed.

課金処理部２４３は、ＣＰＵ２１１の制御に基づいて、必要に応じて、ＨＤＤ２１８に記憶されているユーザの個人情報を参照したり、セルフムービーを要求するユーザから送信された課金に必要な情報を基にして、他の金融機関（例えば、クレジットカードや電子マネーシステムのサービス提供者）のサーバにアクセスすることなどにより、課金処理を実行する。 Based on the control of the CPU 211, the charging processing unit 243 refers to the user's personal information stored in the HDD 218 as necessary, or based on information necessary for charging transmitted from the user requesting the self-movie. Then, the accounting process is executed by accessing a server of another financial institution (for example, a credit card or electronic money system service provider).

認証処理部２４４は、ＣＰＵ２１１の制御に基づいて、必要に応じて、ＨＤＤ２１８に記憶されているユーザの個人情報を参照し、セルフムービーを要求するユーザから送信された認証に必要な情報を用いて、認証処理を実行する。 Based on the control of the CPU 211, the authentication processing unit 244 refers to the personal information of the user stored in the HDD 218 as necessary, and uses information necessary for authentication transmitted from the user requesting the self-movie. Execute the authentication process.

次に、図７のフローチャートを参照して、図３を用いて説明したセルフムービー作成装置２が実行する、セルフムービー作成処理１について説明する。 Next, the self-movie creating process 1 executed by the self-movie creating apparatus 2 described with reference to FIG. 3 will be described with reference to the flowchart of FIG.

ここでは、ユーザからプレイ開始が指令されてから、例えば、３０秒間や１分間など、所定の時間においては、エフェクトが施された映像データおよび効果音が付与された音声データが、モニタ４４に表示され、スピーカ４５−１乃至４５−６から出力されるが、セルフムービーとしては記憶されない、「練習モード」とし、その後、３０秒間や１分間など、所定の時間において、エフェクトが施された映像データおよび効果音が付与された音声データが、モニタ４４に表示され、スピーカ４５−１乃至４５−６から出力されるとともに、セルフムービーとして記憶され、ネットワーク１を介して、セルフムービー作成サーバ３に供給される、「セルフムービーモード」となるものとして説明する。 Here, for example, for a predetermined time such as 30 seconds or 1 minute after the start of play is instructed by the user, the video data with the effect and the audio data with the sound effect are displayed on the monitor 44. Is output from the speakers 45-1 to 45-6, but is not stored as a self-movie, and is set to “practice mode”, and then video data on which an effect has been applied for a predetermined time such as 30 seconds or 1 minute. And the sound data to which the sound effect is added are displayed on the monitor 44, output from the speakers 45-1 to 45-6, stored as a self-movie, and supplied to the self-movie creation server 3 via the network 1. In the following description, the “self movie mode” is set.

ステップＳ１において、受信部５１は、例えば、携帯電話機４やＰＤＡ５などの、ユーザが有する携帯型の機器から、赤外線通信などにより、テキストデータの入力を受けたか否かを判断する。 In step S 1, the receiving unit 51 determines whether or not text data is input from a portable device possessed by the user, such as the mobile phone 4 or the PDA 5, by infrared communication or the like.

ステップＳ１において、テキストデータの入力を受けたと判断された場合、ステップＳ２において、受信部５１は、入力されたテキストデータを映像効果付与処理部６４に供給する。映像効果付与処理部６４は、入力されたテキストデータを映像に重畳可能な、例えば、ビットマップなどの画像データに変換し、内部に有する図示しないページバッファなどに一時保存する。 If it is determined in step S1 that text data has been input, in step S2, the receiving unit 51 supplies the input text data to the video effect applying processing unit 64. The video effect imparting processing unit 64 converts the input text data into image data such as a bitmap that can be superimposed on the video, and temporarily stores it in a page buffer (not shown) or the like included therein.

ステップＳ１において、テキストデータの入力を受けていないと判断された場合、または、ステップＳ２の処理の終了後、ステップＳ３において、操作入力部４１は、ユーザから、プレイ開始の操作入力を受けたか否かを判断する。ステップＳ３において、プレイ開始の操作入力を受けていないと判断された場合、プレイ開始の操作入力を受けたと判断されるまで、ステップＳ３の処理が繰り返される。 If it is determined in step S1 that no text data has been input, or after the processing in step S2, the operation input unit 41 has received an operation input for starting play from the user in step S3. Determine whether. If it is determined in step S3 that no play start operation input has been received, the process of step S3 is repeated until it is determined that a play start operation input has been received.

ステップＳ３において、プレイ開始の操作入力を受けたと判断された場合、操作入力部４１は、プレイ開始を示す制御信号を、画像処理部６１、映像効果付与処理部６４、フットセンサ入力検出部６６、および、伴奏生成部６７に供給するので、ステップＳ４において、伴奏生成部６７は、セルフムービーの音声データの基となる伴奏に対応する音声データを作成する。また、伴奏生成部６７は、例えば、「練習モード」から「セルフムービーモード」へ移行することを、伴奏の変化によって知らせることができるように、伴奏を生成するようにしても良い。 When it is determined in step S3 that an operation input for starting play has been received, the operation input unit 41 sends a control signal indicating the start of play to an image processing unit 61, a video effect application processing unit 64, a foot sensor input detection unit 66, And since it supplies to the accompaniment production | generation part 67, the accompaniment production | generation part 67 produces the audio | voice data corresponding to the accompaniment used as the basis of the audio | voice data of a self movie in step S4. In addition, the accompaniment generation unit 67 may generate an accompaniment so that the transition from the “practice mode” to the “self movie mode” can be notified by a change in the accompaniment.

ステップＳ５において、画像処理部６１は、カメラ４２により撮像された映像の取得を開始し、例えば、Ａ／Ｄ変換や、ガンマ補正、ホワイトバランス調整などの各種信号処理を施して、動き検出部６２、および、映像効果付与処理部６４に供給する。 In step S 5, the image processing unit 61 starts acquisition of the video imaged by the camera 42, performs various signal processing such as A / D conversion, gamma correction, and white balance adjustment, and performs the motion detection unit 62. , And the video effect provision processing unit 64.

ステップＳ６において、動き検出部６２は、撮像された映像を基に、ユーザが把持または装着しているＬＥＤライトの動きを追跡したり、映像データの動きベクトルの大きさを検出することなどにより、ユーザの動きを検出し、検出結果を、照明制御部６３、映像効果付与処理部６４、および、効果音生成部６５に供給する。 In step S6, the motion detection unit 62 tracks the movement of the LED light held or worn by the user based on the captured image, detects the magnitude of the motion vector of the video data, etc. The movement of the user is detected, and the detection result is supplied to the illumination control unit 63, the video effect applying processing unit 64, and the sound effect generating unit 65.

ステップＳ７において、照明制御部６３は、検出されたユーザの動きを基に照明を制御する。照明制御部６３は、例えば、検出された動きの大きさ、または、検出された動きの大きさの変化量に合わせて、照明の照度、または、点滅速度などを変えるなどして、照明を制御する。 In step S 7, the illumination control unit 63 controls illumination based on the detected user movement. For example, the illumination control unit 63 controls the illumination by changing the illuminance of the illumination, the blinking speed, or the like in accordance with the magnitude of the detected movement or the amount of change in the magnitude of the detected movement. To do.

ステップＳ８において、フットセンサ入力検出部６６は、フットセンサ４３−１乃至４３−８から供給される信号を基に、いずれかのフットセンサが操作されたか否かを判断する。 In step S8, the foot sensor input detection unit 66 determines whether any of the foot sensors has been operated based on the signals supplied from the foot sensors 43-1 to 43-8.

ステップＳ８において、いずれかのフットセンサが操作されたと判断された場合、フットセンサ入力検出部６６は、いずれのフットセンサが操作されたのかを示す信号を、照明制御部６３、映像効果付与処理部６４、および、効果音生成部６５に供給するので、ステップＳ９において、映像効果付与処理部６４は、フットセンサ入力検出部６６から供給された信号を基に、図４および図５を用いて説明したように、操作されたフットセンサに対応する映像のエフェクトを、検出されたユーザの動きに応じて、画像処理部６１から供給された撮像された映像（必要に応じて、受信部５１から供給されたテキストデータが重畳されている映像）に施して、セルフムービー記憶部６８に供給する。 If it is determined in step S8 that any one of the foot sensors has been operated, the foot sensor input detection unit 66 sends a signal indicating which of the foot sensors has been operated to the illumination control unit 63, the video effect applying processing unit. 64 and the sound effect generator 65. In step S9, the video effect applying processor 64 is described with reference to FIGS. 4 and 5 based on the signal supplied from the foot sensor input detector 66. As described above, the effect of the video corresponding to the operated foot sensor is obtained from the captured video supplied from the image processing unit 61 according to the detected user movement (supplied from the receiving unit 51 as necessary). And is supplied to the self-movie storage unit 68.

ステップＳ１０において、効果音生成部６５は、フットセンサ入力検出部６６から供給された信号を基に、操作されたフットセンサに対応する音声効果を用いて、検出されたユーザの動きを基に、伴奏作成部６７から供給された伴奏に対応する音声データに、上述したような効果音を付与して、セルフムービーの音声データを生成し、セルフムービー記憶部６８に供給する。 In step S10, the sound effect generation unit 65 uses the sound effect corresponding to the operated foot sensor based on the signal supplied from the foot sensor input detection unit 66, based on the detected user movement. The sound data corresponding to the accompaniment supplied from the accompaniment creation unit 67 is added to the sound effect as described above to generate self-movie sound data, which is supplied to the self-movie storage unit 68.

ステップＳ１１において、照明制御部６３は、フットセンサ入力検出部６６から供給された信号を基に、例えば、操作されたフットセンサに対応して点灯させる照明の種類を変更するなどして、照明を制御する。 In step S 11, the illumination control unit 63 performs illumination based on the signal supplied from the foot sensor input detection unit 66, for example, by changing the type of illumination to be turned on corresponding to the operated foot sensor. Control.

なお、ここでは、ステップＳ６乃至ステップＳ１１の処理を、順次実行される処理ステップとして説明したが、これらの処理は、並行して実行され、リアルタイムで処理されるものである。 In addition, although the process of step S6 thru | or step S11 was demonstrated as a process step performed sequentially here, these processes are performed in parallel and processed in real time.

ステップＳ１２において、セルフムービー記憶部６８は、プレイ開始からすでに所定の時間が経過しているか否かを判断する。ステップＳ１２において、プレイ開始から所定の時間が経過していないと判断された場合、セルフムービー記憶部６８は、映像表示制御部６９に映像データを供給して、モニタ４４に表示させ、音声出力制御部７０に音声データを供給して、スピーカ４５−１乃至４５−６から出力させるが、内部に記憶はしないで、処理は、ステップＳ６に戻り、それ以降の処理が繰り返される。 In step S12, the self-movie storage unit 68 determines whether or not a predetermined time has already elapsed since the start of play. If it is determined in step S12 that the predetermined time has not elapsed since the start of play, the self-movie storage unit 68 supplies the video data to the video display control unit 69 and displays the video data on the monitor 44 for audio output control. The audio data is supplied to the unit 70 and is output from the speakers 45-1 to 45-6, but is not stored inside, and the process returns to step S6, and the subsequent processes are repeated.

ステップＳ１２において、プレイ開始から所定の時間が経過したと判断された場合、ステップＳ１３において、セルフムービー記憶部６８は、映像表示制御部６９に映像データを供給して、モニタ４４に表示させ、音声出力制御部７０に音声データを供給して、スピーカ４５−１乃至４５−６から出力させるとともに、映像データおよび音声データを記憶する、すなわち、セルフムービー配信サーバ３に送信するために、セルフムービーの録画および録音を行う。 If it is determined in step S12 that a predetermined time has elapsed since the start of play, in step S13, the self-movie storage unit 68 supplies video data to the video display control unit 69 to display on the monitor 44, and the audio data is displayed. Audio data is supplied to the output controller 70 and output from the speakers 45-1 to 45-6, and the video data and audio data are stored, that is, for transmission to the self-movie distribution server 3. Record and record.

ステップＳ１４において、セルフムービー作成装置２の各部は、プレイ終了時刻となったか否かを判断する。ステップＳ１４において、プレイ終了時刻となっていないと判断された場合、処理は、ステップＳ６に戻り、それ以降の処理が繰り返される。 In step S14, each part of the self-movie creating apparatus 2 determines whether or not the play end time has come. If it is determined in step S14 that the play end time has not been reached, the process returns to step S6, and the subsequent processes are repeated.

ステップＳ１４において、プレイ終了時刻となったと判断された場合、ステップＳ１５において、セルフムービー記憶部６８は、録画した映像データおよび録音した音声データ、すなわち、生成されたセルフムービーを、ネットワークインターフェース７１に供給する。ネットワークインターフェース７１は、供給されたセルフムービーを、セルフムービーを固有に区別可能な情報とともに、ネットワーク１を介して、セルフムービー配信サーバ３に送信し、処理が終了される。 If it is determined in step S14 that the play end time is reached, in step S15, the self-movie storage unit 68 supplies the recorded video data and recorded audio data, that is, the generated self-movie, to the network interface 71. To do. The network interface 71 transmits the supplied self-movie together with information capable of uniquely distinguishing the self-movie to the self-movie distribution server 3 via the network 1, and the processing ends.

このような処理により、ユーザのパフォーマンスを、フットセンサ４３−１乃至４３−８への操作入力と、撮像された映像から検出される動き量との２種類の方法で検出し、検出されたユーザのパフォーマンスを基に、撮像された映像にエフェクトを施し、作成された伴奏に効果音を付与することができるようにしたので、オリジナリティのある、高度な画像処理や音声処理が施されたセルフムービーを生成することができる。ユーザは、セルフムービーの作成において、自己表現することができ、能動的に楽しむことができる。 Through such processing, the user performance is detected by two types of methods, that is, an operation input to the foot sensors 43-1 to 43-8 and a motion amount detected from the captured image, and the detected user is detected. Based on the performance of the video, it is possible to apply effects to the captured video and add sound effects to the created accompaniment. Can be generated. The user can express himself / herself in the creation of the self-movie and can enjoy it actively.

なお、セルフムービーを固有に区別可能な情報は、ユーザにより入力される情報であっても、セルフムービー作成装置２が、例えば、年月日時刻情報と機器ＩＤなどから作成する情報であっても良く、セルフムービー作成装置２が作成する場合は、ユーザに通知して、ダウンロードのときに利用することができるようにしても良い。 Note that the information that can uniquely distinguish the self-movie may be information input by the user, or information that the self-movie creation device 2 creates from the date / time information and the device ID, for example. In the case where the self-movie creating apparatus 2 creates, the user may be notified so that it can be used when downloading.

次に、図８のフローチャートを参照して、図６を用いて説明したセルフムービー配信サーバ３が実行する処理について説明する。 Next, processing executed by the self-movie distribution server 3 described with reference to FIG. 6 will be described with reference to the flowchart of FIG.

ステップＳ３１において、ＣＰＵ２１１は、ネットワークインターフェース２２０がセルフムービー作成装置２のうちのいずれかから、録画された映像データおよび録音された音声データ、すなわち、作成されたセルフムービーを受信したか否かを判断する。 In step S31, the CPU 211 determines whether or not the network interface 220 has received recorded video data and recorded audio data, that is, the created self-movie, from any of the self-movie creating apparatuses 2. To do.

ステップＳ３１において、セルフムービーを受信したと判断された場合、ステップＳ３２において、ＣＰＵ２１１は、セルフムービーを個別に区別可能するために必要な情報である、例えば、ユーザＩＤや、セルフムービー作成装置２において作成された情報などとともに、セルフムービーをＨＤＤ２１８のセルフムービー記憶領域に記憶する。 If it is determined in step S31 that a self-movie has been received, in step S32, the CPU 211 is information necessary for individually distinguishing the self-movie, for example, in the user ID or the self-movie creating apparatus 2. Along with the created information and the like, the self movie is stored in the self movie storage area of the HDD 218.

ステップＳ３１において、セルフムービーを受信していないと判断された場合、または、ステップＳ３２の処理の終了後、ステップＳ３３において、ＣＰＵ２１１は、ネットワークインターフェース２２０が、セルフムービー作成装置２のうちのいずれかから、セルフムービーのダウンロード要求を示す信号を受信したか否かを判断する。セルフムービーのダウンロード要求を示す信号には、例えば、ユーザの認証に必要な情報、ダウンロード先となる機器に関する情報、および、課金処理に必要な情報などが含まれている。ステップＳ３３において、セルフムービーのダウンロード要求を受けていないと判断された場合、処理は、ステップＳ３１に戻り、それ以降の処理が繰り返される。 If it is determined in step S31 that a self-movie has not been received, or after the processing in step S32 is completed, the CPU 211 causes the network interface 220 to be sent from any of the self-movie creation devices 2 in step S33. Then, it is determined whether or not a signal indicating a self-movie download request has been received. The signal indicating the self-movie download request includes, for example, information necessary for user authentication, information regarding a device that is a download destination, and information necessary for billing processing. If it is determined in step S33 that a self-movie download request has not been received, the process returns to step S31, and the subsequent processes are repeated.

ステップＳ３３において、セルフムービーのダウンロード要求を受けたと判断された場合、ステップＳ３４において、ＣＰＵ２１１は、認証処理部２４４を制御して、認証処理を実行させる。 If it is determined in step S33 that a self-movie download request has been received, in step S34, the CPU 211 controls the authentication processing unit 244 to execute authentication processing.

ステップＳ３５において、ＣＰＵ２１１は、認証処理部２４４から供給される、認証処理の結果を基に、セルフムービーのダウンロードを要求したユーザは、正しいユーザであると認証されたか否かを判断する。 In step S 35, the CPU 211 determines whether the user who requested the download of the self-movie is authenticated as a correct user based on the result of the authentication process supplied from the authentication processing unit 244.

ステップＳ３５において、正しいユーザであると認証されたと判断された場合、ステップＳ３６において、ＣＰＵ２１１は、ＨＤＤ２１８から、ユーザがダウンロードを要求した対応するセルフムービーを検索して読み出す。 If it is determined in step S35 that the user has been authenticated, in step S36, the CPU 211 searches the HDD 218 for a corresponding self-movie that the user requested to download, and reads it out.

ステップＳ３７において、ＣＰＵ２１１は、読み出されたセルフムービーのデータ形式と、送信先の機器が、例えば、携帯型電話機４、ＰＤＡ５，パーソナルコンピュータ６、または、ホームサーバ７のいずれであるかなどの情報を基に、送信先の機器は、データの変換が必要であるか否かを判断する。具体的には、セルフムービーの送信先の機器が、例えば、携帯型電話機４やＰＤＡ５などの小型の装置である場合、音声データが５．１ｃｈのドルビーデジタルデータであっても、ステレオでしか再生することはできず、それらの機器のモニタが表示可能な映像の大きさや解像度などの映像の品質も、例えば、パーソナルコンピュータ６やテレビジョン受像機８などにおける場合と比較して、低いものとなる。また、パーソナルコンピュータ６、または、ホームサーバ７などにセルフムービーが送信される場合も、それらの機器が保有する映像表示および音声出力の機能は、それぞれ異なるものとなる。したがって、ＣＰＵ２１１は、セルフムービーのダウンロード要求を示す信号に含まれていた、ダウンロード先となる機器に関する情報を基に、データの変換が必要であるか否かを判断する。 In step S 37, the CPU 211 determines the data format of the read self-movie and whether the destination device is, for example, the mobile phone 4, the PDA 5, the personal computer 6, or the home server 7. Based on the above, the destination device determines whether or not data conversion is necessary. Specifically, when the self-movie transmission destination device is a small device such as a mobile phone 4 or a PDA 5, for example, even if the audio data is 5.1ch Dolby digital data, it is reproduced only in stereo. The video quality such as the size and resolution of the video that can be displayed on the monitors of these devices is also lower than that in, for example, the personal computer 6 or the television receiver 8. . In addition, when a self-movie is transmitted to the personal computer 6 or the home server 7, the video display and audio output functions possessed by these devices are different. Therefore, the CPU 211 determines whether or not data conversion is necessary based on the information regarding the download destination device included in the signal indicating the self-movie download request.

ステップＳ３７において、データの変換が必要であると判断された場合、ステップＳ３８において、ＣＰＵ２１１は、画像処理部２４１および音声処理部２４２を制御して、送信先の機器に対応するデータ形式に、セルフムービーを変換する。 If it is determined in step S37 that data conversion is necessary, in step S38, the CPU 211 controls the image processing unit 241 and the audio processing unit 242 to set the data format corresponding to the transmission destination device to the self format. Convert the movie.

ステップＳ３７において、データの変換は必要ではないと判断された場合、または、ステップＳ３８の処理の終了後、ステップＳ３９において、ＣＰＵ２１１は、課金処理部２４３を制御して、課金処理を実行させる。 If it is determined in step S37 that data conversion is not necessary, or after the process of step S38 is completed, in step S39, the CPU 211 controls the charging processing unit 243 to execute the charging process.

ステップＳ４０において、ＣＰＵ２１１は、ネットワークインターフェース２２０を制御して、ネットワーク１を介して、セルフムービーのダウンロード要求元の機器に、セルフムービーを送信し、処理が終了される。 In step S 40, the CPU 211 controls the network interface 220 to transmit the self-movie to the self-movie download request source device via the network 1, and the process ends.

ステップＳ３５において、正しいユーザであると認証されなかったと判断された場合、ステップＳ４１において、ＣＰＵ２１１は、ネットワークインターフェース２２０を制御して、ネットワーク１を介して、セルフムービーのダウンロード要求元の機器に、エラーメッセージを送信して、処理が終了される。 If it is determined in step S35 that the user is not authenticated as the correct user, in step S41, the CPU 211 controls the network interface 220 to send an error message to the self-movie download request source device via the network 1. The message is sent and the process is terminated.

このような処理により、セルフムービー作成装置２から送信されたセルフムービーが、ユーザに配信され、課金処理が実行される。 Through such processing, the self-movie transmitted from the self-movie creating apparatus 2 is distributed to the user, and billing processing is executed.

なお、図８を用いて説明した処理では、セルフムービーのダウンロード要求ごとに、認証処理および課金処理を行うものとして説明したが、認証処理および課金処理は、省略されたり、例えば、月や年毎に行われるようにしても良い。 In the process described with reference to FIG. 8, it has been described that the authentication process and the charging process are performed for each self-movie download request. However, the authentication process and the charging process may be omitted, for example, monthly or yearly. It may be made to be performed.

以上の説明においては、セルフムービー作成装置２は、図２および図３を用いて説明したように、筺体２１および床２２により、ユーザが内部で充分動作可能な空間を構成するような、大型の装置であり、床２２に設けられているフットセンサ４３−１乃至４３−８への操作入力に、映像のエフェクトおよび音声の効果音付与を対応付けるものとして説明したが、映像を撮像し、ユーザにより操作入力受けることができ、画像処理および音声データの処理が可能で、更に、映像および音声の出力が可能な装置であれば、例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ディジタルスチルカメラ、もしくは、ディジタルビデオカメラなども、セルフムービ作成装置２として利用することが可能である。 In the above description, as described with reference to FIGS. 2 and 3, the self-movie creating apparatus 2 is a large-sized device in which the housing 21 and the floor 22 constitute a space in which the user can sufficiently operate. Although it has been described that the operation input to the foot sensors 43-1 to 43-8 provided on the floor 22 is associated with the video effect and the audio sound effect application, the device captures the video and the user Any device that can receive operation input, can process image data and audio data, and can output video and audio, for example, a portable telephone 4, a PDA 5, a personal computer 6, or a digital still A camera or a digital video camera can also be used as the self-moving creation apparatus 2.

次に、図９は、セルフムービー作成装置２の第２の実施の形態の構成例について説明するための機能ブロック図である。セルフムービー作成装置２は、図９の機能ブロック図に図示された機能を、ハードウェア、または、ソフトウェアにより実現可能な装置であれば、例えば、上述した携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、または、ディジタルスチルカメラ、もしくは、ディジタルビデオカメラ以外の、いかなる装置を利用するようにしても良いし、セルフムービーの作成のための装置（オリジナルデバイス）を用いるようにしても良い。 Next, FIG. 9 is a functional block diagram for explaining a configuration example of the second embodiment of the self-movie creating apparatus 2. The self-movie creation device 2 is, for example, the above-described mobile phone 4, PDA 5, personal computer 6, and the like as long as the functions illustrated in the functional block diagram of FIG. 9 can be realized by hardware or software. Alternatively, any device other than a digital still camera or a digital video camera may be used, or a device (original device) for creating a self-movie may be used.

なお、図３における場合と対応する部分には同一の符号を付してあり、その説明は適宜省略する。 Note that portions corresponding to those in FIG. 3 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.

すなわち、図９のセルフムービー作成装置２は、操作入力部４１およびフットセンサ４３−１乃至４３−８に代わって、例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、ディジタルスチルカメラ、または、ディジタルビデオなどが有するボタン、キー、ダイヤルなどの操作入力部３０２が、ユーザの操作入力を受けるものとして用いられ、カメラ４２に代わって、例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、ディジタルスチルカメラ、または、ディジタルビデオカメラなどのカメラ３０１が映像を撮像するものとして用いられ、モニタ４４に代わって、例えば、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、ディジタルスチルカメラ、または、ディジタルビデオカメラなどのモニタ３０３がエフェクトが施された映像の表示に用いられ、スピーカ４５−１乃至５４−６に代わって、スピーカ３０４−１および３０４−２が、効果音が付与された音声の出力に用いられており、フットセンサ入力検出部６６、照明制御部６３、および、照明４７−１乃至４７−５が省略されている以外は、基本的に、図３を用いて説明した場合と同様の構成を有している。 That is, the self-movie creating apparatus 2 in FIG. 9 is, for example, a mobile phone 4, a PDA 5, a personal computer 6, a digital still camera, or a digital camera, instead of the operation input unit 41 and the foot sensors 43-1 to 43-8. An operation input unit 302 such as a button, key, or dial included in a video or the like is used to receive a user's operation input. For example, the mobile phone 4, PDA 5, personal computer 6, digital still camera is used instead of the camera 42. Alternatively, a camera 301 such as a digital video camera is used to capture an image. Instead of the monitor 44, for example, a mobile phone 4, a PDA 5, a personal computer 6, a digital still camera, or a digital video camera is used. Monitor 303 This is used to display the video with the effect applied, and instead of the speakers 45-1 to 54-6, the speakers 304-1 and 304-2 are used for the output of the sound to which the sound effect is given. Except that the sensor input detection unit 66, the illumination control unit 63, and the illuminations 47-1 to 47-5 are omitted, the configuration is basically the same as that described with reference to FIG. .

操作入力部３０２のボタン、キー、ダイヤルなどは、テキストデータの入力や、プレイ開始の指令の入力以外にも、プレイ中においては、図２および図３を用いて説明したフットセンサ４３−１乃至４３−８と同様に、所定の映像のエフェクトと、伴奏へ付加される効果音の種類（または、伴奏もしくはサンプル音源に対して施されるエフェクト）とに対応付けられているものとする。なお、ボタンやキーなどが、同時に複数操作された場合、同時に操作された複数のボタンやキーに対応付けられているエフェクトまたは音声データの付加が、同時に施される（合成して処理される）ようにしても良い。 The buttons, keys, dials, etc. of the operation input unit 302 are not limited to text data input and play start command input, but during play, the foot sensors 43-1 to 4-3 described with reference to FIGS. Similarly to 43-8, it is assumed that the effect of a predetermined video is associated with the type of sound effect added to the accompaniment (or the effect applied to the accompaniment or sample sound source). When a plurality of buttons and keys are operated at the same time, effects or audio data associated with the simultaneously operated buttons and keys are added at the same time (combined and processed). You may do it.

例えば、図９のセルフムービー作成装置２が、携帯型電話機４、ＰＤＡ５など、ユーザが把持した状態で、操作入力が可能な小型の装置であったとき、ユーザは、プレイ中に、それらの装置が有するカメラ３０１で、自分自身、または、他人を撮像しながら、ボタン、キー、ダイヤルなどの操作入力部３０２を操作して、撮像された映像にエフェクトをかけ、自動作成された伴奏に効果音などを付加して、セルフムービーを作成することができる。 For example, when the self-movie creating device 2 in FIG. 9 is a small device capable of operation input while being held by the user, such as the mobile phone 4 or the PDA 5, The camera 301 has an image of itself or another person while operating the operation input unit 302 such as a button, key, or dial to apply an effect to the captured image, and a sound effect on the automatically created accompaniment Etc. can be added to create a self-movie.

そして、作成されたセルフムービーは、「練習モード」および「セルフムービーモード」において、モニタ３０３に表示され、スピーカ３０４−１および３０４−２から音声出力されるとともに、「セルフムービーモード」において、記憶されて、ネットワークインターフェース７１から、ネットワーク１を介して、セルフムービー配信サーバ３に送信される。 The created self-movie is displayed on the monitor 303 in the “practice mode” and the “self-movie mode”, and is output from the speakers 304-1 and 304-2 as well as stored in the “self-movie mode”. Then, it is transmitted from the network interface 71 to the self-movie distribution server 3 via the network 1.

次に、図１０のフローチャートを参照して、図９を用いて説明したセルフムービー作成装置２が実行するセルフムービー作成処理２について説明する。 Next, the self-movie creating process 2 executed by the self-movie creating apparatus 2 described with reference to FIG. 9 will be described with reference to the flowchart of FIG.

ステップＳ６１乃至ステップＳ６６において、図７を用いて説明したステップＳ１乃至ステップＳ６と基本的に同様の処理が実行される。 In steps S61 to S66, basically the same processing as that in steps S1 to S6 described with reference to FIG. 7 is executed.

すなわち、操作入力部３０２からテキストデータの入力を受けた場合、入力されたテキストデータが映像効果付与処理部６４に供給され、映像に重畳可能な、例えば、ビットマップなどの映像データに変換されて、内部に有する図示しないページバッファなどに一時保存される。そして、プレイ開始の操作入力を受けて、セルフムービーの音声データの基となる伴奏に対応する音声データが作成され、撮像された映像の取得が開始され、撮像された映像を基に、ユーザの動きが検出される。 That is, when text data is input from the operation input unit 302, the input text data is supplied to the video effect applying processing unit 64 and converted into video data such as a bitmap that can be superimposed on the video. Are temporarily stored in a page buffer (not shown) or the like. Then, in response to a play start operation input, audio data corresponding to the accompaniment that is the basis of the audio data of the self-movie is created, acquisition of the captured video is started, and based on the captured video, the user's Motion is detected.

ステップＳ６７において、映像効果付与処理部６４および効果音生成部６５は、操作入力部３０２から供給される信号を基に、所定の映像のエフェクト、および、音声データの伴奏への効果音の付加に対応付けられているボタンやキーなどのうちのいずれかが操作されたか否かを判断する。 In step S67, the video effect imparting processing unit 64 and the sound effect generating unit 65 are used to add a predetermined video effect and a sound effect to the accompaniment of the audio data based on the signal supplied from the operation input unit 302. It is determined whether any of the associated buttons or keys has been operated.

ステップＳ６７において、いずれかのボタンやキーが操作されたと判断された場合、ステップＳ６８において、映像効果付与処理部６４は、操作入力部３０２から供給された信号を基に、図４および図５を用いて説明したように、操作されたボタンやキーに対応する映像のエフェクトを、検出されたユーザの動きに応じて、画像処理部６１から供給された撮像された映像（必要に応じて、操作入力部３０２から供給されたテキストデータが重畳されている映像）に施して、セルフムービー記憶部６８に供給する。 If it is determined in step S67 that any button or key has been operated, in step S68, the video effect imparting processing unit 64 performs the process shown in FIGS. 4 and 5 based on the signal supplied from the operation input unit 302. As described above, the effect of the video corresponding to the operated button or key is determined based on the captured video supplied from the image processing unit 61 in accordance with the detected user movement (if necessary, the operation (The video on which the text data supplied from the input unit 302 is superimposed) is supplied to the self-movie storage unit 68.

ステップＳ６９において、効果音生成部６５は、操作入力部３０２から供給された信号を基に、操作されたボタンやキーに対応する音声効果を用いて、検出されたユーザの動きを基に、伴奏作成部６７から供給された伴奏に対応する音声データに、上述したような効果音を付与したり、音声エフェクトを施して、セルフムービーの音声データを生成し、セルフムービー記憶部６８に供給する。 In step S 69, the sound effect generation unit 65 uses the sound effect corresponding to the operated button or key based on the signal supplied from the operation input unit 302, and accompaniment based on the detected user movement. The sound data corresponding to the accompaniment supplied from the creating unit 67 is added with sound effects as described above or subjected to sound effects to generate self-movie sound data, which is supplied to the self-movie storage unit 68.

なお、ここでは、ステップＳ６６乃至ステップＳ６９の処理を、順次実行される処理ステップとして説明したが、これらの処理は、並行して実行され、リアルタイムで処理されるものである。 In addition, although the process of step S66 thru | or step S69 was demonstrated as a process step performed sequentially here, these processes are performed in parallel and processed in real time.

そして、ステップＳ７０乃至ステップＳ７３において、図７を用いて説明したステップＳ１２乃至ステップＳ１５と基本的に同様の処理が実行されて、処理が終了される。 In steps S70 to S73, basically the same processing as that in steps S12 to S15 described with reference to FIG. 7 is executed, and the processing ends.

すなわち、プレイ開始からすでに所定の時間が経過しているか否かが判断され、所定の時間が経過していないと判断された場合、作成された映像がモニタ３０３に表示され、作成された音声データがスピーカ３０４−１および３０４−２から出力されるが、内部に記憶はされないで、処理は、ステップＳ６６に戻り、それ以降の処理が繰り返される。そして、プレイ開始から所定の時間が経過したと判断された場合、同様にして、作成された映像がモニタ３０３に表示され、作成された音声データがスピーカ３０４−１および３０４−２から出力されるとともに、作成されたセルフムービーが記憶される。そして、プレイ終了時刻であるか否かが判断され、プレイ終了時刻ではないと判断された場合、処理は、ステップＳ６６に戻り、それ以降の処理が繰り返され、プレイ終了時刻となったと判断された場合、生成されて記憶されたセルフムービーが、ネットワーク１を介して、セルフムービー配信サーバ３に送信され、処理が終了される。 That is, it is determined whether or not a predetermined time has elapsed since the start of play, and when it is determined that the predetermined time has not elapsed, the created video is displayed on the monitor 303 and the created audio data Are output from the speakers 304-1 and 304-2, but are not stored internally, and the process returns to step S66, and the subsequent processes are repeated. If it is determined that a predetermined time has elapsed since the start of play, similarly, the created video is displayed on the monitor 303, and the created audio data is output from the speakers 304-1 and 304-2. At the same time, the created self-movie is stored. Then, it is determined whether or not it is the play end time, and if it is determined that it is not the play end time, the process returns to step S66, and the subsequent processes are repeated to determine that the play end time is reached. In this case, the self-movie generated and stored is transmitted to the self-movie distribution server 3 via the network 1, and the process is terminated.

このような処理により、ユーザのパフォーマンスを、ボタンやキーへの操作入力と、撮像された映像から検出される動き量との２種類の方法で検出し、検出されたユーザのパフォーマンスを基に、撮像された映像にエフェクトを施し、作成された伴奏に効果音を付与することができるようにしたので、特別な知識や技術を持たないユーザであっても、オリジナリティのある、高度な画像処理や音声処理が施されたセルフムービーを、携帯型電話機４、ＰＤＡ５、パーソナルコンピュータ６、ディジタルスチルカメラ、または、ディジタルビデオカメラなどの身近な機器を用いて作成することができる。 Through such processing, the user's performance is detected by two types of methods, that is, an operation input to a button or a key and a motion amount detected from the captured video, and based on the detected user's performance, Since the effect was applied to the captured video and sound effects can be added to the created accompaniment, even users who do not have special knowledge or technology can use original, advanced image processing and A self-movie that has been subjected to sound processing can be created using familiar devices such as the mobile phone 4, the PDA 5, the personal computer 6, a digital still camera, or a digital video camera.

更に、例えば、クラブやパーティー会場などのオープンなスペースで、その会場全体を、セルフムービー作成装置として機能させるようにすることもできる。すなわち、会場内のいずれかの場所で映像を撮像し、その映像の動きを検出するとともに、所定の映像のエフェクトと、伴奏への効果音の付加（または、音声データへのエフェクト）とに対応付けられている、例えば、センサ、パッド、フットセンサ、または、ボタンやキーなどの操作入力部を複数設けるようにする。そして、上述した場合と同様の処理により、映像の動きおよび操作入力部への操作入力を基に、撮像された映像にエフェクトを施したり、音声データを生成するようにして、オープンなスペース全体で視聴可能なように、生成された映像を大画面のモニタに表示し、生成された音声データをスピーカから出力することも可能である。このとき、第１の実施の形態における場合と同様に、部屋全体の照明を制御することも可能であり、また、撮像されるユーザも、１人でなくても良く、複数のカメラを用意して、撮像ポイントを複数用意するようにすることもできる。 Furthermore, for example, in an open space such as a club or a party venue, the entire venue can be made to function as a self-movie creating apparatus. In other words, it captures video at any location in the venue, detects the motion of the video, and supports the effects of a given video and the addition of sound effects to accompaniment (or effects on audio data) For example, a plurality of operation input units such as sensors, pads, foot sensors, buttons or keys are provided. Then, through the same processing as described above, an effect is applied to the captured image and audio data is generated based on the motion of the image and the operation input to the operation input unit, and the entire open space is used. It is also possible to display the generated video on a large screen monitor so that it can be viewed, and to output the generated audio data from a speaker. At this time, as in the case of the first embodiment, it is also possible to control the illumination of the entire room, and the number of users to be imaged is not limited to one, and a plurality of cameras are prepared. Thus, a plurality of imaging points can be prepared.

すなわち、本発明は、第１の実施の形態における場合のように、筺体２１によって閉ざされた空間を用意したり、第２の実施の形態における場合のように、ユーザが把持可能な小型の機器を用いるようにする以外でも、適用可能である。換言すれば、映像を撮像し、ユーザのパフォーマンスを検出することができる情報処理装置であれば、本発明を適用することにより、ユーザのパフォーマンスと、撮像された映像および作成された音声への効果の付与とを関連付けて、オリジナリティのある、高度な画像処理や音声処理が施されたセルフムービーを作成することができる。 That is, the present invention prepares a space closed by the casing 21 as in the case of the first embodiment, or a small device that can be gripped by the user as in the case of the second embodiment. The present invention can be applied to other than using the above. In other words, as long as the information processing apparatus can capture video and detect the user's performance, the present invention can be applied to the user's performance and the effect on the captured video and created audio. It is possible to create a self-movie that has been subjected to advanced image processing and audio processing, in association with the provision of.

また、操作入力の自由度が高いため、初めてセルフムービーを作成するユーザは、単純な動作および操作入力を行っても、それなりに高度な効果が付与されたセルフムービーを作成することができるとともに、何度もプレイしたユーザは、自分の意図した高度な効果が付与されたセルフムービーを作成することができる。すなわち、ユーザは、セルフムービーの作成において、自己表現することができ、能動的に楽しむことができるので、シンプルなユーザインターフェースであっても、繰り返し楽しむことができる。 In addition, since the degree of freedom of operation input is high, a user who creates a self-movie for the first time can create a self-movie with a high degree of effect even if simple operation and operation input are performed, A user who has played many times can create a self-movie with a high-level effect intended by the user. In other words, the user can express themselves and actively enjoy the creation of a self-movie, so that even a simple user interface can be enjoyed repeatedly.

上述した一連の処理は、ソフトウェアにより実行することもできる。そのソフトウェアは、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば、汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。 The series of processes described above can also be executed by software. The software can be a computer in which the program constituting the software is installed in dedicated hardware, or various types of functions can be executed by installing various types of programs. It is installed from a recording medium on a computer or the like.

この記録媒体は、図６に示すように、コンピュータとは別にユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク２２１（フレキシブルディスクを含む）、光ディスク２２２（CD-ROM（Compact Disc-Read Only Memory），DVD（Digital Versatile Disc）を含む）、光磁気ディスク２２３（ＭＤ（Mini-Disc)（商標）を含む）、もしくは半導体メモリ２２４などよりなるパッケージメディアにより構成される。 As shown in FIG. 6, this recording medium is distributed to provide a program to the user separately from the computer, and includes a magnetic disk 221 (including a flexible disk) on which the program is recorded, an optical disk 222 (CD-ROM). (Including Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), magneto-optical disk 223 (including MD (Mini-Disc) (trademark)), or semiconductor memory 224 .

また、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。 Further, in the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but may be performed in parallel or It also includes processes that are executed individually.

なお、本明細書において、システムとは、複数の装置により構成される装置全体を表すものである。 In the present specification, the term “system” represents the entire apparatus constituted by a plurality of apparatuses.

セルフムービー配信システムのシステム構成を示す図である。It is a figure which shows the system configuration | structure of a self-movie delivery system. セルフムービー作成装置の第１の実施の形態の外観構成を示す図である。It is a figure which shows the external appearance structure of 1st Embodiment of the self-movie production apparatus. 第１形態のセルフムービー作成装置の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the self-movie production apparatus of a 1st form. 映像に与えられるエフェクトについて説明するための図である。It is a figure for demonstrating the effect given to an image | video. 映像に与えられるエフェクトについて説明するための図である。It is a figure for demonstrating the effect given to an image | video. セルフムービー配信サーバの構成を示すブロック図である。It is a block diagram which shows the structure of a self-movie delivery server. セルフムービー作成処理１を説明するためのフローチャートである。6 is a flowchart for explaining self-movie creation processing 1; セルフムービー配信サーバの処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of a self-movie delivery server. 第２の形態のセルフムービー作成装置の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the self-movie production apparatus of a 2nd form. セルフムービー作成処理２を説明するためのフローチャートである。It is a flowchart for demonstrating the self-movie production | generation process 2. FIG.

Explanation of symbols

１ネットワーク
２セルフムービー作成装置
３セルフムービー配信サーバ
４１操作入力部
４２カメラ
４３フットセンサ
４４モニタ
４５スピーカ
４７照明
６２動き検出部
６３照明制御部
６４映像効果付与処理部
６５効果音生成部
６６フットセンサ入力検出部
６７伴奏作成部
６８セルフムービー記憶部
２１１ＣＰＵ
２４１画像処理部
２４２音声処理部
２４３課金処理部
２４４認証処理部 DESCRIPTION OF SYMBOLS 1 Network 2 Self movie production apparatus 3 Self movie delivery server 41 Operation input part 42 Camera 43 Foot sensor 44 Monitor 45 Speaker 47 Illumination 62 Motion detection part 63 Illumination control part 64 Image effect provision process part 65 Sound effect generation part 66 Foot sensor input Detection unit 67 Accompaniment creation unit 68 Self movie storage unit 211 CPU
241 Image processing unit 242 Audio processing unit 243 Charge processing unit 244 Authentication processing unit

Claims

In an information processing apparatus that creates content,
Imaging means for imaging first video data;
Action information acquisition means for acquiring information relating to user actions;
An operation input means for acquiring the operation input of the user;
Voice generating means for generating first voice data;
Based on the information about the user's motion acquired by the motion information acquisition unit and the user's operation input input by the operation input unit, the first video data captured by the imaging unit is predetermined. Video processing means for performing the first processing and generating second video data constituting the content;
Based on the information related to the user's motion acquired by the motion information acquiring means and the operation input of the user input by the operation input means, the first voice data generated by the voice generating means Audio processing means for performing second predetermined processing to generate second audio data constituting the content;
Display control means for controlling display of the second video data generated by the video processing means;
And an audio output control means for controlling the output of the second audio data generated by the audio processing means.

The information processing apparatus includes:
A housing composed of a ceiling surface and a wall surface having an opening through which the user can enter and exit;
A space that can be sufficiently operated by the user, and is distinguished from the outside by a floor installed at a lower portion of the housing;
The information processing apparatus according to claim 1, wherein a plurality of the operation input units are installed on an upper surface of the floor portion and receive an operation input by being stepped on by the user's foot.

The information processing apparatus according to claim 1, wherein the information processing apparatus is a portable information processing apparatus that can be held by the user.

Controls output of the content composed of the second video data generated by the video processing unit and the second audio data generated by the audio processing unit to another information processing apparatus. The information processing apparatus according to claim 1, further comprising content output control means.

The apparatus further comprises illumination control means for controlling illumination based on the information related to the user's action acquired by the action information acquiring means and the operation input of the user input by the operation input means. The information processing apparatus according to claim 1.

In response to the user's operation input input by the operation input means, the first processing applied to the first video data by the video processing means, and the first audio by the audio processing means. The information processing apparatus according to claim 1, wherein the second process applied to the data is a process having relevance.

In an information processing method of an information processing apparatus for creating content,
A voice generation step of generating first voice data;
An imaging control step for controlling imaging of the first video data;
An operation information acquisition step for acquiring information about the user's operation;
An operation input acquisition step of acquiring the user's operation input;
Imaging is controlled by the process of the imaging control step based on the information related to the user's action acquired by the process of the operation information acquisition step and the operation input of the user acquired by the process of the operation input acquisition step. A video processing step of performing a predetermined first process on the first video data and generating second video data constituting the content;
Based on the information related to the user's motion acquired by the processing of the motion information acquisition step and the operation input of the user acquired by the processing of the operation input acquisition step, it is generated by the processing of the voice generation step. An information processing method comprising: an audio processing step of performing predetermined second processing on the first audio data to generate second audio data constituting the content.

A program for causing a computer to execute content creation processing,
A voice generation step of generating first voice data;
An imaging control step for controlling imaging of the first video data;
An operation information acquisition step for acquiring information about the user's operation;
An operation input acquisition step of acquiring the user's operation input;
Imaging is controlled by the process of the imaging control step based on the information related to the user's action acquired by the process of the operation information acquisition step and the operation input of the user acquired by the process of the operation input acquisition step. A video processing step of performing a predetermined first process on the first video data and generating second video data constituting the content;
Based on the information related to the user's motion acquired by the processing of the motion information acquisition step and the operation input of the user acquired by the processing of the operation input acquisition step, it is generated by the processing of the voice generation step. An audio processing step of performing predetermined second processing on the first audio data to generate second audio data constituting the content, and causing the computer to execute the processing, program.

A first information processing device for creating content;
An information processing system comprising: a second information processing device that receives the content from the first information processing device and distributes the content to a user;
The first information processing apparatus includes:
Imaging means for imaging first video data;
Action information acquisition means for acquiring information about the user's action;
Operation input means for acquiring the user's operation input;
Voice generating means for generating first voice data;
Based on the information about the user's motion acquired by the motion information acquisition unit and the user's operation input input by the operation input unit, the first video data captured by the imaging unit is predetermined. Video processing means for performing the first processing and generating second video data constituting the content;
Based on the information related to the user's motion acquired by the motion information acquiring means and the operation input of the user input by the operation input means, the first voice data generated by the voice generating means Audio processing means for performing second predetermined processing to generate second audio data constituting the content;
Display control means for controlling display of the second video data generated by the video processing means;
Audio output control means for controlling the output of the second audio data generated by the audio processing means;
Transmission that transmits the second video data generated by the video processing means and the content composed of the second audio data generated by the audio processing means to the second information processing apparatus Means and
The second information processing apparatus
Receiving means for receiving the content transmitted by the transmitting means of the first information processing apparatus;
Storage means for storing the content received by the receiving means;
An information processing system comprising: a distribution control unit that controls distribution of the content stored in the storage unit to the user when a distribution request for the content is received from the user.

The second information processing apparatus
The information processing apparatus according to claim 9, further comprising: a conversion unit that converts a data format of the content corresponding to a type of a distribution destination device of the content whose distribution is controlled by the distribution control unit. system.

The second information processing apparatus
The information processing system according to claim 9, further comprising a billing processing unit that executes billing processing that occurs for the distribution of the content whose distribution is controlled by the distribution control unit.