JP2023169373A

JP2023169373A - Information processing device, moving image synthesis method and moving image synthesis program

Info

Publication number: JP2023169373A
Application number: JP2023163626A
Authority: JP
Inventors: 浩司小野里; Koji Onozato; 佑輔後藤; Yusuke Goto; 智愛鹿野; Chie KANO; 陽樹佐藤; Haruki Sato; 幸次朗村上; Kojiro Murakami
Original assignee: Mixi Inc
Current assignee: Mixi Inc
Priority date: 2018-07-25
Filing date: 2023-09-26
Publication date: 2023-11-29
Also published as: JP7364956B2; JP7148788B2; JP2022176206A; JP2020017325A

Abstract

To provide an information processing device, a moving image synthesis method, and a moving image synthesis program capable of easily creating a desired moving image for a user when the user records a moving image using a camera.SOLUTION: A portable terminal recording an image captured by a camera as a moving image determines a synthesis start position at which to start synthesizing a previously recorded first singing moving image and a second singing moving image to be recorded later, then records the second singing moving image by the camera, and finally synthesizes the first singing moving image and the second singing moving image at the synthesis start position. Therefore, when a singing user records a moving image by a camera, the portable terminal can easily create a desired moving image for the user.SELECTED DRAWING: Figure 10

Description

本発明は、情報処理装置、動画合成方法及び動画合成プログラムに関するものである。 The present invention relates to an information processing device, a video composition method, and a video composition program.

近年、コンピュータネットワークを介して携帯端末等の情報処理装置に配信されるウェブサイト（Ｗｅｂページ，Ｗｅｂサービス）やオンラインゲーム、アプリケーションソフトウェア（以下「アプリ」という。）等のオンラインサービスが広く普及している。 In recent years, online services such as websites (web pages, web services), online games, and application software (hereinafter referred to as "apps") that are distributed to information processing devices such as mobile terminals via computer networks have become widespread. There is.

オンラインサービスの一つに、ユーザ自身が撮影した動画を投稿し、他のユーザによる当該動画の視聴を可能とする動画投稿サイトがある。この動画投稿サイトには、ユーザ自身による歌唱動画等のパフォーマンス動画の投稿も行われている。このため、ユーザはより良い自身の歌唱動画を投稿するために、投稿する歌唱動画の編集を行う場合がある。 One of the online services is a video posting site that allows users to post videos shot by themselves and allow other users to view the videos. This video posting site also allows users to post their own performance videos, such as singing videos. Therefore, in order to post better singing videos of themselves, users may edit the singing videos they post.

ここで、特許文献１には、カラオケ楽曲の演奏を中止した場合であっても、再度、当該カラオケ楽曲を最初から歌い直すことなく最終的な歌唱音声の録音データを得ることを目的とした演奏中止対応カラオケ録音システムが開示されている。 Here, Patent Document 1 describes a performance for the purpose of obtaining the final recording data of the singing voice without having to sing the karaoke song again from the beginning even if the performance of the karaoke song is stopped. A karaoke recording system for cancellation has been disclosed.

この演奏中止対応カラオケ録音システムは、演奏が中止されたカラオケ楽曲の再演奏を指示する再演奏指示手段と、録音手段の機能により歌唱音声の録音が行われている任意のカラオケ楽曲について演奏が中止された場合に、少なくとも、当該カラオケ楽曲の楽曲と、録音済演奏範囲データと、中途録音データと、を紐付けして記録する録音情報記録手段と、カラオケ楽曲の再演奏が指示された場合には、録音済演奏範囲データに基づき、録音済演奏範囲については歌唱音声の録音を行わないように制御するものである。 This karaoke recording system compatible with performance cancellation includes a replay instruction means for instructing the replay of a karaoke song whose performance has been stopped, and a recording means that allows the performance of any karaoke song whose singing voice is being recorded to be stopped. a recording information recording means that records the music of the karaoke music, recorded performance range data, and intermediate recording data in association with each other; and when a replay of the karaoke music is instructed. This control is based on the recorded performance range data so that no singing voice is recorded in the recorded performance range.

特開２０１０－２３７３８９号公報Japanese Patent Application Publication No. 2010-237389

特許文献１に開示されているシステムでは、録音が途中で中止された時点以降の録音が追加で行われるものであり、ユーザは、追加録音を行うカラオケ楽曲のタイミングを自身で決定できない。このため、ユーザは、歌唱を失敗しても、失敗した部分の録音をやり直すことができない。さらに、特許文献１に開示されている演奏中止対応カラオケ録音システムでは、歌唱音声の録音を行うものの歌唱動画等のパフォーマンス動画を撮影するものではない。 In the system disclosed in Patent Document 1, additional recording is performed after the point in time when recording is stopped midway, and the user cannot decide by himself the timing of additional recording of karaoke music. Therefore, even if the user fails to sing, he or she cannot re-record the failed part. Further, although the karaoke recording system for performance cancellation disclosed in Patent Document 1 records singing voices, it does not record performance videos such as singing videos.

本発明は、このような事情に鑑みてなされたものであって、ユーザがカメラを用いて動画の録画を行う場合に、ユーザが望ましいと感じる動画を容易に作成できる、情報処理装置、動画合成方法及び動画合成プログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and provides an information processing device and a video synthesis device that can easily create a video that the user feels is desirable when the user uses a camera to record a video. The purpose of this invention is to provide a method and a video compositing program.

上記課題を解決するために、本発明の情報処理装置、動画合成方法及び動画合成プログラムは以下の手段を採用する。 In order to solve the above problems, an information processing device, a video compositing method, and a video compositing program of the present invention employ the following means.

上記課題を解決するため、本発明の一態様である「情報処理装置」は、カメラで撮影した画像を動画として録画する情報処理装置であって、先に録画した第１動画に対して、後に録画する第２動画の合成を開始する合成開始位置を決定する決定手段と、前記カメラで前記第２動画を録画する録画制御手段と、前記第１動画の前記合成開始位置で前記第１動画と前記第２動画とを合成する動画合成手段と、を備える。 In order to solve the above problems, an "information processing device" that is one aspect of the present invention is an information processing device that records images captured by a camera as a video, and that determining means for determining a composition start position at which to start composition of a second video to be recorded; recording control means for recording the second video with the camera; and recording control means for recording the second video with the camera; A moving image synthesizing means for synthesizing the second moving image.

上記課題を解決するため、本発明の一態様である「動画合成方法」は、カメラで第１動画を録画する第１工程と、前記第１動画に対して、後に録画する第２動画の合成を開始する合成開始位置を決定する第２工程と、前記カメラで前記第２動画を録画する第３工程と、前記第１動画の前記合成開始位置で前記第１動画と前記第２動画とを合成する第４工程と、を有する。 In order to solve the above problems, a "video composition method" which is an aspect of the present invention includes a first step of recording a first video with a camera, and a composition of a second video to be recorded later with respect to the first video. a second step of determining a compositing start position at which to start, a third step of recording the second video with the camera, and a step of recording the first video and the second video at the compositing start position of the first video. and a fourth step of synthesizing.

上記課題を解決するため、本発明の一態様である「動画合成プログラム」は、カメラで撮影した画像を動画として録画する情報処理装置が備えるコンピュータを、先に録画した第１動画に対して、後に録画する第２動画の合成を開始する合成開始位置を決定する決定手段と、前記カメラで前記第２動画を録画する録画制御手段と、前記第１動画の前記合成開始位置で前記第１動画と前記第２動画とを合成する動画合成手段と、して機能させる。 In order to solve the above problems, a "video composition program" which is one aspect of the present invention allows a computer included in an information processing apparatus that records images captured by a camera as a video to determining means for determining a synthesis start position at which to start synthesis of a second moving image to be recorded later; recording control means for recording the second moving image with the camera; and the second moving image.

上記「情報処理装置」には、以下に例示するように、種々の技術的限定を加えてもよい。また、同趣旨の技術的限定を、「動画合成方法」が実行する処理ステップや「動画合成プログラム」の機能に加えてもよい。 Various technical limitations may be added to the above-mentioned "information processing device" as exemplified below. Furthermore, technical limitations to the same effect may be added to the processing steps executed by the "video composition method" and the functions of the "video composition program."

前記第１動画の前記合成開始位置における画像である合成開始位置画像を、前記第２動画の録画開始時に画面に表示する画像表示制御手段を備える。 An image display control means is provided for displaying a synthesis start position image, which is an image at the synthesis start position of the first moving image, on a screen when recording of the second moving image is started.

前記画像表示制御手段は、前記第２動画を録画するために前記カメラで撮影されている画像に前記合成開始位置画像を重畳して前記画面に表示する。 The image display control means superimposes the synthesis start position image on an image taken by the camera in order to record the second moving image, and displays the superimposed composition start position image on the screen.

前記動画合成手段は、前記第１動画と前記第２動画との合成部分を目立たせないための画像処理を行う。 The video composition means performs image processing to make a composite portion of the first video and the second video less noticeable.

前記動画合成手段は、前記合成開始位置から所定期間の前記第１動画と前記第２動画とを重畳させて、前記第１動画と前記第２動画とを合成する。 The moving image synthesizing means synthesizes the first moving image and the second moving image by superimposing the first moving image and the second moving image for a predetermined period from the synthesis start position.

前記第１動画及び前記第２動画は、楽曲が再生されながら録画される動画である。 The first video and the second video are videos that are recorded while the music is being played.

前記録画制御手段は、前記合成開始位置の前から前記楽曲の再生を開始し、再生した楽曲が前記合成開始位置に対応するタイミングに達すると前記第２動画の録画を開始する。 The recording control means starts playing the music before the composition start position, and starts recording the second moving image when the played music reaches a timing corresponding to the composition start position.

前記録画制御手段は、前記第２動画の録画開始前から前記楽曲の再生を開始すると共に、前記第２動画の録画開始のカウントダウンを行う。 The recording control means starts playing the music before starting recording of the second moving image, and performs a countdown to start recording of the second moving image.

前記合成開始位置として設定可能な位置は、前記楽曲の予め定められた位置である。 The position that can be set as the synthesis start position is a predetermined position of the music piece.

本発明によれば、ユーザがカメラを用いて動画の録画を行う場合に、ユーザが望ましいと感じる動画を容易に作成できる、という効果を有する。 According to the present invention, when a user records a moving image using a camera, the user can easily create a moving image that the user finds desirable.

本発明の実施形態に係るカラオケシステムの構成図である。1 is a configuration diagram of a karaoke system according to an embodiment of the present invention. 本発明の実施形態に係るサーバの電気的構成を示すブロック図である。FIG. 1 is a block diagram showing the electrical configuration of a server according to an embodiment of the present invention. 本発明の実施形態に係る携帯端末の電気的構成を示すブロック図である。FIG. 1 is a block diagram showing the electrical configuration of a mobile terminal according to an embodiment of the present invention. 本発明の実施形態に係る歌唱動画を撮影する場合における携帯端末の画面表示を示す図である。FIG. 3 is a diagram showing a screen display of a mobile terminal when shooting a singing video according to an embodiment of the present invention. 本発明の実施形態に係る歌唱動画を視聴する場合における携帯端末の画面表示を示す図である。FIG. 3 is a diagram showing a screen display of a mobile terminal when viewing a singing video according to an embodiment of the present invention. 本発明の実施形態に係る動画合成機能を示す模式図である。FIG. 2 is a schematic diagram showing a video synthesis function according to an embodiment of the present invention. 本発明の実施形態に係る合成開始位置を決定する場合における携帯端末の画面表示を示す図である。FIG. 3 is a diagram showing a screen display of a mobile terminal when determining a synthesis start position according to an embodiment of the present invention. 本発明の実施形態に係る合成開始位置画像を示す図である。FIG. 3 is a diagram showing a composition start position image according to an embodiment of the present invention. 本発明の実施形態に係る第２歌唱動画を撮影開始する場合における携帯端末の画面表示を示す図である。It is a figure which shows the screen display of the mobile terminal when shooting a 2nd singing video based on embodiment of this invention. 本発明の実施形態に係る合成開始位置画像の表示タイミング及び第１歌唱動画と第２歌唱動画の重畳合成を示す模式図である。It is a schematic diagram which shows the display timing of the synthesis|combination start position image, and the superimposition synthesis|combination of the 1st singing video and the 2nd singing video based on embodiment of this invention. 本発明の実施形態に係る合成エフェクトを示す図である。FIG. 3 is a diagram showing a synthesis effect according to an embodiment of the present invention. 本発明の実施形態に係る動画合成機能に関する機能ブロック図である。FIG. 3 is a functional block diagram regarding a video compositing function according to an embodiment of the present invention. 本発明の実施形態に係る動画合成処理の流れを示すフローチャートである。3 is a flowchart showing the flow of video compositing processing according to an embodiment of the present invention.

以下に、本発明に係る情報処理装置、動画合成方法及び動画合成プログラムの一実施形態について、図面を参照して説明する。 DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of an information processing apparatus, a moving image synthesizing method, and a moving image synthesizing program according to the present invention will be described below with reference to the drawings.

本実施形態ではパフォーマーが自身のパフォーマンスを携帯端末を用いて動画として記録してサーバへ送信することで動画投稿サイトにアップロードする。動画投稿サイトにアップロードされた動画（投稿動画）は、携帯端末を介して視聴可能とされる。なお、本実施形態では、パフォーマンスを歌唱とし、パフォーマーを歌唱ユーザとし、動画投稿サイトにアップロードされる動画を歌唱動画とする。また、動画投稿サイトにアップロードされた歌唱動画を視聴するユーザを視聴ユーザという。 In this embodiment, a performer records his or her performance as a video using a mobile terminal and uploads it to a video posting site by transmitting it to a server. Videos uploaded to video posting sites (posted videos) can be viewed via mobile terminals. Note that in this embodiment, the performance is singing, the performer is a singing user, and the video uploaded to the video posting site is a singing video. Furthermore, a user who views a singing video uploaded to a video posting site is referred to as a viewing user.

［１．カラオケシステムの構成］
図１は、本実施形態に係るカラオケシステム１の概略構成図である。カラオケシステム１は、通信回線２、複数の携帯端末３（携帯端末３Ａ，３Ｂ）、及びサーバ４を含んで構成される。 [1. Karaoke system configuration]
FIG. 1 is a schematic configuration diagram of a karaoke system 1 according to this embodiment. The karaoke system 1 includes a communication line 2, a plurality of mobile terminals 3 (mobile terminals 3A, 3B), and a server 4.

通信回線２は、コンピュータネットワークを形成するものであり、例えば、電気事業者によって提供される広域通信回線である。 The communication line 2 forms a computer network, and is, for example, a wide area communication line provided by an electric utility company.

携帯端末３は、例えば、スマートフォンやタブレット端末、ノートパソコン等の情報処理端末であり、オンラインサービスをユーザが利用するために用いられる。携帯端末３は、画像を表示するタッチパネルディスプレイ３ａ、音を出力するスピーカー３ｂ、音が入力されるマイクロフォン３ｃ、被写体を撮影するカメラ３ｄ、及びイヤホン（不図示）が接続されるイヤホン端子３e等を備える。なお、ここでいう撮影とは、カメラ３ｄが機能し、録画の有無にかかわりなく被写体がタッチパネルディスプレイ３ａに表示されている状態である。タッチパネルディスプレイ３ａは、例えばＬＣＤ（Liquid Crystal Display）及びタッチセンサを備える。ＬＣＤは、各種画像を表示し、タッチセンサは、指、スタイラス、又はペン等の指示体を用いて行われる各種入力操作を受け付ける。以下の説明ではタッチパネルディスプレイ３ａを画面３ａともいう。 The mobile terminal 3 is, for example, an information processing terminal such as a smartphone, a tablet terminal, or a notebook computer, and is used by a user to use online services. The mobile terminal 3 includes a touch panel display 3a for displaying images, a speaker 3b for outputting sound, a microphone 3c for inputting sound, a camera 3d for photographing a subject, an earphone terminal 3e for connecting earphones (not shown), and the like. Be prepared. Note that photographing here refers to a state in which the camera 3d is functioning and the subject is displayed on the touch panel display 3a regardless of whether recording is being performed or not. The touch panel display 3a includes, for example, an LCD (Liquid Crystal Display) and a touch sensor. The LCD displays various images, and the touch sensor accepts various input operations performed using a pointing object such as a finger, stylus, or pen. In the following description, the touch panel display 3a is also referred to as a screen 3a.

サーバ４は、通信回線２を介して、携帯端末３へオンラインサービスを提供する情報処理装置である。なお、図１の例では、歌唱ユーザは、携帯端末３Ａから自身の歌唱動画（歌唱動画データ）をサーバ４へ送信することで当該歌唱動画を動画投稿サイトにアップロードする。そして、視聴ユーザは、携帯端末３Ｂを用いて動画投稿サイトへアクセスし、当該歌唱動画を視聴する。なお、歌唱ユーザは、携帯端末３Ａを用いて動画投稿サイトへアクセスすることで、自身がアップロードした歌唱動画を視聴することも可能である。また、携帯端末３Ｂのユーザが歌唱ユーザとなり、歌唱動画を動画投稿サイトにアップロードすることも可能である。 The server 4 is an information processing device that provides online services to the mobile terminal 3 via the communication line 2. In the example of FIG. 1, the singing user uploads his own singing video (singing video data) to the video posting site by transmitting his own singing video (singing video data) to the server 4 from the mobile terminal 3A. Then, the viewing user accesses the video posting site using the mobile terminal 3B and views the singing video. Note that the singing user can also view the singing video that he or she has uploaded by accessing the video posting site using the mobile terminal 3A. It is also possible for the user of the mobile terminal 3B to become a singing user and upload a singing video to a video posting site.

［２．サーバの構成］
図２は、本実施形態に係るサーバ４の電気的構成を示すブロック図である。 [2. Server configuration]
FIG. 2 is a block diagram showing the electrical configuration of the server 4 according to this embodiment.

本実施形態に係るサーバ４は、サーバ４全体の動作を司る主制御部であるＣＰＵ（Central Processing Unit）２０、各種プログラム及び各種データ等が予め記憶されたＲＯＭ
（Read Only Memory）２２、ＣＰＵ２０による各種プログラムの実行時のワークエリア等として用いられるＲＡＭ（Random Access Memory）２４、各種プログラム及び各種データを記憶する記憶手段としてのＨＤＤ（Hard Disk Drive）２６を備えている。 The server 4 according to the present embodiment includes a CPU (Central Processing Unit) 20, which is a main control unit that controls the operation of the entire server 4, and a ROM in which various programs, various data, etc. are stored in advance.
(Read Only Memory) 22, a RAM (Random Access Memory) 24 used as a work area when various programs are executed by the CPU 20, and an HDD (Hard Disk Drive) 26 as a storage means for storing various programs and various data. ing.

ＨＤＤ２６は、携帯端末３Ａから送信された歌唱動画データ、すなわち動画投稿サイトにアップロードされた歌唱動画データや、歌唱ユーザが歌唱可能な楽曲を示す楽曲データ等を記憶する。なお、記憶手段は、ＨＤＤ２６に限らず、例えば、フラッシュメモリ等の半導体メモリ等の他の記憶媒体であってもよい。 The HDD 26 stores singing video data transmitted from the mobile terminal 3A, that is, singing video data uploaded to a video posting site, music data indicating songs that can be sung by the singing user, and the like. Note that the storage means is not limited to the HDD 26, but may be another storage medium such as a semiconductor memory such as a flash memory.

さらに、サーバ４は、キーボード及びマウス等で構成されて各種操作の入力を受け付ける操作入力部２８、各種画像を表示する例えば液晶ディスプレイ装置等のモニタ３０、通信回線２を介して携帯端末３等の他の情報処理装置等と接続され、他の情報処理装置等との間で各種データの送受信を行う外部インタフェース３２を備えている。 Furthermore, the server 4 includes an operation input unit 28 that is configured with a keyboard, a mouse, etc. and accepts inputs for various operations, a monitor 30 such as a liquid crystal display device that displays various images, and a mobile terminal 3 etc. via the communication line 2. It includes an external interface 32 that is connected to other information processing devices and transmits and receives various data to and from the other information processing devices.

これらＣＰＵ２０、ＲＯＭ２２、ＲＡＭ２４、ＨＤＤ２６、操作入力部２８、モニタ３０、及び外部インタフェース３２は、システムバス３４を介して相互に電気的に接続されている。従って、ＣＰＵ２０は、ＲＯＭ２２、ＲＡＭ２４、及びＨＤＤ２６へのアクセス、操作入力部２８に対する操作状態の把握、モニタ３０に対する画像の表示、並びに外部インタフェース３２を介した他の情報処理装置等との各種データの送受信等を行なうことができる。 These CPU 20, ROM 22, RAM 24, HDD 26, operation input section 28, monitor 30, and external interface 32 are electrically connected to each other via a system bus 34. Therefore, the CPU 20 accesses the ROM 22, RAM 24, and HDD 26, grasps the operation status of the operation input unit 28, displays images on the monitor 30, and exchanges various data with other information processing devices etc. via the external interface 32. It is possible to send and receive data.

［３．携帯端末の電気的構成］
図３は、携帯端末３の電気的構成を示す機能ブロック図である。 [3. Electrical configuration of mobile terminal]
FIG. 3 is a functional block diagram showing the electrical configuration of the mobile terminal 3.

携帯端末３は、図１に示される構成に加え、主制御部４０、主記憶部４２、補助記憶部４４、通信部４６、及び操作ボタン４８を備える。 In addition to the configuration shown in FIG. 1, the mobile terminal 3 includes a main control section 40, a main storage section 42, an auxiliary storage section 44, a communication section 46, and an operation button 48.

主制御部４０は、例えば、ＣＰＵ、マイクロプロセッサ、ＤＳＰ（Digital Signal Processor）等であり、携帯端末３の全体の動作を制御する。 The main control unit 40 is, for example, a CPU, a microprocessor, a DSP (Digital Signal Processor), etc., and controls the overall operation of the mobile terminal 3.

主記憶部４２は、例えば、ＲＡＭやＤＲＡＭ（Dynamic Random Access Memory）等で構成されており、主制御部４０による各種プログラムに基づく処理の実行時のワークエリア等として用いられる。 The main storage unit 42 is composed of, for example, RAM or DRAM (Dynamic Random Access Memory), and is used as a work area when the main control unit 40 executes processing based on various programs.

補助記憶部４４は、例えば、フラッシュメモリ等の不揮発性メモリであり、画像等の各種データ及び主制御部４０の処理に利用されるプログラム等を保存する。補助記憶部４４に記憶されるプログラムは、例えば、携帯端末３の基本的な機能を実現するためのＯＳ（Operating System）、各種ハードウェアを制御するためのドライバ、電子メールやウェブブラウジング、その他各種機能を実現するためのプログラム等である。また、補助記憶部４４には、歌唱動画の撮影や投稿及び動画投稿サイトを視聴するためのアプリケーション
（以下「動画投稿視聴アプリ」という。）が予め記憶されている。 The auxiliary storage unit 44 is, for example, a nonvolatile memory such as a flash memory, and stores various data such as images and programs used for processing by the main control unit 40. The programs stored in the auxiliary storage unit 44 include, for example, an OS (Operating System) for realizing the basic functions of the mobile terminal 3, drivers for controlling various hardware, e-mail and web browsing, and other various programs. These are programs, etc. for realizing functions. Further, the auxiliary storage unit 44 stores in advance an application for shooting and posting a singing video and for viewing a video posting site (hereinafter referred to as a "video posting viewing application").

通信部４６は、例えばＮＩＣ（Network Interface Controller）であり、通信回線２に接続する機能を有する。なお、通信部４６は、ＮＩＣに代えて又はＮＩＣと共に、無線ＬＡＮ（Local Area Network）に接続する機能、無線ＷＡＮ（Wide Area Network）に接続する機能、例えばBluetooth（登録商標）等の近距離の無線通信、及び赤外線通信等を可能とする機能を有してもよい。 The communication unit 46 is, for example, a NIC (Network Interface Controller), and has a function of connecting to the communication line 2. Note that the communication unit 46 has a function of connecting to a wireless LAN (Local Area Network), a function of connecting to a wireless WAN (Wide Area Network), instead of or together with the NIC, and has a short-range communication function such as Bluetooth (registered trademark). It may also have functions that enable wireless communication, infrared communication, and the like.

操作ボタン４８は、携帯端末３の側面に設けられ、携帯端末３を起動又は停止させるための電源ボタンやスピーカー３ｂが出力する音のボリューム調整ボタン等である。 The operation button 48 is provided on the side surface of the mobile terminal 3, and is a power button for starting or stopping the mobile terminal 3, a volume adjustment button for the sound output from the speaker 3b, etc.

これら主制御部４０、主記憶部４２、補助記憶部４４、通信部４６、操作ボタン４８、タッチパネルディスプレイ３ａ、スピーカー３ｂ、マイクロフォン３ｃ、カメラ３ｄ、及びイヤホン端子３ｅは、システムバス４９を介して相互に電気的に接続されている。従って、主制御部４０は、主記憶部４２及び補助記憶部４４へのアクセス、タッチパネルディスプレイ３ａに対する画像の表示、ユーザによるタッチパネルディスプレイ３ａや操作ボタン４８に対する操作状態の把握、マイクロフォン３ｃへの音の入力、スピーカー３ｂ又はイヤホン端子３ｅに接続されたイヤホンからの音の出力、カメラ３ｄに対する制御、及び通信部４６を介した各種通信網や他の情報処理装置へのアクセス等を行える。 These main control section 40, main storage section 42, auxiliary storage section 44, communication section 46, operation button 48, touch panel display 3a, speaker 3b, microphone 3c, camera 3d, and earphone terminal 3e are connected to each other via a system bus 49. electrically connected to. Therefore, the main control section 40 accesses the main storage section 42 and the auxiliary storage section 44, displays images on the touch panel display 3a, grasps the operation status of the touch panel display 3a and operation buttons 48 by the user, and controls the sound input to the microphone 3c. It can input, output sound from the speaker 3b or the earphone connected to the earphone terminal 3e, control the camera 3d, and access various communication networks and other information processing devices via the communication section 46.

［４．歌唱ユーザによる歌唱動画の撮影］
歌唱ユーザが、携帯端末３Ａを用いて歌唱動画を撮影する場合について説明する。 [4. Shooting of singing videos by singing users]
A case where a singing user photographs a singing video using the mobile terminal 3A will be described.

歌唱ユーザは、歌唱動画を撮影する場合、携帯端末３Ａに動画投稿視聴アプリを起動させる。動画投稿視聴アプリが起動すると、携帯端末３Ａは複数の楽曲データを記憶したサーバ４にアクセスする。そして、歌唱ユーザは、動画投稿視聴アプリから自身で歌唱するための楽曲を任意に選択し、サーバ４から楽曲データを携帯端末３Ａへダウンロードする。そして、歌唱ユーザは、動画投稿視聴アプリを用いて任意のタイミングで楽曲を再生して歌唱を行う。動画投稿視聴アプリは、楽曲の再生を開始すると共に、カメラ３ｄによって動画の撮影を開始する。すなわち、歌唱動画は、携帯端末３Ａから楽曲が再生されながら携帯端末３Ａによって撮影された動画である。 When shooting a singing video, the singing user activates a video posting viewing application on the mobile terminal 3A. When the video posting and viewing application is started, the mobile terminal 3A accesses the server 4 that stores a plurality of pieces of music data. Then, the singing user arbitrarily selects a song for himself to sing from the video posting/viewing app, and downloads the song data from the server 4 to the mobile terminal 3A. Then, the singing user plays the song at an arbitrary timing using the video posting and viewing application and sings. The video posting/viewing application starts playing the music and starts shooting a video with the camera 3d. That is, the singing video is a video shot by the mobile terminal 3A while a song is being played back from the mobile terminal 3A.

なお、楽曲データには、歌詞データも関連付けられており、楽曲データがサーバ４から携帯端末３Ａにダウンロードされる場合には関連付けられている歌詞データも携帯端末３Ａにダウンロードされる。なお、以下の説明において、楽曲データには歌詞データも含まれるものとする。 Note that lyrics data is also associated with the music data, and when the music data is downloaded from the server 4 to the mobile terminal 3A, the associated lyrics data is also downloaded to the mobile terminal 3A. In the following description, it is assumed that the music data also includes lyrics data.

図４は、歌唱動画を撮影する場合における携帯端末３Ａの画面３ａにおける表示状態（以下「画面表示」という。）の一例である。 FIG. 4 is an example of a display state (hereinafter referred to as "screen display") on the screen 3a of the mobile terminal 3A when shooting a singing video.

図４に示されるように画面３ａは、歌詞表示領域５０Ａ及び撮影画像表示領域５０Ｂに分けられる。歌詞表示領域５０Ａは、ユーザが歌唱する楽曲の歌詞を示す歌詞画像５２、楽曲の音程を示す音程画像５４、及び撮影の進行度合いを示す進行バー５６を含む。 As shown in FIG. 4, the screen 3a is divided into a lyrics display area 50A and a photographed image display area 50B. The lyrics display area 50A includes a lyrics image 52 showing the lyrics of the song sung by the user, a pitch image 54 showing the pitch of the song, and a progress bar 56 showing the progress of shooting.

歌詞画像５２及び音程画像５４は、楽曲の進行に応じて更新される。本実施形態では、一例として、歌詞画像５２及び音程画像５４は数フレーズずつ更新して歌詞表示領域５０Ａに表示される。なお、歌詞画像５２と音程画像５４の更新タイミングは同じであってもよいし、異なってもよい。 The lyrics image 52 and the pitch image 54 are updated according to the progress of the song. In this embodiment, as an example, the lyrics image 52 and the pitch image 54 are updated several phrases at a time and displayed in the lyrics display area 50A. Note that the update timings of the lyrics image 52 and the pitch image 54 may be the same or different.

歌詞画像５２は、一例として、歌詞を複数段（図４の例では２段）で表示し、歌唱ユーザが現在歌唱すべき歌詞を把握可能なように、上段の歌詞の色が楽曲の進行に合わせて左端から右端へ変化する。上段の歌詞の色の変化が右端に達すると、下段の歌詞が上昇して上段に表示されると共に新たな歌詞が下段に表示され、楽曲の進行に合わせて再び上段の歌詞の色が左端から右端へ変化する。 For example, the lyrics image 52 displays the lyrics in multiple rows (two rows in the example of FIG. 4), and the colors of the lyrics in the upper row match the progress of the song so that the singing user can grasp the lyrics to be sung at the moment. At the same time, it changes from the left end to the right end. When the color of the lyrics in the upper row reaches the right edge, the lyrics in the lower row rise and are displayed on the upper row, and new lyrics are displayed in the lower row, and as the song progresses, the color of the lyrics in the upper row changes again from the left end. Change to the right end.

音程画像５４は、一例として、複数の音程バー５４Ａが音程の強弱に合わせて左右方向に階段状に表示される。そして、歌唱ユーザが現在歌唱すべき歌詞の音程を把握可能なように、楽曲の進行に合わせて音程バー５４Ａの色が左端から右端へ変化すると共にポインタ５４Ｂが左端から右端へ移動する。音程バー５４Ａの色の変化及びポインタ５４Ｂが右端に達すると、次の音程を示す音程画像５４が更新表示される。 In the pitch image 54, for example, a plurality of pitch bars 54A are displayed in a step-like manner in the left and right direction according to the strength of the pitch. Then, the color of the pitch bar 54A changes from the left end to the right end and the pointer 54B moves from the left end to the right end as the song progresses so that the singing user can grasp the pitch of the lyrics to be sung. When the color of the pitch bar 54A changes and the pointer 54B reaches the right end, the pitch image 54 indicating the next pitch is updated and displayed.

進行バー５６は、一例として、左端から右端までの長さが楽曲全体の長さを示す。楽曲の再生が開始すると楽曲の再生位置を示すポインタ５６Ａが左端から右端へ移動し、ポインタ５６Ａが右端に達すると楽曲の終了となる。なお、ポインタ５６Ａが通過した進行バー５６は、進行前の位置に比べて太く表示される。 As an example, the length of the progress bar 56 from the left end to the right end indicates the length of the entire song. When the playback of the music starts, the pointer 56A indicating the playback position of the music moves from the left end to the right end, and when the pointer 56A reaches the right end, the music ends. Note that the progress bar 56 through which the pointer 56A has passed is displayed thicker than the position before progress.

歌唱動画の録画は、歌唱ユーザが楽曲を選択した後、画面３ａに表示される録画開始ボタン（不図示）をクリックしてから所定時間後（例えば１０秒後）に開始される。また、動画の録画開始と終了は、楽曲の開始と終了に一致してもよいが、これに限らず、楽曲の開始所定時間前（例えば５秒前）から動画の録画が開始してもよいし、楽曲の終了所定時間後（例えば５秒後）に動画の録画が終了してもよい。 Recording of the singing video starts after a predetermined time (for example, 10 seconds) after the singing user selects a song and clicks a recording start button (not shown) displayed on the screen 3a. Also, the start and end of video recording may coincide with the start and end of the song, but the invention is not limited to this, and video recording may start a predetermined time before the start of the song (for example, 5 seconds). However, the recording of the video may end after a predetermined time (for example, 5 seconds) after the music ends.

歌唱ユーザは、イヤホンをイヤホン端子３ｅに接続して再生される楽曲をイヤホンを用いて聴き、当該楽曲に合わせて歌唱する。携帯端末３Ａは、カメラ３ｄによって歌唱ユーザを撮影すると共に、マイクロフォン３ｃによって歌唱ユーザの歌唱を録音する。すなわち、マイクロフォン３ｃは再生される楽曲の音は取得しない。そして、携帯端末３Ａはマイクロフォン３ｃで取得した歌唱ユーザの歌声を録音し、歌唱データとする。 The singing user connects earphones to the earphone terminal 3e, listens to the reproduced music using the earphones, and sings along with the music. The mobile terminal 3A photographs the singing user with the camera 3d and records the singing user's singing with the microphone 3c. That is, the microphone 3c does not acquire the sound of the music being played. Then, the mobile terminal 3A records the singing user's singing voice acquired by the microphone 3c, and uses it as singing data.

なお、歌唱データは、フィルタリング処理によって人間の声の周波数帯域を抽出したものとされてもよい。このフィルタリング処理によって、歌唱ユーザの周辺環境に起因する雑音が歌唱データから取り除かれることになるので、録音される歌唱ユーザの歌声がより鮮明となる。 Note that the singing data may be data obtained by extracting the frequency band of a human voice through filtering processing. By this filtering process, noise caused by the surrounding environment of the singing user is removed from the singing data, so that the recorded singing voice of the singing user becomes clearer.

そして、動画投稿視聴アプリは、楽曲データ及び歌唱データに録画データを組み合わせることで、サーバ４へ送信可能な歌唱動画データとする。なお、ユーザは、一例として、歌唱動画データをサーバ４へ送信するタイミング、すなわち、動画投稿サイトへアップロードするタイミングとして下記の２種類のうち一つを選択できる。 The video posting and viewing application combines the music data and the singing data with the recorded data to create singing video data that can be transmitted to the server 4. In addition, a user can select one of the following two types as the timing to transmit singing video data to the server 4, ie, the timing to upload to a video posting site, as an example.

一つは、歌唱ユーザが歌唱しながらリアルタイムで歌唱動画データを動画投稿サイトへアップロードするライブ配信である。ライブ配信では、視聴ユーザは歌唱ユーザによる歌唱をリアルタイムで視聴することになる。もう一つは、楽曲の歌唱が完了した後に、歌唱ユーザが任意のタイミングで動画投稿サイトへ歌唱動画データをアップロードする非ライブ配信である。 One is live distribution, in which a singing user uploads singing video data to a video posting site in real time while singing. In live distribution, viewing users will watch singing by singing users in real time. The other type is non-live distribution in which a singing user uploads singing video data to a video posting site at an arbitrary timing after the singing of a song is completed.

歌唱ユーザは、ライブ配信を行う場合には歌唱動画の録画前にライブ配信を行うための設定を行い、動画の録画開始と共に歌唱動画データが動画投稿サイトへアップロードされるようにする。なお、ライブ配信の場合には、歌唱動画データは携帯端末３Ａに記憶されることなく、動画投稿サイトへアップロードされてもよい。 When performing live distribution, the singing user makes settings for performing live distribution before recording the singing video, so that the singing video data is uploaded to the video posting site at the same time as the recording of the video starts. In addition, in the case of live distribution, the singing video data may be uploaded to the video posting site without being stored in the mobile terminal 3A.

なお、ライブ配信を行う場合の設定として、視聴ユーザがライブ配信で当該歌唱動画を視聴可能とする第１ライブ配信設定、ライブ配信後でも視聴ユーザが当該歌唱動画を視聴可能とする第２ライブ配信設定の何れか歌唱ユーザが設定可能とされる。すなわち、第１ライブ配信設定では、ライブ配信が終了するとサーバ４から歌唱動画データが削除され、視聴ユーザはライブ配信の終了後にライブ配信された歌唱動画の視聴ができない。一方、第２ライブ配信では、ライブ配信が終了してもサーバ４が当該歌唱動画データを記憶し続けるので、視聴ユーザはライブ配信の終了後でも非ライブ配信として当該歌唱動画の視聴ができる。 In addition, the settings for performing live distribution include a first live distribution setting that allows the viewing user to view the singing video through live distribution, and a second live distribution setting that allows the viewing user to view the singing video even after the live distribution. Any of the settings can be set by the singing user. That is, in the first live distribution setting, when the live distribution ends, the singing video data is deleted from the server 4, and the viewing user cannot view the lively distributed singing video after the live distribution ends. On the other hand, in the second live distribution, since the server 4 continues to store the singing video data even after the live distribution ends, the viewing user can view the singing video as a non-live distribution even after the live distribution ends.

なお、非ライブ配信を行う場合には、歌唱動画データは携帯端末３Ａに一旦記憶され、歌唱ユーザが動画投稿視聴アプリを操作することで任意のタイミングで動画投稿サイトへ歌唱動画をアップロードする。 Note that when performing non-live distribution, the singing video data is temporarily stored in the mobile terminal 3A, and the singing user uploads the singing video to the video posting site at any timing by operating the video posting and viewing application.

［５．視聴ユーザによる歌唱動画の視聴］
視聴ユーザが、携帯端末３Ｂを用いて歌唱動画を視聴する場合について説明する。 [5. Viewing of singing videos by viewing users]
A case where a viewing user views a singing video using the mobile terminal 3B will be described.

視聴ユーザは、歌唱動画を視聴する場合、携帯端末３Ｂに動画投稿視聴アプリを起動させる。動画投稿視聴アプリが起動すると、携帯端末３Ｂは複数の歌唱動画データを記憶したサーバ４、すなわち動画投稿サイトにアクセスする。そして、視聴ユーザは、動画投稿視聴アプリを介して視聴したい歌唱動画を選択して画面３ａに表示させる。なお、サーバ４による携帯端末３Ｂへの歌唱動画の配信手法は一例として、ストリーミング配信である。 When viewing a singing video, the viewing user activates a video posting viewing application on the mobile terminal 3B. When the video posting/viewing application is started, the mobile terminal 3B accesses the server 4 storing a plurality of singing video data, that is, the video posting site. Then, the viewing user selects a singing video that he or she wants to view via the video posting and viewing application and displays it on the screen 3a. In addition, the method of distributing the singing video to the mobile terminal 3B by the server 4 is, for example, streaming distribution.

図５は、歌唱動画を視聴する場合における携帯端末３Ｂの画面表示の一例であり、ライブ配信が行われている場合の画面表示を示している。 FIG. 5 is an example of a screen display of the mobile terminal 3B when viewing a singing video, and shows a screen display when live distribution is being performed.

画面３ａには、歌唱動画が表示されると共に、歌唱ユーザ表示領域５０Ｃ、歌詞表示領域５０Ｄ、コメント入力表示領域５０Ｅが設けられる。歌唱ユーザ表示領域５０Ｃ、歌詞表示領域５０Ｄ、コメント入力表示領域５０Ｅは、歌唱動画に重畳して表示されてもよい。 The screen 3a displays a singing video, and is also provided with a singing user display area 50C, a lyrics display area 50D, and a comment input display area 50E. The singing user display area 50C, the lyrics display area 50D, and the comment input display area 50E may be displayed superimposed on the singing video.

歌唱ユーザ表示領域５０Ｃには、歌唱動画を投稿した歌唱ユーザのユーザ名、ライブ配信であるか否かの表示、歌唱している楽曲の名称が表示される。 The singing user display area 50C displays the user name of the singing user who posted the singing video, an indication of whether the streaming is live, and the name of the song being sung.

歌詞表示領域５０Ｄには、歌唱動画の歌詞が表示される。なお、表示される歌詞は、一例として、複数フレーズずつであり、楽曲の進行に合わせて歌詞の色が左端から右端へ変化する。なお、歌詞表示領域５０Ｄは、一例として、歌詞を複数段で表示してもよい。この場合、上段の歌詞の色の変化が右端に達すると、下段の歌詞が上昇して上段に表示されると共に新たな歌詞が下段に表示され、楽曲の進行に合わせて再び上段の歌詞の色が左端から右端へ変化する。 The lyrics of the singing video are displayed in the lyrics display area 50D. Note that the displayed lyrics are, for example, a plurality of phrases, and the color of the lyrics changes from the left end to the right end as the song progresses. Note that the lyrics display area 50D may display lyrics in multiple stages, for example. In this case, when the color of the lyrics in the upper row reaches the right edge, the lyrics in the lower row rise and are displayed on the upper row, new lyrics are displayed in the lower row, and as the song progresses, the color of the lyrics in the upper row changes again. changes from the left end to the right end.

コメント入力表示領域５０Ｅには、コメントの入力欄が表示されると共に、歌唱動画を視聴している視聴ユーザのコメントがユーザ名と共に表示される。なお、一例として、視聴ユーザからのコメントが入力される毎にコメント入力表示領域５０Ｅの最上段に当該コメントが追加表示され、それまでに表示されていたコメントは下方に繰り下がる。そして、コメントがコメント入力表示領域５０Ｅに表示しきれなくなった場合には、コメント表示領域の右側にスクロールバー（不図示）が表示され、当該スクロールバーを視聴ユーザが操作することで、それまで画面３ａに表示されなかったコメントが表示される。 In the comment input display area 50E, a comment input field is displayed, and comments of viewing users who are viewing the singing video are displayed together with the user name. As an example, each time a comment from a viewing user is input, that comment is additionally displayed at the top of the comment input display area 50E, and the comments that were displayed up to that point are moved down. When the comment can no longer be displayed in the comment input display area 50E, a scroll bar (not shown) is displayed on the right side of the comment display area, and by operating the scroll bar, the viewing user can move the screen up to that point. Comments that were not displayed in 3a are displayed.

さらに、画面３ａには、視聴ユーザが各種操作を行うための操作アイコン５８Ａ～５８Ｄが表示される。 Further, on the screen 3a, operation icons 58A to 58D are displayed for the viewing user to perform various operations.

操作アイコン５８Ａは、視聴ユーザが視聴している歌唱動画に視聴ユーザが共感等した場合にクリックされるアイコンであり、当該歌唱動画に対する操作アイコン５８Ａのクリックの総数が操作アイコン５８Ａの上方に表示される。 The operation icon 58A is an icon that is clicked when the viewing user empathizes with the singing video that the viewing user is viewing, and the total number of clicks on the operating icon 58A for the singing video is displayed above the operating icon 58A. Ru.

操作アイコン５８Ｂは、画面３ａに表示されている歌唱動画をライブ配信している歌唱ユーザに対して視聴ユーザが対戦（以下「対戦歌唱」という。）を申し込む場合にクリックされるアイコンである。対戦歌唱は、異なる歌唱ユーザによる複数の歌唱動画（第１歌唱動画、第２歌唱動画）を視聴ユーザの携帯端末３Ｂの画面３ａに同時に表示し、歌唱動画が同じ楽曲を交互に歌唱するものである。すなわち、操作アイコン５８Ｂをクリックした視聴ユーザは、対戦歌唱を行う歌唱ユーザとなる。 The operation icon 58B is an icon that is clicked when the viewing user applies for a competition (hereinafter referred to as "competitive singing") against the singing user who is live-distributing the singing video displayed on the screen 3a. In competitive singing, multiple singing videos (first singing video, second singing video) by different singing users are simultaneously displayed on the screen 3a of the viewing user's mobile terminal 3B, and the singing videos alternately sing the same song. be. That is, the viewing user who clicks the operation icon 58B becomes a singing user who performs competitive singing.

操作アイコン５８Ｃは、動画投稿視聴アプリに対する各種設定を視聴ユーザが行う場合にクリックされるアイコンである。 The operation icon 58C is an icon that is clicked when the viewing user makes various settings for the video posting viewing application.

操作アイコン５８Ｄは、画面３ａに表示されている歌唱動画に対して装飾画像を重畳させる場合に視聴ユーザによってクリックされるアイコンである。なお、本実施形態に係る装飾画像は、その種類により金額が決められており、視聴ユーザが課金により購入可能とされている。そして、視聴ユーザは、操作アイコン５８Ｄをクリックすることで、自身が視聴している歌唱動画に対して装飾画像を重畳させる。装飾画像が重畳された歌唱動画の歌唱ユーザは、重畳された装飾画像に応じた金銭を動画投稿サイトの運営者から受け取る。すなわち、視聴ユーザによる歌唱動画への装飾画像の重畳（表示指示）は、歌唱ユーザに対する、いわゆる投げ銭に相当する。 The operation icon 58D is an icon that is clicked by the viewing user when a decorative image is to be superimposed on the singing video displayed on the screen 3a. Note that the price of the decorative image according to this embodiment is determined depending on its type, and the viewing user can purchase it by paying a fee. The viewing user then superimposes the decorative image on the singing video that he or she is viewing by clicking the operation icon 58D. A singing user of a singing video on which a decorative image is superimposed receives money from the operator of the video posting site according to the superimposed decorative image. That is, the superimposition (display instruction) of a decorative image on a singing video by a viewing user corresponds to what is called a tip to the singing user.

［６．動画合成機能］
本実施形態に係る動画投稿視聴アプリは、複数の歌唱動画を合成する動画合成機能を有する。動画合成機能は、先に録画した第１歌唱動画に対して、後に録画した第２歌唱動画を合成する機能である。なお、第１歌唱動画と第２歌唱動画で歌唱される楽曲は同じ楽曲である。 [6. Video compositing function]
The video posting and viewing application according to this embodiment has a video composition function that composes a plurality of singing videos. The video synthesis function is a function that combines the first singing video recorded earlier with the second singing video recorded later. Note that the music sung in the first singing video and the second singing video are the same music.

［６－１．合成開始位置］
図６は、本実施形態に係る動画合成機能の内容を示す模式図である。本実施形態に係る動画合成機能は、先に録画した第１歌唱動画に対して、後に録画する第２歌唱動画の合成を開始する合成開始位置を決定し（図６（Ａ））、合成開始位置を決定した後にカメラ３ｄで第２歌唱動画を録画し（図６（Ｂ））、第１歌唱動画の合成開始位置で第１歌唱動画と第２歌唱動画とを合成する（図６（Ｃ））。なお、合成開始位置や再生位置等で用いられる位置とは、例えば歌唱動画や楽曲の先頭からの経過時間などで表される。 [6-1. Synthesis start position]
FIG. 6 is a schematic diagram showing the contents of the video composition function according to this embodiment. The video synthesis function according to the present embodiment determines a synthesis start position for starting synthesis of a second singing video to be recorded later with respect to the first singing video recorded earlier (FIG. 6(A)), and starts synthesis. After determining the position, the second singing video is recorded with the camera 3d (Fig. 6 (B)), and the first singing video and the second singing video are combined at the composition start position of the first singing video (Fig. 6 (C) )). Note that the position used as the synthesis start position, playback position, etc. is expressed by, for example, the elapsed time from the beginning of the singing video or song.

このような動画合成機能によって歌唱ユーザは、歌唱動画（第１歌唱動画）を録画したものの、その出来栄えが望ましくないと感じた場合に、望ましくないと感じた動画部分よりも前の位置を合成開始位置とし、合成開始位置から歌唱動画（第２歌唱動画）を撮影し直す。そして、第１歌唱動画の合成開始位置で第１歌唱動画と第２歌唱動画とを合成することで、歌唱ユーザは、自身が望ましいと感じる歌唱動画を容易に作成可能となる。 With this kind of video compositing function, if a singing user has recorded a singing video (first singing video) but feels that its quality is undesirable, he or she can start compositing the position before the part of the video that he or she feels is undesirable. position, and re-shoot the singing video (second singing video) from the synthesis start position. By combining the first singing video and the second singing video at the synthesis start position of the first singing video, the singing user can easily create a singing video that he or she feels is desirable.

本実施形態に係る動画合成機能は、上述のように、第１歌唱動画に対する合成開始位置を決定した後に、第２歌唱動画の録画を開始する。このため、動画合成機能は、第１歌唱動画における合成開始位置に対応する楽曲の再生位置から第２歌唱動画の録画を開始する。従って、歌唱ユーザは、第２歌唱動画の録画において楽曲の歌唱を最初から行う必要はなく合成開始位置から行えばよいため、より簡易に歌唱動画の合成を行える。 As described above, the video synthesis function according to the present embodiment starts recording the second singing video after determining the synthesis start position for the first singing video. Therefore, the video synthesis function starts recording the second singing video from the playback position of the song corresponding to the synthesis start position in the first singing video. Therefore, the singing user does not have to sing the song from the beginning when recording the second singing video, but only needs to start singing from the synthesis start position, and can more easily synthesize the singing video.

図７は、合成開始位置を決定する場合に歌唱ユーザが操作する携帯端末３Ａの画面３ａに表示される画像であり、画面３ａは再生制御領域５０Ｆ及び再生動画表示領域５０Ｇに分けられる。再生動画表示領域５０Ｇは、再生対象となる第１歌唱動画が表示される。再生制御領域５０Ｆは、第１歌唱動画の楽曲の歌詞を示す歌詞画像５２、楽曲の音程を示す音程画像５４、及び動画の再生位置を選択するためのスライドバー６０が表示される。 FIG. 7 is an image displayed on the screen 3a of the mobile terminal 3A operated by the singing user when determining the synthesis start position, and the screen 3a is divided into a playback control area 50F and a playback video display area 50G. In the playback video display area 50G, the first singing video to be played is displayed. In the playback control area 50F, a lyrics image 52 showing the lyrics of the music of the first singing video, a pitch image 54 showing the pitch of the music, and a slide bar 60 for selecting the playback position of the video are displayed.

スライドバー６０は、左端が第１歌唱動画の開始を示し、右端が第１歌唱動画の終了を示している。歌唱ユーザが、ポインタ６０Ａを左右に移動させると、当該移動に伴い画面３ａに表示される第１歌唱動画の再生位置が変化し、第１歌唱動画は一時停止の状態で画面３ａに表示される。また、ポインタ６０Ａの移動に伴い、歌詞画像５２及び音程画像５４も更新される。 The left end of the slide bar 60 indicates the start of the first singing video, and the right end indicates the end of the first singing video. When the singing user moves the pointer 60A left and right, the playback position of the first singing video displayed on the screen 3a changes with the movement, and the first singing video is displayed on the screen 3a in a paused state. . Further, as the pointer 60A moves, the lyrics image 52 and the pitch image 54 are also updated.

そして、歌唱ユーザによって決定ボタン６２がクリックされると、再生動画表示領域５０Ｇに表示されている第１歌唱動画の再生位置が合成開始位置として決定される。 Then, when the singing user clicks the determination button 62, the playback position of the first singing video displayed in the playback video display area 50G is determined as the synthesis start position.

なお、合成開始位置として設定可能な位置は、楽曲の予め定められた位置とされてもよい。この予め定められた位置とは、例えば、歌詞のフレーズとフレーズとの境等、歌詞の途中からの歌い出しが容易な位置であり、複数個所が設定されている。換言すると、フレーズの途中等、途中からでは歌い出しが難しい位置は合成開始位置として歌唱ユーザは選択できない。なお、スライドバー６０や歌詞画像５２に、合成開始位置として設定可能な複数の位置が歌唱ユーザに認識できるように表示されてもよい。 Note that the position that can be set as the synthesis start position may be a predetermined position of the song. This predetermined position is, for example, a position where it is easy to start singing from the middle of the lyrics, such as a boundary between phrases of the lyrics, and a plurality of positions are set. In other words, a singing user cannot select a position where it is difficult to start singing from the middle, such as in the middle of a phrase, as a synthesis start position. Note that a plurality of positions that can be set as the synthesis start position may be displayed on the slide bar 60 or the lyrics image 52 so that the singing user can recognize them.

また、図７の例に限らず、画面３ａには第１歌唱動画を再生（一時停止解除）するためのボタンや、再生を停止するためのボタン、早送り又は早戻しするためのボタン、スロー再生するためのボタン等が表示されてもよい。 In addition, not limited to the example in FIG. 7, the screen 3a includes a button for playing the first singing video (unpause), a button for stopping playback, a button for fast forwarding or fast backward, and a button for slow playback. A button or the like for doing so may be displayed.

［６－２．第２歌唱動画の録画］
動画合成機能は、合成開始位置が決定されると第２歌唱動画の録画を行う。 [6-2. Recording of the second singing video]
The video composition function records the second singing video when the composition start position is determined.

本実施形態に係る動画合成機能は、図８に示されるように、第１歌唱動画の合成開始位置における画像である合成開始位置画像６４（静止画像）を、第２歌唱動画の録画開始時に画面３ａに表示するガイド機能を有する。合成開始位置画像６４は、歌唱ユーザが画面３ａに表示されている画像が合成開始位置画像６４であると認識できるように、一例として薄く表示される。そして、動画合成機能は、図９に示されるように、第２歌唱動画の録画のためにカメラ３ｄで撮影されている画像（以下「現在撮影画像」という。）に合成開始位置画像６４を重畳して画面３ａに表示する。 As shown in FIG. 8, the video composition function according to the present embodiment displays a composition start position image 64 (still image), which is an image at the composition start position of the first singing video, on the screen at the time of starting recording of the second singing video. 3a has a guide function. As an example, the composition start position image 64 is displayed faintly so that the singing user can recognize that the image displayed on the screen 3a is the composition start position image 64. Then, as shown in FIG. 9, the video composition function superimposes the composition start position image 64 on the image photographed by the camera 3d for recording the second singing video (hereinafter referred to as the "currently photographed image"). and display it on the screen 3a.

このように、ガイド機能は、合成開始位置画像６４を画面３ａに表示することで第２歌唱動画の録画開始時に、歌唱ユーザに第１歌唱動画の合成開始位置における画像を必然的に確認させることとなる。従って、歌唱ユーザは、第２歌唱動画の録画開始時において、第１歌唱動画における合成開始位置と同様の画像構成やポーズ、表情とすることが可能となる。その結果、第１歌唱動画と第２歌唱動画との合成において違和感の小さい合成が可能となる。さらに、図９に示されるように、合成開始位置画像６４と現在撮影画像とを重畳することで、歌唱ユーザは、第２歌唱動画の録画開始時の画像を合成開始位置画像６４と同様とすることがより簡易にできる。 In this way, the guide function displays the synthesis start position image 64 on the screen 3a to make the singing user inevitably check the image at the synthesis start position of the first singing video when starting recording the second singing video. becomes. Therefore, at the time of starting recording of the second singing video, the singing user can have the same image configuration, pose, and facial expression as the synthesis start position in the first singing video. As a result, it is possible to combine the first singing video and the second singing video with less discomfort. Furthermore, as shown in FIG. 9, by superimposing the synthesis start position image 64 and the currently captured image, the singing user can make the image at the start of recording of the second singing video similar to the synthesis start position image 64. Things can be done more easily.

ガイド機能は、第１歌唱動画の合成開始位置における歌唱ユーザの画像と、第２歌唱動画の録画開始時における歌唱ユーザの画像との一致率などを検出する検出機能を備えてもよい。ガイド機能は、第１歌唱動画の合成開始位置における歌唱ユーザの画像と、第２歌唱動画の録画開始時における歌唱ユーザの画像との一致率が所定の閾値よりも低い場合にユーザに通知をする機能を備えてもよい。 The guide function may include a detection function that detects a matching rate between the image of the singing user at the synthesis start position of the first singing video and the image of the singing user at the time of starting recording of the second singing video. The guide function notifies the user when the matching rate between the image of the singing user at the synthesis start position of the first singing video and the image of the singing user at the start of recording of the second singing video is lower than a predetermined threshold. It may also have a function.

なお、図８，９の例では、画面３ａにおける再生動画表示領域５０Ｇ全体に合成開始位置画像６４が表示されているが、これに限らず、合成開始位置画像６４はウィンドウ表示されて再生動画表示領域５０Ｇの一部に重畳されてもよい。そして、ウィンドウ表示されている合成開始位置画像６４と現在撮影画像とが重畳して表示されてもよい。 In the examples of FIGS. 8 and 9, the synthesis start position image 64 is displayed in the entire playback video display area 50G on the screen 3a, but the invention is not limited to this, and the synthesis start position image 64 is displayed in a window to display the playback video. It may be superimposed on a part of the region 50G. Then, the synthesis start position image 64 displayed in the window and the currently captured image may be displayed in a superimposed manner.

次に図１０を参照して、第２歌唱動画の録画について説明する。図１０は、合成開始位置画像６４の画面３ａへの表示タイミング、及び詳細を後述する第１歌唱動画と第２歌唱動画の重畳合成を示す模式図であり、斜線（ハッチング）で示されるバーは第２歌唱動画を録画する場合に、画面３ａに表示される画像を示す。 Next, with reference to FIG. 10, recording of the second singing video will be described. FIG. 10 is a schematic diagram showing the display timing of the synthesis start position image 64 on the screen 3a and superimposition synthesis of the first singing video and the second singing video, details of which will be described later. Bars indicated by diagonal lines (hatching) are An image displayed on the screen 3a when recording the second singing video is shown.

図１０に示されるように合成開始位置画像６４は、第２歌唱動画の録画開始前から静止画として画面３ａに表示され、第２歌唱動画の録画開始と共に画面３ａへの表示が停止される。 As shown in FIG. 10, the synthesis start position image 64 is displayed as a still image on the screen 3a before the recording of the second singing video starts, and the display on the screen 3a is stopped when the recording of the second singing video starts.

そして、動画合成機能は、合成開始位置の前から楽曲の再生を開始し、再生した楽曲が合成開始位置に対応するタイミングに達すると第２歌唱動画の録画を開始する。本実施形態に係る動画合成機能は、一例として、合成開始位置の所定期間（例えば３０秒）前から楽曲の再生を開始する。これにより、歌唱ユーザは、余裕を持って第２歌唱動画の歌い出しと録画開始のタイミングとを合わせて第２歌唱動画の録画が可能となる。また、動画合成機能は、第２歌唱動画の録画開始前から楽曲の再生を開始すると共に、第２歌唱動画の録画開始のカウントダウン表示も行う（図９参照）。これにより、歌唱ユーザは、第２歌唱動画の歌い出しのタイミングを明確に認識できる。 Then, the video composition function starts playing the music from before the composition start position, and starts recording the second singing video when the played music reaches the timing corresponding to the composition start position. As an example, the video synthesis function according to the present embodiment starts playing music a predetermined period (for example, 30 seconds) before the synthesis start position. Thereby, the singing user can record the second singing video by matching the timing of the singing of the second singing video and the recording start timing with sufficient time. Further, the video synthesis function starts playing the music before the start of recording of the second singing video, and also displays a countdown to the start of recording of the second singing video (see FIG. 9). Thereby, the singing user can clearly recognize the timing at which the second singing video starts singing.

さらに、図１０に示されるように、合成開始位置画像６４の表示が停止されて第２歌唱動画の録画が開始された後に、ガイド機能として、合成開始位置から所定期間だけ第１歌唱動画を画面３ａに重畳表示させてもよい。これにより、歌唱ユーザは、第１歌唱動画に合わせた画像構成やポーズ、表情としながら、第２歌唱動画の録画が可能となるため、より違和感の小さい第１歌唱動画と第２歌唱動画との合成が可能となる。なお、この所定期間は、一例として、詳細を後述する重畳合成期間であり、例えば５秒であるが、時間を基準単位とするのではなくフレーム数を基準に重畳合成期間が設定されてもよい。 Furthermore, as shown in FIG. 10, after the display of the synthesis start position image 64 is stopped and recording of the second singing video is started, as a guide function, the first singing video is displayed on the screen for a predetermined period from the synthesis start position. 3a may be displayed in a superimposed manner. As a result, the singing user can record the second singing video while maintaining the image composition, pose, and facial expressions that match the first singing video, so that the first singing video and the second singing video can be created with less discomfort. Synthesis becomes possible. Note that this predetermined period is, for example, a superimposition synthesis period whose details will be described later, and is, for example, 5 seconds, but the superposition synthesis period may be set based on the number of frames instead of using time as a reference unit. .

ここで、図１０を参照して、第２歌唱動画の録画の全体の流れを説明する。 Here, with reference to FIG. 10, the overall flow of recording the second singing video will be described.

歌唱ユーザの操作によって合成開始位置が決定されると、合成開始位置画像６４が画面３ａに表示されると共にカメラ３ｄによる撮影が開始され、合成開始位置画像６４と現在撮影画像とが重畳されて画面３ａに表示される。このとき、第１歌唱動画の楽曲と同じ楽曲データが、サーバ４から携帯端末３Ａへダウンロードされる。そして、歌唱ユーザが、第２歌唱動画の録画開始指示を携帯端末３Ａに入力すると、楽曲の再生と共に録画のカウントダウンが開始される。 When the synthesis start position is determined by the singing user's operation, the synthesis start position image 64 is displayed on the screen 3a, and the camera 3d starts shooting, and the synthesis start position image 64 and the currently captured image are superimposed and displayed on the screen. 3a. At this time, the same music data as the music of the first singing video is downloaded from the server 4 to the mobile terminal 3A. Then, when the singing user inputs an instruction to start recording the second singing video into the mobile terminal 3A, a recording countdown is started as the song is played.

録画カウントダウンが終了すると、第２歌唱動画の録画が開始されると共に合成開始位置画像６４の表示が停止される。そして、合成開始位置から所定期間（重畳合成期間）における第１歌唱動画が現在撮影画像に重畳して画面３ａに表示される。なお、録画される第２歌唱動画には、重畳されて表示される第１歌唱動画は含まれない。所定期間（重畳合成期間）が経過すると、重畳されている第１歌唱動画の表示は停止され、楽曲が終了するまで第２歌唱動画の録画が継続される。 When the recording countdown ends, recording of the second singing video is started and the display of the composition start position image 64 is stopped. Then, the first singing video for a predetermined period (superimposition synthesis period) from the synthesis start position is displayed on the screen 3a while being superimposed on the currently captured image. Note that the second singing video to be recorded does not include the first singing video that is displayed in a superimposed manner. When a predetermined period (superimposition synthesis period) has elapsed, the display of the superimposed first singing video is stopped, and recording of the second singing video is continued until the music ends.

［６－３．第１歌唱動画と第２歌唱動画との合成］
動画合成機能は、第２歌唱動画の録画が終了すると、第１歌唱動画の合成開始位置で第１歌唱動画と録画した第２歌唱動画とを合成する。なお、以下の説明では、第１歌唱動画と第２歌唱動画を合成して得られた動画を合成歌唱動画という。 [6-3. Combining the first singing video and the second singing video]
When the recording of the second singing video ends, the video synthesis function combines the first singing video and the recorded second singing video at the synthesis start position of the first singing video. In addition, in the following explanation, the video obtained by combining the first singing video and the second singing video will be referred to as a composite singing video.

動画合成機能は、第１歌唱動画と第２歌唱動画との合成部分を目立たせないための画像処理（以下「エフェクト処理」という。）を行ってもよい。エフェクト処理は、例えば、合成部分の前後所定期間（例えば前後５秒ずつ、又は前後５フレームずつ）である。図１１の例では、エフェクト処理として複数の所定画像（星画像）をランダムにちりばめているが、これに限らず、モザイク処理やぼかし処理等、他の画像処理が行われてもよい。 The video synthesis function may perform image processing (hereinafter referred to as "effect processing") for making the composite portion of the first singing video and the second singing video less noticeable. The effect processing is, for example, a predetermined period before and after the composite portion (for example, 5 seconds before and after, or 5 frames before and after). In the example of FIG. 11, a plurality of predetermined images (star images) are randomly interspersed as effect processing, but the present invention is not limited to this, and other image processing such as mosaic processing and blurring processing may be performed.

このエフェクト処理により、第１歌唱動画と第２歌唱動画との境が不明瞭になるため、より違和感の小さい第１歌唱動画と第２歌唱動画との合成が可能となる。 This effect processing makes the boundary between the first singing video and the second singing video unclear, so it is possible to synthesize the first singing video and the second singing video with less discomfort.

また、エフェクト処理として、合成部分の画像全体を瞬間的に所定色に置き換えてもよい。所定色とは、例えば白色や黒色であり、例えば所定色を白色（又は黒色）とすることで意図的に白飛び（又は黒飛び）を画面に生じさせることで、第１歌唱動画と第２歌唱動画との境が不明瞭になる。 Further, as an effect process, the entire image of the composite portion may be instantaneously replaced with a predetermined color. The predetermined color is, for example, white or black. For example, by setting the predetermined color to white (or black) and intentionally causing blown-out highlights (or blown-out black) on the screen, the first singing video and the second singing video can be The line between singing videos and videos becomes unclear.

また、動画合成機能は、図１０に示されるように、合成開始位置から所定期間（上述した重畳合成期間）の第１歌唱動画と第２歌唱動画とを重畳させて、第１歌唱動画と第２歌唱動画とを合成（以下「重畳合成」という。）してもよい。この重畳合成は、いわゆるクロスディゾルブやオーバーラップといわれる画像処理であり、例えば、重畳させる第１歌唱動画を時間と共にフェードアウトさせる一方で、第２歌唱動画をフェードインさせてもよい。この重畳合成により、第１歌唱動画と第２歌唱動画との境が不明瞭になるため、より違和感の小さい第１歌唱動画と第２歌唱動画との合成が可能となる。 In addition, as shown in FIG. 10, the video synthesis function superimposes the first singing video and the second singing video for a predetermined period (the above-mentioned superimposition synthesis period) from the synthesis start position, so that the first singing video and the second singing video are The two singing videos may be synthesized (hereinafter referred to as "superimposed synthesis"). This superimposition synthesis is image processing called cross-dissolve or overlap, and for example, while the first singing video to be superimposed is faded out over time, the second singing video may be faded in. This superimposition synthesis makes the boundary between the first singing video and the second singing video unclear, so it is possible to synthesize the first singing video and the second singing video with less discomfort.

なお、本実施形態に係る動画合成機能は、録音されている歌声に関しては重畳合成を行わないが、これに限らず、歌声に関しても重畳合成を行ってもよい。 Note that although the video synthesis function according to the present embodiment does not perform superimposition synthesis on recorded singing voices, the present invention is not limited to this, and may perform superposition synthesis on singing voices as well.

［７．動画合成機能の機能ブロック］
図１２は、本実施形態に係る動画合成機能に関する機能ブロック図である。携帯端末３が備える主制御部４０は、画像表示制御部７０、録画制御部７２、合成開始位置決定部７４、及び動画合成部７６を備える。主制御部４０が備える各機能によって実行される処理は、補助記憶部４４に記憶されているプログラムによって実現される。 [7. Functional block of video composition function]
FIG. 12 is a functional block diagram regarding the video composition function according to this embodiment. The main control section 40 included in the mobile terminal 3 includes an image display control section 70 , a recording control section 72 , a composition start position determination section 74 , and a video composition section 76 . The processing executed by each function provided in the main control section 40 is realized by a program stored in the auxiliary storage section 44.

画像表示制御部７０は、画面３ａに対する画像の表示を制御するものであり、例えば、動画投稿視聴アプリが起動されてサーバ４から配信された動画を画面３ａに表示させたり、カメラ３ｄで撮影した画像を画面３ａに表示させる。なお、本実施形態に係る画像表示制御部７０は、第２歌唱動画を撮影する場合に、合成開始位置画像６４や、合成開始位置から所定期間の第１歌唱動画をカメラで撮影されている画像に重畳して画面３ａに表示させる。 The image display control unit 70 controls the display of images on the screen 3a, and for example, displays a video distributed from the server 4 on the screen 3a when a video posting/viewing application is started, or displays images taken with the camera 3d. The image is displayed on the screen 3a. In addition, when photographing the second singing video, the image display control unit 70 according to the present embodiment uses the composition start position image 64 and the image of the first singing video taken for a predetermined period from the composition start position with the camera. is displayed on the screen 3a.

録画制御部７２は、カメラ３ｄで撮影された画像を録画する。なお、本実施形態に係る録画制御部７２は、動画投稿視聴アプリを起動することで携帯端末３Ａから楽曲を再生し、再生した楽曲の音と共に第１歌唱動画又は第２歌唱動画を録画する。さらに、録画制御部７２は、第２歌唱動画を録画する場合、合成開始位置の前から楽曲の再生を開始して第２歌唱動画の録画開始のカウントダウンを行い、再生した楽曲が合成開始位置に対応するタイミングに達すると第２歌唱動画の録画を開始する。 The recording control unit 72 records images taken by the camera 3d. Note that the recording control unit 72 according to the present embodiment plays a song from the mobile terminal 3A by starting the video posting and viewing application, and records the first singing video or the second singing video together with the sound of the played song. Furthermore, when recording the second singing video, the recording control unit 72 starts playing the music from before the synthesis start position, counts down the start of recording of the second singing video, and the played music reaches the synthesis start position. When the corresponding timing is reached, recording of the second singing video is started.

合成開始位置決定部７４は、歌唱ユーザによる携帯端末３Ａへの操作に基づいて、先に録画した第１歌唱動画に対して、後に録画する第２歌唱動画の合成を開始する合成開始位置を決定する。 The synthesis start position determining unit 74 determines a synthesis start position at which to start synthesis of a second singing video to be recorded later, with respect to the first singing video recorded earlier, based on the operation of the singing user on the mobile terminal 3A. do.

動画合成部７６は、第１歌唱動画の合成開始位置で第１歌唱動画と第２歌唱動画とを合成して合成歌唱動画とする。また、本実施形態に係る動画合成部７６は、第１歌唱動画と第２歌唱動画との合成部分を目立たせないための画像処理を行ったり、合成開始位置から所定期間（重畳合成期間）の第１歌唱動画と第２歌唱動画とを重畳させて、第１歌唱動画と第２歌唱動画とを合成する The video synthesis unit 76 synthesizes the first singing video and the second singing video at the synthesis start position of the first singing video to form a composite singing video. In addition, the video composition unit 76 according to the present embodiment performs image processing to make the composite part of the first singing video and the second singing video inconspicuous, and performs image processing for a predetermined period (superimposed composition period) from the composition start position. The first singing video and the second singing video are superimposed to synthesize the first singing video and the second singing video.

［８．動画合成処理のフローチャート］
図１３は、携帯端末３が備える主制御部４０によって実行される動画合成処理の流れを示すフローチャートである。動画合成処理を実行するためのプログラム（動画投稿視聴アプリ）は補助記憶部４４の所定領域に予め記憶されている。 [8. Flowchart of video composition processing]
FIG. 13 is a flowchart showing the flow of video synthesis processing executed by the main control unit 40 included in the mobile terminal 3. A program for executing the video composition process (video posting viewing application) is stored in a predetermined area of the auxiliary storage unit 44 in advance.

まず、ステップＳ１００では、合成開始位置決定部７４が歌唱ユーザによる第１歌唱動画の選択を受け付ける。第１歌唱動画は、携帯端末３Ａの補助記憶部４４に記憶されている。なお、歌唱ユーザは、自身がアップロードした歌唱動画を動画投稿サイトからダウンロードし、それを第１歌唱動画としてもよい。 First, in step S100, the synthesis start position determination unit 74 receives selection of the first singing video by the singing user. The first singing video is stored in the auxiliary storage unit 44 of the mobile terminal 3A. Note that the singing user may download the singing video that he/she has uploaded from the video posting site and may use it as the first singing video.

次のステップＳ１０２では、画像表示制御部７０が図７に示すように第１歌唱動画を画面３ａに表示し、合成開始位置決定部７４が歌唱ユーザによる画面３ａへの操作に基づいて合成開始位置を決定する。 In the next step S102, the image display control unit 70 displays the first singing video on the screen 3a as shown in FIG. Determine.

次のステップＳ１０４では、録画制御部７２が第２歌唱動画の撮影開始指示の入力の有無を判定し、肯定判定の場合にはステップＳ１０６へ移行する。否定判定の場合は、第２歌唱動画の撮影開始指示が入力されるまで待ち状態となる。 In the next step S104, the recording control unit 72 determines whether or not an instruction to start shooting the second singing video has been input, and in the case of an affirmative determination, the process moves to step S106. In the case of a negative determination, a waiting state is entered until an instruction to start shooting the second singing video is input.

ステップＳ１０６では、画像表示制御部７０が合成開始位置画像６４及びカメラ３ｄによる撮影画像（現在撮影画像）を画面３ａに表示する。 In step S106, the image display control unit 70 displays the synthesis start position image 64 and the image captured by the camera 3d (currently captured image) on the screen 3a.

次のステップＳ１０８では、録画制御部７２が第２歌唱動画の録画開始指示の入力の有無を判定し、肯定判定の場合にはステップＳ１１０へ移行する。否定判定の場合は、第２歌唱動画の録画開始指示が入力されるまで待ち状態となる。 In the next step S108, the recording control unit 72 determines whether an instruction to start recording the second singing video has been input, and in the case of an affirmative determination, the process moves to step S110. In the case of a negative determination, a waiting state is entered until an instruction to start recording the second singing video is input.

ステップＳ１１０では、録画制御部７２が合成開始位置の前から楽曲の再生を開始し、再生した楽曲が合成開始位置に対応するタイミングに達すると第２歌唱動画の録画を開始し、楽曲が終了すると第２歌唱動画の録画を終了する。録画された第２歌唱動画は、補助記憶部４４に記憶される。 In step S110, the recording control unit 72 starts playing the song from before the synthesis start position, starts recording the second singing video when the played song reaches the timing corresponding to the synthesis start position, and when the song ends. Finish recording the second singing video. The recorded second singing video is stored in the auxiliary storage unit 44.

ステップＳ１１２では、動画合成部７６が第１歌唱動画の合成開始位置で第１歌唱動画と第２歌唱動画とを合成し、合成歌唱動画を生成する。なお、第１歌唱動画と第２歌唱動画の合成として重畳合成が行われてもよい。合成歌唱動画は、補助記憶部４４に記憶される。 In step S112, the video synthesis unit 76 combines the first singing video and the second singing video at the synthesis start position of the first singing video to generate a composite singing video. Note that superimposition synthesis may be performed as the synthesis of the first singing video and the second singing video. The synthetic singing video is stored in the auxiliary storage unit 44.

次のステップＳ１１４では、動画合成部７６が合成歌唱動画にエフェクト処理を行い、本動画合成処理を終了する。なお、実行されるエフェクト処理の種類は、歌唱ユーザによって予め設定されている。 In the next step S114, the video composition unit 76 performs effect processing on the composite singing video, and ends the video composition process. Note that the type of effect processing to be executed is set in advance by the singing user.

そして、歌唱ユーザは、このようにして生成された合成歌唱動画を動画投稿視聴アプリを介して動画投稿サイトにアップロードし、当該合成歌唱動画を視聴ユーザによる視聴が可能とする。なお、第１歌唱動画が動画投稿サイトにアップロードされている場合には、アップロードによって第１歌唱動画が合成歌唱動画に置き換えられてもよい。なお、生成された合成歌唱動画の完成度に歌唱ユーザが満足しない場合には、図１３に示される動画合成処理を再び最初から行い、合成歌唱動画の生成を再度行う。 Then, the singing user uploads the synthetic singing video generated in this manner to the video posting site via the video posting and viewing application, so that the viewing user can view the synthetic singing video. Note that if the first singing video has been uploaded to the video posting site, the first singing video may be replaced by the synthetic singing video by uploading. Note that if the singing user is not satisfied with the completeness of the generated synthetic singing video, the video combining process shown in FIG. 13 is performed again from the beginning, and the synthetic singing video is generated again.

以上説明したように、カメラ３ｄで撮影した画像を動画として録画する携帯端末３Ａは、先に録画した第１歌唱動画に対して、後に録画する第２歌唱動画の合成を開始する合成開始位置を決定した後に、カメラ３ｄで第２歌唱動画を録画し、合成開始位置で第１歌唱動画と第２歌唱動画とを合成する。従って、携帯端末３Ａは、歌唱ユーザがカメラ３ｄを用いて動画の録画を行う場合に、歌唱ユーザが望ましいと感じる動画を容易に作成できる。 As explained above, the mobile terminal 3A that records the image captured by the camera 3d as a moving image determines the composition start position at which to start synthesizing the second singing video to be recorded later, with respect to the first singing video recorded earlier. After the determination, the second singing video is recorded by the camera 3d, and the first singing video and the second singing video are combined at the combination start position. Therefore, when the singing user records a moving image using the camera 3d, the mobile terminal 3A can easily create a moving image that the singing user finds desirable.

［９．他の実施形態］
以上、本発明を、上記実施形態を用いて説明したが、本発明の技術的範囲は上記実施形態に記載の範囲には限定されない。発明の要旨を逸脱しない範囲で上記実施形態に多様な変更又は改良を加えることができ、該変更又は改良を加えた形態も本発明の技術的範囲に含まれる。また、上記実施形態を適宜組み合わせてもよい。 [9. Other embodiments]
Although the present invention has been described above using the above embodiments, the technical scope of the present invention is not limited to the range described in the above embodiments. Various changes or improvements can be made to the embodiments described above without departing from the gist of the invention, and forms with such changes or improvements are also included within the technical scope of the present invention. Further, the above embodiments may be combined as appropriate.

例えば、上記実施形態では、パフォーマンスを歌唱とし、動画投稿サイトにアップロードされる動画を歌唱動画とする形態について説明したが、本発明は、これに限定されるものではない。例えば、パフォーマンスをダンスとし、動画投稿サイトにアップロードされる動画をダンス動画とするように、パフォーマンスを歌唱以外としてもよい。 For example, in the embodiment described above, the performance is a singing video, and the video uploaded to the video posting site is a singing video, but the present invention is not limited to this. For example, the performance may be other than singing, such as a dance performance and a dance video uploaded to a video posting site.

また、上記実施形態では、動画合成機能を携帯端末３Ａで実行する形態について説明したが、本発明は、これに限定されるものではない。例えば、動画合成機能の一部又は全部をサーバ４で実行してもよい。なお、動画合成機能の一部又は全部をサーバ４で実行する場合であっても、動画の撮影及び録画は携帯端末３Ａのカメラ３ｄによって行われる。 Furthermore, in the embodiment described above, a mode has been described in which the video composition function is executed by the mobile terminal 3A, but the present invention is not limited to this. For example, part or all of the video composition function may be executed by the server 4. Note that even if part or all of the video composition function is executed by the server 4, the video is photographed and recorded by the camera 3d of the mobile terminal 3A.

また、動画合成機能は、第１歌唱動画と第２歌唱動画とを合成して生成された合成歌唱動画を新たな第１歌唱動画とし、当該第１歌唱動画に新たな第２歌唱動画を合成して新たな合成歌唱動画を生成してもよい。 In addition, the video synthesis function uses a synthesized singing video generated by combining the first singing video and the second singing video as a new first singing video, and combines the new second singing video with the first singing video. A new synthetic singing video may be generated.

また、上記実施形態で説明した動画合成処理の流れも一例であり、本発明の主旨を逸脱しない範囲内において不要なステップを削除したり、新たなステップを追加したり、処理順序を入れ替えたりしてもよい。 Furthermore, the flow of the video compositing process described in the above embodiment is just an example, and unnecessary steps may be deleted, new steps may be added, or the processing order may be changed without departing from the spirit of the present invention. You can.

３携帯端末（情報処理装置）
３ａ画面
３ｄカメラ
６４合成開始位置画像
７０画像表示制御部（画像表示制御手段）
７２録画制御部（録画制御手段）
７４合成開始位置決定部（決定手段）
７６動画合成部（動画合成手段） 3 Mobile terminal (information processing device)
3a Screen 3d Camera 64 Synthesis start position image 70 Image display control section (image display control means)
72 Recording control unit (recording control means)
74 Synthesis start position determining unit (determining means)
76 Video composition unit (video composition means)

Claims

An information processing device that records images taken with a camera as a video,
determining means for determining a compositing start position at which to start compositing a second video to be recorded later with respect to the first video to be recorded first;
Recording control means for recording the second video with the camera;
An information processing device comprising: video compositing means for compositing the first video and the second video at the composition start position of the first video.

The information processing apparatus according to claim 1, further comprising image display control means for displaying a synthesis start position image, which is an image at the synthesis start position of the first moving image, on a screen when recording of the second moving image is started.

3. The information processing apparatus according to claim 2, wherein the image display control means superimposes the synthesis start position image on the image taken by the camera to record the second moving image and displays the superimposed image on the screen.

4. The information processing apparatus according to claim 1, wherein the video compositing means performs image processing to make a composite portion of the first video and the second video less noticeable.

5. The video composition means according to claim 1, wherein the video composition means superimposes the first video and the second video for a predetermined period from the composition start position, and composites the first video and the second video. The information processing device according to any one of the items.

The information processing apparatus according to any one of claims 1 to 5, wherein the first video and the second video are videos that are recorded while a song is being played.

7. The recording control means starts playing the music before the composition start position, and starts recording the second video when the played music reaches a timing corresponding to the composition start position. Information processing device.

8. The information processing apparatus according to claim 6, wherein the recording control means starts playing the music before starting recording of the second moving image, and performs a countdown to start recording of the second moving image.

The information processing apparatus according to any one of claims 6 to 8, wherein the position that can be set as the synthesis start position is a predetermined position of the music piece.

A first step of recording a first video with a camera,
a second step of determining a synthesis start position for starting synthesis of a second moving image to be recorded later with respect to the first moving image;
a third step of recording the second video with the camera;
A video composition method comprising: a fourth step of combining the first video and the second video at the composition start position of the first video.

A computer included in an information processing device that records images taken with a camera as a video,
determining means for determining a compositing start position at which to start compositing a second video to be recorded later with respect to the first video to be recorded first;
Recording control means for recording the second video with the camera;
A video composition program for functioning as a video composition means for combining the first video and the second video at the composition start position of the first video.