JP5550593B2

JP5550593B2 - Karaoke equipment

Info

Publication number: JP5550593B2
Application number: JP2011073274A
Authority: JP
Inventors: 麻美川▲崎▼
Original assignee: Xing Inc
Current assignee: Xing Inc
Priority date: 2011-03-29
Filing date: 2011-03-29
Publication date: 2014-07-16
Anticipated expiration: 2031-03-29
Also published as: JP2012208281A

Description

本発明は、多数の演奏曲のうちから選択される演奏曲を出力させるカラオケ装置に関し、特に、カラオケ演奏に際して撮影される映像の編集に係る改良に関する。 The present invention relates to a karaoke apparatus that outputs a performance tune selected from a large number of performance tunes, and more particularly, to an improvement related to editing of a video shot during karaoke performance.

多数の演奏曲のうちから選択される演奏曲を出力させる音楽再生装置が知られている。例えば、カラオケボックス等で使用されるカラオケ装置がそれである。斯かるカラオケ装置によれば、予め記憶装置に記憶された多数のカラオケ演奏曲から選択されたカラオケ演奏曲の音楽情報を出力させると共に、そのカラオケ演奏曲の歌詞情報を含む映像をその出力に同期して画面に表示させることで、所望の歌のカラオケ演奏を楽しむことができる。 2. Description of the Related Art Music playback apparatuses that output a performance song selected from a large number of performance songs are known. For example, a karaoke device used in a karaoke box or the like. According to such a karaoke apparatus, music information of a karaoke performance song selected from a large number of karaoke performance songs stored in advance in a storage device is output, and an image including lyrics information of the karaoke performance song is synchronized with the output. By displaying it on the screen, it is possible to enjoy karaoke performance of a desired song.

ところで、本出願人は、前記カラオケ装置を通信端末装置としてインターネットにおける映像情報配信システムに組み込み、前記カラオケ装置による演奏曲の出力に際して撮像装置により撮影された映像情報及び音声入力装置により入力された音声情報の投稿を受け付けて閲覧可能とする技術を発案し、実用化している。そのように、撮像装置により撮影された映像における所定の領域を抽出するための技術が提案されている。例えば、特許文献１に記載された輪郭抽出方法がそれである。この技術によれば、選択された領域の階層エッジデータを作成し、これを利用して追跡開始点の特定と輪郭追跡処理を行うことで輪郭データを抽出した後、動的輪郭モデルによる補正処理によって輪郭データを滑らかに補正することにより、ノイズや濃淡ムラがある場合においても元の画像パターンと一致し且つ滑らかな輪郭線の抽出が可能となるとされている。 By the way, the present applicant incorporated the karaoke device as a communication terminal device into a video information distribution system on the Internet, and recorded video information captured by an imaging device and audio input by a voice input device when a performance song was output by the karaoke device. A technology that accepts and allows browsing of information has been devised and put into practical use. As such, a technique for extracting a predetermined region in an image captured by an imaging device has been proposed. For example, this is the contour extraction method described in Patent Document 1. According to this technology, hierarchical edge data of a selected area is created, and using this, the tracking start point is identified and the contour tracking process is performed to extract the contour data, and then the correction process using the dynamic contour model Thus, by smoothly correcting the contour data, it is possible to extract a smooth contour line that matches the original image pattern even when there is noise or shading unevenness.

特開２００３−２１６９５９号公報JP 2003-216959 A

ところで、前記カラオケ装置による演奏曲の出力に際して撮影される映像（演奏映像）には、歌唱者のみならず演奏を行っている部屋の風景や、同じ部屋で演奏を聞いている（或いは演奏待ちをしている）他の利用者の像等も映り込むが、前記映像情報配信システムに投稿するための映像としては専ら歌唱者の映像を撮影することが望まれる場合がある。しかしながら、前記従来の技術では、歌唱者の映像を特定することができず、カラオケ演奏に際して撮影される映像から歌唱者の映像を抽出して切り抜くことが困難であった。すなわち、カラオケ演奏に際して撮影される映像から歌唱者の映像を簡便に切り抜いて編集し得るカラオケ装置は、未だ開発されていないのが現状である。 By the way, in the video (performance video) taken when the performance music is output by the karaoke device, the scene of the room where the performance is performed as well as the singer, or the performance is heard in the same room (or waiting for the performance). The image of other users is also reflected, but it may be desired to shoot the video of the singer exclusively as the video for posting to the video information distribution system. However, in the conventional technique, it is difficult to specify the video of the singer, and it is difficult to extract and cut out the video of the singer from the video shot during the karaoke performance. That is, the karaoke apparatus which can cut out and edit a singer's image | video from the image | video image | photographed at the time of a karaoke performance has not been developed yet.

本発明は、以上の事情を背景として為されたものであり、その目的とするところは、カラオケ演奏に際して撮影される映像から歌唱者の映像を簡便に切り抜いて編集し得るカラオケ装置を提供することにある。 The present invention has been made against the background of the above circumstances, and an object of the present invention is to provide a karaoke apparatus that can easily cut out and edit a singer's video from a video shot during karaoke performance. It is in.

斯かる目的を達成するために、本第１発明の要旨とするところは、多数の演奏曲のうちから選択される演奏曲を出力させると共にマイクロフォンにより入力される音声を増幅して出力させるカラオケ装置であって、そのカラオケ装置による演奏曲の出力に際してそのカラオケ装置を基準とする所定範囲の映像を撮影する撮像装置と、その撮像装置により撮影された映像に含まれる前記マイクロフォンに対応する位置を検出するマイクロフォン位置検出手段と、そのマイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定する歌唱者領域判定手段と、前記撮像装置により撮影された映像から、その歌唱者領域判定手段により判定された歌唱者に対応する領域の映像を切り抜いて他の映像に合成する映像合成制御手段とを、備え、前記歌唱者領域判定手段は、前記マイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置及び歌唱者の身長に関する情報に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定することを特徴とするものである。 In order to achieve such an object, the gist of the first aspect of the present invention is to provide a karaoke apparatus that outputs a performance tune selected from a large number of performance tunes and amplifies and outputs a sound input by a microphone. An imaging device that captures a predetermined range of video with reference to the karaoke device when the performance music is output by the karaoke device, and a position corresponding to the microphone included in the video captured by the imaging device is detected. A microphone position detection unit that performs the determination, and a singer region determination that determines a region corresponding to the singer included in the video captured by the imaging device based on a position corresponding to the microphone detected by the microphone position detection unit And the singer area determination means from the video taken by the imaging device A video synthesis control means for cutting out an image of the region corresponding to the singer synthesized to other video image, wherein said singer region determining means, corresponding to the microphone detected by the microphone position detecting means A region corresponding to the singer included in the video shot by the imaging device is determined based on the position and the information related to the height of the singer .

このように、前記第１発明によれば、前記カラオケ装置による演奏曲の出力に際してそのカラオケ装置を基準とする所定範囲の映像を撮影する撮像装置と、その撮像装置により撮影された映像に含まれる前記マイクロフォンに対応する位置を検出するマイクロフォン位置検出手段と、そのマイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定する歌唱者領域判定手段と、前記撮像装置により撮影された映像から、その歌唱者領域判定手段により判定された歌唱者に対応する領域の映像を切り抜いて他の映像に合成する映像合成制御手段とを、備え、前記歌唱者領域判定手段は、前記マイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置及び歌唱者の身長に関する情報に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定するものであることから、歌唱者が手にしているマイクロフォンの位置からその歌唱者に対応する領域を好適に特定することができ、その領域を抽出することで歌唱者の映像を切り抜くことができる。すなわち、カラオケ演奏に際して撮影される映像から歌唱者の映像を簡便に切り抜いて編集し得るカラオケ装置を提供することができる。 As described above, according to the first aspect of the present invention, when the performance music is output by the karaoke device, the image pickup device captures a video in a predetermined range with reference to the karaoke device, and the video captured by the image capture device is included. A microphone position detecting means for detecting a position corresponding to the microphone, and a singer included in the video photographed by the imaging device based on the position corresponding to the microphone detected by the microphone position detecting means. Singer area determination means for determining the area, and video composition for cutting out the video of the area corresponding to the singer determined by the singer area determination means from the video taken by the imaging device and synthesizing it with another video and control means, wherein said singer region determining means is detected by the microphone position detecting means On the basis of the position and information on the height of the singer corresponding to the microphone that, since those determines an area corresponding to the singer's included in the captured image by the imaging device, the singer is in hand The area corresponding to the singer can be suitably specified from the position of the microphone that is present, and the video of the singer can be cut out by extracting the area. That is, it is possible to provide a karaoke apparatus that can easily cut out and edit a singer's video from a video shot during karaoke performance.

前記目的を達成するために、本第２発明の要旨とするところは、多数の演奏曲のうちから選択される演奏曲を出力させると共にマイクロフォンにより入力される音声を増幅して出力させるカラオケ装置であって、そのカラオケ装置による演奏曲の出力に際してそのカラオケ装置を基準とする所定範囲の映像を撮影する撮像装置と、その撮像装置により撮影された映像に含まれる前記マイクロフォンに対応する位置を検出するマイクロフォン位置検出手段と、そのマイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定する歌唱者領域判定手段と、前記撮像装置により撮影された映像から、その歌唱者領域判定手段により判定された歌唱者に対応する領域の映像を切り抜いて他の映像に合成する映像合成制御手段と、歌唱者の性別に関する情報及び生年に関する情報から、その歌唱者の身長を算出する歌唱者身長算出手段とを、備え、前記歌唱者領域判定手段は、前記マイクロフォン位置検出手段により検出される前記マイクロフォンに対応する位置及び前記歌唱者身長算出手段により算出される歌唱者の身長に基づいて、前記撮像装置により撮影された映像に含まれる歌唱者に対応する領域を判定するものである。このようにすれば、カラオケ演奏に際して撮影される映像に含まれる利用者に対応する領域を簡便且つ実用的な態様で判定することができる。 In order to achieve the above object, the gist of the second invention is a karaoke apparatus that outputs a performance song selected from a large number of performance songs and amplifies and outputs a sound input by a microphone. Then, when outputting a performance song by the karaoke device, an imaging device that captures a predetermined range of video based on the karaoke device, and a position corresponding to the microphone included in the video captured by the imaging device is detected. A microphone position detection means and a singer area determination means for determining an area corresponding to a singer included in the video imaged by the imaging device based on a position corresponding to the microphone detected by the microphone position detection means And determined by the singer area determination means from the video imaged by the imaging device. Video synthesis control means for synthesizing the other video cut an image of the region corresponding to the singer has, from the information on the information and birth about singer gender, and the singer height calculating means for calculating the height of the singer And the singer area determination means is based on the position corresponding to the microphone detected by the microphone position detection means and the height of the singer calculated by the singer height calculation means. An area corresponding to a singer included in the photographed video is determined. In this way, it is possible to determine in a simple and practical manner the area corresponding to the user included in the video shot during the karaoke performance.

前記第１発明乃至第２発明において、好適には、前記歌唱者領域判定手段は、前記撮像装置により撮影された映像に含まれる歌唱者に対応する楕円形の領域を判定するものであり、その歌唱者の身長に関する情報に基づいてその楕円形の長軸寸法を決定するものである。このようにすれば、カラオケ演奏に際して撮影される映像に含まれる利用者に対応する領域を簡便且つ実用的な態様で判定することができる。 In the first invention to the second invention , preferably, the singer area determination means determines an elliptical area corresponding to a singer included in the video imaged by the imaging device, The major axis dimension of the ellipse is determined based on information on the height of the singer. In this way, it is possible to determine in a simple and practical manner the area corresponding to the user included in the video shot during the karaoke performance.

また、好適には、前記マイクロフォン位置検出手段は、前記撮像装置により撮影された映像において輝度に基づく画像解析を行うことにより前記マイクロフォンに対応する位置を検出するものである。このようにすれば、歌唱者が手にしているマイクロフォンの位置からその歌唱者に対応する領域を実用的な態様で特定することができる。 Preferably, the microphone position detecting means detects a position corresponding to the microphone by performing image analysis based on luminance in an image captured by the imaging device. If it does in this way, the field corresponding to the singer can be specified in the practical form from the position of the microphone which the singer has.

また、好適には、前記マイクロフォンは、無線信号を介して前記カラオケ装置に音声情報を入力するものであり、前記マイクロフォンからの無線信号に基づいてそのマイクロフォンの存在する位置を検出する位置検出装置を備え、前記マイクロフォン位置検出手段は、その位置検出装置の検出結果に基づいて前記撮像装置により撮影された映像に含まれる前記マイクロフォンに対応する位置を検出するものである。このようにすれば、歌唱者が手にしているマイクロフォンの位置からその歌唱者に対応する領域を実用的な態様で特定することができる。 Preferably, the microphone inputs voice information to the karaoke device via a radio signal, and a position detection device that detects the position of the microphone based on the radio signal from the microphone is provided. The microphone position detection means detects a position corresponding to the microphone included in the video imaged by the imaging device based on a detection result of the position detection device. If it does in this way, the field corresponding to the singer can be specified in the practical form from the position of the microphone which the singer has.

また、好適には、前記映像合成制御手段は、前記他の映像に前記歌唱者に対応する領域の映像を合成した合成映像を、前記カラオケ装置による演奏曲の出力に伴ってカラオケ背景映像として表示させるものである。このようにすれば、歌唱者に対応する映像が合成された合成映像をカラオケ演奏における背景映像として用いることができる。 Preferably, the video composition control means displays a synthesized video obtained by synthesizing the video of the area corresponding to the singer with the other video as a karaoke background video along with the output of a performance song by the karaoke device. It is something to be made. If it does in this way, the synthetic | combination image | video with which the image | video corresponding to a singer was synthesize | combined can be used as a background image | video in a karaoke performance.

本発明のカラオケ装置が好適に適用されるカラオケシステムを説明する概略図である。It is the schematic explaining the karaoke system to which the karaoke apparatus of this invention is applied suitably. 本発明の一実施例であるカラオケ装置の構成を例示するブロック線図である。It is a block diagram which illustrates the composition of the karaoke device which is one example of the present invention. 図１のカラオケシステムに備えられたサーバ装置の構成を説明するブロック線図である。It is a block diagram explaining the structure of the server apparatus with which the karaoke system of FIG. 1 was equipped. 図２のカラオケ装置によるカラオケ演奏に際してデジタルカメラにより所定範囲の映像が撮影される様子を説明する図である。It is a figure explaining a mode that the image | video of a predetermined range is image | photographed with a digital camera at the time of the karaoke performance by the karaoke apparatus of FIG. 図２のカラオケ装置のＣＰＵに備えられた制御機能の要部を説明する機能ブロック線図である。It is a functional block diagram explaining the principal part of the control function with which CPU of the karaoke apparatus of FIG. 2 was equipped. 図２のカラオケ装置による演奏曲の出力に際してデジタルカメラにより撮影される映像の一例である演奏映像を示す図である。It is a figure which shows the performance image | video which is an example of the image | video image | photographed with a digital camera at the time of the output of the performance music by the karaoke apparatus of FIG. 図６に示す演奏映像におけるマイクロフォンに対応する位置に基づいて歌唱者に対応する領域を判定する制御の一例を説明する図である。It is a figure explaining an example of the control which determines the area | region corresponding to a singer based on the position corresponding to the microphone in the performance image shown in FIG. 図６に示す演奏映像において図７に示すように判定された歌唱者に対応する領域の映像を切り抜いて示す図である。It is a figure which cuts and shows the image | video of the area | region corresponding to the singer determined as shown in FIG. 7 in the performance image | video shown in FIG. 図８に示す映像が合成される背景の一例として、共にカラオケ演奏を行っている利用者のアバタがバンド演奏をする背景映像を示す図である。FIG. 9 is a diagram illustrating a background image in which a user's avatar performing a karaoke performance performs a band performance as an example of a background in which the image illustrated in FIG. 8 is combined. 図８に示す歌唱者に対応する領域の映像が、図９に示す背景映像の前面側レイヤに合成された合成映像を例示する図である。It is a figure which illustrates the synthetic | combination image | video by which the image | video of the area | region corresponding to the singer shown in FIG. 8 was synthesize | combined with the front side layer of the background image | video shown in FIG. 図１０に示す合成映像に演奏曲の歌詞文字映像が合成されたカラオケ演奏画面を例示する図である。It is a figure which illustrates the karaoke performance screen by which the lyric character image of the performance music was synthesize | combined with the synthetic | combination image | video shown in FIG. 図８に示す歌唱者に対応する領域の映像が合成される背景映像を選択するために表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed in order to select the background image | video with which the image | video of the area | region corresponding to the singer shown in FIG. 8 is synthesize | combined. 図２のカラオケ装置のＣＰＵによる歌唱者映像切抜／合成制御の要部を説明するフローチャートである。It is a flowchart explaining the principal part of singer image | video cutout / composition control by CPU of the karaoke apparatus of FIG.

以下、本発明の好適な実施例を図面に基づいて詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明のカラオケ装置が好適に適用されるカラオケシステム１０を説明する概略図である。この図１に示すように、上記カラオケシステム１０では、カラオケボックス、スナック、旅館等の店舗１２における複数の個室１４ａ、１４ｂ、１４ｃ、・・・（以下、特に区別しない場合には単に個室１４と称する）にそれぞれ１台乃至は複数台ずつ（図１では１台ずつ）本発明の一実施例であるカラオケ装置１６ａ、１６ｂ、１６ｃ、・・・（以下、特に区別しない場合には単にカラオケ装置１６と称する）が設置されている。これら複数のカラオケ装置１６は、ルータ２８を介して公衆電話回線等による通信回線１８に接続されており、同じくその通信回線１８に接続されたカラオケサービス提供会社のサーバ装置（センタ装置）２０との相互間でその通信回線１８を介して情報の通信が可能とされている。このサーバ装置２０は、カラオケ情報（楽曲データ）、背景映像情報、曲間情報等のデジタルコンテンツ（Digital Contents）の保管や入出力管理の基本的な制御を行うサーバであり、上記通信回線１８を介して上記カラオケ装置１６に定期的にコンテンツの配信を行うと共に、そのカラオケ装置１６からの要求に応じて所定の機能制御プログラムを送信するものである。また、上記カラオケシステム１０は、複数の電子早見本装置２２ａ、２２ｂ、２２ｃ、・・・（以下、特に区別しない場合には単に電子早見本装置２２と称する）を備えており、上記カラオケ装置１６の利用に際して、各利用者（グループ）毎に１台乃至数台ずつの電子早見本装置２２が貸与され、各個室１４において後述するように上記カラオケ装置１６の遠隔操作装置として用いられるようになっている。上記店舗１２内には上記複数のカラオケ装置１６を相互に接続するＬＡＮ２４が敷設されており、上記電子早見本装置２２からのカラオケ装置１６への入力は、所定のアクセスポイント２６及びＬＡＮ２４を介したＬＡＮ通信等により行われる。 FIG. 1 is a schematic diagram illustrating a karaoke system 10 to which the karaoke apparatus of the present invention is preferably applied. As shown in FIG. 1, in the karaoke system 10, a plurality of private rooms 14 a, 14 b, 14 c,... In a store 12 such as a karaoke box, a snack, an inn, etc. Karaoke devices 16a, 16b, 16c, which are one embodiment of the present invention (hereinafter referred to simply as karaoke devices unless otherwise specified). 16). The plurality of karaoke devices 16 are connected to a communication line 18 such as a public telephone line via a router 28, and are connected to a server device (center device) 20 of a karaoke service providing company connected to the communication line 18. Information can be communicated between each other via the communication line 18. The server device 20 is a server that performs basic control of storage and input / output management of digital contents (Digital Contents) such as karaoke information (music data), background video information, and information between songs. The content is regularly distributed to the karaoke device 16 via the karaoke device 16 and a predetermined function control program is transmitted in response to a request from the karaoke device 16. The karaoke system 10 includes a plurality of electronic quick sample devices 22a, 22b, 22c,... (Hereinafter simply referred to as the electronic quick sample device 22 unless otherwise distinguished). 1 to several electronic quick sample devices 22 are lent for each user (group) and are used as remote control devices for the karaoke device 16 in each private room 14 as will be described later. ing. A LAN 24 for connecting the plurality of karaoke apparatuses 16 to each other is laid in the store 12, and an input to the karaoke apparatus 16 from the electronic quick sample apparatus 22 is made via a predetermined access point 26 and the LAN 24. This is performed by LAN communication or the like.

図２は、上記カラオケ装置１６の構成を例示するブロック線図である。この図２に示すように、上記カラオケ装置１６は、ＣＲＴ（Cathode-ray Tube）やＴＦＴ（Thin Film Transistor Liquid Crystal）等の映像表示装置３０と、ビデオボード（グラフィックスボード）等の映像出力制御部３２と、映像情報デコーダ３４と、ビデオミキサ３６と、音源であるシンセサイザ３８と、音声入力装置であるマイクロフォン４０と、アンプミキサ４２と、スピーカ４４と、操作パネル４６と、その操作パネル４６等からの入力信号を処理する入出力インターフェイス４８と、中央演算処理装置であるＣＰＵ５０と、読出専用メモリであるＲＯＭ５２と、随時書込読出メモリであるＲＡＭ５４と、記憶装置であるハードディスク５６と、モデム５８と、ＬＡＮポート６０と、上記電子早見本装置２２やリモコン装置６４等の入力装置からのリモコン信号を受信するためのリモコン受信部６２と、撮像装置であるデジタルカメラ６６と、赤外線光源位置検出装置６８とを、備えて構成されている。 FIG. 2 is a block diagram illustrating the configuration of the karaoke apparatus 16. As shown in FIG. 2, the karaoke device 16 includes a video display device 30 such as a CRT (Cathode-ray Tube) or TFT (Thin Film Transistor Liquid Crystal), and a video output control such as a video board (graphics board). From the unit 32, the video information decoder 34, the video mixer 36, the synthesizer 38 as a sound source, the microphone 40 as an audio input device, the amplifier mixer 42, the speaker 44, the operation panel 46, the operation panel 46, etc. An input / output interface 48 for processing input signals, a CPU 50 as a central processing unit, a ROM 52 as a read-only memory, a RAM 54 as a write / read memory as needed, a hard disk 56 as a storage device, and a modem 58 , From the LAN port 60 and input devices such as the electronic sample device 22 and the remote control device 64. A remote control reception section 62 for receiving a remote control signal, the digital camera 66 is an imaging apparatus, an infrared light source position detector 68 is configured to include.

前記映像出力制御部３２は、前記ＣＰＵ５０において生成された歌詞文字映像等の文字映像（テロップ）を出力する文字映像出力装置として機能する他、前記映像表示装置３０による種々の映像表示を制御する表示制御装置である。また、前記映像情報デコーダ３４は、利用者が歌詞を参照しながら歌唱する際に前記ハードディスク５６の背景データベース８８（図５を参照）等に記憶された背景映像情報に基づいて所定の背景映像を再生（デコード）する背景映像再生装置である。この背景映像情報は、例えば、ＭＰＥＧ（Moving Picture Experts Group）形式のデータであり、そのＭＰＥＧデータに基づいて前記映像情報デコーダ３４により再生された背景映像は、前記ビデオミキサ３６へ送られる。また、そのビデオミキサ３６は、前記ＣＰＵ５０において生成され且つ前記映像出力制御部３２から出力される文字映像と、前記映像情報デコーダ３４により再生される背景映像とを合成して前記映像表示装置３０に表示させる映像合成装置である。 The video output control unit 32 functions as a character video output device that outputs a character video (telop) such as a lyric character video generated by the CPU 50, and also displays for controlling various video displays by the video display device 30. It is a control device. Also, the video information decoder 34 generates a predetermined background video based on the background video information stored in the background database 88 (see FIG. 5) of the hard disk 56 when the user sings while referring to the lyrics. This is a background video playback device for playback (decoding). The background video information is, for example, MPEG (Moving Picture Experts Group) format data, and the background video reproduced by the video information decoder 34 based on the MPEG data is sent to the video mixer 36. The video mixer 36 synthesizes the character video generated by the CPU 50 and output from the video output control unit 32 with the background video reproduced by the video information decoder 34 to the video display device 30. This is a video composition device to be displayed.

前記シンセサイザ３８は、前記ハードディスク５６から読み出されて送られて来るカラオケ演奏曲の演奏情報に基づいて楽器の演奏信号等の音楽信号を生成する音源である。この演奏情報は、例えば、ＭＩＤＩ（Musical Instrument Digital Interface）形式のデータであり、そのＭＩＤＩデータに基づいて前記シンセサイザ３８により生成された音楽信号は、アナログ信号に変換されて前記アンプミキサ４２へ送られる。前記マイクロフォン４０は、音波によって生じる機械的な振動を電気信号としての音声情報に変換する電気音響変換器乃至音声入力装置であり、入力された音声情報を例えば赤外線信号等の無線信号を介して前記カラオケ装置１６へ送信する。前記アンプミキサ４２では、前記シンセサイザ３８から送られてきた音楽信号と、前記マイクロフォン４０からリモコン受信部６２を介して入力される利用者（演奏者）の歌声とがミキシングされ、それらの信号が電気的に増幅されて前記スピーカ４４から出力される。 The synthesizer 38 is a sound source that generates a music signal such as a musical instrument performance signal based on performance information of a karaoke performance song read from the hard disk 56 and sent. The performance information is, for example, data in MIDI (Musical Instrument Digital Interface) format, and the music signal generated by the synthesizer 38 based on the MIDI data is converted into an analog signal and sent to the amplifier mixer 42. The microphone 40 is an electroacoustic transducer or an audio input device that converts mechanical vibration generated by sound waves into audio information as an electric signal. The input audio information is transmitted via a radio signal such as an infrared signal, for example. It transmits to the karaoke apparatus 16. In the amplifier mixer 42, the music signal sent from the synthesizer 38 and the singing voice of the user (player) input from the microphone 40 via the remote control receiving unit 62 are mixed, and these signals are electrically mixed. And output from the speaker 44.

前記操作パネル４６は、前記カラオケ装置１６の利用者が歌いたいカラオケ演奏曲を選択したり、演奏曲の音程を調整したり、演奏と歌との音量バランスを調整したり、その他、エコー、音量、トーン等の各種調整を行うための操作ボタン（スイッチ）或いはつまみを備えた入力装置である。また、前記カラオケ装置１６には、前記操作パネル４６の一部機能を遠隔で実行するための入力装置として機能するリモコン装置６４が備えられており、前記リモコン受信部６２は、そのリモコン装置６４から送信されるリモコン信号（赤外線信号）を受信して前記ＣＰＵ５０へ供給する。また、前記カラオケ装置１６と電子早見本装置２２との対応付け（くくりつけ）処理も前記リモコン受信部６２を介して行われ、そのようにして前記カラオケ装置１６に対応付けられた電子早見本装置２２も同様に入力装置として機能する。なお、本実施例においては、前記カラオケ装置１６に備えられたリモコン装置６４や対応付け処理の行われた電子早見本装置２２等の入力装置もそのカラオケ装置１６の一部を構成するものであるとして以下の説明を行う。 The operation panel 46 allows the user of the karaoke apparatus 16 to select a karaoke performance song that the user wants to sing, adjust the pitch of the performance song, adjust the volume balance between the performance and the song, and perform echo and volume. , An input device provided with operation buttons (switches) or knobs for performing various adjustments such as tone. Further, the karaoke device 16 is provided with a remote control device 64 that functions as an input device for remotely executing a part of the function of the operation panel 46, and the remote control receiving unit 62 is connected to the remote control device 64 from the remote control device 64. A remote control signal (infrared signal) to be transmitted is received and supplied to the CPU 50. In addition, the association (sticking) processing between the karaoke device 16 and the electronic quick sample device 22 is also performed via the remote control receiving unit 62, and thus the electronic quick sample device associated with the karaoke device 16. 22 also functions as an input device. In this embodiment, an input device such as the remote control device 64 provided in the karaoke device 16 or the electronic quick sample device 22 subjected to the association processing also constitutes a part of the karaoke device 16. The following will be described.

前記ＣＰＵ５０は、前記ＲＡＭ５４の一時記憶機能を利用しつつ前記ＲＯＭ５２に予め記憶された所定のプログラムに基づいて電子情報を処理・制御する所謂マイクロコンピュータであり、前記電子早見本装置２２やリモコン装置６４等により所定のカラオケ演奏曲が選曲された場合、その選曲されたカラオケ演奏曲を前記ＲＡＭ５４に設けられた予約曲テーブルに登録したり、その予約曲テーブルの演奏順に従って前記ハードディスク５６から前記ＲＡＭ５４に選曲されたカラオケ演奏曲の演奏情報及び歌詞情報等を読み出したり、カラオケ演奏曲の演奏が進行するのに応じてそのＲＡＭ５４から前記シンセサイザ３８へ演奏情報を送信したり、歌詞情報に基づいて歌詞文字映像を生成して前記映像出力制御部３２へ送ったり、選曲時には曲名文字映像を生成して前記映像出力制御部３２へ送ったり、前記映像情報デコーダ３４を制御して所定の背景映像を再生させたり、カラオケ演奏が行われていない間すなわち曲間において、新譜情報、選曲ランキング、店舗広告等の曲間情報を出力させたり、前記通信回線１８を介した前記サーバ装置２０との間の情報通信制御等の基本的な制御に加えて、後述する本実施例の歌唱者映像切抜／合成制御等の各種制御を実行する。 The CPU 50 is a so-called microcomputer that processes and controls electronic information based on a predetermined program stored in advance in the ROM 52 using the temporary storage function of the RAM 54, and the electronic quick sample device 22 and the remote control device 64. When a predetermined karaoke performance song is selected by the above or the like, the selected karaoke performance song is registered in the reserved song table provided in the RAM 54, or from the hard disk 56 to the RAM 54 according to the performance order of the reserved song table. The performance information and lyrics information etc. of the selected karaoke performance song are read out, the performance information is transmitted from the RAM 54 to the synthesizer 38 as the performance of the karaoke performance song progresses, and the lyric characters are based on the lyrics information. Generate a video and send it to the video output control unit 32. A new character information is generated while a name character image is generated and sent to the image output control unit 32, a predetermined background image is reproduced by controlling the image information decoder 34, that is, during a karaoke performance, that is, between songs. In addition to basic control such as information communication control with the server device 20 via the communication line 18, output information between songs such as music selection rankings, store advertisements, etc. Various controls such as singer video clipping / compositing control are executed.

前記モデム５８は、前記カラオケ装置１６を公衆電話回線等による通信回線１８に接続するための装置であり、前記ＣＰＵ５０から出力されるディジタル信号をアナログ信号に変換して前記通信回線１８に送り出すと共に、その通信回線１８を介して伝送されるアナログ信号をディジタル信号に変換して前記ＣＰＵ５０に供給する処理を行う。なお、前記店舗１２に備えられた複数のカラオケ装置１６のうち何れかのカラオケ装置１６が前記ルータ２８の機能を備えてマスターコマンダとして前記通信回線１８に接続される態様も考えられ、その場合、前記モデム５８はそのマスターコマンダとして機能するカラオケ装置１６には必要とされるが、そのマスターコマンダを介して前記サーバ装置２０との間で情報の通信を行う他のカラオケ装置１６には必ずしも設けられなくともよい。 The modem 58 is a device for connecting the karaoke device 16 to a communication line 18 such as a public telephone line, converts a digital signal output from the CPU 50 into an analog signal and sends it to the communication line 18. The analog signal transmitted via the communication line 18 is converted into a digital signal and supplied to the CPU 50. In addition, the aspect by which any karaoke apparatus 16 is equipped with the function of the said router 28 among the several karaoke apparatuses 16 with which the said store 12 was equipped, and is connected to the said communication line 18 as a master commander is also considered, In that case, The modem 58 is required for the karaoke device 16 that functions as the master commander, but is not necessarily provided for other karaoke devices 16 that communicate information with the server device 20 via the master commander. Not necessary.

前記ＬＡＮポート６０は、前記カラオケ装置１６をＬＡＮ２４を介して他のカラオケ装置１６や電子早見本装置２２等の他の機器に接続するための接続器であり、前記カラオケ装置１６は、そのようにＬＡＮ２４を介して接続されることで、他のカラオケ装置１６や電子早見本装置２２等の他の機器との間で情報の送受信が可能とされる。例えば、前記アクセスポイント２６を介して受信される前記電子早見本装置２２からの選曲入力を受け付けて前記ＲＡＭ５４に設けられた予約曲テーブルに記憶したり、そのアクセスポイント２６を介して前記カラオケ装置１６から電子早見本装置２２へ所定の情報を送信したりというように、電波を介して前記カラオケ装置１６と電子早見本装置２２との間における相互の情報のやりとりが実行される。 The LAN port 60 is a connector for connecting the karaoke device 16 to other devices such as the other karaoke device 16 and the electronic quick sample device 22 via the LAN 24, and the karaoke device 16 is used as such. By being connected via the LAN 24, information can be transmitted / received to / from other devices such as the other karaoke apparatus 16 and the electronic quick sample apparatus 22. For example, the music selection input from the electronic quick sample device 22 received through the access point 26 is received and stored in a reserved music table provided in the RAM 54, or the karaoke device 16 through the access point 26. Thus, mutual exchange of information between the karaoke device 16 and the electronic quick sample device 22 is performed via radio waves, such as transmitting predetermined information to the electronic quick sample device 22.

前記デジタルカメラ６６は、例えばＣＣＤ（charge coupled device）等の撮像素子及びレンズを備え、そのレンズから入射される映像をＣＣＤにより検知し、その映像を電子情報（映像データ）として取得する所謂デジタルビデオカメラであり、少なくとも動画（時間の経過に従い変化する動きのある映像）を撮影し得るものであるが、必要に応じて静止画（スチル写真）を撮影できるように構成されたものであってもよい。このデジタルカメラ６６により撮影された映像情報は、図示しないビデオ端子等のインターフェイスを介して前記ＣＰＵ５０等へ供給され、例えばＡＶＩ（Audio-Video Interleaved）形式、ＭＰＥＧ（Moving Picture Experts Group）形式、ＦＬＶ（Flash Video）形式等の映像ファイルとして前記ＲＡＭ５４等に記憶される。なお、このデジタルカメラ６６は、必ずしも前記カラオケ装置１６の一部として備えられたものでなくともよく、例えば前記個室１４における所定位置に固設された別体のビデオカメラ乃至各利用者が所有する携帯電話機に備えられた撮像装置等により撮影された映像が所定のインターフェイスを介して前記カラオケ装置１６に入力される態様も考えられる。 The digital camera 66 includes, for example, an image sensor and a lens such as a CCD (charge coupled device), detects a video incident from the lens by the CCD, and acquires the video as electronic information (video data). A camera that can shoot at least moving images (moving images that change over time), but can be configured to shoot still images (still photos) as needed Good. Video information photographed by the digital camera 66 is supplied to the CPU 50 or the like via an interface such as a video terminal (not shown). For example, AVI (Audio-Video Interleaved) format, MPEG (Moving Picture Experts Group) format, FLV (FLV) Flash video) is stored in the RAM 54 or the like as a video file. The digital camera 66 does not necessarily have to be provided as a part of the karaoke device 16, and is owned by a separate video camera or each user fixed at a predetermined position in the private room 14, for example. A mode in which an image taken by an imaging device or the like provided in the mobile phone is input to the karaoke device 16 via a predetermined interface is also conceivable.

前記赤外線光源位置検出装置６８は、赤外線信号を発する機器における赤外線光源の位置、すなわちその赤外線信号の出力部の位置を検出する。例えば、前記マイクロフォン４０においては、図２等に示すように、棒状（長手状）の柄の一端部であって音声入力部（マイクロフォン本体）が設けられた側とは逆側の端部に赤外線光源４０ｓが設けられており、前記マイクロフォン４０により入力された音声はその赤外線光源４０ｓから前記カラオケ装置１６に対して赤外線信号として無線送信されるように構成されている。前記赤外線光源位置検出装置６８は、例えば、前記カラオケ装置１６の設置された各個室１４に対応して予め定められた座標における上記赤外線光源４０ｓの位置を検出し、それにより前記マイクロフォン４０の前記個室１４内における相対位置を検出する。すなわち、本実施例において、前記赤外線光源位置検出装置６８は、前記マイクロフォン４０からの無線信号に基づいてそのマイクロフォン４０の存在する位置を検出する位置検出装置として機能する。 The infrared light source position detecting device 68 detects the position of the infrared light source in the device that emits the infrared signal, that is, the position of the output portion of the infrared signal. For example, in the microphone 40, as shown in FIG. 2 and the like, an infrared ray is formed at one end of a rod-like (longitudinal) handle opposite to the side on which the voice input unit (microphone main body) is provided. A light source 40 s is provided, and the sound input by the microphone 40 is wirelessly transmitted as an infrared signal from the infrared light source 40 s to the karaoke device 16. The infrared light source position detection device 68 detects, for example, the position of the infrared light source 40s at a predetermined coordinate corresponding to each individual room 14 in which the karaoke device 16 is installed, and thereby the individual room of the microphone 40 is detected. The relative position in 14 is detected. That is, in the present embodiment, the infrared light source position detection device 68 functions as a position detection device that detects the position of the microphone 40 based on the radio signal from the microphone 40.

前記ハードディスク５６には、カラオケ演奏曲を出力させるための多数の楽曲データ（カラオケデータ）を記憶する楽曲データベース及び後述する本実施例の映像合成制御に用いられる複数の背景データを記憶する背景データベース８８（図５を参照）をはじめとする各種データベースが設けられている。カラオケボックス等の店舗１２にそれぞれ備えられた複数のカラオケ装置１６のうち所定のカラオケ装置１６例えば前記カラオケ装置１６ａは、前記モデム５８を介して前記通信回線１８に接続されており、前記複数のカラオケ装置１６によって常に新しい曲が演奏可能とされるように、或いは常に新しい背景データに基づいて後述する映像合成制御が行われるように、随時新たな楽曲データや背景データ等が前記サーバ装置２０から前記通信回線１８を介して配信され、前記ハードディスク５６の背景データベース８８等に記憶される。また、そのようにして前記サーバ装置２０から情報を取得したカラオケ装置１６ａとその他のカラオケ装置１６との間で前記ＬＡＮ２４を介した通信が行われることにより、各カラオケ装置１６のハードディスク５６に記憶される情報が共有され、上記背景データベース８８等の内容が等価なものとされる。 The hard disk 56 stores a music database for storing a large number of music data (karaoke data) for outputting karaoke performance music, and a background database 88 for storing a plurality of background data used for video composition control of the embodiment described later. Various databases such as (see FIG. 5) are provided. Among a plurality of karaoke devices 16 provided in a store 12 such as a karaoke box, a predetermined karaoke device 16, for example, the karaoke device 16 a is connected to the communication line 18 via the modem 58, and the plurality of karaoke devices 16 New music data, background data, and the like are constantly sent from the server device 20 so that new music can be played by the device 16 or video composition control to be described later is performed based on new background data. It is distributed via the communication line 18 and stored in the background database 88 of the hard disk 56 or the like. In addition, communication is performed via the LAN 24 between the karaoke device 16 a that has acquired information from the server device 20 and the other karaoke devices 16, and is stored in the hard disk 56 of each karaoke device 16. And the contents of the background database 88 and the like are equivalent.

上記楽曲データベースは、前記カラオケ装置１６により出力可能な演奏曲にそれぞれ対応する多数（例えば、数万曲分）の楽曲データ（カラオケデータ）を記憶する。この楽曲データは、前記シンセサイザ３８により所定の楽器の演奏音を生成するための演奏情報と、歌詞文字映像（歌詞テロップ）を生成するための歌詞情報と、その歌詞情報に基づいて生成された歌詞文字映像を演奏の進行に合わせて順次色替わりさせてゆくための歌詞色替情報とを、含むものであり、コンテンツＩＤである各演奏曲に固有の選曲番号により識別される。 The music database stores a large number (for example, tens of thousands of songs) of music data (karaoke data) corresponding to performance music that can be output by the karaoke device 16. The music data includes performance information for generating a performance sound of a predetermined instrument by the synthesizer 38, lyrics information for generating a lyrics character image (lyric telop), and lyrics generated based on the lyrics information. It includes lyrics color change information for sequentially changing the color of the character image in accordance with the progress of the performance, and is identified by a music selection number unique to each performance song as the content ID.

また、前記楽曲データベースに記憶された楽曲データは、好適には、演奏情報としてのＭＩＤＩデータ等において複数の区分が予め定められたものである。この区分とは、例えば、前記ＭＩＤＩデータのメタ情報に定められた演奏の区分であり、所定の演奏時間毎に、例えば、イントロ（Intro）、Ａメロ（Amelo）、Ｂメロ（Bmelo）、Ｃメロ（Cmelo）、フィル（Fill）、サビ、間奏、及び変拍等の区分が定められている。ここで、サビとは、各楽曲データに対応する演奏曲のうち最も印象的で盛り上がるフレーズが配された部分に相当し、ブリッジ（bridge）等とも称される。 Further, the music data stored in the music database is preferably one in which a plurality of sections are predetermined in MIDI data or the like as performance information. This category is, for example, a performance category defined in the meta information of the MIDI data. For example, for each predetermined performance time, for example, Intro, Amelo, Bmelo, C Categories such as Cmelo, Fill, Rust, Interlude, and Variable Beat are defined. Here, the chorus corresponds to a portion where the most impressive and exciting phrase is arranged among the performance songs corresponding to each piece of music data, and is also referred to as a bridge or the like.

また、前記背景データベース８８は、前記映像情報デコーダ３４により所定の背景画像（静止画）乃至背景映像（動画）を再生するための複数の背景データを記憶する。この背景データは、例えばＡＶＩ（Audio-Video Interleaved）形式、ＭＰＥＧ（Moving Picture Experts Group）形式、ＦＬＶ（Flash Video）形式等の映像ファイル（動画データ）、或いはＪＰＥＧ（Joint Photographic Experts Group）形式、ＧＩＦ（Graphics Interchange Format）形式、ＰＮＧ（Portable Network Graphics）形式等の画像ファイル（静止画データ）であり、各データに固有の識別情報により識別される。 The background database 88 stores a plurality of background data for reproducing a predetermined background image (still image) or background video (moving image) by the video information decoder 34. This background data is, for example, an AVI (Audio-Video Interleaved) format, MPEG (Moving Picture Experts Group) format, FLV (Flash Video) format video file (movie data), JPEG (Joint Photographic Experts Group) format, GIF, etc. (Graphics Interchange Format) format, PNG (Portable Network Graphics) format image file (still image data), etc., identified by identification information unique to each data.

図３は、前記サーバ装置２０の構成を説明するブロック線図である。この図３に示すように、前記サーバ装置２０は、中央演算処理装置であるＣＰＵ７０により随時書込読出メモリであるＲＡＭ７４の一時記憶機能を利用しつつ読出専用メモリであるＲＯＭ７２に予め記憶されたプログラムに従って信号処理を行う所謂ノイマン型コンピュータであり、前記カラオケ装置１６からの配信要求に応じた楽曲データ等のコンテンツ配信制御をはじめとする基本的な制御に加えて、前記カラオケシステム１０の利用者を対象とするソーシャルネットワークサービス（Social Network Service）を管理運営する制御等、本実施例のカラオケシステム１０に関する各種制御を実行する。このソーシャルネットワークサービスとは、例えば、予め会員登録された会員相互間に限定して情報の閲覧等のサービスを提供する会員制のコミュニティ型のウェブサイトをいう。なお、以下の説明において、ソーシャルネットワークサービスをＳＮＳと略称する。 FIG. 3 is a block diagram illustrating the configuration of the server device 20. As shown in FIG. 3, the server device 20 uses a CPU 70 which is a central processing unit, a program stored in advance in a ROM 72 which is a read-only memory while using a temporary storage function of a RAM 74 which is a read / write memory as needed. A so-called Neumann computer that performs signal processing according to the above, and in addition to basic control including content distribution control such as music data in response to a distribution request from the karaoke device 16, the user of the karaoke system 10 is Various controls related to the karaoke system 10 of this embodiment, such as control for managing and operating a target social network service, are performed. This social network service refers to, for example, a member-based community-type website that provides services such as information browsing only between members who are registered in advance. In the following description, the social network service is abbreviated as SNS.

前記サーバ装置２０は、ビデオボード７８により制御されるＣＲＴやＴＦＴ等の映像表示装置７６と、インターフェイス８２を介して接続されるキーボード等の入力装置８０と、上記ＣＰＵ７０を前記通信回線１８に接続するための装置であるモデム８４とを、備えて構成されている。前記サーバ装置２０は、このモデム８４を介して前記通信回線１８に接続されることにより、その通信回線１８に接続された前記複数のカラオケ装置１６との間で相互に情報の送受信が可能とされている。また、前記サーバ装置２０には、前記カラオケ装置１６に配信するための多数の前記カラオケデータを記憶する図示しない楽曲データベースの他、上記ＳＮＳに関する情報を記憶するＳＮＳデータベース８６等の各種データベースが設けられている。 The server device 20 connects the video display device 76 such as CRT and TFT controlled by the video board 78, the input device 80 such as a keyboard connected via the interface 82, and the CPU 70 to the communication line 18. And a modem 84, which is a device for this purpose. The server device 20 is connected to the communication line 18 via the modem 84, so that information can be transmitted to and received from the plurality of karaoke devices 16 connected to the communication line 18. ing. The server device 20 is provided with various databases such as an SNS database 86 for storing information related to the SNS in addition to a music database (not shown) that stores a large number of the karaoke data to be distributed to the karaoke device 16. ing.

上記ＳＮＳデータベース８６は、前記カラオケシステム１０を利用する各利用者毎の、前記カラオケ装置１６を用いたカラオケ演奏に関する情報を、その利用者の識別情報（ユーザＩＤ）と関連付けて記憶する記憶装置である。このＳＮＳデータベース８６には、上記各利用者の前記カラオケ装置１６を用いたカラオケ演奏に関する情報として、例えば、その利用者が過去に利用したカラオケ装置１６に対応する店舗１２（そのカラオケ装置１６が設置された店舗１２）に関する情報である来店履歴、その利用者が前記カラオケ装置１６によるカラオケ演奏において十八番曲として登録した演奏曲（簡易な操作により選曲入力を行い得るように設定された演奏曲）に関する情報、その利用者が過去に前記カラオケ装置１６によるカラオケ演奏において選曲した選曲履歴（カラオケ装置１６において過去に選曲された演奏曲の履歴）としての演奏曲に関する情報、その利用者が前記カラオケ装置１６によるカラオケ演奏において過去に行った演奏評価の評価結果に関する情報、その利用者が前記カラオケ装置１６によるカラオケ演奏に際して前記デジタルカメラ６６により撮影した映像データ等の情報、及びその利用者がフレンドとして登録した他の利用者に関する情報等が各利用者毎にその利用者のユーザＩＤと関連付けられて記憶される。 The SNS database 86 is a storage device that stores information related to karaoke performance using the karaoke device 16 for each user who uses the karaoke system 10 in association with identification information (user ID) of the user. is there. In this SNS database 86, as information on the karaoke performance using the karaoke device 16 of each user, for example, the store 12 corresponding to the karaoke device 16 used by the user in the past (the karaoke device 16 is installed). The store visit history, which is information related to the store 12), and the performance music that the user has registered as the eighteenth music in the karaoke performance by the karaoke device 16 (the performance music set so that the music selection input can be performed by a simple operation) Information, information related to a performance song as a music selection history (history of performance songs selected in the past by the karaoke device 16) that the user has selected in the karaoke performance by the karaoke device 16; Performance evaluation results of past performances of karaoke performances Information, such as video data photographed by the digital camera 66 when the user performed the karaoke performance by the karaoke device 16, and information on other users registered as friends by the user, etc. for each user. It is stored in association with the user ID of the user.

また、好適には、前記ＳＮＳデータベース８６には、各利用者毎に、その利用者の名前（ニックネーム）、生年月日、年齢（実年齢）、性別、メールアドレス、地域、血液型、星座、パスワードを忘れたときのための質問及び解答、ＳＮＳへのログイン認証に用いられるパスワード、アバタ（ネット上において利用者を象徴する人型映像）に関する情報、及び利用者の歌年齢等の属性情報がその利用者のユーザＩＤと関連付けられて記憶されている。この歌年齢とは、利用者の演奏曲の好みの傾向がどの程度の年代（何歳）に相当するものかを示す仮想的な年齢情報であり、対象となる利用者が前記カラオケ装置１６において過去に選曲（演奏）した演奏曲に基づいて判断される値であり、好適には、対象となる利用者が過去に選曲（演奏）した演奏曲を算出の基準として、その利用者のカラオケ演奏の傾向がどの程度の年齢に相当するかという観点から導出される値である。 Preferably, the SNS database 86 includes, for each user, the name (nickname), date of birth, age (actual age), gender, email address, region, blood type, constellation, Questions and answers for forgotten passwords, passwords used for SNS login authentication, information on avatars (humanoid images that symbolize users on the net), and attribute information such as the song's song age It is stored in association with the user ID of the user. The song age is virtual age information indicating how much age (how many years) the user's preference of the music piece corresponds to, and the target user uses the karaoke device 16 It is a value determined based on a performance song that has been selected (played) in the past. Preferably, a performance song previously selected (played) by the target user is used as a reference for calculation, and the user's karaoke performance It is a value derived from the viewpoint of how much age the tendency corresponds to.

また、好適には、前記カラオケ装置１６において演奏可能な演奏曲に対応する楽曲データ（楽曲データベースに記憶されたデータ）には、属性情報として各演奏曲に対応する歌年齢（演奏曲の仮想的な歌年齢）が設定されており、前記利用者の歌年齢は、例えば、その利用者が前記カラオケ装置１６（所定の店舗におけるカラオケ装置１６に限られず、カラオケシステム１０において利用可能とされた複数のカラオケ装置１６の何れか）において過去に選曲した全ての演奏曲に対応付けられて記憶された歌年齢の平均値である。また、各楽曲データに対応付けられて記憶された歌年齢は、前記カラオケ装置１６（所定の店舗におけるカラオケ装置１６に限られず、カラオケシステム１０において利用可能とされた複数のカラオケ装置１６の何れか）において過去にその演奏曲を選曲した利用者の歌年齢又は実年齢に基づいて算出されるものであり、例えば、その演奏曲を前記カラオケ装置１６において過去に選曲した全ての利用者に対応付けられて記憶された歌年齢又は実年齢の平均値である。斯かる利用者及び演奏曲の歌年齢は、前記サーバ装置２０において統括的に管理され、前記カラオケ装置１６においてカラオケ演奏が行われる毎に各利用者及び演奏曲の歌年齢が更新される。従って、若い世代によく歌われる演奏曲を選曲した場合、選曲主体である利用者の歌年齢は若くなる一方、年配の世代によく歌われる演奏曲を選曲した場合、選曲主体である利用者の歌年齢は高くなる。 Preferably, the song data (data stored in the song database) corresponding to the performance music that can be played by the karaoke apparatus 16 includes the song age (virtual music virtual) corresponding to each performance music as attribute information. The singing age of the user is, for example, a plurality of the singing ages of the user that the user can use in the karaoke system 10 (not limited to the karaoke device 16 in a predetermined store). Any of the karaoke apparatuses 16) is an average value of the song ages stored in association with all the performance songs selected in the past. Further, the song age stored in association with each piece of music data is the karaoke device 16 (not limited to the karaoke device 16 in a predetermined store, and any one of a plurality of karaoke devices 16 that can be used in the karaoke system 10). ) Is calculated based on the singing age or actual age of the user who selected the performance song in the past. For example, the performance song is associated with all users who have selected the performance song in the past in the karaoke device 16. It is the average value of the song age or the actual age stored and stored. The singing ages of such users and performance songs are comprehensively managed by the server device 20, and the singing ages of the respective users and performance songs are updated every time the karaoke device 16 performs a karaoke performance. Therefore, when a song that is often sung by the younger generation is selected, the song age of the user who is the song selection subject is younger, while when a song that is often sung by the older generation is selected, The song age gets higher.

前記サーバ装置２０のＣＰＵ７０は、前記カラオケシステム１０におけるＳＮＳに関する情報登録制御を行う。具体的には、前記カラオケ１６の入力装置としての前記電子早見本装置２２等による入力操作に応じて、前記ＳＮＳデータベース８６に新規ユーザ（利用者）の登録を行ったり、そのＳＮＳデータベース８６に記憶された登録内容を変更（更新）したり、そのＳＮＳデータベース８６に記憶された複数の利用者をフレンドとして相互に関連付けて登録したり、前記カラオケ装置１６による評価結果を各利用者毎に記憶したり、前記カラオケ装置１６によるカラオケ演奏に際して前記デジタルカメラ６６により撮影された映像データを各利用者毎に記憶したり、上述した利用者及び演奏曲に対応する歌年齢の更新を行ったりというように、前記カラオケシステム１０におけるＳＮＳの統括的な管理制御を行う。 The CPU 70 of the server device 20 performs information registration control regarding the SNS in the karaoke system 10. Specifically, a new user (user) is registered in the SNS database 86 or stored in the SNS database 86 in response to an input operation by the electronic quick sample apparatus 22 as an input device of the karaoke 16. The registered contents are changed (updated), a plurality of users stored in the SNS database 86 are registered in association with each other, and the evaluation result by the karaoke device 16 is stored for each user. Or the video data photographed by the digital camera 66 during the karaoke performance by the karaoke device 16 is stored for each user, or the song age corresponding to the above-mentioned user and performance music is updated. , SNS overall management control in the karaoke system 10 is performed.

また、前記サーバ装置２０のＣＰＵ７０は、前記カラオケ装置１６、電子早見本装置２２、或いは図示しない家庭用パーソナルコンピュータや携帯電話機等の通信端末装置から所定の映像データの配信要求があった場合には、その配信要求に応じて前記ＳＮＳデータベース８６に記憶された配信要求に係る映像データを要求元である通信端末装置に前記通信回線１８を介して配信する。斯かる処理により配信された映像データは、各通信端末装置に備えられたアプリケーションソフトによりその通信端末装置の表示部（カラオケ装置１６の映像表示装置３０等）に表示される。 The CPU 70 of the server device 20 receives a predetermined video data distribution request from the karaoke device 16, the electronic sample device 22, or a communication terminal device such as a home personal computer or a cellular phone (not shown). In response to the distribution request, the video data related to the distribution request stored in the SNS database 86 is distributed to the requesting communication terminal device via the communication line 18. The video data distributed by such processing is displayed on the display unit (such as the video display device 30 of the karaoke device 16) of the communication terminal device by application software provided in each communication terminal device.

図４は、前記カラオケ装置１６によるカラオケ演奏に際して前記デジタルカメラ６６によりそのカラオケ装置１６を基準とする所定範囲の映像が撮影される様子を説明する図であり、前記カラオケ装置１６が設置された個室１４を破線で示している。この図４に示すように、前記デジタルカメラ６６は、好適には、前記カラオケ装置１６乃至そのカラオケ装置１６が設置された前記個室１４に対して位置固定に設けられており、その個室１４内における前記カラオケ装置１６を基準とする所定範囲の映像を撮影するように構成されている。すなわち、前記デジタルカメラ６６のカメラアングルは固定されており、常に前記個室１４内の同じ範囲の像が撮影されるようになっている。このカメラアングルは、好適には、前記カラオケ装置１６の映像表示装置３０の画面を見ながらカラオケ演奏を行う歌唱者９０の像を撮像内に収めるように予め定められており、前記カラオケ装置１６による演奏曲の出力に際して前記デジタルカメラ６６による撮影が行われた場合、上記歌唱者９０に対応する映像及びその歌唱者９０が手にする（把持する）マイクロフォン４０に対応する映像を含む映像（演奏映像）が撮影される。なお、上記カメラアングルは、例えば前記店舗１２の店員等による所定の設定操作により変更し得るものであってもよい。 FIG. 4 is a diagram for explaining a situation in which a video of a predetermined range based on the karaoke device 16 is taken by the digital camera 66 during a karaoke performance by the karaoke device 16, and a private room in which the karaoke device 16 is installed. 14 is indicated by a broken line. As shown in FIG. 4, the digital camera 66 is preferably provided in a fixed position with respect to the karaoke device 16 to the private room 14 in which the karaoke device 16 is installed. A predetermined range of video with respect to the karaoke device 16 is taken. That is, the camera angle of the digital camera 66 is fixed, and images in the same range in the private room 14 are always taken. The camera angle is preferably set in advance so that an image of a singer 90 performing a karaoke performance while looking at the screen of the video display device 30 of the karaoke device 16 is included in the imaging. When shooting with the digital camera 66 at the time of outputting a performance song, a video (performance video) including a video corresponding to the singer 90 and a video corresponding to the microphone 40 that the singer 90 holds (holds). ) Is shot. The camera angle may be changed by a predetermined setting operation by a clerk of the store 12, for example.

図５は、前記カラオケ装置１６のＣＰＵ５０に備えられた制御機能の要部を説明する機能ブロック線図である。なお、この図５に示す各制御手段の一部乃至全部が前記電子早見本装置２２のＣＰＵ等に備えられたものであってもよい。この図５に示すマイクロフォン位置検出手段９２は、撮像装置である前記デジタルカメラ６６により撮影された映像に含まれる前記マイクロフォン４０に対応する位置を検出する。すなわち、前記デジタルカメラ６６により撮影された映像全体における前記マイクロフォン４０に対応する映像（部分映像）の相対位置を検出する。 FIG. 5 is a functional block diagram illustrating a main part of the control function provided in the CPU 50 of the karaoke apparatus 16. 5 may be provided in the CPU of the electronic quick sample apparatus 22 or the like. The microphone position detection means 92 shown in FIG. 5 detects a position corresponding to the microphone 40 included in an image taken by the digital camera 66 that is an imaging device. That is, the relative position of the image (partial image) corresponding to the microphone 40 in the entire image taken by the digital camera 66 is detected.

図６は、前記カラオケ装置１６による演奏曲の出力に際して前記デジタルカメラ６６により撮影される映像の一例である演奏映像１００を示す図である。この図６に示すように、前記カラオケ装置１６によるカラオケ演奏に際して撮影された演奏映像１００に歌唱者に対応する映像１０２が含まれている場合、その歌唱者が手にする（把持する）前記マイクロフォン４０に対応する映像１０４も同様にその演奏映像１００に含まれていることが多いものと考えられる。上記マイクロフォン位置検出手段９２は、このように前記デジタルカメラ６６により撮影される演奏映像１００内に含まれる部分映像としての前記マイクロフォン４０に対応する映像１０４の位置を検出する。例えば、図６に示す演奏映像１００全体におけるその映像１０４の相対的な位置を検出する。この検出の態様としては、長手状の前記マイクロフォン４０全体の形状に対応する位置を検出するものであってもよいし、そのマイクロフォン４０における赤外線光源４０ｓや音声入力部（マイクロフォン本体）の位置すなわちマイクロフォン４０の一部を検出するものであってもよい。更に、上記演奏映像１００全体に対する前記映像１０４の重心の座標の位置を検出する等の制御を行うものであってもよい。なお、前記デジタルカメラ６６により撮影される演奏映像１００は、好適には経時的に変化する動画であるが、図６においては、便宜状、経時的に変化する演奏映像１００の１コマ（所定タイミングの像）を例示している。また、上記演奏映像１００が動画である場合、その演奏映像１００の変化に伴い前記マイクロフォン４０に対応する映像１０４の位置が移動することが考えられるため、例えば０．１秒毎といった短い所定時間毎に上記位置検出を行い、継続的に前記マイクロフォン４０に対応する映像１０４の位置を検出することが好ましい。 FIG. 6 is a diagram showing a performance video 100 that is an example of a video shot by the digital camera 66 when the karaoke device 16 outputs a performance tune. As shown in FIG. 6, when a performance image 100 photographed during a karaoke performance by the karaoke device 16 includes a video 102 corresponding to a singer, the microphone that the singer obtains (holds). Similarly, it is considered that the video 104 corresponding to 40 is often included in the performance video 100 as well. The microphone position detecting means 92 detects the position of the video 104 corresponding to the microphone 40 as a partial video included in the performance video 100 photographed by the digital camera 66 in this way. For example, the relative position of the video 104 in the entire performance video 100 shown in FIG. 6 is detected. As a mode of this detection, a position corresponding to the overall shape of the longitudinal microphone 40 may be detected, or the position of the infrared light source 40s or the voice input unit (microphone main body) in the microphone 40, that is, the microphone. A part of 40 may be detected. Further, control such as detecting the position of the coordinates of the center of gravity of the video 104 with respect to the entire performance video 100 may be performed. The performance video 100 photographed by the digital camera 66 is preferably a moving image that changes with time, but in FIG. 6, one frame (predetermined timing) of the performance video 100 that changes with time is shown for convenience. Image). Further, when the performance video 100 is a moving image, the position of the video 104 corresponding to the microphone 40 may move with the change of the performance video 100. Therefore, for example, every short predetermined time such as every 0.1 second. It is preferable to detect the position of the image 104 corresponding to the microphone 40 continuously.

前記マイクロフォン位置検出手段９２は、好適には、前記デジタルカメラ６６により撮影された映像において輝度に基づく画像解析を行うことにより前記マイクロフォン４０に対応する位置を検出する。前記カラオケ装置１６に備えられる前記マイクロフォン４０は、一般に、黒色或いは銀色（金属色）等の単一色を基調とする色彩を有し、且つ一方に部分球状の音声入力部（マイクロフォン本体）を備えた長手状（柄状）の特徴的な形状を有するものであるため、よく知られた輝度に基づく画像解析により比較的容易に検出することができる。図６においては、前記マイクロフォン位置検出手段９２により斯かる画像解析を行うことで検出された前記マイクロフォン４０に対応する映像１０４の位置１０４ａを破線で示している。 The microphone position detection unit 92 preferably detects a position corresponding to the microphone 40 by performing image analysis based on luminance in the video imaged by the digital camera 66. The microphone 40 provided in the karaoke device 16 generally has a color based on a single color such as black or silver (metal color), and has a partially spherical audio input unit (microphone main body) on one side. Since it has a long (patterned) characteristic shape, it can be detected relatively easily by image analysis based on well-known luminance. In FIG. 6, the position 104a of the video image 104 corresponding to the microphone 40 detected by performing such image analysis by the microphone position detection unit 92 is indicated by a broken line.

また、前記マイクロフォン位置検出手段９２は、好適には、前記赤外線光源位置検出装置６８の検出結果に基づいて、前記デジタルカメラ６６により撮影された映像に含まれる前記マイクロフォン４０に対応する位置を検出する。前述のように、前記デジタルカメラ６６のカメラアングル（撮影される範囲）は固定であるため、前記個室１４における前記マイクロフォン４０（赤外線光源４０ｓ）の相対位置が検出されると、そのデジタルカメラ６６により撮影された演奏映像１００内における前記マイクロフォン４０に対応する映像１０４の相対的な位置を特定することができる。なお、斯かる態様においては、前記デジタルカメラ６６のカメラアングル（撮影範囲）と前記赤外線光源位置検出装置６８の検出範囲との対応関係を予め実験的に調査して前記ＲＡＭ５４等に記憶しておくことが好ましい。 In addition, the microphone position detection unit 92 preferably detects a position corresponding to the microphone 40 included in an image captured by the digital camera 66 based on the detection result of the infrared light source position detection device 68. . As described above, since the camera angle (shooting range) of the digital camera 66 is fixed, the digital camera 66 detects the relative position of the microphone 40 (infrared light source 40s) in the private room 14. The relative position of the video 104 corresponding to the microphone 40 in the photographed performance video 100 can be specified. In such an embodiment, the correspondence relationship between the camera angle (shooting range) of the digital camera 66 and the detection range of the infrared light source position detection device 68 is experimentally investigated in advance and stored in the RAM 54 or the like. It is preferable.

図５に示す歌唱者身長算出手段９４は、歌唱者の性別に関する情報及び生年に関する情報から、その歌唱者の身長を算出する。例えば、前記ＲＡＭ５４に設けられた予約曲テーブルには、各予約曲の選曲番号に対応してその演奏曲を予約した利用者のユーザＩＤが記憶されるようになっており、上記歌唱者身長算出手段９４は、各演奏曲の演奏に相前後して、前記ＳＮＳデータベース８６に記憶された各利用者（歌唱者）の性別及び生年月日乃至年齢（実年齢）を前記通信回線１８を介してダウンロードし、各性別毎に予め定められた生年毎の平均身長に基づいて対象となる歌唱者の身長を算出する。例えば、対象となる歌唱者の性別が「男」、生年月日が「２００１年６月１８日」であって、２００１年生まれの男性の平均身長が「１６８ｃｍ」と定められている場合には、その歌唱者の身長は「１６８ｃｍ」と算出される。また、生年月日乃至年齢（実年齢）の代わりに、前記ＳＮＳデータベース８６に記憶された各利用者の歌年齢に基づいてその歌唱者の身長を算出するものであってもよい。また、前記ＳＮＳデータベース８６に登録されていないゲスト利用者に関しては、予め定められた「１７０ｃｍ」といったデフォルト値がその歌唱者の身長として算出される。 The singer height calculating means 94 shown in FIG. 5 calculates the singer's height from the information regarding the gender of the singer and the information regarding the year of birth. For example, the reserved music table provided in the RAM 54 stores the user ID of the user who reserved the performance music corresponding to the music selection number of each reserved music. The means 94 is connected with the gender and date of birth or age (actual age) of each user (singer) stored in the SNS database 86 via the communication line 18 in tandem with the performance of each performance piece. Download and calculate the height of the target singer based on the average height for each year of birth predetermined for each gender. For example, when the gender of the target singer is “male”, the date of birth is “June 18, 2001”, and the average height of a male born in 2001 is “168 cm” The height of the singer is calculated as “168 cm”. Further, instead of the date of birth or age (actual age), the height of the singer may be calculated based on the song age of each user stored in the SNS database 86. For guest users who are not registered in the SNS database 86, a predetermined default value such as “170 cm” is calculated as the height of the singer.

歌唱者領域判定手段９６は、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置に基づいて、前記デジタルカメラ６６により撮影された映像に含まれる歌唱者に対応する領域を判定する。図６を用いて前述したように、前記デジタルカメラ６６により撮影された演奏映像１００内における前記マイクロフォン４０に対応する映像１０４の相対位置が検出された場合、そのマイクロフォン４０を手にする歌唱者に対応する映像１０２がその付近乃至周囲に存在するものと考えられる。従って、上記歌唱者領域判定手段９６は、好適には、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置を含む所定範囲を、前記演奏映像１００において歌唱者が映っている領域として判定する。 The singer area determination unit 96 determines an area corresponding to the singer included in the video photographed by the digital camera 66 based on the position corresponding to the microphone 40 detected by the microphone position detection unit 92. . As described above with reference to FIG. 6, when the relative position of the image 104 corresponding to the microphone 40 in the performance image 100 photographed by the digital camera 66 is detected, the singer holding the microphone 40 is notified. It is considered that the corresponding video 102 exists in the vicinity or the periphery thereof. Therefore, the singer area determination means 96 is preferably an area in which the singer is shown in the performance video 100 within a predetermined range including the position corresponding to the microphone 40 detected by the microphone position detection means 92. Judge as.

図７は、図６に示す演奏映像１００における前記マイクロフォン４０に対応する位置に基づいて歌唱者に対応する領域を判定する制御の一例を説明する図である。この図７に示すように、上記歌唱者領域判定手段９６は、好適には、前記デジタルカメラ６６により撮影された演奏映像１００において、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置を含む楕円形の領域を歌唱者に対応する領域１０６として判定する。例えば、長軸が画面（演奏映像１００）に対して縦方向となる楕円を設定し、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置がその楕円における長軸上であり且つ中心よりも上側の所定位置となるように上記歌唱者に対応する領域１０６を判定する。ここで、図７に示すように、上記歌唱者領域判定手段９６により判定される歌唱者に対応する領域１０６は、必ずしもその歌唱者に対応する映像１０２と一致するものではなく、その歌唱者に対応する映像１０２を部分的に含むものであればよい。また、その領域１０６内には、歌唱者に対応する映像１０２以外の映像も含まれ得るし、その歌唱者に対応する映像１０２以外の映像に対応する割合（面積）の方が大きい場合も考えられる。 FIG. 7 is a diagram for explaining an example of control for determining an area corresponding to a singer based on a position corresponding to the microphone 40 in the performance video 100 shown in FIG. As shown in FIG. 7, the singer area determination means 96 preferably corresponds to the microphone 40 detected by the microphone position detection means 92 in the performance video 100 photographed by the digital camera 66. The oval area including the position is determined as the area 106 corresponding to the singer. For example, an ellipse whose long axis is the vertical direction with respect to the screen (performance video 100) is set, and the position corresponding to the microphone 40 detected by the microphone position detecting means 92 is on the long axis of the ellipse and The region 106 corresponding to the singer is determined so as to be at a predetermined position above the center. Here, as shown in FIG. 7, the area 106 corresponding to the singer determined by the singer area determination means 96 does not necessarily match the video 102 corresponding to the singer, and the singer What is necessary is just to include the corresponding | compatible image | video 102 partially. In addition, the area 106 may include a video other than the video 102 corresponding to the singer, and a ratio (area) corresponding to a video other than the video 102 corresponding to the singer may be larger. It is done.

上記歌唱者領域判定手段９６は、好適には、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置及び歌唱者の身長に関する情報例えば前記歌唱者身長算出手段９４により算出される歌唱者の身長に基づいて、前記デジタルカメラ６６により撮影された映像に含まれる歌唱者に対応する領域を判定する。例えば、図７に示すように歌唱者に対応する楕円形の領域１０６を判定する態様においては、歌唱者の身長に基づいて斯かる楕円形の長軸寸法（縦方向寸法）を決定する。すなわち、対象となる歌唱者の身長が高いほどその歌唱者に対応する楕円形の長軸寸法を長く設定する。これにより、身長が比較的高い歌唱者に対応する領域は長軸寸法が比較的長く縦に長い領域となり、身長が比較的低い歌唱者に対応する領域は長軸寸法が比較的短く縦に短い領域となるため、各歌唱者の身長に合わせて好適な範囲がその歌唱者に対応する領域として判定される。 The singer area determination means 96 is preferably information relating to the position corresponding to the microphone 40 detected by the microphone position detection means 92 and the height of the singer, for example, a song calculated by the singer height calculation means 94. An area corresponding to the singer included in the video photographed by the digital camera 66 is determined based on the height of the person. For example, as shown in FIG. 7, in the aspect which determines the elliptical area | region 106 corresponding to a singer, the major axis dimension (vertical direction dimension) of such an ellipse is determined based on a singer's height. That is, as the height of the target singer is higher, the elliptical long axis dimension corresponding to the singer is set longer. Thereby, the region corresponding to a singer with a relatively high height is a region with a long axis dimension that is relatively long and vertically long, and the region corresponding to a singer with a relatively low height is relatively short in length and short in length. Since it becomes an area | region, according to the height of each singer, a suitable range is determined as an area | region corresponding to the singer.

図５に示す映像合成制御手段９８は、前記デジタルカメラ６６により撮影された映像から、前記歌唱者領域判定手段９６により判定された歌唱者に対応する領域の映像を切り抜いて他の映像に合成する。例えば、前記歌唱者領域判定手段９６により判定された歌唱者に対応する領域１０６に対応する映像を前記演奏映像１００から切り抜いて、前記背景データベース８８に記憶された背景データ（動画又は静止画）の何れかに合成する。ここで、前記デジタルカメラ６６により撮影される映像が動画である場合には、上記映像合成手段９８により切り抜かれる歌唱者に対応する領域１０６に対応する映像も同様に動画となる。一方、前記背景データベース８８に記憶された背景データは、動画又は静止画に対応するものであるため、動画としての背景データの前面側レイヤに上記歌唱者に対応する領域１０６に対応する動画を合成する態様や、静止画としての背景データの前面側レイヤに上記歌唱者に対応する領域１０６に対応する動画を合成する態様等が考えられる。 The video composition control means 98 shown in FIG. 5 cuts out the video of the area corresponding to the singer determined by the singer area determination means 96 from the video photographed by the digital camera 66 and synthesizes it with another video. . For example, the video corresponding to the area 106 corresponding to the singer determined by the singer area determination means 96 is cut out from the performance video 100 and the background data (moving image or still image) stored in the background database 88 is extracted. Synthesize to either. Here, when the video imaged by the digital camera 66 is a moving image, the video image corresponding to the area 106 corresponding to the singer clipped by the video synthesizing means 98 is also a moving image. On the other hand, since the background data stored in the background database 88 corresponds to a moving image or a still image, the moving image corresponding to the area 106 corresponding to the singer is synthesized with the front side layer of the background data as the moving image. And a mode in which a moving image corresponding to the region 106 corresponding to the singer is combined with a front side layer of background data as a still image.

図８は、図６に示す演奏映像１００において図７に示すように判定された歌唱者に対応する領域１０６の映像１０８を切り抜いて示す図である。また、図９は、図８に示す映像１０８が合成される背景の一例として、前記個室１４内で共にカラオケ演奏を行っている利用者（同じカラオケ装置１６においてＳＮＳへのログインを行った利用者）のアバタ１１２ａ、１１２ｂ、１１２ｃ（以下、特に区別しない場合には単にアバタ１１２という）がバンド演奏をする背景映像１１０を示している。上記映像合成手段９８は、前記デジタルカメラ６６により撮影された映像から、図７に示すように前記歌唱者領域判定手段９６により判定された歌唱者に対応する領域１０６の映像１０８を切り抜き（抽出し）、図９に示すような背景映像１１０の前面側レイヤに合成する（貼り込む）。図１０は、図８に示す歌唱者に対応する領域の映像１０８が、図９に示す背景映像１１０の前面側レイヤに合成された合成映像１１４を例示する図である。この合成映像１１４は、好適には、前記映像出力制御部３２及び映像情報デコーダ３４等を介して前記映像表示装置３０に表示されるものであるが、前記電子早見本装置２２のタッチパネルディスプレイに表示されるものであってもよい。前述したように、前記演奏映像１００が経時的に変化する動画である場合、前記歌唱者に対応する領域の映像１０８も同様に動画となるが、前記マイクロフォン位置検出手段９２によりリアルタイムでマイクロフォン４０の位置検出を行い、その検出結果に基づいて前記歌唱者に対応する領域１０６を判定することで、常に歌唱者の映像１０２が含まれる領域の映像１０８を切り抜いて合成することができ、あたかも上記背景画像１１０の中で歌唱者が歌っているような印象の合成映像１１４を合成することができる。 FIG. 8 is a diagram in which the video 108 in the region 106 corresponding to the singer determined as shown in FIG. 7 in the performance video 100 shown in FIG. 6 is cut out. FIG. 9 shows an example of a background in which the video 108 shown in FIG. 8 is synthesized. A user who performs karaoke performance in the private room 14 (a user who logs in to the SNS in the same karaoke device 16). ) Avatars 112a, 112b, and 112c (hereinafter simply referred to as avatars 112 unless otherwise distinguished) show a background image 110 in which a band performance is performed. The video composition means 98 cuts out (extracts) the video 108 of the area 106 corresponding to the singer determined by the singer area determination means 96 from the video taken by the digital camera 66 as shown in FIG. ), And is synthesized (pasted) on the front layer of the background image 110 as shown in FIG. FIG. 10 is a diagram illustrating a composite video 114 in which the video 108 in the area corresponding to the singer shown in FIG. 8 is combined with the front layer of the background video 110 shown in FIG. This composite video 114 is preferably displayed on the video display device 30 via the video output control unit 32, the video information decoder 34, and the like, but is displayed on the touch panel display of the electronic quick sample device 22. It may be done. As described above, when the performance image 100 is a moving image that changes over time, the image 108 in the region corresponding to the singer also becomes a moving image, but the microphone position detection unit 92 performs real-time operation of the microphone 40. By performing position detection and determining the area 106 corresponding to the singer based on the detection result, it is possible to always cut out and synthesize the video 108 of the area including the singer's video 102, as if the above background In the image 110, it is possible to synthesize a composite image 114 having an impression that a singer is singing.

前記映像合成制御手段９８は、好適には、他の映像である前記背景映像１１０に前記歌唱者に対応する領域の映像１０８を合成した合成映像１１４を、前記カラオケ装置１６による演奏曲の出力に伴ってカラオケ背景映像として表示させる。すなわち、図１１に示すように、図８に示すような歌唱者に対応する領域の映像１０８を切り抜き、図９に示すような背景映像１１０の前面側レイヤに合成すると共に、その歌唱者に対応する領域の映像１０８の更に前面側レイヤに演奏曲の歌詞文字映像１１６を合成して前記映像表示装置３０に表示させる。これにより、あたかも前記背景画像１１０の中で歌唱者が歌っているような印象の合成映像１１４を背景映像とするカラオケ映像を見ながらカラオケ演奏を楽しむことができる。 The video composition control means 98 preferably outputs the composite video 114 obtained by synthesizing the video 108 of the area corresponding to the singer to the background video 110 which is another video, as an output of the performance music by the karaoke device 16. Along with this, it is displayed as a karaoke background video. That is, as shown in FIG. 11, the video 108 in the area corresponding to the singer as shown in FIG. 8 is cut out and synthesized with the front side layer of the background video 110 as shown in FIG. The lyric character video 116 of the performance music is synthesized on the further front side layer of the video 108 of the area to be displayed and displayed on the video display device 30. Thereby, it is possible to enjoy the karaoke performance while watching the karaoke video with the composite video 114 having the impression that the singer is singing in the background image 110 as the background video.

また、前記映像合成制御手段９８は、好適には、前記デジタルカメラ６６により撮影される映像から切り抜かれる歌唱者に対応する領域の映像１０８を合成する対象となる他の映像を、前記背景データベース８８に記憶された複数の背景データ等のうちから利用者の入力操作に応じて選択可能とする。図１２は、前記歌唱者に対応する領域の映像１０８を合成する対象となる他の映像を選択するために前記映像表示装置３０或いは前記電子早見本装置２２のタッチパネルディスプレイに表示される画面の一例を示す図である。この図１２に示すように、前記映像合成制御手段９８は、好適には、前記背景データベース８８に記憶された複数の背景データそれぞれのサムネイル映像（縮小した映像）１１８ａ、１１８ｂ、１１８ｃ、１１８ｄを前記映像表示装置３０或いは前記電子早見本装置２２のタッチパネルディスプレイに表示させ、その電子早見本装置２２やリモコン装置６４等により選択させる。なお、図１２に示すサムネイル映像１１８ａが図９に示す背景映像１１０に対応し、このサムネイル映像１１８ａが選択された場合に、図９に示すような前記背景映像１１０に対して前記歌唱者に対応する領域の映像１０８が合成される。 Further, the video composition control means 98 preferably selects another video to be synthesized with the video 108 in the area corresponding to the singer clipped from the video photographed by the digital camera 66 as the background database 88. Can be selected according to a user's input operation from among a plurality of background data stored in. FIG. 12 shows an example of a screen displayed on the touch panel display of the video display device 30 or the electronic quick sample device 22 in order to select another video to be synthesized with the video 108 in the area corresponding to the singer. FIG. As shown in FIG. 12, the video composition control means 98 preferably stores thumbnail videos (reduced videos) 118a, 118b, 118c, 118d of the plurality of background data stored in the background database 88, respectively. The image is displayed on the touch panel display of the video display device 30 or the electronic quick sample device 22, and is selected by the electronic quick sample device 22, the remote control device 64, or the like. 12 corresponds to the background image 110 shown in FIG. 9, and when the thumbnail image 118a is selected, the thumbnail image 118a corresponds to the singer with respect to the background image 110 as shown in FIG. The image 108 of the area to be synthesized is synthesized.

また、前記映像合成制御手段９８は、好適には、前記デジタルカメラ６６により撮影される映像から切り抜かれる歌唱者に対応する領域の映像１０８を合成する対象となる他の映像を複数選択可能とする。すなわち、前記カラオケ装置１６による演奏曲の出力に伴って出力されるカラオケ映像の背景映像として前記歌唱者に対応する領域の映像１０８を合成する対象として、前記背景データベース８８に記憶された背景データ等のうちから利用者の入力操作に応じて複数の背景データを選択可能とする。また、好適には、前記カラオケ装置１６による演奏曲の出力に伴って出力されるカラオケ映像の背景映像として合成する場合、その演奏曲の区分毎に背景データを選択可能とする。すなわち、対象となる演奏曲のＡメロ、Ｂメロ、及びサビ等の区分毎にそれぞれ個別の背景データを選択させ、その背景データに対して前記歌唱者に対応する領域の映像１０８を合成する。斯かる態様において、前記合成映像１１４を前記カラオケ装置１６による演奏曲の出力に伴ってカラオケ背景映像として表示させる場合、その演奏曲の区分に応じてその合成映像１１４のベース（最背面側レイヤの映像）となる背景データが変更される。また、特に演奏曲の区分に関係なく複数の背景データを選択させ、演奏曲の進行に伴って所定時間毎に前記合成映像１１４のベースとなる背景データを変更する態様も考えられる。 In addition, the video composition control means 98 preferably allows a plurality of other videos to be synthesized to be synthesized with the video 108 in the area corresponding to the singer clipped from the video photographed by the digital camera 66. . That is, the background data stored in the background database 88 as a target for synthesizing the video 108 of the area corresponding to the singer as the background video of the karaoke video output along with the output of the performance song by the karaoke device 16 A plurality of background data can be selected according to the input operation of the user. In addition, preferably, when synthesizing as a background image of a karaoke image output along with the output of a performance song by the karaoke apparatus 16, background data can be selected for each segment of the performance song. That is, individual background data is selected for each category such as A melody, B melody, and chorus of the target performance piece, and the video 108 of the area corresponding to the singer is synthesized with the background data. In such an aspect, when the composite video 114 is displayed as a karaoke background video along with the output of the performance music by the karaoke device 16, the base of the composite video 114 (in the rearmost layer) is selected according to the division of the performance music. The background data to be (video) is changed. In addition, it is also conceivable that a plurality of background data is selected regardless of the division of the musical composition, and the background data serving as the base of the composite video 114 is changed every predetermined time as the musical composition progresses.

また、好適には、前記映像合成制御手段９８により合成された合成映像１１４は、前記通信回線１８を介して前記サーバ装置２０に送信され、前記ＳＮＳデータベース８６に歌唱者のユーザＩＤと対応付けられて記憶される。更に好適には、その合成映像１１４を背景映像とするカラオケ演奏が終了した時点で歌唱者（利用者）によりその合成映像１１４をアップロードするか否かを選択可能とし、アップロードする旨の選択入力操作が行われた場合にその合成映像１１４を前記ＳＮＳデータベース８６に記憶させる。斯かる態様においては、前記歌詞文字映像１１６が合成されていない図１０に示すような前記合成映像１１４（背景映像１１０に歌唱者に対応する領域の映像１０８が合成された映像）がアップロードされるのが好ましい。また、図８に示すような前記歌唱者に対応する領域の映像１０８を前記ＳＮＳデータベース８６に記憶し、その映像１０８に対応する動画の配信に際して任意の背景データに合成可能としてもよい。 Preferably, the synthesized video 114 synthesized by the video synthesis control means 98 is transmitted to the server device 20 via the communication line 18 and is associated with the user ID of the singer in the SNS database 86. Is memorized. More preferably, the singer (user) can select whether or not to upload the composite video 114 when the karaoke performance with the composite video 114 as the background video is completed, and a selection input operation for uploading Is performed, the composite video 114 is stored in the SNS database 86. In such an aspect, the composite video 114 (video in which the video 108 of the area corresponding to the singer is combined with the background video 110) as shown in FIG. 10 in which the lyrics character video 116 is not combined is uploaded. Is preferred. 8 may be stored in the SNS database 86 so that it can be combined with arbitrary background data when a moving image corresponding to the video 108 is distributed.

図１３は、前記カラオケ装置１６のＣＰＵ５０による歌唱者映像切抜／合成制御の要部を説明するフローチャートであり、所定の周期で繰り返し実行されるものである。 FIG. 13 is a flowchart for explaining a main part of the singer video cutout / combination control by the CPU 50 of the karaoke apparatus 16 and is repeatedly executed at a predetermined cycle.

先ず、ステップ（以下、ステップを省略する）Ｓ１において、前記背景データベース８８に記憶された複数の背景データが前記映像表示装置３０等に選択可能に表示された後、それら複数の背景データのうち何れかの背景データが選択されたか否かが判断される。このＳ１の判断が否定される場合には、Ｓ１の判断が繰り返されることにより待機させられるが、Ｓ１の判断が肯定される場合には、Ｓ２において、前記ＲＡＭ５４における予約曲テーブルにおける所定の演奏曲（予約曲）の演奏順となる等して前記カラオケ装置１６によるカラオケ演奏が開始されるか否かが判断される。このＳ２の判断が否定される場合には、Ｓ２の判断が繰り返されることにより待機させられるが、Ｓ２の判断が肯定される場合には、Ｓ３において、前記ＳＮＳデータベース８６に記憶された歌唱者の性別及び生年月日乃至年齢が前記通信回線１８を介してダウンロードされ、各性別毎に予め定められた生年毎の平均身長に基づいて対象となる歌唱者の身長が算出された後、Ｓ４において、演奏開始に係る演奏曲の楽曲データが楽曲データベースから読み出される。 First, in step (hereinafter, step is omitted) S1, a plurality of background data stored in the background database 88 is displayed on the video display device 30 or the like so that any one of the plurality of background data can be selected. It is determined whether the background data is selected. If the determination of S1 is negative, the determination is made to wait by repeating the determination of S1, but if the determination of S1 is affirmative, a predetermined performance song in the reserved music table in the RAM 54 is determined in S2. It is determined whether or not the karaoke performance by the karaoke device 16 is started in accordance with the order of performance of the (reserved music). If the determination at S2 is negative, the determination at S2 is repeated to wait, but if the determination at S2 is affirmative, the singer's song stored in the SNS database 86 is stored at S3. After the gender and date of birth or age are downloaded via the communication line 18 and the height of the target singer is calculated based on the average height of each gender determined in advance for each gender, in S4, The music data of the performance music related to the performance start is read from the music database.

次に、Ｓ５において、Ｓ４にて読み出された楽曲データに基づくカラオケ演奏出力が開始される。また、斯かるカラオケ演奏出力と併行して、前記デジタルカメラ６６による撮影が開始される。次に、Ｓ６において、前記デジタルカメラ６６により撮影された映像に含まれる前記マイクロフォン４０に対応する位置１０４ａが、画像解析や赤外線光源４０ｓの検出結果等に基づいて検出される。次に、Ｓ７において、Ｓ３にて算出された歌唱者の身長及びＳ６にて検出された前記マイクロフォン４０に対応する位置１０４ａに基づいて、前記デジタルカメラ６６により撮影された映像に含まれる歌唱者に対応する領域１０６が判定される。次に、Ｓ８において、Ｓ７にて判定された歌唱者に対応する領域の映像１０８が前記デジタルカメラ６６により撮影される映像から切り抜かれる（抽出される）。 Next, in S5, karaoke performance output based on the music data read in S4 is started. In addition to the karaoke performance output, the digital camera 66 starts photographing. Next, in S6, the position 104a corresponding to the microphone 40 included in the video imaged by the digital camera 66 is detected based on the image analysis, the detection result of the infrared light source 40s, and the like. Next, in S7, based on the height of the singer calculated in S3 and the position 104a corresponding to the microphone 40 detected in S6, the singer included in the video photographed by the digital camera 66 is selected. A corresponding area 106 is determined. Next, in S8, the video 108 in the area corresponding to the singer determined in S7 is cut out (extracted) from the video taken by the digital camera 66.

次に、Ｓ９において、Ｓ８にて切り抜かれた歌唱者に対応する領域の映像１０８が、Ｓ１にて選択された背景データの前面側レイヤに合成された合成映像１１４が前記映像表示装置３０等に表示される。また、この合成映像１１４における前記歌唱者に対応する領域の映像１０８の前面側レイヤに、出力されている演奏曲の歌詞文字映像１１６が合成されて表示される。次に、Ｓ１０において、カラオケ演奏終了であるか否かが判断される。このＳ１０の判断が否定される場合には、Ｓ６以下の処理が再び実行されるが、Ｓ１０の判断が肯定される場合には、Ｓ１１において、カラオケ演奏に際して前記映像表示装置３０等に表示されていた合成映像１１４を前記サーバ装置２０へアップロードするか否かが判断される。このＳ１１の判断が否定される場合には、それをもって本ルーチンが終了させられるが、Ｓ１１の判断が肯定される場合には、カラオケ演奏に際して前記映像表示装置３０等に表示されていた合成映像１１４或いは前記歌唱者に対応する領域の映像１０８が所定形式の映像ファイルとして前記サーバ装置２０へアップロードされ、前記ＳＮＳデータベース８６に歌唱者のユーザＩＤと対応付けられて記憶された後、本ルーチンが終了させられる。以上の制御において、Ｓ６が前記マイクロフォン位置検出手段９２の動作に、Ｓ３が前記歌唱者身長算出手段９４の動作に、Ｓ７が前記歌唱者領域判定手段９６の動作に、Ｓ８及びＳ９が前記映像合成制御手段９８の動作にそれぞれ対応する。 Next, in S9, the video 108 in the area corresponding to the singer cut out in S8 is combined with the front side layer of the background data selected in S1, and the synthesized video 114 is displayed on the video display device 30 and the like. Is displayed. In addition, the lyrics text image 116 of the performance song being output is synthesized and displayed on the front layer of the image 108 in the area corresponding to the singer in the synthesized image 114. Next, in S10, it is determined whether or not the karaoke performance has ended. If the determination in S10 is negative, the processing from S6 onward is executed again. However, if the determination in S10 is positive, the video display device 30 and the like are displayed during karaoke performance in S11. It is determined whether or not to upload the synthesized video 114 to the server device 20. If the determination in S11 is negative, the routine is terminated accordingly. If the determination in S11 is affirmative, the composite video 114 displayed on the video display device 30 or the like during the karaoke performance is displayed. Alternatively, after the video 108 in the area corresponding to the singer is uploaded to the server device 20 as a video file of a predetermined format and stored in the SNS database 86 in association with the user ID of the singer, this routine ends. Be made. In the above control, S6 is the operation of the microphone position detection means 92, S3 is the operation of the singer height calculation means 94, S7 is the operation of the singer area determination means 96, and S8 and S9 are the video composition. This corresponds to the operation of the control means 98, respectively.

このように、本実施例によれば、前記カラオケ装置１６による演奏曲の出力に際してそのカラオケ装置１６を基準とする所定範囲の映像を撮影する撮像装置としての前記デジタルカメラ６６と、そのデジタルカメラ６６により撮影された演奏映像１００に含まれる前記マイクロフォン４０に対応する位置１０４ａを検出するマイクロフォン位置検出手段９２（Ｓ６）と、そのマイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置１０４ａに基づいて、前記デジタルカメラ６６により撮影された演奏映像１００に含まれる歌唱者に対応する領域１０６を判定する歌唱者領域判定手段９６（Ｓ７）と、前記デジタルカメラ６６により撮影された映像から、その歌唱者領域判定手段９６により判定された歌唱者に対応する領域の映像１０８を切り抜いて他の映像としての前記背景映像１１０に合成する映像合成制御手段９８（Ｓ８及びＳ９）とを、備えたものであることから、歌唱者が手にしているマイクロフォン４０の位置からその歌唱者に対応する領域１０６を好適に特定することができ、その領域を抽出することで歌唱者に対応する領域の映像１０８を切り抜くことができる。すなわち、カラオケ演奏に際して撮影される映像から歌唱者の映像を簡便に切り抜いて編集し得るカラオケ装置１６を提供することができる。 As described above, according to the present embodiment, the digital camera 66 as an imaging device that captures an image of a predetermined range based on the karaoke device 16 when the karaoke device 16 outputs a performance song, and the digital camera 66. The microphone position detecting means 92 (S6) for detecting the position 104a corresponding to the microphone 40 included in the performance video 100 photographed by the above-described operation, and the position 104a corresponding to the microphone 40 detected by the microphone position detecting means 92. On the basis of the singer area determination means 96 (S7) for determining the area 106 corresponding to the singer included in the performance video 100 photographed by the digital camera 66, and from the video photographed by the digital camera 66, Singer determined by the singer area determination means 96 A microphone held by the singer since it is provided with video composition control means 98 (S8 and S9) for cutting out the video 108 in the corresponding area and synthesizing it with the background video 110 as another video. The area 106 corresponding to the singer can be suitably specified from the 40 positions, and the video 108 of the area corresponding to the singer can be cut out by extracting the area. That is, it is possible to provide the karaoke apparatus 16 that can easily cut out and edit the singer's video from the video shot during the karaoke performance.

また、前記歌唱者領域判定手段９６は、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置１０４ａ及び歌唱者の身長に関する情報に基づいて、前記デジタルカメラ６６により撮影された演奏映像１００に含まれる歌唱者に対応する領域１０６を判定するものであるため、カラオケ演奏に際して撮影される映像に含まれる利用者に対応する領域を簡便且つ実用的な態様で判定することができる。 In addition, the singer area determination means 96 is a performance image photographed by the digital camera 66 based on the position 104a corresponding to the microphone 40 detected by the microphone position detection means 92 and information on the height of the singer. Since the area 106 corresponding to the singer included in 100 is determined, the area corresponding to the user included in the video shot during the karaoke performance can be determined in a simple and practical manner.

また、歌唱者の性別に関する情報及び生年に関する情報から、その歌唱者の身長を算出する歌唱者身長算出手段９４（Ｓ３）を含み、前記歌唱者領域判定手段９６は、前記マイクロフォン位置検出手段９２により検出される前記マイクロフォン４０に対応する位置１０４ａ及び前記歌唱者身長算出手段９４により算出される歌唱者の身長に基づいて、前記デジタルカメラ６６により撮影された演奏映像１００に含まれる歌唱者に対応する領域１０６を判定するものであるため、カラオケ演奏に際して撮影される映像に含まれる利用者に対応する領域を簡便且つ実用的な態様で判定することができる。 Further, it includes a singer height calculating means 94 (S3) for calculating the height of the singer from the information related to the gender of the singer and the information related to the year of birth, and the singer area determining means 96 is controlled by the microphone position detecting means 92. Based on the detected position 104 a corresponding to the microphone 40 and the height of the singer calculated by the singer height calculation means 94, it corresponds to a singer included in the performance video 100 photographed by the digital camera 66. Since the area 106 is determined, the area corresponding to the user included in the video shot during the karaoke performance can be determined in a simple and practical manner.

また、前記歌唱者領域判定手段９６は、前記デジタルカメラ６６により撮影された演奏映像１００に含まれる歌唱者に対応する楕円形の領域１０６を判定するものであり、その歌唱者の身長に関する情報に基づいてその楕円形の長軸寸法を決定するものであるため、カラオケ演奏に際して撮影される映像に含まれる利用者に対応する領域を簡便且つ実用的な態様で判定することができる。 The singer area determination means 96 determines an oval area 106 corresponding to a singer included in the performance video 100 photographed by the digital camera 66, and includes information on the height of the singer. Since the major axis dimension of the ellipse is determined on the basis of this, the area corresponding to the user included in the video photographed during the karaoke performance can be determined in a simple and practical manner.

また、前記マイクロフォン位置検出手段９２は、前記デジタルカメラ６６により撮影された演奏映像１００において輝度に基づく画像解析を行うことにより前記マイクロフォン４０に対応する位置１０４ａを検出するものであるため、歌唱者が手にしているマイクロフォン４０の位置からその歌唱者に対応する領域１０６を実用的な態様で特定することができる。 The microphone position detecting means 92 detects the position 104a corresponding to the microphone 40 by performing image analysis based on luminance in the performance video 100 photographed by the digital camera 66. The area 106 corresponding to the singer can be identified in a practical manner from the position of the microphone 40 in hand.

また、前記マイクロフォン４０は、無線信号を介して前記カラオケ装置１６に音声情報を入力するものであり、前記マイクロフォン４０からの無線信号に基づいてそのマイクロフォン４０の存在する位置を検出する赤外線光源位置検出装置６８を備え、前記マイクロフォン位置検出手段９２は、その赤外線光源位置検出装置６８の検出結果に基づいて前記デジタルカメラ６６により撮影された演奏映像１００に含まれる前記マイクロフォン４０に対応する位置１０４ａを検出するものであるため、歌唱者が手にしているマイクロフォン４０の位置からその歌唱者に対応する領域１０６を実用的な態様で特定することができる。 The microphone 40 inputs voice information to the karaoke device 16 via a radio signal, and detects an infrared light source position detection that detects the position of the microphone 40 based on the radio signal from the microphone 40. The microphone position detection means 92 detects a position 104a corresponding to the microphone 40 included in the performance video 100 photographed by the digital camera 66 based on the detection result of the infrared light source position detection device 68. Therefore, the region 106 corresponding to the singer can be identified in a practical manner from the position of the microphone 40 that the singer has.

また、前記映像合成制御手段９８は、前記背景映像１１０に前記歌唱者に対応する領域の映像１０８を合成した合成映像１１４を、前記カラオケ装置１６による演奏曲の出力に伴ってカラオケ背景映像として表示させるものであるため、歌唱者に対応する映像が合成された合成映像１１４をカラオケ演奏における背景映像として用いることができる。 Further, the video composition control means 98 displays a composite video 114 obtained by synthesizing the background video 110 with the video 108 of the area corresponding to the singer as a karaoke background video in accordance with the output of the performance song by the karaoke device 16. Therefore, the synthesized video 114 in which the video corresponding to the singer is synthesized can be used as the background video in the karaoke performance.

以上、本発明の好適な実施例を図面に基づいて詳細に説明したが、本発明はこれに限定されるものではなく、更に別の態様においても実施される。 The preferred embodiments of the present invention have been described in detail with reference to the drawings. However, the present invention is not limited to these embodiments, and may be implemented in other modes.

例えば、前述の実施例において、前記マイクロフォン位置検出手段９２、歌唱者身長算出手段９４、歌唱者領域判定手段９６、及び映像合成制御手段９８は、何れも前記カラオケ装置１６のＣＰＵ５０に機能的に備えられたものであったが、本発明はこれに限定されるものではなく、それらの制御機能の一部乃至全部が前記電子早見本装置２２のＣＰＵに機能的に備えられたものであってもよい。また、前記マイクロフォン位置検出手段９２、歌唱者身長算出手段９４、及び歌唱者領域判定手段９６の実質的な処理は前記サーバ装置２０のＣＰＵ７０により実行されるものであってもよく、前記カラオケシステム１０の設計に応じて種々の実施態様が考えられる。 For example, in the above-described embodiment, the microphone position detecting unit 92, the singer height calculating unit 94, the singer area determining unit 96, and the video composition control unit 98 are all functionally provided in the CPU 50 of the karaoke apparatus 16. However, the present invention is not limited to this, and some or all of the control functions may be functionally provided in the CPU of the electronic quick sample device 22. Good. Further, the substantial processing of the microphone position detecting unit 92, the singer height calculating unit 94, and the singer area determining unit 96 may be executed by the CPU 70 of the server device 20, and the karaoke system 10 Various embodiments are possible depending on the design.

また、前述の実施例において、前記歌唱者領域判定手段９６は、前記デジタルカメラ６６により撮影された演奏映像１００に含まれる歌唱者に対応する楕円形の領域１０６を判定するものであったが、歌唱者に対応する領域として判定されるのは必ずしも楕円形の領域でなくともよく、矩形の領域、角が丸くなった矩形の領域、円形の領域、その他不定形の領域であってもよい。何れの態様においても、判定される領域の前記デジタルカメラ６６により撮影される映像（表示画面）に対して縦方向の長さ寸法が歌唱者の身長に応じて定められることが好ましい。 In the above-described embodiment, the singer area determination unit 96 determines the elliptical area 106 corresponding to the singer included in the performance video 100 photographed by the digital camera 66. The area corresponding to the singer is not necessarily determined to be an elliptical area, but may be a rectangular area, a rectangular area with rounded corners, a circular area, or any other irregular area. In any aspect, it is preferable that the length dimension in the vertical direction is determined according to the height of the singer with respect to the video (display screen) photographed by the digital camera 66 in the area to be determined.

また、前述の実施例において、前記歌唱者領域判定手段９６は、前記歌唱者身長算出手段９４により算出される歌唱者の身長に基づいて前記歌唱者に対応する領域１０６を判定するものであったが、歌唱者の身長は必ずしも算出されるものでなくともよく、前記電子早見本装置２２等を介して身長に対応する数値が直接的に入力されるものであってもよい。斯かる態様においては、前記歌唱者身長算出手段９４は必ずしも設けられなくともよい。また、歌唱者に対応する領域のサイズそのものを選択可能とするものであってもよく、例えばＳ、Ｍ、Ｌの何れかの領域サイズを前記電子早見本装置２２等により選択入力可能とする態様も考えられる。更に、上半身の撮影に対応する領域乃至全身の撮影に対応する領域の何れかを利用者に選択させ、その選択結果に応じて領域のサイズを変更するものであってもよい。 Moreover, in the above-mentioned Example, the said singer area | region determination means 96 determined the area | region 106 corresponding to the said singer based on the height of the singer calculated by the said singer height calculation means 94. However, the height of the singer is not necessarily calculated, and a numerical value corresponding to the height may be directly input via the electronic quick sample device 22 or the like. In such an aspect, the singer height calculating means 94 is not necessarily provided. In addition, the size of the area corresponding to the singer may be selectable. For example, any one of the S, M, and L area sizes may be selected and input by the electronic quick sample apparatus 22 or the like. Is also possible. Furthermore, the user may select either an area corresponding to upper body imaging or an area corresponding to whole body imaging, and the size of the area may be changed according to the selection result.

また、前述の実施例では、前記ＳＮＳデータベース８６の存在を前提とする制御について説明したが、前記通信回線１８に接続されない非通信型のカラオケ装置にも、本発明は好適に適用される。斯かる態様においては、前記映像合成制御手段９８により合成された合成映像１１４が前記カラオケ装置１６のハードディスク５６に記憶されて再生可能とされるものであってもよいし、前記ＲＡＭ５４に記憶されてその日のカラオケ演奏が終了したら消去される等、当日限りの制御が行われるものであってもよい。 In the above-described embodiment, the control based on the presence of the SNS database 86 has been described. However, the present invention is preferably applied to a non-communication karaoke apparatus that is not connected to the communication line 18. In such an embodiment, the synthesized video 114 synthesized by the video synthesis control means 98 may be stored in the hard disk 56 of the karaoke apparatus 16 and can be played back, or stored in the RAM 54. For example, the control may be performed only on the current day, such as being erased when the karaoke performance is completed on that day.

また、前述の実施例では特に言及していないが、前記マイクロフォン位置検出手段９２は、カラオケ演奏における間奏中においてマイクロフォン４０が一時的に机に置かれたり電源がオフとされる等して位置検出が難しい場合には、検出可能であった最後の時点における位置を前記マイクロフォン４０の位置として維持する等の制御を行うものであってもよい。すなわち、本発明のカラオケ装置は、実用上必要とされる種々の補足的な制御を併せて実行するものである。 Although not particularly mentioned in the above-described embodiments, the microphone position detecting means 92 detects the position when the microphone 40 is temporarily placed on the desk or the power is turned off during the karaoke performance. If this is difficult, control such as maintaining the position at the last time point where detection was possible as the position of the microphone 40 may be performed. That is, the karaoke apparatus of the present invention executes various supplementary controls that are practically required.

その他、一々例示はしないが、本発明はその趣旨を逸脱しない範囲内において種々の変更が加えられて実施されるものである。 In addition, although not illustrated one by one, the present invention is implemented with various modifications within a range not departing from the gist thereof.

１６：カラオケ装置、４０：マイクロフォン、６６：デジタルカメラ（撮像装置）、６８：赤外線光源位置検出装置、９０：歌唱者、９２：マイクロフォン位置検出手段、９４：歌唱者身長算出手段、９６：歌唱者領域判定手段、９８：映像合成制御手段、１０４ａ：マイクロフォンに対応する位置、１０６：歌唱者に対応する領域、１０８：歌唱者に対応する領域の映像、１１０：背景映像（他の映像） 16: Karaoke device, 40: Microphone, 66: Digital camera (imaging device), 68: Infrared light source position detection device, 90: Singer, 92: Microphone position detection means, 94: Singer height calculation means, 96: Singer Area determination means, 98: video composition control means, 104a: position corresponding to the microphone, 106: area corresponding to the singer, 108: video of the area corresponding to the singer, 110: background video (other video)

Claims

A karaoke apparatus that outputs a performance song selected from a large number of performance songs and amplifies and outputs sound input by a microphone,
An imaging device that captures an image of a predetermined range based on the karaoke device when the performance music is output by the karaoke device;
Microphone position detecting means for detecting a position corresponding to the microphone included in the video imaged by the imaging device;
A singer area determination unit that determines an area corresponding to a singer included in the video captured by the imaging device, based on a position corresponding to the microphone detected by the microphone position detection unit;
Video synthesis control means for cutting out the video of the area corresponding to the singer determined by the singer area determination means from the video imaged by the imaging device and synthesizing it with another video ,
The singer area determination unit corresponds to a singer included in the video photographed by the imaging device based on information on the position corresponding to the microphone detected by the microphone position detection unit and the height of the singer. A karaoke apparatus characterized by determining an area .

A karaoke apparatus that outputs a performance song selected from a large number of performance songs and amplifies and outputs sound input by a microphone,
An imaging device that captures an image of a predetermined range based on the karaoke device when the performance music is output by the karaoke device;
Microphone position detecting means for detecting a position corresponding to the microphone included in the video imaged by the imaging device;
A singer area determination unit that determines an area corresponding to a singer included in the video captured by the imaging device, based on a position corresponding to the microphone detected by the microphone position detection unit;
Video composition control means for cutting out the video of the area corresponding to the singer determined by the singer area determination means from the video taken by the imaging device and synthesizing it with another video ;
A singer height calculating means for calculating the height of the singer from information on the gender of the singer and information on the year of birth;
With
The singer area determination means is a video photographed by the imaging device based on the position corresponding to the microphone detected by the microphone position detection means and the height of the singer calculated by the singer height calculation means. The karaoke apparatus characterized by determining the area | region corresponding to the singer contained in .

The singer area determination means is for determining an elliptical area corresponding to a singer included in the video imaged by the imaging device, and based on information about the height of the singer, The karaoke apparatus according to claim 1 or 2 , wherein an axis dimension is determined.

The said microphone position detection means detects the position corresponding to the said microphone by performing the image analysis based on a brightness | luminance in the image | video image | photographed with the said imaging device, The any one of Claim 1 to 3 Karaoke equipment.

The microphone is for inputting voice information to the karaoke device via a radio signal,
A position detecting device for detecting the position of the microphone based on a radio signal from the microphone;
Said microphone position detecting means, any one detection result to claims 1 and detects a position corresponding to the microphone included in the captured image by the image pickup device on the basis of 3 of the position detector Karaoke apparatus as described in 1.