JP4257300B2

JP4257300B2 - Karaoke terminal device

Info

Publication number: JP4257300B2
Application number: JP2005030976A
Authority: JP
Inventors: 久雄伊東
Original assignee: アイピックス株式会社
Priority date: 2005-02-07
Filing date: 2005-02-07
Publication date: 2009-04-22
Anticipated expiration: 2025-02-07
Also published as: JP2006215497A

Description

本発明は、背景に限定されずに、合成画像を動画像として、その場で出力するカラオケ端末装置に関するものである。 The present invention is not limited to the background, and relates to a karaoke terminal device that outputs a synthesized image as a moving image on the spot.

従来、カラオケ端末装置は、利用者の歌唱中に、歌詞と共にあらかじめ撮影された、かつ決められた画像を一定の周期で繰り返し表示するカラオケ端末装置が知られている。 2. Description of the Related Art Conventionally, a karaoke terminal apparatus is known that repeatedly displays a predetermined image taken together with lyrics during a user's singing in a certain cycle.

また、クロマキーと呼ばれる手法を利用して、歌唱者を任意の背景と共に合成して表示するカラオケ端末装置がある。クロマキーの手法を用いれば、映像画面から特定色(青色が用いられることが多いが、グリーンを用いることもある)を抜き取って、マスク画像を作成し、そのマスク画像と任意の背景とを合成することができる。
特開２００４‐３３４０４９号公報 Also, there is a karaoke terminal device that combines and displays a singer along with an arbitrary background using a technique called chroma key. If the chroma key method is used, a specific color (blue is often used but green may be used) is extracted from the video screen, a mask image is created, and the mask image is combined with an arbitrary background. be able to.
JP 2004-334049 A

しかしながら、上述した撮影済み背景を一定の周期で表示するカラオケ端末装置において、どのような機種であっても機能等により差別化できるものではなく、ワンパターンとなってしまう。また、長時間の利用や回数を重ねて利用した場合には、繰り返し同じ画像が表示されることになり、利用者を飽きさせることとなる。 However, in the above-described karaoke terminal device that displays the photographed background at a constant cycle, any model is not distinguishable by function or the like, and becomes a one pattern. In addition, when used for a long time or repeatedly, the same image is repeatedly displayed, which makes the user bored.

また、前述のクロマキー手法のように利用者の歌唱中の画像と背景を合成して表示する方法も知られているが、画像の抽出にクロマキーを使用するため、背景色がクロマキーブルー等の限られた色に背景が限定されてしまう。その結果、歌唱者は、例えばクロマキーブルー一色を背景に歌唱することになり、利用者の興趣をそぐことになる。さらに、あらかじめ決められた透明領域をもったフレーム画像を撮影画像の上に合成する場合でも、撮影画像の背景がそのまま合成画像にも写されてしまうため、合成画像の品質としては低いものになる。 In addition, there is also known a method of synthesizing and displaying a user's singing image and background as in the above-described chroma key method, but since the chroma key is used for image extraction, the background color is limited to chroma key blue or the like. The background is limited to the selected color. As a result, the singer, for example, sings against a background of chroma key blue, and distracts the user's interest. Furthermore, even when a frame image having a predetermined transparent area is synthesized on a photographed image, the background of the photographed image is directly copied to the synthesized image, so that the quality of the synthesized image is low. .

本発明は、上述したような課題に鑑みてなされたものであり、背景にとらわれることなく歌唱者を撮影し、任意の背景画像との合成画像を動画像として合成してその場で出力することができるカラオケ端末装置を提供することを目的とする。 The present invention has been made in view of the above-described problems, and shoots a singer without being caught by the background, and synthesizes a composite image with an arbitrary background image as a moving image and outputs it on the spot. An object of the present invention is to provide a karaoke terminal device that can be used.

そこで、撮影する背景に関係なく合成画像を出力するには、対象となる歌唱者である動体を認識し、透明領域を持つマスク画像を作成して任意の背景画像と合成する必要がある。ここで、任意の動画像において、抽出対象部分を特定する方法としては、動きベクトルを利用する方法や、動的輪郭モデルを利用するもの、輪郭の特徴点の変化を追跡するもの等が挙げられる。 Therefore, in order to output a composite image regardless of the background to be photographed, it is necessary to recognize a moving object that is a target singer, create a mask image having a transparent region, and combine it with an arbitrary background image. Here, as a method for specifying an extraction target portion in an arbitrary moving image, a method using a motion vector, a method using a dynamic contour model, a method for tracking changes in feature points of a contour, and the like can be given. .

動きベクトルを利用した方法は、入力画像を予め定めた大きさのブロックに分割し、現画像において似たブロックの位置との差分を「動きベクトル」として計算する。この方法では、演算量が非常に大きくなり、即時性が要求されるカラオケ装置には使用できない。またブロックの解像度を上げたとしても、ブロック単位での解像度にすぎず、画素単位の輪郭データを正確に求めることはできない。 In the method using a motion vector, an input image is divided into blocks having a predetermined size, and a difference from the position of a similar block in the current image is calculated as a “motion vector”. In this method, the amount of calculation becomes very large and cannot be used for a karaoke apparatus that requires immediacy. Even if the resolution of the block is increased, it is only the resolution in block units, and the contour data in pixel units cannot be obtained accurately.

動的輪郭モデルを利用した方法は、輪郭の形状に応じて与えられる内部エネルギーと、画像の性質に応じて与えられる画像エネルギー、及び外部から与えられる外部エネルギーの総和が最小になるように輪郭を変形させていき、エネルギーが最小のときに画像上の対象物の輪郭が抽出できる。しかし、動的輪郭モデルはエネルギーのバランスを調節するパラメータによって検出できる輪郭が変化するため、目的の輪郭を検出するためのパラメータを決定するのが非常に困難であり、また、対象物が凹型の場合にはうまく輪郭を抽出できないという問題がある。したがって、この方法も初期設定が困難で、即時性かつ対象を歌唱者のみに限定しないことが要求されるカラオケ端末装置に使用することは困難である。 In the method using the active contour model, the contour is set so that the sum of the internal energy given according to the contour shape, the image energy given according to the nature of the image, and the external energy given from the outside is minimized. The contour of the object on the image can be extracted when the energy is minimum. However, since the contour that can be detected by the dynamic contour model changes depending on the parameter that adjusts the energy balance, it is very difficult to determine the parameter for detecting the target contour, and the object is concave. In some cases, there is a problem that the contour cannot be extracted well. Therefore, this method is also difficult to perform initial setting, and is difficult to use for a karaoke terminal device that is required to be immediate and not limited to a singer.

以上のような課題を解決するために、本発明は、以下のようカラオケ端末装置を提供する。 In order to solve the above problems, the present invention provides a karaoke terminal device as follows.

（１）少なくとも伴奏音を含む伴奏情報を発生する伴奏音発生手段と、カラオケに関する情報を制御するカラオケ情報制御手段と、を有するカラオケ端末装置であって、歌唱者の音声及び前記伴奏情報を保存する音声情報保存領域と、基準画像を保存する基準画像保存領域と、合成前画像を保存する合成前画像保存領域と、抽出対象部分を保存する抽出対象部分保存領域と、前記歌唱者が選択可能な撮影済み背景画像を保存する背景画像保存領域と、を備え、前記カラオケ情報制御手段は、前記歌唱者の音声及び前記伴奏情報を収集して前記音声情報保存領域に保存する音声情報収集手段と、基準画像を収集して前記基準画像保存領域に保存する基準画像収集手段と、合成前画像を収集して前記合成前画像保存領域に保存する合成前画像収集手段と、前記基準画像保存領域に保存した基準画像及び前記合成前画像保存領域に保存した合成前画像をそれぞれの画素毎において色を数値化して比較し、差のある画素を抽出対象部分として認識する抽出対象部分認識手段と、前記抽出対象部分認識手段によって認識された抽出対象部分を抽出して前記対象部分保存領域に保存する対象部分抽出手段と、前記背景画像保存領域に保存した背景画像の上に、前記対象部分保存領域に保存した抽出画像を重ねて合成する合成手段と、前記合成手段によって合成した合成画像を前記音声情報保存部分に保存した音声情報と共に出力する出力手段と、を含むカラオケ端末装置。 (1) A karaoke terminal device having accompaniment sound generating means for generating accompaniment information including at least accompaniment sound and karaoke information control means for controlling information related to karaoke, and storing the voice of the singer and the accompaniment information The singer can select the audio information storage area to be stored, the reference image storage area for storing the reference image, the pre-combination image storage area for storing the pre-combination image, the extraction target part storage area for storing the extraction target part, and the singer A background image storage area for storing a photographed background image, and the karaoke information control means collects the voice of the singer and the accompaniment information and stores the voice information in the voice information storage area. A reference image collecting means for collecting a reference image and storing it in the reference image storage region; and a pre-combination image collection for collecting the pre-combination image and storing it in the pre-combination image storage region The reference image stored in the reference image storage area and the pre-combination image stored in the pre-combination image storage area are digitized and compared for each pixel, and pixels with differences are recognized as extraction target portions. Extraction target part recognizing means, target part extracting means for extracting the extraction target part recognized by the extraction target part recognizing means and storing it in the target part storage area, and a background image stored in the background image storage area And a combining unit that combines the extracted images stored in the target part storage area, and an output unit that outputs the combined image combined by the combining unit together with the voice information stored in the voice information storage unit. Karaoke terminal device.

（１）に記載の発明によれば、基準画像を基準画像保存領域に保存し、抽出部分認識手段は、収集した合成前画像のフレーム毎に基準画像と比較し、色に差がある部分を抽出して、マスク画像を作成し、合成手段は、利用者が任意に選択した背景画像を合成することができ、背景にとらわれない合成画像を出力することができる。 According to the invention described in (1), the reference image is stored in the reference image storage area, and the extraction part recognition unit compares the collected image before synthesis with the reference image for each frame of the collected pre-combination image, and selects a portion having a color difference. Extraction is performed to create a mask image, and the synthesizing unit can synthesize a background image arbitrarily selected by the user, and can output a synthesized image independent of the background.

（２）前記基準画像収集手段及び前記合成前画像収集手段は、複眼カメラにより構成されている（１）に記載のカラオケ端末装置。 (2) The karaoke terminal device according to (1), wherein the reference image collection unit and the pre-combination image collection unit are configured by a compound eye camera.

（３）前記基準画像収集手段は、一連の動画像を収集する際に、あらかじめ背景のいずれかに記された基準点に基づいて、複眼の各カメラの向きを調整した上で、複眼カメラの一方が前記一連の動画像の収集時間に含まれるある時点の合成前画像を収集して、前記合成前画像保存領域に保存し、他方が当該ある時点の基準画像を収集して前記基準画像保存領域に保存する、（２）に記載のカラオケ端末装置。 (3) When collecting a series of moving images, the reference image collecting means adjusts the direction of each compound eye camera based on a reference point written in advance in one of the backgrounds, One pre-combine image included in the collection time of the series of moving images is collected and stored in the pre-combine image storage area, and the other collects the reference image of the certain point in time and stores the reference image The karaoke terminal device according to (2), which is stored in an area.

（２）又は（３）に記載の発明によれば、予め複眼の各カメラの向きを基準点にあわせた複眼カメラを使用して画像を収集することにより、収集した複眼画像を照合したときに色が一致しない部分が生じる。その部分が前記基準画像にない新規な部分として認識できる。初期設定することなく、また複雑な演算を行わずに歌唱者の抽出が可能になる。 According to the invention described in (2) or (3), when the collected compound eye images are collated by collecting images using a compound eye camera in which the orientation of each compound eye camera is matched with the reference point in advance. The part where a color does not correspond arises. That part can be recognized as a new part not in the reference image. It is possible to extract a singer without initial setting and without performing complicated calculations.

（４）前記基準画像は、一連の動画像を収集する前に予め収集された画像であり、前記合成前画像は、前記一連の動画像の収集時間に含まれるある時点に収集された画像である、（１）に記載のカラオケ端末装置。 (4) The reference image is an image collected in advance before collecting a series of moving images, and the pre-combination image is an image collected at a certain time included in the collection time of the series of moving images. The karaoke terminal device according to (1).

（４）の発明によれば、予め収集された前記合成前画像と前記基準画像との色調の差がある画素を抽出対象部として認識することにより、前記基準画像との差分が抽出対象部となる。その結果、複雑な演算を必要とせず、単純な比較プログラムによりマスク画像の作成を行うことができる。また、クロマキーのように背景の限定無く合成画像を出力することができ、カラオケの利用者の興趣をそぐことも無い。また、前記合成前画像と、予め収集された前記基準画像とを比較するため、抽出対象部分がほとんど変動しない場合でも、抽出対象部分を特定することができる。 According to the invention of (4), by recognizing a pixel having a color tone difference between the pre-combination image collected in advance and the reference image as an extraction target part, the difference from the reference image is changed to the extraction target part. Become. As a result, a mask image can be created by a simple comparison program without requiring complicated calculations. Moreover, a composite image can be output without limitation of a background like a chroma key, and does not distract the interest of a karaoke user. Further, since the pre-combination image and the reference image collected in advance are compared, the extraction target portion can be specified even when the extraction target portion hardly fluctuates.

（５）前記カラオケ情報制御手段は、動画像の収集時間に含まれる所定の期間の照明を制御する制御信号情報を出力し、前記基準画像収集手段は、前記一連の動画像を収集する前に予め基準画像を収集して、前記制御信号情報と共に前記基準画像保存領域に保存し、前記合成前画像収集手段は、前記所定の期間に含まれるある時点の合成前画像を収集して、前記制御信号情報と共に合成前画像保存領域に保存し、前記抽出部分認識手段は、前記基準画像と前記合成前画像とを前記制御信号情報に基づいて同期させて比較する（４）に記載のカラオケ端末装置。 (5) The karaoke information control means outputs control signal information for controlling illumination in a predetermined period included in the moving image collection time, and the reference image collecting means before collecting the series of moving images. A reference image is collected in advance and stored in the reference image storage area together with the control signal information. The pre-combination image collection means collects a pre-combination image at a certain time included in the predetermined period, and the control The karaoke terminal device according to (4), wherein the extracted image recognition unit stores the reference image and the pre-combination image in synchronization with each other based on the control signal information. .

（５）の発明によれば、前記基準画像を収集する際に、照明の制御情報と共に一定の時間に含まれる一連の画像として前記基準画像保存領域に保存しておくことにより、前記合成画像を収集する際に一連の基準画像とフレーム単位で時間軸を合わせ、同一の時間軸にあるフレームとを比較して動体を認識することにより、照明の変化にも対応した合成画像を提供することができる。 According to the invention of (5), when the reference image is collected, the composite image is stored in the reference image storage area as a series of images included in a certain time together with illumination control information. It is possible to provide a composite image corresponding to a change in illumination by matching a time axis in a frame unit with a series of reference images when collecting and recognizing a moving object by comparing frames on the same time axis. it can.

（６）前記基準画像は、前記一連の動画像の収集時間に含まれるある時点の１フレーム前に収集された画像であり、前記合成前画像は、前記ある時点に収集された画像である、（１）に記載のカラオケ端末装置。 (6) The reference image is an image collected one frame before a certain time point included in the collection time of the series of moving images, and the pre-combination image is an image collected at the certain time point. The karaoke terminal device according to (1).

（６）の発明によれば、前記基準画像は、ある時点における前記合成前画像の１フレーム前となるため、カラオケ部屋の環境が大きく変化する場合でも、抽出対象部分を特定することができる。 According to the invention of (6), since the reference image is one frame before the pre-combination image at a certain time, the extraction target portion can be specified even when the environment of the karaoke room changes greatly.

（７）前記抽出対象部分認識手段は、前記抽出対象部分の認識において、前記抽出対象部分の輪郭を捉えてその動きを追跡し、前記対象部分抽出手段は、前記輪郭に囲まれた画素を抽出対象部分として抽出する、（１）から（６）のいずれかに記載のカラオケ端末装置。 (7) In the recognition of the extraction target part, the extraction target part recognition unit captures an outline of the extraction target part and tracks its movement, and the target part extraction unit extracts pixels surrounded by the contour The karaoke terminal device according to any one of (1) to (6), which is extracted as a target portion.

（７）の発明によれば輪郭データのみに基づいて、前記抽出対象部分を認識することにより、より演算量を減少させ、装置の負担を軽くすることができる。その結果、簡便に合成画像を出力できる。 According to the invention of (7), by recognizing the extraction target portion based only on the contour data, the amount of calculation can be further reduced and the burden on the apparatus can be reduced. As a result, a composite image can be easily output.

（８）少なくとも伴奏音を含む伴奏情報を発生する伴奏音発生手段と、カラオケに関する情報を制御するカラオケ情報制御手段と、を有するカラオケ端末装置であって、歌唱者の音声及び前記伴奏情報を保存する音声情報保存領域と、前記歌唱者が選択可能な撮影済み背景画像を保存する背景画像保存領域と、を備え、前記カラオケ情報制御手段は、前記歌唱者の音声及び前記伴奏情報を収集して前記音声情報保存領域に保存する音声情報収集手段と、一連の動画像を収集して前記合成前画像保存領域に保存する画像収集手段と、前記一連の動画像の上に前記背景画像保存領域に保存した背景画像を重ねて合成する合成手段と、前記合成手段によって合成した合成画像を前記音声情報保存部分に保存した音声情報と共に出力する出力手段と、を含み、
前記背景画像は、あらかじめ前記抽出画像の上から重ねて合成するための透明部分が設定されており、透明部分のエッジの透明度は除々に変化する、カラオケ端末装置。 (8) A karaoke terminal device having accompaniment sound generating means for generating accompaniment information including at least accompaniment sound and karaoke information control means for controlling information related to karaoke, and storing the voice of the singer and the accompaniment information And a background image storage area for storing a photographed background image selectable by the singer, and the karaoke information control means collects the singer's voice and the accompaniment information. Audio information collecting means for saving in the audio information saving area, image collecting means for collecting a series of moving images and saving them in the pre-combination image saving area, and in the background image saving area on the series of moving images Synthesis means for superimposing the stored background images, and output means for outputting the synthesized image synthesized by the synthesis means together with the audio information stored in the audio information storage portion; It includes,
The karaoke terminal apparatus, wherein the background image is set in advance with a transparent portion to be superimposed on the extracted image, and the transparency of the edge of the transparent portion gradually changes.

（８）に記載の発明によれば、前記背景画像に予め透明部分を設定しておくことにより、合成処理の演算量を低減させつつエッジの部分をぼかして合成し、より簡便に品質の高い合成処理を行うことができる。 According to the invention described in (8), by setting a transparent portion in the background image in advance, the edge portion is blurred and combined while reducing the amount of calculation of the combining process, and the quality is higher easily. A synthesis process can be performed.

（９）前記合成前画像は、画像の出力に必要な出力対象画素領域の周囲にブレ補償領域に相当する画素を含み、前記抽出部分認識手段は、前記ある時点において、前記合成前画像保存領域に保存した合成前画像について、画素毎に色を数値化した後に、前記基準画像保存領域に保存した基準画像を上下左右に１画素毎にシフトをして比較し、前記出力対象画素領域全体の各画素毎の前記色の数値の差分の総和が最も小さくなる場合の、前記シフトの量だけ前記合成前画像をシフトして、前記合成前画像保存領域に保存する（１）から（７）のいずれかに記載のカラオケ端末装置。 (9) The pre-combination image includes pixels corresponding to a blur compensation region around an output target pixel region necessary for image output, and the extraction portion recognition unit is configured to output the pre-combination image storage region at the certain time. For the pre-combine image stored in the above, after digitizing the color for each pixel, the reference image stored in the reference image storage area is compared by shifting the pixel image vertically, horizontally, and pixel by pixel. The pre-combination image is shifted by the shift amount and stored in the pre-combination image storage area when the total sum of the color numerical differences for each pixel is the smallest (1) to (7) The karaoke terminal device according to any one of the above.

（９）に記載の発明によれば、出力対象領域の周りにブレ補償領域に相当する画素を合成前画像に含めて撮影しておくことにより、振動等によるブレがあった場合にその分画像をシフトさせて合成前画像として保存することができる。その結果、ブレの影響を受けることなく、合成処理を行い、より品質の高い画像を提供することができる。 According to the invention described in (9), if the pre-combination image is captured by including pixels corresponding to the blur compensation area around the output target area, an image corresponding to the blur is generated. Can be shifted and stored as a pre-combine image. As a result, the composition process can be performed without being affected by blurring, and a higher quality image can be provided.

（１０）前記カラオケ端末装置は、ネットワーク上のサーバと通信可能であり、前記出力手段は、歌唱者の操作に基づいて合成画像データを前記サーバに送信する（１）から（９）のいずれかに記載のカラオケ端末装置。 (10) The karaoke terminal device can communicate with a server on the network, and the output unit transmits the composite image data to the server based on a singer's operation. The karaoke terminal device described in 1.

（１０）に記載の発明によれば、ネットワーク上のサーバに合成画像データを送信することにより、カラオケ部屋という枠に限られること無く、実際に歌唱した結果である歌唱情報を合成画像と共に外部に公開することができ、多くの人に聞いてもらう機会を提供し、歌唱者の意欲を触発することができる。これにより、歌唱者の興趣向上を図ることができる。 According to the invention described in (10), by transmitting the composite image data to the server on the network, the singing information that is the result of actually singing is transmitted to the outside together with the composite image without being limited to the frame of the karaoke room. It can be opened to the public, providing an opportunity for many people to listen to, and inspiring the singers. Thereby, the interest improvement of a singer can be aimed at.

この発明によれば、カラオケ端末装置において、特定の背景に限定されること無く任意の背景と合成処理をした画像を即時に提供することができ、歌唱者の興趣向上を図る事ができる。 According to the present invention, in a karaoke terminal device, an image that is combined with an arbitrary background can be provided immediately without being limited to a specific background, and the interest of the singer can be improved.

［カラオケ端末装置の構成］
以下に、本発明の好適な実施形態を図面に基づいて説明する。 [Configuration of karaoke terminal device]
Preferred embodiments of the present invention will be described below with reference to the drawings.

カラオケ端末装置は、図１のブロック図に示すように、伴奏音発生手段１０と、カラオケ情報制御手段２０から構成される。 As shown in the block diagram of FIG. 1, the karaoke terminal device includes accompaniment sound generating means 10 and karaoke information control means 20.

伴奏音発生手段１０は既存の伴奏情報が歌唱者に選択された結果に応じて、伴奏情報に含まれる画像を表示部に表示し、伴奏情報に含まれる伴奏音を発生させる機能を有する。既存の伴奏情報には、伴奏音が含まれている。尚、伴奏情報には少なくとも伴奏音を含むものであればよい。 The accompaniment sound generation means 10 has a function of displaying an image included in the accompaniment information on the display unit and generating an accompaniment sound included in the accompaniment information according to the result of the existing accompaniment information being selected by the singer. The existing accompaniment information includes an accompaniment sound. The accompaniment information only needs to include at least an accompaniment sound.

カラオケ情報制御手段２０は、図１に示すように、合成前画像収集手段２１、基準画像収集手段２２、抽出対象部分認識手段２５、対象部分抽出手段２６、合成手段４０、出力手段５０、音声情報収集手段３２を少なくとも含む。このカラオケ情報制御手段２０は、収集した一連の動画像の一部分を抽出し、事前に選択された背景と合成して出力する。 As shown in FIG. 1, the karaoke information control means 20 is a pre-combination image collection means 21, a reference image collection means 22, an extraction target part recognition means 25, a target part extraction means 26, a synthesis means 40, an output means 50, audio information. At least the collecting means 32 is included. The karaoke information control means 20 extracts a part of the collected series of moving images, synthesizes them with a previously selected background, and outputs them.

図１に示される合成前画像収集手段２１は、合成前画像を収集するためのものである。合成前画像収集手段２１は、ビデオカメラ２３２Ａで構成されるが、カメラの台数は1台とは限定されない。そして収集した合成画像を合成前画像保存領域に保存する。 The pre-combination image collection means 21 shown in FIG. 1 is for collecting pre-combination images. The pre-combine image collection means 21 is configured by a video camera 232A, but the number of cameras is not limited to one. The collected composite image is stored in the pre-composition image storage area.

図１に示される基準画像収集手段２２は、基準画像を収集するものである。基準画像収集手段２２はビデオカメラ２３２Ａで構成されるが、上記合成前画像収集手段２１を構成するビデオカメラ２３２Ａを兼用してもよい。兼用する場合は、合成前画像を収集する前に、基準画像を収集しておく必要がある。そして、収集した基準画像を、基準画像保存領域に保存する。 The reference image collection means 22 shown in FIG. 1 collects a reference image. Although the reference image collecting unit 22 is configured by the video camera 232A, the video camera 232A configuring the pre-combination image collecting unit 21 may also be used. In the case of sharing, it is necessary to collect the reference image before collecting the pre-combination image. Then, the collected reference image is stored in the reference image storage area.

図１に示される抽出対象部分認識手段２５は、合成処理を行う際に、合成前画像から週出すべき抽出対象部分を認識する。抽出対象部分の認識については、後の図９から図１４において説明するように1つの方法に限定されない。 The extraction target part recognizing means 25 shown in FIG. 1 recognizes an extraction target part to be output from the pre-combination image when performing the synthesis process. The recognition of the extraction target portion is not limited to one method as will be described later with reference to FIGS.

図１に示される対象部分抽出手段２６は、抽出対象部分認識手段２５において認識した抽出対象部分を、合成前画像から抽出して、マスク画像を作成し、背景画像と合成するために、対象部分抽出領域に保存する。 The target portion extraction unit 26 shown in FIG. 1 extracts the extraction target portion recognized by the extraction target portion recognition unit 25 from the pre-combination image, creates a mask image, and combines it with the background image. Save to the extraction area.

図１に示される合成手段４０は、対象部分保存領域２７に保存したマスク画像を予め選択された背景画像と合成する。 The synthesizing means 40 shown in FIG. 1 synthesizes the mask image stored in the target portion storage area 27 with a background image selected in advance.

図１に示される出力手段５０は、合成手段４０において作成された音声情報と合成画像を出力する。出力場所は、表示部２１０を構成するディスプレイやミキシングアンプ２１８に出力してもよく、また、ネットワーク上のサーバに送信してもよい。 The output means 50 shown in FIG. 1 outputs the voice information and the synthesized image created by the synthesizing means 40. The output location may be output to a display or mixing amplifier 218 constituting the display unit 210, or may be transmitted to a server on the network.

[カラオケ端末装置の電気的構成]
カラオケ端末装置の電気的構成を図２によって、説明する。 [Electrical configuration of karaoke terminal]
The electrical configuration of the karaoke terminal apparatus will be described with reference to FIG.

図２に示すように、カラオケ端末装置２０においては、データＢＵＳに、制御部であるＣＰＵ２０２、メモリ２０４、通信インターフェイス（以降、通信Ｉ／Ｆ２０６と称する。）、ＲＡＩＤ（Redundant Arrays of Inexpensive Disks）等で構成された記憶部２０８、液晶表示パネルやＣＲＴ等で構成された表示部２１０、リモコンやキーボード等からなる操作部２１２、音声データから音声を発生させるためのシンセサイザ２１６、音声を編集し、増幅させるためのミキシングアンプ２１８、画像及び音声を編集するためのＡＶデータ処理部２２０が接続されている。 As shown in FIG. 2, in the karaoke terminal device 20, the data BUS includes a CPU 202 as a control unit, a memory 204, a communication interface (hereinafter referred to as a communication I / F 206), RAID (Redundant Arrays of Inexpensive Disks), and the like. A storage unit 208 configured with a liquid crystal display panel or a CRT, an operation unit 212 including a remote controller or a keyboard, a synthesizer 216 for generating audio from audio data, and editing and amplifying audio Are connected to a mixing amplifier 218 and an AV data processing unit 220 for editing images and sounds.

また、制御手段としてのミキシングアンプ２１８には、歌唱者の発する音声を収集するマイク、音声を発生するスピーカ２１４等が接続されている。この歌唱者とは、カラオケの伴奏情報に対して歌唱する者を示す。また、利用者とは、カラオケの伴奏情報に対する歌唱者のみならず、歌唱をしていなくても、カラオケ部屋においてカラオケ端末装置を利用する者を示す。 In addition, a mixing amplifier 218 serving as a control unit is connected to a microphone that collects sound produced by the singer, a speaker 214 that generates sound, and the like. This singer shows the person who sings with accompaniment information of karaoke. Moreover, a user shows not only the singer with respect to the accompaniment information of karaoke but the person who uses a karaoke terminal apparatus in a karaoke room, even if he is not singing.

ミキシングアンプ２１８は、ＣＰＵ２０２から供給されるコマンドに応じて、接続されているマイク、スピーカ２１４の制御を行う機能を有する。ミキシングアンプ２１８は、ＣＰＵ２０２から供給されるコマンドに応じて、マイクから入力された歌唱者の音声を収集する機能を有する。ミキシングアンプ２１８は、ＣＰＵ２０２から供給されるコマンドに応じて、集音マイクから入力された利用者の音声を収集する機能を有する。ミキシングアンプ２１８は、ＣＰＵ２０２から供給されるコマンドに応じて、収集された音声をスピーカ２１４から発生させる機能を有する。ミキシングアンプ２１８は、ＣＰＵ２０２から供給されるコマンドに応じて、収集された音声をＡＶデータ処理部２２０に供給する機能を有する。このため、ＣＰＵ２０２は、ミキシングアンプ２１８に対して、各種のコマンドを送信することにより、各種の機能を実行させることとなる。 The mixing amplifier 218 has a function of controlling the connected microphone and speaker 214 in accordance with a command supplied from the CPU 202. The mixing amplifier 218 has a function of collecting the singer's voice input from the microphone in response to a command supplied from the CPU 202. The mixing amplifier 218 has a function of collecting the user's voice input from the sound collection microphone in response to a command supplied from the CPU 202. The mixing amplifier 218 has a function of generating collected sound from the speaker 214 in accordance with a command supplied from the CPU 202. The mixing amplifier 218 has a function of supplying the collected audio to the AV data processing unit 220 in accordance with a command supplied from the CPU 202. For this reason, the CPU 202 executes various functions by transmitting various commands to the mixing amplifier 218.

また、制御手段としてのＡＶデータ処理部２２０には、ビデオカメラ２３２Ａ、２３２Ｂ、…等が接続されている。歌唱者の動画像を撮影するビデオカメラ２３２Ａは１台に限定されない。 Further, video cameras 232A, 232B,... Are connected to the AV data processing unit 220 as control means. The number of video cameras 232A that capture a moving image of a singer is not limited to one.

ＡＶデータ処理部２２０は、ＣＰＵ２０２から供給されるコマンドに応じて、ビデオカメラ２３２Ａ、２３２Ｂ、…から供給される利用者の画像及び歌唱者の画像を収集する機能を有する。ＡＶデータ処理部２２０は、ＣＰＵ２０２から供給されるコマンドに応じて、ミキシングアンプ２１８から供給される音声を記憶する機能を有する。ＡＶデータ処理部２２０は、ＣＰＵ２０２から供給されるコマンドに応じて、収集した画像及び供給された音声を編集する機能、予め記憶されている特殊合成画像を用いて、画像を編集する機能を有する。また、ＡＶデータ処理部２２０は、収集手段の一部に相当する。更には、ＡＶデータ処理部２２０は、収集された画像情報を編集する編集手段の一部に相当する。このため、ＣＰＵ２０２は、ＡＶデータ処理部２２０に対して、各種のコマンドを送信することにより、各種の機能を実行させることとなる。 The AV data processing unit 220 has a function of collecting user images and singer images supplied from the video cameras 232A, 232B,... According to commands supplied from the CPU 202. The AV data processing unit 220 has a function of storing audio supplied from the mixing amplifier 218 in response to a command supplied from the CPU 202. The AV data processing unit 220 has a function of editing a collected image and supplied sound in accordance with a command supplied from the CPU 202, and a function of editing an image using a special composite image stored in advance. The AV data processing unit 220 corresponds to a part of the collecting unit. Furthermore, the AV data processing unit 220 corresponds to a part of editing means for editing collected image information. For this reason, the CPU 202 executes various functions by transmitting various commands to the AV data processing unit 220.

ＣＰＵ２０２は、記憶部２０８に記憶されたプログラムに従って、各種の処理を実行する機能を有する。また、ＣＰＵ２０２は、記憶部２０８に保存されたプログラムを読み出して実行することにより、上記ハードウェアと協働して後述する各種の手段を実現している。 The CPU 202 has a function of executing various processes in accordance with programs stored in the storage unit 208. Further, the CPU 202 implements various means described later in cooperation with the hardware by reading and executing a program stored in the storage unit 208.

また、記憶部２０８には、伴奏音及び画像を含む伴奏情報、ＣＰＵ２０２によりカラオケ情報制御サーバの動作を制御するためのプログラム等が記憶されている。伴奏情報とは、カラオケの伴奏音とその伴奏音に対して同期して表示部２１０に表示される画像を含むものであり、選曲番号を含む選曲情報と対応付けられている。具体的なプログラムについては後述する。 The storage unit 208 stores accompaniment information including accompaniment sounds and images, a program for the CPU 202 to control the operation of the karaoke information control server, and the like. Accompaniment information includes an accompaniment sound of karaoke and an image displayed on the display unit 210 in synchronization with the accompaniment sound, and is associated with music selection information including a music selection number. A specific program will be described later.

尚、本実施形態においては、プログラム等を記憶する媒体として記憶部２０８を用いるように構成したが、本発明はこれに限らず、コンピュータにより読み取り可能な記憶媒体であれば別態様であってもよく、例えば、ＲＯＭ、ＣＤ−ＲＯＭ及びＤＶＤ等の記憶媒体に記録されていてもよい。また、これらのプログラムは、予め記録されているものでなくとも、電源投入後にメモリ２０４等に記録されるものでもよい。更にまた、プログラムの各々が別々の記憶媒体に記録されていてもよい。 In this embodiment, the storage unit 208 is used as a medium for storing a program or the like. However, the present invention is not limited to this, and any other storage medium that can be read by a computer may be used. For example, it may be recorded on a storage medium such as a ROM, a CD-ROM, and a DVD. These programs may not be recorded in advance, but may be recorded in the memory 204 or the like after the power is turned on. Furthermore, each program may be recorded on a separate storage medium.

メモリ２０４は、ＣＰＵ２０２の一時記憶領域として種々のフラグや変数の値を記憶する機能を有する。メモリ２０４に記憶されるデータの具体例としては、以下のようなものがある。 The memory 204 has a function of storing various flags and variable values as a temporary storage area of the CPU 202. Specific examples of data stored in the memory 204 include the following.

伴奏発生中であるか否かを判定するための伴奏中フラグ、選曲した曲情報を示す選曲情報、歌唱者の画像及び音声を収集するか否かを判定するための歌唱記憶フラグ、カラオケ端末装置の稼動状態に異常があるか否かを判定するための稼動フラグ等がメモリ２０４に記憶されている。また、選曲情報は、実行中である選曲情報と実行待機中である選曲情報とを含む。 Accompaniment flag for determining whether or not accompaniment is occurring, song selection information indicating the selected song information, song memory flag for determining whether or not to collect singer's image and sound, karaoke terminal device An operation flag or the like for determining whether or not there is an abnormality in the operation state is stored in the memory 204. The music selection information includes music selection information that is being executed and music selection information that is being executed.

尚、本実施形態においては、ＣＰＵ２０２の一時記憶領域としてメモリ２０４を用いているが、本発明はこれに限らず、読み書き可能な記憶媒体であればよい。 In this embodiment, the memory 204 is used as a temporary storage area of the CPU 202. However, the present invention is not limited to this, and any readable / writable storage medium may be used.

図３に示されるカラオケ端末装置２０の正面には、表示部２１０が備えられている。この表示部２１０は、複数の表示領域２１０ａ、２１０ｂ、２１０ｃ、２１０ｄ、を有している。メイン表示領域である表示領域２１０ａには、後述するように、選択された画像が表示される。表示領域２１０ｂ、表示領域２１０ｃには、伴奏情報に含まれる画像が表示される。表示領域２１０ｄには、歌唱者の画像が表示される。この歌唱者の画像は、カラオケの伴奏情報に対する歌唱者の画像であり、後述する歌唱情報に含まれている。また、この歌唱情報は、歌唱者の画像及び音声、伴奏情報を含むものである。 A display unit 210 is provided in front of the karaoke terminal device 20 shown in FIG. The display unit 210 has a plurality of display areas 210a, 210b, 210c, and 210d. As will be described later, the selected image is displayed in the display area 210a which is the main display area. Images included in the accompaniment information are displayed in the display area 210b and the display area 210c. In the display area 210d, an image of the singer is displayed. This singer's image is a singer's image with respect to the accompaniment information of karaoke, and is included in the singing information described later. Moreover, this singing information includes a singer's image and sound, and accompaniment information.

尚、本実施形態においては、表示部２１０を１つの表示装置として構成したが、本発明はこれに限らず、複数の表示装置として構成してもよい。 In the present embodiment, the display unit 210 is configured as a single display device, but the present invention is not limited to this, and may be configured as a plurality of display devices.

表示部２１０の下方には、カラオケ演奏装置１０２と、画像合成装置１０４と、が備えられている。 Below the display unit 210, a karaoke performance device 102 and an image composition device 104 are provided.

画像合成装置１０４は、特殊合成画像を生成する機能、集音マイクから周辺の音声を収集する機能、を有する。画像合成装置１０４は、収集手段の一部に相当する。また、画像合成装置１０４は、収集された歌唱情報を編集する編集手段の一部に相当する。 The image synthesizing device 104 has a function of generating a special synthesized image and a function of collecting peripheral sounds from a sound collecting microphone. The image composition device 104 corresponds to a part of the collection unit. The image composition device 104 corresponds to a part of editing means for editing the collected singing information.

画像合成装置１０４の側方には、メイン画面切換スイッチ１０６が備えられている。このメイン画面切換スイッチ１０６は、メイン表示領域である表示領域２１０ａに表示させる画面を操作に応じて切り換える機能を有する。 A main screen changeover switch 106 is provided on the side of the image composition device 104. The main screen changeover switch 106 has a function of switching a screen to be displayed in the display area 210a, which is a main display area, according to an operation.

また、このカラオケ端末装置２０には、リモコン１０８、無線キーボード１１０が備えられている。リモコン１０８、無線キーボード１１０の操作に応じて、各種の機能が実行される。リモコン１０８、無線キーボード１１０は、後述する操作部２１２（図３参照）の一部に相当する。 The karaoke terminal device 20 includes a remote control 108 and a wireless keyboard 110. Various functions are executed in accordance with operations of the remote control 108 and the wireless keyboard 110. The remote control 108 and the wireless keyboard 110 correspond to a part of an operation unit 212 (see FIG. 3) described later.

尚、本実施形態においては、カラオケ演奏装置１０２と、画像合成装置１０４と、を備える構成としたが、本発明はこれに限らず、カラオケ演奏装置１０２と、画像合成装置１０４等の各種の装置を省略してもよく、この場合には、それらの有している機能を他の装置に備える構成とすることが好適である。もちろん、このカラオケ演奏装置１０２は、一体であっても別体であってもよい。 In this embodiment, the karaoke performance device 102 and the image composition device 104 are provided. However, the present invention is not limited to this, and various devices such as the karaoke performance device 102 and the image composition device 104 are provided. In this case, it is preferable to adopt a configuration in which the functions possessed by other devices are provided. Of course, the karaoke performance device 102 may be integrated or separate.

[カラオケ情報制御手段の動作]
上述したように構成されたカラオケ情報制御手段２０の動作について図４を用いて説明する。 [Operation of karaoke information control means]
The operation of the karaoke information control means 20 configured as described above will be described with reference to FIG.

図４に示すように、ステップＳ１１においては、選曲処理を実行する。この処理において、ＣＰＵ２０２は、各種の入力操作等に応じて、選曲に応じた伴奏を発生させるべく、伴奏を発生するための選曲情報の処理を行う。詳しくは、図５を用いて後述する。この処理が終了した場合には、ステップＳ１２に処理を移す。 As shown in FIG. 4, in step S11, a music selection process is executed. In this process, the CPU 202 performs music selection information processing for generating an accompaniment in order to generate an accompaniment corresponding to the music selection in accordance with various input operations. Details will be described later with reference to FIG. If this process ends, the process moves to a step S12.

ステップＳ１２において、ＣＰＵ２０２は、合成モード選択処理を実行する。この処理において、ＣＰＵ２０２は、合成処理を行うか否かの判別を行い、合成処理を行うと判別した場合は、基準画像を収集し、基準画像保存領域に保存する。さらに、利用者が選択した背景画像データを背景画像保存領域２８に渡す。詳しくは、図６を用いて後述する。この処理が終了した場合には、ステップＳ１３に処理を移す。 In step S12, the CPU 202 executes a synthesis mode selection process. In this process, the CPU 202 determines whether or not to perform the synthesis process. If it is determined that the synthesis process is to be performed, the CPU 202 collects the reference image and stores it in the reference image storage area. Further, the background image data selected by the user is transferred to the background image storage area 28. Details will be described later with reference to FIG. If this process ends, the process moves to a step S13.

ステップＳ１３において、ＣＰＵ２０２は伴奏発生処理を実行する。この処理において、ＣＰＵ２０２は、伴奏データをＡＶデータ処理部２２０に渡す。ＡＶデータ処理部は、受け取った伴奏データに基づいて、スピーカ２１４を通じて伴奏を発生する。この処理が終了した場合には、ステップＳ１４に処理を移す。 In step S13, the CPU 202 executes accompaniment generation processing. In this process, the CPU 202 passes the accompaniment data to the AV data processing unit 220. The AV data processing unit generates an accompaniment through the speaker 214 based on the received accompaniment data. If this process ends, the process moves to a step S14.

ステップＳ１４において、ＣＰＵ２０２は、撮影開始処理を実行する。この処理において、ＣＰＵ２０２は、ステップＳ１２で基準画像を収集済みである場合は、合成前画像を収集して合成前画像保存領域２３に保存する。また、ステップＳ１３において、基準画像を収集していない場合は、基準画像及び合成前画像を収集する。詳しくは図７から図８で後述する。この処理が終了した場合には、ステップＳ１５に処理を移す。 In step S14, the CPU 202 executes shooting start processing. In this process, if the reference image has been collected in step S12, the CPU 202 collects the pre-combination image and stores it in the pre-combination image storage area 23. In step S13, when the reference image is not collected, the reference image and the pre-combination image are collected. Details will be described later with reference to FIGS. If this process ends, the process moves to a step S15.

ステップＳ１５において、ＣＰＵ２０２は、抽出対象部分認識処理を行う。この処理において、ＣＰＵ２０２は、合成前画像保存領域２３から合成前画像を読み出し、背景画像保存領域２８に保存された背景に合成する部分を抽出する。この、抽出対象部分認識処理は、いくつかのバリエーションがあり、詳しくは、後述の図９から図１４において後述する。この処理が終了した場合には、ステップＳ１６に処理を移す。 In step S15, the CPU 202 performs extraction target part recognition processing. In this process, the CPU 202 reads out the pre-combine image from the pre-combine image storage area 23 and extracts a portion to be combined with the background stored in the background image storage area 28. This extraction target portion recognition process has several variations, and details will be described later with reference to FIGS. If this process ends, the process moves to a step S16.

ステップＳ１６において、ＣＰＵ２０２は、対象部分抽出処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１５の抽出対象部分認識処理において抽出された抽出対象部分を合成前画像から抽出する処理を行う。ＣＰＵ２０２は、抽出対象部分についてのマスク画像を作成し、合成前画像と重ね合わせ、抽出対象部分のみを合成前画像から抜き出す。そしてＣＰＵ２０２は、合成前画像から抜き出した抽出対象部分を対象部分保存領域２７に保存する。この処理が終了した場合には、ステップＳ１７に処理を移す。 In step S16, the CPU 202 performs target portion extraction processing. In this process, the CPU 202 performs a process of extracting the extraction target part extracted in the extraction target part recognition process in step S15 from the pre-combination image. The CPU 202 creates a mask image for the extraction target part, superimposes it with the pre-combination image, and extracts only the extraction target part from the pre-combination image. Then, the CPU 202 stores the extraction target portion extracted from the pre-combination image in the target portion storage area 27. If this process ends, the process moves to a step S17.

ステップＳ１７において、ＣＰＵ２０２は、合成処理を行う。この処理において、ＣＰＵ２０２は、対象部分保存領域２７から抽出画像を読み込み、かつ背景画像保存領域２８から、事前に選択済みの背景画像と重ねあわせて合成画像を作成する。この処理が終了した場合には、ステップＳ１８に処理を移す。 In step S17, the CPU 202 performs a composition process. In this process, the CPU 202 reads the extracted image from the target part storage area 27 and creates a composite image by superimposing the previously selected background image from the background image storage area 28. If this process ends, the process moves to a step S18.

ステップＳ１８において、ＣＰＵ２０２は、出力処理を行う。この処理において、ＣＰＵ２０２は、音声情報保存領域３４に保存された音声情報と合成画像を同期させて、合成画像を表示部２１０に、音声情報をＡＶデータ処理装置に渡す。すなわち、表示部２１０に合成画像を表示するためのデータを供給し、ミキシングアンプ２１８に音声情報を供給する。また、図６のステップＳ６２で合成処理を行わないと選択した場合には、ＣＰＵ２０２は、合成処理を行わない伴奏用デモ画面のデータを供給する。これにより、表示部２１０は合成画像又は伴奏用デモ画面を表示し、スピーカ２１４は音声情報を再生することとなる。この処理が終了した場合には、ステップＳ１９に処理を移す。 In step S18, the CPU 202 performs output processing. In this process, the CPU 202 synchronizes the audio information stored in the audio information storage area 34 and the synthesized image, and passes the synthesized image to the display unit 210 and the audio information to the AV data processing device. That is, data for displaying a composite image is supplied to the display unit 210 and audio information is supplied to the mixing amplifier 218. If it is selected in step S62 in FIG. 6 that the synthesis process is not performed, the CPU 202 supplies accompaniment demonstration screen data for which the synthesis process is not performed. As a result, the display unit 210 displays a composite image or an accompaniment demonstration screen, and the speaker 214 reproduces audio information. If this process ends, the process moves to a step S19.

ステップＳ１９において、ＣＰＵ２０２は、伴奏が終了するか否かを判断する。この処理において、ＣＰＵ２０２は、伴奏終了と判断した場合には、ステップＳ２０に処理を移す。また、伴奏終了で無いと判断した場合には、ステップＳ１４に処理を移す。すなわち、ＣＰＵ２０２は、撮影開始処理から始まる画像合成処理を繰り返すことになる。 In step S19, the CPU 202 determines whether or not the accompaniment ends. In this process, if the CPU 202 determines that the accompaniment has ended, it moves the process to step S20. If it is determined that the accompaniment has not ended, the process proceeds to step S14. That is, the CPU 202 repeats the image composition process starting from the shooting start process.

ステップＳ２０において、ＣＰＵ２０２は、伴奏終了処理を行う。この処理において、ＣＰＵ２０２は、ＡＶデータ処理部に伴奏データを渡すのを終了する。これにより、スピーカ２１４は音声情報の再生を終了する。また、ＣＰＵ２０２は、音声情報の収集も終了する。この処理が終了した場合には、ステップＳ２１に処理を移す。 In step S20, the CPU 202 performs accompaniment end processing. In this process, the CPU 202 finishes passing the accompaniment data to the AV data processing unit. Thereby, the speaker 214 ends the reproduction of the audio information. In addition, the CPU 202 ends the collection of audio information. If this process ends, the process moves to a step S21.

ステップＳ２１において、ＣＰＵ２０２は、撮影終了処理を行う。この処理において、ＣＰＵ２０２は、合成前画像及び基準画像の収集を終了する。また、ＣＰＵ２０２は、合成画像を表示するためのデータを供給するのを終了する。この処理が終了した場合には、ステップＳ１１に処理を移す。すなわち、伴奏発生準備に戻り、これまでの処理を繰り返すことになる。 In step S21, the CPU 202 performs a photographing end process. In this process, the CPU 202 ends the collection of the pre-combination image and the reference image. In addition, the CPU 202 ends supplying data for displaying the composite image. If this process ends, the process moves to a step S11. That is, returning to the accompaniment generation preparation, the processing so far is repeated.

[選曲処理]
図４のステップＳ１１において実行されるサブルーチンについて、図５を用いて説明する。 [Music selection process]
The subroutine executed in step S11 in FIG. 4 will be described with reference to FIG.

ステップＳ３１においては、曲番号の入力操作の有無を判別する。この処理において、ＣＰＵ２０２は、リモコン１０８、無線キーボード１１０等の操作部２１２の操作に応じて、曲番号入力操作があったか否かを判断することとなる。ＣＰＵ２０２は、曲番号入力操作があったと判別した場合には、ステップＳ３２に処理を移す。一方、ＣＰＵ２０２は、曲番号入力操作があったとは判別しなかった場合には、ステップＳ３３に処理を移す。 In step S31, it is determined whether or not a music number input operation has been performed. In this process, the CPU 202 determines whether or not a song number input operation has been performed in accordance with the operation of the operation unit 212 such as the remote control 108 or the wireless keyboard 110. If the CPU 202 determines that a song number input operation has been performed, the CPU 202 moves the process to step S32. On the other hand, if the CPU 202 does not determine that the music number input operation has been performed, the process proceeds to step S33.

ステップＳ３１の処理により曲番号入力操作があったと判別された場合には、選曲情報記憶処理を実行する（ステップＳ３２）。この処理において、ＣＰＵ２０２は、入力操作に応じた選曲情報を実行待機中の選曲情報として記憶する。この処理が終了した場合には、ステップＳ３３に処理を移す。 If it is determined in the process of step S31 that a music number input operation has been performed, a music selection information storage process is executed (step S32). In this process, the CPU 202 stores the music selection information corresponding to the input operation as the music selection information on standby. If this process ends, the process moves to a step S33.

ステップＳ３３においては、伴奏中であるか否かの判断を行う。この処理において、ＣＰＵ２０２は、伴奏発生手段が伴奏を発生させているか否かにより、伴奏中であるか否かを判断する。ＣＰＵ２０２は、伴奏中であると判別した場合には、本サブルーチンを終了する。一方、ＣＰＵ２０２は、伴奏中であるとは判別しなかった場合には、ステップＳ３４に処理を移す。 In step S33, it is determined whether or not accompaniment is being performed. In this process, the CPU 202 determines whether or not the accompaniment is generated by the accompaniment generating means. If the CPU 202 determines that accompaniment is in progress, the subroutine is terminated. On the other hand, if the CPU 202 does not determine that accompaniment is being performed, the CPU 202 moves the process to step S34.

ステップＳ３３の処理により伴奏終了中であると判別された場合には、選曲情報があるか否かの判断を行う（ステップＳ３４）。この処理において、ＣＰＵ２０２は、実行待機中である選曲情報を読み出し、実行待機中である選曲情報が全て空情報であるか否かを判定することにより、選曲情報があるか否かを判断することとなる。ＣＰＵ２０２は、選曲情報があると判別した場合には、本サブルーチンを終了する。一方、ＣＰＵ２０２は、選曲情報があるとは判別しなかった場合には、ステップＳ３５に処理を移す。 If it is determined in step S33 that the accompaniment has ended, it is determined whether there is music selection information (step S34). In this process, the CPU 202 reads the music selection information that is waiting for execution, and determines whether or not there is music selection information by determining whether or not all the music selection information that is waiting for execution is empty information. It becomes. If the CPU 202 determines that there is music selection information, the CPU 202 ends this subroutine. On the other hand, if the CPU 202 does not determine that there is music selection information, the process proceeds to step S35.

ステップＳ３４の処理により選曲情報がないと判別された場合には、選曲情報に基づく選曲処理を実行する（ステップＳ３５）。この処理において、ＣＰＵ２０２は、実行順序に従って、実行待機中である選曲情報の一つを実行中である選曲情報として記憶する。この処理が終了した場合には、本サブルーチンを終了する。 If it is determined in the process of step S34 that there is no music selection information, a music selection process based on the music selection information is executed (step S35). In this process, the CPU 202 stores one piece of music selection information that is waiting to be executed as music selection information that is being executed according to the execution order. When this process is finished, this subroutine is finished.

[合成モード選択処理]
図４のステップＳ１２において実行されるサブルーチンについて、図６を用いて説明する。 [Composite mode selection process]
The subroutine executed in step S12 in FIG. 4 will be described with reference to FIG.

ステップＳ６１において、ＣＰＵ２０２は、合成モード選択画面表示処理を行う。この処理において、ＣＰＵ２０２は、表示部２１０に、利用者に合成モードを選択させる画面のデータを供給する。この処理が終了した場合には、ステップＳ６２に処理を移す。 In step S61, the CPU 202 performs a synthesis mode selection screen display process. In this process, the CPU 202 supplies screen data to the display unit 210 that allows the user to select a synthesis mode. If this process ends, the process moves to a step S62.

ステップＳ６２において、ＣＰＵ２０２は、合成処理を行うか否かを判別する。この処理において、ＣＰＵ２０２は、ステップＳ６１でリモコン１０８や無線キーボード１１０から入力されたデータにより、合成処理を行うか否かを判別する。ＣＰＵ２０２は、合成処理を行うと判別した場合には、ステップＳ６３に処理を移す。また、合成処理を行わないと判別した場合には、本サブルーチンを終了する。 In step S62, the CPU 202 determines whether or not to perform a synthesis process. In this process, the CPU 202 determines whether or not to perform the synthesis process based on the data input from the remote control 108 or the wireless keyboard 110 in step S61. If the CPU 202 determines that the composition process is to be performed, the process proceeds to step S63. If it is determined that the synthesis process is not performed, this subroutine is terminated.

ステップＳ６３において、ＣＰＵ２０２は、基準画像を事前に撮影するか否かを判別する。この処理において、ＣＰＵ２０２は、ビデオカメラ２３２Ａ、２３２Ｂの構成情報や、後に行う抽出対象部分認識処理において、どのようなバリエーションを選択したかにより、基準画像を事前に撮影するか否かを判別する。基準画像を事前に撮影すると判別した場合には、ステップＳ６４にその処理を移す。また、基準画像を事前に撮影すると判別しなかった場合には、本サブルーチンを終了する。 In step S63, the CPU 202 determines whether to capture a reference image in advance. In this process, the CPU 202 determines whether or not to capture a reference image in advance depending on the configuration information of the video cameras 232A and 232B and what variation is selected in the extraction target partial recognition process to be performed later. If it is determined that the reference image is to be taken in advance, the process proceeds to step S64. If it is not determined that the reference image is captured in advance, this subroutine is terminated.

ステップＳ６４において、ＣＰＵ２０２は、基準画像収集処理を行う。この処理において、ＣＰＵ２０２は基準画像収集手段２２により、基準画像を収集する。この処理が終了した場合には、ステップＳ６５に処理を移す。 In step S64, the CPU 202 performs a reference image collection process. In this processing, the CPU 202 collects the reference image by the reference image collection unit 22. If this process ends, the process moves to a step S65.

ステップＳ６５において、ＣＰＵ２０２は、基準画像保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ６４で収集した基準画像を、基準画像保存領域２４に保存する。この処理が終了した場合には、ステップＳ６６に処理を移す。 In step S65, the CPU 202 performs a reference image storage process. In this process, the CPU 202 stores the reference image collected in step S64 in the reference image storage area 24. If this process ends, the process moves to a step S66.

ステップＳ６６において、ＣＰＵ２０２は、背景画像指定処理を行う。この処理において、ＣＰＵ２０２は、任意の背景画像を記憶部２０８から読み出す。この背景画像は、利用者により、選択できてもよい。 In step S66, the CPU 202 performs background image designation processing. In this process, the CPU 202 reads an arbitrary background image from the storage unit 208. This background image may be selected by the user.

ステップＳ６７において、ＣＰＵ２０２は、基準画像をセットする。この処理においてＣＰＵ２０２は、基準画像保存領域２４に保存した基準画像をメモリ２０４に読み込む。この処理が終了した場合には、ステップＳ６８に処理を移す。 In step S67, the CPU 202 sets a reference image. In this process, the CPU 202 reads the reference image stored in the reference image storage area 24 into the memory 204. If this process ends, the process moves to a step S68.

ステップＳ６８において、ＣＰＵ２０２は、基準画像セットフラグを有効の値として保存する。この処理が終了した場合には、本サブルーチンは終了する。 In step S68, the CPU 202 stores the reference image set flag as a valid value. When this process ends, this subroutine ends.

[撮影開始処理]
図４のステップＳ１４において実行されるサブルーチンについて、図７から図８を用いて説明する。撮影開始処理については、いくつかのバリエーションがあり、状況に応じてカラオケ端末装置の構成にふさわしい処理を選択できる。 [Shooting start processing]
The subroutine executed in step S14 in FIG. 4 will be described with reference to FIGS. There are several variations for the shooting start process, and a process suitable for the configuration of the karaoke terminal apparatus can be selected according to the situation.

[撮影開始処理（背景差分）]
図７のステップＳ７１において、ＣＰＵ２０２は、基準画像セットフラグが有効であるか否かの判断を行う。この処理において、ＣＰＵ２０２は、基準画像セットフラグを読み出し、有効であるか否かを判断する。ＣＰＵ２０２は、基準画像セットフラグが有効であると判別した場合には、ステップＳ７２に処理を移す。一方、ＣＰＵ２０２は、基準画像フラグが有効であると判別しなかった場合には、図４のステップＳ１２に処理を移す。 [Shooting start processing (background difference)]
In step S71 in FIG. 7, the CPU 202 determines whether or not the reference image set flag is valid. In this process, the CPU 202 reads the reference image set flag and determines whether it is valid. If the CPU 202 determines that the reference image set flag is valid, it moves the process to step S72. On the other hand, if the CPU 202 does not determine that the reference image flag is valid, it moves the process to step S12 in FIG.

ステップＳ７２において、ＣＰＵ２０２は、合成前画像収集処理を行う。この処理において、ＣＰＵ２０２は合成前画像収集手段２１により、合成前画像を収集する。この処理が終了した場合には、ステップＳ７３に処理を移す。 In step S72, the CPU 202 performs pre-combination image collection processing. In this process, the CPU 202 collects the pre-combine image by the pre-combine image collection means 21. If this process ends, the process moves to a step S73.

ステップＳ７３において、ＣＰＵ２０２は、合成前画像保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ７２で収集した合成前画像を、合成前画像保存領域２３に保存する。この処理が終了した場合には、本サブルーチンを終了する。また、背景差分を使用するときとなっているが、これは、事前に基準画像を収集し、保存する必要のある他の方法に使用してもよい。 In step S73, the CPU 202 performs pre-combination image storage processing. In this process, the CPU 202 stores the pre-combination image collected in step S72 in the pre-combination image storage area 23. When this process is finished, this subroutine is finished. Moreover, although it is time to use the background difference, this may be used for other methods in which the reference image needs to be collected and stored in advance.

[撮影開始処理（フレーム間差分）]
図８のステップＳ８１において、ＣＰＵ２０２は、基準画像収集処理を行う。この処理において、ＣＰＵ２０２は基準画像収集手段２２により、基準画像を収集する。この処理が終了した場合には、ステップＳ８２に処理を移す。 [Shooting start processing (difference between frames)]
In step S81 in FIG. 8, the CPU 202 performs a reference image collection process. In this processing, the CPU 202 collects the reference image by the reference image collection unit 22. If this process ends, the process moves to a step S82.

ステップＳ８２において、ＣＰＵ２０２は、基準画像保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ８１で収集した基準画像を、基準画像保存領域２４に保存する。この処理が終了した場合には、ステップＳ８３に処理を移す。 In step S82, the CPU 202 performs a reference image storage process. In this process, the CPU 202 stores the reference image collected in step S81 in the reference image storage area 24. If this process ends, the process moves to a step S83.

ステップＳ８３において、ＣＰＵ２０２は、合成前画像収集処理を行う。この処理において、ＣＰＵ２０２は合成前画像収集手段２１により、合成前画像を収集する。この処理が終了した場合には、ステップＳ８４に処理を移す。 In step S83, the CPU 202 performs pre-combination image collection processing. In this process, the CPU 202 collects the pre-combine image by the pre-combine image collection means 21. If this process ends, the process moves to a step S84.

ステップＳ８４において、ＣＰＵ２０２は、合成前画像保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ８３で収集した合成前画像を、合成前画像保存領域２３に保存する。この処理が終了した場合には、本サブルーチンを終了する。また、フレーム間差分を使用するときとなっているが、これは、事前に基準画像を収集する必要が無く、基準画像と同時に収集し、保存する他の方法に使用してもよい。 In step S84, the CPU 202 performs pre-combination image storage processing. In this process, the CPU 202 stores the pre-combination image collected in step S83 in the pre-combination image storage area 23. When this process is finished, this subroutine is finished. In addition, although the difference between frames is used, it is not necessary to collect the reference image in advance, and it may be used for another method of collecting and storing the reference image at the same time.

[抽出対象部分認識処理]
図４のステップＳ１５において実行されるサブルーチンについて、図９から図１４を用いて説明する。抽出対象部分認識処理については、いくつかのバリエーションがあり、状況に応じてカラオケ端末装置の構成にふさわしい処理を選択できる。 [Extraction target partial recognition processing]
The subroutine executed in step S15 in FIG. 4 will be described with reference to FIGS. There are several variations for the extraction target partial recognition process, and a process suitable for the configuration of the karaoke terminal apparatus can be selected according to the situation.

[抽出対象部分認識処理１]
図９を用いて、抽出対象部分認識処理のバリエーションの１つ目を説明する。 [Extraction target partial recognition process 1]
The first variation of the extraction target partial recognition process will be described with reference to FIG.

ステップＳ１１１において、ＣＰＵ２０２は、基準画像保存領域２４から基準画像を読み出し、各画素の数値化を行う。そして、ＣＰＵ２０２は、基準画像数値を記憶する。この処理が終了した場合には、ステップＳ１１２に処理を移す。 In step S111, the CPU 202 reads a reference image from the reference image storage area 24 and digitizes each pixel. Then, the CPU 202 stores the reference image numerical value. If this process ends, the process moves to a step S112.

ステップＳ１１２において、ＣＰＵ２０２は、合成前画像保存領域２３から合成前画像を読み出し、各画素の数値化を行う。そして、ＣＰＵ２０２は、合成前画像数値を記憶する。この処理が終了した場合には、ステップＳ１１３に処理を移す。 In step S112, the CPU 202 reads the pre-combine image from the pre-combine image storage area 23, and digitizes each pixel. Then, the CPU 202 stores the pre-combination image numerical value. If this process ends, the process moves to a step S113.

ステップＳ１１３において、ＣＰＵ２０２は、差分処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１１２で求めた合成前画像数値から、ステップＳ１１１で求めた基準画像数値の差分を計算する。この処理が終了した場合には、ステップＳ１１４に処理を移す。 In step S113, the CPU 202 performs a difference process. In this process, the CPU 202 calculates the difference between the reference image values obtained in step S111 from the pre-combination image values obtained in step S112. If this process ends, the process moves to a step S114.

ステップＳ１１４において、ＣＰＵ２０２は、差分値の保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１１３で求めた差分値を記憶する。この処理が終了した場合には、本サブルーチンを終了する。 In step S114, the CPU 202 performs a difference value storage process. In this process, the CPU 202 stores the difference value obtained in step S113. When this process is finished, this subroutine is finished.

上述のステップＳ１１１からステップＳ１１４までの処理により、基準画像と合成前画像との差分のある部分を抽出対象部分として認識する。 By the processing from step S111 to step S114 described above, a part having a difference between the reference image and the pre-combination image is recognized as an extraction target part.

[抽出対象部分認識処理２]
図１０を用いて、抽出対象部分認識処理のバリエーションの２つ目を説明する。この抽出対象部分認識処理の２つ目については、ＣＰＵ２０２は特に、複眼カメラにより動画像を収集する場合に、選択する。また、この処理を選択する場合には、複眼カメラの一方のカメラから収集する画像を基準画像として保存するため、基準画像を予め収集し、保存する必要はない。 [Extraction target partial recognition process 2]
A second variation of the extraction target partial recognition process will be described with reference to FIG. The CPU 202 selects the second extraction target part recognition process particularly when a moving image is collected by a compound eye camera. When this process is selected, the image collected from one of the compound-eye cameras is stored as a reference image, so that it is not necessary to collect and store the reference image in advance.

また、複眼カメラを使用する場合は、複眼カメラのいずれか一方のカメラ２４１ａを撮影対象２４３に焦点を合わせてセットする（図１１）。その際マーク２４５が、必ず画面のいずれかに撮影されるようにする必要がある（画像２５７）。また、他方のカメラ２４１ｂは、カメラ２４１ａの左右どちらかに設置し、その際、マーク２４５の位置が、カメラ２４１aで撮影したときと同じ位置に来るように焦点をあわせる（画像２５９）。これにより、カメラ２４１ａで撮影した画像２５７とカメラ２４１ｂで撮影した画像２５９の背景が、画像２５７と画像２５９を重ね合わせたとき一致するようになる。その一方で撮影対象２４３のみが、画像２５７と画像２５９を重ね合わせたときにずれて表示される。 When a compound eye camera is used, one of the compound eye cameras 241a is set in focus on the subject 243 (FIG. 11). At that time, it is necessary to make sure that the mark 245 is photographed on any of the screens (image 257). The other camera 241b is installed on either the left or right side of the camera 241a, and the focus is adjusted so that the position of the mark 245 is at the same position as when the camera 241a is photographed (image 259). As a result, the backgrounds of the image 257 photographed by the camera 241a and the image 259 photographed by the camera 241b coincide with each other when the image 257 and the image 259 are superimposed. On the other hand, only the photographing target 243 is displayed with being shifted when the images 257 and 259 are superimposed.

ステップＳ１２１において、ＣＰＵ２０２は、基準画像数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１２１において取得した基準画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、基準画像数値を記憶する。この処理が終了した場合には、ステップＳ１２２に処理を移す。 In step S121, the CPU 202 performs reference image numerical information storage processing. In this process, the CPU 202 digitizes each pixel of the reference image acquired in step S121. Then, the CPU 202 stores the reference image numerical value. If this process ends, the process moves to a step S122.

ステップＳ１２２において、ＣＰＵ２０２は、合成前画像の数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、合成前画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、合成前画像数値として記憶する。この処理が終了した場合には、ステップＳ１２３に処理を移す。 In step S122, the CPU 202 performs numerical information storage processing for the pre-combine image. In this process, the CPU 202 digitizes each pixel of the pre-combination image. Then, the CPU 202 stores the pre-combination image numerical value. If this process ends, the process moves to a step S123.

ステップＳ１２３において、ＣＰＵ２０２は、差分処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１２２で求めた合成前画像数値から、ステップＳ１２３で求めた基準画像数値の差分を計算する。この処理が終了した場合には、ステップＳ１２４に処理を移す。 In step S123, the CPU 202 performs a difference process. In this process, the CPU 202 calculates the difference between the reference image values obtained in step S123 from the pre-combine image values obtained in step S122. If this process ends, the process moves to a step S124.

ステップＳ１２４において、ＣＰＵ２０２は、シフト値補正処理を行う。この処理において、ＣＰＵ２０２は、予め複眼カメラ２４１ａ、２４１ｂの視差をシフト値として記憶しており、そのシフト値を用いてステップＳ１２３で求めた画素の差分値に対し、視差を補正する。この処理を行うことにより、視差のずれのない抽出対処領域を認識できる。この処理が終了した場合には、ステップＳ１２５に処理を移す。 In step S124, the CPU 202 performs shift value correction processing. In this process, the CPU 202 stores in advance the parallax of the compound-eye cameras 241a and 241b as a shift value, and corrects the parallax with respect to the pixel difference value obtained in step S123 using the shift value. By performing this process, it is possible to recognize an extraction handling area without a parallax shift. If this process ends, the process moves to a step S125.

ステップＳ１２５において、ＣＰＵ２０２は、補正済み差分値保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１２３で求めた補正済み差分値を記憶する。この処理が終了した場合には、本サブルーチンを終了する。 In step S125, the CPU 202 performs a corrected difference value storage process. In this process, the CPU 202 stores the corrected difference value obtained in step S123. When this process is finished, this subroutine is finished.

[抽出対象部分認識処理３]
図１２を用いて、抽出対象部分認識処理のバリエーションの３つ目を説明する。 [Extraction target partial recognition process 3]
A third variation of the extraction target partial recognition process will be described with reference to FIG.

ステップＳ１３１において、ＣＰＵ２０２は、基準画像保存領域２４から基準画像を読み出し、各画素の数値化を行う。そして、ＣＰＵ２０２は、基準画像数値を記憶する。この処理が終了した場合には、ステップＳ１３２に処理を移す。 In step S131, the CPU 202 reads the reference image from the reference image storage area 24 and digitizes each pixel. Then, the CPU 202 stores the reference image numerical value. If this process ends, the process moves to a step S132.

ステップＳ１３２において、ＣＰＵ２０２は、合成前画像の数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、合成前画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、合成前画像数値を記憶する。この処理が終了した場合には、ステップＳ１３３に処理を移す。 In step S132, the CPU 202 performs numerical information storage processing for the pre-combine image. In this process, the CPU 202 digitizes each pixel of the pre-combination image. Then, the CPU 202 stores the pre-combination image numerical value. If this process ends, the process moves to a step S133.

ステップＳ１３３において、ＣＰＵ２０２は、差分処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１３２で求めた合成前画像数値から、ステップＳ１３１で求めた基準画像数値の差分を計算する。この処理が終了した場合には、ステップＳ１３４に処理を移す。 In step S133, the CPU 202 performs difference processing. In this process, the CPU 202 calculates the difference between the reference image values obtained in step S131 from the pre-combination image values obtained in step S132. If this process ends, the process moves to a step S134.

ステップＳ１３４において、ＣＰＵ２０２は、差分値の保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１３３で求めた差分値を記憶する。この処理が終了した場合には、ステップＳ１３５に処理を移す。 In step S134, the CPU 202 performs a difference value storage process. In this process, the CPU 202 stores the difference value obtained in step S133. If this process ends, the process moves to a step S135.

ステップＳ１３５において、ＣＰＵ２０２は、二値化処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１３４で保存した差分値について、予め設定したしきい値を使用して二値化処理を行う。この処理が終了した場合には、ステップＳ１３６に処理を移す。 In step S135, the CPU 202 performs binarization processing. In this processing, the CPU 202 performs binarization processing using a preset threshold value for the difference value stored in step S134. If this process ends, the process moves to a step S136.

ステップＳ１３６において、ＣＰＵ２０２は、細線化処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１３５で二値化処理をした画像の画素値の０と１の境界線を認識し、細線化処理を行う。そして、ＣＰＵ２０２は、細線化処理の結果、抽出対象部分の輪郭を抽出する。この処理が終了した場合には、本サブルーチンを終了する。 In step S136, the CPU 202 performs a thinning process. In this process, the CPU 202 recognizes the boundary line between the pixel values 0 and 1 of the image subjected to the binarization process in step S135 and performs the thinning process. Then, the CPU 202 extracts the contour of the extraction target portion as a result of the thinning process. When this process is finished, this subroutine is finished.

[抽出対象部分認識処理４]
図１３を用いて、抽出対象部分認識処理のバリエーションの４つ目を説明する。 [Extraction target recognition process 4]
A fourth variation of the extraction target partial recognition process will be described with reference to FIG.

ステップＳ１４１において、ＣＰＵ２０２は、基準画像保存領域２４から制御信号情報を読み出す。そして、ＣＰＵ２０２は、制御信号情報と基準画像と同期する。 In step S <b> 141, the CPU 202 reads control signal information from the reference image storage area 24. Then, the CPU 202 synchronizes the control signal information and the reference image.

ステップＳ１４２において、ＣＰＵ２０２は、基準画像の数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、基準画像画像保存領域２３から、基準画像を読み出し、基準画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、基準画像数値を記憶する。この処理が終了した場合には、ステップＳ１４３に処理を移す In step S142, the CPU 202 performs numerical information storage processing for the reference image. In this process, the CPU 202 reads the reference image from the reference image image storage area 23 and digitizes each pixel of the reference image. Then, the CPU 202 stores the reference image numerical value. If this process ends, the process moves to a step S143.

ステップＳ１４３において、ＣＰＵ２０２は、合成前画像の数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、合成前画像保存領域２３から、合成前画像を読み出し、合成前画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、合成前画像数値を記憶する。この処理が終了した場合には、ステップＳ１４４に処理を移す。 In step S143, the CPU 202 performs numerical information storage processing for the pre-combine image. In this process, the CPU 202 reads out the pre-combination image from the pre-combination image storage area 23 and digitizes each pixel of the pre-combination image. Then, the CPU 202 stores the pre-combination image numerical value. If this process ends, the process moves to a step S144.

ステップＳ１４４において、ＣＰＵ２０２は、差分処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１４３で求めた合成前画像数値から、ステップＳ１４２で求めた基準画像数値の差分を計算する。この処理が終了した場合には、ステップＳ１４５に処理を移す。 In step S144, the CPU 202 performs a difference process. In this process, the CPU 202 calculates the difference between the reference image values obtained in step S142 from the pre-combination image values obtained in step S143. If this process ends, the process moves to a step S145.

ステップＳ１４５において、ＣＰＵ２０２は、差分値の保存処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１４４で求めた差分値を記憶する。この処理が終了した場合には、本サブルーチンを終了する。 In step S145, the CPU 202 performs a difference value storage process. In this process, the CPU 202 stores the difference value obtained in step S144. When this process is finished, this subroutine is finished.

[抽出対象部分認識処理５]
図１４を用いて、抽出対象部分認識処理のバリエーションの４つ目を説明する。 [Extraction target partial recognition processing 5]
A fourth variation of the extraction target partial recognition process will be described with reference to FIG.

ステップＳ１５１において、ＣＰＵ２０２は、合成前画像の数値情報保存処理を行う。この処理において、ＣＰＵ２０２は、合成前画像保存領域２３から、合成前画像を読み出し、合成前画像の各画素の数値化を行う。そして、ＣＰＵ２０２は、合成前画像数値を記憶する。この処理が終了した場合には、ステップＳ１５２に処理を移す。 In step S151, the CPU 202 performs numerical information storage processing for the pre-combine image. In this process, the CPU 202 reads out the pre-combination image from the pre-combination image storage area 23 and digitizes each pixel of the pre-combination image. Then, the CPU 202 stores the pre-combination image numerical value. If this process ends, the process moves to a step S152.

ステップＳ１５２において、ＣＰＵ２０２は、基準画像のシフト比較処理を行う。この処理において、ＣＰＵ２０２は、基準画像保存領域２４から基準画像を読みこみ、その基準画像をブレ補償領域について上下左右に１画素づつシフトさせる。その際、基準画像の各画素を数値化しておき、基準画像と、出力対象領域全体との画素の差の総和が最も小さくなる場所を計算する。そして、画素の差の総和が最小になる場所と元の基準画像との距離と向きを求め、その値をシフト値としてＣＰＵ２０２は記憶する。この処理が終了した場合には、ステップＳ１５３に処理を移す。 In step S152, the CPU 202 performs a reference image shift comparison process. In this process, the CPU 202 reads the reference image from the reference image storage area 24, and shifts the reference image vertically, horizontally, and pixel by pixel with respect to the shake compensation area. At that time, each pixel of the reference image is digitized, and a place where the total sum of pixel differences between the reference image and the entire output target area is minimized is calculated. Then, the CPU 202 stores the distance and direction between the place where the total sum of pixel differences is minimized and the original reference image as the shift value. If this process ends, the process moves to a step S153.

ステップＳ１５３において、ＣＰＵ２０２は、合成前画像のシフト処理を行う。この処理において、ＣＰＵ２０２は、ステップＳ１５２において求めたシフト値を使用して、合成前画像の位置を補正する。この処理が終了した場合には、ステップＳ１５４に処理を移す。 In step S153, the CPU 202 performs a shift process for the pre-combine image. In this process, the CPU 202 corrects the position of the pre-combine image using the shift value obtained in step S152. If this process ends, the process moves to a step S154.

ステップＳ１５４において、ＣＰＵ２０３は、シフト値により位置を補正した合成前画像を合成前画像保存領域に保存する。この処理が終了した場合には、本サブルーチンを終了する。 In step S154, the CPU 203 stores the pre-combination image whose position is corrected by the shift value in the pre-combination image storage area. When this process is finished, this subroutine is finished.

尚、この５つ目の抽出対象部分認識処理は、この処理を行ったあと、さらに他の抽出対象部分認識処理を行ってもよい。 In the fifth extraction target part recognition process, after this process is performed, another extraction target part recognition process may be performed.

以上、本発明の実施形態を説明したが、具体例を例示したに過ぎず、特に本発明を限定しない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本発明の実施形態に記載された効果に限定されない。 As mentioned above, although embodiment of this invention was described, it only showed the specific example and does not specifically limit this invention. Further, the effects described in the embodiments of the present invention only list the most preferable effects resulting from the present invention, and the effects of the present invention are not limited to the effects described in the embodiments of the present invention.

本発明の実施例である、カラオケ端末装置の構成を示すブロック図である。It is a block diagram which shows the structure of the karaoke terminal device which is an Example of this invention. 本発明の実施例である、カラオケ端末装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the karaoke terminal device which is an Example of this invention. 本発明の実施例である、カラオケ端末装置の表示部２１０の画面イメージである。It is a screen image of the display part 210 of the karaoke terminal device which is an Example of this invention. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が複眼カメラで動画を収集するときの概念図である。It is a conceptual diagram when the karaoke terminal device which is an Example of this invention collects a moving image with a compound eye camera. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例であるカラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs. 本発明の実施例である、カラオケ端末装置が実行する動作を示すフローチャート図である。It is a flowchart figure which shows the operation | movement which the karaoke terminal device which is an Example of this invention performs.

Explanation of symbols

１カラオケ端末装置
１０伴奏音発生手段
２０カラオケ情報制御手段
２１合成前画像収集手段
２２基準画像収集手段
２５抽出対象部分認識手段
２６対象部分抽出手段
３２音声情報収集手段
４０合成手段
５０出力手段 DESCRIPTION OF SYMBOLS 1 Karaoke terminal apparatus 10 Accompaniment sound generation means 20 Karaoke information control means 21 Pre-synthesis image collection means 22 Reference image collection means 25 Extraction target part recognition means 26 Target part extraction means 32 Audio information collection means 40 Synthesis means 50 Output means

Claims

Accompaniment sound generating means for generating accompaniment information including at least an accompaniment sound;
Karaoke information control means for processing images taken by a video camera and controlling information related to karaoke;
A karaoke terminal device having output means for outputting a voice and an image of a singer,
A voice information storage area for storing the voice of the singer and the accompaniment information;
A reference image storage area for storing first control signal information that is control information of illumination output during a collection time of the series of reference images together with a series of reference images captured in advance;
A pre-combination image storage area for storing a pre-combination image collected at a certain point in time during which the output means outputs the sound and the image together with second control signal information that is control information of illumination output at the certain point of time ; ,
An extraction target part storage area for storing the extraction target part extracted from the pre-synthesis image;
A background image storage area for storing a background image selected by the singer to synthesize with the extraction target part, and
The karaoke information control means includes:
Voice information collecting means for storing the voice and the accompaniment information in the voice information storage area;
The pre-combination image stored in the pre-combination image storage area and the series of reference images stored in the reference image storage area, the time of the first control signal information , the time of the second control signal information, , Synchronize the frames of the same time axis, calculate the difference of pixel values, extraction target portion recognition means,
Target part extraction means for storing the difference portion as the extraction target part in the extraction target part storage area;
Synthesizing means for synthesizing the extraction target portion and the background image,
A karaoke apparatus in which a synthesized image synthesized by the synthesizing unit is output by the output unit.

2. The karaoke apparatus according to claim 1, wherein the karaoke information control unit synchronizes the accompaniment information, the sound, and the image synthesized by the synthesis unit and displays the image on the output unit. 3.

3. The karaoke apparatus according to claim 1, wherein the output unit includes a communication interface, and connects to a server to transmit the accompaniment information, the voice, and an image synthesized by the synthesis unit.

The karaoke apparatus includes an operation unit operable by the singer,
The karaoke apparatus according to any one of claims 1 to 3, wherein the karaoke information control unit determines whether or not to perform a combining process by the combining unit based on an input from the operation unit .