JP2020086113A

JP2020086113A - Karaoke system and karaoke device

Info

Publication number: JP2020086113A
Application number: JP2018219997A
Authority: JP
Inventors: 美香谷水; Mika Tanimizu
Original assignee: Daiichikosho Co Ltd
Current assignee: Daiichikosho Co Ltd
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2020-06-04
Anticipated expiration: 2038-11-26
Also published as: JP7117228B2

Abstract

To provide a karaoke system with which a user whose native language differs from a user performing karaoke singing can enjoy a song according to karaoke performance.SOLUTION: A karaoke system comprises: an extraction section for extracting singing voice data from singing voice obtained when a user performs karaoke singing by first language according to karaoke performance of a prescribed music; a generation section for generating a synthesized singing voice signal on the basis of extracted singing voice data, second lyric data of a prescribed music, which is read according to karaoke performance, and a voice quality parameter obtained in accordance with user identification information of the user performing karaoke singing; and a transmission processing section for transmitting the generated synthesized singing voice signal to a portable terminal which audience possesses.SELECTED DRAWING: Figure 5

Description

本発明はカラオケシステム、及びカラオケ装置に関する。 The present invention relates to a karaoke system and a karaoke device.

母国語が異なる利用者同士でカラオケ装置を利用することがある。このような場合に、たとえば、日本人の利用者が日本語でカラオケ歌唱を行うと、母国語が異なる外国人は歌詞の内容を理解することができない。 Users of different native languages may use the karaoke device. In such a case, for example, when a Japanese user sings a karaoke song in Japanese, a foreigner whose native language is different cannot understand the content of the lyrics.

そこで、特許文献１には、利用者の母国語に応じた翻訳歌詞をサブモニタに表示する技術が開示されている。 Therefore, Patent Document 1 discloses a technique of displaying translated lyrics according to the user's native language on the sub monitor.

特開２００７−１２１５１４号公報JP, 2007-121514, A

特許文献１のように翻訳歌詞（たとえば英語）をサブモニタに表示したとしても、実際のカラオケ歌唱は翻訳前の歌詞（たとえば日本語）に基づいて行われる。従って、母国語が異なる利用者は、カラオケ演奏に合わせて翻訳歌詞を参照したとしても、歌を楽しむことができない。 Even if the translated lyrics (for example, English) are displayed on the sub monitor as in Patent Document 1, the actual karaoke song is performed based on the lyrics (for example, Japanese) before translation. Therefore, a user whose native language is different cannot enjoy the song even if the translated lyrics are referred to in time with the karaoke performance.

本発明の目的は、カラオケ歌唱を行う利用者と母国語が異なる利用者であっても、カラオケ演奏に合わせて歌を楽しむことが可能なカラオケステムを提供することにある。 An object of the present invention is to provide a karaoke stem that allows a user who has a different native language from the user who sings a karaoke song to enjoy the song in tune with the karaoke performance.

上記目的を達成するための一の発明は、サーバ装置とカラオケ装置とが通信可能に接続されたカラオケシステムであって、前記サーバ装置は、利用者の声質を示す声質パラメータを、当該利用者を識別するための利用者識別情報と関連付けて記憶する利用者情報記憶部を有し、前記カラオケ装置は、カラオケ演奏を行うための演奏データ、第１の言語による第１の歌詞データ、及び当該第１の言語とは異なる第２の言語による第２の歌詞データを、楽曲毎に記憶する楽曲データ記憶部と、前記カラオケ装置の利用者が所有する携帯端末を介して行われたログイン操作に応じて、当該利用者の利用者識別情報を取得するログイン処理部と、ある楽曲の前記演奏データに基づくカラオケ演奏に合わせて利用者が前記第１の言語でカラオケ歌唱を行った際に得られる歌唱音声から、音高データ及び音量データを含む歌唱音声データを抽出する抽出部と、抽出した前記歌唱音声データ、前記カラオケ演奏に合わせて読み出した前記ある楽曲の第２の歌詞データ、及び前記カラオケ歌唱を行う利用者の利用者識別情報に応じて前記サーバ装置から取得した前記声質パラメータに基づいて、合成歌唱音声信号を作成する作成部と、前記ログイン操作を行った複数の利用者のうち、前記カラオケ歌唱を行う利用者以外の利用者である聴衆が所有する携帯端末に対し、作成した前記合成歌唱音声信号を送信する送信処理部と、を有するカラオケシステムである。
本発明の他の特徴については、後述する明細書及び図面の記載により明らかにする。 One invention for achieving the above object is a karaoke system in which a server device and a karaoke device are communicably connected to each other, wherein the server device sets a voice quality parameter indicating a voice quality of the user to the user. The karaoke apparatus has a user information storage unit that stores the user identification information for identification in association with the karaoke device, performance data for performing karaoke performance, first lyrics data in a first language, and the first lyric data. In accordance with a song data storage unit that stores second lyrics data in a second language different from the first language for each song, and a login operation performed via a mobile terminal owned by the user of the karaoke device. And a singing obtained when the user sings a karaoke song in the first language in synchronization with a karaoke performance based on the performance data of a certain piece of music, and a login processing unit that acquires the user identification information of the user. An extraction unit that extracts singing voice data including pitch data and volume data from a voice, the extracted singing voice data, the second lyric data of the certain song read in accordance with the karaoke performance, and the karaoke singing Based on the voice quality parameter acquired from the server device according to the user identification information of the user to perform, a creation unit that creates a synthetic singing voice signal, and among the plurality of users who performed the login operation, It is a karaoke system which has a transmission processing part which transmits the created synthetic singing voice signal to a portable terminal which a user who is a user other than a user who sings a karaoke song owns.
Other features of the present invention will be clarified by the description and drawings described below.

本発明によれば、カラオケ歌唱を行う利用者と母国語が異なる利用者であっても、カラオケ演奏に合わせて歌を楽しむことができる。 According to the present invention, even a user whose native language is different from the user who sings a karaoke song can enjoy the song in tune with the karaoke performance.

第１実施形態に係るカラオケシステムの概略を示す図である。It is a figure which shows the outline of the karaoke system which concerns on 1st Embodiment. 第１実施形態に係るサーバ装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the server apparatus which concerns on 1st Embodiment. 第１実施形態に係るカラオケ装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the karaoke apparatus which concerns on 1st Embodiment. 第１実施形態に係るカラオケ装置のソフトウェア構成例を示す図である。It is a figure which shows the software structural example of the karaoke apparatus which concerns on 1st Embodiment. 第１実施形態に係るカラオケシステムの処理を示すフローチャートである。It is a flow chart which shows processing of a karaoke system concerning a 1st embodiment.

＜第１実施形態＞
図１〜図５を参照して、第１実施形態に係るカラオケシステムについて説明する。 <First Embodiment>
The karaoke system according to the first embodiment will be described with reference to FIGS. 1 to 5.

＝＝カラオケシステム＝＝
図１に示すように、本実施形態に係るカラオケシステム１は、サーバ装置Ｓ及びカラオケ装置Ｋを備える。サーバ装置Ｓとカラオケ装置Ｋとは、ネットワークＮを介して通信可能となっている。ネットワークＮは、たとえば公衆電話回線網やインターネット回線等の伝送路である。また、カラオケ装置Ｋは、利用者が所有する携帯端末と通信可能となっている。 == Karaoke system ==
As shown in FIG. 1, the karaoke system 1 according to the present embodiment includes a server device S and a karaoke device K. The server device S and the karaoke device K can communicate with each other via the network N. The network N is a transmission line such as a public telephone line network or an internet line. Further, the karaoke device K can communicate with a mobile terminal owned by the user.

利用者Ｕ１〜利用者Ｕ３は、カラオケ装置Ｋを一緒に利用する利用者である。以下、利用者Ｕ１及び利用者Ｕ２は日本人（母国語：日本語）であり、利用者Ｕ３は米国人（母国語：英語）であるとして説明する。 The users U1 to U3 are users who use the karaoke device K together. Hereinafter, it is assumed that the users U1 and U2 are Japanese (native language: Japanese) and the user U3 is an American (native language: English).

各利用者は、携帯端末（携帯端末Ｍ１〜携帯端末Ｍ３）を所有している。携帯端末は、一般的なスマートフォンやタブレット端末である。携帯端末には、カラオケ装置Ｋを利用する際に使用する専用アプリケーションソフトウェア（以下、「カラオケアプリ」）がインストールされている。カラオケアプリは、サーバ装置Ｓや、サーバ装置Ｓが提供するＷｅｂサイトからダウンロードすることで入手できる。 Each user owns a mobile terminal (mobile terminal M1 to mobile terminal M3). The mobile terminal is a general smartphone or tablet terminal. Dedicated application software (hereinafter, "karaoke application") used when using the karaoke apparatus K is installed in the mobile terminal. The karaoke application can be obtained by downloading it from the server device S or a website provided by the server device S.

携帯端末とカラオケ装置Ｋは、ペアリングされることで互いに通信可能となる。ペアリングは公知の手法を利用することができる。ペアリングは、たとえば、カラオケ装置Ｋが設置されたカラオケルームへの入室後、カラオケアプリを起動させた場合に実行される。ペアリングされた後、利用者は、カラオケアプリを利用して、たとえばカラオケ装置Ｋに対してログイン操作を行ったり、カラオケ装置Ｋに付属するリモコン装置と同様の操作（楽曲の予約等）を行うことができる。 The mobile terminal and the karaoke device K can communicate with each other by being paired. A known method can be used for pairing. The pairing is executed, for example, when the karaoke application is activated after entering the karaoke room in which the karaoke device K is installed. After being paired, the user uses the karaoke application to perform, for example, a login operation on the karaoke apparatus K or the same operation as a remote control apparatus attached to the karaoke apparatus K (reservation of music, etc.). be able to.

＝＝サーバ装置＝＝
サーバ装置Ｓは、カラオケ装置Ｋに関する各種情報を管理するコンピュータである。図２はサーバ装置Ｓのハードウェア構成例を示す図である。サーバ装置Ｓは、記憶部１０、通信部２０、及び制御部３０を備える。各構成はインターフェース（図示なし）を介してバスＢに接続されている。 == server device ==
The server device S is a computer that manages various information about the karaoke device K. FIG. 2 is a diagram illustrating a hardware configuration example of the server device S. The server device S includes a storage unit 10, a communication unit 20, and a control unit 30. Each component is connected to the bus B via an interface (not shown).

記憶部１０は、各種のデータを記憶する大容量の記憶装置である。本実施形態に係る記憶部１０の記憶領域の一部は、利用者情報記憶部１０ａとして機能する。 The storage unit 10 is a large-capacity storage device that stores various data. A part of the storage area of the storage unit 10 according to the present embodiment functions as the user information storage unit 10a.

利用者情報記憶部１０ａは、声質パラメータを利用者識別情報と関連付けて記憶する。 The user information storage unit 10a stores the voice quality parameter in association with the user identification information.

声質パラメータは、利用者の声質を示すパラメータである。利用者の声質は、声の周波数特性に基づいて決定される。声質パラメータは、たとえば利用者の歌唱音声から公知の技術（特開２０１４−０４８４７２号公報等）を用いて抽出することができる。利用者識別情報は、利用者を識別するための利用者ＩＤのような、各利用者に固有の情報である。 The voice quality parameter is a parameter indicating the voice quality of the user. The voice quality of the user is determined based on the frequency characteristic of the voice. The voice quality parameter can be extracted from, for example, the singing voice of the user by using a known technique (Japanese Patent Laid-Open No. 2014-048472, etc.). The user identification information is information unique to each user, such as a user ID for identifying the user.

声質パラメータは、カラオケ装置Ｋを利用する前に予め取得しておくことができる。たとえば、過去にあるカラオケ装置を利用して利用者Ｕ１がカラオケ歌唱を行った場合、当該あるカラオケ装置は、利用者Ｕ１の歌唱音声から声質パラメータを抽出し、利用者Ｕ１の利用者識別情報（利用者ＩＤ）と関連付けてサーバ装置Ｓに送信する。利用者情報記憶部１０ａは、利用者Ｕ１の声質パラメータを利用者ＩＤと関連付けて記憶する。 The voice quality parameter can be acquired in advance before using the karaoke apparatus K. For example, when the user U1 sings a karaoke song using a karaoke device in the past, the certain karaoke device extracts a voice quality parameter from the singing voice of the user U1 and identifies the user identification information of the user U1 ( It is transmitted to the server device S in association with the user ID). The user information storage unit 10a stores the voice quality parameter of the user U1 in association with the user ID.

なお、声質パラメータには、年齢や性別に関する属性情報が含まれていてもよい。 The voice quality parameter may include attribute information regarding age and sex.

通信部２０は、サーバ装置Ｓとカラオケ装置Ｋとを接続するためのインターフェースを提供する。制御部３０は、サーバ装置Ｓにおける各種の制御を行う。制御部３０は、ＣＰＵおよびメモリ（いずれも図示なし）を備える。ＣＰＵは、メモリに記憶されたプログラムを実行することにより各種の機能を実現する。 The communication unit 20 provides an interface for connecting the server device S and the karaoke device K. The control unit 30 performs various controls in the server device S. The control unit 30 includes a CPU and a memory (neither is shown). The CPU realizes various functions by executing the programs stored in the memory.

＝＝カラオケ装置＝＝
［ハードウェア構成］
カラオケ装置Ｋは、楽曲のカラオケ演奏、及び利用者がカラオケ歌唱を行うための装置である。 == Karaoke device ==
[Hardware configuration]
The karaoke device K is a device for performing a karaoke performance of music and for a user to sing a karaoke song.

図３に示すように、カラオケ装置Ｋは、カラオケ本体４０、スピーカ５０、表示装置６０、マイク７０、及びリモコン装置８０を備える。 As shown in FIG. 3, the karaoke device K includes a karaoke body 40, a speaker 50, a display device 60, a microphone 70, and a remote control device 80.

スピーカ５０はカラオケ本体４０からの放音信号に基づいて放音するための構成である。表示装置６０はカラオケ本体４０からの信号に基づいて映像や画像を画面に表示するための構成である。マイク７０は利用者の歌唱音声（マイク７０への入力音声）をアナログの音声信号に変換してカラオケ本体４０に入力するための構成である。リモコン装置８０は、カラオケ本体４０に対する各種操作をおこなうための装置である。利用者はリモコン装置８０を用いてカラオケ歌唱を希望する楽曲の検索や選曲（予約）等を行うことができる。リモコン装置８０の表示画面には各種操作の指示入力を行うためのアイコン等が表示される。 The speaker 50 is configured to emit sound based on a sound emission signal from the karaoke body 40. The display device 60 is configured to display a video or image on the screen based on a signal from the karaoke body 40. The microphone 70 is configured to convert a user's singing voice (voice input to the microphone 70) into an analog voice signal and input the analog voice signal to the karaoke body 40. The remote control device 80 is a device for performing various operations on the karaoke body 40. The user can use the remote control device 80 to search for a song desired to be sung by karaoke, select a song (reserve), and the like. Icons and the like for inputting instructions for various operations are displayed on the display screen of the remote controller 80.

カラオケ本体４０は、選曲された楽曲のカラオケ演奏制御、歌詞等の表示制御、マイク７０を通じて入力された音声信号の処理といった、カラオケ歌唱に関する各種の制御を行う。図３に示すように、カラオケ本体４０は、制御部４１、通信部４２、記憶部４３、音響処理部４４、表示処理部４５、及び操作部４６を備える。各構成はインターフェース（図示なし）を介してバスＢに接続されている。 The karaoke body 40 performs various controls related to karaoke singing, such as karaoke performance control of a selected song, display control of lyrics, etc., and processing of a voice signal input through the microphone 70. As shown in FIG. 3, the karaoke body 40 includes a control unit 41, a communication unit 42, a storage unit 43, a sound processing unit 44, a display processing unit 45, and an operation unit 46. Each component is connected to the bus B via an interface (not shown).

制御部４１は、ＣＰＵおよびメモリ（いずれも図示なし）を備える。ＣＰＵは、メモリに記憶された動作プログラムを実行することにより各種の制御機能を実現する。メモリは、ＣＰＵに実行されるプログラムを記憶したり、プログラムの実行時に各種情報を一時的に記憶したりする記憶装置である。 The control unit 41 includes a CPU and a memory (neither is shown). The CPU realizes various control functions by executing the operation program stored in the memory. The memory is a storage device that stores a program executed by the CPU and temporarily stores various information when the program is executed.

通信部４２は、ルーター（図示なし）を介してカラオケ本体４０を通信回線に接続するためのインターフェースを提供する。 The communication unit 42 provides an interface for connecting the karaoke body 40 to a communication line via a router (not shown).

記憶部４３は、各種のデータを記憶する大容量の記憶装置であり、たとえばハードディスクドライブなどである。 The storage unit 43 is a large-capacity storage device that stores various types of data, and is, for example, a hard disk drive.

音響処理部４４は、制御部４１の制御に基づき、楽曲に対するカラオケ演奏の制御およびマイク７０を通じて入力された歌唱音声信号の処理を行う。表示処理部４５は、制御部４１の制御に基づき、表示装置６０やリモコン装置８０における各種表示に関する処理を行う。たとえば、表示処理部４５は、カラオケ演奏時における背景映像に歌詞テロップや各種アイコンが重ねられた映像を表示装置６０に表示させる。或いは、表示処理部４５は、リモコン装置８０の表示画面に操作入力用の各種アイコンを表示させる。 Under the control of the control unit 41, the sound processing unit 44 controls the karaoke performance for the music and processes the singing voice signal input through the microphone 70. The display processing unit 45 performs processing related to various displays on the display device 60 and the remote control device 80 under the control of the control unit 41. For example, the display processing unit 45 causes the display device 60 to display an image in which lyrics telops and various icons are superimposed on the background image during karaoke performance. Alternatively, the display processing unit 45 displays various icons for operation input on the display screen of the remote controller 80.

操作部４６は、パネルスイッチおよびリモコン受信回路などからなり、利用者によるカラオケ装置Ｋのパネルスイッチあるいはリモコン装置８０の操作に応じて選曲信号、演奏中止信号などの操作信号を制御部４１に対して出力する。制御部４１は、操作部４６からの操作信号を検出し、対応する処理を実行する。 The operation unit 46 is composed of a panel switch and a remote control receiving circuit and the like, and sends operation signals such as a music selection signal and a performance stop signal to the control unit 41 in response to a user operating the panel switch of the karaoke device K or the remote control device 80. Output. The control unit 41 detects an operation signal from the operation unit 46 and executes a corresponding process.

［ソフトウェア構成］
図４はカラオケ本体４０のソフトウェア構成例を示す図である。カラオケ本体４０は、楽曲データ記憶部１００、ログイン処理部２００、抽出部３００、作成部４００、及び送信処理部５００を備える。楽曲データ記憶部１００は、記憶部４３の記憶領域の一部として構成される。ログイン処理部２００、抽出部３００、作成部４００、及び送信処理部５００は、制御部４１のＣＰＵがメモリに記憶されるプログラムを実行することにより実現される。 [Software configuration]
FIG. 4 is a diagram showing a software configuration example of the karaoke body 40. The karaoke body 40 includes a music data storage unit 100, a login processing unit 200, an extraction unit 300, a creation unit 400, and a transmission processing unit 500. The music data storage unit 100 is configured as a part of the storage area of the storage unit 43. The login processing unit 200, the extraction unit 300, the creation unit 400, and the transmission processing unit 500 are realized by the CPU of the control unit 41 executing a program stored in the memory.

（楽曲データ記憶部）
楽曲データ記憶部１００は、カラオケ演奏を行うための演奏データ、第１の言語による第１の歌詞データ、及び当該第１の言語とは異なる第２の言語による第２の歌詞データを、楽曲毎に記憶する。各楽曲には、個々の楽曲を特定するための楽曲ＩＤが付与されている。 (Music data storage unit)
The music data storage unit 100 stores performance data for performing a karaoke performance, first lyrics data in a first language, and second lyrics data in a second language different from the first language for each song. Remember. A music ID for identifying each music is given to each music.

演奏データはカラオケ演奏音の元となるデータである。音響処理部４４は、演奏データに基づいてカラオケ演奏を行う。 The performance data is the source of the karaoke performance sound. The sound processing unit 44 performs a karaoke performance based on the performance data.

第１の歌詞データ及び第２の歌詞データは、楽曲に対応する歌詞テロップを表示装置６０等に表示させるためのデータである。第１の歌詞データは、第１の言語により構成されている。第２の歌詞データは、第１の言語とは異なる第２の言語により構成されている。たとえば、第１の言語が「日本語」の場合、第２の言語は日本語以外の言語（たとえば「英語」）である。第１の歌詞データ及び第２の歌詞データが示す歌詞は、それぞれの言語に応じて、カラオケ演奏と合うように分節されている。 The first lyrics data and the second lyrics data are data for displaying the lyrics telop corresponding to the music on the display device 60 or the like. The first lyrics data is composed in the first language. The second lyrics data is composed in a second language different from the first language. For example, when the first language is "Japanese", the second language is a language other than Japanese (for example, "English"). The lyrics indicated by the first lyric data and the second lyric data are segmented so as to match the karaoke performance according to each language.

（ログイン処理部）
ログイン処理部２００は、カラオケ装置Ｋを利用する利用者が所有する携帯端末からのログイン操作に応じて、当該利用者の利用者識別情報を取得する。 (Login processing section)
The login processing unit 200 acquires the user identification information of the user who uses the karaoke device K according to the login operation from the mobile terminal owned by the user.

利用者は、カラオケ装置Ｋの利用にあたって、最初にログイン操作（自己の利用者ＩＤやパスワードの入力）を行なう。具体的に、利用者は、自己の携帯端末でカラオケアプリを起動させてペアリングが完了した後、利用者ＩＤ及びパスワードを入力する。携帯端末は、入力された利用者ＩＤ及びパスワードを、携帯端末を識別するための情報（端末ＩＤ）と併せてカラオケ装置Ｋに送信する。ログイン処理部２００は、利用者ＩＤ等を取得し、当該利用者ＩＤ等に基づいて利用者がカラオケ装置Ｋにログインする処理を行う。ログイン処理を行うことにより、カラオケ装置Ｋは、現在の利用者（及び利用者が所有する携帯端末）を特定することができる。複数の利用者でカラオケ装置Ｋを利用する場合、利用者毎にログイン操作を行う。 When using the karaoke apparatus K, the user first performs a login operation (inputting his or her own user ID and password). Specifically, the user activates the karaoke application on his/her mobile terminal and completes the pairing, and then inputs the user ID and the password. The mobile terminal transmits the input user ID and password together with the information (terminal ID) for identifying the mobile terminal to the karaoke device K. The login processing unit 200 acquires a user ID and the like, and performs processing for the user to log in to the karaoke device K based on the user ID and the like. By performing the login process, the karaoke device K can identify the current user (and the mobile terminal owned by the user). When a plurality of users use the karaoke apparatus K, a login operation is performed for each user.

ここで、図１の例において、利用者Ｕ１〜利用者Ｕ３が自己の携帯端末を介してカラオケ装置Ｋにログインした後、利用者Ｕ１が携帯端末Ｍ１を介して楽曲Ｘの予約を行うとする。また、楽曲Ｘには、日本人アーティストが歌唱するオリジナルバージョン（日本語）と、米国人アーティストが歌唱するカバーバージョン（英語）があるとする。この場合、楽曲データ記憶部１００には、楽曲Ｘの演奏データ、日本語による歌詞データ、及び英語による歌詞データが記憶されている。 Here, in the example of FIG. 1, after the users U1 to U3 log in to the karaoke device K via their mobile terminals, the user U1 makes a reservation for the music X via the mobile terminal M1. .. Further, it is assumed that the music X has an original version (Japanese) sung by a Japanese artist and a cover version (English) sung by an American artist. In this case, the music data storage unit 100 stores the performance data of the music X, the lyrics data in Japanese, and the lyrics data in English.

利用者Ｕ１は、楽曲Ｘの予約にあたり、オリジナルバージョンとカバーバージョンのいずれのカラオケ歌唱を行うかを選択する。利用者Ｕ１は日本人であるため、オリジナルバージョンを選択したとする。この場合、第１の言語は「日本語」であり、第２の言語は「英語」となる。携帯端末Ｍ１は、楽曲Ｘの楽曲ＩＤ、利用者が選択したバージョンを示すバージョン情報、及び利用者Ｕ１の利用者ＩＤをカラオケ装置Ｋに送信する。 When making a reservation for the music piece X, the user U1 selects which of the original version and the cover version the karaoke song is to be performed. Since the user U1 is Japanese, it is assumed that the original version is selected. In this case, the first language is "Japanese" and the second language is "English". The mobile terminal M1 transmits the music ID of the music X, version information indicating the version selected by the user, and the user ID of the user U1 to the karaoke device K.

制御部４１は、受信した楽曲ＩＤに基づいて楽曲データ記憶部１００から楽曲Ｘの演奏データを読み出し、音響処理部４４を制御して、楽曲Ｘのカラオケ演奏を行う。また、制御部４１は、受信したバージョン情報に基づいて楽曲データ記憶部１００から楽曲Ｘの日本語による歌詞データを読み出し、表示処理部４５を制御して、当該歌詞データに基づく歌詞テロップ（日本語の歌詞テロップ）を表示装置６０やリモコン装置８０に表示する。この例において、日本語による歌詞データは「第１の歌詞データ」に相当する。 The control unit 41 reads the performance data of the music X from the music data storage unit 100 based on the received music ID, controls the acoustic processing unit 44, and performs the karaoke performance of the music X. Further, the control unit 41 reads the lyrics data in Japanese of the music X from the music data storage unit 100 based on the received version information, controls the display processing unit 45, and controls the lyrics telop (Japanese words based on the lyrics data. Is displayed on the display device 60 and the remote control device 80. In this example, the lyrics data in Japanese corresponds to the "first lyrics data".

（抽出部）
抽出部３００は、ある楽曲の演奏データに基づくカラオケ演奏に合わせて利用者が第１の言語でカラオケ歌唱を行った際に得られる歌唱音声から、音高データ及び音量データを含む歌唱音声データを抽出する。 (Extractor)
The extraction unit 300 extracts singing voice data including pitch data and volume data from the singing voice obtained when the user sings a karaoke song in the first language in accordance with the karaoke performance based on the performance data of a certain song. Extract.

音高データや音量データの抽出は、公知の手法（たとえば、特開２０１１−２１５２９２号公報）を用いることができる。 A known method (for example, Japanese Patent Laid-Open No. 2011-215292) can be used to extract the pitch data and the volume data.

上記例において楽曲Ｘの予約を行った利用者Ｕ１は、楽曲Ｘのカラオケ演奏に合わせて、日本語の歌詞テロップを参照しながら楽曲Ｘのカラオケ歌唱を行う。 In the above example, the user U1 who has reserved the music piece X performs the karaoke song of the music piece X while referring to the Japanese lyrics telop in time with the karaoke performance of the music piece X.

抽出部３００は、楽曲Ｘのカラオケ演奏に合わせて利用者Ｕ１が日本語でカラオケ歌唱を行った際に得られる歌唱音声から、音高データ及び音量データを含む歌唱音声データを抽出する。 The extraction unit 300 extracts singing voice data including pitch data and volume data from the singing voice obtained when the user U1 sings a karaoke song in Japanese in accordance with the karaoke performance of the music piece X.

（作成部）
作成部４００は、抽出した歌唱音声データ、カラオケ演奏に合わせて読み出したある楽曲の第２の歌詞データ、及びカラオケ歌唱を行う利用者の利用者識別情報に応じてサーバ装置Ｓから取得した声質パラメータに基づいて、合成歌唱音声信号を作成する。 (Preparation Department)
The creation unit 400 obtains the singing voice data, the second lyric data of a certain piece of music read out along with the karaoke performance, and the voice quality parameter acquired from the server device S according to the user identification information of the user who sings the karaoke song. Based on, a synthetic singing voice signal is created.

合成歌唱音声信号の作成は、公知の手法（たとえば、特開２００７−２４０５６４号公報）を用いて行うことができる。たとえば、作成部４００は、第２の歌詞データに基づいて歌唱音声信号の元となる音声波形を合成し、歌唱音声データに含まれる音高データ及び音量データを、合成した音声波形に適用して音高変化及び音量変化を付与した歌唱音声信号を生成する。作成部４００は、生成した歌唱音声信号に、声質パラメータに基づいたイコライジング処理やフォルマント・フィルタ処理などを施すことにより、合成歌唱音声信号を作成する。 The synthesis singing voice signal can be created by using a known method (for example, Japanese Patent Laid-Open No. 2007-240564). For example, the creation unit 400 synthesizes the voice waveform that is the source of the singing voice signal based on the second lyrics data, and applies the pitch data and the volume data included in the singing voice data to the synthesized voice waveform. A singing voice signal with a pitch change and a volume change is generated. The creation unit 400 creates a synthetic singing voice signal by performing equalizing processing, formant filtering processing, or the like on the generated singing voice signal based on the voice quality parameter.

声質パラメータは利用者の声そのものを規定し、歌唱音声データは利用者の歌い方を規定する。従って、これらに基づいて作成した合成歌唱音声信号に基づく音声は、利用者がある楽曲を第２の言語でカラオケ歌唱した場合の音声と類似する。 The voice quality parameter defines the user's voice itself, and the singing voice data defines the user's singing style. Therefore, the voice based on the synthetic singing voice signal created based on these is similar to the voice when the user sings a certain song in karaoke in the second language.

上述のように利用者Ｕ１が楽曲Ｘのカラオケ歌唱を日本語で行った場合、作成部４００は、抽出部３００により抽出された歌唱音声データを取得する。また、作成部４００は、楽曲Ｘのカラオケ演奏に合わせて楽曲データ記憶部１００から楽曲Ｘの英語の歌詞データを取得する。この例において、英語の歌詞データは「第２の歌詞データ」に相当する。更に、作成部４００は、利用者Ｕ１の利用者ＩＤに基づいて、サーバ装置Ｓから利用者Ｕ１の声質パラメータを取得する。作成部４００は、取得した各データ及びパラメータに基づいて、合成歌唱音声信号を作成する。 When the user U1 sings the karaoke song of the music piece X in Japanese as described above, the creating unit 400 acquires the singing voice data extracted by the extracting unit 300. Further, the creating unit 400 acquires the English lyrics data of the song X from the song data storage unit 100 in synchronization with the karaoke performance of the song X. In this example, the English lyrics data corresponds to the "second lyrics data". Furthermore, the creation unit 400 acquires the voice quality parameter of the user U1 from the server S based on the user ID of the user U1. The creation unit 400 creates a synthetic singing voice signal based on the acquired data and parameters.

（送信処理部）
送信処理部５００は、ログイン操作を行った複数の利用者のうち、カラオケ歌唱を行う利用者以外の利用者である聴衆が所有する携帯端末に対し、作成した合成歌唱音声信号を送信する。 (Transmission processing unit)
The transmission processing unit 500 transmits the created synthetic singing voice signal to a portable terminal owned by an audience who is a user other than the user who performs the karaoke song among the plurality of users who have performed the login operation.

上述のように利用者Ｕ１〜利用者Ｕ３がログイン操作を行った後、利用者Ｕ１が楽曲Ｘのカラオケ歌唱を行う場合、利用者Ｕ２及び利用者Ｕ３は聴衆にあたる。送信処理部５００は、利用者Ｕ２が所有する携帯端末Ｍ２、及び利用者Ｕ３が所有する携帯端末Ｍ３に対し、作成した合成歌唱音声信号を送信する。 When the user U1 performs the karaoke singing of the music X after the user U1 to the user U3 perform the login operation as described above, the users U2 and U3 correspond to the audience. The transmission processing unit 500 transmits the created synthesized singing voice signal to the mobile terminal M2 owned by the user U2 and the mobile terminal M3 owned by the user U3.

合成歌唱音声信号を受信した携帯端末は、当該携帯端末が備えるスピーカを介して、合成歌唱音声信号に基づく音声を放音する。 The mobile terminal that has received the synthetic singing voice signal emits a sound based on the synthetic singing voice signal via the speaker included in the mobile terminal.

＝＝カラオケシステムにおける処理について＝＝
次に、図５を参照して本実施形態に係るカラオケシステム１における処理の具体例について述べる。図５は、カラオケシステム１における処理例を示すフローチャートである。この例では、利用者Ｕ１〜利用者Ｕ３がカラオケ装置Ｋを利用する。また、サーバ装置Ｓの利用者情報記憶部１０ａは、利用者の声質を示す声質パラメータを、当該利用者を識別するための利用者識別情報と関連付けて記憶しているとする。また、楽曲Ｘは、日本語のバージョンと英語のバージョンがあるとする。更に、カラオケ装置Ｋの楽曲データ記憶部１００は、楽曲Ｘについて、カラオケ演奏を行うための演奏データ、日本語による歌詞データ、及び英語による歌詞データを記憶しているとする。 ==About processing in the karaoke system==
Next, a specific example of processing in the karaoke system 1 according to the present embodiment will be described with reference to FIG. FIG. 5 is a flowchart showing a processing example in the karaoke system 1. In this example, the users U1 to U3 use the karaoke device K. Further, it is assumed that the user information storage unit 10a of the server device S stores the voice quality parameter indicating the voice quality of the user in association with the user identification information for identifying the user. Moreover, it is assumed that the music X has a Japanese version and an English version. Further, it is assumed that the music data storage unit 100 of the karaoke apparatus K stores performance data for performing a karaoke performance, lyrics data in Japanese, and lyrics data in English for the music X.

各利用者は、自己の携帯端末においてカラオケアプリを起動させ、カラオケ装置Ｋに対してログイン操作を行う。 Each user activates the karaoke application on his/her mobile terminal and performs a login operation on the karaoke device K.

ログイン処理部２００は、各携帯端末を介して行われたログイン操作に応じて、各利用者の利用者ＩＤを取得する（ログイン処理。ステップ１０）。 The login processing unit 200 acquires the user ID of each user according to the login operation performed via each mobile terminal (login processing, step 10).

その後、利用者Ｕ１が、楽曲Ｘについて日本語のバージョンでの歌唱予約を行ったとする。カラオケ装置Ｋは、楽曲Ｘの演奏データを楽曲データ記憶部１００から読み出し、楽曲Ｘのカラオケ演奏を開始する（カラオケ演奏の開始。ステップ１１）。 After that, it is assumed that the user U1 makes a song reservation for the song X in the Japanese version. The karaoke apparatus K reads the performance data of the music X from the music data storage unit 100 and starts the karaoke performance of the music X (start of karaoke performance, step 11).

利用者Ｕ１がカラオケ歌唱を開始する前（楽曲Ｘの前奏中）に、作成部４００は、利用者Ｕ１の利用者ＩＤをサーバ装置Ｓに送信する（利用者ＩＤの送信。ステップ１２）。 Before the user U1 starts singing a karaoke song (during the prelude of the music piece X), the creating unit 400 transmits the user ID of the user U1 to the server device S (transmission of the user ID, step 12).

サーバ装置Ｓは、ステップ１２で送信された利用者ＩＤに関連付けられた声質パラメータを利用者情報記憶部１０ａから読み出し、カラオケ装置Ｋに送信する（声質パラメータの送信。ステップ１３）。 The server device S reads the voice quality parameter associated with the user ID transmitted in step 12 from the user information storage section 10a and transmits it to the karaoke device K (transmission of voice quality parameter, step 13).

その後、楽曲Ｘの前奏が終了し、歌唱区間（カラオケ歌唱を行うための歌詞が付与されている区間）のカラオケ演奏が開始される。カラオケ装置Ｋは、表示装置６０に、日本語による歌詞データに基づく歌詞テロップを表示させる。利用者Ｕ１は、歌詞テロップを参照しながら、楽曲Ｘのカラオケ演奏に合わせて、日本語でカラオケ歌唱を行う（カラオケ歌唱の開始。ステップ１４）。 After that, the prelude of the music piece X is finished, and the karaoke performance of the singing section (the section where the lyrics for singing the karaoke song is given) is started. The karaoke device K causes the display device 60 to display the lyrics telop based on the lyrics data in Japanese. The user U1 performs karaoke singing in Japanese in accordance with the karaoke performance of the music piece X while referring to the lyrics telop (starting karaoke singing, step 14).

抽出部３００は、利用者Ｕ１の歌唱音声から、歌唱音声データを抽出する（歌唱音声データの抽出。ステップ１５）。 The extraction unit 300 extracts singing voice data from the singing voice of the user U1 (extraction of singing voice data, step 15).

作成部４００は、ステップ１５で抽出した歌唱音声データ、カラオケ演奏に合わせて読み出した楽曲Ｘの英語による歌詞データ、及びステップ１３で送信された利用者Ｕ１の声質パラメータに基づいて、合成歌唱音声信号を作成する（合成歌唱音声信号の作成。ステップ１６）。 The creating unit 400, based on the singing voice data extracted in step 15, the lyrics data in English of the music piece X read out along with the karaoke performance, and the voice quality parameter of the user U1 transmitted in step 13, the synthetic singing voice signal. (Create a synthetic singing voice signal, step 16).

送信処理部５００は、ステップ１０でログイン操作を行った複数の利用者のうち、利用者Ｕ１のカラオケ歌唱を聴いている利用者Ｕ２及び利用者Ｕ３が所有する携帯端末それぞれに対し、ステップ１６で作成した合成歌唱音声信号を送信する（合成歌唱音声信号の送信。ステップ１７）。各携帯端末は、合成歌唱音声信号に基づく音声（利用者Ｕ１が楽曲Ｘを英語でカラオケ歌唱した場合の音声と類似する音声）を放音する。各利用者は、楽曲Ｘのカラオケ演奏に合わせて、合成歌唱音声信号に基づく音声を聴くことができる。 The transmission processing unit 500, in step 16, for each of the mobile terminals owned by the user U2 and the user U3 who are listening to the karaoke song of the user U1 among the plurality of users who have performed the login operation in step 10. The created synthetic singing voice signal is transmitted (transmission of the synthetic singing voice signal, step 17). Each mobile terminal emits a voice based on the synthetic singing voice signal (a voice similar to the voice when the user U1 sings the song X in English in karaoke). Each user can listen to the voice based on the synthetic singing voice signal in synchronization with the karaoke performance of the music X.

カラオケ装置Ｋは、楽曲Ｘのカラオケ歌唱が終了するまで（ステップ１８でＹの場合）、ステップ１５〜ステップ１７の処理を繰り返し行う。 The karaoke apparatus K repeats the processing of steps 15 to 17 until the karaoke singing of the music piece X is completed (in the case of Y in step 18).

このように、本実施形態に係るカラオケシステム１は、サーバ装置Ｓとカラオケ装置Ｋとが通信可能に接続されている。サーバ装置Ｓは、利用者の声質を示す声質パラメータを、当該利用者を識別するための利用者ＩＤと関連付けて記憶する利用者情報記憶部１０ａを有する。カラオケ装置Ｋは、カラオケ演奏を行うための演奏データ、第１の言語による第１の歌詞データ、及び当該第１の言語とは異なる第２の言語による第２の歌詞データを、楽曲毎に記憶する楽曲データ記憶部１００と、カラオケ装置Ｋを利用する利用者が所有する携帯端末からのログイン操作に応じて、当該利用者の利用者ＩＤを取得するログイン処理部２００と、ある楽曲の演奏データに基づくカラオケ演奏に合わせて利用者が第１の言語でカラオケ歌唱を行った際に得られる歌唱音声から、音高データ及び音量データを含む歌唱音声データを抽出する抽出部３００と、抽出した歌唱音声データ、カラオケ演奏に合わせて読み出したある楽曲の第２の歌詞データ、及びカラオケ歌唱を行う利用者の利用者ＩＤに応じてサーバ装置Ｓから取得した声質パラメータに基づいて、合成歌唱音声信号を作成する作成部４００と、ログイン操作を行った複数の利用者のうち、カラオケ歌唱を行う利用者以外の利用者である聴衆が所有する携帯端末に対し、作成した合成歌唱音声信号を送信する送信処理部５００と、を有する。 As described above, in the karaoke system 1 according to the present embodiment, the server device S and the karaoke device K are communicably connected. The server device S includes a user information storage unit 10a that stores the voice quality parameter indicating the voice quality of the user in association with the user ID for identifying the user. The karaoke device K stores performance data for performing a karaoke performance, first lyrics data in a first language, and second lyrics data in a second language different from the first language for each song. Music data storage unit 100, a login processing unit 200 that acquires a user ID of a user who uses the karaoke apparatus K in response to a login operation from a mobile terminal owned by the user, and performance data of a certain music. Extraction unit 300 for extracting singing voice data including pitch data and volume data from a singing voice obtained when the user sings a karaoke song in the first language in accordance with the karaoke performance based on A synthetic singing voice signal is generated based on the voice data, the second lyrics data of a certain piece of music read along with the karaoke performance, and the voice quality parameter acquired from the server device S in accordance with the user ID of the user who sings the karaoke song. Sending the created synthetic singing voice signal to the creating unit 400 to create and a portable terminal owned by an audience who is a user other than the user performing the karaoke singing among a plurality of users who have performed the login operation. And a processing unit 500.

このようなカラオケシステム１によれば、たとえば利用者Ｕ１が楽曲Ｘについて第１の言語によりカラオケ歌唱を行う際に、利用者Ｕ２及び利用者Ｕ３は、自己の携帯端末を介して、利用者Ｕ１が楽曲Ｘについて第２の言語によりカラオケ歌唱を行った場合と類似する音声を聴くことができる。従って、第２の言語を母国語とする利用者は、楽曲Ｘのカラオケ演奏と当該音声を合わせて歌として聴くことができる。また、第２の言語による歌詞が音声として放音されるため、第２の言語を母国語とする利用者は、聴覚的に歌詞の内容を理解できる。すなわち、本実施形態に係るカラオケシステム１によれば、カラオケ歌唱を行う利用者と母国語が異なる利用者であっても、カラオケ演奏に合わせて歌を楽しむことができる。 According to such a karaoke system 1, for example, when the user U1 sings the karaoke song of the song X in the first language, the user U2 and the user U3 use the user's U1 via their own mobile terminals. It is possible to listen to a voice similar to that when the song X is karaoke sung in the second language. Therefore, the user who speaks the second language as the mother tongue can listen to the karaoke performance of the music piece X and the voice as a song. Further, since the lyrics in the second language are sounded as a voice, a user whose native language is the second language can understand the contents of the lyrics auditorily. That is, according to the karaoke system 1 according to the present embodiment, even a user whose native language is different from the user who sings the karaoke song can enjoy the song along with the karaoke performance.

なお、上記実施形態において、送信処理部５００は、合成歌唱音声信号と併せて第２の歌詞データを送信してもよい。この場合、携帯端末は、当該携帯端末が備える表示画面に、第２の歌詞データに基づく歌詞テロップを表示できる。従って、聴衆はカラオケ歌唱を視覚的にも楽しむことができる。 In the above embodiment, the transmission processing unit 500 may transmit the second lyrics data together with the synthetic singing voice signal. In this case, the mobile terminal can display the lyrics telop based on the second lyrics data on the display screen of the mobile terminal. Therefore, the audience can visually enjoy the karaoke song.

また、楽曲データ記憶部１００は、楽曲毎に、第２の言語によるカラオケ歌唱を評価するためのリファレンスデータを記憶することができる。この場合、抽出部３００は、抽出した歌唱音声データを、リファレンスデータに基づいて補正して出力することができる。 Further, the music data storage unit 100 can store reference data for evaluating a karaoke song in the second language for each music. In this case, the extraction unit 300 can correct and output the extracted singing voice data based on the reference data.

リファレンスデータは、楽曲の主旋律等を示す音高データである。リファレンスデータは、利用者によるカラオケ歌唱を採点する際の基準として用いられる。 The reference data is pitch data indicating the main melody of the music. The reference data is used as a reference when scoring a karaoke song by the user.

たとえば、上記実施形態の例において、利用者Ｕ１の歌唱音声から抽出された音高データは、日本語でカラオケ歌唱を行った場合の音高に基づく。ところで、ある歌詞を日本語で歌唱した場合の音高と英語で歌唱した場合の音高を比べたときに、先頭の音高が一致している場合であっても音節としてみると音高のずれが生じている場合もありうる。 For example, in the example of the above embodiment, the pitch data extracted from the singing voice of the user U1 is based on the pitch when a karaoke singing is performed in Japanese. By the way, when comparing the pitch when singing a certain lyrics in Japanese and the pitch when singing in English, even if the pitches at the beginning are the same, the pitch is There may be a gap.

そこで、抽出部３００は、抽出した音高データのうち、第１の言語と第２の言語で音高が異なる部分については、第２の言語のリファレンスデータが示す音高で置き換える。 Therefore, the extraction unit 300 replaces, in the extracted pitch data, a portion having different pitches in the first language and the second language with the pitch indicated by the reference data in the second language.

作成部４００は、補正された歌唱音声データを用いて合成歌唱音声信号を作成する。このような合成歌唱音声信号に基づく音声は、第２の言語を母国語とする利用者がカラオケ歌唱した場合に近い音声である。従って、たとえば第２の言語が英語の場合、聴衆は、英語としてより自然な音声として聴くことができる。 The creation unit 400 creates a synthesized singing voice signal using the corrected singing voice data. The voice based on such a synthetic singing voice signal is a voice close to that when a user whose native language is the second language sings karaoke. Thus, for example, if the second language is English, the audience will hear a more natural English sound.

＜第２実施形態＞
次に、第２実施形態に係るカラオケシステムについて説明する。本実施形態では、カラオケ歌唱を行っている利用者と母国語が異なる利用者に対してのみ合成歌唱音声信号を送信する例について述べる。第１実施形態と同様の構成については詳細な説明を省略する。 <Second Embodiment>
Next, a karaoke system according to the second embodiment will be described. In the present embodiment, an example will be described in which a synthetic singing voice signal is transmitted only to a user whose native language is different from the user who is singing a karaoke song. Detailed description of the same configuration as that of the first embodiment is omitted.

本実施形態に係る利用者情報記憶部１０ａは、利用者の母国語を示す母国語情報を、利用者識別情報と関連付けて記憶する。 The user information storage unit 10a according to the present embodiment stores the native language information indicating the native language of the user in association with the user identification information.

たとえば、図１に示した利用者Ｕ１及び利用者Ｕ２の母国語情報は「日本語」である。一方、利用者Ｕ３の母国語情報は「英語」である。母国語情報の記憶は、たとえば声質データを記憶する際に併せて行うことができる。 For example, the native language information of the users U1 and U2 shown in FIG. 1 is “Japanese”. On the other hand, the native language information of the user U3 is "English". Native language information can be stored together with voice quality data, for example.

本実施形態に係る送信処理部５００は、聴衆の母国語情報に基づいて、第２の言語を母国語とする聴衆のみに合成歌唱音声信号を送信する。 The transmission processing unit 500 according to the present embodiment transmits the synthetic singing voice signal only to the audience whose native language is the second language, based on the native language information of the audience.

たとえば、第１実施形態と同様、利用者Ｕ１が楽曲Ｘのカラオケ歌唱を日本語で行ったとする。 For example, it is assumed that the user U1 sings the karaoke song of the music piece X in Japanese, as in the first embodiment.

この場合、カラオケ装置Ｋは、サーバ装置Ｓに対して聴衆である利用者Ｕ２及び利用者Ｕ３の利用者ＩＤを送信する。サーバ装置Ｓは、利用者Ｕ２及び利用者Ｕ３の利用者ＩＤに基づいて、各利用者の母国語情報を利用者情報記憶部１０ａから読み出し、カラオケ装置Ｋに送信する。 In this case, the karaoke device K transmits to the server device S the user IDs of the users U2 and U3 who are the audience. The server device S reads the native language information of each user from the user information storage unit 10a based on the user IDs of the users U2 and U3, and sends the information to the karaoke device K.

その後、作成部４００は、利用者Ｕ１のカラオケ歌唱に伴って合成歌唱音声信号を作成する。 After that, the creating unit 400 creates a synthetic singing voice signal along with the karaoke singing of the user U1.

送信処理部５００は、作成された合成歌唱音声信号を送信する際に、当該信号の作成に用いた第２の歌詞データに対応する第２の言語を特定する。送信処理部５００は、特定した第２の言語と、サーバ装置Ｓから受信した母国語情報に対応する言語とを比較する。送信処理部５００は、第２の言語と母国語情報に対応する言語が一致する利用者に対してのみ合成歌唱音声信号を送信する。 When transmitting the created synthetic singing voice signal, the transmission processing unit 500 specifies the second language corresponding to the second lyrics data used to create the signal. The transmission processing unit 500 compares the specified second language with the language corresponding to the native language information received from the server device S. The transmission processing unit 500 transmits the synthetic singing voice signal only to the user whose language corresponding to the second language and the language corresponding to the native language information match.

上述の通り、利用者Ｕ１は楽曲Ｘのカラオケ歌唱を日本語で行っている。よって、送信処理部５００は、楽曲Ｘの第２の言語（英語）と、母国語情報に対応する言語が一致する利用者Ｕ３（母国語が英語）について合成歌唱音声信号を送信する。一方、送信処理部５００は、楽曲Ｘの第２の言語（英語）と、母国語情報に対応する言語が一致しない利用者Ｕ２（母国語が日本語）については合成歌唱音声信号を送信しない。 As described above, the user U1 sings the karaoke song of the music piece X in Japanese. Therefore, the transmission processing unit 500 transmits the synthetic singing voice signal for the user U3 (the native language is English) whose second language (English) of the music piece X and the language corresponding to the native language information match. On the other hand, the transmission processing unit 500 does not transmit the synthetic singing voice signal to the user U2 (whose native language is Japanese) whose second language (English) of the music piece X and the language corresponding to the native language information do not match.

このように、本実施形態に係るカラオケシステム１における利用者情報記憶部１０ａは、利用者の母国語を示す母国語情報を、利用者識別情報と関連付けて記憶する。また、送信処理部５００は、聴衆の母国語情報に基づいて、第２の言語を母国語とする聴衆のみに前記合成歌唱音声信号を送信する。 As described above, the user information storage unit 10a in the karaoke system 1 according to the present embodiment stores the native language information indicating the native language of the user in association with the user identification information. Further, the transmission processing unit 500 transmits the synthesized singing voice signal only to the audience whose native language is the second language, based on the native language information of the audience.

このように、第２の言語と同じ母国語である利用者に対してのみ合成歌唱音声信号を送信することにより、当該利用者はカラオケ演奏に合わせて歌を楽しむことができる。一方、カラオケシステム１は、第２の言語を母国語としない利用者（たとえば、カラオケ歌唱されている第１の言語を母国語とする利用者）の携帯端末に対して、合成歌唱音声信号を送信する処理を省略できる。 In this way, by transmitting the synthetic singing voice signal only to the user who has the same mother tongue as the second language, the user can enjoy the song in time with the karaoke performance. On the other hand, the karaoke system 1 outputs a synthetic singing voice signal to the portable terminal of a user whose native language is not the second language (for example, a user whose native language is singing karaoke). The sending process can be omitted.

＜その他＞
上記実施形態におけるカラオケシステム１をカラオケ装置Ｋのみで構成することも可能である。 <Other>
It is also possible to configure the karaoke system 1 in the above embodiment with only the karaoke device K.

このようなカラオケ装置Ｋは、カラオケ演奏を行うための演奏データ、第１の言語による第１の歌詞データ、及び当該第１の言語とは異なる第２の言語による第２の歌詞データを、楽曲毎に記憶する楽曲データ記憶部１００と、カラオケ装置Ｋの利用者が所有する携帯端末を介して行われたログイン操作に応じて、当該利用者の利用者識別情報を取得するログイン処理部２００と、ある楽曲の演奏データに基づくカラオケ演奏に合わせて利用者が第１の言語でカラオケ歌唱を行った際に得られる歌唱音声から、音高データ及び音量データを含む歌唱音声データを抽出する抽出部３００と、抽出した歌唱音声データ、カラオケ演奏に合わせて読み出した第２の歌詞データ、及びカラオケ歌唱を行う利用者の声質を示す声質パラメータに基づいて、合成歌唱音声信号を作成する作成部４００と、ログイン操作を行った複数の利用者のうち、カラオケ歌唱を行う利用者以外の利用者である聴衆が所有する携帯端末に対し、作成した合成歌唱音声信号を送信する送信処理部５００と、を有する。 Such a karaoke device K uses the performance data for performing the karaoke performance, the first lyrics data in the first language, and the second lyrics data in the second language different from the first language as the music piece. A music data storage unit 100 that stores each piece of information, and a login processing unit 200 that acquires user identification information of a user of the karaoke apparatus K in response to a login operation performed via a mobile terminal owned by the user. , An extraction unit for extracting singing voice data including pitch data and volume data from a singing voice obtained when a user sings a karaoke song in a first language in accordance with a karaoke performance based on performance data of a certain music. 300, and a creation unit 400 that creates a synthetic singing voice signal based on the extracted singing voice data, the second lyrics data read out in accordance with the karaoke performance, and the voice quality parameter indicating the voice quality of the user performing the karaoke singing. , A transmission processing unit 500 that transmits the created synthesized singing voice signal to a portable terminal owned by an audience who is a user other than the user who sings a karaoke song among a plurality of users who have performed a login operation. Have.

この例において、声質パラメータは、たとえばサーバ装置Ｓから取得することができる。また、サーバ装置Ｓが備える利用者情報記憶部１０ａに相当する構成をカラオケ装置Ｋ側に設けてもよい。この場合、カラオケ装置Ｋが備える記憶部４３の記憶領域の一部が利用者情報記憶部１０ａとして機能する。或いは、声質パラメータは、カラオケ装置Ｋを利用する際に都度、取得してもよい。 In this example, the voice quality parameter can be acquired from the server device S, for example. Further, a configuration corresponding to the user information storage unit 10a included in the server device S may be provided on the karaoke device K side. In this case, a part of the storage area of the storage unit 43 included in the karaoke device K functions as the user information storage unit 10a. Alternatively, the voice quality parameter may be acquired each time the karaoke apparatus K is used.

上記実施形態では、一の楽曲について異なる言語による歌詞データが２つ（第１の歌詞データ及び第２の歌詞データ）記憶されていた。一方、一の楽曲について、異なる言語による歌詞データが３つ以上（たとえば、日本語による歌詞データ、英語による歌詞データ、及び中国語による歌詞データ）が記憶されていてもよい。 In the above embodiment, two pieces of lyrics data (first lyrics data and second lyrics data) in different languages are stored for one music piece. On the other hand, three or more pieces of lyrics data in different languages may be stored for one music piece (for example, lyrics data in Japanese, lyrics data in English, and lyrics data in Chinese).

上記実施形態は、例として提示したものであり、発明の範囲を限定するものではない。上記の構成は、適宜組み合わせて実施することが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。上記実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 The above embodiments are presented as examples and do not limit the scope of the invention. The above configurations can be appropriately combined and implemented, and various omissions, replacements, and changes can be made without departing from the scope of the invention. The above-described embodiments and modifications thereof are included in the invention described in the claims and equivalents thereof as well as included in the scope and the gist of the invention.

１カラオケシステム
１０ａ利用者情報記憶部
１００楽曲データ記憶部
２００ログイン処理部
３００抽出部
４００作成部
５００送信処理部
Ｋカラオケ装置
Ｓサーバ装置 1 Karaoke system 10a User information storage unit 100 Music data storage unit 200 Login processing unit 300 Extraction unit 400 Creation unit 500 Transmission processing unit K Karaoke device S server device

Claims

A karaoke system in which a server device and a karaoke device are communicably connected to each other,
The server device is
A voice quality parameter indicating the voice quality of the user is provided with a user information storage unit that stores the voice quality parameter in association with the user identification information for identifying the user,
The karaoke device,
A song data storage unit that stores, for each song, performance data for performing a karaoke performance, first lyrics data in a first language, and second lyrics data in a second language different from the first language. When,
A login processing unit that acquires user identification information of the user according to a login operation performed via a mobile terminal owned by the user of the karaoke device,
Extraction for extracting singing voice data including pitch data and volume data from a singing voice obtained when a user sings a karaoke song in the first language in accordance with a karaoke performance based on the performance data of a certain music. Department,
The voice quality parameter acquired from the server device according to the extracted singing voice data, the second lyric data of the certain music read in accordance with the karaoke performance, and the user identification information of the user who performs the karaoke singing. A creation unit for creating a synthetic singing voice signal based on
Among a plurality of users who performed the login operation, a transmission processing unit that transmits the created synthetic singing voice signal to a mobile terminal owned by an audience who is a user other than the user who performs the karaoke song,
Karaoke system having a.

The karaoke system according to claim 1, wherein the transmission processing unit transmits the second lyrics data together with the synthesized singing voice signal.

The music data storage unit stores, for each music, reference data for evaluating a karaoke song in the second language,
The karaoke system according to claim 1 or 2, wherein the extraction unit corrects the extracted singing voice data based on the reference data and outputs the corrected singing voice data.

The user information storage unit stores native language information indicating a native language of the user in association with the user identification information,
4. The transmission processing unit transmits the synthesized singing voice signal only to an audience whose native language is the second language, based on the native language information of the audience. Karaoke system described in one.

A song data storage unit that stores, for each song, performance data for performing a karaoke performance, first lyrics data in a first language, and second lyrics data in a second language different from the first language. When,
A login processing unit that acquires user identification information of the user according to a login operation performed via a mobile terminal owned by the user of the karaoke device,
Extraction for extracting singing voice data including pitch data and volume data from a singing voice obtained when a user sings a karaoke song in the first language in accordance with a karaoke performance based on the performance data of a certain music. Department,
A creation unit that creates a synthetic singing voice signal based on the extracted singing voice data, the second lyrics data read out in accordance with the karaoke performance, and a voice quality parameter indicating the voice quality of the user performing the karaoke singing. ,
Among a plurality of users who performed the login operation, a transmission processing unit that transmits the created synthetic singing voice signal to a mobile terminal owned by an audience who is a user other than the user who performs the karaoke song,
Karaoke device having a.