JP5704201B2

JP5704201B2 - Karaoke device and karaoke music processing program

Info

Publication number: JP5704201B2
Application number: JP2013178254A
Authority: JP
Inventors: 正弘光川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2013-08-29
Filing date: 2013-08-29
Publication date: 2015-04-22
Anticipated expiration: 2033-08-29
Also published as: JP2015045822A

Description

本発明は、歌詞テキストの表示を行うカラオケ装置及びカラオケ楽曲処理プログラムに関する。 The present invention relates to a karaoke apparatus and a karaoke piece processing program for displaying lyrics text.

所定の言語により構成されたテキストを、音が似ている英単語に順次置き換える技術が既に提唱されている（例えば、特許文献１参照）。この従来技術では、発音記号の類似性に基づき上記置き換えを行うことにより、例えば日本語の会話言語の発音を、英語の会話言語を読める人に略正確に伝えるようにしている。 There has already been proposed a technique for sequentially replacing text composed of a predetermined language with English words having similar sounds (see, for example, Patent Document 1). In this prior art, the above replacement is performed based on the similarity of phonetic symbols so that, for example, the pronunciation of a Japanese conversation language is transmitted to a person who can read an English conversation language substantially accurately.

特開２０１０−１０８２０１号公報JP 2010-108201 A

ところで、カラオケ装置においては、カラオケ楽曲の再生とともに当該カラオケ楽曲の歌詞テキストが表示されることで、ユーザはその歌詞を見ながら歌唱を楽しむことができる。ここで、ユーザが上記歌詞テキストの言語を母国語とする場合（例えば歌詞テキストが日本語で、ユーザが日本人の場合）には、上記のようにして表示される例えば日本語からなる歌詞を見て、カラオケ楽曲の原曲（歌手が歌った当該楽曲）に沿った歌唱を容易に行うことができる。これに対して、ユーザが上記歌詞テキストの言語を母国語としない場合（例えば英語を母国語とする、あるいは日常で使用する外国人である場合）には、上記のようにして表示される歌詞テキストが十分には読めなかったり、あるいは読めたとしても流ちょうな当該言語の文章の流れでは発音できなかったり等により、上記原曲に沿った歌唱が難しい。 By the way, in a karaoke apparatus, the lyric text of the said karaoke music is displayed with the reproduction | regeneration of a karaoke music, so that the user can enjoy singing while watching the lyrics. Here, when the user uses the language of the lyric text as a native language (for example, when the lyric text is in Japanese and the user is Japanese), the lyrics displayed in the above manner, for example, are in Japanese. It is possible to easily sing along the original karaoke music (the music sung by the singer). On the other hand, when the user does not use the language of the lyrics text as a native language (for example, when the user speaks English as a native language or is a foreigner used daily), the lyrics displayed as described above. It is difficult to sing along the original music because the text cannot be read sufficiently, or even if it can be read, it cannot be pronounced by the flow of the sentence in the fluent language.

このような場合に、上記従来技術を適用し、発音記号の類似性により、歌詞の文章を、音が似ている英単語の羅列に単純に置き換える手法が考えられる。しかしながらカラオケ楽曲は、いわゆるＡメロ、Ｂメロ、サビ等の所定の楽節やフレーズごとに一つの区切り、まとまりを構成している。したがって、例えば２つの上記楽節・フレーズに跨るような歌詞があった場合に、上記のような単純な置き換えでは、楽曲の流れや区切りを反映した原曲に近い態様での歌唱を行うことは難しかった。 In such a case, it is conceivable to apply the above-mentioned conventional technique and simply replace the text of the lyrics with an enumeration of English words having similar sounds by the similarity of phonetic symbols. However, the karaoke music piece constitutes one segment or group for each predetermined passage or phrase such as so-called A melody, B melody, and chorus. Therefore, for example, when there are lyrics that straddle the two above-mentioned passages / phrases, it is difficult to perform singing in a manner close to the original music reflecting the flow and breaks of the music with the simple replacement as described above. It was.

本発明の目的は、ユーザが英語を母国語とする（あるいは日常で使用する）外国人である場合であっても、カラオケ楽曲の原曲に近い発音態様で確実に歌唱を行える、カラオケ装置及びカラオケ楽曲処理プログラムを提供することにある。 An object of the present invention is to provide a karaoke apparatus capable of reliably singing in a pronunciation manner close to the original song of a karaoke song even when the user is a foreigner whose native language is English (or used everyday) To provide a karaoke music processing program.

上記目的を達成するために、第１の発明のカラオケ装置は、所定の楽節フレーズデータ及び第１歌詞データが対応づけられた、カラオケ楽曲データを再生可能な楽曲再生手段と、ユーザによる前記カラオケ楽曲データの選択を受け付ける楽曲受付手段と、前記楽曲受付手段により受け付けた前記カラオケ楽曲データに対応する前記第１歌詞データの中から、英単語に対応した歌詞部分を抽出してアルファベット化した第２歌詞データを生成する英単語歌詞生成手段と、前記第１歌詞データのうち前記英単語歌詞生成手段により抽出されなかった残り部分の歌詞テキストに対し、単語ごとに発音記号を与える発音記号付与手段と、前記楽節フレーズデータに対応した楽節フレーズごとに、前記発音記号付与手段により前記歌詞テキストに与えられた前記発音記号と前記歌詞テキストを置き換えるための英単語候補の発音記号とを比較し、類似度を導出する類似度導出手段と、前記類似度導出手段による導出結果に応じて、前記楽節フレーズごとに、前記類似度が最も大きくなるように複数の前記英単語候補を選択して配列し、置き換え用の第３歌詞データを生成する置き換え歌詞生成手段と、前記英単語歌詞生成手段により生成された前記第２歌詞データと前記置き換え歌詞生成手段により生成された前記第３歌詞データとを合成して、前記楽曲受付手段により受け付けた前記カラオケ楽曲データに対応した新たな歌詞データを生成する歌詞合成手段と、前記楽曲受付手段により受け付けた前記カラオケ楽曲データの再生に応じて、当該カラオケ楽曲データに対応する前記新たな歌詞データを表示する表示手段と、を有することを特徴とする。
In order to achieve the above object, a karaoke apparatus according to a first aspect of the present invention is a music playback means capable of playing back karaoke music data in which predetermined phrase phrase data and first lyric data are associated, and the karaoke music by the user. Music accepting means for accepting selection of data, and second lyrics that are extracted from the first lyrics data corresponding to the karaoke song data accepted by the music accepting means and extracted from the lyrics corresponding to English words and English words lyrics generating means for generating data, with respect to the lyrics text of the remaining portion not extracted by the English single word lyric generating means of the first lyrics data, and pronunciation symbols applying means for applying a pronunciation symbol for each word For each phrase phrase corresponding to the phrase phrase data, it is given to the lyrics text by the phonetic symbol assigning means. The phonetic symbol is compared with the phonetic symbol of the English word candidate for replacing the lyric text, and the similarity degree deriving means for deriving the degree of similarity, and according to the derivation result by the similarity degree deriving means, A plurality of English word candidates are selected and arranged so that the degree of similarity is maximized, and replacement lyric generating means for generating third lyric data for replacement is generated by the English word lyric generating means Lyric synthesizer for synthesizing the second lyric data and the third lyric data generated by the replacement lyric generator, and generating new lyric data corresponding to the karaoke song data received by the music receiver And the new lyrics corresponding to the karaoke music data according to the reproduction of the karaoke music data received by the music receiving means And having display means for displaying the over data, the.

カラオケ装置においては、カラオケ楽曲の再生とともに当該カラオケ楽曲の歌詞テキストが表示されることで、ユーザはその歌詞を見ながら歌唱を楽しむことができる。そして、本願発明においては、カラオケ楽曲の歌詞を、音として類似している英単語の羅列に置き換えて表示することで、外国人ユーザへの便宜を図る。すなわち、ユーザがカラオケ楽曲データを選択すると、その選択が楽曲受付手段によって受け付けられ、その楽曲の歌詞データの歌詞テキストに対し、発音記号付与手段によって単語ごとに発音記号が付与される。 In the karaoke apparatus, the lyrics text of the karaoke music is displayed along with the reproduction of the karaoke music, so that the user can enjoy singing while watching the lyrics. In the invention of the present application, the lyrics of the karaoke music are replaced with a list of English words similar as sounds and displayed, thereby making it easier for foreign users. That is, when the user selects karaoke music data, the selection is received by the music receiving means, and a phonetic symbol is given to each word by the phonetic symbol giving means to the lyric text of the lyrics data of the music.

その後、上記のように単語ごとに発音記号が付与された歌詞テキストを音が似ている英単語に置き換えるために、類似度導出手段によって、上記歌詞テキストの発音記号と置き換え用の英単語候補の発音記号とが比較されて、類似度が導出される。そして、置き換え歌詞生成手段によって、上記類似度が最も大きくなるようにしつつ、複数の上記英単語候補が選択されて配列され、置き換え用の歌詞データ（第３歌詞データ）が生成される。 Then, in order to replace the lyric text with the phonetic symbol for each word as described above with an English word that resembles the sound, the similarity derivation means uses the phonetic symbol of the lyric text and the replacement English word candidate The phonetic symbol is compared to derive the similarity. Then, the replacement lyrics generation means selects and arranges the plurality of English word candidates while generating the highest similarity, and generates replacement lyrics data (third lyrics data).

このとき、上記のような（例えば日本語からなる）カラオケ楽曲の歌詞に、もとが英語である単語（いわゆる外来語）が含まれる場合がある。このような場合には、当該単語は上述のような処理をせず、本来の英語アルファベット表記に戻すほうが上記ユーザにとっては発音がしやすい。そこで本願発明では、英単語歌詞生成手段により、前述のように受け付けられたカラオケ楽曲データの歌詞データ（第１歌詞データ）の中から、英単語に対応した歌詞部分が抽出されそれをアルファベット化した第２歌詞データが生成される。そして、上記発音記号付与手段では上記において抽出されなかった残りの歌詞テキストに対してのみ発音記号の付与が行われ、それに基づく上記類似度導出手段での類似度の導出を経て、置き換え用の上記第３歌詞データの生成が行われる。 At this time, there are cases where words (so-called foreign words) originally in English are included in the lyrics of the karaoke music as described above (for example, composed of Japanese). In such a case, it is easier for the user to pronounce the word by returning the word to the original English alphabet notation without performing the above processing. Therefore, in the present invention, the lyric part corresponding to the English word is extracted from the lyric data (first lyric data) of the karaoke song data received as described above by the English word lyric generating means and converted into the alphabet. Second lyrics data is generated. Then, the phonetic symbol assigning means assigns phonetic symbols only to the remaining lyric texts not extracted in the above, and the derivation of the similarity in the similarity derivation means based on the lyric symbols is performed. Generation of third lyrics data is performed.

そして、この生成された第３歌詞データに対し、歌詞合成手段によって上記第２歌詞データが合成されることで、新たな歌詞データが生成され、表示手段によって表示される。 Then, the second lyric data is synthesized with the generated third lyric data by the lyric synthesizing means, whereby new lyric data is generated and displayed by the display means.

以上の結果、本願発明においては、外来語部分はそのままアルファベット化した本来の英単語として含めつつ、それ以外の部分については、音が近い英単語の羅列の形で歌詞データの表示が行われる。これにより、ユーザが英語を母国語とする（あるいは日常で使用する）外国人である場合であっても、カラオケ楽曲の原曲に近い発音態様で歌唱を行うことができる。特に、本願発明においては、カラオケ楽曲データに対応した歌詞データに、いわゆるＡメロ、Ｂメロ、サビ等の所定の楽節やフレーズを表す楽節フレーズデータが対応づけられている。そして、上記類似度導出手段による類似度の導出や、置き換え歌詞生成手段による複数の英単語候補の配列が、上記楽節・フレーズごとに行われる。これにより、例えば２つの上記楽節・フレーズに跨るような歌詞があったとしても、当該歌詞に対する上記英単語への置き換えは、楽節・フレーズごとに分割した態様で（すなわち歌詞を２つに分けて）行われる。この結果、歌詞の文章を、音が似ている英単語の羅列に単純に置き換えた場合に比べ、楽曲の流れや区切りを反映した、より原曲に近い態様での歌唱を確実に実現することができる。 As a result of the above, in the present invention, the foreign word part is included as it is as the original English word converted into alphabets, and the lyric data is displayed in the form of an enumeration of English words with similar sounds for the other parts. Thus, even if the user is a foreigner whose native language is English (or used daily), the user can sing in a pronunciation manner close to the original karaoke song. In particular, in the present invention, lyrics phrase data corresponding to karaoke music data is associated with phrase phrase data representing a predetermined phrase or phrase such as so-called A melody, B melody, and chorus. Then, derivation of similarity by the similarity deriving means and arrangement of a plurality of English word candidates by the replacement lyrics generating means are performed for each of the passages and phrases. Thus, for example, even if there is a lyric that straddles two passages / phrases, the replacement of the lyric with the English word is performed in a manner divided for each passage / phrase (that is, the lyrics are divided into two). ) Done. As a result, singing in a manner closer to the original song that reflects the flow and breaks of the song is more reliably realized than when the lyrics text is simply replaced with an enumeration of English words with similar sounds. Can do.

第２発明は、上記第１発明において、前記第１歌詞データには、前記カラオケ楽曲データに対応した音符長データが対応づけられており、前記類似度導出手段は、前記歌詞テキストの前記発音記号と前記英単語候補の発音記号との比較の際、前記音符長データの表す音符が長いほど大きくなり前記音符長データの表す音符が短いほど小さくなる所定の重み付けを用いつつ、前記類似度の導出を行うことを特徴とする。 In a second aspect based on the first aspect, note length data corresponding to the karaoke piece data is associated with the first lyric data, and the similarity derivation means is configured to generate the phonetic symbol of the lyric text. Derivation of the similarity while using a predetermined weighting that becomes larger as the note represented by the note length data becomes larger and becomes shorter as the note represented by the note length data becomes shorter when comparing the pronunciation symbol of the English word candidate It is characterized by performing.

歌詞が発音されて歌われるとき、長い音に乗せられた言葉は発音時間が長いので明瞭に耳に聞こえるのに対し、短い音に乗せられた言葉は発音時間が短いので不明確にしか聞こえない。本願発明においては、歌詞テキストの発音記号と英単語候補の発音記号との比較による上記類似度導出の際に、上記を参酌した重み付けが行われる。 When lyrics are sung and sung, words placed on long sounds are clearly audible because of the long pronunciation time, whereas words placed on short sounds are heard indefinitely because the pronunciation time is short . In the present invention, weighting in consideration of the above is performed when the similarity is derived by comparing the phonetic symbol of the lyric text with the phonetic symbol of the English word candidate.

すなわち、本願発明においては、カラオケ楽曲データの歌詞データ（第１歌詞データ）には、例えば全音符、四分音符、八分音符等の音符長データが対応づけられている。そして、上記類似度導出の際、上記乗せられる音の音符が長い発音記号については、（明確に耳に聞こえやすい性質に対応して）重み付けを大きくして類似度が導出される一方、上記乗せられる音の音符が短い発音記号については、（不明確にしか聞こえない性質に対応して）重み付けを小さくして類似度が導出される。これにより、例えば日本人により日本語で歌唱された原曲に対しさらに近い態様での歌唱を、確実に実現することができる。 In other words, in the present invention, lyric data (first lyric data) of karaoke music data is associated with note length data such as full notes, quarter notes, eighth notes, and the like. In the derivation of the similarity, the phonetic symbol with a long note is placed with a higher weight (corresponding to the property that it is clearly audible to the ear) and the similarity is derived, For phonetic symbols with short notes, the similarity is derived by reducing the weight (corresponding to the property of being audible only indefinitely). Thereby, for example, the singing in a mode closer to the original song sung in Japanese by the Japanese can be reliably realized.

上記目的を達成するために、第３発明のカラオケ楽曲処理プログラムは、所定の楽節フレーズデータ及び第１歌詞データが対応づけられた、カラオケ楽曲データを再生可能な楽曲再生手段と、表示手段と、を有するカラオケ装置に備えられた演算手段に対し、ユーザによる前記カラオケ楽曲データの選択を受け付ける楽曲受付手順と、前記楽曲受付手順で受け付けた前記カラオケ楽曲データに対応する前記第１歌詞データの中から、英単語に対応した歌詞部分を抽出してアルファベット化した第２歌詞データを生成する英単語歌詞生成手順と、前記第１歌詞データのうち前記英単語歌詞生成手順で抽出されなかった残り部分の歌詞テキストに対し、単語ごとに発音記号を与える発音記号付与手順と、前記楽節フレーズデータに対応した楽節フレーズごとに、前記発音記号付与手順で前記歌詞テキストに与えられた前記発音記号と前記歌詞テキストを置き換えるための英単語候補の発音記号とを比較し、類似度を導出する類似度導出手順と、前記類似度導出手順での導出結果に応じて、前記楽節フレーズごとに、前記類似度が最も大きくなるように複数の前記英単語候補を選択して配列し、置き換え用の第３歌詞データを生成する置き換え歌詞生成手順と、前記英単語歌詞生成手順で生成された前記第２歌詞データと前記置き換え歌詞生成手順で生成された前記第３歌詞データとを合成して、前記楽曲受付手順で受け付けた前記カラオケ楽曲データに対応した新たな歌詞データを生成する歌詞合成手順と、前記楽曲受付手順で受け付けた前記カラオケ楽曲データの再生に応じて、当該カラオケ楽曲データに対応する前記新たな歌詞データを前記表示手段に表示させる表示制御手順と、を実行させる。 In order to achieve the above object, a karaoke music piece processing program according to a third aspect of the present invention is a music piece reproduction means capable of reproducing karaoke piece data associated with predetermined phrase phrase data and first lyric data, a display means, A music reception procedure for accepting selection of the karaoke music data by the user, and the first lyrics data corresponding to the karaoke music data received in the music reception procedure. the English word lyric generation step of generating a second lyric data alphabetized by extracting lyric portion corresponding to the English word, the rest of the not extracted with English single language lyrics generation procedure of the first lyrics data A syllable text assignment procedure for assigning a phonetic symbol for each word and a phrase phrase corresponding to the phrase phrase data. A degree-of-similarity derivation procedure for comparing the phonetic symbol given to the lyric text with the phonetic symbol of the English word candidate for replacing the lyric text and deriving similarity In accordance with the derivation result in the similarity degree derivation procedure, the plurality of English word candidates are selected and arranged so that the similarity degree is maximized for each of the phrase phrases, and replacement third lyric data is obtained. The generated replacement lyrics generation procedure, the second lyrics data generated in the English word lyrics generation procedure and the third lyrics data generated in the replacement lyrics generation procedure are combined and received in the music reception procedure In accordance with the lyrics composition procedure for generating new lyrics data corresponding to the karaoke song data, and the reproduction of the karaoke song data received in the song reception procedure, And display control procedure of the new lyrics data corresponding to Raoke music data to be displayed on the display unit, thereby to execute.

本発明によれば、ユーザが英語を母国語とする（あるいは日常で使用する）外国人である場合であっても、カラオケ楽曲の原曲に近い発音態様で確実に歌唱を行うことができる。 According to the present invention, even when the user is a foreigner whose native language is English (or used daily), the user can reliably sing in a pronunciation manner close to the original karaoke song.

本発明の一実施形態によるカラオケ装置を備えたカラオケ楽曲再生システムの全体構成を表す機能ブロック図である。It is a functional block diagram showing the whole structure of the karaoke music reproduction system provided with the karaoke apparatus by one Embodiment of this invention. カラオケ楽曲データのデータ構造を表す説明図である。It is explanatory drawing showing the data structure of karaoke music data. 比較例による、英単語置換処理の手法を表す説明図である。It is explanatory drawing showing the method of the English word substitution process by a comparative example. 実施形態による、英単語置換処理の手法を表す説明図である。It is explanatory drawing showing the method of the English word substitution process by embodiment. 制御部により実行される処理手順を表すフローチャートである。It is a flowchart showing the process sequence performed by a control part. ステップＳ１００の詳細手順を表すフローチャートである。It is a flowchart showing the detailed procedure of step S100. 重み付けテーブルの一例を表す図である。It is a figure showing an example of a weighting table. 重み付けを用いて英単語置換処理が行われる態様を説明する説明図である。It is explanatory drawing explaining the aspect in which an English word substitution process is performed using weighting.

以下、本発明の一実施の形態について、図面を参照しつつ説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

＜システム概要＞
図１は、本発明の一実施の形態のカラオケ装置を備えたカラオケ楽曲再生システムの全体構成を表す機能ブロック図である。カラオケ再生システム１は、カラオケ装置本体１００と、リモコン２００と、認証サーバ３００とを、備えている。カラオケ装置本体１００及びリモコン２００は、カラオケ楽曲の再生サービスを提供するカラオケ店舗等のカラオケルームＫＲに設置されている。 <System overview>
FIG. 1 is a functional block diagram showing the overall configuration of a karaoke music piece reproduction system provided with a karaoke apparatus according to an embodiment of the present invention. The karaoke playback system 1 includes a karaoke device main body 100, a remote controller 200, and an authentication server 300. The karaoke apparatus main body 100 and the remote controller 200 are installed in a karaoke room KR such as a karaoke store that provides a karaoke music reproduction service.

カラオケ装置本体１００とリモコン２００とは、例えば無線又は有線のＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ（ＬＡＮ）等のネットワークＮＷ１を介し、互いに情報送受信可能に接続されている。また、カラオケ装置本体１００及びリモコン２００と、認証サーバ３００とは、上記ネットワークＮＷ１と、ネットワークＮＷ２とを介し、互いに情報送受信可能に接続されている。なお、カラオケ装置本体１００とリモコン２００とが、各請求項記載のカラオケ装置を構成する。 The karaoke apparatus main body 100 and the remote controller 200 are connected to each other so as to be able to transmit and receive information to and from each other via a network NW1 such as a wireless or wired local area network (LAN). The karaoke apparatus main body 100, the remote controller 200, and the authentication server 300 are connected to each other via the network NW1 and the network NW2 so that information can be transmitted and received between them. The karaoke apparatus main body 100 and the remote controller 200 constitute a karaoke apparatus described in each claim.

＜カラオケ装置本体＞
カラオケ装置本体１００は、カラオケ楽曲データを構成する、ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ（ＭＩＤＩ；登録商標）データ、背景映像データ、及び歌詞データ（詳細は後述）を用いて、カラオケ楽曲の再生サービスを提供する装置である。このカラオケ装置本体１００は、制御部１０１と、この制御部１０１にそれぞれ接続された、英単語発音記号データベース１０３Ａ及び日本語単語発音記号データベース１０３Ｂ（詳細は後述）を備えた大容量記憶装置１０３と、操作部１０４と、マイクロフォン１０５と、音源１０６と、音声制御部１０７と、表示部１０９と、通信制御部１１０と、を備えている。 <Karaoke device body>
The karaoke device main body 100 is a device that provides a karaoke song reproduction service using Musical Instrument Digital Interface (MIDI; registered trademark) data, background video data, and lyrics data (details will be described later) that constitute karaoke song data. It is. The karaoke apparatus main body 100 includes a control unit 101, a mass storage device 103 including an English word phonetic symbol database 103A and a Japanese word phonetic symbol database 103B (details will be described later) connected to the control unit 101, respectively. , An operation unit 104, a microphone 105, a sound source 106, a sound control unit 107, a display unit 109, and a communication control unit 110.

制御部１０１は、図示しないＣＰＵと、ＲＡＭ、ＲＯＭ等のメモリとを備えている。この制御部１０１は、ＲＡＭの一時記憶機能を利用しつつ、ＲＯＭや上記大容量記憶装置１０３に予め記憶された各種プログラム（後述の図５及び図６に示す処理を実行するためのカラオケ楽曲処理プログラムを含む）を実行する。これにより、カラオケ装置本体１００全体の制御を行う。 The control unit 101 includes a CPU (not shown) and a memory such as a RAM and a ROM. The control unit 101 uses various programs stored in advance in the ROM or the large-capacity storage device 103 (karaoke music processing for executing the processing shown in FIGS. 5 and 6 described later) while using the temporary storage function of the RAM. Program). As a result, the entire karaoke apparatus body 100 is controlled.

大容量記憶装置１０３は、例えばＨａｒｄＤｉｓｋＤｒｉｖｅ（ＨＤＤ）などから構成される。この大容量記憶装置１０３には、ＭＩＤＩデータ、背景映像データ、及び、例えば日本語の歌詞データ（第１歌詞データに相当）等からなるカラオケ楽曲データ（後述の図２参照）や、その他の各種情報が記憶されている。また、この大容量記憶装置１０３に備えられた上記英単語発音記号データベース１０３Ａ（多数の英単語とこれに対応する発音記号とが対応づけて格納されている）及び日本語単語発音記号データベース１０３Ｂ（多数の日本語単語とこれに対応する発音記号とが対応づけて格納されている）には、上記歌詞データに対して後述の英単語置換処理を行う際に用いられる。 The mass storage device 103 is composed of, for example, a hard disk drive (HDD). The large-capacity storage device 103 includes MIDI data, background video data, karaoke music data (see FIG. 2 to be described later) composed of, for example, Japanese lyrics data (corresponding to first lyrics data), and other various types. Information is stored. Further, the English word phonetic symbol database 103A (a large number of English words and corresponding phonetic symbols are stored in association with each other) and the Japanese word phonetic symbol database 103B ( A large number of Japanese words and corresponding phonetic symbols are stored in association with each other), which is used when English word replacement processing described later is performed on the lyrics data.

操作部１０４は、例えば複数のキーやスイッチなどから構成される。利用者は、この操作部１０４又は後述のリモコン２００の操作部２０４を用いて、ユーザの所望のカラオケ楽曲の予約操作等の各種操作を行うことができる。 The operation unit 104 includes, for example, a plurality of keys and switches. A user can perform various operations such as a reservation operation for a user's desired karaoke piece by using the operation unit 104 or an operation unit 204 of the remote controller 200 described later.

マイクロフォン１０５は、利用者によるカラオケ歌唱の音声を音声信号に変換して出力する。 The microphone 105 converts the voice of the karaoke song by the user into a voice signal and outputs the voice signal.

音源１０６は、上記制御部１０１によって大容量記憶装置１０３から読み出されたＭＩＤＩデータを再生して音声制御部１０７へ出力する。 The sound source 106 reproduces the MIDI data read from the mass storage device 103 by the control unit 101 and outputs it to the audio control unit 107.

音声制御部１０７は、音源１０６から出力されたＭＩＤＩデータの再生による演奏音と、マイクロフォン１０５により入力された音声信号とを、合成・増幅してスピーカ１０８へ出力する。 The sound control unit 107 synthesizes and amplifies the performance sound generated by reproducing the MIDI data output from the sound source 106 and the sound signal input from the microphone 105 and outputs the result to the speaker 108.

スピーカ１０８は、音声制御部１０７に接続されており、音声制御部１０７から出力されたＭＩＤＩデータ及び音声信号を音声出力、すなわち音として放出する。 The speaker 108 is connected to the audio control unit 107 and emits MIDI data and audio signals output from the audio control unit 107 as audio output, that is, as sound.

なお、音源１０６及び音声制御部１０７が、各請求項記載の楽曲再生手段として機能する。以下適宜、これら音源１０６及び音声制御部１０７を、省略して「音源１０６等」と称する。 The sound source 106 and the sound control unit 107 function as music reproducing means described in each claim. Hereinafter, the sound source 106 and the sound control unit 107 are appropriately omitted and referred to as “sound source 106 or the like”.

表示部１０９は、例えば液晶ディスプレイなどから構成され、各種映像を表示する。特に、表示部１０９は、上記音源１０６等によるＭＩＤＩデータの再生に同期して、言い換えれば、音源１０６等によりＭＩＤＩデータの再生が行われるのにしたがい、大容量記憶装置１０３から読み出された、背景映像データ、及び、歌詞データに対応したテロップ（歌詞テキスト）等を表示することができる。 The display unit 109 is composed of a liquid crystal display, for example, and displays various videos. In particular, the display unit 109 is read from the mass storage device 103 in synchronization with the reproduction of the MIDI data by the sound source 106 or the like, in other words, as the MIDI data is reproduced by the sound source 106 or the like. Background video data, telop (lyric text) corresponding to lyrics data, and the like can be displayed.

通信制御部１１０は、リモコン２００や認証サーバ３００との間で、上記ネットワークＮＷ１，ＮＷ２を介し行われる情報通信の制御を行う。 The communication control unit 110 controls information communication between the remote controller 200 and the authentication server 300 via the networks NW1 and NW2.

＜リモコン＞
リモコン２００は、利用者がカラオケ楽曲の予約操作等の各種操作を行うための操作端末である。このリモコン２００は、制御部２０１と、記憶装置２０３と、操作部２０４と、表示部２０９と、通信制御部２１０と、を備えている。 <Remote control>
The remote controller 200 is an operation terminal for a user to perform various operations such as a reservation operation for karaoke music. The remote controller 200 includes a control unit 201, a storage device 203, an operation unit 204, a display unit 209, and a communication control unit 210.

制御部２０１は、図示しないＣＰＵと、ＲＡＭ、ＲＯＭ等のメモリと、を備えている。この制御部２０１は、ＲＡＭの一時記憶機能を利用しつつ、ＲＯＭや上記記憶装置２０３に予め記憶された各種プログラムを実行する。これにより、リモコン２００全体の制御を行う。 The control unit 201 includes a CPU (not shown) and a memory such as a RAM and a ROM. The control unit 201 executes various programs stored in advance in the ROM or the storage device 203 while using the temporary storage function of the RAM. As a result, the entire remote controller 200 is controlled.

記憶装置２０３は、例えば不揮発性メモリなどから構成され、各種情報を記憶する。 The storage device 203 is composed of, for example, a nonvolatile memory and stores various types of information.

操作部２０４は、例えば複数のキーやスイッチなどから構成される。利用者は、この操作部２０４又は上記カラオケ装置本体１００の操作部１０４を用いて、カラオケ楽曲の予約操作等の各種操作を行うことができる。 The operation unit 204 is composed of, for example, a plurality of keys and switches. Using the operation unit 204 or the operation unit 104 of the karaoke apparatus main body 100, the user can perform various operations such as a reservation operation for karaoke music.

表示部２０９は、例えば液晶ディスプレイなどから構成され、各種表示を行う。 The display unit 209 is composed of, for example, a liquid crystal display and performs various displays.

通信制御部２１０は、カラオケ装置本体１００や認証サーバ３００との間で、上記ネットワークＮＷ１，ＮＷ２を介し行われる情報通信の制御を行う。なお、この例では、通信制御部２１０とネットワークＮＷ１との間の接続は無線接続となっている。 The communication control unit 210 controls information communication performed between the karaoke apparatus main body 100 and the authentication server 300 via the networks NW1 and NW2. In this example, the connection between the communication control unit 210 and the network NW1 is a wireless connection.

＜認証サーバ＞
認証サーバ３００は、制御部３０１と、大容量記憶装置３０３と、通信制御部３１０とを有している。 <Authentication server>
The authentication server 300 includes a control unit 301, a mass storage device 303, and a communication control unit 310.

制御部３０１は、図示しないＣＰＵとＲＡＭ、ＲＯＭ等のメモリとを備えている。この制御部３０１は、ＲＡＭの一時記憶機能を利用しつつ、ＲＯＭや上記大容量記憶装置３０３に予め記憶された各種プログラムを実行する。これにより、認証サーバ３００全体の制御を行う。 The control unit 301 includes a CPU (not shown) and a memory such as a RAM and a ROM. The control unit 301 executes various programs stored in advance in the ROM or the large-capacity storage device 303 while using the temporary storage function of the RAM. As a result, the entire authentication server 300 is controlled.

＜実施形態の特徴＞
ここで、本実施形態の特徴は、上述したように、歌詞データの英単語置換処理を行うことにある。 <Features of the embodiment>
Here, as described above, the feature of the present embodiment is that an English word replacement process of lyrics data is performed.

＜英単語置換の必要性＞
上記のようなカラオケ装置においては、スピーカ１０８からのカラオケ楽曲の再生とともに当該カラオケ楽曲の歌詞テキストが表示部１０９に表示されることで、ユーザはその歌詞を見ながら歌唱を楽しむことができる。ここで、ユーザが上記歌詞テキストの言語（この例では日本語）を母国語とする場合（例えば日本人の場合）には、上記のようにして表示される例えば日本語からなる歌詞を見てカラオケ楽曲の原曲（歌手が歌った当該楽曲）に沿った歌唱を容易に行うことができる。 <Need for English word replacement>
In the karaoke apparatus as described above, the lyrics text of the karaoke music is displayed on the display unit 109 along with the reproduction of the karaoke music from the speaker 108, so that the user can enjoy singing while watching the lyrics. Here, when the user uses the language of the lyric text (in this example, Japanese) as a native language (for example, Japanese), the user sees, for example, Japanese lyrics displayed as described above. Singing along the original karaoke song (the song sung by the singer) can be performed easily.

これに対して、ユーザが例えば英語を母国語とする（あるいは日常で使用する）外国人である場合には、上記のようにして表示部１０９に表示される歌詞テキストが十分には読めなかったり、あるいは読めたとしても流ちょうな当該言語の文章の流れでは発音できなかったり等により、上記原曲に沿った歌唱が難しい。 On the other hand, when the user is a foreigner whose native language is English (or used daily), the lyrics text displayed on the display unit 109 cannot be read sufficiently as described above. Or, even if it can be read, it is difficult to sing along the original music because it cannot be pronounced in a fluent text flow.

そこで、本実施形態においては、カラオケ楽曲の歌詞を、音として類似している英単語の羅列に置き換えて表示することで、外国人ユーザへの便宜を図る。以下、その内容を順を追って詳細に説明する。 Therefore, in this embodiment, the lyrics of the karaoke music are replaced with a list of English words that are similar as sounds, and displayed for convenience to the foreign user. The details will be described in detail below.

＜データ構造＞
図２は、本実施形態におけるカラオケ楽曲データのデータ構造を表す説明図である。 <Data structure>
FIG. 2 is an explanatory diagram showing the data structure of karaoke song data in the present embodiment.

図２に示すように、カラオケ楽曲データ２０は、曲名等の曲識別情報（曲ＩＤ）２１と、楽譜データ２２と、歌詞データ２３と、演奏データとしてのＭＩＤＩデータ２４と、を備えている。 As shown in FIG. 2, the karaoke song data 20 includes song identification information (song ID) 21 such as a song name, score data 22, lyrics data 23, and MIDI data 24 as performance data.

曲識別情報２１は、カラオケ楽曲データ２０のヘッダとして置かれている。ＭＩＤＩデータ２４は、前述したように上記音源１０６等によって再生され、音声としてスピーカ１０８から放出される。また、歌詞データ２３は前述したように歌詞テロップとして表示部１０９に表示される。 The song identification information 21 is placed as a header of the karaoke song data 20. As described above, the MIDI data 24 is reproduced by the sound source 106 and the like, and is emitted from the speaker 108 as sound. The lyrics data 23 is displayed on the display unit 109 as lyrics telop as described above.

楽譜データ２２（楽節フレーズデータに相当）は、少なくともメロディーラインを表す音符の音符長データを含んでおり、かつ、所定の楽節（例えば、いわゆるＡメロ、Ｂメロ、サビなどのまとまり）やフレーズ（後述の例では１小節）を表す機能を備えている。楽譜データ２２は、典型的には、例えば、後述の図４（ａ）、図４（ｄ）のように五線譜で表される。 The musical score data 22 (corresponding to the phrase phrase data) includes at least the note length data of the notes representing the melody line, and a predetermined phrase (for example, a set of so-called A melody, B melody, chorus, etc.) or phrase ( In the example described later, it has a function representing one measure). The musical score data 22 is typically expressed in a staff notation as shown in FIGS. 4A and 4D described later, for example.

なお、この例では、上記歌詞データ２３や楽譜データ２２が上記カラオケ楽曲データ２０と同一ファイルに組み込まれているが、これに限られない。すなわち、歌詞データ２３や楽譜データ２２が、ＭＩＤＩデータ２４等を含むカラオケ楽曲データ２０とは別ファイルで構成され、同じ曲名で対応付けられていてもよい。 In this example, the lyric data 23 and the score data 22 are incorporated in the same file as the karaoke music data 20, but the present invention is not limited to this. That is, the lyrics data 23 and the score data 22 may be configured as a separate file from the karaoke song data 20 including the MIDI data 24 and the like, and may be associated with the same song name.

＜歌詞データの英単語置換＞
上記カラオケ楽曲データ２０に含まれる、日本語テキスト（歌詞テキストに相当）による歌詞データ２３に対し、本実施形態では、上記日本語単語発音記号データベース１０３Ｂが参照されつつ、例えば単語ごとに発音記号が付与される。さらに、発音記号が付与されたそれら日本語テキストは、上記英単語発音記号データベース１０３Ａが参照されることで、音が似ている英単語に置き換えられる。 <English word replacement of lyrics data>
In this embodiment, with respect to lyrics data 23 in Japanese text (corresponding to lyrics text) included in the karaoke song data 20, in the present embodiment, the Japanese word phonetic symbol database 103B is referred to, and for example, a phonetic symbol is provided for each word. Is granted. Further, those Japanese texts to which phonetic symbols are assigned are replaced with English words having similar sounds by referring to the English word phonetic symbol database 103A.

このとき、本実施形態では、上記英単語への置き換えの際、上記楽譜データを参酌することで、楽曲の流れや区切りを反映した、より原曲に近い態様での歌唱を確実に実現できるような置き換えを行うことができる。以下、その置き換え手法を、比較例を用いて、詳細に説明する。 At this time, in the present embodiment, by replacing the English word with the score data, it is possible to reliably realize the singing in a mode closer to the original music reflecting the music flow and breaks. Replacements can be made. Hereinafter, the replacement method will be described in detail using a comparative example.

＜比較例＞
図３（ａ）〜（ｅ）により、本実施形態の比較例による英単語置換の手法について説明する。図３（ａ）に上記楽譜データ２２の一例を示すように、この例では「あのことばさえいわなければ〜」の歌詞となる４小節分のメロディーラインが音符化されて記載されており、コード進行は、コードＥｍ（第１小節）→コードＢｍ７（第２小節）→コードＣａｄｄ９（第３小節）→コードＢｍ７（第４小節）・・・の順となっている。 <Comparative example>
An English word replacement method according to a comparative example of the present embodiment will be described with reference to FIGS. As shown in the example of the score data 22 in FIG. 3A, in this example, the melody line for four measures, which is the lyrics of “If you don't have to say that word”, is described as a note, Progression is in the order of code Em (first measure) → code Bm7 (second measure) → code Cadd9 (third measure) → code Bm7 (fourth measure).

この比較例では、図３（ｂ）に示すように、上記４小節分の歌詞テキスト「あの言葉さえ言わなければ」に対し上記英単語置換を行う際に、（後述のようにフレーズや楽節等の音楽的要素を考慮することなく）単にテキストとして取り扱われ、文節によって区切られて処理される。すなわち、図３（ｃ）に示すように、「あの」「言葉」「さえ」「言わ」「なければ」の５つに区切られる。 In this comparative example, as shown in FIG. 3B, when the above-mentioned English word replacement is performed on the lyric text for the four measures “I have to say that word” (as described later, phrases, musical phrases, etc.) It is treated as text only (without taking into account the musical elements) and separated by clauses. That is, as shown in FIG. 3C, it is divided into five parts: “that”, “word”, “even”, “say” and “if not”.

そして、このようにして区切った各文節ごとに、発音が似ている英単語にそれぞれ置き換えられる。この例では、日本語テキスト「あの」は、英単語「Ｉｋｎｏｗ」に置き換えられ、日本語テキスト「言葉」は、英単語「ＣｕｔＢａｒ」に置き換えられ、日本語テキスト「さえ」は、英単語「ＳｉｒＷａｙ」に置き換えられ、日本語テキスト「言わ」は、英単語「ＥａｒＷｏｒｄ」に置き換えられ、日本語テキスト「なければ」は、英単語「ＮａｋｅｄＬｅｖｅｒ」に置き換えられる。 Each phrase segmented in this way is replaced with an English word with similar pronunciation. In this example, the Japanese text “Ano” is replaced with the English word “I know”, the Japanese text “Word” is replaced with the English word “Cut Bar”, and the Japanese text “Even” is converted into the English word. The Japanese text “say” is replaced with the English word “Ear Word”, and the Japanese text “Miss” is replaced with the English word “Naked Level”.

しかしながら、この比較例におけるこのような英単語置換の手法では、図３（ｅ）に示すように、例えば原曲の「あ」の音は比較的長い音であることから、「Ｉ」に置き換えた場合は両者の発音の違いが際だってしまい、原曲っぽく歌唱することができない。また、例えば原曲の「言葉」は２小節にまたがって発音されるが、これを「ＣｕｔＢａｒ」で置き換えると、２小節にまたがるメロディーラインの譜割りにうまく乗り切れないため、原曲っぽく歌唱することができない。また、原曲の「なければ」の「なけ」も２小節にまたがって発音されるが、これを「Ｎａｋｅｄ」で置き換えると、上記同様、２小節にまたがるメロディーラインの譜割りにうまく乗り切れず、原曲っぽく歌唱することができない。 However, in such a method for replacing English words in this comparative example, as shown in FIG. 3 (e), for example, since the sound of the original song “A” is a relatively long sound, it is replaced with “I”. If you do, the difference in pronunciation between the two will stand out and you will not be able to sing like the original song. Also, for example, the “word” of the original song is pronounced across two bars, but if you replace it with “Cut Bar”, you will not be able to get through the melody line score that spans two bars, so sing like the original song. I can't. Also, “Nake” of the original song “No” is pronounced over two bars, but if this is replaced with “Naked”, the melody line straddling over two bars will not be able to survive well, as described above. I can't sing like the original song.

＜実施形態＞
上記比較例の手法に対し、本実施形態における英単語置換の手法では、フレーズや楽節等の音楽的要素が考慮された処理が実行される。すなわち、上記１０４（ａ）及び図３（ｂ）と同様の図４（ａ）及び図４（ｂ）に示す４小節分の歌詞テキスト「あの言葉さえ言わなければ」に対して、フレーズや楽節等の音楽的要素（この例では小節の区切り）が考慮されて、図３（ｃ）に示すように、「あの言」「葉さえ」「言わな」「ければ」の４つ（すなわちこの例では小節数と同じ数）の日本語テキストの語句に区切られる。 <Embodiment>
In contrast to the method of the comparative example, in the method of replacing English words in the present embodiment, processing is performed in consideration of musical elements such as phrases and passages. That is, for the lyric text for four measures shown in FIG. 4 (a) and FIG. 4 (b) similar to 104 (a) and FIG. As shown in Fig. 3 (c), the musical elements such as "that word", "even the leaf", "don't say", "if you" (In the example, it is the same as the number of bars).

そして、この４つの語句それぞれごとに、発音が似ている英単語、詳細には後述のように発音記号同士が類似している英単語（英単語候補に相当）にそれぞれ置き換えられる。その際、本実施形態ではさらに、上記音符長が配慮されて、音符長が長いほど精度良く類似するような英単語が選ばれる一方、音符長が短い場合には類似性が低くても前後の単語とのリズムバランスがよいような英単語が選ばれる。 Each of the four words is replaced with an English word having a similar pronunciation, more specifically, an English word having a similar phonetic symbol (corresponding to an English word candidate) as described later. At this time, in the present embodiment, the above note length is further considered, and English words that are more similar with accuracy are selected as the note length is longer. English words that have good rhythm balance with words are selected.

例えば、この例では、日本語テキスト「あの言」において、「あ」に係わる音符が付点四分音符であって比較的長いことから、「あ」は一文字でより高い類似度となる「Ａｎｎｅ」に置換され、「こ」「と」に係わる音符は八分音符であって比較的短いので、（比較的低い類似度であっても）両者同士のリズムバランスのよい「Ｇｏ」「Ｔｏ」に置換される。すなわち、日本語テキスト「あの言」は、英単語「ＡｎｎｋｎｏｗＧｏＴｏ」に置き換えられる。 For example, in this example, in the Japanese text “that word”, since the note related to “a” is a dotted quarter note and is relatively long, “a” is “Ane” which is a single character and has a higher similarity. The notes related to “ko” and “to” are octal notes and are relatively short, so “Go” and “To” have good rhythm balance between them (even if they have a relatively low degree of similarity). Is replaced by That is, the Japanese text “That word” is replaced with the English word “Ann know Go To”.

また、日本語テキスト「葉さえ」において、「さ」に係わる音符が八分音符であって比較的短いことから、上記同様、比較的低い類似度であっても、「ば」と組み合わせたときのリズムバランスのよい「Ｂｕｓ」の一部とされる。すなわち、日本語テキスト「葉さえ」は、英単語「ＢｕｓＡｗａｙ」に置き換えられる。 In addition, in the Japanese text “Hane”, the note related to “sa” is an eighth note and is relatively short. Therefore, even when the similarity is relatively low, It is a part of “Bus” with good rhythm balance. That is, the Japanese text “Even leaf” is replaced with the English word “Bus Away”.

さらに、日本語テキスト「言わな」において、「わ」に係わる音符が付点四分音符であって比較的長いことから、「わ」は１文字でより高い類似度となる「Ｗａｒｎ」に置換され、「な」に係わる音符は四分音符であり中程度の長さであるので、類似度は中程度であっても次の単語（後述の「Ｋａｔｅ」）とのリズムバランスを重視して「Ｎｏｗ」とされる。すなわち、日本語テキスト「言わな」は、英単語「ＥａｒＷａｒｎＮｏｗ」に置き換えられる。 In addition, in the Japanese text “say”, the note related to “wa” is a dotted quarter note and is relatively long, so “wa” is replaced with “warn”, which is a single character with a higher similarity. Since the note related to “NA” is a quarter note and has a medium length, emphasis is placed on the rhythm balance with the next word (“Kate” described later) even if the similarity is medium. “Now”. That is, the Japanese text “say na” is replaced with the English word “Ear Warn Now”.

さらに、日本語テキスト「ければ」において、「け」に係わる音符は四分音符であり中程度の長さであるので、類似度は中程度であっても、前の単語（上述の「Ｎｏｗ」）とのリズムバランスを重視して「Ｋａｔｅ」とされる。すなわち、日本語テキスト「ければ」は、英単語「ＫａｔｅＬｅｖｅｒ」に置き換えられる。 Furthermore, in the Japanese text “Kara”, the note related to “ke” is a quarter note and has a medium length, so even if the degree of similarity is medium, the previous word (“Now” described above) is used. ")" And "Kate". That is, the Japanese text “if” is replaced with the English word “Kate Lever”.

以上の手法により、本実施形態では、上記比較例に比べて、楽曲の流れや区切りを反映しつつ、より原曲に近い態様での歌唱を確実に実現することができる。 According to the above method, in this embodiment, it is possible to reliably realize singing in a mode closer to the original music while reflecting the flow and break of the music as compared with the comparative example.

なお、上述の図示では説明を省略したが、上記カラオケ楽曲データ２０に含まれる、日本語テキストによる歌詞データ２３に、もとが英語である単語（いわゆる外来語）が含まれる場合がある。このような場合には、当該単語は上述のような処理をせず、本来の英語アルファベット表記に戻すほうが上記外国人ユーザにとっては発音がしやすい。したがって、本実施形態では、上述の英単語置換処理を行う前に、まず、歌詞データ２３の中から、英単語に対応した歌詞部分が抽出され、それをアルファベット化した歌詞データ（第２歌詞データに相当）が生成される（後述の図５のステップＳ１５参照）。そして、上記において抽出されずに残った日本語の歌詞データに対し、上述の英単語置換処理が行われる（後述の図５のステップＳ２５及びステップＳ１００等参照）。 Although explanation is omitted in the above-described illustration, the lyric data 23 based on the Japanese text included in the karaoke song data 20 may include words originally in English (so-called foreign words). In such a case, it is easier for the foreign user to pronounce the word if the word is not processed as described above and is returned to the original English alphabet. Therefore, in the present embodiment, before performing the above-described English word replacement process, first, the lyrics portion corresponding to the English word is extracted from the lyrics data 23, and the lyrics data (second lyrics data) is alphabetized. (Refer to step S15 in FIG. 5 described later). Then, the above-described English word replacement process is performed on the remaining Japanese lyrics data that has not been extracted (see step S25 and step S100 in FIG. 5 described later).

＜制御フロー＞
次に、上記手法を実行するために、カラオケ装置本体１００の制御部１０１によって実行される制御手順を、図５により説明する。このフローは、例えばユーザが、歌唱を意図する曲の選曲番号をリモコン２００の操作部２０４やカラオケ装置本体１００の操作部１０４）を介して入力することで、その選曲が制御部１０１によって受け付けられることによって、開始される。なお、この選曲を受け付ける制御部１０１の機能が、各請求項記載の楽曲受付手段として機能する。 <Control flow>
Next, a control procedure executed by the control unit 101 of the karaoke apparatus main body 100 in order to execute the above method will be described with reference to FIG. In this flow, for example, when a user inputs a song selection number of a song intended for singing via the operation unit 204 of the remote controller 200 or the operation unit 104 of the karaoke apparatus main body 100, the song selection is accepted by the control unit 101. Is started. In addition, the function of the control part 101 which receives this music selection functions as a music reception means as described in each claim.

図５において、まず、ステップＳ５で、制御部１０１は、上記ユーザによる選曲に対応したカラオケ楽曲データの送信を要求するリクエスト信号を、ネットワークＮＷ２を介して認証サーバ３００へ送信する。これにより、認証サーバ３００は、大容量記憶装置３０３から、上記リクエスト信号に示される選曲番号に対応するカラオケ楽曲データ２０（歌詞データ２３及び楽譜データ２２を含む）等を検索して読み出して制御装置２０に送信する。そして、制御部１０１は、認証サーバ３００から送信された上記カラオケ楽曲データ２０から、（日本語テキストからなる）上記歌詞データ２３を読み込む。その後、ステップＳ１０に移る。 5, first, in step S5, the control unit 101 transmits a request signal for requesting transmission of karaoke music data corresponding to the music selection by the user to the authentication server 300 via the network NW2. As a result, the authentication server 300 searches the large-capacity storage device 303 for karaoke song data 20 (including the lyric data 23 and the score data 22) corresponding to the music selection number indicated in the request signal, and reads the control data. 20 to send. Then, the control unit 101 reads the lyric data 23 (consisting of Japanese text) from the karaoke song data 20 transmitted from the authentication server 300. Thereafter, the process proceeds to step S10.

ステップＳ１０では、制御部１０１は、上記ステップＳ１０で読み込んだ歌詞データ２３を順次チェックし、前述したもとが英語である単語（いわゆる外来語）の歌詞が含まれているか否かを判定する。外来語が含まれていなければステップＳ１０の判定が満たされず（Ｓ１０：ＮＯ）、後述のステップＳ２０に移る。外来語が含まれていた場合にはステップＳ１０の判定が満たされ（Ｓ１０：ＹＥＳ）、ステップＳ１５に移る。 In step S10, the control unit 101 sequentially checks the lyrics data 23 read in step S10, and determines whether or not the above-described lyrics of words originally in English (so-called foreign words) are included. If no foreign language is included, the determination in step S10 is not satisfied (S10: NO), and the process proceeds to step S20 described later. If a foreign word is included, the determination in step S10 is satisfied (S10: YES), and the process proceeds to step S15.

ステップＳ１５では、上記ステップＳ１０で識別された外来語の歌詞部分（例えばカタカナにより日本語表記されている）を、アルファベット化した新たな歌詞データを生成し、適宜の箇所（例えば上記ＲＡＭ）に記憶する。 In step S15, new lyrics data in which the lyric portion of the foreign word identified in step S10 (for example, written in Japanese by Katakana) is alphabetized is generated and stored in an appropriate location (for example, the RAM). To do.

そして、ステップＳ２０で、制御部１０１は、ステップＳ５で読み込んだ上記歌詞データ２３のうちの全部について、上記ステップＳ１０及びステップＳ１５の処理が終了したか否かを判定する。歌詞データ２３の全部について処理がまだ終了していなければ、ステップＳ２０の判定が満たされず（Ｓ２０：ＮＯ）、ステップＳ１０に戻って同様の処理を繰り返す。歌詞データ２３の全部について処理が終了したら、ステップＳ２０の判定が満たされ（Ｓ２０：ＹＥＳ）、ステップＳ２５に移る。なお、上記ステップＳ１０、ステップＳ１５、ステップＳ２０が、各請求項記載の英単語歌詞生成手順に相当すると共に、これらのステップを実行する制御部１０１が、各請求項記載の英単語歌詞生成手段として機能する。 In step S20, the control unit 101 determines whether or not the processing in steps S10 and S15 has been completed for all of the lyrics data 23 read in step S5. If the processing has not been completed for all of the lyrics data 23, the determination in step S20 is not satisfied (S20: NO), and the process returns to step S10 and the same processing is repeated. When the processing is completed for all of the lyrics data 23, the determination in step S20 is satisfied (S20: YES), and the process proceeds to step S25. Note that steps S10, S15, and S20 correspond to the English word lyrics generation procedure described in each claim, and the control unit 101 that executes these steps serves as the English word lyrics generation means described in each claim. Function.

ステップＳ２５では、制御部１０１は、歌詞データ２３のうち、上記ステップＳ１５でアルファベット化されなかった残りの日本語歌詞部分を、上記日本語単語発音記号データベース１０３Ｂを参照しつつ、発音記号に変換する。この発音記号の変換は、公知の適宜の手法（例えば、ジョーンズ式発音記号等）により行えば足りるので詳細な説明を省略する。 In step S25, the control unit 101 converts the remaining Japanese lyrics part of the lyrics data 23 that has not been alphabetized in step S15 into phonetic symbols while referring to the Japanese word phonetic symbol database 103B. . Since the phonetic symbols need only be converted by a known appropriate method (for example, Jones type phonetic symbols), detailed description thereof will be omitted.

そして、ステップＳ３０で、制御部１０１は、上記残りの日本語歌詞部分の全部について、上記ステップＳ２５の処理が終了したか否かを判定する。上記残りの日本語歌詞部分の全部について処理がまだ終了していなければ、ステップＳ３０の判定が満たされず（Ｓ３０：ＮＯ）、ステップＳ２５に戻って同様の処理を繰り返す。上記残りの日本語歌詞部分の全部について処理が終了したら、ステップＳ３０の判定が満たされ（Ｓ３０：ＹＥＳ）、ステップＳ１００に移る。なお、上記ステップＳ２５及びステップＳ３０が、各請求項記載の英発音記号付与手順に相当すると共に、これらのステップを実行する制御部１０１が、各請求項記載の発音記号付与手段として機能する。 In step S30, the control unit 101 determines whether or not the processing in step S25 has been completed for all the remaining Japanese lyrics. If the process has not been completed for all the remaining Japanese lyrics parts, the determination in step S30 is not satisfied (S30: NO), and the process returns to step S25 and the same process is repeated. When the process is completed for all the remaining Japanese lyrics, the determination in step S30 is satisfied (S30: YES), and the process proceeds to step S100. The steps S25 and S30 correspond to the English phonetic symbol assigning procedure described in each claim, and the control unit 101 that executes these steps functions as the phonetic symbol assigning unit described in each claim.

ステップＳ１００では、制御部１０１は、英単語置換処理を実行する。この英単語置換処理は、図４を用いて前述したように、上記楽譜データ２２に基づき、所定の楽節やフレーズごとに（前述の例では１小節ごとに）行われる。 In step S100, the control unit 101 executes English word replacement processing. As described above with reference to FIG. 4, this English word replacement process is performed for each predetermined passage or phrase (in the above example, for each measure) based on the score data 22.

＜英単語置換処理＞
上記ステップＳ１００の英単語置換処理の詳細を図６に示す。図６において、制御部１０１は、まずステップＳ１０５で、上記英単語発音記号データベース１０３Ａを参照し、前述のようにして変換して得られた歌詞データ２３全編にわたる発音記号のうち、当該１フレーズに含まれる発音記号よりも少ない発音記号数となる、すべての英単語を抽出する。 <English word replacement>
Details of the English word replacement process in step S100 are shown in FIG. In FIG. 6, the control unit 101 first refers to the English word phonetic symbol database 103A in step S105, and selects one phrase out of the phonetic symbols over the entire lyrics data 23 obtained by conversion as described above. Extract all English words that have fewer phonetic symbols than included phonetic symbols.

その後、ステップＳ１１０で、制御部１０１は、後述のステップＳ１２０における類似度の算出を、上記ステップＳ１０５で抽出されたすべての英単語について終了したか否かを判定する。すべての英単語について類似度算出が完了するまではステップＳ１１０の判定が満たされず（Ｓ１１０：ＮＯ）、ステップＳ１１５に移る。ステップＳ１１５では、制御部１０１は、後述のステップＳ１２０における類似度の算出を、上記１フレーズ中のすべての発音記号について終了したか否かを判定する。すべての発音記号について類似度算出が終了するまではステップＳ１１５の判定が満たされず（Ｓ：ＮＯ）、ステップＳ１２０に移る。ステップＳ１２０では、上記１フレーズ中に含まれる全発音記号のうちの、適宜の数の発音記号と上記ステップＳ１０５で抽出した英単語の発音記号とを公知の手法で順次比較し、類似度を算出する。 Thereafter, in step S110, the control unit 101 determines whether or not similarity calculation in step S120, which will be described later, has been completed for all English words extracted in step S105. Until the similarity calculation is completed for all English words, the determination in step S110 is not satisfied (S110: NO), and the process proceeds to step S115. In step S115, the control unit 101 determines whether or not the calculation of similarity in step S120 described later has been completed for all phonetic symbols in the one phrase. Until the similarity calculation is completed for all phonetic symbols, the determination in step S115 is not satisfied (S: NO), and the process proceeds to step S120. In step S120, an appropriate number of phonetic symbols out of all phonetic symbols included in the one phrase and the English phonetic phonetic symbols extracted in step S105 are sequentially compared by a known method to calculate similarity. To do.

なお、上記類似度の算出の際、発音記号は異なるが発音自体は類似するもの（例えば日本語テキスト「ず」に対応する「ｄｚｗ」「ｚｗ」「ｔｈ」等）については、相互に同等で互換性があるものとして類似度の算出を行う。さらに、前述のようにして対応する音符が長い発音記号については高い類似度により置換を行い、対応する音符が長くない発音記号については低い類似度により置換が行われる。そのために、このステップＳ１２０では、制御部１０１は、楽譜データ２２に含まれる音符長データに基づき、対応する音符長が長いものほど類似度算出時の重み付けが重くされ、逆に対応する音符長が短いものほど重み付けが軽くされる。図７に、そのような重み付けテーブルの一例を表す。 When calculating the similarity, the pronunciation symbols are different but the pronunciations are similar (for example, “dzw”, “zw”, “th”, etc., corresponding to the Japanese text “Zu”) are mutually equivalent. The similarity is calculated as being compatible. Further, as described above, a phonetic symbol with a long corresponding note is replaced with a high similarity, and a phonetic symbol with a long corresponding note is replaced with a low similarity. For this reason, in step S120, the control unit 101 weights the similarity when the corresponding note length is longer based on the note length data included in the score data 22, and the corresponding note length is reversed. The shorter the weight, the lighter the weight. FIG. 7 shows an example of such a weighting table.

図７に示す例では、互いに音符長が異なる５種類の音符それぞれに対し、異なる重み値が対応づけられている。すなわち、全音符に対しては重み値１６が対応づけられ、２分音符に対しては重み値８が対応づけられ、４分音符に対しては重み値４が対応づけられ、８分音符に対しては重み値２が対応づけられ、１６分音符に対しては重み値１が対応づけられる。このような重み値を用いたときの類似度の算出の例を図８に示す。 In the example shown in FIG. 7, different weight values are associated with five types of notes having different note lengths. That is, a weight value of 16 is associated with all notes, a weight value of 8 is associated with half notes, a weight value of 4 is associated with quarter notes, A weight value of 2 is associated with the 16th note, and a weight value of 1 is associated with the sixteenth note. An example of calculating the similarity when such weight values are used is shown in FIG.

図８において、この例では、ある１フレーズ中のメロディーラインに、全音符、２分音符、１６分音符（１番目）、４分音符、８分音符、１６分音符（２番目）、及び、１６分音符（３番目）の合計７つの音符がこの順序で含まれている場合を例に取っている。そして、それぞれの音符に乗せられる日本語歌詞部分のテキストの発音記号が、上記の順に、上記全音符については「○○」（単なる説明用の表記である。以下同様）、上記２分音符については「××」、上記１６分音符（１番目）については「★★」、上記４分音符については「△△」、上記８分音符については「□□」、上記１６分音符（２番目）については「※※」、上記１６分音符（３番目）については「※※」、となっている。 In FIG. 8, in this example, a melody line in one phrase includes all notes, half notes, sixteenth notes (first), quarter notes, eighth notes, sixteenth notes (second), and A case where a total of seven notes of sixteenth notes (third) are included in this order is taken as an example. The phonetic symbols of the text of the Japanese lyrics placed on each note are, in the above order, “OO” for the whole note (simply an explanatory notation; the same applies hereinafter), and the half note. Is “XX”, “★★” for the sixteenth note (first), “ΔΔ” for the quarter note, “□□” for the eighth note, and the sixteenth note (second) ) Is “**”, and the above 16th note (third) is “**”.

そして、このような日本語歌詞部分に対して、この例では『英単語１』と『英単語２』の２つが、置換用の英単語として類似度の算出対象となっている。『英単語１』の発音記号は、上記の順に沿って、上記全音符については「××」、上記２分音符については「××」、上記１６分音符（１番目）については「△△」、上記４分音符については「△△」、上記８分音符については「□□」、上記１６分音符（２番目）については「※※」、上記１６分音符（３番目）については「※※」、となっている。すなわち、英単語１は、２分音符、４分音符、８分音符、１６分音符（２番目）、及び、１６分音符（３番目）の合計５つの音符について、発音記号が一致している。一致した場合の（重み付け前の）類似度を１、不一致の場合の（重み付け前の）類似度を０、とすると、上記重み付けを加味したこの英単語１の類似度は、
１６（全音符重み値）×０＋８（２分音符重み値）×１＋１（１６分音符重み値）×０＋４（４分音符重み値）×１＋２（８分音符重み値）×１＋１（１６分音符重み値）×１＋１（１６分音符重み値）×１
＝０＋８＋０＋４＋２＋１＋１
＝１６
となる。 For such a Japanese lyrics part, in this example, “English word 1” and “English word 2” are the calculation targets of similarity as replacement English words. The phonetic symbol of “English word 1” is “XX” for the whole note, “XX” for the half note, and “Δ △” for the sixteenth note (first) in the above order. ”,“ △△ ”for the quarter note,“ □□ ”for the eighth note,“ ** ”for the sixteenth note (second), and“ 「” for the sixteenth note (third) ** ” That is, in English word 1, the phonetic symbols are the same for a total of five notes: a half note, a quarter note, an eighth note, a sixteenth note (second), and a sixteenth note (third). . Assuming that the similarity (before weighting) is 1 when matching, and the similarity (before weighting) when mismatching is 0, the similarity of this English word 1 with the above weighting is
16 (whole note weight value) × 0 + 8 (half note weight value) × 1 + 1 (16th note weight value) × 0 + 4 (quarter note weight value) × 1 + 2 (8th note weight value) × 1 + 1 (16th note weight) Value) × 1 + 1 (16th note weight value) × 1
= 0 + 8 + 0 + 4 + 2 + 1 + 1
= 16
It becomes.

一方、『英単語２』の発音記号は、上記の順に沿って、上記全音符については「○○」、上記２分音符については「△△」、上記１６分音符（１番目）については「□□」、上記４分音符については「△△」、上記８分音符については「□□」、上記１６分音符（２番目）については「＊＊」、上記１６分音符（３番目）については「★★」、となっている。すなわち、英単語２は、全音符、４分音符、８分音符の合計３つの音符について、発音記号が一致している。したがって、上記同様、重み付けを加味したこの英単語１の類似度は、
１６（全音符重み値）×１＋８（２分音符重み値）×０＋１（１６分音符重み値）×０＋４（４分音符重み値）×１＋２（８分音符重み値）×１＋１（１６分音符重み値）×０＋１（１６分音符重み値）×０
＝１６＋０＋０＋４＋２＋０＋０
＝２２
となる。したがって、この例では、見かけ上一致した発音記号の数が少ない（３個）英単語２のほうが、見かけ上一致した発音記号の数が多い（５個）英単語１よりも、類似度が高くなる。なお、上記ステップＳ１２０が、各請求項記載の類似度導出手順に相当すると共に、このステップを実行する制御部１０１が、各請求項記載の類似度導出手段として機能する。 On the other hand, the phonetic symbols of “English word 2” are “◯◯” for the whole note, “ΔΔ” for the half note, and “Δ △” for the half note, and “ □□ ”,“ △△ ”for the quarter note,“ □□ ”for the eighth note,“ ** ”for the sixteenth note (second), and the sixteenth note (third) Is “★★”. That is, in the English word 2, the phonetic symbols are the same for a total of three notes of all notes, quarter notes, and eighth notes. Therefore, as above, the similarity of this English word 1 with weighting is
16 (whole note weight value) × 1 + 8 (half note weight value) × 0 + 1 (16th note weight value) × 0 + 4 (quarter note weight value) × 1 + 2 (8th note weight value) × 1 + 1 (16th note weight) Value) x 0 + 1 (16th note weight value) x 0
= 16 + 0 + 0 + 4 + 2 + 0 + 0
= 22
It becomes. Therefore, in this example, the English word 2 with a small number of phonetic symbols that are apparently matched (three) has a higher degree of similarity than the English word 1 with a large number of phonetic symbols that seem to match (five). Become. Note that step S120 corresponds to the similarity deriving procedure described in each claim, and the control unit 101 that executes this step functions as a similarity deriving unit described in each claim.

図６に戻り、以上のようにしてステップＳ１２０が終了すると、前述のステップＳ１１５に戻り、同様の手順を繰り返す。これにより、ステップＳ１０５で抽出された全英単語のうち、１つの英単語について上記１フレーズ中に含まれる適宜の数の発音記号に対する比較及び類似度の算出がステップＳ１２０で順次行われ、上記適宜の数の発音記号すべてに対する類似度の算出が終了したら、ステップＳ１１０に戻り、上記抽出された全体単語のうち次の英単語について同様の手順が繰り返される。このような、ステップＳ１１０、ステップＳ１１５、ステップＳ１２０を含めた繰り返しによって、上記適宜の数の発音記号すべてに対し、ステップＳ１０５で抽出されたすべての英単語それぞれの類似度が算出される。この全英単語の類似度の算出が終了することでステップＳ１１０の判定が満たされ、ステップＳ１２５に移る。 Returning to FIG. 6, when step S120 is completed as described above, the process returns to step S115 described above, and the same procedure is repeated. As a result, among all the English words extracted in step S105, the comparison of the appropriate number of phonetic symbols included in the one phrase with respect to one English word and the calculation of the similarity are sequentially performed in step S120. When the calculation of the similarity for all the phonetic symbols is completed, the process returns to step S110, and the same procedure is repeated for the next English word among the extracted whole words. By repeating such steps including step S110, step S115, and step S120, the similarities of all the English words extracted in step S105 are calculated for all the appropriate number of phonetic symbols. When the calculation of the similarity of all English words is completed, the determination in step S110 is satisfied, and the process proceeds to step S125.

ステップＳ１２５では、制御部１０１は、上記のようにして、全英単語についての算出結果に基づき、上記１フレーズ中に含まれる適宜の数の発音記号（言い換えればその発音記号に対応した日本語歌詞部分）を、最大の類似度を与える英単語に置換する。その後、ステップＳ１３０に移る。 In step S125, the control unit 101, as described above, based on the calculation results for all English words, an appropriate number of phonetic symbols included in the one phrase (in other words, Japanese lyrics corresponding to the phonetic symbols). Part) is replaced with an English word giving the maximum similarity. Thereafter, the process proceeds to step S130.

ステップＳ１３０では、制御部１０１は、上記ステップＳ１０５〜ステップＳ１２５における処理を、上記１フレーズ中に含まれるすべての発音記号に対して完了したか否かを判定する。１フレーズ中の全発音記号について上記処理が完了するまでは判定が満たされず（Ｓ１３０：ＮＯ）、ステップＳ１０５に戻って同様の手順を繰り返す。１フレーズ中の全発音記号について上記処理が完了したら判定が満たされ（Ｓ１３０：ＹＥＳ）、図５のステップＳ３５へ移行する。 In step S130, the control unit 101 determines whether or not the processing in steps S105 to S125 has been completed for all phonetic symbols included in the one phrase. The determination is not satisfied until the above processing is completed for all phonetic symbols in one phrase (S130: NO), the process returns to step S105 and the same procedure is repeated. When the above processing is completed for all phonetic symbols in one phrase, the determination is satisfied (S130: YES), and the process proceeds to step S35 in FIG.

図５に戻り、ステップＳ３５では、制御部１０１は、上述したステップＳ１００による英単語置換処理を、上記アルファベット化されなかった残りの日本語歌詞部分を構成するすべてのフレーズ（上記の例では全小節）について終了したか否かを判定する。全フレーズについて上記英単語置換処理が完了するまでは判定が満たされず（Ｓ３５：ＮＯ）、順次フレーズを後続へと移しながら上記ステップＳ１００の処理を続行する。全フレーズについて上記英単語置換処理が完了したら判定が満たされ（Ｓ３５：ＹＥＳ）、ステップＳ４０に移る。なお、上記図６のステップＳ１２５及びステップＳ１３０と上記図５のステップＳ３５とが、各請求項記載の置き換え歌詞生成手順に相当すると共に、これらのステップを実行する制御部１０１が、各請求項記載の置き換え歌詞生成手段として機能する。 Returning to FIG. 5, in step S35, the control unit 101 performs the English word replacement process in step S100 described above for all phrases (all measures in the above example) that constitute the remaining Japanese lyrics portion that has not been alphabetized. It is determined whether or not the processing is finished. The determination is not satisfied until the above-mentioned English word replacement process is completed for all phrases (S35: NO), and the process of step S100 is continued while sequentially moving the phrase to the subsequent. When the above English word replacement process is completed for all phrases, the determination is satisfied (S35: YES), and the routine goes to Step S40. Note that steps S125 and S130 in FIG. 6 and step S35 in FIG. 5 correspond to the replacement lyrics generation procedure described in each claim, and the control unit 101 that executes these steps includes each claim. Functions as a replacement lyrics generation means.

その後、ステップＳ４０で、制御部１０１は、上記のようにしてステップＳ１００及びステップＳ３５で英単語置換処理により生成された置き換え語の歌詞（第３歌詞データに相当）と、上記ステップＳ１５で生成されていた、英単語に対応した歌詞部分をアルファベット化した歌詞データ（第２歌詞データに相当）とを、合成し、もともとの日本語の歌詞データ２３（第１歌詞データに相当）に対応する新たな歌詞データを生成する。その後、ステップＳ４５に移る。なお、このステップＳ４０が、各請求項記載の置き換え歌詞合成手順に相当すると共に、このステップを実行する制御部１０１が、各請求項記載の歌詞合成手段として機能する。 Thereafter, in step S40, the control unit 101 generates the replacement word lyrics (corresponding to the third lyric data) generated by the English word replacement process in steps S100 and S35 as described above, and is generated in step S15. The lyric data (corresponding to the second lyric data) obtained by converting the lyric portion corresponding to the English word into an alphabet is synthesized, and the new lyric data corresponding to the original Japanese lyric data 23 (corresponding to the first lyric data) Lyric data is generated. Thereafter, the process proceeds to step S45. This step S40 corresponds to the replacement lyrics composition procedure described in each claim, and the control unit 101 that executes this step functions as the lyrics composition means described in each claim.

ステップＳ４５では、制御部１０１は、表示部１０９（表示手段に相当）に制御信号を出力し、上記ステップＳ４０で合成された新たな歌詞データを、表示部１０９において表示させる。その後、このフローを終了する。なお、このステップＳ４５が、各請求項記載の表示制御手順に相当する。 In step S45, the control unit 101 outputs a control signal to the display unit 109 (corresponding to the display unit), and causes the display unit 109 to display the new lyrics data synthesized in step S40. Thereafter, this flow is terminated. This step S45 corresponds to the display control procedure described in each claim.

以上説明したように、本実施形態のカラオケ装置では、外来語部分はそのままアルファベット化した本来の英単語として含めつつ、それ以外の部分については、音が近い英単語の羅列の形で歌詞データの表示が行われる。これにより、ユーザが英語を母国語とする（あるいは日常で使用する）外国人である場合であっても、カラオケ楽曲の原曲に近い発音態様で歌唱を行うことができる。特に、本実施形態においては、カラオケ楽曲データに対応した歌詞データ２３に、所定の楽節やフレーズを表す楽節フレーズデータを含む楽譜データ２２が対応づけられている。そして、上記類似度の導出や、上記置き換え時における英単語の配列が、上記楽節・フレーズごと（前述の例では１小節ごと）に行われる。これにより、例えば複数の上記楽節・フレーズ（前述の例では２小節）に跨るような歌詞があったとしても、当該歌詞に対する上記英単語への置き換えは、楽節・フレーズごとに分割した態様で（すなわち歌詞を２つに分けて）行われる（図４（ｃ）及び図４（ｄ）参照）。この結果、歌詞の文章を、音が似ている英単語の羅列に単純に置き換えた場合（前述の図３に示した比較例参照）に比べ、楽曲の流れや区切りを反映した、より原曲に近い態様での歌唱を確実に実現することができる。 As described above, in the karaoke apparatus of the present embodiment, the foreign word part is included as an original English word that is alphabetized as it is, and the other parts are included in the form of lyrics data in the form of a series of English words with similar sounds. Display is performed. Thus, even if the user is a foreigner whose native language is English (or used daily), the user can sing in a pronunciation manner close to the original karaoke song. In particular, in this embodiment, the musical score data 22 including passage phrase data representing a predetermined passage or phrase is associated with the lyrics data 23 corresponding to the karaoke song data. Then, derivation of the similarity and arrangement of English words at the time of replacement are performed for each passage / phrase (for each measure in the above example). Thus, for example, even if there is a lyrics that spans a plurality of the above-mentioned passages / phrases (2 bars in the above example), the replacement of the lyrics with the above English words is performed in a manner divided for each passage / phrase ( That is, the lyrics are divided into two parts (see FIGS. 4C and 4D). As a result, the original song reflects the flow and breaks of the song compared to the case where the lyric text is simply replaced with an enumeration of similar English words (see the comparative example shown in FIG. 3 above). It is possible to reliably realize singing in a mode close to.

また、本実施形態では特に、図５のステップＳ１００における日本語歌詞部分の発音記号と英単語の発音記号との比較の際、音符長データの表す音符が長いほど大きくなり音符長データの表す音符が短いほど小さくなるような重み付け（図７参照）を用いつつ、類似度が導出される。これには以下のような意義がある。 In the present embodiment, in particular, when the phonetic symbol of the Japanese lyrics portion and the phonetic symbol of the English word are compared in step S100 in FIG. 5, the longer the note represented by the note length data, the larger the note represented by the note length data. The degree of similarity is derived using a weighting (see FIG. 7) that becomes smaller as is shorter. This has the following significance.

すなわち、歌詞が発音されて歌われるとき、長い音に乗せられた言葉は発音時間が長いので明瞭に耳に聞こえるのに対し、短い音に乗せられた言葉は発音時間が短いので不明確にしか聞こえない。本実施形態では、このことを参酌し、カラオケ楽曲データの歌詞データ２３に、全音符、四分音符、八分音符等の音符長データが対応づけられている（図８参照）。そして、上記類似度算出の際、上記乗せられる音の音符が長い発音記号については、（明確に耳に聞こえやすい性質に対応して）重み付けを大きくして類似度が導出される一方、上記乗せられる音の音符が短い発音記号については、（不明確にしか聞こえない性質に対応して）重み付けを小さくして類似度が導出される（図８参照）。これにより、例えば日本人により日本語で歌唱された原曲に対しさらに近い態様での歌唱を、確実に実現することができる。 That is, when the lyrics are sung and sung, the words put on the long sound are clearly audible because of the long pronunciation time, whereas the words put on the short sound are only indefinite because the pronunciation time is short. Inaudible. In the present embodiment, taking this into account, note length data such as full notes, quarter notes, and eighth notes is associated with the lyrics data 23 of the karaoke music data (see FIG. 8). Then, when calculating the similarity, for the phonetic symbols having a long note, the similarity is derived by increasing the weight (corresponding to the characteristic that it is clearly audible to the ear), while the above-mentioned placement is calculated. For phonetic symbols with short notes, the degree of similarity is derived by reducing the weight (corresponding to the property of being heard only indefinitely) (see FIG. 8). Thereby, for example, the singing in a mode closer to the original song sung in Japanese by the Japanese can be reliably realized.

なお、図５及び図６に示すフローチャートは本発明を上記フローに示す手順に限定するものではなく、発明の趣旨及び技術的思想を逸脱しない範囲内で手順の追加・削除又は順番の変更等をしてもよい。 Note that the flowcharts shown in FIGS. 5 and 6 do not limit the present invention to the procedure shown in the above-described flow, and add / delete procedures or change the order within the scope of the invention and the technical idea. May be.

また、以上既に述べた以外にも、上記実施形態や各変形例による手法を適宜組み合わせて利用しても良い。 In addition to those already described above, the methods according to the above-described embodiments and modifications may be used in appropriate combination.

その他、一々例示はしないが、本発明は、その趣旨を逸脱しない範囲内において、種々の変更が加えられて実施されるものである。 In addition, although not illustrated one by one, the present invention is implemented with various modifications within a range not departing from the gist thereof.

１カラオケ楽曲再生システム
２０カラオケ楽曲データ
２３歌詞データ
１０１制御部
１００カラオケ装置本体
１０６音源（楽曲再生手段）
１０７音声制御部（楽曲再生手段）
１０９表示部（表示手段） DESCRIPTION OF SYMBOLS 1 Karaoke music reproduction | regeneration system 20 Karaoke music data 23 Lyrics data 101 Control part 100 Karaoke apparatus main body 106 Sound source (music reproduction means)
107 Voice control unit (music playback means)
109 Display section (display means)

Claims

Music playback means capable of playing back karaoke music data, which is associated with predetermined phrase phrase data and first lyrics data;
Music receiving means for receiving selection of the karaoke music data by the user;
English word lyric generation means for extracting lyrics parts corresponding to English words from the first lyric data corresponding to the karaoke music data received by the music receiving means and generating second lyric data alphabetized; ,
To lyrics text of the remaining portion not extracted by the English single word lyric generating means of the first lyrics data, and pronunciation symbols applying means for applying a pronunciation symbol for each word,
For each phrase phrase corresponding to the phrase phrase data, the phonetic symbol given to the lyric text by the phonetic symbol giving means is compared with the phonetic symbol of the English word candidate for replacing the lyric text. Similarity derivation means to derive;
According to the derivation result by the similarity deriving means, for each of the phrase phrases, a plurality of English word candidates are selected and arranged so as to maximize the similarity, and third lyric data for replacement is generated. Replacement lyrics generation means,
The second lyrics data generated by the English word lyrics generating means and the third lyrics data generated by the replacement lyrics generating means are combined to correspond to the karaoke song data received by the music receiving means. Lyrics synthesis means for generating new lyrics data;
Display means for displaying the new lyrics data corresponding to the karaoke song data in response to the reproduction of the karaoke song data received by the music receiving means;
A karaoke apparatus comprising:

The karaoke apparatus according to claim 1,
The first lyrics data is associated with note length data corresponding to the karaoke song data,
The similarity derivation means includes:
When comparing the phonetic symbol of the lyric text and the phonetic symbol of the English word candidate, a predetermined weight is used that becomes larger as the note represented by the note length data becomes longer and becomes smaller as the note represented by the note length data becomes shorter. However, the karaoke apparatus is characterized in that the similarity is derived.

With respect to the arithmetic means provided in the karaoke apparatus having the music playback means capable of playing back the karaoke music data, the predetermined music phrase phrase data and the first lyric data being associated, and the display means,
A music reception procedure for receiving selection of the karaoke music data by the user;
An English word lyric generating procedure for generating a second lyric data by extracting a lyric part corresponding to an English word from the first lyric data corresponding to the karaoke music data received in the music receiving procedure; ,
To lyrics text of the remainder of the not extracted with English single language lyrics generation procedure of the first lyric data, the phonetic symbol imparting procedure giving phonetic symbols for each word,
For each phrase phrase corresponding to the phrase phrase data, the phonetic symbol given to the lyrics text in the phonetic symbol assigning procedure is compared with the pronunciation symbol of the English word candidate for replacing the lyrics text, and the similarity is determined. A similarity derivation procedure to derive,
According to the derivation result in the similarity derivation procedure, for each of the phrase phrases, a plurality of English word candidates are selected and arranged so as to maximize the similarity, and third lyric data for replacement is generated. The replacement lyrics generation procedure to
The second lyrics data generated in the English word lyrics generation procedure and the third lyrics data generated in the replacement lyrics generation procedure are combined to correspond to the karaoke song data received in the music reception procedure Lyric composition procedure to generate new lyrics data,
A display control procedure for displaying the new lyric data corresponding to the karaoke song data on the display means in accordance with the reproduction of the karaoke song data received in the music receiving procedure;
Karaoke music processing program to execute.