JP2011221064A

JP2011221064A - Karaoke system

Info

Publication number: JP2011221064A
Application number: JP2010086678A
Authority: JP
Inventors: Kaoru Uenosono; 薫上之薗
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2010-04-05
Filing date: 2010-04-05
Publication date: 2011-11-04

Abstract

PROBLEM TO BE SOLVED: To provide a karaoke system which converts a timbre of a voice and allows a user to easily select the type of timbre converted.SOLUTION: The karaoke system comprises: a karaoke main body 20 having output signal generation means which generates an output signal based on a voice signal outputted from a microphone 30 and the information of accompaniment music; a speaker 40 which reproduces the output signal into a voice; a gradient sensor for measuring gradient; and a mobile terminal 60 having communicating means which transmits the gradient information measured by the gradient sensor to the karaoke main body. The karaoke main body 20 has timbre type determination means which determines a timbre type to which the voice signal output from the microphone 30 is converted based on the gradient information and timbre conversion means which converts the timbre of the voice signal output from the microphone 30 in order to generate the conversion voice information based on the timbre type.

Description

本発明は、音声の音色を変換するカラオケシステムに関する。 The present invention relates to a karaoke system for converting the timbre of a voice.

従来から特許文献１に示されるように、ユーザーによる音声の音色を変換するカラオケシステムが提案されている。 Conventionally, as shown in Patent Document 1, a karaoke system for converting the tone of a voice by a user has been proposed.

特開平１−２０５９７号公報JP-A-1-20597

特許文献１に示されるカラオケシステムでは、変換する音色が固定されていたので、変換する音色を容易に選択することが困難であった。
本発明は、上記問題を解決し、音声の音色を変換するカラオケシステムにおいて、ユーザーが容易に変換する音色の種類を選択することができるカラオケシステムを提供することを目的とする。 In the karaoke system shown in Patent Document 1, since the timbre to be converted is fixed, it is difficult to easily select the timbre to be converted.
An object of the present invention is to solve the above problems and provide a karaoke system that allows a user to easily select a timbre type to be converted in a karaoke system that converts a timbre of a voice.

上記課題を解決するためになされた請求項１に記載の発明は、
音声から音声信号を生成するマイクロフォンと、
伴奏音楽情報を取得する伴奏音楽情報取得手段と、前記マイクロフォンから出力される音声信号及び前記伴奏情報取得手段が取得した伴奏音楽情報に基づいて出力信号を生成する出力信号生成手段を有するカラオケ装置本体と、
前記出力信号を音声に再生するスピーカーとからなるカラオケシステムにおいて、
傾度を測定する傾度センサーと、
前記傾度センサーが測定した傾度情報を前記カラオケ装置本体に送信する通信手段を有する携帯端末を更に有し、
前記カラオケ装置本体は、
前記携帯端末が送信する傾度情報に基づき、マイクロフォンが出力する音声信号が変換される音色種類を決定する音色種類決定手段と、
前記音色種類決定手段が決定した音色種類に基づいて、マイクロフォンが出力する音声信号の音色を変換して変換音声情報を生成する音色変換手段を更に有し、
出力信号生成手段は、前記変換音声情報と伴奏音楽情報を合成して出力信号を生成することを特徴とする。 The invention according to claim 1, which has been made to solve the above problems,
A microphone that generates an audio signal from audio;
Accompaniment music information acquisition means for acquiring accompaniment music information, and an output signal generation means for generating an output signal based on an audio signal output from the microphone and accompaniment music information acquired by the accompaniment information acquisition means When,
In a karaoke system comprising a speaker that reproduces the output signal as audio,
A tilt sensor for measuring the tilt,
A portable terminal further comprising a communication means for transmitting the inclination information measured by the inclination sensor to the karaoke apparatus body;
The karaoke apparatus body is
A timbre type determining means for determining a timbre type to which an audio signal output from the microphone is converted based on the gradient information transmitted by the portable terminal;
Further comprising timbre conversion means for converting the timbre of the voice signal output from the microphone to generate converted voice information based on the timbre type determined by the timbre type determination means;
The output signal generation means generates the output signal by synthesizing the converted voice information and the accompaniment music information.

請求項２に記載の発明は、請求項１に記載の発明において、
音色種類決定手段は、携帯端末が送信する傾度情報を閾値に基づき、前記携帯端末の姿勢を判定し、音色種類を決定することを特徴とする。
これにより、音色の種類が確実に決定される。 The invention according to claim 2 is the invention according to claim 1,
The timbre type determining means determines the timbre type by determining the attitude of the mobile terminal based on the gradient information transmitted by the mobile terminal based on a threshold value.
Thereby, the kind of timbre is determined reliably.

請求項３に記載の発明は、請求項１又は請求項２に記載の発明において、
カラオケ装置本体は、マイクロフォンが出力した音声信号の音程を、伴奏音楽情報に基づいて補正する音程補正手段を更に有することを特徴とする。
これにより、ユーザーが伴奏音楽の音程を外して発声した場合であっても、伴奏音楽に合った変換音声情報が生成される。 The invention according to claim 3 is the invention according to claim 1 or 2,
The karaoke apparatus main body further includes pitch correction means for correcting the pitch of the audio signal output from the microphone based on the accompaniment music information.
Thus, even when the user utters the accompaniment music at a different pitch, converted voice information suitable for the accompaniment music is generated.

請求項４に記載の発明は、請求項１〜請求項３に記載の発明において、
カラオケ装置本体は、マイクロフォンが出力した音声信号のリズムを、伴奏音楽情報に基づいて補正するリズム補正手段を更に有することを特徴とする。
これにより、ユーザーが伴奏音楽のリズムを外して発声した場合であっても、伴奏音楽に合った変換音声情報が生成される。 The invention according to claim 4 is the invention according to claims 1 to 3,
The karaoke apparatus main body further includes rhythm correction means for correcting the rhythm of the audio signal output from the microphone based on the accompaniment music information.
Thereby, even if the user utters the rhythm of the accompaniment music, the converted voice information suitable for the accompaniment music is generated.

請求項５に記載の発明は、請求項１〜請求項４に記載の発明において、
携帯端末は、加速度を測定する加速度センサーを更に有し、
送信手段は、前記加速度センサーが測定した加速度情報を、通信手段を介して前記カラオケ装置本体に送信し、
カラオケ装置本体は、携帯端末から送信された加速度情報に基づいて音色変換手段が生成した変換音声情報に効果を付加する効果付加手段を更に有することを特徴とする。
これにより、ユーザーは、携帯端末を動かすという簡単な操作により、変換音声情報に効果を付加させることが可能となる。 The invention according to claim 5 is the invention according to claims 1 to 4,
The portable terminal further includes an acceleration sensor that measures acceleration,
The transmission means transmits the acceleration information measured by the acceleration sensor to the karaoke apparatus body through the communication means,
The karaoke apparatus main body further includes effect adding means for adding an effect to the converted voice information generated by the timbre converting means based on the acceleration information transmitted from the portable terminal.
Thereby, the user can add an effect to the converted voice information by a simple operation of moving the mobile terminal.

本発明によれば、カラオケ装置本体は携帯端末の傾度情報に基づき音色種類を決定するので、ユーザーは携帯端末を傾けるという容易な操作により、楽音の種類を選択することが可能となる。 According to the present invention, since the karaoke apparatus main body determines the timbre type based on the inclination information of the mobile terminal, the user can select the type of musical sound by an easy operation of tilting the mobile terminal.

本発明の実施の形態を示すカラオケシステムの全体図である。1 is an overall view of a karaoke system showing an embodiment of the present invention. カラオケシステムのブロック図である。It is a block diagram of a karaoke system. メイン処理のフロー図である。It is a flowchart of a main process. 音色決定処理のフロー図である。It is a flowchart of a timbre determination process. 音程補正処理のフロー図である。It is a flowchart of a pitch correction process. リズム補正処理のフロー図である。It is a flowchart of a rhythm correction process. 効果付加処理のフロー図である。It is a flowchart of an effect addition process. 音色決定の説明図である。It is explanatory drawing of timbre determination. 音程比較処理の説明図である。It is explanatory drawing of a pitch comparison process. リズム補正処理の説明図である。It is explanatory drawing of a rhythm correction process. 付加効果処理の説明図である。It is explanatory drawing of an additional effect process.

（本発明の概要）
以下に図面を参照しつつ、本発明の好ましい実施の形態を示す。図１に示されるように、本発明のカラオケシステム１００は、カラオケ装置本体２０と、このカラオケ本体装置２０と接続しているマイクロフォン３０、スピーカー４０、画像表示装置５０、携帯端末６０とから構成されている。カラオケ装置本体２０は、公衆通信網９００と接続している。カラオケ装置本体２０は、公衆通信網９００から「伴奏動画情報」を取得する。そして、カラオケ装置本体２０は、「伴奏動画情報」から「伴奏音楽情報」を抽出し、この「伴奏音楽情報」とマイクロフォン３０から出力された「音声信号」を合成して「出力信号」を生成する。そして、前記「出力信号」スピーカー４０に出力する。また、カラオケ装置本体２０は、取得した「伴奏動画情報」から「伴奏動画信号」を生成し、当該「伴奏動画信号」を画像表示装置５０に出力する。 (Outline of the present invention)
Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings. As shown in FIG. 1, the karaoke system 100 of the present invention includes a karaoke device main body 20, a microphone 30 connected to the karaoke main device 20, a speaker 40, an image display device 50, and a portable terminal 60. ing. The karaoke apparatus body 20 is connected to the public communication network 900. The karaoke apparatus body 20 acquires “accompaniment video information” from the public communication network 900. Then, the karaoke apparatus body 20 extracts the “accompaniment music information” from the “accompaniment video information” and combines the “accompaniment music information” with the “voice signal” output from the microphone 30 to generate an “output signal”. To do. Then, the “output signal” is output to the speaker 40. Further, the karaoke apparatus body 20 generates an “accompaniment video signal” from the acquired “accompaniment video information” and outputs the “accompaniment video signal” to the image display device 50.

ユーザーは、スピーカー４０から再生される伴奏音楽を聴きながら、或いは、画像表示装置５０で表示される伴奏動画を見ながら、マイクロフォン３０に発声する。本発明では、マイクロフォン３０から出力された「音声信号」は、カラオケ装置本体２０で音色が変換されて、スピーカー４０に出力される。ユーザーは、携帯端末６０を傾けることにより、「音声信号」が変換される音色の種類を選択できるようになっている。また、ユーザーは、携帯端末６０を、振ることにより、「音声信号」に効果を付加することができるようになっている。以下に、このような機能を実現するカラオケシステム１００について詳細に説明する。 The user utters the microphone 30 while listening to the accompaniment music reproduced from the speaker 40 or watching the accompaniment moving image displayed on the image display device 50. In the present invention, the “voice signal” output from the microphone 30 is converted into a timbre by the karaoke apparatus body 20 and output to the speaker 40. The user can select the type of timbre to which the “voice signal” is converted by tilting the portable terminal 60. Further, the user can add an effect to the “voice signal” by shaking the mobile terminal 60. Below, the karaoke system 100 which implement | achieves such a function is demonstrated in detail.

（カラオケシステムのブロック図）
図２を用いて、以下にカラオケシステム１００のブロック図の説明をする。カラオケ装置本体２０は、ＣＰＵ１１、記憶部１３、音声入力インターフェース１４、出力信号生成部１５、画像生成部１６、通信部１７、外部通信部１８、操作部１９を有している。これらの構成は、相互にバス９で接続されている。 (Karaoke system block diagram)
A block diagram of the karaoke system 100 will be described below with reference to FIG. The karaoke apparatus body 20 includes a CPU 11, a storage unit 13, a voice input interface 14, an output signal generation unit 15, an image generation unit 16, a communication unit 17, an external communication unit 18, and an operation unit 19. These components are connected to each other by a bus 9.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１は、記憶部１３と協動して、各種演算、処理を行う。 A CPU (Central Processing Unit) 11 performs various calculations and processes in cooperation with the storage unit 13.

記憶部１３は、主記憶装置であるＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）及び、不揮発性メモリーやハードディスク等の補助記憶装置から構成されている。ＲＡＭは、ＣＰＵ１１のワーキングエリアとして使用され、ＣＰＵ１１で処理されるプログラムや、ＣＰＵ１１が処理するデータを、そのアドレス空間に一時的に記憶する。
記憶部１３の補助記憶装置には、制御部３０を制御する各種プログラムやパラメータが記憶されている。当該各種プログラムが、ＣＰＵ１１で処理されることにより、各種機能を実現している。記憶部１３の補助記憶装置には、伴奏動画情報取得プログラム１３ａ、伴奏音楽情報取得プログラム１３ｂ、出力信号生成プログラム１３ｃ、音色種類決定プログラム１３ｄ、音色変換プログラム１３ｅ、音程補正プログラム１３ｆ、リズム補正プログラム１３ｇ、効果付加プログラム１３ｈ、伴奏動画生成プログラム１３ｉが記憶されている。また、記憶部１３は、伴奏動画情報記憶領域１３ｊ、伴奏音楽情報記憶領域１３ｋ、傾斜情報記憶領域１３ｍ、加速度情報記憶領域１３ｎ、音声情報記憶領域１３ｐ、音色種類記憶領域１３ｑ、変換音声情報記憶領域１３ｒを有している。 The storage unit 13 includes a RAM (Random Access Memory) that is a main storage device, and an auxiliary storage device such as a nonvolatile memory or a hard disk. The RAM is used as a working area for the CPU 11 and temporarily stores programs processed by the CPU 11 and data processed by the CPU 11 in its address space.
Various programs and parameters for controlling the control unit 30 are stored in the auxiliary storage device of the storage unit 13. Various functions are realized by the various programs being processed by the CPU 11. The auxiliary storage device of the storage unit 13 includes an accompaniment video information acquisition program 13a, an accompaniment music information acquisition program 13b, an output signal generation program 13c, a timbre type determination program 13d, a timbre conversion program 13e, a pitch correction program 13f, and a rhythm correction program 13g. Further, an effect addition program 13h and an accompaniment moving image generation program 13i are stored. The storage unit 13 includes an accompaniment video information storage area 13j, an accompaniment music information storage area 13k, a tilt information storage area 13m, an acceleration information storage area 13n, a voice information storage area 13p, a tone color type storage area 13q, and a converted voice information storage area. 13r.

伴奏動画情報取得プログラム１３ａは、外部通信部１８を介して「伴奏動画情報」を取得し、伴奏動画情報記憶領域１３ｊに記憶させるプログラムである。
伴奏音楽情報取得プログラム１３ｂは、伴奏動画情報記憶領域１３ｊに記憶された「伴奏動画情報」から「伴奏音楽情報」を抽出して取得し、伴奏音楽情報記憶領域１３ｋに記憶させるプログラムである。
出力信号生成プログラム１３ｃは、伴奏音楽情報記憶領域１３ｋに記憶されている「伴奏音楽情報」及び変換音声情報記憶領域１３ｒに記憶されている「変換音声情報」を合成して「合成音声情報」を生成し、当該「変換音声情報」を出力信号生成部１５に出力して、出力信号生成部１５で「出力信号」を生成させるプログラムである。
音色種類決定プログラム１３ｄは、携帯端末６０の傾斜センサー６４が出力した「傾斜情報」に基づいて、「音声情報」を変換する音色の種類を決定するプログラムである。
音色変換プログラム１３ｅは、「音声情報」の音色を変換して「変換音声情報」を生成するプログラムである。
音程補正プログラム１３ｆは、「音声情報」の「伴奏音楽情報」からの音程のズレを補正するプログラムである。
リズム補正プログラム１３ｇは、「音声情報」の「伴奏音楽情報」からのリズムのズレを補正するプログラムである。
効果付加プログラム１３ｈは、携帯端末６０の加速度センサー６５が出力した「加速度情報」に基づいて、「変換音声情報」に効果を付加させるプログラムである。
伴奏動画生成プログラム１３ｉは、伴奏動画情報記憶領域１３ｊに記憶されている「伴奏動画情報」から、画像表示装置５０に出力される「動画信号」を生成する描画命令を画像生成部１６に出力するプログラムである。
なお、伴奏動画情報取得プログラム１３ａ、伴奏音楽情報取得プログラム１３ｂ、出力信号生成プログラム１３ｃ、音色種類決定プログラム１３ｄ、音色変換プログラム１３ｅ、音程補正プログラム１３ｆ、リズム補正プログラム１３ｇ、効果付加プログラム１３ｈ、伴奏動画生成プログラム１３ｉを、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）として構成することとしても差し支えない。 The accompaniment moving image information acquisition program 13a is a program for acquiring “accompaniment moving image information” via the external communication unit 18 and storing it in the accompaniment moving image information storage area 13j.
The accompaniment music information acquisition program 13b is a program that extracts and acquires “accompaniment music information” from “accompaniment video information” stored in the accompaniment video information storage area 13j and stores it in the accompaniment music information storage area 13k.
The output signal generation program 13c synthesizes “synthesized voice information” by synthesizing “accompanied music information” stored in the accompaniment music information storage area 13k and “converted voice information” stored in the converted voice information storage area 13r. This is a program that generates and outputs the “converted speech information” to the output signal generation unit 15 and causes the output signal generation unit 15 to generate an “output signal”.
The timbre type determination program 13d is a program for determining the timbre type for converting the “voice information” based on the “tilt information” output from the tilt sensor 64 of the portable terminal 60.
The timbre conversion program 13e is a program that converts the timbre of “voice information” to generate “converted voice information”.
The pitch correction program 13f is a program for correcting a pitch shift from “accompaniment music information” of “voice information”.
The rhythm correction program 13g is a program for correcting a rhythm shift from “accompaniment music information” of “audio information”.
The effect addition program 13h is a program for adding an effect to the “converted voice information” based on the “acceleration information” output from the acceleration sensor 65 of the mobile terminal 60.
The accompaniment moving image generation program 13i outputs a drawing command for generating a “moving image signal” to be output to the image display device 50 from the “accompaniment moving image information” stored in the accompaniment moving image information storage area 13j to the image generation unit 16. It is a program.
Accompaniment video information acquisition program 13a, accompaniment music information acquisition program 13b, output signal generation program 13c, tone color type determination program 13d, tone color conversion program 13e, pitch correction program 13f, rhythm correction program 13g, effect addition program 13h, accompaniment video The generation program 13i may be configured as an ASIC (Application Specific Integrated Circuit).

伴奏動画情報記憶領域１３ｊには、伴奏動画情報取得プログラム１３ａによって取得された「伴奏動画情報」が記憶される。
伴奏音楽情報記憶領域１３ｋには、伴奏音楽情報取得プログラム１３ｂによって取得された「伴奏音楽情報」が記憶される。なお、「伴奏音楽情報」には、ＭＩＤＩデータ（ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）が含まれる。
傾斜情報記憶領域１３ｍには、携帯端末６０から送信された「傾斜情報」が順次記憶される。
加速度情報記憶領域１３ｎには、携帯端末６０から送信された「加速度情報」が順次記憶される。
音声情報記憶領域１３ｐには、音声入力インターフェース１４が生成した「音声情報」が記憶される。
音色種類記憶領域１３ｑには、音色種類決定プログラム１３ｄが決定した音色の種類が記憶される。
変換音声情報記憶領域１３ｒには、音色変換プログラム１３ｅが生成した「変換音色情報」が記憶される。 The accompaniment video information storage area 13j stores “accompaniment video information” acquired by the accompaniment video information acquisition program 13a.
The accompaniment music information storage area 13k stores “accompaniment music information” acquired by the accompaniment music information acquisition program 13b. The “accompaniment music information” includes MIDI data (Musical Instrument Digital Interface).
The “inclination information” transmitted from the portable terminal 60 is sequentially stored in the inclination information storage area 13m.
In the acceleration information storage area 13n, “acceleration information” transmitted from the portable terminal 60 is sequentially stored.
The “voice information” generated by the voice input interface 14 is stored in the voice information storage area 13p.
The timbre type storage area 13q stores the timbre type determined by the timbre type determination program 13d.
In the converted voice information storage area 13r, “converted timbre information” generated by the timbre conversion program 13e is stored.

音声入力インターフェース１４には、マイクロフォン３０が接続している。音声入力インターフェース１４とマイクロフォン３０との接続方式には、有線及び無線の両方の接続方式が含まれる。音声入力インターフェース１４は、マイクロフォン３０から出力された「音声信号」の物理的・論理的な形式を変換して「音声情報」を生成し、当該「音声情報」をバス９に出力する。 A microphone 30 is connected to the voice input interface 14. The connection method between the voice input interface 14 and the microphone 30 includes both wired and wireless connection methods. The voice input interface 14 converts the physical / logical format of the “voice signal” output from the microphone 30 to generate “voice information”, and outputs the “voice information” to the bus 9.

出力信号生成部１５には、スピーカー４０が接続している。出力信号生成部１５は、デジタル信号をアナログ信号に変換するＤ／Ａコンバータ及びアナログ信号を増幅するオペアンプを有している。 A speaker 40 is connected to the output signal generator 15. The output signal generation unit 15 includes a D / A converter that converts a digital signal into an analog signal and an operational amplifier that amplifies the analog signal.

画像生成部１６には、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）等の画像表示装置５０が接続している。画像生成部１５は、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）及びＶＲＡＭを有している。ＧＰＵは、伴奏動画生成プログラム１３ｉからの描画命令により、「画像データ」を生成し、ＶＲＡＭに記憶させる。ＶＲＡＭに記憶された「画像データ」は、「画像信号」として画像表示装置５０に出力される。 An image display device 50 such as an LCD (Liquid Crystal Display) is connected to the image generation unit 16. The image generation unit 15 includes a GPU (Graphics Processing Unit) and a VRAM. The GPU generates “image data” in accordance with a drawing command from the accompaniment moving image generation program 13 i and stores it in the VRAM. The “image data” stored in the VRAM is output to the image display device 50 as an “image signal”.

通信部１７は、携帯端末６０と通信を行う装置である。通信部１７は、赤外線通信やブルートゥース（登録商標）による無線規格のインターフェースとなっている。 The communication unit 17 is a device that communicates with the mobile terminal 60. The communication unit 17 is a wireless standard interface based on infrared communication or Bluetooth (registered trademark).

外部通信部１８は、いわゆるインターネットや公衆通信電話通信網等の公衆通信網と接続している。外部通信部１８には、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）や、ＵＳＢ（ＵｎｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）、ＩＥＥＥ１３９４、ＲＳ２３２、ＲＳ４２２その他通信インターフェースや、ＩＥＥＥ８０２に規定されているいわゆる無線ＬＡＮ等の無線インターフェースが含まれる。カラオケ装置本体２０は、外部通信部１８を介して、「伴奏動画情報」を取得する。 The external communication unit 18 is connected to a public communication network such as a so-called Internet or a public communication telephone communication network. The external communication unit 18 includes a LAN (Local Area Network), USB (Universal Serial Bus), IEEE1394, RS232, RS422, and other communication interfaces, and a wireless interface such as a so-called wireless LAN defined in IEEE802. The karaoke apparatus body 20 acquires “accompaniment video information” via the external communication unit 18.

操作部１９は、ユーザーが、カラオケ装置本体２０の操作を行うためのものである。操作部１９は、複数のボタンやタッチパネル等の入力部と、入力部から出力された入力操作信号の物理的・論理的な形式を変換してバス９に出力する入力インターフェースとから構成されている。ユーザーは、操作部１９を操作することにより、スピーカー４０や画像表示装置５０で再生させる「伴奏動画情報」を選択することができる。 The operation unit 19 is for the user to operate the karaoke apparatus body 20. The operation unit 19 includes an input unit such as a plurality of buttons and a touch panel, and an input interface that converts a physical / logical format of an input operation signal output from the input unit and outputs the converted signal to the bus 9. . The user can select “accompaniment moving image information” to be reproduced by the speaker 40 or the image display device 50 by operating the operation unit 19.

携帯端末６０は、傾斜センサー６４、加速度センサー６５を有する所謂スマートフォンやＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｔ）等の携帯端末である。
携帯端末６０は、ＣＰＵ６１、記憶部６３、傾斜センサー６４、加速度センサー６５、通信部６６を有していて、これらの構成は、相互にバス６９で接続されている。
ＣＰＵ６１は、記憶部６３と協動して、各種演算、処理を行う。 The mobile terminal 60 is a mobile terminal such as a so-called smart phone having a tilt sensor 64 and an acceleration sensor 65 or a PDA (Personal Digital Assistant).
The portable terminal 60 includes a CPU 61, a storage unit 63, a tilt sensor 64, an acceleration sensor 65, and a communication unit 66, and these components are connected to each other via a bus 69.
The CPU 61 performs various calculations and processes in cooperation with the storage unit 63.

記憶部６３は、主記憶装置であるＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）及び、不揮発性メモリーやハードディスク等の補助記憶装置から構成されている。ＲＡＭは、ＣＰＵ６１のワーキングエリアとして使用され、ＣＰＵ６１で処理されるプログラムや、ＣＰＵ６１が処理するデータを、そのアドレス空間に一時的に記憶する。
記憶部６３の補助記憶装置には、携帯端末６０を制御する各種プログラムやパラメータが記憶されている。携帯端末６０が、スマートフォンである場合には、記憶部６３の補助記憶装置には、通話機能及びメール機能を実現するプログラムが記憶されている。 The storage unit 63 includes a RAM (Random Access Memory) that is a main storage device and an auxiliary storage device such as a nonvolatile memory or a hard disk. The RAM is used as a working area of the CPU 61, and temporarily stores a program processed by the CPU 61 and data processed by the CPU 61 in its address space.
Various programs and parameters for controlling the portable terminal 60 are stored in the auxiliary storage device of the storage unit 63. When the portable terminal 60 is a smartphone, the auxiliary storage device of the storage unit 63 stores a program that realizes a call function and a mail function.

傾斜センサー６４は、携帯端末６０のＸ、Ｙ、Ｚの３方向の傾度を測定するセンサーである。傾斜センサー６４で測定され生成された「傾斜情報」は、通信部６６を介して、カラオケ装置本体２０に送信される。なお、「傾斜情報」は、数ミリ〜数１０ミリ秒の間隔をおいて、カラオケ装置本体２０に送信される。 The inclination sensor 64 is a sensor that measures the inclination of the mobile terminal 60 in three directions of X, Y, and Z. The “tilt information” measured and generated by the tilt sensor 64 is transmitted to the karaoke apparatus body 20 via the communication unit 66. The “tilt information” is transmitted to the karaoke apparatus body 20 at intervals of several milliseconds to several tens of milliseconds.

加速度センサー６５は、携帯端末６０のＸ、Ｙ、Ｚの３方向の加速度を検知するセンサーである。加速度センサー６５で検知され生成された「加速度情報」は、通信部６６を介して、カラオケ装置本体２０に送信される。なお、「加速度情報」は、数ミリ〜数１０ミリ秒の間隔をおいて、カラオケ装置本体２０に送信される。 The acceleration sensor 65 is a sensor that detects accelerations in the X, Y, and Z directions of the mobile terminal 60. The “acceleration information” detected and generated by the acceleration sensor 65 is transmitted to the karaoke apparatus body 20 via the communication unit 66. The “acceleration information” is transmitted to the karaoke apparatus body 20 at intervals of several milliseconds to several tens of milliseconds.

通信部６６は、カラオケ装置本体２０の通信部１７と通信をする装置である。通信部６６は。赤外線通信やブルートゥース（登録商標）等通信部１７のインターフェースに対応した無線規格のインターフェースとなっている。 The communication unit 66 is a device that communicates with the communication unit 17 of the karaoke apparatus body 20. The communication unit 66. This is a wireless standard interface corresponding to the interface of the communication unit 17 such as infrared communication or Bluetooth (registered trademark).

（メイン処理の説明）
以下に、図３を用いて、カラオケ装置本体２０で実行されるメイン処理について説明する。
ユーザーが操作部１９を操作することにより、スピーカー４０や画像表示装置５０で再生される「伴奏動画情報」を選択すると、メイン処理が開始し、Ｓ１１の処理に進む。
Ｓ１１「伴奏音楽情報取得」の処理において、伴奏動画情報取得プログラム１３ａは、外部通信部１８を介して「伴奏動画情報」を取得し、伴奏動画情報記憶領域１３ｊに記憶させる。そして、伴奏音楽情報取得プログラム１３ｂは、伴奏動画情報記憶領域１３ｊに記憶された「伴奏動画情報」から「伴奏音楽情報」を抽出して取得し、伴奏音楽情報記憶領域１３ｋに記憶させる。Ｓ１１の処理が終了すると、Ｓ１２の処理に進む。 (Description of main processing)
Below, the main process performed with the karaoke apparatus main body 20 is demonstrated using FIG.
When the user operates the operation unit 19 to select “accompaniment video information” to be played back by the speaker 40 or the image display device 50, the main process starts and the process proceeds to S11.
In the process of S11 “accompaniment music information acquisition”, the accompaniment video information acquisition program 13a acquires “accompaniment video information” via the external communication unit 18 and stores it in the accompaniment video information storage area 13j. And the accompaniment music information acquisition program 13b extracts and acquires "accompaniment music information" from the "accompaniment moving image information" memorize | stored in the accompaniment moving image information storage area 13j, and memorize | stores it in the accompaniment music information storage area 13k. When the process of S11 ends, the process proceeds to S12.

Ｓ１２「伴奏音楽情報再生開始」の処理において、出力信号生成プログラム１３ｃは、伴奏音楽情報記憶領域１３ｋに記憶されている「伴奏音楽情報」を出力信号生成部１５に出力する。出力信号生成部１５は、「出力信号」を生成し、スピーカー４０に出力する。なお、Ｓ１２の処理において、伴奏動画生成プログラム１３ｉは、伴奏動画情報記憶領域１３ｊに記憶されている「伴奏動画情報」から、画像表示装置５０に出力される「動画信号」を生成する描画命令を画像生成部１６に出力する。Ｓ１２の処理が終了すると、Ｓ１３の判断処理に進む。 In the process of S12 “Accompaniment music information reproduction start”, the output signal generation program 13c outputs “accompaniment music information” stored in the accompaniment music information storage area 13k to the output signal generation unit 15. The output signal generation unit 15 generates an “output signal” and outputs it to the speaker 40. In the process of S12, the accompaniment moving image generation program 13i generates a drawing command for generating a “moving image signal” to be output to the image display device 50 from the “accompaniment moving image information” stored in the accompaniment moving image information storage area 13j. The image is output to the image generation unit 16. When the process of S12 ends, the process proceeds to the determination process of S13.

Ｓ１３の判断処理において、ＣＰＵ１１は、バス９に「伴奏音楽情報」及び「伴奏動画情報」の再生を中止する「再生中止信号」が入力されたか否かを判断する。「再生中止信号」は、ユーザーが操作部１９を操作することにより、又は出力信号生成プログラム１３ｃが伴奏音楽情報記憶領域１３ｋに記憶されている「伴奏音楽情報」を末尾まで出力することにより、バス９に対して入力される。ＣＰＵ１１が、「再生中止信号」がバス９に入力されたと判断した場合には（Ｓ１３の判断処理がＹＥＳ）、Ｓ３１の処理に進む。一方で、ＣＰＵ１１が、「再生中止信号」がバス９に入力されていない判断した場合には（Ｓ１３の判断処理がＮＯ）、Ｓ１４の判断処理に進む。 In the determination process of S <b> 13, the CPU 11 determines whether or not a “reproduction stop signal” for stopping reproduction of “accompaniment music information” and “accompaniment video information” is input to the bus 9. The “reproduction stop signal” is generated when the user operates the operation unit 19 or the output signal generation program 13c outputs “accompaniment music information” stored in the accompaniment music information storage area 13k to the end. 9 is input. If the CPU 11 determines that the “reproduction stop signal” has been input to the bus 9 (YES in S13), the process proceeds to S31. On the other hand, if the CPU 11 determines that the “reproduction stop signal” has not been input to the bus 9 (NO in S13), the process proceeds to S14.

Ｓ１４「音声入力有り」の判断処理において、ＣＰＵ１１は、音声入力インターフェース１４からバス９に、「音声情報」が入力されたか否かを判断する。ＣＰＵ１１が、音声入力インターフェース１４からバス９に、「音声情報」が入力されたと判断した場合には（Ｓ１４の判断処理がＹＥＳ）、前記「音声情報」を音声情報記憶領域１３ｐに記憶し、Ｓ１５の処理に進む。ＣＰＵ１１が、音声入力インターフェース１４からバス９に、「音声情報」が入力されていないと判断した場合には（Ｓ１４の判断処理がＮＯ）、Ｓ１３の判断処理に戻る。 In the determination process of S14 “with voice input”, the CPU 11 determines whether or not “voice information” has been input from the voice input interface 14 to the bus 9. When the CPU 11 determines that “voice information” is input from the voice input interface 14 to the bus 9 (YES in S14), the “voice information” is stored in the voice information storage area 13p, and S15. Proceed to the process. If the CPU 11 determines that “voice information” is not input from the voice input interface 14 to the bus 9 (NO in S14), the process returns to the determination process in S13.

Ｓ１５「音色決定処理」において、音色種類決定プログラム１３ｄは、「音声情報」を変換する音色の種類を決定する音色決定処理を実行する。詳しくは、図４に示される音色変換処理のフローを用いて、後述する。Ｓ１５の処理が終了すると、Ｓ１６の処理に進む。 In S15 “timbre determination process”, the timbre type determination program 13d executes a timbre determination process for determining a timbre type for converting “voice information”. Details will be described later with reference to the timbre conversion processing flow shown in FIG. When the process of S15 ends, the process proceeds to S16.

Ｓ１６「音程補正処理」において、音程補正プログラム１３ｆは、「音声信号」の音程の「伴奏音楽情報」からのズレを補正する音程補正処理を実行する。詳しくは、図５に示される音程補正処理のフローを用いて、後述する。Ｓ１６の処理が終了すると、Ｓ１７の処理に進む。 In S16 “pitch correction process”, the pitch correction program 13f executes a pitch correction process for correcting a shift of the pitch of the “voice signal” from the “accompaniment music information”. Details will be described later with reference to the flow of pitch correction processing shown in FIG. When the process of S16 ends, the process proceeds to S17.

Ｓ１７「リズム補正処理」において、リズム補正プログラム１３ｇは、「音声情報」のリズムの「伴奏音楽情報」からのズレを補正するリズム補正処理を実行する。詳しくは、図６に示されるリズム補正処理のフローを用いて、後述する。Ｓ１７の処理が終了すると、Ｓ１８の判断処理に進む。 In S17 “Rhythm correction process”, the rhythm correction program 13g executes a rhythm correction process for correcting a deviation of the “sound information” rhythm from the “accompaniment music information”. Details will be described later using the flow of the rhythm correction process shown in FIG. When the process of S17 ends, the process proceeds to the determination process of S18.

Ｓ１８「音色変換処理」において、音色変換プログラム１３ｅは、音声情報記憶領域１３ｐに記憶されている「音声情報」を、Ｓ１５の処理で決定され音色種類記憶領域１３ｑに記憶された音色の種類に変換して「変換音声情報」を生成し、変換音声情報記憶領域１３ｒに記憶させる。なお、音色種類記憶領域１３ｑにnothingと記憶されている場合には（S１５−２の処理において、音色種類決定プログラム１３ｄがA＝０、B＝０と判断し、Ｓ１５−３の処理において、音色種類決定プログラム１３ｄが音色をnothingと音色を決定した場合）、Ｓ１８の処理において、音色変換プログラム１３ｅは、特に処理を行わない。Ｓ１８の処理が終了すると、Ｓ１９の判断処理に進む。 In S18 “tone conversion process”, the timbre conversion program 13e converts the “voice information” stored in the voice information storage area 13p into the timbre type determined in the process of S15 and stored in the timbre type storage area 13q. Then, “converted voice information” is generated and stored in the converted voice information storage area 13r. If nothing is stored in the timbre type storage area 13q (in the processing of S15-2, the timbre type determination program 13d determines that A = 0 and B = 0, and in the processing of S15-3, the timbre In the case where the type determination program 13d determines the timbre as nothing and the timbre), in the process of S18, the timbre conversion program 13e performs no particular process. When the process of S18 ends, the process proceeds to the determination process of S19.

Ｓ１９「動作入力有り」の判断処理において、ＣＰＵ１１は、バス９に所定値以上の「加速度情報」が入力されたか否かを判断する。ＣＰＵ１１が、バス９に所定値以上の「加速度情報」が入力されたと判断した場合には（Ｓ１９の判断処理がＹＥＳ）、Ｓ２０の処理に進む。ＣＰＵ１１が、バス９に所定値以上の「加速度情報」が入力されていない判断した場合には（Ｓ１９の判断処理がＮＯ）、Ｓ２１の処理に進む。 In the determination process of S19 “operation input present”, the CPU 11 determines whether or not “acceleration information” of a predetermined value or more is input to the bus 9. When the CPU 11 determines that “acceleration information” equal to or greater than a predetermined value is input to the bus 9 (YES in S19), the process proceeds to S20. When the CPU 11 determines that “acceleration information” of a predetermined value or more is not input to the bus 9 (NO in S19), the process proceeds to S21.

Ｓ２０「付加効果処理」において、効果付加プログラム１３ｈは、携帯端末６０の加速度センサー６５が出力した「加速度情報」に基づいて、「変換音声情報」に効果を付加させる付加効果処理を実行する。詳しくは、図７に示される効果付加処理のフローを用いて後述する。Ｓ２０の処理が終了すると、Ｓ２１の処理に進む。 In S20 “addition effect processing”, the effect addition program 13h executes addition effect processing for adding an effect to the “converted voice information” based on the “acceleration information” output from the acceleration sensor 65 of the mobile terminal 60. Details will be described later with reference to the flow of effect addition processing shown in FIG. When the process of S20 ends, the process proceeds to S21.

Ｓ２１「出力信号生成」の処理において、出力信号生成プログラム１３ｃは、伴奏音楽情報記憶領域１３ｋに記憶されている「伴奏音楽情報」及び変換音声情報記憶領域１３ｒに記憶されている「変換音声情報」を合成して「合成音声情報」を生成し、当該「合成音声情報」を出力信号生成部１５に出力する。出力信号生成部１５は、「出力信号」を生成して、当該「出力信号」をスピーカー４０に出力する。スピーカー４０は、「出力信号」を音声として再生する。Ｓ２１の処理が終了すると、Ｓ１３の判断処理に戻る。 In the process of S21 “output signal generation”, the output signal generation program 13c executes “accompaniment music information” stored in the accompaniment music information storage area 13k and “converted audio information” stored in the converted audio information storage area 13r. To generate “synthesized speech information”, and output the “synthesized speech information” to the output signal generation unit 15. The output signal generation unit 15 generates an “output signal” and outputs the “output signal” to the speaker 40. The speaker 40 reproduces the “output signal” as sound. When the process of S21 ends, the process returns to the determination process of S13.

Ｓ３１「伴奏音楽情報再生終了」の処理において、出力信号生成プログラム１３ｃは、出力信号生成部１５への「伴奏音楽情報」の出力を停止する。また、伴奏動画生成プログラム１３ｉは、画像生成部１６への描画命令の出力を停止する。Ｓ３１の処理が終了すると、メイン処理が終了する。 In the process of S31 “Accompaniment music information reproduction end”, the output signal generation program 13c stops outputting the “accompaniment music information” to the output signal generation unit 15. In addition, the accompaniment moving image generation program 13 i stops outputting the drawing command to the image generation unit 16. When the process of S31 ends, the main process ends.

（音色決定処理）
図４を用いて、以下に音色決定処理について説明する。
音色決定処理が開始すると、Ｓ１５−１の処理に進む。
Ｓ１５−1「情報取得」の処理において、音色種類決定プログラム１３ｄは、傾斜情報記憶領域１３ｍに記憶されている「傾斜情報」及び加速度情報記憶領域１３ｎに記憶されている「加速度情報」を取得し、ＲＡＭのワーキングエリアに記憶させる。Ｓ１５−１の処理が終了すると、Ｓ１５−２の処理に進む。 (Tone determination processing)
The timbre determination process will be described below with reference to FIG.
When the tone determination process starts, the process proceeds to S15-1.
In the processing of S15-1 “information acquisition”, the timbre type determination program 13d acquires “inclination information” stored in the inclination information storage area 13m and “acceleration information” stored in the acceleration information storage area 13n. And stored in the working area of the RAM. When the process of S15-1 ends, the process proceeds to S15-2.

Ｓ１５−２「姿勢算出」の処理において、音色種類決定プログラム１３ｄは、ＲＡＭのワーキングエリアに記憶されている「傾斜情報」及び「加速度情報」に基づいて、携帯端末６０の傾度を算出する。具体的には、まず、音色種類決定プログラム１３ｄは、「加速度情報」に基づいて、重力方向を算出する。次に、音色種類決定プログラム１３ｄは、前記算出した重力方向及び「傾度情報」から、携帯端末６０の姿勢を算出する。図８を用いて、具体的に説明する。図８の（Ａ）、（Ｂ）、（Ｄ）示されるように携帯端末６０に対するＸ、Ｙ、Ｚ座標が設定されている。なお、Ｘ座標方向は携帯端末６０の幅方向であり、Ｙ座標方向はＸ座標方向と直交する携帯端末６０の縦方向であり。Ｚ座標方法はＸ座標方向及びＹ座標方向と直交する携帯端末６０の厚さ方向である。音色種類決定プログラム１３ｄは、重力方向及び「傾度情報」から、携帯端末６０のＸＺ平面上における傾き角θｘｚ（図８の（Ａ）に示す）及び、携帯端末６０のＹＺ平面上における傾き角θｙｚ（図８の（Ｂ）に示す）を算出する。そして、音色種類決定プログラム１３ｄは、算出された携帯端末６０の傾き角θｘｚ及びθｙｚを、図８の（Ｃ）に示される基準に照合せることにより、Ａ及びＢの値を算出する。なお、ＡとＢの初期値は１である。
図８の（Ｃ）に示される基準を詳述すると、携帯端末６０の傾き角θｘｚ及びθｙｚを、下式１〜４を満たすか否かによってＡ及びＢの値が決定される。
−５°≦θｘｚ≦５°…式１
−５°≦θｙｚ≦５°…式２
式１及び式２の両方満たす場合には、Ａ＝０、Ｂ＝０と決定される。
θｙｚ≦θｘｙ…式３
式１及び式２を満たすこと無く、式３を満たす場合には、Ａ＝２と決定される。
θｙｚ≦−θｘｙ…式４
式１及び式２を満たすこと無く、式４を満たす場合には、Ｂ＝２と決定される。
Ｓ１５−２の処理が終了すると、Ｓ１５−３の処理に進む。 In the process of S15-2 “posture calculation”, the timbre type determination program 13d calculates the inclination of the portable terminal 60 based on the “tilt information” and “acceleration information” stored in the working area of the RAM. Specifically, first, the timbre type determination program 13d calculates the direction of gravity based on the “acceleration information”. Next, the tone color type determination program 13d calculates the attitude of the portable terminal 60 from the calculated gravity direction and “gradient information”. This will be specifically described with reference to FIG. As shown in FIGS. 8A, 8B, and 8D, X, Y, and Z coordinates for the portable terminal 60 are set. The X coordinate direction is the width direction of the mobile terminal 60, and the Y coordinate direction is the vertical direction of the mobile terminal 60 orthogonal to the X coordinate direction. The Z coordinate method is the thickness direction of the portable terminal 60 orthogonal to the X coordinate direction and the Y coordinate direction. The tone color type determination program 13d determines the inclination angle θxz (shown in FIG. 8A) of the portable terminal 60 on the XZ plane and the inclination angle θyz of the portable terminal 60 on the YZ plane from the direction of gravity and the “gradient information”. (Shown in FIG. 8B) is calculated. Then, the timbre type determination program 13d calculates the values of A and B by collating the calculated inclination angles θxz and θyz of the portable terminal 60 with the reference shown in FIG. The initial value of A and B is 1.
The criteria shown in FIG. 8C will be described in detail. The values of A and B are determined depending on whether the inclination angles θxz and θyz of the mobile terminal 60 satisfy the following expressions 1 to 4.
−5 ° ≦ θxz ≦ 5 ° ... Formula 1
−5 ° ≦ θyz ≦ 5 ° Formula 2
When both Expression 1 and Expression 2 are satisfied, it is determined that A = 0 and B = 0.
θyz ≦ θxy Equation 3
When Expression 3 is satisfied without satisfying Expression 1 and Expression 2, A = 2 is determined.
θyz ≦ −θxy (Formula 4)
If Expression 4 is satisfied without satisfying Expression 1 and Expression 2, B = 2 is determined.
When the process of S15-2 is completed, the process proceeds to S15-3.

Ｓ１５−３「音色決定」の処理において、音色種類決定プログラム１３ｄは、Ｓ１５−２の処理で算出された携帯端末６０の姿勢に基づいて、音色を決定し、決定した音色の種類を音色種類記憶領域１３ｑに記憶させる。具体的には、音色種類決定プログラム１３ｄは、Ｓ１５−２の処理で決定されたＡ及びＢを図８の（Ｅ）に示される基準に照合させることにより音色を決定する。なお本実施形態では、図８の（Ｅ）に示されるように、音色の種類は、楽器の楽音となっている。Ｓ１５−３の処理が終了すると、音色決定処理が終了する。
このように、音色種類決定プログラム１３ｄは、「傾度情報」から所定閾値（図８の（Ｃ）に示される基準）を用いて、携帯端末６０の姿勢を判定し、更に、携帯端末６０の姿勢から音色の種類を決定することにしたので、音色の種類が確実に決定される。 In the process of S15-3 “Tone determination”, the timbre type determination program 13d determines a timbre based on the attitude of the mobile terminal 60 calculated in the process of S15-2, and stores the determined timbre type as a timbre type. It memorize | stores in the area | region 13q. Specifically, the timbre type determination program 13d determines the timbre by collating A and B determined in S15-2 with the reference shown in FIG. In this embodiment, as shown in FIG. 8E, the timbre type is a musical tone of a musical instrument. When the process of S15-3 is completed, the timbre determination process is terminated.
As described above, the timbre type determination program 13d determines the attitude of the portable terminal 60 from the “gradient information” using the predetermined threshold (the reference shown in FIG. 8C), and further determines the attitude of the portable terminal 60. Therefore, the timbre type is determined with certainty.

（音程補正処理）
図５を用いて、以下に音程補正処理のフローについて説明する。音程補正処理が開始すると、Ｓ１６−１の処理に進む。
Ｓ１６−１「音声情報取得」の処理において、音程補正プログラム１３ｆは、音声情報記憶領域１３ｐに記憶された「音声情報」を、ＲＡＭのワーキングエリアに記憶させる。Ｓ１６−１の処理が終了すると、Ｓ１６−２の処理に進む。 (Pitch correction processing)
The flow of the pitch correction process will be described below with reference to FIG. When the pitch correction process is started, the process proceeds to S16-1.
In the process of S16-1 “acquisition of voice information”, the pitch correction program 13f stores the “voice information” stored in the voice information storage area 13p in the working area of the RAM. When the process of S16-1 ends, the process proceeds to S16-2.

Ｓ１６−２「伴奏音楽情報取得」の処理において、音程補正プログラム１３ｆは、伴奏音楽情報記憶領域１３ｋに記憶された「伴奏音楽情報」を、ＲＡＭのワーキングエリアに記憶させる。Ｓ１６−２の処理が終了すると、Ｓ１６−３の処理に進む。 In the process of S16-2 “accompaniment music information acquisition”, the pitch correction program 13f stores the “accompaniment music information” stored in the accompaniment music information storage area 13k in the working area of the RAM. When the process of S16-2 ends, the process proceeds to S16-3.

Ｓ１６−３「音程のズレが所定以上」の判断処理において、音程補正プログラム１３ｆは、ＲＡＭのワーキングエリアに記憶された「伴奏音楽情報」と「音声情報」を比較し、「音声情報」の音程が「伴奏音楽信号情報」の音程から所定以上ズレているか否かを判断する。なお、図９において、縦方向は音程、横方向は時間を意味する。図９の（１）や（２）に示されるように、音程補正プログラム１３ｆが、「音声情報」の音程が「伴奏音楽信号情報」の音程に基づく音程から所定以上（例えば、四分音以上）ズレていると判断した場合には（Ｓ１６−３の判断処理がＹＥＳ）、Ｓ１６−４の処理に進む。一方で、音程補正プログラム１３ｆが、「音声情報」の音程が「伴奏音楽信号情報」の音程から所定以上ズレていないと判断した場合には（Ｓ１６−３の判断処理がＮＯ）、音程補正処理が終了する。なお、「伴奏音楽信号情報」の音程に基づく音程とは、「伴奏音楽信号情報」の音程に一致する音程はもちろん、「伴奏音楽信号情報」の音程と所定の関係にある音程（例えば、「伴奏音楽信号情報」の音程から、１オクターブずれた音程や、３度又は５度ずれた和音の関係にある音程など）も含む意である。 In the determination process of S16-3 “pitch deviation is not less than a predetermined value”, the pitch correction program 13f compares the “accompaniment music information” stored in the working area of the RAM with the “voice information”, and the pitch of the “voice information”. Is deviated from the pitch of “accompaniment music signal information” by a predetermined amount or more. In FIG. 9, the vertical direction means the pitch, and the horizontal direction means time. As shown in (1) and (2) of FIG. 9, the pitch correction program 13f determines that the pitch of “speech information” is greater than or equal to a pitch based on the pitch of “accompaniment music signal information” (eg, a quarter tone or more). ) If it is determined that there is a deviation (YES in S16-3), the process proceeds to S16-4. On the other hand, when the pitch correction program 13f determines that the pitch of “speech information” is not shifted from the pitch of “accompaniment music signal information” by a predetermined amount or more (NO in S16-3), the pitch correction processing is performed. Ends. Note that the pitch based on the pitch of the “accompaniment music signal information” is not only a pitch that matches the pitch of the “accompaniment music signal information” but also a pitch that has a predetermined relationship with the pitch of the “accompaniment music signal information” (for example, “ It is meant to include a pitch shifted by one octave from a pitch of the “accompaniment music signal information” or a pitch in a chord relationship shifted by 3 or 5 degrees.

Ｓ１６−４「音程補正」の処理において、音程補正プログラム１３ｆは、図９の（３）、（４）に示されるように、Ｓ１６−４の判断処理でズレていると判断した「音声情報」の音程を補正し、ＲＡＭのワーキングエリアに更新記憶させる。具体的には、図９の（３）（４）以外の音と同様に、「伴奏音楽信号情報」の音程に基づく音程となるように、音程が補正される。Ｓ１６−４の処理が終了すると、音程補正処理が終了する。
この音程補正処理により、ユーザーがスピーカー４０から再生される伴奏音楽の音程を外して発声した場合であっても、伴奏音楽に合った「変換音声情報」が生成される。 In the process of S16-4 “pitch correction”, the pitch correction program 13f determines that “sound information” has been shifted in the determination process of S16-4 as shown in (3) and (4) of FIG. Is corrected and stored in the working area of the RAM. Specifically, the pitch is corrected so that the pitch is based on the pitch of the “accompaniment music signal information”, as with the sounds other than (3) and (4) in FIG. When the process of S16-4 ends, the pitch correction process ends.
By this pitch correction process, even if the user utters the musical accompaniment that is reproduced from the speaker 40, “converted voice information” that matches the accompaniment music is generated.

（リズム補正処理）
図６を用いて、以下にリズム補正処理のフローについて説明する。リズム補正処理が開始すると、Ｓ１７−１の処理に進む。
Ｓ１７−１「リズムのズレが所定以上」の処理において、リズム補正プログラム１３ｇは、ＲＡＭのワーキングエリアに記憶された「伴奏音楽情報」と「音声情報」を比較し、「音声情報」のリズムが「伴奏音楽情報」のリズムから所定以上ずれているか否かを判断する。なお、図１０において、横軸は時間（デルタタイム）を意味する。図１０の（１）〜（４）に示されるように、リズム補正プログラム１３ｇが、「音声情報」のリズムが「伴奏音楽情報」のリズムから所定以上ずれていると判断した場合には（Ｓ１７−１の判断処理がＹＥＳ）、Ｓ１７−２の処理に進む。一方で、リズム補正プログラム１３ｇが、「音声情報」のリズムが「伴奏音楽情報」のリズムから所定以上ずれていないと判断した場合には（Ｓ１７−１の判断処理がＮＯ）、リズム補正処理が終了する。 (Rhythm correction processing)
The flow of the rhythm correction process will be described below using FIG. When the rhythm correction process is started, the process proceeds to S17-1.
In the process of S17-1 “Rhythm deviation is not less than a predetermined value”, the rhythm correction program 13g compares the “accompaniment music information” and the “voice information” stored in the working area of the RAM, and the rhythm of the “voice information” is It is determined whether or not the rhythm of the “accompaniment music information” deviates by a predetermined amount or more. In FIG. 10, the horizontal axis represents time (delta time). As shown in (1) to (4) of FIG. 10, when the rhythm correction program 13g determines that the rhythm of “audio information” is deviated from the rhythm of “accompaniment music information” by a predetermined amount or more (S17). -1 determination process is YES), the process proceeds to S17-2. On the other hand, if the rhythm correction program 13g determines that the rhythm of “voice information” is not deviated from the rhythm of “accompaniment music information” by a predetermined amount or more (NO in S17-1), the rhythm correction process is performed. finish.

Ｓ１７−２「リズム補正処理」の処理において、リズム補正プログラム１３ｇは、図１０の（５）〜（８）に示されるように、Ｓ１７−１の判断処理において、ズレていると判断した「音声情報」のリズムを補正し、音声情報記憶領域１３ｐに更新記憶させる。具体的には、図１０の（５）〜（８）に示されるように、リズム補正プログラム１３ｇは「音声情報」の早く入力を止めてしまった音を「伴奏音楽情報」に合うように伸ばす処理や、早く入力してしまった音を「伴奏音楽情報」に合うように入力を遅らせる処理を行う。Ｓ１７−２の処理が終了すると、リズム補正処理が終了する。
このリズム補正処理により、ユーザーがスピーカー４０から再生される伴奏音楽のリズムを外して発声した場合であっても、伴奏音楽に合った「変換音声情報」が生成される。 In the process of S17-2 “Rhythm correction process”, the rhythm correction program 13g, as shown in (5) to (8) of FIG. The rhythm of “information” is corrected and updated and stored in the voice information storage area 13p. Specifically, as shown in (5) to (8) of FIG. 10, the rhythm correction program 13g extends the sound that has been stopped to input "voice information" early so as to match the "accompaniment music information". Processing is performed and processing for delaying input so that the sound that has been input earlier matches the “accompaniment music information” is performed. When the process of S17-2 ends, the rhythm correction process ends.
By this rhythm correction processing, even if the user utters the accompaniment music reproduced from the speaker 40, “converted sound information” that matches the accompaniment music is generated.

（効果付加処理）
図７を用いて、以下に効果付加処理のフローについて説明する。
効果付加処理が開始すると、Ｓ２０−１の処理に進む。
Ｓ２０−１「加速度データ取得」の処理において、効果付加プログラム１３ｈは、加速度情報記憶領域１３ｎに記憶された「加速度情報」をＲＡＭのワーキングエリアに記憶させる。Ｓ２０−１の処理が終了すると、Ｓ２０−２の処理に進む。 (Effect addition processing)
The effect adding process flow will be described below with reference to FIG.
When the effect addition process is started, the process proceeds to S20-1.
In the processing of S20-1 “Acquire acceleration data”, the effect addition program 13h stores “acceleration information” stored in the acceleration information storage area 13n in the working area of the RAM. When the process of S20-1 ends, the process proceeds to S20-2.

Ｓ２０−２「動き認識」の処理において、効果付加プログラム１３ｈは、ＲＡＭのワーキングエリアに記憶された「加速度情報」から、携帯端末６０の動きを認識する。具体的には、図１１の（Ａ）、（Ｂ）に示されるように、効果付加プログラム１３ｈは、携帯端末６０のＸ、Ｙ、Ｚ座標方向の動きを認識する。Ｓ２０−２の処理が終了すると、Ｓ２０−３の処理に進む。 In the process of S20-2 “motion recognition”, the effect addition program 13h recognizes the motion of the mobile terminal 60 from the “acceleration information” stored in the working area of the RAM. Specifically, as shown in FIGS. 11A and 11B, the effect addition program 13 h recognizes the movement of the mobile terminal 60 in the X, Y, and Z coordinate directions. When the process of S20-2 ends, the process proceeds to S20-3.

Ｓ２０−３「付加効果決定」の処理において、効果付加プログラム１３ｈは、Ｓ２０−２の処理で認識された携帯端末６０の動きから、付加効果を決定する。具体的には、効果付加プログラム１３ｈは、図１１の（Ｃ）の表に示されるような基準により、「変換音声情報」に付加する効果（例えば、出力を遅らせる、ビブラートを付加するなど）を決定する。つまり、ユーザーが携帯端末６０をＸ、Ｙ、Ｚの特定の方向に振った場合には、振った方向に対応する効果が決定される。或いは、図１１の（Ｄ）に示されように、効果付加プログラム１３ｈは、携帯端末６０の回転等の特定の動きを検知して、「変換音声情報」に付加する効果を決定する。Ｓ２０−３の処理が終了すると、Ｓ２０−４の処理に進む。 In the process of S20-3 “determination of additional effect”, the effect addition program 13h determines the additional effect from the movement of the mobile terminal 60 recognized in the process of S20-2. Specifically, the effect addition program 13h provides an effect (for example, delaying output, adding vibrato, etc.) to be added to the “converted audio information” based on the criteria shown in the table of FIG. decide. That is, when the user swings the mobile terminal 60 in specific directions of X, Y, and Z, the effect corresponding to the swing direction is determined. Alternatively, as illustrated in FIG. 11D, the effect addition program 13 h detects a specific movement such as rotation of the mobile terminal 60 and determines an effect to be added to the “converted sound information”. When the process of S20-3 ends, the process proceeds to S20-4.

Ｓ２０−４「効果付加」の処理において、効果付加プログラム１３ｈは、「変換音声情報」にＳ２０−３の処理で決定された効果を付加し、変換音声情報記憶領域１３ｒに更新記憶させる。Ｓ２０−４の処理が終了すると、効果付加処理が終了する。
この効果付加処理により、ユーザーは、携帯端末６０を動かすという簡単な操作により、「変換音声情報」に効果を付加させることが可能となる。 In the process of S20-4 “add effect”, the effect addition program 13h adds the effect determined in the process of S20-3 to “converted voice information”, and updates and stores it in the converted voice information storage area 13r. When the process of S20-4 ends, the effect addition process ends.
By this effect addition process, the user can add an effect to the “converted voice information” by a simple operation of moving the mobile terminal 60.

（総括）
以上説明したように、本発明では、ユーザーが携帯端末６０を傾けるという簡単な操作により、ユーザーが発声した音声が変換される音色の種類を選択できるようになっている。 (Summary)
As described above, according to the present invention, the type of timbre to which the voice uttered by the user is converted can be selected by a simple operation in which the user tilts the portable terminal 60.

なお、以上説明した実施形態では、カラオケ装置本体２０は、「伴奏動画情報」を取得しているが、動画無しの「伴奏音楽情報」を取得する実施形態であっても差し支えない。また、以上説明した実施形態では、カラオケ装置本体２０は「伴奏動画情報」や「伴奏音楽情報」を公衆通信網９００から取得しているが、ＤＶＤやＣＤＲＯＭ等リムーバブルディスクに記憶さされた「伴奏動画情報」や「伴奏音楽情報」を、ＤＶＤドライブやＣＤドライブ等の読み取り装置で取得することにしても差し支えない。 In the embodiment described above, the karaoke apparatus body 20 acquires “accompaniment video information”, but may be an embodiment that acquires “accompaniment music information” without a video. Further, in the embodiment described above, the karaoke apparatus body 20 acquires “accompaniment video information” and “accompaniment music information” from the public communication network 900, but “accompaniment” stored in a removable disk such as a DVD or CDROM. The “moving image information” and “accompaniment music information” may be acquired by a reading device such as a DVD drive or a CD drive.

以上、現時点において、もっとも、実践的であり、かつ好ましいと思われる実施形態に関連して本発明を説明したが、本発明は、本願明細書中に開示された実施形態に限定されるものではなく、請求の範囲および明細書全体から読み取れる発明の要旨あるいは思想に反しない範囲で適宜変更可能であり、そのような変更を伴うカラオケシステムもまた技術的範囲に包含されるものとして理解されなければならない。 Although the present invention has been described above in connection with the most practical and preferred embodiments at the present time, the present invention is not limited to the embodiments disclosed herein. The invention can be changed as appropriate without departing from the spirit or concept of the invention that can be read from the claims and the entire specification, and a karaoke system with such changes should also be understood as being included in the technical scope. Don't be.

９バス
１１ＣＰＵ
１３記憶部
１３ａ伴奏動画情報取得プログラム
１３ｂ伴奏音楽情報取得プログラム
１３ｃ出力信号生成プログラム
１３ｄ音色種類決定プログラム
１３ｅ音色変換プログラム
１３ｆ音程補正プログラム
１３ｇリズム補正プログラム
１３ｈ効果付加プログラム
１３ｉ伴奏動画生成プログラム
１３ｊ伴奏動画情報記憶領域
１３ｋ伴奏音楽情報記憶領域
１３ｍ傾斜情報記憶領域
１３ｎ加速度情報記憶領域
１３ｐ音声情報記憶領域
１３ｑ音色種類記憶領域
１３ｒ変換音声情報記憶領域
１４音声入力インターフェース
１５出力信号生成部
１６画像生成部
１７通信部
１８外部通信部
１９操作部
２０カラオケ装置本体
３０マイクロフォン
４０スピーカー
５０画像表示装置
６０携帯端末
６１ＣＰＵ
６３記憶部
６４傾斜センサー
６５加速度センサー
６６通信部
６９バス
１００カラオケシステム
９００公衆通信網 9 Bus 11 CPU
13 storage unit 13a accompaniment video information acquisition program 13b accompaniment music information acquisition program 13c output signal generation program 13d tone color type determination program 13e tone color conversion program 13f pitch correction program 13g rhythm correction program 13h effect addition program 13i accompaniment video generation program 13j accompaniment video information Storage area 13k Accompaniment music information storage area 13m Tilt information storage area 13n Acceleration information storage area 13p Voice information storage area 13q Tone type storage area 13r Conversion voice information storage area 14 Voice input interface 15 Output signal generation section 16 Image generation section 17 Communication section 18 External Communication Unit 19 Operation Unit 20 Karaoke Device Main Body 30 Microphone 40 Speaker 50 Image Display Device 60 Portable Terminal 61 CPU
63 storage unit 64 tilt sensor 65 acceleration sensor 66 communication unit 69 bus 100 karaoke system 900 public communication network

Claims

A microphone that generates an audio signal from audio;
Accompaniment music information acquisition means for acquiring accompaniment music information, and an output signal generation means for generating an output signal based on an audio signal output from the microphone and accompaniment music information acquired by the accompaniment information acquisition means When,
In a karaoke system comprising a speaker that reproduces the output signal as audio,
A tilt sensor for measuring the tilt,
A portable terminal further comprising a communication means for transmitting the inclination information measured by the inclination sensor to the karaoke apparatus body;
The karaoke apparatus body is
A timbre type determining means for determining a timbre type to which an audio signal output from the microphone is converted based on the gradient information transmitted by the portable terminal;
Further comprising timbre conversion means for converting the timbre of the voice signal output from the microphone to generate converted voice information based on the timbre type determined by the timbre type determination means;
The karaoke system characterized in that the output signal generation means generates the output signal by synthesizing the converted voice information and the accompaniment music information.

2. The karaoke system according to claim 1, wherein the timbre type determining means determines the timbre type by determining the attitude of the mobile terminal based on the gradient information transmitted by the mobile terminal based on a threshold value.

3. The karaoke system according to claim 1, wherein the karaoke apparatus main body further includes pitch correction means for correcting the pitch of the audio signal output from the microphone based on the accompaniment music information.

The karaoke system according to any one of claims 1 to 3, wherein the karaoke apparatus main body further includes rhythm correction means for correcting the rhythm of the audio signal output from the microphone based on the accompaniment music information.

The portable terminal further includes an acceleration sensor that measures acceleration,
The transmission means transmits the acceleration information measured by the acceleration sensor to the karaoke apparatus body through the communication means,
The karaoke apparatus main body further includes effect adding means for adding an effect to the converted voice information generated by the timbre converting means based on the acceleration information transmitted from the portable terminal. The karaoke system described in Crab.