JP5776205B2

JP5776205B2 - Sound signal generating apparatus and program

Info

Publication number: JP5776205B2
Application number: JP2011028622A
Authority: JP
Inventors: 山内　明; 明山内; 光加瀬
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2011-02-14
Filing date: 2011-02-14
Publication date: 2015-09-09
Anticipated expiration: 2031-02-14
Also published as: JP2012168323A

Description

この発明は、入力された音声信号をピッチシフトすることにより１乃至複数のハーモニー音信号を生成する音信号生成装置及びプログラムに関する。特に、複数のハーモニー音信号それぞれに対し、入力音声信号が有するピッチ揺れを任意の大きさで反映させる技術に関する。この発明の音信号生成装置及び方法は、カラオケあるいは電子楽器あるいはエフェクタあるいはパーソナルコンピュータなどの音楽関連機器に附属した人間音声あるいは楽器音の処理システムにおいて利用可能である。 The present invention relates to a sound signal generation device and a program for generating one or more harmony sound signals by pitch-shifting an input sound signal. In particular, the present invention relates to a technique for reflecting a pitch fluctuation of an input audio signal with an arbitrary magnitude for each of a plurality of harmony sound signals. The sound signal generating apparatus and method of the present invention can be used in a human voice or musical instrument processing system attached to music-related equipment such as karaoke, an electronic musical instrument, an effector, or a personal computer.

従来から、マイクロフォン等を介してユーザにより入力された楽器演奏音又は人間の音声等の入力音声信号に基づき、前記入力音声信号とは例えば３度や５度などの所定の音程分だけ上又は下に音高が離れた１乃至複数のハーモニー音信号を自動的に生成し、これを前記入力音声信号と共に再生することでリード音（入力音）及びハーモニー音（付加音）を同時に発音させることのできる電子音楽装置及びプログラムが知られている。こうした装置に関連するものとしては、例えば下記に示す特許文献１や特許文献２に記載されている装置がその一例である。 Conventionally, based on an input sound signal such as a musical instrument performance sound or human voice input by a user via a microphone or the like, the input sound signal is higher or lower by a predetermined pitch such as 3 degrees or 5 degrees. Automatically generate one or a plurality of harmony sound signals with different pitches, and reproduce them together with the input sound signal to simultaneously produce a lead sound (input sound) and a harmony sound (additional sound). Electronic music apparatuses and programs that can be used are known. For example, devices described in Patent Literature 1 and Patent Literature 2 shown below are examples of devices related to such devices.

これら特許文献１や特許文献２に記載されたような従来知られた装置では、入力音声信号を周波数解析して得られる周波数情報（ピッチ）に基づき所定区間（又は所定期間）毎に基本周波数つまりは音楽の音名のいずれかに対応する音高を特定し、該特定した入力音声信号の音高に応じて決定される所定のピッチシフト量に従って入力音声信号（より具体的には、前記特定した音高に対応した窓関数により切り出され記憶される１周期分の波形要素データ）をピッチシフトすることにより、所定の目標音高（音楽の音名のいずれかに対応する音高）の１乃至複数のハーモニー音信号を別途独立した付加音として生成するようになっている。さらに、特許文献２に記載の装置においては、押鍵音（入力音声信号に該当）がユーザによるホイール操作に応じてベンドアップされた場合つまりはピッチベンド値（ピッチシフト量とも呼ぶ）が変更された場合に、付加音（ハーモニー音信号に該当）がコードと調和する音高となるように付加音のピッチベンド量を補正することが開示されている。すなわち、特許文献２に記載の装置では予め指定されているコード情報（具体的にはＣメジャー、Ａマイナーなどのコード名）に従って、当該コードの各構成音からなる複数の付加音が生成されるようになっている。 In the devices known in the art such as those described in Patent Document 1 and Patent Document 2, the fundamental frequency, that is, the fundamental frequency, that is, every predetermined section (or predetermined period) based on frequency information (pitch) obtained by frequency analysis of the input audio signal. Identifies the pitch corresponding to one of the musical pitch names, and the input voice signal (more specifically, the specified pitch) according to a predetermined pitch shift amount determined according to the pitch of the identified input voice signal. 1 of a predetermined target pitch (pitch corresponding to one of the musical pitch names) is obtained by pitch-shifting one cycle of waveform element data cut out and stored by a window function corresponding to the pitch. In addition, a plurality of harmony sound signals are separately generated as independent additional sounds. Furthermore, in the apparatus described in Patent Document 2, when a key-pressing sound (corresponding to an input audio signal) is bent up according to a wheel operation by the user, that is, a pitch bend value (also referred to as a pitch shift amount) is changed. In this case, it is disclosed that the pitch bend amount of the additional sound is corrected so that the additional sound (corresponding to the harmony sound signal) has a pitch that harmonizes with the chord. That is, in the apparatus described in Patent Document 2, a plurality of additional sounds composed of the constituent sounds of the chord are generated according to pre-designated chord information (specifically, chord names such as C major and A minor). It is like that.

特許第2879948号Patent No. 2879948 特開平06-202660号公報Japanese Patent Laid-Open No. 06-202660

しかし、上述した特許文献１や特許文献２に記載されたような従来の装置において、生成されるハーモニー音信号はあくまでも音楽の音名のいずれかに対応する半音単位の音高であり、その音高は常に一定でしかなく半音（１００セント）以下の微妙なピッチ変化（この明細書ではピッチ揺れと呼ぶ）を有しないために、ハーモニー音の音楽的な表情が機械的なものとなって都合が悪い、という問題があった。特に、入力音声信号がピッチ揺れを有するような場合には、リード音（入力音）の音楽的に豊かな表情とハーモニー音（付加音）の音楽的に機械的な表情との間に大きな隔たりが生じうるので、ユーザが違和感を抱きやすい。また、単にピッチ揺れのあるハーモニー音（入力音声信号のピッチ揺れを反映していない）を生成するには、前記音高一定のハーモニー音信号に対してビブラートをかけるなどのピッチ制御を行えばよい。ところで、ユーザ自身の好みにあわせたハーモニー、例えば高音側に比べて低音側のピッチの揺れが小さく安定している全体として聴きやすいハーモニー、曲調に応じた雰囲気（メジャー感やマイナー感など）が明瞭に出されたハーモニー、テンションノートのピッチ揺れを調整することによる緊張感に強弱をつけたハーモニーなどを最終的に発生させるためには、異なるピッチ揺れを有する複数のハーモニー音を生成する必要がある。しかし、従来において異なるピッチ揺れを有する複数のハーモニー音を生成するためには、ビブラート制御のためのパラメータ設定等をハーモニー音毎に個々に行わなければならず、そうした操作がユーザにとって非常に面倒であった。そこで、入力音声信号が有するピッチ揺れをそれぞれ任意の大きさで反映させた１乃至複数のハーモニー音信号を簡単な操作で生成することのできる音信号生成装置が望まれていたが、未だそうしたものは提案されていない。 However, in the conventional devices described in Patent Document 1 and Patent Document 2 described above, the generated harmony sound signal is a semitone unit corresponding to one of the musical pitch names. The height is always constant and does not have a subtle pitch change of less than a semitone (100 cents) (referred to as pitch fluctuation in this specification), so the musical expression of the harmony sound is mechanical and convenient. There was a problem of being bad. In particular, when the input audio signal has pitch fluctuation, there is a large gap between the musically rich expression of the lead sound (input sound) and the musically mechanical expression of the harmony sound (additional sound). Since this can occur, it is easy for the user to feel uncomfortable. Further, in order to generate a harmony sound with a pitch fluctuation (not reflecting the pitch fluctuation of the input audio signal), pitch control such as applying vibrato to the harmony sound signal with a constant pitch may be performed. . By the way, the harmony according to the user's own preference, for example, the harmony that is easy to listen to as a whole, with the pitch fluctuation on the bass side being small and stable compared to the treble side, and the atmosphere (majority, minority, etc.) according to the tune is clear It is necessary to generate multiple harmony sounds with different pitch fluctuations in order to finally generate harmonies that are applied to the sound, and harmonies with a sense of tension by adjusting the pitch fluctuation of the tension note. . However, in order to generate a plurality of harmony sounds having different pitch fluctuations in the past, parameter settings for vibrato control must be performed individually for each harmony sound, which is very troublesome for the user. there were. Therefore, there has been a demand for a sound signal generating device capable of generating one or more harmony sound signals reflecting the pitch fluctuations of the input sound signal with arbitrary magnitudes by a simple operation. Has not been proposed.

本発明は上述の点に鑑みてなされたもので、入力音声信号そのものが有する半音以下のピッチ変化（ピッチ揺れ）を、該入力音声信号を元に自動生成される１乃至複数のハーモニー音信号それぞれに対して任意の大きさで反映させることが容易にできるようにした音信号生成装置及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and each of one to a plurality of harmony sound signals that are automatically generated based on the input sound signal, with a pitch change (pitch fluctuation) less than a semitone included in the input sound signal itself. It is an object of the present invention to provide a sound signal generation device and a program that can be easily reflected in an arbitrary size.

本発明に係る音信号生成装置は、入力音声信号の音高変動に追従して音高が制御される１乃至複数のハーモニー音信号を生成する音信号生成装置であって、音声信号を入力する入力手段と、前記入力された音声信号の具体的ピッチを逐次検出し、該具体的ピッチから音名に対応する正規化されたピッチを検出するピッチ検出手段と、前記具体的ピッチと正規化されたピッチとの差に関連する差分ピッチ情報を求める差生成手段と、前記正規化されたピッチに対して相異なる音程を持つ複数のピッチを、生成すべき複数の音信号の目標ピッチとして決定する目標ピッチ決定手段と、前記複数の目標ピッチ毎に、該目標ピッチに付加するピッチ揺れの程度を調整するために、差分ピッチ付加割合を示すピッチ調整情報をそれぞれ設定する設定手段と、前記複数の目標ピッチ毎に、前記ピッチ調整情報に応じて付加割合が調整された前記差分ピッチ情報に従って該目標ピッチを変調したピッチを持つ音信号を生成する音信号生成手段とを具える。 A sound signal generation device according to the present invention is a sound signal generation device that generates one or more harmony sound signals whose pitches are controlled following the pitch fluctuation of an input sound signal, and inputs the sound signals. An input means; a pitch detection means for sequentially detecting a specific pitch of the input voice signal; and a normalized pitch corresponding to a pitch name from the specific pitch; and the specific pitch is normalized. Difference generating means for obtaining difference pitch information related to the difference between the pitch and a plurality of pitches having pitches different from the normalized pitch are determined as target pitches of the plurality of sound signals to be generated. a target pitch determining means, for each of the plurality of target pitch, in order to adjust the degree of pitch sway to be added to the target pitch, setting means for setting a pitch adjustment information indicating the difference pitch additional percentage respectively , For each of the plurality of target pitch, comprising a sound signal generation means for generating a sound signal having a pitch obtained by modulating the target pitch according to the difference pitch information adding proportion is adjusted according to the pitch adjustment information.

差分ピッチ情報は、入力された音声信号におけるピッチ揺れ（ピッチの揺らぎ成分）を示すものであるから、目標ピッチを該差分ピッチ情報に応じて変調したピッチを持つ音信号（ハーモニー音）を生成することにより、入力音声信号（リード音）が有するピッチ揺れを反映させたハーモニー音信号を生成することができる。また、複数のハーモニー音信号を生成するために複数の目標ピッチを差分ピッチ情報に応じて変調する際には、前記複数の目標ピッチ毎に、該目標ピッチに付加するピッチ揺れの程度を調整するために、差分ピッチ付加割合を示すピッチ調整情報をそれぞれ設定し、該複数の目標ピッチ毎に、該ピッチ調整情報に応じて付加割合が調整された前記差分ピッチ情報に従って該目標ピッチを変調したピッチを持つハーモニー音信号を生成するようにした。これにより、入力された音声信号におけるピッチ揺れ（ピッチの揺らぎ成分）が再現された、リード音との間に大きな隔たりがなくユーザに違和感を抱かせることのないハーモニー音信号でありながらも、複数の各目標ピッチ毎のピッチ揺れの程度をそれぞれ独立に調整することができるので、ピッチ揺れの程度が異なる複数のハーモニー音信号を生成することができ、ユーザは自身の好みにあわせたハーモニーを生成して出力させることが容易にできるようになる。 Since the differential pitch information indicates pitch fluctuation (pitch fluctuation component) in the input audio signal, a sound signal (harmony sound) having a pitch obtained by modulating the target pitch according to the differential pitch information is generated. Thus, it is possible to generate a harmony sound signal reflecting the pitch fluctuation of the input sound signal (lead sound). Further, when a plurality of target pitches are modulated according to the difference pitch information in order to generate a plurality of harmony sound signals, the degree of pitch fluctuation added to the target pitch is adjusted for each of the plurality of target pitches. Therefore, pitch adjustment information indicating a difference pitch addition ratio is set for each of the plurality of target pitches, and the target pitch is modulated according to the difference pitch information in which the addition ratio is adjusted according to the pitch adjustment information. It was to generate a harmony sound signal with. Thus, the pitch swing in the input audio signal (fluctuation component of the pitch) is reproduced, while a harmony voice signal that does not inspire discomfort to the user without a wide gap between the lead note, several since the degree of pitch fluctuation of each target pitch can be adjusted independently Ki out that the degree of pitch shaking generates a different harmony voice signal, harmony users to suit their own tastes It can be easily generated and output.

本発明は装置の発明として構成し実施することができるのみならず、方法の発明として構成し実施することができる。また、本発明は、コンピュータまたはＤＳＰ等のプロセッサのプログラムの形態で実施することができるし、そのようなプログラムを記憶した記憶媒体の形態で実施することもできる。 The present invention can be constructed and implemented not only as a device invention but also as a method invention. Further, the present invention can be implemented in the form of a program of a processor such as a computer or a DSP, or can be implemented in the form of a storage medium storing such a program.

この発明によれば、入力音声信号のピッチと特定された音名に対応した音高との差分である差分ピッチを求めておき、複数のハーモニー音信号を生成するために複数の目標ピッチを差分ピッチに応じて変調する際に、前記複数の目標ピッチ毎に、該目標ピッチに付加するピッチ揺れの程度を調整するために、差分ピッチ付加割合を示すピッチ調整情報をそれぞれ設定し、該複数の目標ピッチ毎に、該ピッチ調整情報に応じて付加割合が調整された前記差分ピッチ情報に従って該目標ピッチを変調したピッチを持つハーモニー音信号を生成するようにした。これにより、入力された音声信号におけるピッチ揺れ（ピッチの揺らぎ成分）が再現された、リード音との間に大きな隔たりがなくユーザに違和感を抱かせることのないハーモニー音信号でありながらも、複数の各目標ピッチ毎のピッチ揺れの程度をそれぞれ独立に調整することができるので、ピッチ揺れの程度が異なる複数のハーモニー音信号を生成することができるようになる、という効果を奏する。
According to the present invention, a difference pitch, which is a difference between the pitch of the input voice signal and the pitch corresponding to the specified pitch name, is obtained, and a plurality of target pitches are subtracted to generate a plurality of harmony sound signals. When adjusting according to the pitch, for each of the plurality of target pitches, in order to adjust the degree of pitch fluctuation added to the target pitch, pitch adjustment information indicating a differential pitch addition ratio is set, and the plurality of target pitches are set. For each target pitch, a harmony sound signal having a pitch obtained by modulating the target pitch in accordance with the differential pitch information in which the addition ratio is adjusted according to the pitch adjustment information is generated. Thus, the pitch swing in the input audio signal (fluctuation component of the pitch) is reproduced, while a harmony voice signal that does not inspire discomfort to the user without a wide gap between the lead note, several since the degree of pitch fluctuation of each target pitch can be adjusted independently, it is possible to the extent of the pitch shaking generates a different harmony voice signal, an effect that.

この発明に係る音信号生成装置（電子音楽装置）の全体構成の一実施例を示したハード構成ブロック図である。1 is a block diagram of a hardware configuration showing an embodiment of an overall configuration of a sound signal generation device (electronic music device) according to the present invention. ハーモニーテーブルのデータ構成の一実施例を示す概念図である。It is a conceptual diagram which shows one Example of the data structure of a harmony table. ハーモニー音生成処理の前半処理を示すフローチャートである。It is a flowchart which shows the first half process of a harmony sound production | generation process. ハーモニー音生成処理の後半処理を示すフローチャートである。It is a flowchart which shows the latter half process of a harmony sound production | generation process. 差分ピッチ付加割合テーブルのデータ構成の一実施例を示す概念図である。It is a conceptual diagram which shows one Example of the data structure of a difference pitch addition ratio table. 差分ピッチ付加割合決定処理を示すフローチャートである。It is a flowchart which shows a difference pitch addition ratio determination process. ルール２処理を示すフローチャートである。It is a flowchart which shows a rule 2 process. ルール３又は４処理を示すフローチャートである。It is a flowchart which shows the rule 3 or 4 process. ルール５又は６処理を示すフローチャートである。It is a flowchart which shows the rule 5 or 6 process. ハーモニー音生成処理を説明するための具体例を示す概念図である。It is a conceptual diagram which shows the specific example for demonstrating a harmony sound production | generation process. ピッチ検出処理の一例を示すフローチャートである。It is a flowchart which shows an example of a pitch detection process.

以下、この発明の実施の形態を添付図面に従って詳細に説明する。 Embodiments of the present invention will be described below in detail with reference to the accompanying drawings.

図１は、この発明に係る音信号生成装置（電子音楽装置）の全体構成の一実施例を示したハード構成ブロック図である。本実施例に示す電子音楽装置は、マイクロプロセッサユニット（ＣＰＵ）１、リードオンリメモリ（ＲＯＭ）２、ランダムアクセスメモリ（ＲＡＭ）３からなるマイクロコンピュータによって制御される。ＣＰＵ１は、この装置全体の動作を制御する。このＣＰＵ１に対して、データ及びアドレスバス１Ｄを介してＲＯＭ２、ＲＡＭ３、入力操作部４、表示部５、音源６、通信インタフェース（Ｉ／Ｆ）７、記憶装置８がそれぞれ接続されている。 FIG. 1 is a hardware configuration block diagram showing an embodiment of the overall configuration of a sound signal generation device (electronic music device) according to the present invention. The electronic music apparatus shown in this embodiment is controlled by a microcomputer comprising a microprocessor unit (CPU) 1, a read only memory (ROM) 2, and a random access memory (RAM) 3. The CPU 1 controls the operation of the entire apparatus. A ROM 2, a RAM 3, an input operation unit 4, a display unit 5, a sound source 6, a communication interface (I / F) 7, and a storage device 8 are connected to the CPU 1 via a data and address bus 1D.

ＲＯＭ２は、ＣＰＵ１により実行あるいは参照される各種制御プログラムや例えば図２に示したハーモニーテーブル（音高決定テーブル）などの各種データ等を格納する。ＲＡＭ３は、ＣＰＵ１が所定の制御プログラムを実行する際に発生する各種データなどを一時的に記憶するワーキングメモリとして、あるいは現在実行中の制御プログラムやそれに関連するデータを一時的に記憶するメモリ等として使用される。ＲＡＭ３の所定のアドレス領域がそれぞれの機能に割り当てられ、レジスタやフラグ、テーブル、メモリなどとして利用される。 The ROM 2 stores various control programs executed or referred to by the CPU 1, various data such as a harmony table (pitch determination table) shown in FIG. The RAM 3 is a working memory that temporarily stores various data generated when the CPU 1 executes a predetermined control program, or a memory that temporarily stores a control program currently being executed and related data. used. A predetermined address area of the RAM 3 is assigned to each function and used as a register, flag, table, memory, or the like.

入力操作部４は、例えばユーザが発した人間音声やユーザが演奏した楽器の演奏音などの音声信号を入力するマイクロフォンなどの入力機器、演奏（音声信号の入力）の開始／停止を指示する演奏開始／停止ボタン、各種パラメータを設定するスイッチなどの各種操作子の他、数値データ入力用のテンキーや文字データ入力用のキーボードあるいはマウスなどであってよい。前記入力機器はマイクロフォンに限らず、ハーモニー音信号を生成する際に必要とされるコード情報などをユーザ操作に応じて発生する例えば鍵盤等の演奏操作子や、予めＲＯＭ２等に記憶したコード情報を演奏進行順に供給するシーケンサーなどのデータ入力装置であってもよい。さらに、前記入力操作部４は、後述する差分ピッチ付加割合テーブル（図５参照）に定義される「差分ピッチ付加割合」（ピッチ調整情報）をユーザが任意の値に編集／設定することの可能な設定操作子等を含む。 The input operation unit 4 is, for example, an input device such as a microphone for inputting a sound signal such as a human voice uttered by the user or a performance sound of a musical instrument performed by the user, and a performance instructing start / stop of performance (input of a sound signal) In addition to various operators such as a start / stop button and a switch for setting various parameters, a numeric keypad for inputting numeric data, a keyboard for inputting character data, or a mouse may be used. The input device is not limited to a microphone, and chord information necessary for generating a harmony sound signal is generated in response to a user operation, for example, a performance operator such as a keyboard, or chord information previously stored in the ROM 2 or the like. It may be a data input device such as a sequencer that supplies in order of performance. Further, the input operation unit 4 allows the user to edit / set “difference pitch addition ratio” (pitch adjustment information) defined in a difference pitch addition ratio table (see FIG. 5) described later to an arbitrary value. Including various setting operators.

表示部５は例えば液晶表示パネル（ＬＣＤ）やＣＲＴ等から構成されてなり、マイクロフォン等から入力された音声信号に基づき発生されるリード音に関する楽譜及び／又は生成されたハーモニー音信号に基づき発音されるハーモニー音に関する楽譜、各種操作子により設定されたパラメータ設定状態、あるいは予め記憶されている各種データの一覧やＣＰＵ１の制御状態などといった各種情報を表示する。 The display unit 5 is composed of, for example, a liquid crystal display panel (LCD), a CRT, or the like, and is pronounced based on a musical score and / or a generated harmony sound signal related to a lead sound generated based on an audio signal input from a microphone or the like. Various information such as a musical score relating to a harmony sound, a parameter setting state set by various operators, a list of various data stored in advance and a control state of the CPU 1 are displayed.

音源６は複数のチャンネルで楽音信号の同時発生が可能であり、データ及びアドレスバス１Ｄを経由して与えられる、例えばマイクロフォンを介して入力された入力音声信号を一時的にバッファ記憶した波形信号に基づき或るチャンネルでリード音の音信号を発生し、及び、前記入力音声信号を一時的にバッファ記憶した波形信号に基づき別のチャンネルでハーモニー音信号を発生する。なお、リード音の音源波形としては、一時的にバッファ記憶した入力音声信号の波形をそのまま使用してもよいし、あるいは、入力音声信号の波形を基にしてピッチあるいは音色等適宜制御した波形を使用してもよい。また、ハーモニー音信号用の音源波形としては、一時的にバッファ記憶した入力音声信号の波形を基にしたものを使用してもよいし、あるいは、その他適宜の音源波形を使用してもよい。
音源６から発生されたこれらの信号は、アンプやスピーカなどを含むサウンドシステム６Ａから発音される。また、音源６は入力音声信号やハーモニー音信号などを発生する際に、例えばジェンダー（男性声、女性声といった声質のタイプおよび深さ）、トレモロ、音量、パン（定位）、デチューン、リバーブ（残響）などの各種効果を付与することができるようになっている。なお、音源６とサウンドシステム６Ａの構成には、従来のいかなる構成を用いてもよい。例えば、音源６における音源波形の生成又は再生方式としては、ＦＭ、ＰＣＭ、物理モデル、フォルマント合成、ＭＰ３等の各種楽音合成方式あるいは符号化方式又はデータ圧縮方式のいずれを採用してもよい。また、音源６の全部又は一部を、専用のハードウェアで構成してもよいし、ＣＰＵ１あるいはＤＳＰ（Digital Signal Processor）によるソフトウェア処理で構成してもよい。 The tone generator 6 can simultaneously generate musical tone signals on a plurality of channels, and is provided via data and an address bus 1D, for example, a waveform signal temporarily buffered and stored as an input audio signal input via a microphone. Based on this, a sound signal of a lead sound is generated in a certain channel, and a harmony sound signal is generated in another channel based on a waveform signal in which the input sound signal is temporarily buffer-stored. As the sound source waveform of the lead sound, the waveform of the input audio signal temporarily stored in the buffer may be used as it is, or a waveform that is appropriately controlled based on the waveform of the input audio signal, such as pitch or timbre. May be used. As the sound source waveform for the harmony sound signal, a waveform based on the waveform of the input audio signal temporarily stored in the buffer may be used, or other appropriate sound source waveform may be used.
These signals generated from the sound source 6 are generated from a sound system 6A including an amplifier and a speaker. When the sound source 6 generates an input sound signal, a harmony sound signal, etc., for example, gender (type and depth of voice quality such as male voice and female voice), tremolo, volume, pan (localization), detune, reverb (reverberation) ) And the like can be imparted. Note that any conventional configuration may be used for the configuration of the sound source 6 and the sound system 6A. For example, as a sound source waveform generating or reproducing method in the sound source 6, any of various tone synthesis methods such as FM, PCM, physical model, formant synthesis, MP3, encoding method or data compression method may be adopted. Further, all or a part of the sound source 6 may be configured by dedicated hardware, or may be configured by software processing by the CPU 1 or DSP (Digital Signal Processor).

通信インタフェース（Ｉ／Ｆ）７は、当該装置と図示しない外部機器との間で制御プログラムや各種データなどの各種情報を送受信するためのインタフェースである。この通信インタフェース７は、例えばMIDIインタフェース，ＬＡＮ，インターネット，電話回線等であってよく、また有線あるいは無線のものいずれかでなく双方を具えていてよい。 The communication interface (I / F) 7 is an interface for transmitting and receiving various information such as a control program and various data between the apparatus and an external device (not shown). The communication interface 7 may be, for example, a MIDI interface, a LAN, the Internet, a telephone line, or the like, and may include both wired and wireless ones.

記憶装置８は、予め用意されたハーモニーテーブル（後述の図２参照）やＣＰＵ１が実行する各種制御プログラムなどの各種情報を記憶する。あるいは、入力された入力音声信号や生成されたハーモニー音信号などを記憶できるようにしてもよい。なお、上述したＲＯＭ２に制御プログラムが記憶されていない場合、この記憶装置８（例えばハードディスク）に制御プログラムを記憶させておき、それをＲＡＭ３に読み込むことにより、ＲＯＭ２に制御プログラムを記憶している場合と同様の動作をＣＰＵ１に実行させることができる。このようにすると、制御プログラムの追加やバージョンアップ等が容易に行える。また、記憶装置８はハードディスク（HD）に限られず、フレキシブルディスク（FD）、コンパクトディスク（CD‐ROM・CD‐RAM）、光磁気ディスク（MO）、あるいはDVD（Digital Versatile Disk）等の様々な形態の記憶媒体を利用する記憶装置であればどのようなものであってもよい。あるいは、フラッシュメモリなどの半導体メモリであってもよい。 The storage device 8 stores various information such as a prepared harmony table (see FIG. 2 described later) and various control programs executed by the CPU 1. Or you may enable it to memorize | store the input audio | voice signal input, the produced | generated harmony sound signal, etc. When the control program is not stored in the ROM 2 described above, the control program is stored in the storage device 8 (for example, a hard disk) and is read into the RAM 3 to store the control program in the ROM 2. It is possible to cause the CPU 1 to execute the same operation as in FIG. In this way, control programs can be easily added and upgraded. The storage device 8 is not limited to a hard disk (HD), but may be a flexible disk (FD), a compact disk (CD-ROM / CD-RAM), a magneto-optical disk (MO), or a DVD (Digital Versatile Disk). Any storage device may be used as long as it uses a storage medium in the form. Alternatively, a semiconductor memory such as a flash memory may be used.

なお、上述した音信号生成装置（電子音楽装置）において、入力操作部４や表示部５あるいは音源６などを１つの装置本体に内蔵したものに限らず、それぞれが別々に構成され、MIDIインタフェースや各種ネットワーク等の通信インタフェースを用いて各装置を接続するように構成されたものであってよいことは言うまでもない。
なお、本発明に係る音信号生成装置（電子音楽装置）及びプログラムは、カラオケ装置、電子楽器、パーソナルコンピュータ、携帯電話等の携帯型通信端末、あるいはゲーム装置など、どのような形態の装置・機器に適用してもよい。携帯型通信端末に適用した場合、端末のみで所定の機能が完結している場合に限らず、機能の一部をサーバ側に持たせ、端末とサーバとからなるシステム全体として所定の機能を実現するようにしてもよい。 In the above-described sound signal generation device (electronic music device), the input operation unit 4, the display unit 5, the sound source 6 and the like are not limited to those built in one device body, but each is configured separately, and has a MIDI interface, Needless to say, the apparatus may be configured to connect each device using a communication interface such as various networks.
The sound signal generating device (electronic music device) and the program according to the present invention may be any device / equipment such as a karaoke device, an electronic musical instrument, a personal computer, a portable communication terminal such as a mobile phone, or a game device. You may apply to. When applied to a portable communication terminal, not only when a predetermined function is completed with only the terminal, but also a part of the function is provided on the server side, and the predetermined function is realized as a whole system composed of the terminal and the server. You may make it do.

図１に示した音信号生成装置（電子音楽装置）は、マイクロフォン等を介して入力された入力音声信号を周波数解析してピッチを検出し（最終的には音楽の音名のいずれかに対応する特定の音高に特定する）、該特定した音高と鍵盤等から入力されたコード（chord）情報とを元にして別途新たに１乃至複数の目標音高（同様に音楽の音名のいずれかに対応する特定の音高である）を決定し、該決定した目標音高を持つハーモニー音信号を自動的に生成するハーモニー音生成（付加）機能を有する。 The sound signal generation device (electronic music device) shown in FIG. 1 detects the pitch by analyzing the frequency of the input sound signal input via a microphone or the like (corresponding to one of the musical pitch names in the end). 1 to a plurality of target pitches (similarly to music pitch names) based on the specified pitch and chord information input from the keyboard or the like. And a harmony sound generation (addition) function for automatically generating a harmony sound signal having the determined target pitch.

ここで、前記目標音高は、入力音声信号を周波数解析して得られた音楽の音名のいずれかに対応する特定の音高と鍵盤等から入力されたコード情報とに基づき、予め用意された図２に示すハーモニーテーブル（音高決定テーブル）に従って１２音音階の階名（音名）のいずれかに決定されるようになっている（所謂コード入力方式）。図２は、ハーモニーテーブルのデータ構成を示す概念図である。ただし、ここではコード情報としてコード名「Ｃメジャー」が指定された場合に参照されるテーブルを図２（Ａ）に、コード情報としてコード名「Ｃマイナー」が指定された場合に参照されるテーブルを図２（Ｂ）にそれぞれ示しており、各テーブルは前記入力音声信号から得られた特定の音高に応じて該当する和音（コード）構成音からなる複数系列のハーモニー音信号を生成するデータ構成のものを例に示している。 Here, the target pitch is prepared in advance based on a specific pitch corresponding to one of musical pitches obtained by frequency analysis of the input voice signal and chord information input from a keyboard or the like. According to the harmony table (pitch determination table) shown in FIG. 2, any one of twelve scale names (sound names) is determined (so-called chord input method). FIG. 2 is a conceptual diagram showing the data structure of the harmony table. However, here, a table referred to when the code name “C major” is designated as the code information is shown in FIG. 2A, and a table referred to when the code name “C minor” is designated as the code information. 2B, each table is data for generating a plurality of series of harmony sound signals composed of chord constituent chords corresponding to specific pitches obtained from the input sound signal. A configuration is shown as an example.

ハーモニーテーブルはコード名毎に例えば１テーブルずつ複数のテーブルが予め記憶されており、前記コード情報（ここではコード名）に従って対応する１テーブルが特定されるようになっている。図２（Ａ）及び（Ｂ）から理解できるように、ハーモニーテーブルは入力音声信号を周波数解析して得られた音楽の音名のいずれかに対応する特定の音高（入力音高）毎に、前記コード情報に対応した和音の構成音からなる複数のハーモニー系列の目標音高を定義する。ただし、この図２（Ａ），（Ｂ）では、入力音高について音名「Ｃ,Ｃ♯,Ｄ,Ｄ♯,Ｅ,Ｆ,Ｆ♯,Ｇ,Ｇ♯,Ａ,Ａ♯,Ｂ」で表記してあり、また目標音高についても同様に音名で表記している。 In the harmony table, a plurality of tables, for example, one table for each code name is stored in advance, and one corresponding table is specified according to the code information (here, the code name). As can be understood from FIGS. 2 (A) and 2 (B), the harmony table is generated for each specific pitch (input pitch) corresponding to one of the musical pitches obtained by frequency analysis of the input voice signal. The target pitches of a plurality of harmony sequences composed of chord constituent sounds corresponding to the chord information are defined. However, in FIGS. 2A and 2B, the pitch names “C, C #, D, D #, E, F, F #, G, G #, A, A #, B” for the input pitches. In addition, the target pitch is also indicated by the pitch name.

前記目標音高の音名表記に関し、例えば目標音高「Ｇ」は入力音高と同じオクターブ領域の「Ｇ」音であることを示し、目標音高「Ｃ＋１」は入力音高から１つ上のオクターブ領域の「Ｃ」音であること、目標音高「Ｃ＋２」は入力音高から２つ上のオクターブ領域の「Ｃ」音であることを示す。また、ここには記載していないが、例えば目標音高「Ｅ−１」は入力音高から１つ下のオクターブ領域の「Ｅ」音であること、目標音高「Ｅ−２」は入力音高から２つ下のオクターブ領域の「Ｅ」音であることを示す。したがって、図２（Ａ）の例では、例えば入力音高が「Ｅ３」である場合にはハーモニー音信号の目標音高として「Ｇ３」、「Ｃ４」、「Ｅ４」に、入力音高が「Ａ２♯」である場合にはハーモニー音信号の目標音高として「Ｅ３」、「Ｇ３」、「Ｃ４」にそれぞれ決定されることになる。なお、この実施例ではオクターブ領域を「Ｃ」と「Ｂ」との間で区切るものを例に示している。 Regarding the pitch notation of the target pitch, for example, the target pitch “G” indicates a “G” tone in the same octave region as the input pitch, and the target pitch “C + 1” is one level higher than the input pitch. Indicates that the target pitch “C + 2” is a “C” sound in the octave region two levels higher than the input pitch. Although not described here, for example, the target pitch “E-1” is an “E” tone in the octave region one level lower than the input pitch, and the target pitch “E-2” is input. This indicates that the “E” sound is in the octave region two places below the pitch. Therefore, in the example of FIG. 2A, for example, when the input pitch is “E3”, the target pitches of the harmony sound signal are “G3”, “C4”, “E4”, and the input pitch is “ In the case of “A2 #”, the target pitch of the harmony sound signal is determined as “E3”, “G3”, and “C4”, respectively. In this embodiment, an example in which the octave region is divided between “C” and “B” is shown.

なお、上述した実施例では、ハーモニー音信号の音高を決定する方式としてコード情報（より詳しくはハーモニーテーブル）を元にして決定するコード入力方式を示したがこれに限らず、コード情報を元にすることなくハーモニー音信号の音高を決定する公知の他の方法であってもよい。例えば、入力音声信号の音高に対して予め決めてある所定の音程分離れた複数の音高（例えば、４半音上（長３度）と７半音上（完全５度）の２つの音高など）に一律にハーモニー音信号の音高を決定する方法を採用するなどしてもよい（所謂固定方式）。 In the above-described embodiment, the code input method for determining the pitch of the harmony sound signal based on the chord information (more specifically, the harmony table) is shown. However, the present invention is not limited to this. Other known methods for determining the pitch of the harmony sound signal may be used. For example, two pitches (for example, four semitones (long 3 degrees) and seven semitones (completely 5 degrees)) separated by a predetermined pitch predetermined with respect to the pitch of the input audio signal. Or the like) may be adopted to uniformly determine the pitch of the harmony signal (so-called fixed method).

図１に示した電子音楽装置はピッチ揺れのない一定の音高（目標音高）のハーモニー音信号を生成できることは勿論のこと、さらに入力音声信号がピッチ揺れ（１００セント未満のピッチ揺れ）を有する場合に該ピッチ揺れを任意の異なる大きさで反映させたハーモニー音信号を生成することのできるようになっている。そこで、このような入力音声信号のピッチ揺れを任意に反映した１乃至複数のハーモニー音信号を生成するハーモニー音生成機能について、図３〜図１０を用いて説明する。図３及び図４は、上記のハーモニー音生成機能をＣＰＵ１によって実現する「ハーモニー音生成処理」の一実施例を示すフローチャートである。ただし、ここでは図示の都合上、ハーモニー音生成処理の前半を図３に、図３の前半処理に後続する後半の処理を図４にそれぞれ分けて示している。当該処理は、例えば演奏開始／停止ボタンのユーザ操作等に従って演奏開始が指示されることに応じて開始され、演奏停止が指示されるまで繰り返し実行される。なお、以下では図１０を適宜に参照しながら上記処理について説明する。なお、図１０はハーモニー音生成処理を説明するための具体例を示す概念図である。 The electronic music apparatus shown in FIG. 1 can generate a harmony sound signal having a constant pitch (target pitch) without pitch fluctuations, and further, the input audio signal has pitch fluctuations (pitch fluctuations of less than 100 cents). In the case where it is provided, it is possible to generate a harmony sound signal in which the pitch fluctuation is reflected in any different magnitude. Therefore, a harmony sound generation function for generating one or more harmony sound signals that arbitrarily reflects the pitch fluctuation of the input audio signal will be described with reference to FIGS. 3 and 4 are flowcharts showing an embodiment of “harmonic sound generation processing” in which the above-described harmony sound generation function is realized by the CPU 1. However, for the sake of illustration, the first half of the harmony sound generation process is shown in FIG. 3, and the latter half of the process subsequent to the first half process of FIG. 3 is shown separately in FIG. This process is started in response to an instruction to start a performance in accordance with a user operation of the performance start / stop button, for example, and is repeatedly executed until an instruction to stop the performance is given. Hereinafter, the above process will be described with reference to FIG. 10 as appropriate. FIG. 10 is a conceptual diagram showing a specific example for explaining the harmony sound generation processing.

図３に示すように、ステップＳ１は初期設定を行う。この初期設定では、例えばコード情報を記憶するコードバッファ、入力音高を記憶するリードバッファ、ハーモニー音高（目的音高）を記憶するノートバッファ、入力音高と目的音高との差分ピッチを記憶する残差バッファなどの各種バッファ領域のクリア、ユーザ操作に応じたハーモニー音高決定方式（例えば上記したコード入力方式又は固定方式）の選択などを行う。ステップＳ２では、差分ピッチ付加割合の編集・設定を行う。例えば、ユーザにより図５に示されるような差分ピッチ付加割合テーブルの編集・設定が行われる。差分ピッチ付加割合（ピッチ調整情報）は差分ピッチを元にハーモニー音高に付加するピッチ揺れの大きさを決めるパラメータであり（ここでは一例として、例えば０〜１００％などの割合で示している）、図５に示す差分ピッチ付加割合テーブルとして複数の差分ピッチ付加割合（ピッチ調整情報）が定義されている。この差分ピッチ付加割合テーブルがハーモニー音信号を生成する際に参照されて、複数のハーモニー音高に対するピッチ調整（１００セント未満のピッチ調整）を個別に行うことのできるようにしている（詳しくは後述する）。 As shown in FIG. 3, step S1 performs initial setting. In this initial setting, for example, a code buffer for storing chord information, a read buffer for storing input pitch, a note buffer for storing harmony pitch (target pitch), and a differential pitch between the input pitch and the target pitch are stored. Clearing various buffer areas such as a residual buffer, selecting a harmony pitch determination method (for example, the above-described chord input method or fixed method) according to a user operation, and the like. In step S2, the differential pitch addition ratio is edited and set. For example, the user edits / sets the differential pitch addition ratio table as shown in FIG. The difference pitch addition ratio (pitch adjustment information) is a parameter that determines the magnitude of pitch fluctuation to be added to the harmony pitch based on the difference pitch (here, as an example, it is indicated by a ratio of 0 to 100%, for example). A plurality of differential pitch addition ratios (pitch adjustment information) are defined as the differential pitch addition ratio table shown in FIG. This differential pitch addition ratio table is referred to when generating a harmony sound signal, so that a pitch adjustment (a pitch adjustment of less than 100 cents) for a plurality of harmony pitches can be performed individually (details will be described later). To do).

ここで、上記差分ピッチ付加割合テーブルについて説明する。図５は、差分ピッチ付加割合テーブルのデータ構成の一実施例を示す概念図である。図５（Ａ）に示す差分ピッチ付加割合テーブルは、１音目、２音目、３音目（図中の番号１〜３に対応する）・・・の所定の音高順（低い順又は高い順あるいは他の適宜の順であってもよい）に、各ハーモニー音毎に適用する差分ピッチ付加割合（ピッチ調整情報）を定義したものである。この差分ピッチ付加割合は、後述する具体的ピッチと正規化されたピッチとの差（ピッチ揺れの大きさに相当する）を調整するピッチ調整情報である。ここに示した例では、例えばハーモニー目標音高が「Ｅ３」、「Ｇ３」、「Ｃ４」に特定されている場合に、これらの音高「Ｅ３」、「Ｇ３」、「Ｃ４」の各ハーモニー音のピッチ揺れはそれぞれ入力音声信号のピッチ揺れの大きさ（つまりは差分ピッチの大きさ）を１００％としたときの「２０％」、「２５％」、「１５％」となるように前記具体的ピッチと正規化されたピッチとの差を調整する。 Here, the difference pitch addition ratio table will be described. FIG. 5 is a conceptual diagram showing an example of the data configuration of the differential pitch addition ratio table. The difference pitch addition ratio table shown in FIG. 5 (A) is a predetermined pitch order (lowest order or first order) of the first sound, the second sound, the third sound (corresponding to numbers 1 to 3 in the figure). The difference pitch addition ratio (pitch adjustment information) to be applied to each harmony sound is defined in the order of higher order or other appropriate order). This differential pitch addition ratio is pitch adjustment information for adjusting a difference (corresponding to the magnitude of pitch fluctuation) between a specific pitch described later and a normalized pitch. In the example shown here, for example, when the harmony target pitches are specified as “E3”, “G3”, “C4”, each of these pitches “E3”, “G3”, “C4” is harmonized. The pitch fluctuation of the sound is “20%”, “25%”, and “15%” when the magnitude of the pitch fluctuation (that is, the difference pitch) of the input audio signal is 100%. Adjust the difference between the specific pitch and the normalized pitch.

他方、図５（Ｂ）に示す差分ピッチ付加割合テーブルは、ハーモニー音に付加するピッチ揺れの大きさ（レベル）別に（この例では大、中、小、なしの４レベル）、適用する差分ピッチ付加割合（ピッチ調整情報）を定義したものである。
なお、上記２つのテーブルのうちどちらのテーブルを用いるかは、ハーモニー音の生成ルール（詳しくは後述する）に従って決められる。また、レベル別の差分ピッチ付加割合を用いる場合には、どのハーモニー音にどのレベルの差分ピッチ付加割合を適用するかがハーモニー音の生成ルールにより予め対応付けられている（詳しくは後述する各ルール処理参照）。
なお、差分ピッチ付加割合は上記した設定態様に限られず、例えば具体的な数値であってもよいことは言うまでもない。 On the other hand, the differential pitch addition ratio table shown in FIG. 5B is a differential pitch to be applied according to the magnitude (level) of pitch fluctuation added to the harmony sound (four levels of large, medium, small and none in this example). This defines the addition ratio (pitch adjustment information).
Note that which of the two tables is used is determined according to a harmony sound generation rule (described in detail later). In addition, when using the difference pitch addition ratio for each level, which level of the difference pitch addition ratio is applied to which harmony sound is previously associated by the harmony sound generation rule (details will be described later). Processing).
Needless to say, the differential pitch addition ratio is not limited to the above-described setting mode, and may be a specific numerical value, for example.

図３に戻って、ステップＳ３は演奏終了を検出したか否かを判定する。演奏終了を検出したと判定した場合には（ステップＳ３のＹＥＳ）、例えば発音中のリード音及び／又はハーモニー音を消音するなどの終了処理（ステップＳ２８）を実行して当該処理を終了する。他方、演奏終了を検出していないと判定した場合には（ステップＳ３のＮＯ）、入力音のオフを検出したか否かを判定する（ステップＳ４）。
ハーモニー音生成処理に並行して、図１１に示すような公知のピッチ検出処理が逐次（例えば一定の割り込み周期で）実行されている。具体的には、マイクロフォン等を介して入力される入力音声信号がＡ／Ｄ変換回路によりディジタル化され、該ピッチ検出処理では、該ディジタル化された入力音声信号の具体的ピッチを「周波数検出」処理（Ｓ７０）により検出して具体的周波数信号（具体的ピッチ情報）を得る。なお、具体的ピッチとは音名周波数（正規化されたピッチ）に丸められる前のピッチである。「母音区間検出」処理（Ｓ７１）では、入力音声信号の母音区間を検出して該入力音声信号を異なる母音区間毎に区切る。なお、「周波数検出」処理は、例えば音声分析の分野で周知の技術であるゼロクロス法などの公知のどのような技術を用いてもよいことから、ここでの詳細説明を省略する。１つの（同じ）母音区間が持続しているとき１つの入力音のオンが持続しているとみなす。図３のステップＳ４では、同じ母音区間が続いているか否かを調べることにより、入力音のオフを検出したか否かを判定する。
図３に戻り、入力音のオフを検出していないと判定した場合つまりは同じ母音区間が続いている場合には、ステップＳ４はＮＯであり、ステップＳ７の処理へジャンプする。入力音のオフを検出したと判定した場合つまりは１つの母音区間が終了した場合には、ステップＳ４はＹＥＳであり、入力音声信号の再生に基づき発音されるリード音を消音する処理を実行する（ステップＳ５）と共に、ハーモニー音信号の再生に基づき発音されるハーモニー音を消音する処理を実行する（ステップＳ６）。 Returning to FIG. 3, step S3 determines whether or not the end of the performance has been detected. If it is determined that the end of the performance has been detected (YES in step S3), for example, an end process (step S28) such as muting the lead sound and / or the harmony sound being generated is executed, and the process ends. On the other hand, when it is determined that the end of the performance has not been detected (NO in step S3), it is determined whether or not the input sound has been turned off (step S4).
In parallel with the harmony sound generation process, a known pitch detection process as shown in FIG. 11 is performed sequentially (for example, at a constant interrupt cycle). Specifically, an input voice signal input via a microphone or the like is digitized by an A / D conversion circuit. In the pitch detection process, a specific pitch of the digitized input voice signal is “frequency detected”. A specific frequency signal (specific pitch information) is obtained by detection by the processing (S70). The specific pitch is a pitch before being rounded to a pitch name (normalized pitch). In the “vowel section detection” process (S71), the vowel section of the input speech signal is detected and the input speech signal is divided into different vowel sections. The “frequency detection” process may use any known technique such as the zero-cross method that is a well-known technique in the field of speech analysis, for example, and will not be described in detail here. When one (same) vowel interval lasts, it is considered that one input sound is on. In step S4 in FIG. 3, it is determined whether or not the input sound has been detected by checking whether or not the same vowel segment continues.
Returning to FIG. 3, when it is determined that the input sound is not detected to be off, that is, when the same vowel section continues, step S4 is NO and the process jumps to step S7. If it is determined that the input sound has been detected to be off, that is, if one vowel section has been completed, step S4 is YES, and a process for muting the lead sound generated based on the reproduction of the input sound signal is executed. Along with (Step S5), a process of muting the harmony sound generated based on the reproduction of the harmony sound signal is executed (Step S6).

ステップＳ７は、新規入力音（新たな母音区間の入力音声信号）を検出したか否かを判定する。新規入力音を検出していないと判定した場合つまりはある１つの母音区間が終了していない場合には（ステップＳ７のＮＯ）、図４に示すステップＳ１９の処理へジャンプする。新規入力音を検出したと判定した場合つまりはある１つの母音区間が終了し異なる母音区間となった場合には（ステップＳ７のＹＥＳ）、入力音声信号の周波数情報を音名単位にクオンタイズして入力音高を特定する（ステップＳ８）。すなわち、上記「周波数検出」処理により変換された周波数信号を「平坦化」処理することによって、周波数信号の変化を平坦化（又は平滑化とも呼ばれる）する。該平坦化された周波数信号は「階名検出」処理により、所定時間毎に１２音音階の階名（音名）のいずれかに離散化される。このように、平坦化された周波数信号が半音（１００セント）単位で定められた複数の音楽の音名のいずれかに対応する所定の音高に丸められることによって、入力された入力音声信号を音楽上の音名に対応する音高（これを入力音高又は正規化されたピッチと呼ぶ）のいずれかに特定する。特定された入力音高（正規化されたピッチ）は、リードバッファに記憶される。なお、この際には前記検出した正規化されたピッチに対応した窓関数により切り出される１周期分の波形要素データを記憶する処理が行われる。この１周期分の波形要素データの記憶更新は逐次行われるようになっていてよい。 Step S7 determines whether or not a new input sound (input voice signal in a new vowel section) has been detected. If it is determined that a new input sound has not been detected, that is, if one vowel section has not ended (NO in step S7), the process jumps to the process in step S19 shown in FIG. When it is determined that a new input sound has been detected, that is, when one vowel section ends and becomes a different vowel section (YES in step S7), the frequency information of the input voice signal is quantized in units of pitch names. The input pitch is specified (step S8). That is, the frequency signal converted by the “frequency detection” process is “flattened” to flatten the change in the frequency signal (also referred to as smoothing). The flattened frequency signal is discretized into one of twelve-tone scale names (pitch names) every predetermined time by “class name detection” processing. In this way, the leveled frequency signal is rounded to a predetermined pitch corresponding to one of a plurality of musical pitch names defined in units of semitones (100 cents). It is specified as one of the pitches corresponding to the pitch names on music (this is called the input pitch or normalized pitch). The specified input pitch (normalized pitch) is stored in the read buffer. At this time, processing for storing the waveform element data for one period cut out by the window function corresponding to the detected normalized pitch is performed. The storage update of the waveform element data for one cycle may be performed sequentially.

図１０（Ａ）は、一連の入力音声信号の具体的ピッチ変化の一例を示している。この例では、入力音声信号の具体的ピッチが、音名「Ｃ」の音高を基準にして上下に小さなピッチ揺れ（半音以下であり、数〜数十セント程度）を示す第１の母音区間から、音名「Ｄ」の音高を基準にして上下に小さなピッチ揺れ（同じく数〜数十セント程度）を示す第２の母音区間へと遷移する。例えば、入力音声が歌っている歌詞の音韻が「あい」であるような場合には、第１の母音区間に該当するのは音節「あ」の母音音素「ａ」であり、第２の母音区間に該当するのは音節「い」の母音音素「ｉ」である。このようなピッチ揺れを有する入力音声信号の周波数情報を音名単位にクオンタイズする（つまり正規化されたピッチを検出する）と、図１０（Ｂ）に示すように、第１の母音区間については音名「Ｃ」の音高に、第２の母音区間については音名「Ｄ」の音高にそれぞれ入力音高（正規化されたピッチ）を特定することができる。 FIG. 10A shows an example of a specific pitch change of a series of input audio signals. In this example, the first vowel section in which the specific pitch of the input voice signal indicates a small pitch fluctuation (less than a semitone and about several to several tens of cents) with respect to the pitch of the pitch name “C”. To a second vowel section that shows a small pitch fluctuation (also about several to several tens of cents) on the basis of the pitch of the pitch name “D”. For example, when the phoneme of the lyrics sung by the input speech is “ai”, the vowel phoneme “a” of the syllable “a” corresponds to the first vowel segment, and the second vowel The vowel phoneme “i” of the syllable “I” corresponds to the section. When the frequency information of the input voice signal having such pitch fluctuation is quantized in units of pitch names (that is, a normalized pitch is detected), as shown in FIG. The input pitch (normalized pitch) can be specified for the pitch of the pitch name “C” and the pitch of the pitch name “D” for the second vowel interval, respectively.

ステップＳ９は、入力音声信号の周波数情報（具体的ピッチ）と前記特定した入力音高（正規化されたピッチ）との差分ピッチ（差に関連する差分情報）を生成する。なお、差分ピッチ（差に関連する差分情報）は、セント値表現つまり音程差を示す情報であるとする。各母音区間において生成される差分ピッチの一例を図１０（Ｃ）に示す（実線参照）。この図から理解できるように、上記処理により差分ピッチ「０」を基準として入力音声信号が有するピッチ揺れをそのまま再現した差分ピッチが生成される。なお、１００セント以上のピッチ差は本来別の正規化されたピッチとして検出されるべきものであるから、生成される差分ピッチの最大値は１００セント未満であるべきである。換言すれば、正規化されたピッチとして検出され得なかったような例えば極短い時間で瞬間的に生じた一時的な１００セント以上のピッチ差は、差分ピッチにおいては無視する（例えば９９セントに丸める、あるいはその直前の１００セント未満のピッチ差を使用する）ようにしてよい。勿論、これに限らず、そうした一時的な１００セント以上のピッチ差をそのまま差分ピッチとして生成するようにしてもよい。つまり、差分ピッチは１００セント未満に制限されるものではない。なお、この図１０（Ｃ）には上記ステップＳ２により編集・設定された差分ピッチ付加割合テーブル（図５参照）に定義してある差分ピッチ付加割合（１００セント未満に制限）にて調整された後の差分ピッチも便宜的に点線で示してある（ただし、ここでは１つのみを例示してある）。 Step S9 generates a difference pitch (difference information related to the difference) between the frequency information (specific pitch) of the input voice signal and the specified input pitch (normalized pitch). The difference pitch (difference information related to the difference) is information indicating a cent value expression, that is, a pitch difference. An example of the difference pitch generated in each vowel section is shown in FIG. 10C (see solid line). As can be understood from this figure, the above processing generates a differential pitch that directly reproduces the pitch fluctuation of the input audio signal with the differential pitch “0” as a reference. In addition, since a pitch difference of 100 cents or more should be detected as another normalized pitch, the maximum value of the generated difference pitch should be less than 100 cents. In other words, a temporary pitch difference of 100 cents or more that occurs instantaneously in a very short time that could not be detected as a normalized pitch, for example, is ignored in the difference pitch (for example, rounded to 99 cents). Or a pitch difference of less than 100 cents just before that may be used). Of course, the present invention is not limited to this, and a temporary pitch difference of 100 cents or more may be generated as a difference pitch as it is. That is, the differential pitch is not limited to less than 100 cents. In FIG. 10C, the difference pitch addition ratio (limited to less than 100 cents) defined in the difference pitch addition ratio table (see FIG. 5) edited and set in step S2 is adjusted. The subsequent differential pitch is also indicated by a dotted line for convenience (however, only one is illustrated here).

ステップＳ１０は、入力音声信号を再生してリード音を発音する。なお、リード音は、一時的にバッファ記憶した入力音声信号を順次再生することで、元の入力音声信号におけるピッチ揺れをそっくり再現し得るように発生するようにしてもよい。あるいは、前記記憶されかつ逐次更新される１周期分の波形要素データを使用して、前記正規化されたピッチ及び前記差分ピッチの組み合わせで元の入力音声信号におけるピッチ揺れを再現し得るようにして、リード音を発生するようにしてもよい。あるいは、任意の音源波形を使用して、前記正規化されたピッチ及び前記差分ピッチの組み合わせで元の入力音声信号におけるピッチ揺れを再現し得るようにして、リード音を発生するようにしてもよい。
ステップＳ１１は、ハーモニー音高はコード情報を元に決定するか否かつまりはハーモニー音信号の音高を決定する方式としてコード入力方式が選択されているか否かを判定する。コード入力方式が選択されていないと判定した場合には（ステップＳ１１のＮＯ）、固定方式に従い入力音高（正規化されたピッチ）から予め決められた所定の音程（例えば４半音上（長３度），７半音上（完全５度）など）だけ離れた１乃至複数の音高を目標音高に決定する（ステップＳ１４）。該決定された目標音高は、ノートバッファに記憶される。一方、コード入力方式が選択されていると判定した場合には（ステップＳ１１のＹＥＳ）、コードバッファに記憶されているコード情報が有効であるか否かを判定する（ステップＳ１２）。 In step S10, the input sound signal is reproduced to generate a lead sound. Note that the lead sound may be generated so that the pitch fluctuation in the original input sound signal can be reproduced exactly by sequentially reproducing the input sound signal temporarily stored in the buffer. Alternatively, using the stored and sequentially updated waveform element data for one period, the pitch fluctuation in the original input audio signal can be reproduced with the combination of the normalized pitch and the differential pitch. A lead sound may be generated. Alternatively, a lead sound may be generated using an arbitrary sound source waveform so that pitch fluctuations in the original input audio signal can be reproduced with a combination of the normalized pitch and the differential pitch. .
Step S11 determines whether or not the harmony pitch is determined based on the chord information, that is, whether or not the chord input method is selected as a method for determining the pitch of the harmony signal. When it is determined that the chord input method has not been selected (NO in step S11), a predetermined pitch (for example, 4 semitones above (long 3) is determined in advance from the input pitch (normalized pitch) according to the fixed method). (Degree), 7 semitones (completely 5 degrees), etc.) are determined as target pitches (step S14). The determined target pitch is stored in the note buffer. On the other hand, if it is determined that the code input method is selected (YES in step S11), it is determined whether the code information stored in the code buffer is valid (step S12).

コード情報が有効でないと判定した場合つまりは何らのコード情報も入力されておらずコード情報が記憶されていない場合には（ステップＳ１２のＮＯ）、図４に示すステップＳ１９の処理へジャンプする。コード情報が有効であると判定した場合つまりは何らかのコード情報が入力されておりコード情報が記憶されている場合には（ステップＳ１２のＹＥＳ）、コードバッファに記憶されているコード情報とリードバッファに記憶されている入力音高とに基づいてＲＯＭ２又は記憶装置８に記憶されている該当するハーモニーテーブルを参照し、ハーモニー音信号の音高（目標音高）として１乃至複数の音高を決定する（ステップＳ１３）。例えば入力されたコード情報が「Ｃメジャー」であり図２に示したハーモニーテーブルに従えば、図１０（Ａ）に示す第１の母音区間における目標音高は「Ｅ」、「Ｇ」、「Ｃ＋１」となり、第２の母音区間における目標音高は「Ｇ」、「Ｃ＋１」、「Ｅ＋１」となる。該決定された１乃至複数の目標音高はノートバッファに記憶される。ステップＳ１５は、差分ピッチ付加割合決定処理を実行する。 If it is determined that the code information is not valid, that is, if no code information is input and no code information is stored (NO in step S12), the process jumps to step S19 shown in FIG. When it is determined that the code information is valid, that is, when some code information is input and the code information is stored (YES in step S12), the code information stored in the code buffer and the read buffer are stored. Based on the stored input pitch, the corresponding harmony table stored in the ROM 2 or the storage device 8 is referred to, and one or more pitches are determined as the pitch (target pitch) of the harmony sound signal. (Step S13). For example, if the input chord information is “C major” and the harmony table shown in FIG. 2 is used, the target pitches in the first vowel section shown in FIG. 10A are “E”, “G”, “ C + 1 ”, and the target pitches in the second vowel section are“ G ”,“ C + 1 ”, and“ E + 1 ”. The determined one or more target pitches are stored in a note buffer. A step S15 executes a difference pitch addition ratio determination process.

ここで、上記差分ピッチ付加割合決定処理（ステップＳ１５参照）について説明する。図６は、差分ピッチ付加割合決定処理を示すフローチャートである。図６に示すように、本差分ピッチ付加割合決定処理では、予め選択されているハーモニー音の生成ルール（例えば後述するルール１〜ルール６までの６個のルール）に従って、差分ピッチ付加割合を反映させる対象のハーモニー音（目標ピッチ）の特定と、該特定したハーモニー音に適用する差分ピッチ付加割合が決定されるようになっている。そこで、ステップＳ３１においてルール１が選択されているか、ステップＳ３２においてルール２が選択されているか、ステップＳ３３においてルール３又は４が選択されているか、ステップＳ３４においてルール５又は６が選択されているかを段階的に判定し、各判定に応じて選択ざれていると判定されたルールに従って差分ピッチ付加割合を決定するようにしている。 Here, the difference pitch addition ratio determination process (see step S15) will be described. FIG. 6 is a flowchart showing the difference pitch addition ratio determination process. As shown in FIG. 6, in this difference pitch addition ratio determination process, the difference pitch addition ratio is reflected in accordance with a pre-selected harmony sound generation rule (for example, six rules from rule 1 to rule 6 described later). The target harmony sound (target pitch) is specified, and the difference pitch addition ratio applied to the specified harmony sound is determined. Therefore, whether rule 1 is selected in step S31, rule 2 is selected in step S32, rule 3 or 4 is selected in step S33, or rule 5 or 6 is selected in step S34. The determination is made in stages, and the difference pitch addition ratio is determined according to the rule determined to be selected according to each determination.

ここで、上記ルールとしては例えば６個のルール、具体的には入力音高（正規化されたピッチ）毎に設定されている１乃至複数の差分ピッチ付加割合（つまりは差分ピッチ付加割合テーブル（図５（Ａ）参照））を用いるルール１、高音のハーモニー音ほど割合の高い差分ピッチ付加割合を用いるルール２、ハーモニー目標音高のうち３度にあたる音高のハーモニー音に割合の高い差分ピッチ付加割合を用いるルール３、ハーモニー目標音高のうち３度にあたる音高のハーモニー音を安定させる（ピッチ揺れを生じさせない）ルール４、入力音高（正規化されたピッチ）に対してテンションノートにあたる音高のハーモニー音に割合の高い差分ピッチ付加割合を用いるルール５、入力音高（正規化されたピッチ）に対してテンションノートにあたる音高のハーモニー音を安定させる（ピッチ揺れを生じさせない）ルール６がある。これらのルールは、聴きやすいハーモニー音や、曲調に応じた雰囲気（メジャー感やマイナー感など）を持つハーモニー音、あるいはテンション感に強弱をつけたハーモニー音を生成することを目的に、各目的にあわせて用意される。 Here, as the rule, for example, six rules, specifically, one to a plurality of differential pitch addition ratios (that is, a differential pitch addition ratio table) set for each input pitch (normalized pitch) ( Rule 1 using FIG. 5 (A))), rule 2 using a higher difference pitch addition ratio for higher harmonic sounds, and a differential pitch having a higher ratio to the harmonic sound of the third harmonic of the harmony target pitch. Rule 3 using additional ratio, Rule 4 to stabilize the harmony sound of 3rd of the target pitch of harmony (Do not cause pitch fluctuation), Hit note for input pitch (normalized pitch) Rule 5, which uses a differential pitch addition ratio with a high ratio to the harmony sound of the pitch, for tension notes against the input pitch (normalized pitch) Upcoming pitch to stabilize the harmony sound (not to cause the pitch shaking) there is a rule 6. These rules are used for each purpose to generate a harmonious sound that is easy to listen to, a harmonious sound that has an atmosphere (majority or minor feeling, etc.) according to the tune, or a tension that is strong and weak. Prepared together.

図６に戻り、ステップＳ３１においてルール１が選択されていると判定された場合には（ステップＳ３１のＹＥＳ）、ルール１処理としてハーモニー音毎の差分ピッチ付加割合テーブル（図５（Ａ）参照））に基づき、例えば音高の低い順（又は高い順）に入力信号のピッチ揺れを反映して反映すべき反映対象のハーモニー音を特定し、該特定されたハーモニー音毎に差分ピッチ付加割合を決定する（ステップＳ３５）。ステップＳ３２においてルール２が選択されていると判定された場合には（ステップＳ３１がＮＯでＳ３２がＹＥＳ）、後述のルール２処理を実行する（ステップＳ３６）。ステップＳ３３においてルール３又は４が選択されていると判定された場合には（ステップＳ３１，Ｓ３２がＮＯでＳ３３がＹＥＳ）、後述のルール３又は４処理を実行する（ステップＳ３７）。ステップＳ３４においてルール５又は６が選択されていると判定された場合には（ステップＳ３１〜Ｓ３３がＮＯでＳ３４がＹＥＳ）、後述のルール５又は６処理を実行する（ステップＳ３８）。ステップＳ３４においてルール５又は６が選択されていないと判定された場合には（ステップＳ３１〜Ｓ３５が全てＮＯ）、ルール１処理を適用してハーモニー音毎の差分ピッチ付加割合テーブル（図５（Ａ）参照））に基づき、例えば音高の低い順（又は高い順）に反映対象のハーモニー音を特定し、該特定されたハーモニー音毎に差分ピッチ付加割合を決定する（ステップＳ３９）。 Returning to FIG. 6, if it is determined in step S31 that rule 1 is selected (YES in step S31), a difference pitch addition ratio table for each harmony sound as rule 1 processing (see FIG. 5A). ), For example, to identify the harmony sound to be reflected by reflecting the pitch fluctuation of the input signal in the order of low pitch (or high pitch), and set the difference pitch addition ratio for each of the identified harmony sounds. Determine (step S35). If it is determined in step S32 that rule 2 is selected (step S31 is NO and S32 is YES), a rule 2 process described later is executed (step S36). If it is determined in step S33 that rule 3 or 4 is selected (steps S31 and S32 are NO and S33 is YES), the later-described rule 3 or 4 process is executed (step S37). If it is determined in step S34 that rule 5 or 6 is selected (steps S31 to S33 are NO and S34 is YES), the later-described rule 5 or 6 process is executed (step S38). If it is determined in step S34 that rule 5 or 6 is not selected (steps S31 to S35 are all NO), the rule 1 process is applied to apply a difference pitch addition ratio table for each harmony sound (FIG. 5A). )))), For example, the harmony sounds to be reflected are specified in the order of low pitch (or high order), and the difference pitch addition ratio is determined for each specified harmony sound (step S39).

上記ルール２処理（ステップＳ３６参照）について、図７を用いて説明する。図７は、ルール２処理を示すフローチャートである。ステップＳ４１は、生成するハーモニー音の数が１個であるか否かを判定する。生成するハーモニー音の数が１個であると判定した場合には（ステップＳ４１のＹＥＳ）、反映対象のハーモニー音を１つに特定し、ハーモニー音毎の差分ピッチ付加割合テーブル（図５（Ａ）参照））に基づき該特定されたハーモニー音に差分ピッチ付加割合を決定する（ステップＳ４６）。生成するハーモニー音の数が１個でないと判定した場合には（ステップＳ４１のＮＯ）、生成するハーモニー音の数が２個であるか否かを判定する（ステップＳ４２）。生成するハーモニー音の数が２個であると判定した場合には（ステップＳ４２のＹＥＳ）、反映対象のハーモニー音を２つに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ４５）。ただし、この場合には、音高が高い方のハーモニー音に対してレベル大の、音高が低いもう一方のハーモニー音に対してレベルなしの差分ピッチ付加割合を決定する。 The rule 2 process (see step S36) will be described with reference to FIG. FIG. 7 is a flowchart showing the rule 2 process. In step S41, it is determined whether or not the number of generated harmony sounds is one. If it is determined that the number of generated harmony sounds is one (YES in step S41), one harmony sound to be reflected is specified, and a difference pitch addition ratio table for each harmony sound (FIG. 5A) ) See))) to determine the difference pitch addition ratio for the identified harmony sound (step S46). When it is determined that the number of generated harmony sounds is not one (NO in step S41), it is determined whether the number of generated harmony sounds is two (step S42). When it is determined that the number of harmony sounds to be generated is two (YES in step S42), two harmony sounds to be reflected are specified, and a difference pitch addition ratio table for each level (FIG. 5B). The difference pitch addition ratio for each harmony sound is determined based on (see step S45). However, in this case, the difference pitch addition ratio with a level greater than that of the higher harmony sound and with no level for the other harmony sound with a lower pitch is determined.

生成するハーモニー音の数が２個でないつまりは３個以上であると判定した場合には（ステップＳ４２のＮＯ）、ハーモニー音の中で最も高い音高と最も低い音高の中間の音高を特定する（ステップＳ４３）。ステップＳ４４は、反映対象のハーモニー音を全てに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ４４）。ただし、この場合には、例えば音高が最も高いハーモニー音に対してレベル大の、音高が最も低いハーモニー音に対してレベルなしの、音高が最も高いハーモニー音を除く前記特定した中間の音高以上の音高のハーモニー音に対してレベル中の、その他のハーモニー音に対してレベル小の差分ピッチ付加割合を決定する。 When it is determined that the number of generated harmony sounds is not two, that is, three or more (NO in step S42), an intermediate pitch between the highest pitch and the lowest pitch among the harmony sounds is determined. Specify (step S43). In step S44, all the harmony sounds to be reflected are specified, and the difference pitch addition ratio for each harmony sound is determined based on the level-specific difference pitch addition ratio table (see FIG. 5B) (step S44). However, in this case, for example, the specified intermediate level except for the harmony sound having the highest pitch, the level being higher than the harmony sound having the highest pitch, and having no level for the harmony sound having the lowest pitch. A difference pitch addition ratio is determined which is in the level with respect to the harmony sound having a pitch higher than the pitch, and with a small level with respect to the other harmony sounds.

上記ルール３又は４処理（ステップＳ３７参照）について、図８を用いて説明する。図８は、ルール３又は４処理を示すフローチャートである。ステップＳ５１は、ハーモニー目標音高のうち３度にあたる音高のハーモニー音が存在するか否かを判定する。３度にあたる音高のハーモニー音が存在しないと判定した場合には（ステップＳ５１のＮＯ）、ハーモニー音毎の差分ピッチ付加割合テーブル（図５（Ａ）参照））に基づき反映対象のハーモニー音を特定すると共に差分ピッチ付加割合を決定する（ステップＳ５５）。３度にあたる音高のハーモニー音が存在すると判定した場合には（ステップＳ５１のＹＥＳ）、ルール３が選択されているか否かを判定する（ステップＳ５２）。ルール３が選択されていると判定した場合には（ステップＳ５２のＹＥＳ）、反映対象のハーモニー音を全てに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ５３）。ただし、この場合には、例えば３度にあたる音高のハーモニー音に対してレベル大の、その他のハーモニー音に対してレベル小の差分ピッチ付加割合を決定する。ルール３が選択されていないつまりはルール４が選択されていると判定した場合には（ステップＳ５２のＮＯ）、反映対象のハーモニー音を全てに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ５４）。ただし、この場合には、例えば３度にあたる音高のハーモニー音に対してレベルなしの、その他のハーモニー音に対してレベル中の差分ピッチ付加割合を決定する。 The rule 3 or 4 process (see step S37) will be described with reference to FIG. FIG. 8 is a flowchart showing rule 3 or 4 processing. In step S51, it is determined whether or not a harmony sound having a pitch corresponding to the third of the harmony target pitches is present. When it is determined that there is no harmony sound having a pitch of 3 (NO in step S51), the harmony sound to be reflected is determined based on the difference pitch addition ratio table for each harmony sound (see FIG. 5A). At the same time, the differential pitch addition ratio is determined (step S55). If it is determined that there is a harmony sound with a pitch corresponding to the third (YES in step S51), it is determined whether rule 3 is selected (step S52). When it is determined that the rule 3 is selected (YES in step S52), all the harmony sounds to be reflected are specified, and each difference pitch addition ratio table for each level (see FIG. 5B) is used. The difference pitch addition ratio for each harmony sound is determined (step S53). However, in this case, for example, a difference pitch addition ratio having a high level with respect to a harmony sound having a pitch of 3 degrees and a small level with respect to other harmony sounds is determined. If it is determined that rule 3 has not been selected, that is, rule 4 has been selected (NO in step S52), all the target harmony sounds are identified, and a differential pitch addition ratio table for each level (FIG. 5 (B)), the difference pitch addition ratio for each harmony sound is determined (step S54). However, in this case, for example, the difference pitch addition ratio in the level with respect to the other harmony sounds without the level with respect to the harmony sound having a pitch of 3 times is determined.

上記ルール５又は６処理（ステップＳ３８参照）について、図９を用いて説明する。図９は、ルール５又は６処理を示すフローチャートである。ステップＳ６１は、入力音高（正規化されたピッチ）に対してテンションノートにあたる音高のハーモニー音が存在するか否かを判定する。テンションノートにあたる音高のハーモニー音が存在しないと判定した場合には（ステップＳ６１のＮＯ）、ハーモニー音毎の差分ピッチ付加割合テーブル（図５（Ａ）参照））に基づき反映対象のハーモニー音を特定すると共に差分ピッチ付加割合を決定する（ステップＳ６５）。テンションノートにあたる音高のハーモニー音が存在すると判定した場合には（ステップＳ６１のＹＥＳ）、ルール５が選択されているか否かを判定する（ステップＳ６２）。 The rule 5 or 6 process (see step S38) will be described with reference to FIG. FIG. 9 is a flowchart showing rule 5 or 6 processing. In step S61, it is determined whether or not there is a harmony sound having a pitch corresponding to the tension note with respect to the input pitch (normalized pitch). If it is determined that there is no harmony sound of the pitch corresponding to the tension note (NO in step S61), the harmony sound to be reflected is calculated based on the difference pitch addition ratio table (see FIG. 5A) for each harmony sound. At the same time, the differential pitch addition ratio is determined (step S65). If it is determined that there is a harmony sound with a pitch corresponding to the tension note (YES in step S61), it is determined whether or not rule 5 is selected (step S62).

ルール５が選択されていると判定した場合には（ステップＳ６２のＹＥＳ）、反映対象のハーモニー音を全てに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ６３）。ただし、この場合には、例えばテンションノートにあたる音高のハーモニー音に対してレベル大の、その他のハーモニー音に対してレベル小の差分ピッチ付加割合を決定する。ルール５が選択されていないつまりはルール６が選択されていると判定した場合には（ステップＳ６２のＮＯ）、反映対象のハーモニー音を全てに特定し、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）に基づき各ハーモニー音毎の差分ピッチ付加割合を決定する（ステップＳ６４）。ただし、この場合には、例えばテンションノートにあたる音高のハーモニー音に対してレベルなしの、その他のハーモニー音に対してレベル中の差分ピッチ付加割合を決定する。 If it is determined that the rule 5 is selected (YES in step S62), all the harmony sounds to be reflected are specified, and each level is added based on the differential pitch addition ratio table for each level (see FIG. 5B). The difference pitch addition ratio for each harmony sound is determined (step S63). However, in this case, for example, a difference pitch addition ratio having a high level with respect to a harmonic sound having a pitch corresponding to a tension note and a low level with respect to other harmony sounds is determined. If it is determined that rule 5 has not been selected, that is, rule 6 has been selected (NO in step S62), all the target harmony sounds are identified, and a differential pitch addition ratio table for each level (FIG. 5 (B)), the difference pitch addition ratio for each harmony sound is determined (step S64). However, in this case, for example, the difference pitch addition ratio in the level with respect to the other harmony sounds without the level with respect to the harmony sound of the pitch corresponding to the tension note is determined.

図３に戻って、ステップＳ１６は発音中のハーモニー音があればそれを消音する。ステップＳ１７は、ノートバッファに記憶された目標音高とリードバッファに記憶された入力音高とを比較して差分（従来装置においてハーモニー音信号を生成するために用いられていたピッチシフト量に該当する）を求め、該求めた差分に１乃至複数の差分ピッチを加算することによって１乃至複数のピッチシフト量を算出する。ただし、この際に加算する差分ピッチは、上記ステップＳ１５において決定された１乃至複数の差分ピッチ付加割合（ピッチ調整情報）に従って残差バッファに記憶された差分ピッチのピッチ変化を調整した後の変更後の１乃至複数の差分ピッチである。ステップＳ１８は、前記算出した１乃至複数のピッチシフト量に基づいて入力音声信号（詳しくは前記記憶済みの１周期の波形要素データ）をピッチシフトして、前記目標音高を基準にして、入力音声信号が有するピッチ揺れを反映して、ピッチ変調された１乃至複数のハーモニー音信号を生成し、これを再生して１乃至複数のハーモニー音を発音する。 Returning to FIG. 3, in step S16, if there is a harmony sound being produced, it is muted. Step S17 compares the target pitch stored in the note buffer with the input pitch stored in the read buffer, and outputs a difference (corresponding to the pitch shift amount used to generate the harmony sound signal in the conventional apparatus). And one or more pitch shift amounts are calculated by adding one or more difference pitches to the obtained difference. However, the difference pitch to be added at this time is changed after adjusting the pitch change of the difference pitch stored in the residual buffer according to one or more difference pitch addition ratios (pitch adjustment information) determined in step S15. These are one or more differential pitches later. A step S18 pitch-shifts the input audio signal (specifically, the stored waveform element data of one cycle) based on the calculated one or more pitch shift amounts, and inputs the reference based on the target pitch. Reflecting the pitch fluctuation of the audio signal, one or more pitch-modulated harmony sound signals are generated and reproduced to generate one or more harmony sounds.

図１０（Ｄ）に、差分ピッチ付加割合に従ってピッチ変化を調整された変更後の差分ピッチに応じて算出されるピッチシフト量を実線で示し、元の差分ピッチに従う変動を参考のために点線で示す（ただし、ここでは１つのみ例示してある）。この図１０（Ｄ）に示すように、例えば目標音高「Ｅ」に決定された第１の母音区間についてはピッチシフト量「+400」を基準に、目標音高「Ｇ」に決定された第２の母音区間についてはピッチシフト量「+500」を基準に上下にピッチが揺れているピッチシフト量が得られる。このようにして各母音区間毎に求めた複数のピッチシフト量に従って、それぞれの区間において入力音声信号（前記記憶した波形要素データ）に対してピッチシフトを行うことにより、図１０（Ｅ）に示すような入力音声信号のピッチ揺れを反映しながらもピッチ揺れの大きさが異なる複数のハーモニー音信号をそれぞれ生成することができる。 In FIG. 10 (D), the pitch shift amount calculated according to the changed differential pitch after the change in pitch is adjusted according to the differential pitch addition ratio is indicated by a solid line, and the fluctuation according to the original differential pitch is indicated by a dotted line for reference. Shown (however, only one is illustrated here). As shown in FIG. 10D, for example, for the first vowel section determined to be the target pitch “E”, the target pitch “G” is determined based on the pitch shift amount “+400”. For the second vowel section, a pitch shift amount in which the pitch fluctuates up and down with respect to the pitch shift amount “+500” is obtained. As shown in FIG. 10E, the input speech signal (stored waveform element data) is pitch-shifted in each section in accordance with a plurality of pitch shift amounts obtained for each vowel section in this manner. It is possible to generate a plurality of harmony signals having different pitch fluctuations while reflecting the pitch fluctuations of the input audio signal.

この図１０（Ｅ）から理解できるように、従来では「Ｅ」，「Ｇ」，「Ｃ＋１」、また「Ｇ」，「Ｃ＋１」，「Ｅ＋１」といった音高が一定の複数のハーモニー音信号が生成されていたが（点線で示す）、本実施例では入力音声信号が有するピッチ揺れを大なり小なり反映した異なるピッチ揺れを持つ複数のハーモニー音信号が生成されるようになっている。すなわち、個々のハーモニー音毎に決定される差分ピッチ付加割合の大小によって、複数のハーモニー音信号それぞれのピッチ揺れの大きさを入力音声信号のピッチ揺れの大きさよりも大きくしたり小さくしたり任意に調整することのできるようになっている。例えば、差分ピッチ付加割合が１００％であるときにハーモニー音信号のピッチ揺れの大きさが入力音声信号のピッチ揺れの大きさと同じになるように予め設定しておき、差分ピッチ付加割合を１００％より小さく設定した場合には、差分ピッチ付加割合を０％に近づけるにつれてハーモニー音信号のピッチ揺れの大きさが入力音声信号のピッチ揺れの大きさよりもより小さくなって現れるようにし、０％では従来と同様のピッチ揺れを有しない音高一定のハーモニー音信号が現われるようにしている。 As can be understood from FIG. 10 (E), conventionally, a plurality of harmony sound signals having constant pitches such as “E”, “G”, “C + 1”, and “G”, “C + 1”, “E + 1” are generated. Although generated (indicated by dotted lines), in the present embodiment, a plurality of harmony sound signals having different pitch fluctuations reflecting the pitch fluctuations of the input audio signal are generated. In other words, depending on the difference pitch addition ratio determined for each individual harmony sound, the magnitude of the pitch fluctuation of each of the multiple harmony sound signals can be made larger or smaller than the magnitude of the pitch fluctuation of the input audio signal. It can be adjusted. For example, when the difference pitch addition ratio is 100%, the pitch fluctuation magnitude of the harmony sound signal is set in advance to be the same as the pitch fluctuation magnitude of the input audio signal, and the difference pitch addition ratio is 100%. When set to a smaller value, the pitch fluctuation magnitude of the harmony sound signal appears smaller than the pitch fluctuation magnitude of the input audio signal as the differential pitch addition ratio approaches 0%. A harmony sound signal having a constant pitch that does not have the same pitch fluctuation as in FIG.

図４に戻って、ステップＳ１９は、例えばユーザが鍵盤等を操作するなどして入力した（あるいはカラオケ伴奏に伴って自動的に与えられるようにしてもよい）コード情報を取得したか否かを判定する。コード情報を取得していないと判定した場合には（ステップＳ１９のＮＯ）、ステップＳ２の処理へジャンプする。コード情報を取得したと判定した場合には（ステップＳ１９のＹＥＳ）、コード情報を抽出しコードバッファに記憶する（ステップＳ２０）。ステップＳ２１は、リードバッファに記憶された入力音高が有効であるか否かを判定する。入力音高が有効でないと判定した場合には（ステップＳ２１のＮＯ）、図３に示したステップＳ２の処理へ戻る。入力音高が有効であると判定した場合には（ステップＳ２１のＹＥＳ）、ハーモニー音信号の音高を決定する方式としてコード入力方式が選択されているか否かを判定する（ステップＳ２２）。コード入力方式が選択されていないと判定した場合には（ステップＳ２２のＮＯ）、ステップＳ２の処理へジャンプする。 Returning to FIG. 4, in step S19, it is determined whether or not the chord information input by the user operating the keyboard or the like (or may be automatically provided along with the karaoke accompaniment) is acquired. judge. If it is determined that the code information has not been acquired (NO in step S19), the process jumps to the process in step S2. If it is determined that the code information has been acquired (YES in step S19), the code information is extracted and stored in the code buffer (step S20). A step S21 decides whether or not the input pitch stored in the read buffer is valid. If it is determined that the input pitch is not valid (NO in step S21), the process returns to step S2 shown in FIG. If it is determined that the input pitch is valid (YES in step S21), it is determined whether or not a chord input method is selected as a method for determining the pitch of the harmony sound signal (step S22). If it is determined that the code input method is not selected (NO in step S22), the process jumps to the process in step S2.

コード入力方式が選択されていると判定した場合には（ステップＳ２２のＹＥＳ）、コードバッファに記憶されているコード情報とリードバッファに記憶されている入力音高とに基づいてＲＯＭ２又は記憶装置８に記憶されている該当するハーモニーテーブルを参照し、１乃至複数のハーモニー音信号の音高（目標音高）を決定する（ステップＳ２３）。ステップＳ２４は、差分ピッチ付加割合決定処理（図６参照）を実行する。ステップＳ２５は、発音中のハーモニー音があればそれを消音する。ステップＳ２６は、ノートバッファに記憶された目標音高とリードバッファに記憶された入力音高とを比較して差分（従来のピッチシフト量）を求め、該求めた差分に差分ピッチを加算することによってピッチシフト量を算出する。ただし、この際に加算する差分ピッチは、予め設定された差分ピッチ付加割合に従って残差バッファに記憶された差分ピッチのピッチ変化を調整した後の変更後の差分ピッチであることは既に説明したとおりである。ステップＳ２７は、前記算出したピッチシフト量に基づいて入力音声信号（詳しくは前記記憶済みの１周期の波形要素データ）をピッチシフトして、前記目標音高を基準に入力音声信号が有するピッチ揺れを反映したハーモニー音信号を生成し、これを再生してハーモニー音を発音する。 If it is determined that the chord input method is selected (YES in step S22), the ROM 2 or the storage device 8 is based on the chord information stored in the chord buffer and the input pitch stored in the read buffer. Referring to the corresponding harmony table stored in (1), the pitches (target pitches) of one or more harmony sound signals are determined (step S23). A step S24 executes a difference pitch addition ratio determination process (see FIG. 6). In step S25, if there is a harmony sound being pronounced, it is muted. In step S26, the target pitch stored in the note buffer and the input pitch stored in the read buffer are compared to obtain a difference (conventional pitch shift amount), and the difference pitch is added to the obtained difference. To calculate the pitch shift amount. However, the difference pitch to be added at this time is the difference pitch after the change after adjusting the pitch change of the difference pitch stored in the residual buffer according to the preset difference pitch addition ratio, as already described. It is. Step S27 shifts the pitch of the input voice signal (specifically, the stored waveform element data of one cycle) based on the calculated pitch shift amount, and the pitch fluctuation of the input voice signal based on the target pitch. A harmony sound signal reflecting the above is generated and reproduced to produce a harmony sound.

以上のように、本発明においては、入力された音声信号を分析して検出される音声信号のピッチと、前記ピッチ検出に伴って当該入力音声信号の所定区間毎に特定される音名に対応した音高のいずれかとの差分である差分ピッチを求める。また、前記差分ピッチを任意の異なる差分ピッチ付加割合により変更して、複数の異なる差分ピッチを生成する。そして、入力音声信号を前記特定した音高に従って決定される１乃至複数のハーモニー音声の音高へとピッチシフトする際に必要とされるピッチシフト量（従来のピッチシフト量）に対して、前記変更後の差分ピッチによるピッチ変動分を加えるようにして前記ピッチシフト量を変更する。前記差分ピッチはもともと入力音声信号が有していたある音高を基準としたピッチ揺れ（ピッチの揺らぎ成分）を示すものであるから、前記変更後の差分ピッチによる変動分を加えられたピッチ変更後のピッチシフト量は入力音声信号が有していたピッチ揺れを加味したものとなる。したがって、該ピッチ変更後のピッチシフト量に基づいて前記入力音声信号をピッチシフトすれば、前記決定した１乃至複数の音高を基準にピッチ揺れを有する１乃至複数のハーモニー音信号を生成することができる。このようにすれば、ユーザは簡単な操作によって、入力音声信号が有するピッチ揺れを複数のハーモニー音信号に任意の大きさで反映させて、ユーザに違和感を抱かせることのない入力音声信号と同様のピッチ揺れを有するハーモニー音信号を生成することができるようになる。 As described above, in the present invention, the pitch of the voice signal detected by analyzing the input voice signal and the pitch name specified for each predetermined section of the input voice signal according to the pitch detection are supported. A difference pitch which is a difference from any of the pitches obtained is obtained. Further, the difference pitch is changed according to any different difference pitch addition ratio to generate a plurality of different difference pitches. Then, with respect to the pitch shift amount (conventional pitch shift amount) required when the input audio signal is pitch-shifted to the pitch of one or more harmony sounds determined according to the specified pitch, The pitch shift amount is changed so as to add the pitch fluctuation due to the changed differential pitch. Since the difference pitch indicates a pitch fluctuation (pitch fluctuation component) based on a certain pitch originally included in the input audio signal, the pitch change to which the fluctuation due to the difference pitch after the change is added is added. The later pitch shift amount takes into account the pitch fluctuation that the input audio signal had. Therefore, if the input audio signal is pitch-shifted based on the pitch shift amount after the pitch change, one or more harmony sound signals having pitch fluctuations are generated based on the determined one or more pitches. Can do. In this way, the user can reflect the pitch fluctuation of the input audio signal in a plurality of harmony sound signals with an arbitrary magnitude by a simple operation, and it is the same as the input audio signal that does not make the user feel uncomfortable. It becomes possible to generate a harmony sound signal having a pitch fluctuation of.

以上、図面に基づいて実施形態の一例を説明したが、本発明はこれに限定されるものではなく、様々な実施形態が可能であることは言うまでもない。例えば、従来のように目標音高と入力音高とを比較して求めた差分（従来のピッチシフト量）により入力音声信号をピッチシフトすることによってピッチ揺れを有しない一定の音高（目標音高）からなるハーモニー音信号を先に生成しておき、これに対して単純に差分ピッチ付加割合によって調整された差分ピッチを付加するピッチ変調制御を行うことによって、入力音声信号が有するピッチ揺れをハーモニー音信号に反映させるようにしてもよい。
なお、本明細書において「ピッチ揺れ」若しくはピッチ変調とは、ビブラートのような周期的なピッチ変化に限らず、あるいはベンドアップやベンドダウンなどの非周期的なピッチ変化を含んでいてよい。また、奏法表現としてユーザが認識できないような微細なピッチ変化を含んでいてよい。 As mentioned above, although an example of embodiment was demonstrated based on drawing, this invention is not limited to this, It cannot be overemphasized that various embodiment is possible. For example, as in the prior art, the input sound signal is pitch-shifted by the difference obtained by comparing the target pitch and the input pitch (conventional pitch shift amount) as in the prior art, so that there is no constant pitch (target pitch). The pitch fluctuation of the input audio signal is generated by performing a pitch modulation control in which a difference pitch adjusted by the difference pitch addition ratio is simply generated. You may make it reflect in a harmony sound signal.
In this specification, “pitch fluctuation” or pitch modulation is not limited to periodic pitch changes such as vibrato, but may include non-periodic pitch changes such as bend up and bend down. In addition, a fine pitch change that cannot be recognized by the user as a performance style expression may be included.

なお、ハーモニー音信号を生成するために入力されるコード情報は、上述したように本装置上あるいは本装置に接続された鍵盤などの演奏操作子からユーザ操作に応じて入力された入力情報から検出されたものでもよいし、あるいは和音名を順次入力する形式で得られるものであってもよいし、あるいはカラオケ伴奏に伴って自動的に与えられるようにしてもよい。
なお、コード入力がユーザによる演奏操作により行われる場合、押鍵状態からコード検出を行うことは言うまでもない。ただし、この際のコード検出方法は、実際のコード構成音の鍵を全て押鍵することによりコードを指定する所謂フィンガード方式、１つ乃至３つ程度の鍵を所定の規則に基づいて押鍵することによりコードを指定する所謂シングルフィンガー方式など、どのような方式であってもよい。あるいは、操作パネル上に配置された所定のスイッチ操作により、各コードのルートとタイプを指定する方式であってもよい。 Note that the chord information input to generate the harmony sound signal is detected from the input information input in response to a user operation from a performance operator such as a keyboard connected to the apparatus or the apparatus as described above. May be obtained in a format in which chord names are sequentially input, or may be automatically provided along with karaoke accompaniment.
Needless to say, when chord input is performed by a performance operation by the user, chord detection is performed from the key depression state. However, the chord detection method at this time is a so-called fingered system in which chords are designated by depressing all keys of actual chord constituent sounds, and one to three keys are depressed based on a predetermined rule. Thus, any method such as a so-called single finger method for designating a code may be used. Alternatively, a method of designating the route and type of each code by a predetermined switch operation arranged on the operation panel may be used.

なお、上記のようにして入力された音声信号のピッチ検出結果をそのまま用いてハーモニー音信号の音高を決定するものに限らず、入力音声信号のピッチ検出結果を例えば１オクターブや３半音等の所定ピッチだけ上下するなど音高変換したものを用いてハーモニー音信号の音高を決定するようにしてもよい。
なお、上述した実施例においてハーモニー音信号を生成するための元となる入力音声信号はマイクロフォンを介して入力されたユーザの音声を例に説明したが、これに限らない。例えば、マイクロフォンを介して入力される楽器演奏音などであってもよいし、あるいは記憶された音声信号あるいは適宜に外部から配信される音声信号などであってもよい。 Note that the pitch detection result of the input sound signal is not limited to determining the pitch of the harmony sound signal by directly using the pitch detection result of the sound signal input as described above. For example, the pitch detection result of the input sound signal is 1 octave or 3 semitones. You may make it determine the pitch of a harmony sound signal using what converted pitch, such as going up and down only a predetermined pitch.
In addition, although the input audio | voice signal used as the origin for producing | generating a harmony sound signal in the Example mentioned above was demonstrated to the example of the user's audio | voice input via the microphone, it is not restricted to this. For example, a musical instrument performance sound input via a microphone may be used, or a stored audio signal or an audio signal distributed from the outside as appropriate may be used.

なお、ピッチ調整情報は上述した実施例のような予め決められた差分ピッチ付加割合（％）に限らず、計算により差分ピッチ付加割合（％）を算出させるようにしてもよい。その場合、算出の元となるルールをいくつか予め用意しておき、その中から選択できるようになっていてよい。さらには、ピッチ調整情報が自動で算出された後に、当該算出されたピッチ調整情報をユーザが編集することのできるようになっていてよい。また、入力音声信号のピッチ揺れの大きさを自動的に検出して、該検出したピッチ揺れの大きさに応じて各ハーモニー音に対するピッチ調整情報を決定するようになっていてもよい。さらには、ピッチ調整情報に基づくピッチ調整対象のハーモニー音の特定をユーザ自身が指定できてもよい。 Note that the pitch adjustment information is not limited to the predetermined differential pitch addition ratio (%) as in the above-described embodiment, and the differential pitch addition ratio (%) may be calculated by calculation. In that case, some rules that are the basis of calculation may be prepared in advance and selected from them. Furthermore, after the pitch adjustment information is automatically calculated, the user may be able to edit the calculated pitch adjustment information. In addition, the magnitude of pitch fluctuation of the input audio signal may be automatically detected, and pitch adjustment information for each harmony sound may be determined according to the detected magnitude of pitch fluctuation. Furthermore, the user himself / herself may be able to specify the harmony sound to be pitch adjusted based on the pitch adjustment information.

なお、生成するハーモニー音の数を予め指定することができるようになっていてもよい。例えば、上述した実施例ではコード情報に応じて特定されるハーモニーテーブル（図２（Ａ）参照）に基づいて３音のハーモニー音を生成するための音高を決定するようになっているが、そうしたテーブルである場合に例えば音数として２音が指定されたときにはテーブルの低音側（又は高音側）の２音など３音のうちの任意の２音を組み合わせて音高を決定するようにするとよい。
なお、差分ピッチ付加割合テーブルは同じタイプ（ハーモニー音毎／レベル別など）でも複数用意しておき、その中から選択して切り替えできるようになっていてよい。こうしたテーブルの切り替えは曲の途中で随時に切り替えできるようになっていてよい。
なお、レベル別の差分ピッチ付加割合テーブル（図５（Ｂ）参照）を用いる場合、同じレベルでも複数ハーモニー音が存在するときには同じ差分ピッチ付加割合を適用せず、少しずつ異なる値になるように調整できるとよい（例えば２ずらすなど）。 Note that the number of harmony sounds to be generated may be designated in advance. For example, in the above-described embodiment, the pitch for generating the three harmony sounds is determined based on the harmony table (see FIG. 2A) specified according to the chord information. In the case of such a table, for example, when two sounds are specified as the number of sounds, the pitch is determined by combining any two sounds of the three sounds such as the two sounds on the low sound side (or the high sound side) of the table. Good.
It should be noted that a plurality of difference pitch addition ratio tables of the same type (for each harmony sound / by level, etc.) may be prepared, and selected from among them and switched. Such table switching may be performed at any time during the course of a song.
When using the difference pitch addition ratio table for each level (see FIG. 5B), the same difference pitch addition ratio is not applied when there are a plurality of harmony sounds even at the same level, and the values are slightly different. It is good to be able to adjust (for example, to shift 2).

１…ＣＰＵ、２…ＲＯＭ、３…ＲＡＭ、４…入力操作部、５…表示部、６…音源、６Ａ…サウンドシステム、７…通信インタフェース、８…記憶装置、１Ｄ…データ及びアドレスバス DESCRIPTION OF SYMBOLS 1 ... CPU, 2 ... ROM, 3 ... RAM, 4 ... Input operation part, 5 ... Display part, 6 ... Sound source, 6A ... Sound system, 7 ... Communication interface, 8 ... Storage device, 1D ... Data and address bus

Claims

A sound signal generation device that generates one or more harmony sound signals whose pitches are controlled following the pitch variation of an input sound signal,
An input means for inputting an audio signal;
Pitch detection means for sequentially detecting a specific pitch of the input voice signal and detecting a normalized pitch corresponding to a pitch name from the specific pitch;
Difference generating means for obtaining differential pitch information related to the difference between the specific pitch and the normalized pitch;
Target pitch determining means for determining a plurality of pitches having different pitches relative to the normalized pitch as target pitches of a plurality of sound signals to be generated;
For each of the plurality of target pitches, in order to adjust the degree of pitch fluctuation added to the target pitch, setting means for respectively setting pitch adjustment information indicating a difference pitch addition ratio ;
A sound signal comprising sound signal generating means for generating , for each of the plurality of target pitches, a sound signal having a pitch obtained by modulating the target pitch in accordance with the difference pitch information in which the addition ratio is adjusted according to the pitch adjustment information. Generator.

The sound signal generation unit generates pitch information indicating a pitch obtained by modulating the target pitch according to the differential pitch information in which an addition ratio is adjusted according to the pitch adjustment information, and based on the generated pitch information, the sound signal is generated. The sound signal generation device according to claim 1, wherein the sound signal generation device generates a signal.

The sound signal generating means generates a sound signal having the target pitch, and modulates the generated sound signal having the target pitch according to the differential pitch information whose addition ratio is adjusted according to the pitch adjustment information. The sound signal according to claim 1, wherein the sound signal having a pitch obtained by modulating the target pitch according to the difference pitch information in which an addition ratio is adjusted according to the pitch adjustment information is generated. Signal generator.

The sound signal generating means outputs a sound signal having a waveform characteristic of the input sound signal for each of the plurality of target pitches according to the difference pitch information in which an addition ratio is adjusted according to the pitch adjustment information. 4. The sound signal generating apparatus according to claim 1, wherein the sound signal generating apparatus generates the sound with a pitch modulated .

A computer-executable program for generating one or more harmony sound signals whose pitches are controlled following the pitch fluctuation of an input audio signal, the program being stored in a computer,
Input audio signal,
A step of sequentially detecting a specific pitch of the input voice signal and detecting a normalized pitch corresponding to a pitch name from the specific pitch;
Obtaining differential pitch information related to the difference between the specific pitch and the normalized pitch ;
Determining a plurality of pitches having different pitches relative to the normalized pitch as target pitches of a plurality of sound signals to be generated;
For each of the plurality of target pitches, in order to adjust the degree of pitch fluctuation to be added to the target pitch, a procedure for setting each pitch adjustment information indicating a difference pitch addition ratio ,
Wherein for each of a plurality of target pitch, a program for executing the steps of generating a sound signal having a pitch obtained by modulating the target pitch according to the difference pitch information adding proportion is adjusted according to the pitch adjustment information.