JP2876604B2

JP2876604B2 - Signal compression method

Info

Publication number: JP2876604B2
Application number: JP63292932A
Authority: JP
Inventors: 真古橋
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1988-11-19
Filing date: 1988-11-19
Publication date: 1999-03-31
Anticipated expiration: 2014-03-31
Also published as: JPH02140020A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、例えば楽音信号等を圧縮する信号圧縮方法
に関するものであり、特に、楽音データ等をディジタル
処理するいわゆるオーディオ・プロセッシング・ユニッ
ト（APU）を用いて信号圧縮するものである。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a signal compression method for compressing, for example, a tone signal, and more particularly, to a so-called audio processing unit (APU) for digitally processing tone data and the like. ) To compress the signal.

[Summary of the Invention]

本発明は、入力信号を直接あるいはフィルタを介して
出力するモードのうち最も高い圧縮率を有する出力信号
を得るモードを選択して、出力信号を伝送する信号圧縮
方法において、入力信号の開始点に先行する最低１ブロ
ック以上の期間に入力信号を直接出力するモードが選択
されるような０レベルの擬似入力信号を付加した後、そ
の擬似入力信号を含めて信号処理を行うことにより、先
頭ブロックのための初期値を持つ必要がなく、また、ハ
ードウェアも複雑にならない信号圧縮方法を提供するも
のである。The present invention provides a signal compression method for transmitting an output signal by selecting a mode for obtaining an output signal having the highest compression rate among modes for outputting an input signal directly or through a filter. After adding a 0-level pseudo input signal such that a mode for directly outputting an input signal is selected during a period of at least one preceding block, the signal processing including the pseudo input signal is performed, whereby Therefore, the present invention provides a signal compression method that does not need to have an initial value and does not require complicated hardware.

[Conventional technology]

一般に、電子楽器やTVゲーム器等に用いられる音源
は、VCO、VCA、VCF等から成るアナログ音源と、PSG（プ
ログラマブル・サウンド・ジェネレータ）や波形ROM読
み出しタイプ等のディジタル音源とに大別される。この
ディジタル音源の一種として、近年においては、生の楽
器音等をサンプリングしてディジタル処理した音源デー
タをメモリ等に記憶させて用いるようなサンプラー音源
も広く知られるようになってきている（例えば特開昭62
−264099号公報、特開昭62−267798号公報参照）。Generally, sound sources used in electronic musical instruments and video game consoles are roughly classified into analog sound sources such as VCOs, VCAs, and VCFs, and digital sound sources such as a PSG (programmable sound generator) and a waveform ROM reading type. . In recent years, as one type of the digital sound source, a sampler sound source that uses a sound source data obtained by sampling a raw musical instrument sound or the like and digitally processing the stored data in a memory or the like has been widely known (for example, a special sampler sound source). Kaisho 62
-264099 and JP-A-62-267798).

このサンプラー音源においては、一般的に音源データ
記憶用のメモリに大容量を要することから、メモリ節約
のための手法が各種提案されており、例えば、楽音波形
の周期性を利用したルーピング処理や、非線形量子化等
によるビット圧縮処理がその代表的なものとして挙げら
れる。なお上記ルーピング処理は、サンプリングされた
楽音の元の持続時間よりも長い時間音を出し続けるため
の一手法でもある。In this sampler sound source, since a memory for sound source data storage generally requires a large capacity, various methods for saving the memory have been proposed, for example, a looping process using periodicity of a musical sound waveform, A typical example thereof is a bit compression process using nonlinear quantization or the like. Note that the looping process is also a method for continuously outputting a sound for a longer time than the original duration of the sampled musical sound.

ここで、上記音源データをビット圧縮処理する一例と
して、複数サンプル毎のブロック単位で最も高い圧縮率
の得られるフィルタを選択するようなビット圧縮符号化
処理を利用してメモリを節約することが考えられてい
る。Here, as an example of performing the bit compression processing on the sound source data, it is conceivable to save memory by using a bit compression encoding processing that selects a filter that can obtain the highest compression ratio in block units of a plurality of samples. Have been.

[Problems to be solved by the invention]

上述のようなフィルタ選択型のビット圧縮符号化のシ
ステムでは、例えば楽音信号波形のサンプリングされた
波高値データの16サンプルを１ブロックとしたブロック
単位で、レンジおよびフィルタ情報等のパラメータある
いはヘッダー情報を付加する。ここで、フィルタ情報
は、例えばストレートPCM,1次および２次差分フィルタ
の３種のモードのうち、エンコード時に最適な、すなわ
ち最も高い圧縮率が得られるようなフィルタを選択する
ものである。このうち１次および２次差分フィルタは、
デコード（再生）時にIIRフィルタとなるために、ブロ
ックの先頭サンプルをデコード（再生）するときには、
そのブロック以前の１個及び２個のサンプルが初期値と
して必要になる。In the above-described filter-compression type bit compression encoding system, for example, parameters such as range and filter information or header information are divided into blocks each of which is composed of 16 samples of sampled peak value data of a musical tone signal waveform. Add. Here, the filter information selects, for example, a filter that is optimal in encoding, that is, a filter that can obtain the highest compression rate, among three types of modes such as straight PCM, primary and secondary difference filters. The primary and secondary difference filters are
When decoding (playing back) the first sample of a block because it becomes an IIR filter,
One and two samples before the block are required as initial values.

ところが、音源データの先頭ブロックで１次または２
次差分フィルタが選択される場合には、それ以前、すな
わち発音開始前のサンプルが無いため、１個または２個
のデータを初期値としてメモリ等の記憶媒体に記憶させ
ておく必要がある。しかし、デコーダ側から見ればこの
ような記憶媒体を必要とすることは、ハードウエア上の
負担増となり、特にローコストを目的としたIC化に際し
ては好ましくない。However, in the first block of the sound source data, the primary or secondary
When the next difference filter is selected, there is no sample before that, that is, there is no sample before the start of sound generation. Therefore, it is necessary to store one or two data as an initial value in a storage medium such as a memory. However, from the decoder's point of view, the need for such a storage medium increases the load on the hardware, and is not preferable, especially in the case of an IC for low cost.

本発明は、上述のような実情に鑑みて提案されたもの
であり、先頭ブロックで１次または２次差分フィルタモ
ード等が選択されることによる初期値が不要で、ハード
ウエアも簡単な構成となる信号圧縮方法を提供すること
を目的とするものである。The present invention has been proposed in view of the above situation, and does not require an initial value when a primary or secondary differential filter mode or the like is selected in the first block, and has a simple hardware configuration. It is an object of the present invention to provide a signal compression method.

[Means for solving the problem]

本発明は、上述の目的を達成するために提案されたも
のであって、入力信号を直接出力するモード及びフィル
タを介して出力するモードの複数のモードのうち最も高
い圧縮率を有する出力信号が得られるモードを選択し
て、出力信号を伝送するようにした信号圧縮方法におい
て、入力信号の開始点に先行する最低１ブロック以上の
期間に入力信号を直接出力するモードが選択されるよう
な０レベルの擬似入力信号を付加した後、その擬似入力
信号を含めて信号処理を行うことを特徴とするものであ
る。The present invention has been proposed to achieve the above-described object, and an output signal having the highest compression ratio among a plurality of modes of a mode of directly outputting an input signal and a mode of outputting through a filter is provided. In the signal compression method in which an obtained mode is selected and an output signal is transmitted, a mode in which an input signal is directly output in a period of at least one block preceding a start point of the input signal is selected. After adding a pseudo input signal of a level, signal processing is performed including the pseudo input signal.

[Action]

本発明によれば、入力信号の開始点に先行する最低１
ブロック以上の期間に入力信号を直接出力するモードが
選択されるような０レベルの擬似入力信号を付加してい
るため、信号再生のためのデコード時には、最初に入力
信号を直接出力するモードが自動的に選択され、１次ま
たは２次差分フィルタ等のための初期値を持つ必要がな
い。According to the present invention, at least one prior to the start of the input signal.
Since a 0-level pseudo input signal is added so that a mode for directly outputting an input signal is selected during a period longer than a block, a mode for directly outputting an input signal first is automatically set at the time of decoding for signal reproduction. And need not have an initial value for a primary or secondary difference filter or the like.

〔Example〕

先ず、本発明の実施例の説明に先立って、第２図に示
す楽音信号波形を参照しながら、前述したルーピング処
理について簡単に説明する。一般に発音開始直後におい
てはピアノの打鍵ノイズや管楽器のブレスノイズ等の非
音程成分が含まれることにより、波形の周期性が不明瞭
な部分であるフォルマント部分FRが生じており、その
後、楽音の音程（ピッチ、音高）に対応する基本周期で
同じ波形が繰り返し現れるようになる。この繰り返し波
形のｎ周期分（ｎは整数）をルーピング区間LPとし、こ
のルーピング区間LPはルーピング開始点LP_Sとルーピン
グ終端点LP_Eのルーピングポイント間で表されるもので
ある。そして上記フォルマント部分FRとルーピング区間
LPとを記憶媒体に記録し、再生時にはフォルマント部分
FRの再生に続いてルーピング区間LPを繰り返し再生する
ことにより、任意の長時間に亘って楽音を発生させるこ
とができる。First, prior to the description of the embodiment of the present invention, the above-described looping processing will be briefly described with reference to a tone signal waveform shown in FIG. In general, immediately after the start of sound production, non-pitch components such as piano tapping noise and wind instrument breath noise are included, so that a formant part FR, in which the periodicity of the waveform is unclear, occurs. The same waveform repeatedly appears in the basic cycle corresponding to (pitch, pitch). The n period of the repetitive waveform component (n is an integer) as a looping section LP, the looping section LP is represented by the inter-looping points looping start point LP _S and the looping end point LP _E. And the above formant part FR and looping section
LP is recorded on a storage medium, and the formant
By repeatedly reproducing the looping section LP subsequent to the reproduction of the FR, a musical sound can be generated for an arbitrary long time.

以下、本発明の一実施例について図面を参照しながら
説明する。なお、本発明は以下の実施例に限定されるも
のでないことは言うまでもない。Hereinafter, an embodiment of the present invention will be described with reference to the drawings. It goes without saying that the present invention is not limited to the following examples.

第３図は、本発明実施例の音源データ圧縮符号化方法
を音源データ形成装置に適用する際に、入力楽音信号を
サンプリングして記憶媒体に記録するまでの各機能の具
体例を示す機能ブロック図である。この場合の入力端子
10に供給される入力楽音信号としては、例えばマイクロ
フォンで直接収音した信号、あるいはディジタル・オー
ディオ信号記録媒体等を再生して得られた信号を、アナ
ログ信号あるいはディジタル信号の形態で用いることが
できる。FIG. 3 is a functional block diagram showing specific examples of functions from sampling an input tone signal to recording it on a storage medium when the sound source data compression encoding method according to the embodiment of the present invention is applied to a sound source data forming apparatus. FIG. Input terminal in this case
As the input tone signal supplied to 10, for example, a signal directly collected by a microphone or a signal obtained by reproducing a digital audio signal recording medium or the like can be used in the form of an analog signal or a digital signal. .

先ず、第３図のサンプリング処理機能ブロック11にお
いては、上記入力楽音信号を例えば周波数38kHzでサン
プリングし、１サンプル16ビットのディジタルデータと
して取り出している。このサンプリング処理とは、上記
入力楽音信号がアナログ信号の場合のA/D変換処理に対
応するものであり、また入力信号がディジタル信号の場
合にはサンプリングレート変換及びビット数変換の処理
に対応するものである。First, in the sampling processing function block 11 shown in FIG. 3, the input tone signal is sampled at a frequency of, for example, 38 kHz, and extracted as 16-bit digital data per sample. This sampling processing corresponds to A / D conversion processing when the input tone signal is an analog signal, and corresponds to sampling rate conversion and bit number conversion processing when the input signal is a digital signal. Things.

次に、ピッチ検出機能ブロック12において、上述のサ
ンプリング処理により得られたディジタル楽音信号につ
いての楽音の音程（ピッチ）を決定する基音の周波数
（基本周波数）f₀、すなわちピッチ情報が検出される。Next, in the pitch detection function block 12, a fundamental tone frequency (fundamental frequency) f ₀ for determining a musical tone pitch (pitch) of the digital tone signal obtained by the above-described sampling processing, that is, pitch information is detected.

このピッチ検出機能ブロック12における検出原理を説
明する。ここで、サンプリング音源となる楽音信号は、
その基音となる周波数がサンプリング周波数fsに比べて
かなり低い場合が多く、周波数軸で楽音のピークを検出
するだけでは高い精度での音程の同定が難しい。したが
って、何らかの手段を用いて、楽音の倍音成分のスペク
トルを利用する必要がある。The principle of detection in the pitch detection function block 12 will be described. Here, the musical tone signal as the sampling sound source is
In many cases, the fundamental frequency is considerably lower than the sampling frequency fs, and it is difficult to identify the pitch with high accuracy only by detecting the peak of the musical tone on the frequency axis. Therefore, it is necessary to use some means to use the spectrum of the overtone component of the musical tone.

先ず、音程を検出したい楽音信号の波形をｆ（ｔ）と
すれば、こお楽音波形ｆ（ｔ）を各倍音成分の振幅ａ
（ω）および位相φ（ω）で表せば、該楽音波形ｆ
（ｔ）はフーリエ展開した式、で表せる。ここで、各倍音の位相のずれφ（ω）を全て
ゼロにすると、の式で表せるものとなる。このように位相の揃えられた
楽音波形（ｔ）のピークは楽音波形（ｔ）の持つ全
ての倍音の周期の整数倍の点およびｔ＝０の点である。
これは基音の周期にほかならない。First, assuming that the waveform of a musical tone signal whose pitch is to be detected is f (t), the musical tone waveform f (t) is represented by the amplitude a of each harmonic component.
(Ω) and phase φ (ω), the tone waveform f
(T) is a Fourier-expanded equation, Can be represented by Here, if the phase shift φ (ω) of each overtone is all zero, It can be expressed by the following equation. The peaks of the musical tone waveform (t) whose phases are aligned in this manner are points at integer multiples of the period of all overtones of the musical tone waveform (t) and at t = 0.
This is nothing but the period of the fundamental tone.

この原理をふまえて、ピッチ検出の手順を第４図に示
す機能ブロック図を用いて説明する。Based on this principle, the procedure of pitch detection will be described with reference to a functional block diagram shown in FIG.

第４図において、実部データ入力端子31より楽音デー
タを、また虚部データ入力端子32より“0"を、高速フー
リエ変換（FFT）機能ブロック33に供給する。In FIG. 4, tone data is supplied from a real part data input terminal 31 and “0” is supplied from an imaginary part data input terminal 32 to a fast Fourier transform (FFT) function block 33.

ここで、上記高速フーリエ変換機能ブロック33で行わ
れる高速フーリエ変換において、ピッチを推定する楽音
信号をｘ（ｔ）とし、また、上記楽音信号ｘ（ｔ）に含
まれる倍音成分を a_ncos（２πf_nf＋θ） …… とすれば、ｘ（ｔ）はこれを複素表示で書き直して、ただし、 cosθ＝（exp（ｊθ）＋exp（−ｊθ））/2 …… を用いた。この式をフーリエ変換すると、ここで、δ（ω−ω_n）はデルタ関数である。Here, in a fast Fourier transformation performed by the fast Fourier transform function block 33, the musical tone signal to estimate the pitch is x (t), also a harmonic component included in the sound signal x (t) a _n cos ( 2πf _n f + θ) ... Then, x (t) becomes Rewrite this in complex notation, Here, cosθ = (exp (jθ) + exp (−jθ)) / 2... Was used. When this equation is Fourier transformed, Here, δ (ω−ω _n ) is a delta function.

次の機能ブロック34で該高速フーリエ変換後のデータ
のノルム（絶対値、すなわち実部と虚部をそれぞれ２乗
したものの和の平行根）を算出する。In the next functional block 34, the norm (absolute value, that is, the parallel root of the sum of the squares of the real part and the imaginary part) of the data after the fast Fourier transform is calculated.

すなわち、Ｘ（ω）の絶対値Ｙ（ω）を取ると、位相
成分がキャンセルされて、Ｙ（ω）＝［Ｘ（ω）▲▼］^1/2＝（1/2）a_nδ
（ω−ω_n） …… これは、上記楽音データの周波数成分の全ての位相を合
わせるために成されるものであり、上記虚部をゼロにす
ることにより、位相成分を揃えることができる。That is, taking the X (omega) of the absolute value Y (omega), the phase component is canceled, Y (ω) = [X (ω) ▲ ▼] 1/2 = (1/2) a n δ
(.Omega .-. _Omega..sub.n )... This is performed in order to match all the phases of the frequency components of the musical tone data. By setting the imaginary part to zero, the phase components can be aligned.

次に、この算出されたノルムを高速フーリエ変換（こ
の場合は逆FFTに相当）機能ブロック36に実部データと
して供給し、虚部データ入力端子35には“0"を供給して
逆FFTをかけ楽音データを復元する。すなわち、上記逆
フーリエ変換は、である。この逆フーリエ変換後の復元された楽音データ
は、全ての周波数成分の位相が揃ったコサイン波の合成
で表せる波形として取り出されるものである。Next, the calculated norm is supplied to the fast Fourier transform (corresponding to the inverse FFT in this case) function block 36 as the real part data, and “0” is supplied to the imaginary part data input terminal 35 to perform the inverse FFT. Restore the tone data. That is, the inverse Fourier transform is It is. The restored tone data after the inverse Fourier transform is extracted as a waveform that can be expressed by synthesizing a cosine wave in which all the frequency components have the same phase.

その後、ピーク検出機能ブロック37で上記復元された
音源データのピークを検出する。ここで、上記ピークは
上記楽音データの全ての高周波成分の極値（ピーク）が
一致した点であり、次の機能ブロック38において上記検
出されたピーク値を値の大きい方から分類（ソート）す
る。上記検出されたピークの周期を計測することによ
り、楽音信号の音程を知ることができる。Thereafter, the peak of the restored sound source data is detected by the peak detection function block 37. Here, the peak is a point where the extreme values (peaks) of all the high frequency components of the musical tone data coincide with each other, and in the next function block 38, the detected peak values are classified (sorted) from the larger value. . By measuring the period of the detected peak, the pitch of the tone signal can be known.

第５図は、第４図のピーク検出機能ブロック37におけ
る楽音データの極大値（ピーク）を検出するための構成
について説明するためのものである。FIG. 5 is a diagram for explaining a configuration for detecting the maximum value (peak) of the musical sound data in the peak detection function block 37 of FIG.

この場合上記楽音データは、値の異なったピーク（極
値）が多数存在するものであり、上記楽音データの最大
値を求めてその周期を検出することで楽音の音程を知る
ことができる。In this case, the musical tone data has many peaks (extreme values) having different values, and the pitch of the musical tone can be known by finding the maximum value of the musical tone data and detecting its cycle.

すなわち第５図において、逆フーリエ変換後の楽音デ
ータ列は、入力端子41を介しＮ＋１段のシフトレジスタ
42に供給され、このシフトレジスタ42の各段のレジスタ
a_-N/2…a₀…a_N/2を順次介して出力端子43に送られてい
る。このＮ＋１段のシフトレジスタ42は上記楽音データ
列に対して幅がＮ＋１サンプル分のウィンドウとして作
用し、該楽音データ列のＮ＋１サンプルが上記ウィンド
ウを介して最大値検出回路44に送られる。すなわち、上
記楽音データは最初にレジスタa_-N/2に入力した後レジ
スタa_N/2まで順次伝送され、各々のレジスタa_-N/2…a₀
…a_N/2からのＮ＋１サンプルの上記各楽音データが最大
値検出回路44に送られる。That is, in FIG. 5, the tone data string after the inverse Fourier transform is transferred to an N + 1-stage shift register via an input terminal 41.
42, and the register of each stage of the shift register 42
a _{-N / 2} ... a ₀ ... a _{N / 2} are sequentially sent to the output terminal 43. The N + 1-stage shift register 42 acts as a window having a width of N + 1 samples for the tone data string, and N + 1 samples of the tone data string are sent to the maximum value detection circuit 44 via the window. That is, the music data is first sequentially transmitted to the register a _{N / 2} after the input to the register a _{-N / 2,} each register a _{-N / 2} ... a ₀
.. A The N + 1 samples of tone data from _{N / 2} are sent to the maximum value detection circuit 44.

この最大値検出回路44は、上記シフトレジスタ42内の
例えば中央のレジスタa₀の値が上記Ｎ＋１サンプルのデ
ータの各値の内で最大となったとき、そのレジスタa₀値
のデータをピーク値として検出して、出力端子45より出
力するものである。なお、上記ウィンドウの幅Ｎ＋１は
任意に設定可能である。The maximum value detection circuit 44, when the value of the register a _0, for example, the center in the shift register 42 becomes maximum among the values of the data of the N + 1 samples, the peak value data of that register a ₀ value And output from the output terminal 45. The window width N + 1 can be set arbitrarily.

第３図に戻って、エンベロープ検出機能ブロック13に
おいては、上述のサンプリング処理後のディジタル楽音
信号に対して、上記ピッチ情報を用いたエンベロープ検
出処理を施すことにより、楽音信号のいわゆるエンベロ
ープ波形を得ている。これは、例えば第６図Ａに示すよ
うな楽音信号波形のピーク点を順次結んで得られる第６
図Ｂに示すような波形であり、発音直後からの時間経過
に伴うレベル（あるいは音量）の変化を表している。こ
のエンベロープ波形は、一般にあADSR（アタックタイム
／ディケイタイム／サスティンレベル／リリースタイ
ム）のような各パラメータにより表されることが多い。
ここで楽音信号の一具体例として、打鍵操作に応じて発
音されるピアノ音等を考えるとき、上記アタックタイム
T_Aは鍵盤の鍵が押され（キー・オン）徐々に音量が上が
り目標とする音量に達するまでの時間を表し、上記ディ
ケイタイムT_Dは上記アタックタイムT_Sで達した音量から
次の音量（例えば楽器の持続音の音量）に達するまでの
時間を表し、上記サスティンレベルL_Sは鍵の押圧を解除
してキー・オフするまで保たれる持続音の音量を表し、
上記リリースタイムT_Rは上記キー・オフしてから音が消
えるまでの時間を表している。なお上記各時間T_A、T_D、
T_Rは、音量変化の傾きあるいはレートを示すこともあ
る。また、これらの４つのパラメータの他にさらに多く
のエンベロープパラメータを用いるようにしてもよい。Returning to FIG. 3, the envelope detection function block 13 performs an envelope detection process using the pitch information on the digital tone signal after the sampling process to obtain a so-called envelope waveform of the tone signal. ing. This is achieved by sequentially connecting peak points of the tone signal waveform as shown in FIG. 6A, for example.
The waveform is as shown in FIG. B, and represents a change in level (or volume) over time immediately after sound generation. This envelope waveform is generally represented by parameters such as ADSR (attack time / decay time / sustain level / release time).
Here, as a specific example of a tone signal, when considering a piano sound or the like that is generated in response to a keystroke operation, the attack time
T _A represents the time from when the key of the keyboard is pressed (key on) until the volume gradually increases to the target volume, and the decay time T _D is the next volume from the volume reached by the attack time T _S (for example, the volume of the instrument of sustained sound) represents a time to reach, the sustain level L _S represents the volume of the sustained sound is maintained until the key-off by releasing the pressing of the key,
The above release time T _R represents the time until the sound from the above-mentioned key-off disappears. Each of the above times T _A , T _D ,
T _R may also exhibit slope or rate of volume change. Further, in addition to these four parameters, more envelope parameters may be used.

ここで、エンベロープ検出機能ブロック13において
は、上述したようなADSR（アタックタイムT_A／ディケイ
タイムT_D／サスティンレベルL_S／リリースタイムT_R）等
の各パラメータにより表されるエンベロープ波形情報と
同時に、前述したフォルマント部分をアタック波形の残
った状態で取り出すために、信号波形の全体的なディケ
イレートを示す情報を得るようにしている。このディケ
イレート情報は、例えば第７図に示すように、発音時
（キー・オン時）から上記アタックタイムT_Aの間は基準
の値“1"をとり、その後単調減衰する波形を表すもので
ある。Here, in the envelope detection function block 13, simultaneously with the envelope waveform information represented by each parameter such as ADSR (attack time T _A / decay time T _D / sustain level L _S / release time T _R ) as described above. In order to extract the above-mentioned formant portion with the attack waveform remaining, information indicating the entire decay rate of the signal waveform is obtained. The decay rate information, for example, as shown in FIG. 7, between time pronunciation (when a key on) of the attack time T _A has a value "1" of the reference, which represents the subsequent waveform monotonously attenuated is there.

ここで、第３図のエンベロープ検出機能ブロック13の
構成例について、第８図の機能ブロック図を参照しなが
ら説明する。Here, an example of the configuration of the envelope detection function block 13 in FIG. 3 will be described with reference to the functional block diagram in FIG.

当該エンベロープ検出の原理は、いわゆるAM（振幅変
調）信号のエンベロープ検波と同様なものである。すな
わち、上記楽音信号のピッチを上記AM信号のキャリアの
周波数として考えることによりエンベロープを検出する
ものである。上記エンベロープ情報は楽音を再生する際
に用いるものであり、当該楽音は上記エンベロープ情報
とピッチ情報に基づいて形成されるものである。The principle of the envelope detection is similar to the envelope detection of a so-called AM (amplitude modulation) signal. That is, the envelope is detected by considering the pitch of the tone signal as the frequency of the carrier of the AM signal. The envelope information is used for reproducing a musical sound, and the musical sound is formed based on the envelope information and the pitch information.

第８図の入力端子51に供給された楽音データは、絶対
値出力機能ブロック52において、上記楽音の波高値デー
タの絶対値が求められる。この絶対値データをFIR（有
限インパルス応答）型ディジタルフィルタの機能ブロッ
ク55に送る。ここで、上記FIRフィルタ機能ブロック55
はローパスフィルタとして作用するものであり、予め、
入力端子53に供給されたピッチ情報に基づいて機能ブロ
ック54において形成しておいたフィルタ係数をFIRフィ
ルタ機能ブロック55に供給することにより、そのローパ
スフィルタのカットオフ特性を決定するものである。From the musical tone data supplied to the input terminal 51 in FIG. 8, the absolute value of the peak value data of the musical tone is obtained in the absolute value output function block 52. The absolute value data is sent to a functional block 55 of a FIR (finite impulse response) type digital filter. Here, the above FIR filter function block 55
Acts as a low-pass filter.
By supplying the filter coefficient formed in the function block 54 to the FIR filter function block 55 based on the pitch information supplied to the input terminal 53, the cutoff characteristic of the low-pass filter is determined.

ここで、上記フィルタ特性は、例えば第９図に示す特
性となっており、上記楽音信号の基音（周波数f₀）やそ
の倍音の周波数に零点を有するものである。例えば、上
記第６図Ａに示す楽音信号からは、上記FIRフィルタで
も基音，倍音の周波数を減衰させることにより第６図Ｂ
に示すようなエンベロープ情報が検出される。なお上記
フィルタ係数の構成は、次式で示されるものである。Here, the filter characteristic is, for example, the characteristic shown in FIG. 9, and has a zero point in the fundamental tone (frequency f ₀ ) of the musical tone signal and its overtone frequency. For example, from the musical tone signal shown in FIG. 6A, the FIR filter also attenuates the frequencies of the fundamental tone and the overtones by using the FIR filter shown in FIG.
Is detected as shown in FIG. The configuration of the filter coefficient is shown by the following equation.

Ｈ（ｆ）＝ｋ・（sin（πf/f₀））/f …… この式中のf₀は楽音信号の基本周波数（ピッチ）を
示す。H (f) = k ・ (sin (πf / f ₀ )) / f In this equation, f ₀ indicates the fundamental frequency (pitch) of the tone signal.

次に、上述のサンプリング処理された楽音信号の波高
値データ（サンプリングデータ）から、前述の第２図に
示すフォルマント部分FRの信号の波高値データと、ルー
ピング区間LPの信号の波高値データ（ループデータ）と
を生成する処理について説明する。Next, the peak value data of the signal of the formant part FR shown in FIG. 2 and the peak value data of the signal of the looping section LP (loop ) Will be described.

上記ループデータ生成のための最初の機能ブロック14
において、上記サンプリングされた楽音信号の波高値デ
ータを、先に検出したエンベロープ波形（第６図Ｂ）の
データで割算（又は逆数を乗算）してエンベロープ補正
を行うことにより、第10図に示すような振幅一定の波形
の信号の波高値データを得ている。このエンベロープ補
正された信号（の波高値データ）をフィルタ処理するこ
とにより、音程成分以外が減衰された、あるいは相対的
に音程成分が強調された信号（の波高値データ）を得て
いる。ここで音程成分とは、基本周波数f₀の整数倍の周
波数成分である。具体的には、上記エンベロープ補正さ
れた信号に含まれるビブラート等の低周波成分を除去す
るためにHPF（ハイパスフィルタ）を介し、次に、第11
図の一点鎖線に示すような周波数特性、すなわち基本周
波数f₀の整数倍の周波数帯域が通過帯域の周波数特性、
を有する櫛形フィルタを介すことにより、上記HPF出力
信号に含まれる音程成分のみを通過させてこれら以外の
非音程成分やノイズ成分を減衰させ、さらに必要に応じ
てLPF（ローパスフィルタ）を介すことにより、上記櫛
形フィルタ通過後の信号に重畳しているノイズ成分を除
去する。First functional block 14 for generating the above loop data
In FIG. 10, the peak value data of the sampled tone signal is divided (or multiplied by the reciprocal) by the data of the previously detected envelope waveform (FIG. 6B) to perform envelope correction. Crest value data of a signal having a constant amplitude waveform as shown is obtained. By subjecting this envelope-corrected signal (peak value data) to filtering processing, a signal (peak value data) in which components other than the pitch components are attenuated or the pitch components are relatively emphasized is obtained. Here, the pitch component is an integral multiple of the frequency component of the fundamental frequency f _0. Specifically, an HPF (high-pass filter) is used to remove low-frequency components such as vibrato contained in the envelope-corrected signal.
Frequency characteristics shown in dashed line in FIG., Namely an integral multiple of the frequency band is the frequency characteristic of the pass band of the fundamental frequency f _0,
By passing through the comb filter having the above, only the pitch components included in the HPF output signal are passed to attenuate other non-pitch components and noise components, and further, if necessary, through an LPF (low-pass filter). This removes a noise component superimposed on the signal after passing through the comb filter.

すなわち、前記入力信号として楽器の音等の楽音信号
を考えるとき、この楽音信号は通常一定の音程（ピッ
チ、音高）を有していることから、その周波数スペクト
ラムには、第11図の実線に示すように、上記楽音自体の
音程に対応する基本周波数f₀の近傍とその整数倍の周波
数の近傍にエネルギが集中するような分布が得られる。
これに対して一般のノイズ成分は一様な周波数分布を持
っていることが知られている。従って、上記入力楽音信
号を第11図の一点鎖線に示すような周波数特性の櫛形フ
ィルタを通すことにより、楽音信号の基本周波数f₀の整
数倍の周波数成分（いわゆる音程成分）のみがそのまま
通過あるいは強調されて他の成分（非音程成分及びノイ
ズの一部）が減衰され、結果としてSN比を改善すること
ができる。ここで、上記第11図中の一点鎖線に示す櫛形
フィルタの周波数特性は、次式Ｈ（ｆ）＝［（cos（２πf/f₀）＋１）/2］^N …… で表されるものである。この式中のf₀は上記入力信号
の基本周波数（音程に対応する基音の周波数）、Ｎは櫛
形フィルタの段数である。That is, when considering a tone signal such as the sound of a musical instrument as the input signal, since the tone signal usually has a fixed pitch (pitch, pitch), its frequency spectrum has a solid line in FIG. as shown in, distributed as near the energy in the vicinity of the frequency of an integer multiple of the fundamental frequency f ₀ corresponding to the pitch of the musical tone itself is concentrated is obtained.
On the other hand, it is known that general noise components have a uniform frequency distribution. Therefore, by passing the input tone signal through a comb filter having a frequency characteristic as shown by a dashed line in FIG. 11, only a frequency component (a so-called pitch component) that is an integral multiple of the fundamental frequency f ₀ of the tone signal is passed or left as it is. The other components (non-pitched components and part of noise) are emphasized and attenuated, and as a result, the S / N ratio can be improved. Here, the frequency characteristic of the comb filter shown in dashed line in the FIG. 11, the following formula H (f) = [(cos (2πf / f 0) +1) / 2] those represented by ^N ...... is there. In this equation, f ₀ is the fundamental frequency of the input signal (the frequency of the fundamental tone corresponding to the pitch), and N is the number of stages of the comb filter.

このようにしてノイズ成分が低減された楽音信号は、
前記繰り返し波形抽出回路に送られ、この繰り返し波形
抽出回路により前述した第２図のルーピング区間LPのよ
うな適当な繰り返し波形区間が抽出された後、半導体メ
モリ等の記憶媒体に送られて記録される。この記憶媒体
に記録された楽音信号データは、非音程成分や一部のノ
イズ成分が減衰されたものであるため、上記繰り返し波
形区間を繰り返し再生する際のノイズ、いわゆるルーピ
ングノイズを低減することができる。The tone signal with the noise component reduced in this way is
It is sent to the repetitive waveform extraction circuit, and after the repetition waveform extraction circuit extracts an appropriate repetition waveform section such as the looping section LP in FIG. 2 described above, it is sent to a storage medium such as a semiconductor memory and recorded. You. Since the tone signal data recorded on this storage medium has attenuated non-pitch components and some noise components, it is possible to reduce noise when repeatedly playing back the repetitive waveform section, so-called looping noise. it can.

なお上記HPF、櫛形フィルタ、LPFの周波数特性は、先
にピッチ検出機能ブロック12にて検出されたピッチ情報
である上記基本周波数f₀に基づいて設定されるようにな
っている。Note the HPF, comb filter, the frequency characteristic of the LPF is adapted to be set based on the fundamental frequency f ₀ is the pitch information detected by the pitch detection function block 12 first.

次に第３図のループ区間検出機能ブロック16におい
て、上記フィルタ処理によって音程成分以外が減衰され
た楽音信号に対して、適当な繰り返し波形区間を検出す
ることにより、ルーピング開始点LP_Sとルーピング終端
点LP_Eとのルーピングポイントを設定する。Then the loop interval detection block 16 of FIG. 3, with respect to the musical tone signal other than pitch component is attenuated by the filtering process, by detecting a suitable repetitive waveform sections, looping start point LP _S and looping end to set the looping point between the point LP _E.

すなわち、ループ区間検出機能ブロック16では、上記
楽音信号のピッチ（音程）に対応する繰り返し周期（の
整数倍）だけ相対的に離れた２点であるルーピングポイ
ントを選定するが、以下にその選定原理を説明する。That is, the loop section detection function block 16 selects two looping points that are relatively separated by a repetition period (an integer multiple) corresponding to the pitch (pitch) of the musical tone signal. Will be described.

楽音データをルーピング処理する場合、ルーピングの
間隔は、楽音信号の基本周期（基音の周波数の逆数）の
整数倍でなければならない。したがって、その楽音の音
程を正確に同定すれば、容易に決定することが可能とな
る。When performing looping processing on musical tone data, the looping interval must be an integral multiple of the fundamental period of the musical tone signal (the reciprocal of the fundamental tone frequency). Therefore, if the pitch of the musical tone is accurately identified, it can be easily determined.

つまり、予めルーピング間隔を決定しておき、その間
隔分だけ離れた２点を取り出し、その２点の近傍の信号
波形の相関性あるいは類似性を評価することでルーピン
グポイントを設定する。この評価関数の一例として、上
記２点の各近傍の信号波形のサンプルについてのたたみ
込み（合成積、コンボリューション）を用いるものにつ
いて説明する。すなわち、上記コンボリューションの操
作を全ての点の組みについて順次施すことで信号波形の
相関性あるいは類似性を評価する。ここで、上述のコン
ボリューションによる評価は、例えば上記楽音データを
シフトレジスタに順次入力してゆき、それぞれ各レジス
タで取り込まれた楽音データを、例えば後述するDSP
（ディジタル信号処理装置）で構成された積和器にそれ
ぞれ入力し、該積和器で上記コンボリューションを計算
し出力するものである。このようにして得られたポンボ
リューションが最大となる２点の組みをルーピング開始
点LP_Sおよびルーピング終端点LP_Eとする。In other words, a looping interval is determined in advance, two points separated by the interval are extracted, and a looping point is set by evaluating the correlation or similarity of signal waveforms near the two points. As an example of the evaluation function, a description will be given of a function using convolution (synthesis product, convolution) for a sample of a signal waveform near each of the two points. That is, the convolution operation is sequentially performed on all sets of points to evaluate the correlation or similarity of the signal waveforms. Here, in the evaluation by the convolution described above, for example, the tone data is sequentially input to the shift register, and the tone data captured by each register is converted into, for example, a DSP described later.
(Digital signal processing device), each of which is input to a product-sum device, and the product-sum device calculates and outputs the convolution. Thus Pont convolution obtained is to set the looping start point LP _S and looping end point LP _E of two points becomes maximum.

すなわち、第12図において、ルーピング開始点LP_Sの
候補点をa₀とし、ルーピング終端点LP_Eの候補点をb₀と
して、上記ルーピング開始点LP_Sの候補点a₀の前後近傍
の複数個の点、例えば2N＋１個の点の各波高値データ
を、それぞれa_-N・・，a_-2，a_-1，a₀，a₁，a₂，・・
a_N、ルーピング終端点LP_Eの候補点b₀の前後近傍の同じ
個数（2N＋１個）点の各波高値データを、b_-N・・，
b_-2，b_-1，b₀，b₁，b₂，・・b_Nとすると、このときの評
価関数Ｅ（a₀，b₀）は、次式で定めることができる。この第式はa₀，b₀の点を中心
としたコンボリューションを求めるための式である。そ
して上記候補点a₀，b₀の組を順次変更して、全てのルー
ピングポイントの候補となる点についての上記評価関数
Ｅの値を求め、得られた全ての評価関数Ｅの内でその値
が最大となる点をルーピングポイントとする。That is, in Figure 12, the candidate points of the looping start point LP _S and a _0, the candidate points of the looping end point LP _E as b _0, a plurality of front and rear vicinity of the candidate point a ₀ of the looping start point LP _S , For example, 2N + 1 points, are respectively _{converted to} a _-N ..., A _-2 , a _-1 , a ₀ , a ₁ , a ₂ ,.
a _N , each peak value data of the same number (2N + 1) points before and after the candidate point b ₀ of the looping end point LP _E is expressed by b _−N .
_{_{_{b -2, b -1, b 0}}} , b 1, b 2, when a · · b _N, the evaluation function E in this case (a _0, b _0), the following equation Can be determined. This equation is an equation for obtaining a convolution centered on points a ₀ and b ₀ . Then, the set of the candidate points a ₀ and b ₀ is sequentially changed to obtain the value of the evaluation function E for all the looping point candidate points, and the value of the evaluation function E among all the obtained evaluation functions E is obtained. The point at which is maximized is defined as the looping point.

また、ルーピングポイントは上述のようにコンボリュ
ーションから求める方法の他に、誤差の最小２乗法から
求めることも可能である。すなわち、最小２乗法による
ルーピングポイントの候補点a₀，b₀は、の式で表すことができる。この場合には、評価関数εの
値が最小となるa₀，b₀を求めればよい。Further, the looping point can be obtained by the least square method of the error in addition to the method of obtaining the looping point from the convolution as described above. That is, the looping point candidate points a ₀ and b ₀ by the least square method are Can be represented by the following equation. In this case, a ₀ and b ₀ that minimize the value of the evaluation function ε may be obtained.

また、上述のループ区間検出機能ブロック16では、必
要に応じて上記ルーピング開始点LP_Sとルーピング終端
点LP_Eとに基づいてピッチ変換比を算出する。このピッ
チ変換比は、次の機能ブロック17における時間軸補正処
理の際の時間軸補正値データとして用いられる。この時
間軸補正処理は、実際に各種音源データをメモリ等の記
憶手段に記録する際の各種音源データの各ピッチを揃え
ておくために行われるものであり、上記ピッチ変換比の
代わりにピッチ検出機能ブロック12において検出された
上記ピッチ情報を用いるようにしてもよい。Further, the loop interval detection block 16 described above, calculates the pitch conversion ratio based on the above looping start point LP _S and the looping end point LP _E as required. This pitch conversion ratio is used as time axis correction value data in the time axis correction processing in the next function block 17. This time axis correction process is performed to make the pitches of the various sound source data uniform when actually recording the various sound source data in a storage unit such as a memory, and the pitch detection is performed instead of the pitch conversion ratio. The pitch information detected in the function block 12 may be used.

この時間軸補正機能ブロック17におけるピッチ正規化
動作について第13図を参照しながら説明する。The pitch normalizing operation in the time axis correction function block 17 will be described with reference to FIG.

第13図Ａは時間軸補正処理（主として時間軸圧伸処
理）を施す前の楽音信号波形を示し、第13図Ｂは上記圧
伸後の補正波形を示している。これらの第13図Ａ、Ｂの
時間軸には、後述する準瞬時ビット圧縮符号化処理の際
のブロック単位で目盛りを付している。FIG. 13A shows a tone signal waveform before time-base correction processing (mainly, time-base expansion / compression processing), and FIG. 13B shows a corrected waveform after the above-mentioned expansion. The time axis in FIGS. 13A and 13B is marked on a block-by-block basis in the quasi-instantaneous bit compression encoding process described later.

時間軸補正前の波形Ａにおいては、通常の場合ルーピ
ング区間LPと上記ブロックとは無関係となるが、第13図
Ｂに示すように、上記ルーピング区間LPがブロックの長
さ（ブロック周期）の整数倍（ｍ倍）となるように時間
軸圧伸処理し、さらにブロックの境界位置が上記ルーピ
ング開始点LP_Sおよびルーピング終端点LP_Eに一致するよ
うに時間軸方向にシフトする。すなわちルーピング区間
LPの開始点LP_S及び終端点LP_Eが所定のブロックの境界位
置となるように時間軸補正（時間軸圧伸及びシフト）す
ることによって、整数個（ｍ個）のブロック単位でルー
ピング処理を行うことができ、記録時の音源データのピ
ッチの正規化が実現できる。ここで、上記時間シフトに
よって楽音信号波形の先頭に生ずるブロックの境界から
のずれ分ΔＴの間には、波高値データとして“0"を詰め
るようにすればよい。In the waveform A before the time axis correction, the looping section LP and the block are normally unrelated in the normal case. However, as shown in FIG. 13B, the looping section LP is an integer of the block length (block cycle). times (m times) and a way to time-scale modification processing, further boundary position of the block is shifted in the time axis direction to coincide with the looping start point LP _S and looping end point LP _E. That is, the looping section
By starting point of the LP LP _S and end point LP _E is time base correction so that the boundary position of a predetermined block (time scale modification and shift), the looping process in blocks of an integer number (m pieces) And normalization of the pitch of the sound source data at the time of recording can be realized. Here, "0" may be filled as the peak value data during the deviation ΔT from the block boundary generated at the head of the tone signal waveform due to the time shift.

第14図は、上記時間軸補正後の波形の波高値データを
後述のビット圧縮符号化処理するためにブロック化する
際のブロック構造を表すものであり、１ブロックの波高
値データの個数（サンプル数、ワード数）をｈとしてい
る。この場合、上記ピッチの正規化とは、一般的に第２
図に示す楽音信号波形の一定周期Twの波形のｎ周期分す
なわちルーピング区間LP内のワード数を、上記ブロック
内のワード数ｈの整数倍（ｍ倍）とするように時間軸圧
伸処理することであり、さらに好ましくは、ルーピング
区間LPの開始点LP_S及び終端点LP_Eを時間軸上のブロック
境界位置に一致させるように時間軸処理（シフト処理）
させることである。このように各点LP_S、LP_Eがブロック
境界位置に一致していると、ビット圧縮符号化システム
でのデコードの際のブロック切替えによって生じる誤差
を減少させることができる。FIG. 14 shows a block structure when the peak value data of the waveform after the time axis correction is divided into blocks for bit compression encoding processing to be described later. The number of peak value data in one block (sample (Number, number of words). In this case, the pitch normalization generally means the second
A time axis expansion process is performed so that the number of words in the n-cycle of the constant tone Tw of the tone signal waveform shown in FIG. it, still preferably, in the time axis processing to match the start point LP _S and end point LP _E looping section LP to block border position on the time axis (shift)
It is to make it. When the points LP _S and LP _E coincide with the block boundary position as described above, it is possible to reduce an error caused by block switching at the time of decoding in the bit compression encoding system.

ここで、第14図Ａの１ブロック内の図中斜線で示す部
分のワードWLP_SとWLP_Eは、図中補正波形のルーピング開
始点LP_Sとルーピング終端点LP_E（正確には点LP_Eの直前
の点）のサンプルを示すワードである。なお上記シフト
処理を行わない場合には、ルーピング開始点LP_S及び終
端点LP_Eがブロック境界に必ずしも一致しないため、第1
4図Ｂに示すように、上記ワードWLP_S、WLP_Eの設定位置
は、ブロック内の任意の位置に設定される。ただし、上
記ワードWLP_SからワードWLP_Eまでの間のワード数は１ブ
ロック内のワード数ｈの整数倍（ｍ倍）となっており、
ピッチは正規化される。Here, the word WLP _S and WLP _E of the portion indicated by oblique lines in FIG within 1 block of Figure 14 A is a looping start point of the figure correction waveform LP _S and the looping end point LP _E (exact points in LP _E Is a word indicating the sample at the point immediately before the. Note that the case of not performing the shift process, since the looping start point LP _S and end point LP _E do not necessarily coincide with the block boundary, the first
4 As shown in FIG. B, set position of the word WLP _S, WLP _E is set at an arbitrary position in the block. However, the number of words between the above word WLP _S to word WLP _E is an integral multiple of the number of words h in one block (m times),
The pitch is normalized.

ここで、上述のようにルーピング区間LP内のワード数
を１ブロックのワード数ｈの整数倍とするための楽音信
号波形の時間軸圧伸処理には各種方法が考えられるが、
例えばサンプリングされた波形の波高値データを補間処
理することにより実現でき、その一具体例としては、オ
ーバーサンプリング処理用のフィルタ構成等を利用する
ことができる。Here, as described above, various methods can be considered for the time axis companding process of the tone signal waveform for making the number of words in the looping section LP an integral multiple of the number h of words in one block.
For example, it can be realized by interpolating the peak value data of the sampled waveform. As a specific example, a filter configuration for oversampling processing or the like can be used.

ところで、現実の楽音波形のルーピング周期がサンプ
リング周期単位に対して端数を持ち、ルーピング開始点
LP_Sでのサンプリング波高値とルーピング終端点LP_Eでの
サンプリング波高値とにずれが生じている場合に、オー
バサンプリング等を利用した補間処理により、ルーピン
グ終端点LP_Eの近傍位置（サンプリング周期よりも短い
距離の位置）でルーピング開始点LP_Sのサンプリング波
高値に一致するような波高値を求める等して、補間サン
プルも含めたサンプリング周期の非整数倍の（端数を持
つ）ルーピング周期を実現することが考えられる。この
ようなサンプリング周期の非整数倍のルーピング周期
も、上記時間軸補正処理により上記ブロック周期の整数
倍とすることができ、例えば256倍オーバサンプリング
を利用して時間軸圧伸処理する場合には、ルーピング開
始点LP_Sと終端点LP_Eとの間の波高値の誤差を1/256に低
減して、より円滑なルーピング再生を実現できる。By the way, the looping cycle of the actual tone waveform has a fraction with respect to the sampling cycle unit, and the looping start point
If the deviation in the sampling peak value at the sampling peak value and looping end point LP _E in LP _S occurs, by interpolation processing using oversampling like, from a position near (sampling period of looping end point LP _E even if such finding the peak value to conform to the sampling the peak value of the short distance looping start point LP _S at position) of, with non-integer multiple of the (fractional sampling period interpolated samples were also included) realized looping cycle It is possible to do. Such a looping cycle of a non-integer multiple of the sampling cycle can also be set to an integer multiple of the block cycle by the time axis correction processing. For example, when performing the time axis companding processing using 256 times oversampling, , to reduce the error in the wave height value between the looping start point LP _S and the end point LP _E 1/256 can be realized more smoothly looping playback.

上述のようにしてルーピング区間LPが決められ時間軸
補正（圧伸）処理が施された波形は、次の機能ブロック
21において、第15図に示すようにルーピング区間LPを前
後に接続してループデータの生成が行われる。すなわち
第15図は、上記時間軸補正後の楽音波形（第13図Ｂ）か
らルーピング区間LPのみを切り取り、このルーピング区
間LPを複数個並べたループデータ波形を示しており、こ
のループデータ波形は、複数個のルーピング区間LPのそ
れぞれ一方のルーピング終端点LP_Eと他方のルーピング
開始点LP_Sとを順次接続して並べたものである。このル
ープデータ波形がループデータ生成機能ブロック21にて
生成される。The waveform for which the looping section LP has been determined as described above and the time axis correction (compression expansion) processing has been performed is performed by the following functional block.
In FIG. 21, loop data is generated by connecting the looping sections LP back and forth as shown in FIG. That is, FIG. 15 shows a loop data waveform obtained by cutting out only the looping section LP from the tone waveform after the time axis correction (FIG. 13B) and arranging a plurality of looping sections LP. it is obtained by arranging the respective one of the looping end point LP _E and the other looping start point LP _S of a plurality of looping sections LP and sequentially connected. This loop data waveform is generated by the loop data generation function block 21.

このループデータは、ルーピング区間LPを多数回接続
して形成されるため、該接続形成されたループデータ波
形の各ルーピング開始点LP_Sに対応するワードWLP_Sを含
む開始ブロックの直前には、ルーピング終端点LP_E（正
確には点LP_Eの直前の点）に対応するワードWLP_Sを含む
終了ブロックのデータがそのまま配置されることにな
る。原理的には、ビット圧縮符号化のエンコード処理す
る際に、記憶しようとするルーピング区間LP₀の上記開
始ブロックの直前位置に、少なくとも上記終了ブロック
が存在していればよい。さらに一般化して述べるなら
ば、上記ブロック単位のビット圧縮エンコード時に、上
記開始ブロックのパラメータ（圧縮ブロック毎のビット
圧縮符号化の情報、例えば後述するレンジ情報やフィル
タ選択情報）は、上記開始ブロックと終了ブロックのデ
ータに基づいて形成されるようにすればよい。これは、
後述するフォルマント部分を持たないループデータのみ
の楽音信号を音源とする場合にも適用可能な技術であ
る。The loop data, because it is formed by connecting multiple looping section LP, just before the start block containing the word WLP _S corresponding to each looping start point LP _S of the connection formed loop data waveform, looping (the exact point immediately before the point LP _E) termination point LP _E so that the data of the end block containing the word WLP _S corresponding to is arranged as it is. In principle, when encoding processing of bit compression encoding, just before the position of the start block of the looping section LP ₀ to be stored, it is sufficient that at least the end block is present. In more general terms, at the time of the bit compression encoding in block units, the parameters of the start block (information of bit compression encoding for each compression block, for example, range information and filter selection information described later) are What is necessary is just to form based on the data of an end block. this is,
This technique is also applicable to a case where a tone signal of only loop data having no formant part, which will be described later, is used as a sound source.

こうすれば、上記エンコード時に、ルーピング開始点
LP_Sと終端点LP_Eとについては、それぞれの前後複数サン
プルに亘って、それぞれ同じデータが並ぶことになる。
従って、これらの各点LP_SとLP_Eの直前のそれぞれのブロ
ックについてのビット圧縮符号化の際のパラメータは同
じものとなり、デコード処理の際のルーピング再生時の
エラー（ノイズ）を減少することができる。すなわち、
ルーピング再生される楽音データは接続ノイズの無い安
定したものとなる。なお、本実施例においては、上記開
始ブロックの直前に配置する上記ルーピング区間LPのデ
ータのサンプル数を約500サンプルとしている。In this way, at the time of the above encoding, the looping start point
For the LP _S and the end point LP _E is over each of the front and rear multiple samples, so that each same data lined.
Therefore, that the parameters of the time of bit compression encoding for each of the blocks immediately preceding each of these points LP _S and LP _E becomes the same as, reducing the looping playback error upon decoding (noise) it can. That is,
Music data to be looped and reproduced is stable without connection noise. In the present embodiment, the number of data samples of the looping section LP arranged immediately before the start block is about 500 samples.

次に上記フォルマント部分FRの信号のデータ生成工程
においては、先ず、上記ループデータ生成の際の機能ブ
ロック14と同様に、機能ブロック18においてエンベロー
プ補正処理が施される。ただしこの場合のエンベロープ
補正は、上記サンプリング処理された楽音信号に対し
て、前述したディケイレート情報のみのエンベロープ波
形（第７図）で割算することにより、第16図に示すよう
な波形の信号（の波高値データ）を得ている。すなわち
この第16図の出力信号においては、上記アタック部分
（時間T_Aの間）のエンベロープが残され、それ以外の部
分は一定振幅となっている。Next, in the data generation process of the signal of the formant part FR, first, similarly to the functional block 14 at the time of generating the loop data, an envelope correction process is performed in a functional block 18. In this case, however, the envelope correction is performed by dividing the above-described sampled tone signal by the envelope waveform (FIG. 7) containing only the decay rate information as described above. (Peak value data). That is, in the output signal of FIG. 16, the envelope of the above-mentioned attack portion (during the time T _A ) is left, and the other portions have a constant amplitude.

このエンベロープ補正された信号は、必要に応じて機
能ブロック19でのフィルタ処理が施される。この機能ブ
ロック19でのフィルタ処理には、上記機能ブロック15と
同様な例えば第11図の一点鎖線に示すような周波数特性
の櫛形フィルタが用いられる。すなわちこの櫛形フィル
タは、上記音程に対応する基本周波数f₀の整数倍の周波
数帯域成分を強調して相対的に非音程成分を減衰するよ
うな周波数特性を有しており、この櫛形フィルタも上記
ピッチ検出機能ブロック12で検出されたピッチ情報（基
本周波数f₀）に基づいて周波数特性が設定されるもので
ある。このような信号は、最終的にメモリ等の記憶媒体
に記録される音源データにおけるフォルマント部分の信
号のデータを生成するために用いられる。This envelope-corrected signal is subjected to a filtering process in a functional block 19 as necessary. For the filter processing in the functional block 19, for example, a comb filter having a frequency characteristic as shown by a dashed line in FIG. 11 similar to the functional block 15 is used. That this comb filter has a frequency characteristic as to attenuate the relatively non-pitch component emphasizes the integral multiple of the frequency band component of the fundamental frequency f ₀ corresponding to the pitch, also the comb filter described above The frequency characteristic is set based on the pitch information (basic frequency f ₀ ) detected by the pitch detection function block 12. Such a signal is used to generate data of a signal of a formant part in sound source data finally recorded on a storage medium such as a memory.

次の機能ブロック20においては、上記機能ブロック17
と同様な時間軸補正が上記フォルマント部分生成用信号
に対しても行われる。これは、上記機能ブロック16で求
められたピッチ変換比あるいは上記機能ブロック12で検
出されたピッチ情報に基づいて時間軸の圧縮伸長を行う
ことにより、各音源毎のピッチを揃える（正規化する）
ためのものである。In the next function block 20, the above function block 17
The same time axis correction is performed on the formant part generation signal. This is because the pitch of each sound source is made uniform (normalized) by performing compression and expansion on the time axis based on the pitch conversion ratio obtained in the function block 16 or the pitch information detected in the function block 12.
It is for.

次に、機能ブロック22において、上記共に同じピッチ
変換比あるいはピッチ情報を用いて時間軸補正されたル
ープデータとフォルマント部分生成用データとが混合さ
れる。このときの混合は、上記機能ブロック20からのフ
ォルマント部分生成用信号に対してハミング窓をかけ、
ループデータと混合しようとする部分で時間に伴って減
衰するフェイドアウト型の信号を形成し、これに対して
上記機能ブロック20からのループデータに対しても同様
なハミング窓をかけ、この場合にはフォルマント信号と
混合しようとする部分で時間に伴って増大するフェイド
イン型の信号を形成し、これらの信号を混合する（クロ
スフェイドする）ことにより、最終的に音源データとな
る楽音信号を得ている。ここで、メモリ等の記憶媒体に
記録するループデータとしては、上記クロスフェイド部
分からある程度離れた１つのルーピング区間のデータを
取り出すことにより、ルーピング再生時のノイズ（ルー
ピングノイズ）を低減することができる。このようにし
て、発音時から非音程成分を含む波形部分であるフォル
マント部分FRと、音程成分のみ繰り返し波形部分である
ルーピング区間LPとから成る音源信号の波高値データが
得られる。Next, in the function block 22, the loop data and the formant part generation data that have been time-axis corrected using the same pitch conversion ratio or pitch information are mixed. At this time, a hamming window is applied to the formant part generation signal from the functional block 20,
Form a fade-out type signal that attenuates with time in the part to be mixed with the loop data, and applies the same Hamming window to the loop data from the functional block 20 in this case. By forming a fade-in type signal that increases with time in the portion to be mixed with the formant signal, and mixing (cross-fading) these signals, a tone signal is finally obtained as sound source data. I have. Here, as loop data to be recorded on a storage medium such as a memory, noise in a looping reproduction (looping noise) can be reduced by extracting data of one looping section that is separated from the cross-fade part to some extent. . In this manner, the peak value data of the sound source signal including the formant portion FR, which is a waveform portion including non-pitch components, and the looping section LP, which is a repetitive waveform portion including only pitch components, is obtained from the time of sounding.

この他、上記フォルマント部分生成用信号における上
記ルーピング開始点の位置にループデータの信号の開始
点を接続するように各部分を切り繋ぐ処理等も考えられ
る。In addition, a process of connecting each part such that the start point of the loop data signal is connected to the position of the looping start point in the formant part generation signal may be considered.

ところで、現実にループ区間検出やルーピング処理、
さらにはループデータとフォルマント部分との混合を行
う際には、人間の手操作により試行錯誤的に試聴を繰り
返しながら大まかな混合をしておき、このときのループ
ポイント（ルーピング開始点LP_Sとルーピング終端点L
P_E）情報等に基づいてより高精度の処理を行っている。By the way, actually, loop section detection and looping processing,
When further performing mixing of the loop data and formant portion, by the hand of man operations leave the rough mixed with trial and error repeated Listen, loop points (looping start point LP _S and the looping of the time End point L
_PE ) Higher precision processing is performed based on information.

すなわち、上記機能ブロック16での高精度のループ区
間検出に先立って、第17図のフローチャートに示すよう
な手順でループ区間検出や上記混合等を試聴を繰り返し
ながら手操作で行い、その後、上述したような高精度の
処理（ステップS26以降）を行わせる。That is, prior to the high-precision loop section detection in the functional block 16, the loop section detection and the mixing and the like are manually performed while repeating the audition according to the procedure shown in the flowchart of FIG. Such high-precision processing (step S26 and subsequent steps) is performed.

この第17図において、最初のステップS21において
は、例えば信号波形のゼロクロス点を利用したり、信号
波形の表示を目視確認しながら、比較的粗い精度で上記
ループポイントを検出し、ステップS22でルーピング処
理して上記ループポイント間の波形を繰り返し再生し、
次のステップS23で人間が試聴して良好か否かを判別す
る。不良の場合には上記最初のステップS21に戻ってル
ープポイントを再度検出する。これを繰り返して良好な
試聴結果が得られれば、次のステップS24に進み、上記
フォルマント部用信号とクロスフェード等により混合
し、次のステップS23で人間が試聴してフォルマント部
からルーピング部への移行が良好か否かを判別する。不
良の場合にはステップS24に戻って上記混合をやり直
す。その後、ステップS26に進んで、上記ループ区間検
出機能ブロック16における高精度のループ区間検出を行
う。具体的には上記補間サンプルも含むループ区間検
出、例えば256倍オーバサンプリング時にはサンプリン
グ周期の１ステップ256の精度でのループ区間検出を行
い、次のステップS27で上記ピッチ正規化のためのピッ
チ変換比を算出する。このピッチ変換比に基づいて、次
のステップS28で上記機能ブロック17、20における時間
軸補正処理を行い、次のステップS29にて上記機能ブロ
ック21でのループデータ生成を行う。そして、ステップ
S30において、上記機能ブロック22での混合処理を行
う。これらのステップS26以降の処理においては、ステ
ップS21からS25までで得られたループポイント情報等を
利用するものである。なお、上記ステップS21からS25ま
でを省略して、ルーピング処理等の全自動化を図っても
よい。In FIG. 17, in the first step S21, the loop point is detected with relatively coarse accuracy while using, for example, the zero-cross point of the signal waveform or visually confirming the display of the signal waveform, and looping is performed in step S22. Process and repeatedly play the waveform between the above loop points,
In the next step S23, it is determined whether or not the human being listens to the sample by listening. In the case of a failure, the process returns to the first step S21 to detect the loop point again. If a good audition result is obtained by repeating this, the process proceeds to the next step S24, where the signal is mixed with the above-mentioned signal for the formant section by crossfading, etc. It is determined whether the transition is good. If defective, the process returns to step S24 to repeat the mixing. Thereafter, the process proceeds to step S26, in which the loop section detection function block 16 performs high-accuracy loop section detection. Specifically, the loop section including the interpolation sample is detected. For example, at the time of oversampling of 256 times, the loop section is detected with the accuracy of one step 256 of the sampling period. In the next step S27, the pitch conversion ratio for the pitch normalization is performed. Is calculated. Based on the pitch conversion ratio, the time axis correction processing in the functional blocks 17 and 20 is performed in the next step S28, and the loop data is generated in the functional block 21 in the next step S29. And step
In S30, the mixing process in the functional block 22 is performed. In the processing after step S26, loop point information and the like obtained in steps S21 to S25 are used. Note that steps S21 to S25 may be omitted and full automation such as looping processing may be achieved.

このような混合処理により得られたフォルマント部分
FRとルーピング区間LPとから成る信号の波高値データ
は、次の機能ブロック23においてビット圧縮符号化処理
が施される。Formant part obtained by such a mixing process
The peak value data of the signal composed of the FR and the looping section LP is subjected to a bit compression encoding process in the next functional block 23.

上述のビット圧縮符号化方式としては種々のものが考
えられるが、ここでは、本件出願人が先に特開昭62−00
8629号公報や特開昭62−003516号公報等において提案し
ている準瞬時圧伸型、すなわち波高値データの一定ワー
ド数（ｈサンプル）毎にブロック化しこのブロック単位
でビット圧縮を施すような高能率符号化方式を用いるも
のとし、この高能率ビット圧縮符号化方式について、第
18図を参照しながら概略的に説明する。Various types of the above-described bit compression encoding method are conceivable.
No. 8629 and Japanese Unexamined Patent Publication No. 62-003516, etc., a quasi-instantaneous companding type, in which a block is formed for each fixed number of words (h samples) of peak value data and bit compression is performed in block units. The high-efficiency coding method shall be used.
This will be schematically described with reference to FIG.

この第18図において、上記高能率ビット圧縮符号化シ
ステムは、記録側のエンコーダ70と、再生側のデコーダ
90とにより構成されており、エンコーダ70の入力端子71
には、上記音源信号の波高値データｘ（ｎ）が供給され
ている。In FIG. 18, the high-efficiency bit compression encoding system comprises a recording-side encoder 70 and a reproduction-side decoder.
90 and the input terminal 71 of the encoder 70.
Is supplied with peak value data x (n) of the sound source signal.

この入力信号（の波高値データ）ｘ（ｎ）は、予測器
72及び加算器73で構成されたFIR（有限インパルス応答
型）ディジタルフィルタ74に供給され、上記予測器72か
らの予測信号（の波高値データ）（ｎ）は上記加算器
73に減算信号として送られている。上記加算器73におい
ては、上記入力信号ｘ（ｎ）から上記予測信号（ｎ）
が減算されることによって、予測誤差信号あるいは広義
の差分出力ｄ（ｎ）が出力される。予測器72は、一般に
過去のｐ個の入力ｘ（ｎ−ｐ）,x（ｎ−ｐ＋１），・
・,x（ｎ−１）の１次結合により予測値（ｎ）を算出
するものである。なお、上記FIRフィルタ74を以下エン
コード・フィルタと称す。This input signal (peak value data) x (n) is calculated by a predictor
The prediction signal (the peak value data) (n) of the prediction signal from the predictor 72 is supplied to an FIR (finite impulse response type) digital filter 74 comprising an adder 72 and an adder 73.
73 is sent as a subtraction signal. In the adder 73, the prediction signal (n) is obtained from the input signal x (n).
Is subtracted to output a prediction error signal or a difference output d (n) in a broad sense. The predictor 72 generally has p past inputs x (n-p), x (n-p + 1),.
The prediction value (n) is calculated by a linear combination of x (n-1). Note that the FIR filter 74 is hereinafter referred to as an encoding filter.

上記高能率ビット圧縮符号システムにおいては、上記
音源データの一定時間内のデータ、すなわち、一定ワー
ド数ｈの入力データ毎にブロック化して、各ブロック毎
に最適の特性の上記エンコード・フィルタ74を選択する
ようにしている。これは、互いに異なる特性を有する複
数の（例えば４個の）エンコード・フィルタを予め設け
ておき、これらのフィルタのうち最適の特性の、すなわ
ち最も高い圧縮率を得ることのできるようなフィルタを
選択することで実現できるものである。ただし、一般の
ディジタル・フィルタの構成上は、第18図に示す１個の
エンコード・フィルタ74の予測器72の係数の組を複数組
（例えば４組）係数メモリ等に記憶させておき、これら
の係数の組を時分割的に切り換え選択することで、実質
的に上記複数のエンコード・フィルタのうちの一つを選
択するのと等価な動作を行わせることが多い。In the high-efficiency bit compression encoding system, data of the excitation data within a certain time, that is, input data of a certain number of words h is divided into blocks, and the encoding filter 74 having the optimum characteristic is selected for each block. I am trying to do it. This is because a plurality of (for example, four) encoding filters having different characteristics are provided in advance, and a filter having an optimum characteristic, that is, a filter capable of obtaining the highest compression ratio is selected from these filters. It can be realized by doing. However, in the structure of a general digital filter, a plurality of sets (for example, four sets) of coefficients of the predictor 72 of one encoding filter 74 shown in FIG. In many cases, an operation equivalent to selecting one of the plurality of encoding filters is performed by switching and selecting the set of coefficients in a time-division manner.

次に、上記予測誤差としての差分出力ｄ（ｎ）は、加
算器81を介し、利得Ｇのシフタ75と量子化器76とよりな
るビット圧縮器に送られ、例えば浮動小数点（フローテ
ィング・ポイント）表示形態における指数部が上記利得
Ｇに、仮数部が量子化器76からの出力にそれぞれ対応す
るような圧縮処理あるいはレンジング処理が施される。
すなわち、シフタ75により入力データを上記利得Ｇに応
じたビット数だけシフトしてレンジを切り替え、量子化
器76により該ビット・シフトされたデータの一定ビット
数の取り出すような再量子化を行っている。ここで、ノ
イズ・シェイピング回路（ノイズ・シェイパ）77は、量
子化器76の出力と入力との誤差分いわゆる量子化誤差を
加算器78で得て、この量子化誤差を利得G^-1のシフタ79
を介して予測器80に送って、量子化誤差の予測信号を加
算器81に減算信号として帰還するようないわゆるエラー
・フィードバックを行う。このように量子化器76による
再量子化とノイズ・シェイピング回路77によるエラー・
フィードバックとが施され、出力端子82より出力
（ｎ）が取り出される。Next, the difference output d (n) as the prediction error is sent to a bit compressor composed of a shifter 75 for gain G and a quantizer 76 via an adder 81, for example, a floating point. A compression process or a ranging process is performed so that the exponent part in the display form corresponds to the gain G and the mantissa part corresponds to the output from the quantizer 76.
That is, the shifter 75 shifts the input data by the number of bits corresponding to the gain G to switch the range, and the quantizer 76 performs requantization such that a fixed number of bits of the bit-shifted data is extracted. I have. Here, a noise shaping circuit (noise shaper) 77 obtains a so-called quantization error corresponding to an error between an output and an input of the quantizer 76 by an adder 78, and converts the quantization error into a shifter having a gain G- ¹ . 79
, And so-called error feedback in which the prediction signal of the quantization error is fed back to the adder 81 as a subtraction signal. Thus, the requantization by the quantizer 76 and the error
Feedback is performed, and the output (n) is taken out from the output terminal 82.

ところで、上記加算器81からの出力ｄ′（ｎ）は上記
差分出力ｄ（ｎ）より上記ノイズ・シェイパ77からの量
子化誤差の予測信号（ｎ）を減算したものであり、上
記利得Ｇのシフタ75からの出力ｄ″（ｎ）は利得Ｇと上
記出力加算器81からの出力ｄ′（ｎ）を乗算したもので
ある。また、上記量子化器76からの出力（ｎ）は、量
子化の過程における量子化誤差ｅ（ｎ）と上記シフタ75
からの出力ｄ″（ｎ）を加算したものとなり、上記ノイ
ズ・シェイパ77の上記加算器78において上記量子化誤差
ｅ（ｎ）が取り出される。この量子化誤差ｅ（ｎ）は、
上記利得G^-1のシフタ79を介し、過去のｒ個の入力の１
次結合をとる予測器80を介することにより量子化誤差の
予測信号（ｎ）となる。The output d '(n) from the adder 81 is obtained by subtracting the prediction signal (n) of the quantization error from the noise shaper 77 from the difference output d (n). The output d ″ (n) from the shifter 75 is obtained by multiplying the gain G by the output d ′ (n) from the output adder 81. The output (n) from the quantizer 76 is the quantum Error e (n) in the process of quantization and the shifter 75
Is added to the output d ″ (n), and the quantization error e (n) is extracted by the adder 78 of the noise shaper 77. The quantization error e (n) is
Through the shifter 79 of the gain G ^-1 , one of the past r inputs
The prediction signal (n) of the quantization error is obtained through the predictor 80 that takes the next combination.

上記音源データは、以上のようなエンコード処理が施
され、上記量子化器76からの出力（ｎ）となって出力
端子82を介して取り出される。The above-mentioned sound source data is subjected to the above-described encoding processing, output as the output (n) from the quantizer 76, and taken out through the output terminal 82.

次に予測・レンジ適応回路84からは、最適フィルタ選
択情報としてのモード選択情報が出力されて、上記エン
コード・フィルタ74の例えば予測器72および出力端子87
に送られ、また、上記利得Ｇおよび利得G^-1あるいは上
記ビット・シフト量を決定するためのレンジ情報が出力
されて、各シフタ75,79および出力端子86に送られてい
る。Next, the mode selection information as the optimum filter selection information is output from the prediction / range adaptation circuit 84, for example, the predictor 72 and the output terminal 87 of the encoding filter 74.
The range information for determining the gain G and the gain G- ¹ or the bit shift amount is output to the shifters 75 and 79 and the output terminal 86.

次に、再生側のデコーダ90の入力端子91には、上記エ
ンコーダ70の出力端子82からの出力（ｎ）が伝送さ
れ、あるいは記録，再生されることによって得られた信
号′（ｎ）が供給されている。この入力信号′
（ｎ）は利得G^-1のシフタ92を介し加算器93に送られて
いる。加算器93からの出力ｘ′（ｎ）は予測器94に送ら
れて予測信号′（ｎ）となり、この予測信号′
（ｎ）は上記加算器93に送られて上記シフタ92からの出
力″（ｎ）と加算される。この加算出力がデコード出
力′（ｎ）として出力端子95より出力される。Next, the output (n) from the output terminal 82 of the encoder 70 is transmitted to the input terminal 91 of the decoder 90 on the reproduction side, or the signal '(n) obtained by recording and reproduction is supplied. Have been. This input signal
(N) is sent to the adder 93 via the shifter 92 having the gain G ^-1 . The output x '(n) from the adder 93 is sent to the predictor 94 to become a predicted signal' (n),
(N) is sent to the adder 93 and added to the output "(n) from the shifter 92. The added output is output from the output terminal 95 as a decoded output '(n).

また、上記エンコーダ70の各出力端子86および87より
出力され、伝送あるいは記録，再生された上記レンジ情
報およびモード選択信号は、上記デコーダ90の各入力端
子96および97にそれぞれ入力されている。そして、入力
端子96からのレンジ情報は上記シフタ92に送られて利得
G^-1を決定し、入力端子97からのモード選択情報は上記
予測器94に送られて予測特性を決定する。この予測器94
の予測特性は、上記エンコーダ70の予測器72の特性に等
しいものが選択される。The range information and the mode selection signal output from the output terminals 86 and 87 of the encoder 70 and transmitted, recorded, or reproduced are input to the input terminals 96 and 97 of the decoder 90, respectively. The range information from the input terminal 96 is sent to the shifter 92 and gain
G- ¹ is determined, and the mode selection information from the input terminal 97 is sent to the predictor 94 to determine a prediction characteristic. This predictor 94
Are selected as those of the predictor 72 of the encoder 70.

このような構成のデコーダ90において、上記シフタ92
からの出力″（ｎ）は上記入力信号′（ｎ）と利得
G^-1を乗算したものである。また、上記加算器93の出力
′（ｎ）は、上記シフタ92からの出力″（ｎ）と予
測信号′（ｎ）を加算したものである。In the decoder 90 having such a configuration, the shifter 92
The output "(n) is the input signal '(n) and the gain
Multiplied by G ^-1 . The output '(n) of the adder 93 is the sum of the output' (n) from the shifter 92 and the prediction signal '(n).

次に、第19図は、上記ビット圧縮符号化エンコーダ70
からの上記１ブロック分の出力データの一例を示してお
り、この１ブロック分のデータは、１バイトのヘッダ情
報（圧縮に関するパラメータ情報あるいは付属情報）RF
と８バイトのサンプル用データD_A0〜D_B3で構成されてい
る。上記ヘッダ情報RFは、４ビットの上記レンジ情報
と、２ビットの上記モード選択情報、あるいはフィルタ
選択情報と、それぞれ１ビットの２つのフラグ情報、例
えばループの有無を示す情報LI及び波形の終端ブロック
（エンドブロック）が否かを示す情報EIとで構成されて
いる。ここで１サンプルの波高値データは、ビット圧縮
されて４ビットで表されており、上記データD_A0〜D_B3中
には16サンプル分の４ビット・データD_A0H〜D_B3Lが含ま
れている。Next, FIG. 19 shows the bit compression encoding encoder 70
1 shows one example of the output data of one block, and this one block of data is composed of 1-byte header information (compression parameter information or additional information) RF
And 8 bytes of sample data D _{A0 to} D _B3 . The header information RF includes the 4-bit range information, the 2-bit mode selection information, or the filter selection information, two 1-bit flag information, for example, information LI indicating the presence or absence of a loop, and a waveform end block. (End block) is composed of information EI indicating whether or not (end block). Here, the peak value data of one sample is bit-compressed and represented by 4 bits, and the data D _{A0 to} D _B3 include 4-bit data D _{A0H to} D _{B3L for} 16 samples. .

次に第20図は、第２図に示すような楽音信号波形の先
頭部分に対応する上記準瞬時（ブロック化）ビット圧縮
符号化された波高値データの各ブロックを示している。
この第20図においては、上記ヘッダを省略して波高値デ
ータのみを示しており、図示の都合上１ブロックを８サ
ンプルとしているが、１ブロック16サンプル等のように
任意に設定可能であることは勿論である。これは、前記
第14図の場合も同様である。Next, FIG. 20 shows each block of the peak value data which has been subjected to the quasi-instantaneous (blocking) bit compression encoding corresponding to the head portion of the tone signal waveform as shown in FIG.
In FIG. 20, only the peak value data is shown omitting the above-mentioned header, and one block is set to 8 samples for the sake of illustration. However, it can be set arbitrarily such as 16 samples per block. Of course. This is the same in the case of FIG.

ここで、上記準瞬時ビット圧縮符号システムは、上記
入力楽音信号を直接出力するモードすなわちストレート
PCMモードと、楽音信号をフィルタを介して出力するモ
ードすなわち１次または２次差分フィルタモードのう
ち、最も高い圧縮率を有する信号が得られるモードを選
択して、出力信号である楽音データを伝送するようにし
たものである。Here, the quasi-instantaneous bit compression code system is a mode for directly outputting the input tone signal, that is, a straight mode.
Selects a mode in which a signal having the highest compression rate is obtained from the PCM mode and a mode in which a tone signal is output through a filter, that is, a primary or secondary difference filter mode, and transmits tone data as an output signal. It is something to do.

楽音をサンプリングしてメモリ等の記憶媒体に記録す
る場合、上記楽音の楽音信号波形は発音開始点KSで波形
取り込みが開始されるものであるが、この発音開始点KS
からの最初のブロックにて１次または２次差分フィルタ
モード等のように初期値が必要なフィルタモードが選択
されると、この初期値を予め用意しておく必要が生じる
ため、このような初期値の必要のない形態とすることが
望まれる。このため、上記発音開始点KSに先行する期間
に、上記ストレートPCMモード（入力楽音信号を直接出
力するモード）が選択されるような擬似入力信号を付加
した後、その入力信号を含めて信号処理するようにして
いる。When a musical tone is sampled and recorded in a storage medium such as a memory, the tone signal waveform of the musical tone starts to be captured at the tone generation start point KS.
When a filter mode requiring an initial value, such as a primary or secondary difference filter mode, is selected in the first block from, this initial value needs to be prepared in advance. It is desirable to have a form that does not require a value. For this reason, during a period preceding the tone generation start point KS, a pseudo input signal for selecting the straight PCM mode (a mode for directly outputting an input tone signal) is added, and then signal processing including the input signal is performed. I am trying to do it.

すなわち具体的には、第20図において、上記発音開始
点KSに先行して、上記擬似入力信号としてデータを全て
“0"としたブロックを配置し、このブロックの先頭から
の全データ“0"をサンプリング波高値データとしてビッ
ト圧縮処理して取り込むようにしている。これは、例え
ば、予め１ブロックのデータが全て“0"のブロック作成
しておきこれをメモリ等にストアしておいて用いるか、
または、楽音をサンプリングする際に上記発音開始点KS
の前にデータが全て“0"の部分（すなわち発音開始前の
無音部分）の入力信号からサンプリングを開始する等に
より得ることができる。なお、上記擬似入力信号のブロ
ックは最低１ブロック以上である。That is, specifically, in FIG. 20, a block in which data is all “0” is arranged as the pseudo input signal prior to the tone generation start point KS, and all data “0” from the head of this block are arranged. Is subjected to bit compression processing as sampling peak value data and taken in. This is, for example, to create a block in which one block of data is all “0” and store it in a memory or the like before use,
Alternatively, when sampling a tone,
, Data can be obtained by starting sampling from an input signal in which all data is "0" (that is, a silent portion before the start of sound generation). The number of blocks of the pseudo input signal is at least one.

上述のようにして形成された擬似入力信号を含んだ音
楽データを、前述の第18図に示すような高能率ビット圧
縮符号化システムにより信号圧縮処理し、メモリ等の記
憶媒体に記録させておき、この圧縮処理された信号を再
生する。The music data including the pseudo input signal formed as described above is subjected to signal compression processing by the high-efficiency bit compression encoding system as shown in FIG. 18 and recorded in a storage medium such as a memory. The compressed signal is reproduced.

したがって、上記擬似入力信号を含んだ楽音データを
再生する場合、再生開始時（擬似入力信号のブロック部
分）のフィルタにストレートPCMモードが選択れるた
め、１次または２次差分フィルタの初期値をあらかじめ
設定しておく必要がなくなる。Therefore, when the tone data including the pseudo input signal is reproduced, the straight PCM mode is selected as the filter at the start of reproduction (the block portion of the pseudo input signal), so that the initial value of the primary or secondary differential filter is set in advance. There is no need to set it.

ここで、再生開始時に上記擬似入力信号（データが全
て“0"であるため無音である。）による発音開始時間の
遅れについての懸念がある。しかし、例えば、サンプリ
ング周波数32kHzで１ブロック16サンプルとした場合、
上記発音時間の遅れは約0.5msecとなり聴覚上で識別で
きる遅れではなく問題にならない。Here, there is a concern about a delay in the sound generation start time due to the pseudo input signal (since the data is all "0" and there is no sound) at the start of reproduction. However, for example, when the sampling frequency is 32 kHz and one block is 16 samples,
The delay of the sounding time is about 0.5 msec, which is not a problem that is not a delay that can be discerned by hearing.

ところで、上記ビット圧縮符号化処理やその他の音源
データ生成のためのディジタル信号処理については、デ
ィジタル信号処理装置（DSP）を用いてソフトウェア的
に実現することが多く行われており、また記録された音
源データの再生にもDSPを用いたソフトウェア的な構成
が採用されることが多い。第21図はその一例として、音
源データを取り扱う音源ユニットとしてのオーディオ・
プロセッシング・ユニット（APU）107及びその周辺を含
むシステムの全体構成例を示している。By the way, digital signal processing for generating the above-mentioned bit compression encoding processing and other sound source data is often implemented by software using a digital signal processing device (DSP). A software-like configuration using a DSP is often used for reproducing sound source data. FIG. 21 shows an example of an audio / sound source unit that handles sound source data.
1 shows an example of the overall configuration of a system including a processing unit (APU) 107 and its periphery.

この第21図において、例えば一般のパーソナルコンピ
ュータ装置や、ディジタル電子楽器、TVゲーム機等に設
けられているホストコンピュータ104は、上記音源ユニ
ットとしてのAPU107と接続されており、該ホストコンピ
ュータ104からは音源データ等がAPU107にロードされる
ようになっている。このAPU107は、マイクロプロセッサ
等のCPU（中央処理装置）103と、DSP（ディジタル信号
処理装置）101と、上述したような音源データ等が記憶
されたメモリ102とを少なくとも有して構成されるもの
である。すなわち、このメモリ102には少なくとも音源
データが記憶されており、上記DSP101により該音源デー
タの読み出し制御を含む各種処理、例えばルーピング処
理、ビット伸長（復元）処理、ピッチ変換処理、エンベ
ロープの付加、エコー（リバーブ）処理等が施される。
メモリ102は、これらの各種処理のためのバッファメモ
リとしても用いられる。CPU103は、DSP101のこれらの各
種処理の動作や内容等についての制御を行うものであ
る。In FIG. 21, for example, a host computer 104 provided in a general personal computer device, a digital electronic musical instrument, a video game machine, or the like is connected to an APU 107 as the sound source unit. Sound source data and the like are loaded into the APU 107. The APU 107 includes at least a CPU (central processing unit) 103 such as a microprocessor, a DSP (digital signal processing unit) 101, and a memory 102 in which sound source data and the like are stored as described above. It is. That is, at least sound source data is stored in the memory 102, and various kinds of processing including reading control of the sound source data by the DSP 101, such as looping processing, bit decompression (decompression) processing, pitch conversion processing, envelope addition, echo (Reverb) processing or the like is performed.
The memory 102 is also used as a buffer memory for these various processes. The CPU 103 controls operations and contents of these various processes of the DSP 101.

さらに、メモリ102からの上記音源データに対してDSP
101により上記各種処理を施して最終的に得られたディ
ジタル楽音データは、ディジタル／アナログ（D/A）コ
ンバータ105によりアナログ信号に変換されてスピーカ1
06に供給されるようになっている。Furthermore, the above sound source data from the memory 102 is
Digital tone data finally obtained by performing the above-described various processings at 101 is converted into an analog signal by a digital / analog (D / A) converter 105, and
06 will be supplied.

なお、本発明は上述した実施例のみに限定されるもの
ではなく、例えば、上述の実施例においてはフォルマン
ト部分とルーピング区間とを接続して音源データを形成
していたが、ルーピング区間のみから成る音源データを
形成する場合にも容易に適用可能である。また、上記デ
コーダ側の構成や音源データ用外部メモリは、ROMカー
トリッジやアダプタ等として供給してもよい。また、楽
音信号の音源のみならず音声合成にも適用可能である。It should be noted that the present invention is not limited to only the above-described embodiment. For example, in the above-described embodiment, the sound source data is formed by connecting the formant part and the looping section, but only the looping section is used. It can be easily applied to the case where sound source data is formed. Further, the configuration on the decoder side and the external memory for sound source data may be supplied as a ROM cartridge, an adapter, or the like. In addition, the present invention can be applied not only to the sound source of a tone signal but also to speech synthesis.

〔The invention's effect〕

本発明の信号圧縮方法によれば、入力信号の開始点に
先行する最低１ブロック以上の期間に入力信号直接出力
モードが選択されるような０レベルの擬似入力信号を付
加することにより、信号再生のためのデコード時には該
擬似入力信号からデコードが開始されることになり、例
えば１次および２次差分フィルタ等のための初期値を持
つ必要がない。したがって、ハードウェアの構成を複雑
にすることなく音源データ等の信号を圧縮することが可
能となる。According to the signal compression method of the present invention, signal reproduction is performed by adding a 0-level pseudo input signal such that the input signal direct output mode is selected during a period of at least one block preceding the start point of the input signal. In this case, decoding is started from the pseudo input signal, and there is no need to have initial values for, for example, primary and secondary difference filters. Therefore, signals such as sound source data can be compressed without complicating the hardware configuration.

また、擬似入力信号は、０レベルであるため、この擬
似入力信号の期間を長く設定すれば、音声発生前のミュ
ーティング処理が同時に行え、異音発生を防止すること
ができる。Further, since the pseudo input signal is at the 0 level, if the period of the pseudo input signal is set to be long, muting processing before sound generation can be performed at the same time, and generation of abnormal noise can be prevented.

[Brief description of the drawings]

第１図は本発明の信号圧縮方法の動作説明のための波形
図、第２図は楽音信号波形図、第３図は本発明の信号記
録方法の具体例を説明するための機能ブロック図、第４
図はピッチ検出動作を説明するための機能ブロック図、
第５図はピーク検出動作を説明するためのブロック図、
第６図は楽音信号及びエンベロープの波形図、第７図は
楽音信号のディケイレート情報の波形図、第８図はエン
ベロープ検出動作を説明するための機能ブロック図、第
９図はFIRフィルタの特性図、第10図は楽音信号のエン
ベロープ補正された後の波高値データを示す波形図、第
11図は櫛形フィルタの特性図、第12図は最適ルーピング
ポイントの設定動作を説明するための波形図、第13図は
時間軸補正の前後の楽音信号を示す波形図、第14図は時
間軸補正後の波高値データについて準瞬時ビット圧縮用
のブロックの構造を示す模式図、第15図はルーピング区
間の波形を繰り返し接続されて得られるループデータを
示す波形図、第16図はディケイレート情報に基づくエン
ベロープ補正後のフォルマント部分生成用データを示す
波形図、第17図は現実のルーピング処理前後の動作を説
明するためのフローチャート、第18図は準瞬時ビット圧
縮符号化システムの概略構成示すブロック回路図、第19
図は準瞬時ビット圧縮符号化されて得られたデータの１
ブロックの具体例を示す模式図、第20図は楽音信号の先
頭部分のブロックの内容を示す模式図、第21図はオーデ
ィオ・プロセッシング・ユニット（APU）及びその周辺
を含むシステムの構成例を示すブロック図である。FIG. 1 is a waveform diagram for explaining the operation of the signal compression method of the present invention, FIG. 2 is a waveform diagram of a tone signal, FIG. 3 is a functional block diagram for explaining a specific example of the signal recording method of the present invention, 4th
The figure is a functional block diagram for explaining the pitch detection operation,
FIG. 5 is a block diagram for explaining a peak detection operation;
FIG. 6 is a waveform diagram of the tone signal and the envelope, FIG. 7 is a waveform diagram of the decay rate information of the tone signal, FIG. 8 is a functional block diagram for explaining the envelope detection operation, and FIG. 9 is a characteristic of the FIR filter. FIG. 10 is a waveform diagram showing peak value data of the tone signal after envelope correction, and FIG.
11 is a characteristic diagram of the comb filter, FIG. 12 is a waveform diagram for explaining an operation of setting an optimum looping point, FIG. 13 is a waveform diagram showing a tone signal before and after time axis correction, and FIG. FIG. 15 is a schematic diagram showing the structure of a block for quasi-instantaneous bit compression for corrected peak value data, FIG. 15 is a waveform diagram showing loop data obtained by repeatedly connecting looping section waveforms, and FIG. 16 is decay rate information. FIG. 17 is a waveform diagram showing formant part generation data after envelope correction based on FIG. 17, FIG. 17 is a flowchart for explaining operations before and after actual looping processing, and FIG. 18 is a block diagram showing a schematic configuration of a quasi-instantaneous bit compression encoding system Circuit diagram, 19th
The figure shows one of the data obtained by quasi-instantaneous bit compression encoding.
FIG. 20 is a schematic diagram showing a specific example of a block, FIG. 20 is a schematic diagram showing the contents of a block at the head of a tone signal, and FIG. 21 shows a configuration example of a system including an audio processing unit (APU) and its periphery. It is a block diagram.

Claims

(57) [Claims]

1. A mode in which an output signal having the highest compression ratio is obtained from a plurality of modes of a mode of directly outputting an input signal and a mode of outputting through an filter, and transmitting the output signal. In the above signal compression method, after adding a 0-level pseudo input signal such that a mode for directly outputting the input signal is selected during a period of at least one block preceding the start point of the input signal, A signal compression method characterized in that signal processing is performed including the following.