JP2806047B2

JP2806047B2 - Automatic transcription device

Info

Publication number: JP2806047B2
Application number: JP1143191A
Authority: JP
Inventors: 慈明小松; せい子石川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 1991-01-07
Filing date: 1991-01-07
Publication date: 1998-09-30
Anticipated expiration: 2013-09-30
Also published as: JPH04261590A

Abstract

PURPOSE:To write the music without being limited by the kinds of a musical instrument and the number of musical instruments by synthesizing a signal being similar to a signal obtained by sampling a music signal by combining synthesizer sounds, and generating a code corresponding to music. CONSTITUTION:A signal fetching means 21 samples a music signal and fetches it, and a signal synthesizing means 23 controls a synthesizer 25 so that various musical instrument sounds are combined successively by various musical intervals and outputted repeatedly. Also, a distance calculating means 26 executes repeatedly a distance calculation of a signal from the signal fetching means 21, and an output signal of the synthesizer 25 controlled by the signal synthesizing means 23. Subsequently, a signal saving means 27 saves information related to each sound for generating an output signal whose matching degree is the highest from the signal synthesizing means 23, as a result of the distance calculation executed by the distance calculating means 26, and a code converting means 28 converts information of the signal saving means 27 to music or a code corresponding to music.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、音楽を楽譜もしくは楽
譜に相当する符号に変換する自動採譜装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic transcription apparatus for converting music into a musical score or a code corresponding to the musical score.

【０００２】[0002]

【従来の技術】従来、複数の楽器によって演奏された音
楽の採譜は音楽的知識を有する採譜者によって行われて
きた。また、単音からなる音楽を採譜したり、あるいは
鍵盤の押された情報から採譜を行う装置は提案されてい
るが、採譜可能な楽器の種類や楽器数には制約があっ
た。2. Description of the Related Art Hitherto, transcription of music performed by a plurality of musical instruments has been performed by a transcriptionist having musical knowledge. Further, devices that transcribe music composed of single notes or transcribe music based on pressed information of a keyboard have been proposed, but there are restrictions on the types of musical instruments that can be transcribed and the number of musical instruments.

【０００３】[0003]

【発明が解決しようとする課題】すなわち、従来の採譜
装置では、単一の楽器により演奏された音楽信号を採譜
するように構成されているため、多数の楽器により演奏
された音楽信号について採譜を行うことはできないとい
った問題を有していた。本発明は、上述した問題点を解
決するためになされたものであり、音楽信号を標本化し
て取り込んだ信号に似た信号をシンセサイザ音の組み合
わせにより合成し、その合成信号を生成するために使用
した音の種類、音程、継続時間などから、楽譜に相当す
る符号を生成できるようにし、それによって楽器の種類
や楽器数に制約のない採譜装置を提供することを目的と
する。That is, since the conventional music transcription device is configured to transcribe music signals played by a single musical instrument, music transcription performed by a number of musical instruments is performed. There was a problem that it could not be done. The present invention has been made to solve the above-described problem, and is used for synthesizing a signal similar to a signal obtained by sampling a music signal by a combination of synthesizer sounds and generating a synthesized signal. It is an object of the present invention to provide a musical notation apparatus capable of generating a code corresponding to a musical score based on the type, pitch, duration, and the like of the sound, thereby limiting the types of musical instruments and the number of musical instruments.

【０００４】[0004]

【課題を解決するための手段】上記目的を達成するため
に本発明は、音楽信号を楽譜もしくは楽譜に相当する符
号に変換する採譜装置において、複数の楽器によって演
奏された音楽信号を標本化して取り込む信号取り込み手
段と、種々の楽器の音を種々の音程で出力しうるシンセ
サイザと、前記シンセサイザに種々の楽器音を種々の音
程で順次組み合わせて繰り返し合成させる信号合成手段
と、前記信号取り込み手段からの信号と前記シンセサイ
ザの出力信号との距離を繰り返し計算する距離計算手段
と、前記距離計算手段による計算結果のうち最もマッチ
ング度の高い出力信号を生成した各音の情報を保存する
信号保存手段と、前記信号保存手段の情報を楽譜もしく
は楽譜に相当する符号に変換する符号変換手段とを備え
たものである。In order to achieve the above object, the present invention provides a musical notation apparatus for converting a music signal into a musical score or a code corresponding to the musical score by sampling a musical signal played by a plurality of musical instruments. A signal capturing means for capturing, a synthesizer capable of outputting sounds of various musical instruments at various pitches, a signal synthesizing means for repeatedly synthesizing the synthesizer by sequentially combining various musical instrument sounds at various pitches, and the signal capturing means. Distance calculation means for repeatedly calculating the distance between the signal of the synthesizer and the output signal of the synthesizer, and signal storage means for storing information of each sound that generated the highest matching output signal among the calculation results by the distance calculation means. Code conversion means for converting the information of the signal storage means into a musical score or a code corresponding to the musical score.

【０００５】[0005]

【作用】上記の構成によれば、信号取り込み手段は音楽
信号を標本化して取り込む。信号合成手段はシンセサイ
ザを種々の楽器音を種々の音程で順次組み合わせて繰り
返し出力するよう制御する。距離計算手段は信号取り込
み手段からの信号と信号合成手段により制御されるシン
セサイザの出力信号との距離計算を繰り返し行う。信号
保存手段は距離計算手段での距離計算の結果、信号合成
手段からの最もマッチング度の高い出力信号を生成した
各音に関する情報を保存する。符号変換手段は信号保存
手段の情報を、楽譜もしくは楽譜に相当する符号に変換
する。According to the above arrangement, the signal capturing means samples and captures the music signal. The signal synthesizing means controls the synthesizer so as to repeatedly combine various musical instrument sounds at various pitches and repeatedly output them. The distance calculation means repeatedly calculates the distance between the signal from the signal acquisition means and the output signal of the synthesizer controlled by the signal synthesis means. The signal storage unit stores information on each sound that has generated the output signal with the highest matching degree from the signal synthesis unit as a result of the distance calculation by the distance calculation unit. The code conversion means converts the information of the signal storage means into a musical score or a code corresponding to the musical score.

【０００６】[0006]

【実施例】以下、本発明を具体化した一実施例を図面を
参照して説明する。図１は本発明の自動採譜装置のブロ
ック構成を示す。本装置は、信号が入力されるオーディ
オ・アンプ１、ローパス・フィルタ２、Ａ／Ｄ変換装置
３、Ｉ／Ｏポート４、ＣＰＵ５、ＲＡＭ６、ＲＯＭ７、
および採譜結果を表示するディスプレイ８から構成され
ている。次に、図２は上記の構成でなる採譜装置の機能
構成を、図３は周波数分析部２２の機能構成を示す。信
号取り込み部２１においては、入力された音楽信号は、
オーディオ・アンプ１により増幅される。この増幅され
た信号は、ローパス・フィルタ２に入力され、５．５ｋ
Ｈｚ以下の成分のみが通過し、標本化時の折返し歪を抑
える。この出力信号はＡ／Ｄ変換装置３により１２ｋＨ
ｚ，１６ｂｉｔで標本化される。標本化されたデータは
Ｉ／Ｏポート４を介し、ＣＰＵ５に取り込まれ、ＲＡＭ
６に記憶される。DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below with reference to the drawings. FIG. 1 shows a block configuration of the automatic transcription apparatus of the present invention. This device comprises an audio amplifier 1 to which signals are input, a low-pass filter 2, an A / D converter 3, an I / O port 4, a CPU 5, a RAM 6, a ROM 7,
And a display 8 for displaying the transcription result. Next, FIG. 2 shows a functional configuration of the music transcription apparatus having the above configuration, and FIG. 3 shows a functional configuration of the frequency analysis unit 22. In the signal capturing unit 21, the input music signal is
The signal is amplified by the audio amplifier 1. This amplified signal is input to the low-pass filter 2 and the 5.5 k
Only the components below Hz pass and the aliasing distortion at the time of sampling is suppressed. This output signal is 12 kHz by the A / D converter 3.
Sampled at z, 16 bits. The sampled data is taken into the CPU 5 through the I / O port 4 and stored in the RAM
6 is stored.

【０００７】周波数分析部２２は、図３のようにＦＦＴ
処理部３１と分析区間決定部３２と基本周波数候補抽出
部３３から構成されている。ＦＦＴ処理部３１では、Ｃ
ＰＵ５はＲＡＭ６より標本化されたデータを読み出し、
２５ｍｓｅｃ毎を１フレームとし、１フレーム毎に８
５．３ｍｓｅｃハミング窓を掛けた後、ＦＦＴ分析によ
り対数パワースペクトルが算出される。次に、ＣＰＵ５
は、算出された対数パワースペクトルから放物線内挿処
理によりピーク周波数を求め、ピーク周波数を鍵盤番号
に変換する。図４は以上のようにして求めたピーク・ス
ペクトルを、時間軸を横軸に、鍵盤番号を縦軸にとり、
強度を濃淡で示したものである。[0007] The frequency analysis unit 22 performs FFT as shown in FIG.
It comprises a processing section 31, an analysis section determination section 32, and a fundamental frequency candidate extraction section 33. In the FFT processing unit 31, C
PU5 reads the sampled data from RAM6,
One frame is set for every 25 msec, and 8 is set for each frame.
After applying a 5.3 msec Hamming window, a logarithmic power spectrum is calculated by FFT analysis. Next, the CPU 5
Calculates a peak frequency from the calculated logarithmic power spectrum by a parabolic interpolation process, and converts the peak frequency into a keyboard number. FIG. 4 shows the peak spectrum obtained as described above, with the horizontal axis representing the time axis and the vertical axis representing the keyboard number.
The intensity is indicated by shading.

【０００８】分析区間決定部３２では、ＣＰＵ５はピー
ク・スペクトルの安定部の続く区間を分析区間とし、各
分析区間を抽出する。ピーク・スペクトルの安定部は、
近接するフレーム間でのピーク・スペクトルの変化量が
あるしきい値以下の区間とする。基本周波数候補抽出部
３３では、ＣＰＵ５は、以下の３つの尺度から、ある分
析区間内にある音が基本周波数か倍音かを判定する。３
つの尺度とは、その音の強度、その音を基本周波数
であるとした場合、その音の倍音がピーク・スペクトル
の中に含まれているか（すなわち、基本周波数らし
さ）、その音が他の音の第ｎ次倍音（２≦ｎ≦８）で
あるとした場合、基本周波数になる音の倍音がピーク・
スペクトルの中に含まれているか（すなわち、倍音らし
さ）である。判定は、の強度が分析区間内のピーク・
スペクトルの強度から計算したしきい値より上であり、
の基本周波数らしさがあるしきい値より大きく、の
倍音らしさがあるしきい値より小さい場合に、その音が
基本周波数候補であるとされる。このとき基本周波数候
補の中に倍音が含まれていても以後の処理で除去するこ
とが可能であり、基本周波数さえ落とさなければよい。In the analysis section determining section 32, the CPU 5 extracts a section following the stable section of the peak spectrum as an analysis section. The stable part of the peak spectrum is
It is assumed that the change amount of the peak spectrum between adjacent frames is equal to or less than a certain threshold value. In the fundamental frequency candidate extracting unit 33, the CPU 5 determines whether a sound in a certain analysis section is a fundamental frequency or a harmonic based on the following three measures. 3
The two measures are the intensity of the sound, if the sound is the fundamental frequency, whether the harmonics of the sound are included in the peak spectrum (ie, the likelihood of the fundamental frequency), or if the sound is another sound. Is the nth harmonic (2 ≦ n ≦ 8), the overtone of the sound having the fundamental frequency has a peak
Whether it is included in the spectrum (that is, overtone-likeness). The judgment is that the intensity of the peak within the analysis section
Above a threshold calculated from the intensity of the spectrum;
If the likelihood of the fundamental frequency is larger than a certain threshold value and the likelihood of the harmonic is smaller than a certain threshold value, the sound is determined to be a fundamental frequency candidate. At this time, even if a fundamental frequency candidate contains a harmonic, it can be removed in the subsequent processing, and it is sufficient that the fundamental frequency is not reduced.

【０００９】信号合成部２３では、ＣＰＵ５は各分析区
間毎に処理を進める。ある分析区間に対して、周波数分
析部２２で求めた基本周波数候補がＮ個あるとし、以後
説明を行う。低いほうからｎ番目の基本周波数候補に対
して、音程、音の始端、継続時間を図４から決定し、楽
器の種類Ｘ（ｎ）、音の強さＩ（ｎ）に関しては変数と
し、シンセサイザ制御部２４に入力する。シンセサイザ
制御部２４では、ＣＰＵ５はＮ個の基本周波数候補に対
して信号合成部２３の出力情報をもとに、シンセサイザ
２５を駆動制御して、Ｎ個の楽器音を同時に合成する。
距離計算部２６では、ＣＰＵ５は各分析区間の入力音と
合成音について周波数分析部２２と同様にＦＦＴにより
周波数分析を行い、パワースペクトルを求めた後、分析
区間で平均化し、平均パワースペクトルを算出する。次
に入力音と合成音との平均パワースペクトルのユークリ
ッド距離Ｄを求める。In the signal synthesizing section 23, the CPU 5 advances the processing for each analysis section. It is assumed that there are N fundamental frequency candidates obtained by the frequency analysis unit 22 for a certain analysis section, and the following description will be made. For the nth fundamental frequency candidate from the lowest, the pitch, the beginning of the sound, and the duration are determined from FIG. 4, and the musical instrument type X (n) and the sound intensity I (n) are set as variables. Input to the control unit 24. In the synthesizer control unit 24, the CPU 5 controls the drive of the synthesizer 25 based on the output information of the signal synthesis unit 23 for the N fundamental frequency candidates, and synthesizes N instrument sounds simultaneously.
In the distance calculation unit 26, the CPU 5 performs a frequency analysis on the input sound and the synthesized sound in each analysis section by FFT in the same manner as the frequency analysis unit 22, obtains a power spectrum, and averages the power spectrum in the analysis section to calculate an average power spectrum. I do. Next, the Euclidean distance D of the average power spectrum between the input sound and the synthesized sound is obtained.

【００１０】信号保存部２７では、距離計算部２６にお
いて算出されたユークリッド距離Ｄがあるしきい値以下
の場合、ユークリッド距離Ｄ、楽器の種類Ｘ、音の強さ
ＩをＲＡＭ６内に格納する。次に、楽器の種類Ｘ、音の
強さＩの組合わせを変え、信号合成部２３は同様の処理
を繰り返す。こうして、楽器の種類Ｘ、音の強さＩに関
して可能な全ての組合わせを終了したとき、符号変換部
２８の処理に移る。符号変換部２８では、ＣＰＵ５はＲ
ＡＭ６内の上記処理により得られた合成音と入力音との
ユークリッド距離の中で最小値を調べ、最小値に対応す
るＸ，Ｉを分析区間内の楽器の種類、音の強さとし出力
する。この時、もしＩ（ｎ）があるしきい値以下の場
合、ｎ番目の基本周波数候補は倍音であるとし除去す
る。信号合成部２３から符号変換部２８までの処理を全
ての分析区間について繰り返した後、処理を終了する。When the Euclidean distance D calculated by the distance calculating unit 26 is equal to or smaller than a certain threshold, the signal storing unit 27 stores the Euclidean distance D, the type X of the musical instrument, and the sound intensity I in the RAM 6. Next, the combination of the musical instrument type X and the sound intensity I is changed, and the signal synthesizer 23 repeats the same processing. When all possible combinations of the musical instrument type X and the sound intensity I have been completed, the process proceeds to the transcoding unit 28. In the code conversion unit 28, the CPU 5
The minimum value of the Euclidean distance between the synthesized sound obtained by the above processing in the AM 6 and the input sound is checked, and X and I corresponding to the minimum value are output as the type of the instrument and the sound intensity in the analysis section. At this time, if I (n) is equal to or less than a certain threshold, the nth fundamental frequency candidate is determined to be a harmonic and is removed. After the processing from the signal synthesis unit 23 to the code conversion unit 28 is repeated for all the analysis sections, the processing ends.

【００１１】本発明は以上詳述した実施例に限定するも
のではなく、その趣旨を逸脱しない範囲において種々の
変更を加えることができる。例えば、本実施例において
は基本周波数候補抽出部３３において、音階・音の始端
・継続時間を決定し定数としたが、これらの定数も変数
として処理することも可能である。The present invention is not limited to the embodiment described in detail above, and various changes can be made without departing from the gist of the present invention. For example, in the present embodiment, the fundamental frequency candidate extraction unit 33 determines the scale and the start and duration of the sound and sets them as constants. However, these constants can also be processed as variables.

【００１２】[0012]

【発明の効果】以上のように本発明によれば、音楽信号
を標本化した信号に似た信号をシンセサイザ音を組み合
わせて合成し、その合成信号を生成するために使用した
音の種類、音程、継続時間などから、楽譜に相当する符
号を生成できるようにし、それによって、楽器の種類や
楽器数に限定されることなく採譜することができる。As described above, according to the present invention, a signal similar to a signal obtained by sampling a music signal is synthesized by combining a synthesizer sound, and the type and pitch of the sound used to generate the synthesized signal In addition, a code corresponding to a musical score can be generated from a duration or the like, so that the musical score can be transcribed without being limited by the type and the number of musical instruments.

[Brief description of the drawings]

【図１】本発明の一実施例による自動採譜装置のブロ
ック構成図である。FIG. 1 is a block diagram of an automatic transcription apparatus according to an embodiment of the present invention.

【図２】自動採譜装置の機能構成図である。FIG. 2 is a functional configuration diagram of the automatic transcription apparatus.

【図３】周波数分析部２２の機能構成図である。FIG. 3 is a functional configuration diagram of a frequency analysis unit 22.

【図４】ＦＦＴ処理部３１により算出されたピーク・
スペクトルの説明図である。FIG. 4 shows a peak calculated by the FFT processing unit 31;
FIG. 4 is an explanatory diagram of a spectrum.

[Explanation of symbols]

５ＣＰＵ２１信号取り込み部２２周波数分析部２３信号合成部２４シンセサイザ制御部２５シンセサイザ２６距離計算部２７信号保存部２８符号変換部３１ＦＦＴ処理部３２分析区間決定部３３基本周波数候補抽出部 Reference Signs List 5 CPU 21 signal acquisition unit 22 frequency analysis unit 23 signal synthesis unit 24 synthesizer control unit 25 synthesizer 26 distance calculation unit 27 signal storage unit 28 code conversion unit 31 FFT processing unit 32 analysis section determination unit 33 basic frequency candidate extraction unit

Claims

(57) [Claims]

1. A musical notation apparatus for converting a music signal into a musical score or a code corresponding to a musical score, a signal capturing means for sampling and capturing a musical signal played by a plurality of musical instruments, and a sound of various musical instruments at various intervals. A signal synthesizer that sequentially combines various musical instrument sounds with various pitches in the synthesizer and synthesizes repeatedly with various pitches, and repeatedly calculates a distance between a signal from the signal capturing unit and an output signal of the synthesizer. Distance calculating means, signal storing means for storing information of each sound which has generated the output signal having the highest matching degree among the calculation results by the distance calculating means, and information of the signal storing means as a musical score or a code corresponding to a musical score. An automatic music transcription device, comprising: