JP2604406B2

JP2604406B2 - Automatic music transcription method and device

Info

Publication number: JP2604406B2
Application number: JP4612088A
Authority: JP
Inventors: 七郎鶴田; 洋典高島; 正樹藤本; 正典水野
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-02-29
Filing date: 1988-02-29
Publication date: 1997-04-30
Anticipated expiration: 2012-04-30
Also published as: JPH01219630A

Abstract

PURPOSE:To well determine an interval, by calculating the central value of the pitch data of the section of an acoustic signal divided by segmentation processing and determining the intervale of said section on the basis of the calculated value. CONSTITUTION:A CPU 1 stores the digital acoustic signal such as singing passing through an acoustic signal input apparatus 8 and an A/D converter 7 in the auxiliary memory device 6 of a working memory corresponding to the order from a keyboard 4 and executes the program from a main memory device 3. The pitch data of the acoustic signal is extracted to perform segmentation processing for dividing the acoustic signal into the section of the same interval. Further, the central value of the pitch data of each section is extracted and the intervale on the absolute intervale axis nearest to the central value as the intervale of said section and, even when the acoustic signal is fluctuated, a good intervale is identified to enhance the accuracy of the score taken from the acoustic signal.

Description

【発明の詳細な説明】［産業上の利用分野］本発明は、歌唱音声やハミング音声や楽器音等の音響
信号から楽譜データを作成する自動採譜方法及び装置に
関し、特に、音響信号の所定区間の音程として絶対音程
軸上の音程に同定する音程同定処理に関するものであ
る。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic music transcription method and apparatus for creating musical score data from audio signals such as singing voices, humming voices, and instrument sounds, and more particularly, to a predetermined section of an audio signal. This relates to a pitch identification process for identifying a pitch on the absolute pitch axis as a pitch of.

［従来の技術］歌唱音声やハミング音声や楽器音声等の音響信号を楽
譜データに変換する自動採譜方式においては、音響信号
から楽譜としての基本的な情報である音長、音程、調、
拍子及びテンポを検出することを有する。[Background Art] In an automatic transcription system that converts acoustic signals such as singing voices, humming voices, and musical instrument voices into musical score data, basic information as musical scores from the acoustic signals is pitch, pitch, key,
Detecting time signature and tempo.

ところで、音響信号は基本波形の繰返し波形を連続的
に含む信号であるだけであり、上述した各情報を直ちに
得ることはできない。By the way, an acoustic signal is only a signal that continuously includes a repetitive waveform of a basic waveform, and the above-described information cannot be obtained immediately.

そこで、従来の自動採譜方式においては、まず、音響
信号の音高を表す基本波形の繰返し情報（以下、ピッチ
情報と呼ぶ）及びパワー情報を分析周期毎に抽出し、そ
の後、抽出されたピッチ情報及び又はパワー情報から音
響信号を同一音程とみなせる区間（セグメント）に区分
し（かかる処理をセグメンテーションと呼ぶ）、次い
で、セグメントのピッチ情報から各セグメントの音響信
号の音程として絶対音程軸にそった音程に同定し、ピッ
チ情報の音程軸周りの分布情報に基づいて音響信号の調
を決定し、さらに、セグメントに基づいて音響信号の拍
子及びテンポを決定するという順序で各情報を得てい
た。Therefore, in the conventional automatic transcription method, first, repetition information (hereinafter, referred to as pitch information) of a basic waveform representing a pitch of an acoustic signal and power information are extracted for each analysis cycle, and thereafter, the extracted pitch information is extracted. And / or dividing the audio signal into sections (segments) that can be regarded as the same pitch based on the power information (this processing is called segmentation), and then, based on the pitch information of the segment, the pitch along the absolute pitch axis as the pitch of the audio signal of each segment. Each information is obtained in the order of determining the tone of the sound signal based on the distribution information around the pitch axis of the pitch information, and further determining the beat and tempo of the sound signal based on the segment.

［発明が解決しようとする課題］ところで、音響信号のあるセグメントを絶対音程軸上
の音程として同定しようとしても、音響信号、特に人に
よって発声された音響信号は音程が安定しておらず、同
一音程を意図している場合であっても音程の揺らぎが多
い。そのため、音程同定処理を非常に難しいものとして
いた。[Problems to be Solved by the Invention] By the way, even if an attempt is made to identify a certain segment of an acoustic signal as a pitch on an absolute pitch axis, the pitch of an acoustic signal, particularly an acoustic signal uttered by a human, is not stable and the same. Even if the pitch is intended, there is much fluctuation in the pitch. For this reason, the pitch identification processing is very difficult.

音程は、音長と共に楽譜データの基本的な要素である
ので、正確に同定することが必要であり、正確に同定す
ることができない場合には、楽譜データの精度を低いも
のとする。Since the pitch is a fundamental element of the musical score data together with the pitch, it is necessary to identify it accurately. If the musical score data cannot be identified accurately, the accuracy of the musical score data is reduced.

本発明は、以上の点を考慮してなされたもので、音程
を正確に同定することのできる新規な音程同定方法を提
案し、最終的な楽譜データの精度を一段と向上させるこ
とのできる自動採譜方法及び装置を提供しようとするも
のである。The present invention has been made in consideration of the above points, and proposes a new pitch identification method capable of accurately identifying pitches, and automatic transcription that can further improve the accuracy of final score data. It is intended to provide a method and apparatus.

［課題を解決するための手段］かかる課題を解決するため、第１の本発明において
は、入力された音響信号波形の繰返し周期であり、音高
を表すピッチ情報及び音響信号のパワー情報を抽出する
処理と、ピッチ情報及び又はパワー情報に基づいて音響
信号を同一音程とみなせる区間に区分するセグメンテー
ション処理と、この区分された区間について音響信号の
絶対音程軸上の音程を決定する音程同定処理とを少なく
とも含み、音響信号を楽譜データに変換する自動採譜方
法において、音程同定処理が、区分された各区間につい
てそのピッチ情報の中央値を抽出する処理と、抽出され
た中央値が最も近い絶対音程軸上の音程に各区間の音程
を決定する処理とからなるようにした。[Means for Solving the Problems] In order to solve the problems, in the first aspect of the present invention, pitch information representing a pitch and a power information of an acoustic signal, which is a repetition period of an input acoustic signal waveform, is extracted. And a segmentation process of dividing the acoustic signal into sections that can be regarded as having the same pitch based on the pitch information and / or the power information, and a pitch identification process of determining a pitch on the absolute pitch axis of the acoustic signal for the divided section. In an automatic transcription method for converting an acoustic signal into musical score data, the pitch identification process includes a process of extracting a median of pitch information for each of the divided sections, and an absolute pitch in which the extracted median is closest. And determining the pitch of each section on the on-axis pitch.

また、第２の本発明においては、入力された音響信号
波形の繰返し周期であり、音高を表すピッチ情報及び音
響信号のパワー情報を抽出するピッチ・パワー抽出手段
と、ピッチ情報及び又はパワー情報に基づいて音響信号
を同一音程とみなせる区間に区分するセグメンテーショ
ン手段と、この区分された区間について音響信号の絶対
音程軸上の音程を決定する音程同定手段とを一部に備え
て音響信号を楽譜データに変換する自動採譜装置におい
て、音程同定手段を、セグメンテーション手段によって
区分された各区間についてピッチ情報の中央値を抽出す
る中央値抽出部と、抽出された中央値が最も近い絶対音
程軸上の音程にその区間の音程を決定する音程決定部と
で構成した。Further, in the second aspect of the present invention, pitch / power extraction means for extracting pitch information representing a pitch and power information of the audio signal, which is a repetition period of the input audio signal waveform, comprises pitch information and / or power information. Segmentation means for classifying an audio signal into sections that can be regarded as having the same pitch based on the sound signal, and pitch identification means for determining a pitch on the absolute pitch axis of the audio signal for the divided section. In an automatic transcription apparatus for converting data into data, a pitch identification means, a median value extraction unit for extracting a median value of pitch information for each section segmented by the segmentation means, and an extracted median value on an absolute pitch axis closest to the pitch information. A pitch determination unit for determining a pitch in the section is provided.

［作用］第１の本発明においては、各区間の音程を絶対音程軸
上の音程に同定するにつき、音響信号が揺らぐとしても
音響信号の発生源が意図する音程を中心として揺らぐの
で、その意図する音程はピッチ情報の中央値と非常に良
く対応する点に着目し、各区間の中央値を抽出してその
中央値が近い絶対音程軸上の音程に同定するようにし
た。[Operation] In the first aspect of the present invention, when the pitch of each section is identified as the pitch on the absolute pitch axis, even if the sound signal fluctuates, the sound signal fluctuates around the intended pitch. Focusing on a point at which the median corresponding to the median of the pitch information corresponds very well, the median of each section is extracted, and the median is identified as a median on the absolute musical pitch axis that is close.

また、第２の本発明は、同様にピッチ情報の中央値が
音響信号の意図する音程に近いことに基づいて、セグメ
ンテーションされた各区間のピッチ情報の中央値を中央
値抽出部によって抽出し、音程決定部によってその抽出
された中央値が近い絶対音程上の音程にその区間の音程
を同定するようにした。Further, the second present invention similarly extracts the median value of the pitch information of each segmented section by the median value extraction unit based on the fact that the median value of the pitch information is close to the intended pitch of the audio signal, The pitch determination section identifies the pitch in that section as the pitch on the absolute pitch whose extracted median is close.

［実施例］以下、本発明の一実施例を図面を参照しながら詳述す
る。Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

自動採譜方式まず、本発明が適用される自動採譜方式について説明
する。Automatic transcription system First, an automatic transcription system to which the present invention is applied will be described.

第３図において、中央処理ユニット（CPU）１は、当
該装置の全体を制御するものであり、バス２を介して接
続されている主記憶装置３に格納されている第４図に示
す採譜処理プログラムを実行するものである。バス２に
は、CPU1及び主記憶装置３に加えて、入力装置としての
キーボード４、出力装置としての表示装置５、ワーキン
グメモリとして用いられる補助記憶装置６及びアナログ
／デジタル変換器７が接続されている。In FIG. 3, a central processing unit (CPU) 1 controls the whole of the apparatus, and performs a musical notation processing shown in FIG. 4 stored in a main storage device 3 connected via a bus 2. Execute the program. In addition to the CPU 1 and the main storage device 3, a keyboard 4 as an input device, a display device 5 as an output device, an auxiliary storage device 6 used as a working memory, and an analog / digital converter 7 are connected to the bus 2. I have.

アナログ／デジタル変換器には、例えば、マイクロフ
ォンでなる音響信号入力装置８が接続されている。この
音響信号入力装置８は、ユーザによって発生された歌唱
やハミングや、楽器から発生された楽音等の音響信号を
捕捉して電気信号に変換するものであり、その電気信号
をアナログ／デジタル変換器７に出力するものである。An audio signal input device 8 including, for example, a microphone is connected to the analog / digital converter. The acoustic signal input device 8 captures an acoustic signal such as singing or humming generated by a user or a musical tone generated from a musical instrument and converts the signal into an electric signal, and converts the electric signal into an analog / digital converter. 7 is output.

CPU1は、キーボード入力装置４によって処理が指令さ
れたとき、当該採譜処理を開始し、主記憶装置３に格納
されているプログラムを実行してアナログ／デジタル変
換器７によってデジタル信号に変換された音響信号を一
旦補助記憶装置６に格納し、その後、これら音響信号を
上述のプログラムを実行して楽譜データに変換して必要
に応じて表示装置５に出力するようになされている。When a process is instructed by the keyboard input device 4, the CPU 1 starts the transcription process, executes a program stored in the main storage device 3, and converts the sound converted into a digital signal by the analog / digital converter 7. The signals are temporarily stored in the auxiliary storage device 6, and thereafter, these sound signals are converted into musical score data by executing the above-described program and output to the display device 5 as necessary.

次に、CPU1が実行する音響信号を取り込んだ後の採譜
処理を第４図の機能レベルで示フローチャートに従って
詳述する。Next, the transcription process performed after the CPU 1 captures the audio signal will be described in detail with reference to the flowchart shown in FIG.

まず、CPU1は、音響信号を自己相関分析して分析周期
毎に音響信号のピッチ情報を抽出し、また２乗和処理し
て分析周期毎にパワー情報を抽出し、その後ノイズ除去
や平滑化処理等の後処理を実行する（ステップSP1、SP
2）。その後、CPU1は、ピッチ情報については、その分
布状況に基づいて絶対音程軸に対する音響信号の音程軸
のずれ量を算出し、得られたピッチ情報をそのずれ量に
応じてシフトさせるチューニング処理を実行する（ステ
ップSP3）。すなわち、音響信号を発生した歌唱者また
は楽器が有する音程軸と絶対音程軸との差が小さくなる
ようにピッチ情報を修正する。First, the CPU 1 performs an autocorrelation analysis of the acoustic signal to extract pitch information of the acoustic signal at each analysis cycle, and also performs a sum-of-squares process to extract power information at each analysis cycle, and then performs noise removal and smoothing processing. And other post-processing (steps SP1, SP
2). Thereafter, for the pitch information, the CPU 1 calculates a shift amount of the pitch axis of the acoustic signal with respect to the absolute pitch axis based on the distribution state, and executes a tuning process of shifting the obtained pitch information according to the shift amount. (Step SP3). That is, the pitch information is corrected so that the difference between the pitch axis and the absolute pitch axis of the singer or musical instrument that has generated the acoustic signal is reduced.

次いで、CPU1は、得られたピッチ情報が同一音程を指
示するものと考えられるピッチ情報の連続期間を得て、
音響信号を１音ごとのセグメントに切り分けるセグメン
テーションを実行し、また、得られたパワー情報の変化
に基づいてセグメンテーションを実行する（ステップSP
4、SP5）。これら得られた両者のセグメント情報に基づ
いて、CPU1は、４分音符や８分音符等の時間長に相当す
る基準長を算出してこの基準長に基づいて再度セグメン
テーションを実行する（ステップSP6）。Next, the CPU 1 obtains a continuous period of pitch information in which the obtained pitch information is considered to indicate the same pitch,
A segmentation is performed to divide the acoustic signal into segments for each sound, and a segmentation is performed based on the obtained change in the power information (step SP
4, SP5). Based on these two pieces of segment information obtained, the CPU 1 calculates a reference length corresponding to a time length of a quarter note, an eighth note, etc., and executes the segmentation again based on this reference length (step SP6). .

CPU1は、このようにしてセグメンテーションされたセ
グメントのピッチ情報に基づきそのピッチ情報が最も近
いと判断できる絶対音程軸上の音程にそのセグメントの
音程を同定し、さらに、同定された連続するセグメント
の音程が同一か否かに基づいて再度セグメンテーション
を実行する（ステップSP7、SP8）。The CPU 1 identifies the pitch of the segment as a pitch on the absolute pitch axis that can determine that the pitch information is the closest based on the pitch information of the segment thus segmented, and further identifies the pitch of the identified continuous segment. Segmentation is again performed based on whether or not are the same (steps SP7 and SP8).

その後、CPU1は、チューニング後のピッチ情報を集計
して得た各セグメントについての音程の出現頻度と、調
に応じて定まる所定の重み付け係数との積和を求めてこ
の積和の最大情報に基づいて、例えば、ハ長調やイ短調
というように入力音響信号の楽曲の調を決定し、決定さ
れた調における音階上の所定の音程についてその音程を
ピッチ情報に基づいて見直して音程を確認、修正する
（ステップSP9、SP10）。次いで、CPU1は、最終的に決
定された音程から連続するセグメントについて同一なも
のがあるか否か、また連続するセグメント間でパワーの
変化があるか否かに基づいてセグメンテーションの見直
しを実行し、最終的なセグメンテーションを行なう（ス
テップSP11）。Thereafter, the CPU 1 obtains a product sum of a frequency of occurrence of a pitch for each segment obtained by summing the pitch information after tuning and a predetermined weighting coefficient determined according to the key, and based on the maximum information of the product sum. For example, the key of the music of the input audio signal is determined, such as C major or A minor, and the predetermined pitch on the scale in the determined key is reviewed based on the pitch information to confirm and correct the pitch. (Steps SP9 and SP10). Next, the CPU 1 executes a review of the segmentation based on whether or not there is the same continuous segment from the finally determined pitch, and whether or not there is a power change between the continuous segments, Final segmentation is performed (step SP11).

このようにして音程及びセグメントが決定されると、
CPU1は、楽曲は１拍目から始まる、フレーズの最後の音
は次の小節にまたがらない、小節ごとに切れ目がある等
の観点から小節を抽出し、この小節情報及びセグメンテ
ーション情報から拍子を決定し、この決定された拍子情
報及び小節の長さからテンポを決定する（ステップSP1
2、SP13）。Once the pitch and segment are determined in this way,
The CPU 1 extracts measures from the viewpoint that the music starts from the first beat, the last sound of the phrase does not extend to the next measure, and there is a break in each measure, and determines the time signature from the measure information and the segmentation information. The tempo is determined from the determined time signature information and the length of the bar (step SP1).
2, SP13).

そして、CPU1は決定された音程、音長、調、拍子及び
テンポの情報を整理して最終的に楽譜データを作成する
（ステップSP14）。Then, the CPU 1 organizes the information on the determined pitch, pitch, key, beat, and tempo to finally create the musical score data (step SP14).

音程同定処置次に、このような自動採譜方式における音程同定処理
（ステップSP7参照）について、第１図のフローチャー
トを用いて詳述する。Next, the pitch identification process (see step SP7) in the automatic transcription system will be described in detail with reference to the flowchart of FIG.

CPU1は、まずセグメンテーションによって得られたセ
グメントのうち最初のセグメントを取り出し、次いで、
そのセグメント内にある全てのピッチ情報の中央値を抽
出する（ステップSP20、21）。ここで、中央値とは、当
該セグメントのピッチ情報を大きさの順に並べたとき、
そのデータ数が奇数の場合には、その中央のピッチ情報
の値であり、データ数が偶数の場合には中央の２個のピ
ッチ情報の平均値である。CPU1 first extracts the first segment from the segments obtained by the segmentation, and then
The median of all pitch information in the segment is extracted (steps SP20 and SP21). Here, the median value means that when the pitch information of the segment is arranged in order of size,
When the number of data is odd, it is the value of the pitch information at the center, and when the number of data is even, it is the average value of the two pieces of pitch information at the center.

その後、CPU1は抽出された中央値が最も近い絶対音程
軸上の音程を当該セグメントの音程として同定する（ス
テップSP22）。なお、音響信号の各セグメントの音程
は、絶対音程軸上の半音ずつ異なるいずれかの音程に同
定される。CPU1は、かかる処理がなされて音程が同定さ
れたセグメントが最後のセグメントか否かを判別する
（ステップSP23）。その結果、処理が終了していると、
当該処理プログラムを終了し、処理が終了していない
と、次のセグメントを処理対象として上述のステップ21
に戻る（ステップSP24）。Thereafter, the CPU 1 identifies a pitch on the absolute pitch axis having the closest extracted median as a pitch of the segment (step SP22). Note that the pitch of each segment of the acoustic signal is identified as one of the pitches that differs by a semitone on the absolute pitch axis. The CPU 1 determines whether or not the segment whose pitch has been identified by performing the processing is the last segment (step SP23). As a result, when the process is completed,
If the processing program is terminated and the processing is not terminated, the next segment is set as a processing target and the above-described step 21 is performed.
Return to step SP24.

このようなステップSP21〜24でなる処理ループを繰り
返すことにより、全てのセグメントについてそのセグメ
ント内のピッチ情報の中央値情報による音程同定が実行
される。By repeating such a processing loop consisting of steps SP21 to SP24, pitch identification is executed for all the segments based on the median information of the pitch information in the segments.

ここで、音程同定処理に中央値を利用するようにした
のは、音響信号が揺らぎを有するとはいえ、歌唱者が意
図する音程を中心として揺らぐと考えられ、中央値がそ
の意図する音程に対応すると考えられるためである。Here, the median value is used for the pitch identification processing, although the sound signal has fluctuation, it is considered that the singer fluctuates around the pitch intended, and the median value is set to the intended pitch. This is because it is considered to correspond.

第２図は、かかる処理による音程同定の一例を示すも
のであり、点線曲線PITは音響信号のピッチ情報を示
し、縦方向の実線VPはセグメントのきれ目を示してい
る。この例による各セグメントの中央値は横方向の実線
HPで示しており、また、同定された音程は横方向の点線
HPで示している。この第２図により明らかなように中央
値は絶対音程軸上の音程に対する偏差が少なく、良好に
同定できることが分かる。また、セグメントのきれ目前
後のピッチ情報が不安定な状態（例えば曲線部分C1及び
C2）の影響を受けることなく音程を同定できる。FIG. 2 shows an example of pitch identification by such processing. A dotted curve PIT indicates pitch information of an acoustic signal, and a vertical solid line VP indicates a gap between segments. The median of each segment in this example is the horizontal solid line
HP is indicated, and the identified pitch is indicated by a dotted horizontal line.
Indicated by HP. As is apparent from FIG. 2, the median has a small deviation from the pitch on the absolute pitch axis, and can be identified well. In addition, the pitch information before and after the segment break is unstable (for example, the curved portions C1 and C1).
The pitch can be identified without being affected by C2).

従って、上述の実施例によれば、各セグメントのピッ
チ情報の中央値を抽出し、中央値が最も近い絶対音程軸
上の音程にそのセグメントの音程を同定したので、音程
を高精度に決定することができる。なお、音程同定に先
立ち、音響信号をチューニング処理しているので、かか
る方法によれば中央値は絶対音程軸上の音程い近い値を
とり、同定が非常にし易くなっている。Therefore, according to the above-described embodiment, the median of the pitch information of each segment is extracted and the pitch of the segment is identified as the pitch on the absolute pitch axis whose median is closest, so that the pitch is determined with high accuracy. be able to. Since the acoustic signal is tuned prior to the pitch identification, according to this method, the median value takes a value close to the pitch on the absolute pitch axis, and the identification becomes very easy.

他の実施例なお、音程同定処理に用いるピッチ情報は、周波数単
位のHzで表わされているものであっても良く、また、音
楽分野で良く用いられているセント単位で表わされてい
るものであっても良い。Other Embodiments Note that the pitch information used for the pitch identification processing may be represented by Hz in frequency units, or represented by cent units often used in the music field. It may be something.

また、上述の実施例においては、第４図に示す全ての
処理をCPU1が主記憶装置３に格納されているプログラム
に従って実行するものを示したが、その一部または全部
の処理をハードウェア構成で実行するようにしても良
い。例えば、第３図との対応部分に同一符号を付した第
５図に示すように、音響信号入力装置８からの音響信号
を増幅回路10を介して増幅した後、さらに前置フィルタ
11を介してアナログ／デジタル変換器12に与えてデジタ
ル信号に変換し、このデジタル信号に変換された音響信
号を信号処理プロセッサ13が自己相関分析してピッチ情
報を抽出し、また２乗和処理してパワー情報を抽出して
CPU1によるソフトウェア処理系に与えるようにしても良
い。このようなハードウェア構成（10〜13）に用いられ
る信号処理プロセッサ13としては、音声帯域の信号をリ
アルタイム処理し得ると共に、ホストのCPU1とのインタ
フエース信号が用意されているプロセッサ（例えば、日
本電気株式会社製μPD7720）を適用し得る。In the above-described embodiment, the CPU 1 executes all the processing shown in FIG. 4 according to the program stored in the main storage device 3. However, a part or all of the processing is performed by a hardware configuration. May be executed. For example, as shown in FIG. 5 in which the same reference numerals are given to the corresponding parts in FIG. 3, after the sound signal from the sound signal input device 8 is amplified through the amplifier circuit 10, the pre-filter is further added.
The digital signal is supplied to an analog / digital converter 12 via an analog-to-digital converter 11 and converted into a digital signal. The acoustic signal converted to the digital signal is subjected to autocorrelation analysis by a signal processor 13 to extract pitch information, and a square sum processing is performed. And extract power information
It may be provided to the software processing system by the CPU1. As a signal processor 13 used in such a hardware configuration (10 to 13), a processor capable of processing a signal in a voice band in real time and providing an interface signal with a host CPU 1 (for example, Japan) ΜPD7720 manufactured by Denki Co., Ltd.) can be applied.

［発明の効果］以上のように、本発明によれば、各セグメントの音程
を、セグメントのピッチ情報の中央値に基づいて同定す
るようにしたので、良好に音程を決定でき、楽譜データ
の精度を一段と高めることができる。[Effects of the Invention] As described above, according to the present invention, the pitch of each segment is identified based on the median value of the pitch information of the segment, so that the pitch can be determined well, and the accuracy of the score data can be determined. Can be further increased.

[Brief description of the drawings]

第１図は本発明の一実施例にかかる音程同定処理を示す
フローチャート、第２図はかかる音程同定処理による一
例を示す略線図、第３図は本発明を適用する自動採譜方
式の構成を示すブロック図、第４図はその自動採譜処理
手順を示すフローチャート、第５図は自動採譜方式の他
の構成を示すブロック図である。１……CPU、３……主記憶装置、６……補助記憶装置、
７……アナログ／デジタル変換器、８……音響信号入力
装置。FIG. 1 is a flowchart showing a pitch identification process according to an embodiment of the present invention, FIG. 2 is a schematic diagram showing an example of the pitch identification process, and FIG. 3 shows a configuration of an automatic transcription system to which the present invention is applied. FIG. 4 is a flowchart showing the automatic transcription process, and FIG. 5 is a block diagram showing another configuration of the automatic transcription system. 1 ... CPU, 3 ... main storage device, 6 ... auxiliary storage device,
7 ... A / D converter, 8 ... Acoustic signal input device.

───────────────────────────────────────────────────── フロントページの続き (72)発明者藤本正樹東京都港区芝５丁目７番15号日本電気技術情報システム開発株式会社内 (72)発明者水野正典東京都港区芝５丁目７番15号日本電気技術情報システム開発株式会社内審査官新井重雄 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Masaki Fujimoto 5-7-15 Shiba, Minato-ku, Tokyo Inside NEC Technical Information System Development Co., Ltd. (72) Inventor Masanori Mizuno 5-7-1 Shiba, Minato-ku, Tokyo No. 15 Examiner, NEC Technical Information Systems Development Co., Ltd. Shigeo Arai

Claims

(57) [Claims]

1. A process for extracting pitch information representing a pitch and power information of the sound signal, which is a repetition period of an input sound signal waveform, and processing the sound signal based on the pitch information and / or the power information. Automatically segmenting the sound signal into musical score data, including at least a segmentation process for segmenting the sound signal into segments that can be regarded as the same pitch, and a pitch identification process for determining a pitch on the absolute pitch axis of the acoustic signal for the segmented segment. In the music transcription method, the pitch identification processing includes a process of extracting a median of pitch information of each of the divided sections, and determining a pitch of the section to a pitch on an absolute pitch axis closest to the extracted median. Automatic transcription method, comprising:

2. A pitch / power extracting means for extracting pitch information representing a pitch and power information of the audio signal, which is a repetition period of an input audio signal waveform, and based on the pitch information and / or the power information. Segmentation means for classifying the acoustic signal into sections that can be regarded as having the same pitch, and pitch identification means for determining a pitch on the absolute pitch axis of the acoustic signal for the divided section. In an automatic transcription apparatus for converting musical score data, a median extracting unit that extracts a median of pitch information for each of the sections divided by the segmentation means, An automatic transcription apparatus comprising: a pitch on a pitch axis; and a pitch determination unit that determines a pitch in the section.