JP2614631B2

JP2614631B2 - Automatic music transcription method and device

Info

Publication number: JP2614631B2
Application number: JP4611988A
Authority: JP
Inventors: 七郎鶴田; 洋典高島; 正樹藤本; 正典水野
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-02-29
Filing date: 1988-02-29
Publication date: 1997-05-28
Anticipated expiration: 2012-05-28
Also published as: JPH01219629A

Abstract

PURPOSE:To easily and well determine an intervale, by determining the intervale of a section on the basis of the average value of section pitch data with respect to a section of an acoustic signal divided by segmentation processing. CONSTITUTION:The digital acoustic signal passing through an acoustic signal input apparatus 8 and an A/D converter 7 by a CPU 1 corresponding to the order from a keyboard 4 is stored in the auxiliary memory device 6 of a working memory and the program from a main memory device 3 is executed. The pitch data of the acoustic signal is extracted to perform segmentation processing for dividing the acoustic signal into the section of the same intervale. Further, the average value of the pitch data of each section is calculated and the intervale on the absolute intervale axis nearest to the average value is determined as the intervale of each section and an intervale is easily and well determined to enhance the accuracy of the score automatically taken from the acoustic signal.

Description

【発明の詳細な説明】［産業上の利用分野］本発明は、歌唱音声やハミング音声や楽器音等の音響
信号から楽譜データを作成する自動採譜方法及び装置に
関し、特に、音響信号の所定区間の音程として絶対音程
軸上の音程に同定する音程同定処理に関するものであ
る。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic music transcription method and apparatus for creating musical score data from audio signals such as singing voices, humming voices, and instrument sounds, and more particularly, to a predetermined section of an audio signal. This relates to a pitch identification process for identifying a pitch on the absolute pitch axis as a pitch of.

［従来の技術］歌唱音声やハミング音声や楽器音等の音響信号を楽譜
データに変換する自動採譜方式においては、音響信号か
ら楽譜としての基本的な情報である音長、音程、調、拍
子及びテンポを検出することを有する。[Prior Art] In an automatic transcription system for converting an acoustic signal such as a singing voice, a humming voice, or a musical instrument sound into musical score data, a sound length, a pitch, a key, a time signature, and the like, which are basic information as a musical score from an acoustic signal. Detecting the tempo.

ところで、音響信号は基本波形の繰返し波形を連続的
に含む信号であるだけであり、上述した各情報を直ちに
得ることはできない。By the way, an acoustic signal is only a signal that continuously includes a repetitive waveform of a basic waveform, and the above-described information cannot be obtained immediately.

そこで、従来の自動採譜方式においては、まず、音響
信号の音高を表す基本波形の繰返し情報（以下、ピッチ
情報と呼ぶ）及びパワー情報を分析周期毎に抽出し、そ
の後超出されたピッチ情報及び又はパワー情報から音響
信号を同一音程とみなせる区間（セグメント）に区分し
（かかる処理をセグメンテーションと呼ぶ）、次いで、
セグメントのピッチ情報から各セグメントの音響信号の
絶対音程軸にそった音程を同定し、ピッチ情報の音程軸
周りの分布情報に基づいて音響信号の調を決定し、さら
に、セグメントに基づいて音響信号の拍子及びテンポを
決定するという順序で各情報を得ていた。Therefore, in the conventional automatic transcription method, first, repetition information (hereinafter, referred to as pitch information) of a basic waveform representing a pitch of an acoustic signal and power information are extracted for each analysis cycle, and thereafter, the extracted pitch information and Alternatively, the audio signal is divided into sections (segments) that can be regarded as having the same pitch from the power information (this processing is called segmentation),
Identify the pitch along the absolute pitch axis of the audio signal of each segment from the pitch information of the segment, determine the tone of the audio signal based on the distribution information around the pitch axis of the pitch information, and further determine the audio signal based on the segment. Each information was obtained in the order of determining the time signature and tempo.

［発明が解決しようとする課題］ところで、音響信号のあるセグメントを絶対音程軸上
の音程として同定しようとしても、音響信号、特に人に
よって発声された音響信号は音程が安定しておらず、同
一音程を意図している場合であっても音程の揺らぎが多
い。そのため、音程同定処理を非常に難しいものとして
いた。[Problems to be Solved by the Invention] By the way, even if an attempt is made to identify a certain segment of an acoustic signal as a pitch on an absolute pitch axis, the pitch of an acoustic signal, particularly an acoustic signal uttered by a human, is not stable and the same. Even if the pitch is intended, there is much fluctuation in the pitch. For this reason, the pitch identification processing is very difficult.

音程は、音長と共に楽譜データの基本的な要素である
ので、正確に同定することが必要であり、正確に同定す
ることができない場合には、楽譜データの精度を低いも
のとする。Since the pitch is a fundamental element of the musical score data together with the pitch, it is necessary to identify it accurately. If the musical score data cannot be identified accurately, the accuracy of the musical score data is reduced.

本発明は、以上の点を考慮してなされたもので、音程
を正確に同定することのできる新規な音程同定方式を提
案し、最終的な楽譜データの精度を一段と向上させるこ
とのできる自動採譜方法及び装置を提供しようとするも
のである。The present invention has been made in consideration of the above points, and proposes a new pitch identification method capable of accurately identifying pitches, and automatic transcription that can further improve the accuracy of final score data. It is intended to provide a method and apparatus.

［課題を解決するための手段］かかる課題を解決するため、第１の本発明において
は、入力された音響信号波形の繰返し周期であり、音高
を表すピッチ情報及び音響信号のパワー情報を抽出する
処理と、ピッチ情報及び又はパワー情報に基づいて音響
信号を同一音程とみなせる区間に区分するセグメンテー
ション処理と、この区分された区間について音響信号の
絶対音程軸上の音程を決定する音程同定処理とを少なく
とも含み、音響信号を楽譜データに変換する自動採譜方
法において、音程同定処理が、区分された各区間につい
てそのピッチ情報の平均値を算出する処理と、算出され
た平均値が最も近い絶対音程軸上の音程に各区間の音程
を決定する処理とからなるようにした。[Means for Solving the Problems] In order to solve the problems, in the first aspect of the present invention, pitch information representing a pitch and a power information of an acoustic signal, which is a repetition period of an input acoustic signal waveform, is extracted. And a segmentation process of dividing the acoustic signal into sections that can be regarded as having the same pitch based on the pitch information and / or the power information, and a pitch identification process of determining a pitch on the absolute pitch axis of the acoustic signal for the divided section. In an automatic transcription method for converting an acoustic signal into musical score data, the pitch identification processing calculates an average value of pitch information for each of the divided sections, and an absolute pitch in which the calculated average value is the closest. And determining the pitch of each section on the on-axis pitch.

また、第２の本発明においては、入力された音響信号
波形の繰返し周期であり、音高を表すピッチ情報及び音
響信号のパワー情報を抽出するピッチ・パワー抽出手段
と、ピッチ情報及び又はパワー情報に基づいて音響信号
を同一音程とみなせる区間に区分するセグメンテーショ
ン手段と、この区分された区間について音響信号の絶対
音程軸上の音程を決定する音程同定手段とを一部に備え
て音響信号を楽譜データに変換する自動採譜装置におい
て、音程同定手段を、セグメンテーション手段によって
区分された各区間についてピッチ情報の平均値を算出す
る平均値算出部と、算出された平均値が最も近い絶対音
程軸上の音程にその区間の音程を決定する音程決定部と
で構成した。Further, in the second aspect of the present invention, pitch / power extraction means for extracting pitch information representing a pitch and power information of the audio signal, which is a repetition period of the input audio signal waveform, comprises pitch information and / or power information. Segmentation means for classifying an audio signal into sections that can be regarded as having the same pitch based on the sound signal, and pitch identification means for determining a pitch on the absolute pitch axis of the audio signal for the divided section. In an automatic transcription apparatus that converts data into data, a pitch identification unit is provided with an average value calculation unit that calculates an average value of pitch information for each section divided by the segmentation unit, and a calculated average value on an absolute pitch axis closest to the pitch information. A pitch determination unit for determining a pitch in the section is provided.

［作用］第１の本発明においては、各区間の音程を絶対音程軸
上の音程に同定するにつき、音響信号が揺らぐとしても
音響信号の発生源が意図する音程を中心として揺らぐの
で、その意図する音程はピッチ情報の平均値と非常に良
く対応する点に着目し、各区間の平均値を算出してその
平均値が近い絶対音程軸上の音程に同定するようにし
た。[Operation] In the first aspect of the present invention, when the pitch of each section is identified as the pitch on the absolute pitch axis, even if the sound signal fluctuates, the sound signal fluctuates around the intended pitch. Focusing on a point at which the average pitch corresponds very well to the average value of the pitch information, the average value of each section is calculated, and the average value is identified as a pitch on the absolute pitch axis that is close.

また、第２の本発明は、同様にピッチ情報の平均値が
音響信号の意図する音程に近いことに基づいて、セグメ
ンテーションされた各区間のピッチ情報の平均値を平均
値算出部によって算出し、音程決定部によってその算出
された平均値が近い絶対音程軸上の音程にその区間の音
程を同定するようにした。Further, the second present invention similarly calculates an average value of the pitch information of each segmented section by the average value calculating unit based on the fact that the average value of the pitch information is close to the intended pitch of the audio signal, The interval in the section is identified as the interval on the absolute interval axis whose average value calculated by the interval determining unit is close.

［実施例］以下、本発明の一実施例を図面を参照しながら詳述す
る。Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

自動採譜方式まず、本発明が適用される自動採譜方式について説明
する。Automatic transcription system First, an automatic transcription system to which the present invention is applied will be described.

第３図において、中央処理ユニット（CPU）１は、当
該装置の全体を制御するものであり、バス２を介して接
続されている主記憶装置３に格納されている第４図に示
す採譜処理プログラムを実行するものである。バス２に
は、CPU1及び主記憶装置３に加えて、入力装置としての
キーボード４、出力装置としての表示装置５、ワーキン
グメモリとして用いられる補助記憶装置６及びアナログ
／デジタル変換器７が接続されている。In FIG. 3, a central processing unit (CPU) 1 controls the whole of the apparatus, and performs a musical notation processing shown in FIG. 4 stored in a main storage device 3 connected via a bus 2. Execute the program. In addition to the CPU 1 and the main storage device 3, a keyboard 4 as an input device, a display device 5 as an output device, an auxiliary storage device 6 used as a working memory, and an analog / digital converter 7 are connected to the bus 2. I have.

アナログ／デジタル変換器７には、例えば、マイクロ
フォンでなる音響信号入力装置８が接続されている。こ
の音響信号入力装置８は、ユーザによって発声された歌
唱やハミングや、楽器から発生された楽音等の音響信号
を捕捉して電気信号に変換するものであり、その電気信
号をアナログ／デジタル変換器７に出力するものであ
る。An audio signal input device 8 including, for example, a microphone is connected to the analog / digital converter 7. The acoustic signal input device 8 captures an acoustic signal such as singing or humming uttered by a user or a musical tone generated from a musical instrument and converts the signal into an electric signal, and converts the electric signal into an analog / digital converter. 7 is output.

CPU1は、キーボード入力装置４によって処理が指令さ
れたとき、当該採譜処理を開始し、主記憶装置３に格納
されているプログラムを実行してアナログ／デジタル変
換器７によってデジタル信号に変換された音響信号を一
旦補助記憶装置６に格納し、その後、これら音響信号を
上述のプログラムを実行して楽譜データに変換して必要
に応じて表示装置５に出力するようになされている。When a process is instructed by the keyboard input device 4, the CPU 1 starts the transcription process, executes a program stored in the main storage device 3, and converts the sound converted into a digital signal by the analog / digital converter 7. The signals are temporarily stored in the auxiliary storage device 6, and thereafter, these sound signals are converted into musical score data by executing the above-described program and output to the display device 5 as necessary.

次に、CPU1が実行する音響信号を取り込んだ後の採譜
処理を第４図の機能レベルで示すフローチャートに従っ
て詳述する。Next, the transcription process performed by the CPU 1 after capturing the audio signal will be described in detail with reference to the flowchart shown in FIG.

まず、CPU1は、音響信号を自己相関分析して分析周期
毎に音響信号のピッチ情報を抽出し、また２乗和処理し
て分析周期毎にパワー情報を抽出し、その後ノイズ除去
や平滑化処理等の後処理を実行する（ステップSP1、SP
2）。その後、CPU1は、ピッチ情報については、その分
布状況に基づいて絶対音程軸に対する音響信号が有する
音程軸のずれ量を算出し、得られたピッチ情報をそのず
れ量に応じてシフトさせるチューニング処理を実行する
（ステップSP3）。すなわち、音響信号を発生した歌唱
者または楽器が有する音程軸と絶対音程軸との差が小さ
くなるようにピッチ情報を修正する。First, the CPU 1 performs an autocorrelation analysis of the acoustic signal to extract pitch information of the acoustic signal at each analysis cycle, and also performs a sum-of-squares process to extract power information at each analysis cycle, and then performs noise removal and smoothing processing. And other post-processing (steps SP1, SP
2). Thereafter, for the pitch information, the CPU 1 calculates a shift amount of the pitch axis of the acoustic signal with respect to the absolute pitch axis based on the distribution state, and performs a tuning process of shifting the obtained pitch information according to the shift amount. Execute (step SP3). That is, the pitch information is corrected so that the difference between the pitch axis and the absolute pitch axis of the singer or musical instrument that has generated the acoustic signal is reduced.

次いで、CPU1は、得られたピッチ情報が同一音程を指
示するものと考えられるピッチ情報の連続期間を得て、
音響信号を１音ごとのセグメントに切り分けるセグメン
テーションを実行し、また、得られたパワー情報の変化
に基づいてセグメンテーションを実行する（ステップSP
4、SP5）。これら得られた両者のセグメント情報に基づ
いて、CPU1は、４分音符や８分音符の時間長に相当する
基準長を算出してこの基準長に基づいて再度セグメンテ
ーションを実行する（ステップSP6）。Next, the CPU 1 obtains a continuous period of pitch information in which the obtained pitch information is considered to indicate the same pitch,
A segmentation is performed to divide the acoustic signal into segments for each sound, and a segmentation is performed based on the obtained change in the power information (step SP
4, SP5). Based on these two pieces of segment information obtained, the CPU 1 calculates a reference length corresponding to the time length of a quarter note or eighth note, and executes the segmentation again based on this reference length (step SP6).

CPU1は、このようにしてセグメンテーションされたセ
グメントのピッチ情報に基づきそのピッチ情報が最も近
いと判断できる絶対音程軸上の音程にそのセグメントの
音程を同定し、さらに、同定された連続するセグメント
の音程が同一か否かに基づいて再度セグメンテーション
を実行する（ステップSP7、SP8）。The CPU 1 identifies the pitch of the segment as a pitch on the absolute pitch axis that can determine that the pitch information is the closest based on the pitch information of the segment thus segmented, and further identifies the pitch of the identified continuous segment. Segmentation is again performed based on whether or not are the same (steps SP7 and SP8).

その後、CPU1は、チューニング後のピッチ情報を集計
して得た音程の出現頻度と、調に応じて定まる所定の重
み付け係数との積和を求めてこの積和の最大情報に基づ
いて、例えば、ハ長調やイ短調というように入力音響信
号の楽曲の調を決定し、決定された調における音階上の
所定の音程についてその音程をピッチ情報について見直
して音程を確認、修正する（ステップSP9、SP10）。次
いで、CPU1は、最終的に決定された音程から連続するセ
グメントについて同一なものがあるか否か、また連続す
るセグメント間でパワーの変化があるか否かに基づいて
セグメンテーションの見直しを実行し、最終的なセグメ
ンテーションを行なう（ステップSP11）。Thereafter, the CPU 1 obtains a product sum of the frequency of appearance of the pitch obtained by summing the pitch information after tuning and a predetermined weighting coefficient determined according to the key, and based on the maximum information of the product sum, for example, Determine the key of the musical composition of the input audio signal, such as C major or A minor, and review and correct the pitch of the predetermined pitch on the scale in the determined key with respect to the pitch information (steps SP9 and SP10). ). Next, the CPU 1 executes a review of the segmentation based on whether or not there is the same continuous segment from the finally determined pitch, and whether or not there is a power change between the continuous segments, Final segmentation is performed (step SP11).

このようにして音程及びセグメントが決定されると、
CPU1は、楽曲は１拍目から始まる、フレーズの最後の音
は次の小節にまたがらない、小節ごとに切れ目がある等
の観点から小節を抽出し、この小節情報及びセグメンテ
ーション情報から拍子を決定し、この決定された拍子情
報及び小節の長さからテンポを決定する（ステップSP1
2、SP13）。Once the pitch and segment are determined in this way,
The CPU 1 extracts measures from the viewpoint that the music starts from the first beat, the last sound of the phrase does not extend to the next measure, and there is a break in each measure, and determines the time signature from the measure information and the segmentation information. The tempo is determined from the determined time signature information and the length of the bar (step SP1).
2, SP13).

そして、CPU1は決定された音程、音長、調、拍子及び
テンポの情報を整理して最終的に楽譜データを作成する
（ステップSP14）。Then, the CPU 1 organizes the information on the determined pitch, pitch, key, beat, and tempo to finally create the musical score data (step SP14).

音程同定処理次に、このような自動採譜方式における音程同定処理
（ステップSP7参照）について、第１図のフローチャー
トを用いて詳述する。Next, the pitch identification process (see step SP7) in such an automatic transcription system will be described in detail with reference to the flowchart of FIG.

CPU1は、まずセグメンテーションによって得られたセ
グメントのうち最初のセグメントを取り出し、次いで、
そのセグメント内にある全てのピッチ情報の平均値を算
出する（ステップSP20、21）。CPU1 first extracts the first segment from the segments obtained by the segmentation, and then
An average value of all pitch information in the segment is calculated (steps SP20 and SP21).

その後、CPU1は算出された平均値が最も近い絶対音程
軸上の音程を当該セグメントの音程として同定する（ス
テップSP22）。なお、音響信号の各セグメントの音程
は、絶対音程軸上の半音ずつ異なるいずれかの音程に同
定される。CPU1は、かかる処理がなされて音程が同定さ
れたセグメントが最終のセグメントか否かを判別する
（ステップSP23）。その結果、処理が終了していると、
当該処理プログラムを終了し、処理が終了していない
と、次のセグメントを処理対象として上述のステップ21
に戻る（ステップSP24）。Thereafter, the CPU 1 identifies a pitch on the absolute pitch axis having the closest calculated average value as a pitch of the segment (step SP22). Note that the pitch of each segment of the acoustic signal is identified as one of the pitches that differs by a semitone on the absolute pitch axis. The CPU 1 determines whether or not the segment whose pitch has been identified by performing the processing is the last segment (step SP23). As a result, when the process is completed,
If the processing program is terminated and the processing is not terminated, the next segment is set as a processing target and the above-described step 21 is performed.
Return to step SP24.

このようなステップSP21〜24でなる処理ループを繰り
返すことにより、全てのセグメントについてそのセグメ
ント内のピッチ情報の平均値情報による音程同定が実行
される。By repeating the processing loop consisting of steps SP21 to SP24, pitch identification is executed for all the segments based on the average value information of the pitch information in the segments.

ここで、音程同定処理に平均値を利用するようにした
のは、音響信号が揺らぎを有するとはいえ、歌唱者等が
意図する音程を中心として揺らぐと考えられ、平均値が
その意図する音程に対応すると考えられるためである。Here, the average value is used for the pitch identification processing. Although the sound signal has fluctuation, it is considered that the sound signal fluctuates around a pitch intended by a singer or the like. This is because it is thought that it corresponds.

第２図は、かかる処理による音程同定の一例を示すも
のであり、点線曲線P1は音響信号のピッチ情報を示し、
縦方向の実線VRはセグメントのきれ目を示している。こ
の例による各セグメントの平均値は横方向の実線HRで示
しており、また、同定された音程は横方向の点線HPで示
している。この第２図より明らかなように平均値は絶対
音程軸上の音程に対する偏差が少なく、良好に同定でき
ることが分かる。FIG. 2 shows an example of pitch identification by such processing, where a dotted curve P1 shows pitch information of an acoustic signal,
A vertical solid line VR indicates a segment break. The average value of each segment in this example is indicated by a horizontal solid line HR, and the identified pitch is indicated by a horizontal dotted line HP. As is apparent from FIG. 2, the average value has a small deviation with respect to the pitch on the absolute pitch axis and can be identified well.

従って、上述の実施例によれば、各セグメントについ
てピッチ情報の平均値を算出し、平均値が最も近い絶対
音程軸上の音程にそのセグメントの音程を同定したの
で、音程を高精度に決定することができる。なお、音程
同定に先立ち、音響信号をチューニング処理しているの
で、かかる方法によれば平均値は絶対音程軸上の音程に
近い値をとり、同定が非常にし易くなっている。Therefore, according to the above-described embodiment, the average value of the pitch information is calculated for each segment, and the interval of the segment is identified as the interval on the absolute pitch axis having the closest average value, so that the interval is determined with high accuracy. be able to. Since the acoustic signal is tuned prior to pitch identification, according to this method, the average value takes a value close to the pitch on the absolute pitch axis, and the identification is very easy.

他の実施例なお、音程同定処理に用いるピッチ情報は、周波数単
位のHzで表わされているものであっても良く、また、音
楽分野で良く用いられているセグメント単位で表わされ
ているものであっても良い。Other Embodiments Note that the pitch information used for the pitch identification processing may be expressed in Hz in frequency units, or in segment units often used in the music field. It may be something.

また、上述の実施例においては、第４図に示す全ての
処理をCPU1が主記憶装置３に格納されているプログラム
に従って実行するものを示したが、その一部または全部
の処理をハードウェア構成で実行するようにしても良
い。例えば、第３図との対応部分に同一符号を付した第
５図に示すように、音響信号入力装置８からの音響信号
を増幅回路10を介して増幅した後、さらに前置ファイタ
11を介してアナログ／デジタル変換器12に与えてデジタ
ル信号に変換し、このデジタル信号に変換された音響信
号を信号処理プロセッサ13が自己相関分析してピッチ情
報を抽出し、また２乗和処理してパワー情報を抽出して
CPU1によるソフトウェア処理系に与えるようにしても良
い。このようなハードウェア構成（10〜13）に用いられ
る信号処理プロセッサ13としては、音声帯域の信号をリ
アルタイム処理し得ると共に、ホストのCPU1とのインタ
フェース信号が用意されているプロセッサ（例えば、日
本電気株式会社製μPD7720）を適用し得る。In the above-described embodiment, the CPU 1 executes all the processing shown in FIG. 4 according to the program stored in the main storage device 3. However, a part or all of the processing is performed by a hardware configuration. May be executed. For example, as shown in FIG. 5 in which the same reference numerals are given to the corresponding parts in FIG. 3, after the sound signal from the sound signal input device 8 is amplified through the amplifier circuit 10, the pre-
The digital signal is supplied to an analog / digital converter 12 via an analog-to-digital converter 11 and converted into a digital signal. The acoustic signal converted to the digital signal is subjected to autocorrelation analysis by a signal processor 13 to extract pitch information, and a square sum processing is performed. And extract power information
It may be provided to the software processing system by the CPU1. As the signal processor 13 used in such a hardware configuration (10 to 13), a processor capable of processing a signal in a voice band in real time and providing an interface signal with a host CPU 1 (for example, NEC Corporation) Co., Ltd. μPD7720) can be applied.

［発明の効果］以上のように、本発明によれば、各セグメントの音程
を、セグメントのピッチ情報の平均値に基づいて同定す
るようにしたので、良好に音程を決定でき、楽譜データ
の精度を一段と高めることができる。[Effects of the Invention] As described above, according to the present invention, the pitch of each segment is identified based on the average value of the pitch information of the segment, so that the pitch can be determined well, and the accuracy of the score data can be determined. Can be further increased.

[Brief description of the drawings]

第１図は本発明の一実施例にかかる音程同定処理を示す
フローチャート、第２図はかかる音程同定処理による一
例を示す略線図、第３図は本発明を適用する自動採譜方
式の構成を示すブロック図、第４図はその自動採譜処理
手順を示すフローチャート、第５図は自動採譜方式の他
の構成を示すブロック図である。１……CPU、３……主記憶装置、６……補助記憶装置、
７……アナログ／デジタル変換器、８……音響信号入力
装置。FIG. 1 is a flowchart showing a pitch identification process according to an embodiment of the present invention, FIG. 2 is a schematic diagram showing an example of the pitch identification process, and FIG. 3 shows a configuration of an automatic transcription system to which the present invention is applied. FIG. 4 is a flowchart showing the automatic transcription process, and FIG. 5 is a block diagram showing another configuration of the automatic transcription system. 1 ... CPU, 3 ... main storage device, 6 ... auxiliary storage device,
7 ... A / D converter, 8 ... Acoustic signal input device.

───────────────────────────────────────────────────── フロントページの続き (72)発明者藤本正樹東京都港区芝５丁目７番15号日本電気技術情報システム開発株式会社 (72)発明者水野正典東京都港区芝５丁目７番15号日本電気技術情報システム開発株式会社審査官新井重雄 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Masaki Fujimoto 5-7-15 Shiba, Minato-ku, Tokyo NEC Technical Information Systems Development Co., Ltd. (72) Inventor Masanori Mizuno 5-7-15 Shiba, Minato-ku, Tokyo No. NEC Technical Information System Development Co., Ltd.Examiner Shigeo Arai

Claims

(57) [Claims]

1. A process for extracting pitch information representing a pitch and power information of the sound signal, which is a repetition period of an input sound signal waveform, and processing the sound signal based on the pitch information and / or the power information. Automatically segmenting the sound signal into musical score data, including at least a segmentation process for segmenting the sound signal into segments that can be regarded as the same pitch, and a pitch identification process for determining a pitch on the absolute pitch axis of the acoustic signal for the segmented segment. In the music transcription method, the pitch identification processing includes: calculating an average value of pitch information for each of the divided sections; and calculating a pitch of each of the sections to a pitch on an absolute pitch axis closest to the calculated average value. An automatic transcription method, comprising: determining.

2. A pitch / power extracting means for extracting pitch information representing a pitch and power information of the audio signal, which is a repetition period of an input audio signal waveform, and based on the pitch information and / or the power information. Segmentation means for classifying the acoustic signal into sections that can be regarded as having the same pitch, and pitch identification means for determining a pitch on the absolute pitch axis of the acoustic signal for the divided section. In an automatic transcription apparatus for converting musical score data, an average value calculating unit that calculates an average value of pitch information for each of the sections divided by the segmentation unit, wherein the calculated average value is the closest to the absolute value. An automatic music transcription apparatus, comprising: a pitch determining unit that determines a pitch in each of the above sections on a pitch on a pitch axis.