JP2003242439A

JP2003242439A - Musical score recognizing device

Info

Publication number: JP2003242439A
Application number: JP2003030971A
Authority: JP
Inventors: Seiji Nakano; 誠至中野; Ren Sumida; 錬澄田; Tetsuo Hino; 鉄夫日野; Atsushi Ooba; 厚始大場
Original assignee: Kawai Musical Instrument Manufacturing Co Ltd
Current assignee: Kawai Musical Instrument Manufacturing Co Ltd
Priority date: 2003-02-07
Filing date: 2003-02-07
Publication date: 2003-08-29
Anticipated expiration: 2015-09-29
Also published as: JP3812836B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a musical score recognizing means capable of efficiently recognizing a musical score by simple processing. <P>SOLUTION: This musical score recognizing device converts various signs into performance information by recognizing the various signs from inputted musical score image data, and has a separating means for separating a part for constituting a fine line from the musical score image data, a fine sign detecting means for detecting a sign composed of the fine line from a separated fine line image, a thick sign detecting means for detecting a black ball note head, a continuous flag, and a flag on the basis of image data other than image data on a fine line constituting part included in the musical score image data, and a musical note detecting means for detecting a musical note on the basis of a detecting result of the fine sign detecting means and the thick sign detecting means. Thus, the musical note can be recognized by efficiently recognizing the sign from the image data separated by the simple processing, so that a recognizing rate is improved. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は楽譜認識装置に関
し、特に、簡単な処理で楽譜の構成要素を効率良く認識
することが可能な楽譜認識装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a score recognizing apparatus, and more particularly to a score recognizing apparatus capable of efficiently recognizing a score component by a simple process.

【０００２】[0002]

【従来の技術】従来の楽譜認識装置においては、例えば
スキャナによって読み込んだ楽譜画像データについて、
五線、音符や各種記号を認識して、ＭＩＤＩファイルデ
ータ等の演奏データを生成するものがあった。そして、
楽譜に含まれる各種記号の認識処理においては、例え
ば、五線を検出し、検出した五線を消去することにより
ラベル（各種記号の画像）を分離する方式があった。2. Description of the Related Art In a conventional score recognition apparatus, for example, score image data read by a scanner is
There has been one that recognizes staffs, notes and various symbols and generates performance data such as MIDI file data. And
In the recognition processing of various symbols included in a musical score, for example, there has been a method of detecting a staff and erasing the detected staff to separate labels (images of various symbols).

【０００３】[0003]

【発明が解決しようとする課題】通常、楽譜を演奏者が
読む時には、五線上に書かれた黒玉（符頭）で音の高さ
を判断し、その符頭から延びる縦の細い線（符尾）に接
続されている太い旗（符鉤または連鉤）の数などにより
音の長さを判断するというように、楽譜の細い部分と太
い部分を無意識のうちに分離していると思われる。Usually, when a player reads a musical score, the pitch is judged by a black ball (notehead) written on the staff, and a vertical thin line extending from the notehead ( It seems that the thin and thick parts of the score are unknowingly separated, as the length of the note is judged by the number of thick flags (clams or continuous hooks) connected to the stem. Be done.

【０００４】一方、従来の楽譜認識装置における一番の
問題点は記号の分離である。楽譜は、五線上に記号が描
かれており、認識処理の際には、五線によってそれぞれ
の音楽記号のラベルが連結され、認識を困難にする原因
になっている。多くの楽譜認識装置では、認識の前処理
として、不必要なラベルの分離がなるべく起こらないよ
うに配慮しながら五線を消去するという処理が行われ
る。しかし、このような処理を行っても、多数の記号を
音楽的に見やすく配置するという楽譜の性質上、例えば
加線に接触した臨時記号など、記号同士が接触したラベ
ルは依然多く存在する。On the other hand, the biggest problem in the conventional musical score recognition apparatus is the separation of symbols. In the music score, symbols are drawn on the staff, and during the recognition processing, the labels of the respective music symbols are connected by the staff, which makes recognition difficult. In many score recognition devices, a process of erasing staffs is performed as a pre-recognition process, taking care to avoid unnecessary label separation as much as possible. However, even if such processing is performed, there are still many labels in which symbols are in contact with each other, such as temporary symbols in contact with additional lines, due to the nature of the musical score in which a large number of symbols are arranged so that they can be viewed in a musical manner.

【０００５】また、楽譜に特徴的な性質として、形が一
定でない記号の問題がある。音符などは、構造に規則性
はあるものの、ラベルとしての形は千差万別である、こ
のため、ラベルごとの辞書とのマッチングといった方式
では認識は不可能であり、構造解析法的な手法を採る必
要がある。ところが、従来の記号分離方式である五線消
去方式では、ラベル同士が接触している場合に分離する
ことができないし、音符のように、その形状が変化する
ものは認識が困難であるという問題点があった。本発明
の目的は、前記のような従来技術の問題点を解決し、簡
単な処理で楽譜を効率良く認識することが可能な楽譜認
識装置を提供することにある。Further, as a characteristic property of the musical score, there is a problem of a symbol whose shape is not constant. Although notes have regularity in their structure, their shapes as labels are infinite. Therefore, it is impossible to recognize them by a method such as matching with a dictionary for each label. Need to take. However, in the conventional erasure method, which is a symbol separation method, labels cannot be separated when they are in contact with each other, and it is difficult to recognize those whose shape changes like notes. There was a point. SUMMARY OF THE INVENTION An object of the present invention is to solve the above-mentioned problems of the prior art and to provide a musical score recognition apparatus capable of efficiently recognizing a musical score by simple processing.

【０００６】[0006]

【課題を解決するための手段】本発明は、入力された楽
譜画像データから各種記号を認識する楽譜認識装置にお
いて、楽譜画像データから所定の線幅に対応するしきい
値に従って細い線を構成する部分の画像データを分離す
る分離手段と、前記分離手段により分離された細い線を
構成する部分の画像データに基づいて記号としての細い
線を検出する細記号検出手段と、前記楽譜画像データに
含まれるものであって、前記細い線を構成する部分の画
像データ以外の画像データに基づいて黒玉符頭、連鉤お
よび旗を検出する太記号検出手段と、前記細記号検出手
段および前記太記号検出手段の検出結果に基づいて音符
を検出する音符検出手段とを備えたことを特徴とする。
本発明はこのような構成により、例えば楽譜認識の前処
理として、楽譜画像中の細い線を構成する部分の画像デ
ータ、それ以外の画像データを分離しておくことによっ
て、人間が、楽譜を読む場合と感覚的に近い認識法を構
成することができ、分離した画像データから効率良く細
い線（細記号）および黒玉符頭、連鉤および旗（太記
号）を検出して音符を認識することができ、認識率が向
上する。また、小節線、符尾、加線などは、同じ細い線
であっても太さが異なっている場合もあるが、細線検出
のしきい値を縦と横で変えて検出することにより、これ
らの相違にも対応可能である。According to the present invention, in a musical score recognition apparatus for recognizing various symbols from inputted musical score image data, a thin line is constructed from the musical score image data according to a threshold value corresponding to a predetermined line width. A separating means for separating the image data of the portion; a fine symbol detecting means for detecting a thin line as a symbol based on the image data of the portion forming the thin line separated by the separating means; And a thick symbol detecting means for detecting a black ball notehead, a continuous hook and a flag based on image data other than the image data of the portion forming the thin line, the thin symbol detecting means and the thick symbol. And a note detecting means for detecting a note based on the detection result of the detecting means.
With such a configuration, the present invention separates the image data of the portion forming the thin line in the score image and the image data other than that as the preprocessing of score recognition, so that a human can read the score. It is possible to construct a recognition method that is similar to the case, and efficiently recognize thin lines (thin symbols) and black ball note heads, connecting hooks and flags (thick symbols) from separated image data to recognize notes. The recognition rate can be improved. In addition, bar lines, stems, additional lines, etc. may have different thicknesses even if they are the same thin line, but by detecting by changing the threshold of thin line detection vertically and horizontally, It is possible to deal with the difference of.

【０００７】[0007]

【発明の実施の形態】以下、本発明の実施の形態を図面
を参照して詳細に説明する。図１は本発明の楽譜認識装
置の１実施例の構成を示すブロック図である。この装置
は、パソコン等の一般的な計算機システムにスキャナや
ＭＩＤＩインターフェース回路を付加したものである。
ＣＰＵ１は、ＲＯＭ２あるいはＲＡＭ３に格納されるプ
ログラムに基づき、楽譜認識装置全体の制御を行う中央
処理装置である。また、予め設定された所定の周期でＣ
ＰＵ１に割り込みをかけるタイマ回路を内蔵している。
ＲＡＭ３はプログラムエリアの他、画像データバッフ
ァ、ワークエリア等として使用される。ハードディスク
装置ＨＤＤ４およびフロッピディスク装置ＦＤＤ５は、
プログラムおよび画像データ、演奏データ等を格納す
る。ＣＲＴ６はＣＰＵ１の制御に基づき、ＣＲＴインタ
ーフェース回路７から出力される映像情報を表示し、キ
ーボード８から入力された情報は、キーボードインター
フェース回路９を経てＣＰＵ１に取り込まれる。プリン
タ１０は、ＣＰＵ１の制御に基づき、プリンタインター
フェース回路１１から出力される印字情報を印字する。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing the configuration of an embodiment of the musical score recognition apparatus of the present invention. This device is a general computer system such as a personal computer to which a scanner and a MIDI interface circuit are added.
The CPU 1 is a central processing unit that controls the entire score recognition apparatus based on a program stored in the ROM 2 or the RAM 3. In addition, C at a predetermined cycle set in advance
It has a built-in timer circuit that interrupts PU1.
The RAM 3 is used as an image data buffer, a work area, etc. in addition to the program area. The hard disk device HDD4 and the floppy disk device FDD5 are
Stores programs, image data, performance data, etc. The CRT 6 displays the video information output from the CRT interface circuit 7 under the control of the CPU 1, and the information input from the keyboard 8 is taken into the CPU 1 via the keyboard interface circuit 9. The printer 10 prints the print information output from the printer interface circuit 11 under the control of the CPU 1.

【０００８】スキャナ１２は、（印刷された）楽譜を光
学的に走査して、２値あるいはグレイスケールの画像デ
ータに変換するものであり、フラットベッド型、ハンデ
ィ型、フィーダー型等任意のタイプのものを使用でき
る。スキャナ１２によって読み取られた画像情報は、ス
キャナインターフェース回路１３を介して、ＲＡＭ３あ
るいはＨＤＤ４に取り込まれる。ＭＩＤＩインターフェ
ース回路１４は、音源モジュール等の外部のＭＩＤＩ機
器との間でＭＩＤＩデータの送受信を行う回路である。
バス１５は楽譜認識装置内の各回路を接続している。な
お、この他にマウス等のポインティングデバイス、ＲＳ
２３２Ｃ等のシリアルインターフェース回路等を備えて
いてもよい。The scanner 12 optically scans a (printed) musical score and converts it into binary or grayscale image data, and may be of any type such as a flatbed type, a handy type, and a feeder type. You can use one. The image information read by the scanner 12 is taken into the RAM 3 or the HDD 4 via the scanner interface circuit 13. The MIDI interface circuit 14 is a circuit that transmits and receives MIDI data to and from an external MIDI device such as a sound source module.
The bus 15 connects each circuit in the musical score recognition device. In addition to this, a pointing device such as a mouse, RS
A serial interface circuit such as 232C may be provided.

【０００９】図３は、ＣＰＵ１のメイン処理を示すフロ
ーチャートである。Ｓ１においては、スキャナ１２によ
って楽譜の画像をＲＡＭ３に取り込む。画像は２値の画
像として取り込む。Ｓ２においては、かすれやドットノ
イズなどを軽減するために、図形融合などの画質平滑化
処理を行う。Ｓ３においては、画質チェック処理を行
う。画質チェック処理においては、倍率と濃度の情報を
得ると共に、後段における五線検出の基準データを得る
ために、まず五線の線幅と五線の各線間の間幅を検出す
る。線幅、間幅を求めるためには、まず、画像上の横
（ｘ）方向の数箇所において縦（ｙ）方向に走査し、黒
ラン（連続する黒画素）と白ランの長さを全て求めて、
長さ毎に頻度分布（ヒストグラム）データを作成する。FIG. 3 is a flow chart showing the main processing of the CPU 1. In S1, the image of the musical score is taken into the RAM 3 by the scanner 12. The image is captured as a binary image. In S2, image quality smoothing processing such as figure fusion is performed in order to reduce blurring and dot noise. In S3, an image quality check process is performed. In the image quality check process, first, the line width of the staff and the space between the staffs are detected in order to obtain the magnification and the density information and the reference data for the staff detection in the subsequent stage. In order to obtain the line width and the space width, first, scanning is performed in the vertical (y) direction at several points in the horizontal (x) direction, and all the lengths of black runs (continuous black pixels) and white runs are calculated. Seeking
Frequency distribution (histogram) data is created for each length.

【００１０】楽譜上で最も多い記号は五線であるので、
作成された黒ラン長ヒストグラムと、白ラン長ヒストグ
ラムのピークをそれぞれ検出することで、五線の線幅、
間幅が推定できる。そして、画像データの倍率は、例え
ば間幅から推定可能であり、また、濃度は線幅と間幅の
比から推定することができる。楽譜の認識処理において
は、倍率および濃度が所定の範囲から外れると認識率が
低下してしまうので、Ｓ３においては、これらの値が、
所定の範囲内に入っているか否かがチェックされる。Ｓ
４においては、Ｓ３におけるチェック結果が画質ＯＫで
あるか否かが判定され、結果がＯＫでない場合にはＳ１
に戻って、倍率や濃度を変えて再取り込みを行う。Since the most common symbol on the score is the staff,
By detecting the peaks of the created black run length histogram and white run length histogram respectively, the line width of the staff,
The space can be estimated. Then, the magnification of the image data can be estimated, for example, from the space, and the density can be estimated from the ratio of the line width and the space. In the musical score recognition processing, the recognition rate decreases if the magnification and the density deviate from the predetermined ranges. Therefore, in S3, these values are
It is checked whether it is within a predetermined range. S
In 4, it is determined whether or not the check result in S3 is the image quality OK, and if the result is not OK, S1
Return to and change the magnification and density to re-capture.

【００１１】Ｓ５においては五線認識を行う。五線認識
処理は、大きく五線走査開始位置検出処理と、五線シフ
ト量の検出処理に分かれる。五線走査開始位置検出処理
の概略を述べると、ｘ軸方向のある位置で、黒画素と白
画素の幅を順に求め、求められた線幅と間幅が五線状に
並んでいる位置を、ある程度の誤差を考慮して検出す
る。そして、加線（五線からはみ出した音符を記載する
ために付加した横線）の影響を除くために、五線状の並
びの両側に間幅より大きな白画素幅があるという条件を
加える。この条件に合う白黒画素の並びがあるｘ位置の
各黒ランの中点を五線走査開始位置とする。In S5, staff recognition is performed. The staff recognition processing is roughly divided into staff scanning start position detection processing and staff shift amount detection processing. The outline of the staff scanning start position detection processing will be described. At a certain position in the x-axis direction, the widths of the black pixels and the white pixels are sequentially obtained, and the positions where the obtained line width and the space width are arranged in a staff line are determined. , It is detected in consideration of some error. Then, in order to remove the influence of additional lines (horizontal lines added to describe notes that have run off from the staff), a condition is added that there is a white pixel width larger than the gap between both sides of the staff line arrangement. The midpoint of each black run at the x position where the arrangement of black and white pixels that meets this condition is set as the staff scanning start position.

【００１２】つぎに、五線シフト量の検出処理の概略を
述べると、求められたｘ位置の五線走査開始位置（５点
の黒画素位置）から、１ドットずつ位置を左右に変えて
いき、５点の内、黒画素がある個数（例えば３あるいは
４個）以下になった場合に、５点を上下にずらして黒画
素数をチェックし、黒画素の割合が高くなる方向へｙ座
標をシフトする。そして開始位置からのシフト量を五線
のシフト量とする。五線走査開始位置から左右に、黒画
素個数が０になる位置まで走査することにより五線の検
出を行う。Next, the outline of the staff shift detection processing will be described. The position is changed left and right one dot at a time from the staff scan start position (five black pixel positions) at the obtained x position. When the number of black pixels is less than a certain number (for example, 3 or 4) of the 5 points, the number of black pixels is checked by shifting the 5 points up and down, and the y coordinate is increased in the direction of increasing the ratio of black pixels. Shift. Then, the shift amount from the start position is set as the staff shift amount. The staff is detected by scanning from the staff scanning start position to the left and right until the number of black pixels becomes zero.

【００１３】Ｓ６においては、段落認識処理を行う。こ
の処理は、大きく、段落認識処理と、大かっこ認識処理
に分かれる。段落認識処理においては、画像全体で五線
を検出し、五線同士で左端がほぼ同じ場所にある五線の
組を探し、五線の端同士が、黒画素で結ばれているかど
うかを検査し、段落を認識する。段落を囲む矩形が左右
に並んでいた場合には、これも時系列になるように処理
を行う。なお、予め、ｘ軸、ｙ軸方向に黒画素のヒスト
グラムを取り、これの空白部分を検出することによって
段落の存在を推定しても良い。五線同士が大かっこで結
ばれていた場合には、五線同士にまたがる音符等が存在
する場合があるので、大かっこで結ばれた五線は１つの
単位で処理を行った方が良い。大かっこ認識において
は、段落線の左の所定の範囲で、後述する定型記号認識
と同様の手法で認識を行う。なお、この認識において
は、大かっこおよび大弧線が認識できれば良い。In S6, paragraph recognition processing is performed. This process is roughly divided into a paragraph recognition process and a bracket recognition process. In the paragraph recognition process, the staff is detected in the entire image, the set of staffs whose left ends are almost at the same position among the staffs is searched, and it is checked whether the ends of the staff are connected by black pixels. And recognize paragraphs. If the rectangles surrounding the paragraph are arranged side by side, this is also processed in chronological order. Note that the existence of a paragraph may be estimated by taking a histogram of black pixels in the x-axis and y-axis directions in advance and detecting a blank portion of the histogram. If the staves are connected by square brackets, there may be notes that cross the staves, so it is better to process the staves connected by the brackets in one unit. . In the bracket recognition, recognition is performed in a predetermined range to the left of the paragraph line by a method similar to the standard symbol recognition described later. In this recognition, it is only necessary to recognize the brackets and the large arch line.

【００１４】Ｓ７においては、段落の認識結果を表示し
て、段落認識結果が正しいか否かを利用者にチェックさ
せることにより、ＯＫか否かが判定され、結果がＯＫで
ない場合にはＳ８に移行して、段落認識結果の修正が行
われる。スコア譜においては、各段落のパート構成が等
しいものの他に、途中でパートの省略や追加があった
り、同じパートで単独譜表と大譜表が段落ごとに変化す
る場合もある。このようなパートの対応は、大かっこの
対比等で行うが、パートの対応が一意に決められない場
合もあるので、予め段落認識結果の修正を行えるように
する。なお、五線認識が失敗した場合には、その後の処
理が行えないので、倍率や濃度を変更して再度画像を取
り込む必要がある。従って、ステップＳ７においては、
まず五線の認識結果を表示し、正しいか否かを利用者に
判定させ、もし正しくない場合には、Ｓ１に戻って画像
の再取り込みを行い、また五線が正しく認識されている
場合には、段落認識結果を表示し、チェックさせるよう
にしてもよい。In S7, the recognition result of the paragraph is displayed, and the user is checked whether or not the paragraph recognition result is correct, whereby it is determined whether the result is OK. If the result is not OK, the process proceeds to S8. After the transition, the paragraph recognition result is corrected. In the score staff, in addition to the parts having the same part structure, each part may be omitted or added in the middle, or a single staff and a grand staff may change from paragraph to paragraph in the same part. Correspondence of such parts is performed by comparing the brackets or the like, but since the correspondence of parts may not be uniquely determined, the paragraph recognition result can be corrected in advance. If the staff recognition fails, the subsequent processing cannot be performed, so it is necessary to change the magnification and the density and capture the image again. Therefore, in step S7,
First, display the recognition result of the staff, and let the user judge whether it is correct or not. If not, go back to S1 to re-import the image, and if the staff is recognized correctly, May display the paragraph recognition result for checking.

【００１５】Ｓ９においては、処理矩形の決定処理が行
われる。求められた五線、（大譜表の場合には、譜表中
の五線）を含む、ある程度広い矩形を採り、これを認識
処理矩形とする。矩形の大きさは、その五線に関係する
音楽記号が存在する最大領域以上で、かつ五線傾き補正
により、必要な記号が消えない様な大きさにする。これ
以降の認識はこの矩形内で行う。In step S9, processing rectangle determination processing is performed. A rectangle that is wide to some extent including the determined staff (in the case of a grand staff, the staff in the staff) is taken as the recognition processing rectangle. The size of the rectangle is larger than the maximum area where the musical symbols related to the staff exist, and the necessary symbols are not deleted by the staff inclination correction. The subsequent recognition is performed within this rectangle.

【００１６】Ｓ１０においては、五線傾き補正処理を行
う。概略を述べると、先に求めた五線シフト量に基づい
て、矩形画像の列ごとに画素列を上下にシフトする。五
線ごとにシフト量を計算し、矩形画像内でシフト補正を
行った方がより正確であるが、シフト量は、取り込み画
像全体で１つ計算し、画像全体をシフトしても良い。こ
の後、矩形の上下端に接した図形ラベル（独立した黒画
素領域）は上下のパートの構成要素として削除する。最
後に、上下端の空白部分を検出して、矩形を縮小する。In S10, a staff slope correction process is performed. In brief, the pixel row is vertically shifted for each row of the rectangular image based on the staff shift amount obtained previously. It is more accurate to calculate the shift amount for each staff and perform shift correction in the rectangular image, but one shift amount may be calculated for the entire captured image and the entire image may be shifted. Thereafter, the graphic labels (independent black pixel areas) in contact with the upper and lower ends of the rectangle are deleted as constituent elements of the upper and lower parts. Finally, the upper and lower blank areas are detected to reduce the rectangle.

【００１７】Ｓ１１〜Ｓ１５においては、各種記号の認
識処理が行われる。楽譜記号は、形、位置に関して大ま
かに以下の３つの種類がある。（１）定型で、上下位置
がほぼ決まっているもの（音部記号、拍子記号等）。
（２）定型で、上下位置は自由度があるもの（臨時記
号、休符等）。（３）不定型かつ位置も不定のもの（音
符、スラー、タイ等）。これらをそれぞれに適した方式
で、音部記号、拍子認識、音符認識、定型記号認識、文
字列認識、スラー、タイ認識の順に認識する。In S11 to S15, various symbols are recognized. There are roughly three types of score symbols in terms of shape and position. (1) A fixed type whose top and bottom positions are almost fixed (clefs, time signatures, etc.).
(2) A fixed type that has flexibility in the vertical position (temporary symbols, rests, etc.). (3) Those of indefinite type and indefinite position (notes, slurs, ties, etc.) Each of them is recognized in the order of clef, time signature, note recognition, fixed symbol recognition, character string recognition, slurs, and tie recognition.

【００１８】音部記号、拍子認識を最初に行うのは、処
理コストの低い認識を最初に行って、この記号を削除す
ることによって、後の認識の処理コストを軽減するため
と、最初により確実なものを認識することで、後の認識
での誤認識を減らすためである。また、音符認識の後に
定型記号認識を行うのは、ラベルの接触に影響されにく
い認識方式である音符認識を行って、この音符を削除す
ることで、音符に接触した臨時記号等の認識を可能にす
るためである。スラー、タイ認識が最後になっているの
は、処理コストの高いスラー認識の対象になるラベルを
なるべく少なくするためである。また、先に検出された
音符の周りのラベルだけをスラー、タイ認識の対象にす
ることによって、更にスラー、タイ認識の処理コストを
下げ、かつ、スラー、タイの誤認識も減らすことができ
る。The clef and time signature recognition are performed first because the recognition processing with low processing cost is performed first and the processing cost of subsequent recognition is reduced by deleting this symbol. This is because by recognizing such a thing, false recognition in later recognition is reduced. In addition, the fixed symbol recognition is performed after the note recognition, which is a recognition method that is not easily affected by the contact of the label, and the note recognition is performed by deleting the note. This is because The slur and tie recognition is the last to reduce the number of labels subject to slur recognition, which requires high processing cost. Further, by making only the labels around the previously detected notes the target of slur and tie recognition, it is possible to further reduce the processing cost of slur and tie recognition, and to reduce erroneous recognition of slur and tie.

【００１９】Ｓ１１においては、五線に対して定位置に
ある記号として、音部記号と拍子記号を認識する。該処
理においては、まず、求められた五線を含む矩形領域で
縦に黒画素のヒストグラムを取っていき、黒画素量があ
るしきい値以上の帯域を、記号が存在する可能性のある
場所としてマッチングの対象とする。マッチングは、五
線間の数箇所について横方向のペリフェラル特徴によっ
て行う。ペリフェラル特徴とは、マッチング対象となる
記号のみを含む矩形領域の左右端から五線間の数箇所の
白画素領域を内方向に走査し、黒画素領域に達するまで
の距離を１次（最初）あるいは数次（２回目以降）まで
求めたものである。また、マッチングに失敗した場合に
は、隣接した帯域を併合して再度認識を行う。そして、
認識された記号は画像データから削除する。In S11, a clef and a time signature are recognized as symbols at fixed positions with respect to the staff. In the processing, first, a histogram of black pixels is taken vertically in a rectangular area including the determined staff, and a band in which the amount of black pixels is greater than or equal to a certain threshold is set in a place where a symbol may exist. As the target of matching. Matching is performed by peripheral peripheral features at several points between staffs. Peripheral feature is the first (first) distance from the left and right edges of a rectangular area that includes only the symbol to be matched to the white pixel area between the staff and the black pixel area. Alternatively, it is obtained up to several orders (from the second time onward). If matching fails, the adjacent bands are merged and recognized again. And
The recognized symbol is deleted from the image data.

【００２０】Ｓ１２においては、後述する音符認識を行
う。概略を述べると、まず、画像データを縦細線、横細
線、太線画像に分離する。そして、各画像から記号や記
号の部品を検出し、それらを総合して音符等の記号を認
識する。Ｓ１３においては、定型記号認識が行われる。
この処理においては、まず、公知の輪郭線荷重方向指数
を取り、辞書の各記号データについてラベルのサイズと
輪郭線荷重方向指数のマッチング度を計算して、各マッ
チング度を正規化し、統合した結果が最も高い記号を出
力する。なお、サイズと荷重方向指数の他に、ペリフェ
ラルなどの他の特徴を使っても良い。また、五線消去に
よりラベルが切れたものの対策として、五線消去によっ
て切れたラベルを辞書に登録し、この記号であると認識
された場合には、その周りのラベルを結合して再認識す
る。認識された記号は画像から削除する。At S12, note recognition described later is performed. In brief, first, image data is separated into vertical thin line, horizontal thin line, and thick line images. Then, the symbols and the parts of the symbols are detected from each image, and these are combined to recognize symbols such as musical notes. In S13, the fixed symbol recognition is performed.
In this process, first, a publicly known contour line load direction index is taken, the matching degree between the label size and the contour line load direction index is calculated for each symbol data in the dictionary, and the matching degree is normalized and integrated. Outputs the highest symbol. In addition to the size and load direction index, other features such as peripherals may be used. In addition, as a measure against the label cut by the staff deletion, the label cut by the staff deletion is registered in the dictionary, and when it is recognized as this symbol, the labels around it are combined and recognized again. . The recognized symbol is deleted from the image.

【００２１】Ｓ１４においては、文字列認識を行う。速
度記号などの文字列を認識するために、定型記号認識で
認識されたアルファベットその他の記号を使い、その記
号を囲む矩形が文字列状に並んでいるものを抽出し、こ
れと文字列辞書のマッチングをとることで、文字列状の
記号を、それぞれの構成文字が多少間違っていても認識
できるようにする。In S14, character string recognition is performed. In order to recognize character strings such as speed symbols, the alphabets and other symbols recognized by the fixed symbol recognition are used, and the rectangles enclosing the symbols are extracted in a character string form. By performing matching, a character string-like symbol can be recognized even if each constituent character is slightly wrong.

【００２２】Ｓ１５においては、スラータイ認識を行
う。この処理においては、残ったラベルのうち、検出さ
れた音符の周りのラベルに関して、これを細線化し、こ
れを多円弧近似する。そして、以前に消された記号によ
り線が切れている場合があるので、求められた多円弧同
士の連結を行う。最後に、求められた円弧の形や元画像
の図の太さ、音符との関係などからスラー、タイを認識
する。これが認識で最後のルーチンなので、認識された
記号は画像から削除しなくても良いが、認識したスラ
ー、タイを削除し、この後で再度定型記号認識を行うよ
うにすれば、スラー、タイと接触した記号を認識するこ
とができるようになる。In step S15, slur tie recognition is performed. In this process, of the remaining labels, the labels around the detected note are thinned and subjected to multi-arc approximation. Then, since the line may be broken due to the previously erased symbol, the obtained multi-arcs are connected to each other. Finally, the slurs and ties are recognized based on the calculated arc shape, the thickness of the original image, and the relationship with the notes. Since this is the last routine in recognition, it is not necessary to delete the recognized symbols from the image, but if you delete the recognized slurs and ties and perform the fixed symbol recognition again after this, the slurs and ties will be deleted. You will be able to recognize the touched symbol.

【００２３】Ｓ１６においては、例えば認識結果に基づ
き、楽譜画像データを合成して表示し、正しいか否かを
利用者にチェックさせることにより、ＯＫか否かが判定
され、結果がＯＫでない場合にはＳ１７に移行して、マ
ウス、キーボード等を用いて、手動により認識結果の修
正が行われる。Ｓ１８においては、演奏データ作成処理
が行われる。該処理においては、認識した各種の記号や
音符情報に基づき、例えば公知の演奏データ形式である
ＭＩＤＩファイルデータを生成する。In S16, for example, based on the recognition result, the musical score image data is combined and displayed, and the user is checked whether it is correct or not to determine whether it is OK. If the result is not OK, Shifts to S17, and the recognition result is manually corrected using a mouse, a keyboard, or the like. In S18, performance data creation processing is performed. In the processing, MIDI file data, which is a well-known performance data format, is generated based on the recognized various symbols and note information.

【００２４】図４は、図３のＳ１２の音符認識処理の詳
細を示すフローチャートである。Ｓ２０においては、横
方向の短い黒ランを検出し、画像を分離する。黒ランの
長さは、先に検出された五線の線幅を基準として、例え
ば線幅の２倍というように決定される。図２は、画像分
離処理例を示す説明図である。図２（ａ）は原画像であ
り、図２（ｂ）は、Ｓ２０において（ａ）の原画像から
分離された縦細線画像の例である。小節線２０や符尾２
１は、例えば五線の位置で複数のラベルに切断されてい
る。また、符尾の連鉤部分のように太い画像の部分も消
えている。FIG. 4 is a flow chart showing details of the note recognition processing of S12 of FIG. In S20, a short horizontal black run is detected and the image is separated. The length of the black run is determined based on the line width of the previously detected staff, for example, twice the line width. FIG. 2 is an explanatory diagram showing an example of image separation processing. 2A is an original image, and FIG. 2B is an example of a vertical thin line image separated from the original image in S20 in S20. Bar line 20 and stem 2
1 is cut into a plurality of labels at, for example, staff positions. In addition, a thick image part such as a chained part of a stem disappears.

【００２５】Ｓ２１においては、分離された画像データ
から縦線を検出する。図５は、図４のＳ２１の縦線検出
処理の詳細を示すフローチャートである。Ｓ４０におい
ては、横に短い黒ランの集合画像から各ラベル（独立し
た黒領域）の縦方向の中線を求める。中線は、ラベルの
上部の横方向のランの中点と下部の横方向のランの中点
を結んだ線で画像を走査し、黒画素の存在確率を求めて
も良いし、ラベルの複数の縦位置における水平方向の黒
画素列から、中点座標列を求め、最小２乗法により中線
を求めても良い。Ｓ４１においては、中線同士を連結す
る。この判断は、例えば線分同士の傾きや、端点間の
（ｘ座標における）距離、線分間の原画像での１画素の
存在確率等により行う。Ｓ４２においては、連結された
線の先端から外側に、原画像に黒画素が存在しなくなる
箇所まで線を延長する。Ｓ４３においては、ｘ座標が近
接している線同士を統合し、１つの線として認識する。In S21, a vertical line is detected from the separated image data. FIG. 5 is a flowchart showing details of the vertical line detection processing in S21 of FIG. In S40, the vertical center line of each label (independent black area) is obtained from the horizontally short set of black runs. The center line may scan the image with a line that connects the midpoint of the horizontal run at the top of the label and the midpoint of the horizontal run at the bottom of the label, and determine the probability of black pixel existence. It is also possible to obtain the midpoint coordinate sequence from the horizontal black pixel sequence at the vertical position of, and obtain the midline by the least square method. In S41, the center lines are connected to each other. This determination is performed based on, for example, the inclination between line segments, the distance between the end points (in the x coordinate), the existence probability of one pixel in the original image of the line segment, and the like. In S42, the line is extended from the tip of the connected line to the outside until the black pixel does not exist in the original image. In S43, the lines whose x coordinates are close to each other are integrated and recognized as one line.

【００２６】従来は、画像の縦方向のヒストグラムから
符尾を検出する方法等があったが、このような方法では
画像の傾きが大きい場合にヒストグラムのピークが低く
なってしまい、正確に符尾を検出することが困難にな
る。これに対し、本発明の符尾検出方式では、符尾に傾
きがあっても正確に検出ができる。更に、特に多数の音
符の符頭が接近して存在する場合などに、本方式では符
尾が傾きも含めて正確に認識されるので、符尾と符頭を
より正確に対応づけることができる。また、切れた細い
ラベルから検出した中線を連結する方式を取っているの
で、手書き楽譜のように、符尾が多少曲がっている場合
にも認識が可能である。Conventionally, there has been a method of detecting a stem from a histogram in the vertical direction of an image. However, in such a method, the peak of the histogram becomes low when the inclination of the image is large, and the stem is accurately represented. Becomes difficult to detect. On the other hand, according to the stem detection method of the present invention, even if the stem has an inclination, it can be accurately detected. In addition, since the stems are accurately recognized, including the inclination, in this method, especially when the noteheads of many notes are close to each other, the stems and the noteheads can be more accurately associated with each other. . Moreover, since the method of connecting the median lines detected from the cut thin labels is adopted, it is possible to recognize even when the stem is slightly bent like a handwritten score.

【００２７】図４に戻って、Ｓ２２においては、縦方向
の短い黒ランを検出し、画像を分離する。黒ランの長さ
は、やはり五線の線幅を基準として決定されるが、Ｓ２
０の値と異なっていてもよい。図２（ｃ）は、Ｓ２２に
おいて（ａ）の原画像から分離された横細線画像の例で
ある。加線２２は、符頭の位置で複数のラベルに切断さ
れている。Ｓ２３においては、分離された画像データか
ら、縦線と同様に横線を検出する。Returning to FIG. 4, in S22, a short black run in the vertical direction is detected and the image is separated. The length of the black run is also determined based on the line width of the staff, but S2
It may be different from the value of 0. FIG. 2C is an example of the horizontal thin line image separated from the original image of FIG. The additional line 22 is cut into a plurality of labels at the notehead position. In S23, horizontal lines are detected in the same manner as vertical lines from the separated image data.

【００２８】Ｓ２４においては、細線を消した画像から
符頭候補の楕円を検出する。図２（ｄ）は、（ａ）の原
画像から細線を分離した残りの太線画像の例である。こ
の太線画像から、４分音符より短い音符の符頭（本件明
細書では、黒玉符頭という。）や符鉤、連鉤が分離でき
る。黒玉は、太ラベルの境界線のチェーンより、境界線
座標をある間隔で抽出して公知の方式により楕円式を計
算し、この形や画像とのマッチング度を取って認識す
る。なお、楕円式の導出に失敗する場合もあるので、間
隔を変えて楕円式を計算し直し、マッチングの高いもの
を選択する。In step S24, the ellipse of the notehead candidate is detected from the image with the thin line erased. FIG. 2D is an example of the remaining thick line image obtained by separating the thin line from the original image of FIG. From this thick line image, noteheads of notes shorter than quarter notes (in the present specification, referred to as black ball noteheads), note hooks, and continuous hooks can be separated. The black ball is recognized by extracting the boundary line coordinates at a certain interval from the boundary line chain of the thick label, calculating the elliptic formula by a known method, and taking the matching degree with this shape or image. In some cases, the elliptic expression may fail to be derived. Therefore, the elliptic expression is recalculated with different intervals, and the one with a high matching is selected.

【００２９】従来の符頭検出方法としては、符頭領域に
おいて、黒画素の存在する点の座標情報を辞書として持
っておく方法があった。このような方法において、ある
いは他の特徴量を辞書に持つ方式にしても、様々な音符
のフォントあるいは特徴量に対応するために、大きな辞
書を用意する必要があり、また、多数の音符のサンプル
を認識させて辞書を構築する必要があった。これに対し
て、符頭を楕円式として検出する方式においては、楕円
式で玉の形状が表されているので、楕円式の形状、即ち
式の係数が所定の範囲内であるか否かによって符頭の認
識を行うことができ、多量の辞書が不要になるととも
に、楕円式の係数範囲も感覚的に決定でき、多数の音符
のサンプルを認識させる必要が無くなる。また、楕円式
の係数範囲を広げることにより、手書き楽譜の認識にも
対応可能である。更に、音符消去の際にも楕円式で決定
される形状に基づいて正確な消去を行うことができる。As a conventional note head detection method, there is a method of holding coordinate information of a point where a black pixel exists in a note head area as a dictionary. In such a method, or even if the dictionary has other feature quantities, it is necessary to prepare a large dictionary to support various note fonts or feature quantities. It was necessary to recognize and build a dictionary. On the other hand, in the method of detecting the note head as an ellipse type, the shape of the ball is represented by the ellipse type. Therefore, depending on whether the shape of the ellipse type, that is, the coefficient of the equation is within a predetermined range. Note heads can be recognized, a large amount of dictionary is not required, and the elliptic coefficient range can be sensuously determined, so that it is not necessary to recognize a large number of note samples. Further, by expanding the elliptic coefficient range, it is possible to recognize handwritten musical scores. Furthermore, even when erasing notes, accurate erasure can be performed based on the shape determined by the elliptic formula.

【００３０】和音への対応として、まず横方向に並んだ
和音を認識するために、符尾候補の縦線により太ラベル
を切断する。また縦方向の和音への対応は、太ラベルの
くぼみを検出し、左右のくぼみ同士の組を作ってこれを
結ぶ線で太ラベルを切断する。画像がつぶれている場合
など、くぼみの組が見つからなかった場合には、くぼみ
の位置を推定する。図７は、和音の音符の切断処理例を
示す説明図である。図７（ａ）は和音の音符を含む原画
像であり、（ｂ）は細線を消去した画像である。この段
階では和音の符頭は連結している。（ｃ）は切断処理を
行った結果であり、左の符頭は（ｂ）に点線で示す符尾
の縦線により、また右の符頭は（ｂ）に矢印で示す左右
のくぼみを結ぶ線によってそれぞれ切断されている。In order to deal with chords, first, in order to recognize chords arranged in the horizontal direction, thick labels are cut by vertical lines of stem candidates. To deal with chords in the vertical direction, indentations in the thick label are detected, a pair of indentations on the left and right are made, and the thick label is cut by a line connecting the depressions. If the set of dents is not found, such as when the image is crushed, the position of the dents is estimated. FIG. 7 is an explanatory diagram showing an example of a chord note cutting process. FIG. 7A is an original image including chord notes, and FIG. 7B is an image in which thin lines are deleted. At this stage, the chord note heads are connected. (C) is the result of the cutting process, the left notehead is connected by the vertical line of the stem shown by the dotted line in (b), and the right notehead connects the left and right indentations shown by the arrow in (b). Each is cut by a line.

【００３１】Ｓ２５においては、先に求めた符尾候補の
縦線と結合して音符を検出する。連結には、「符尾の端
の玉は、結合する側が決まっている」などの楽典知識を
利用することができる。Ｓ２６においては、２分音符、
全音符の符頭（以下白抜き符頭）を検出する。白抜き符
頭は、原画像の境界線のチェーンを全てチェックし、白
抜きの孔を検出して、楕円式を計算する。そして、楕円
の形や画像とのマッチング度を取って認識する。音符が
五線上にあるものに対しては、２つのチェーンを結合し
たものから楕円式を計算する。なお、楕円式を計算して
認識する代わりに、単純に符頭部分の太ラベルをテンプ
レートとして辞書に持っていてもよい。この場合、和音
状になった音符を認識するために太ラベルを切断するの
ではなく、接触した符頭の太ラベル自体を辞書に持って
おいてもよい。また、テンプレートマッチングではな
く、ペリフェラル特徴等の特徴抽出によって認識しても
よい。楕円式を使用する場合でも、全ての境界線のチェ
ーンから楕円式を計算するのではなく、予め符頭で計算
された楕円式をテンプレートとして持っておき、これを
五線の間隔で正規化して太ラベルとマッチングさせても
よい。At S25, a note is detected by combining with the vertical line of the stem candidate previously obtained. For the connection, it is possible to use musical knowledge such as "a ball at the end of a stem has a fixed side to be connected". In S26, a half note,
Detects the noteheads of whole notes (hereafter referred to as the hollow noteheads). The white notehead checks all the chains of the boundary lines of the original image, detects the white holes, and calculates the elliptic formula. Then, the degree of matching with the shape of the ellipse or the image is taken for recognition. For notes whose notes are on the staff, the elliptic formula is calculated from the combination of two chains. Instead of calculating and recognizing the elliptic formula, the bold label of the notehead portion may be simply held as a template in the dictionary. In this case, the bold label itself of the touched note head may be held in the dictionary instead of cutting the bold label in order to recognize the chordal note. Further, instead of template matching, recognition may be performed by feature extraction such as peripheral features. Even when using the elliptic formula, instead of calculating the elliptic formula from the chain of all the boundary lines, hold an elliptic formula calculated in advance with a notehead as a template, and normalize this with the interval between staffs. You may match with a thick label.

【００３２】Ｓ２７においては、先に求めた符尾候補の
縦線と結合して音符を検出する。ここでの音符は、旗を
考えないものである。Ｓ２８においては、検出された音
符と太線画像から連鉤を検出する。連鉤は、これまでに
求められた旗を考えない音符の符尾の周辺に存在する太
ラベルを検出し、これの形状から連鉤の本数を計算す
る。また、この太ラベルに連結している他の音符も検出
する。連結する他の音符が無い場合には単独の旗あるい
は、半連鉤と考えられる。連鉤の本数により、音符の長
さ情報を決定する。なお、五線との交差の関係で、数本
の連鉤が１つの太ラベルとなっている場合があるので、
これを考慮した上で、連鉤の数を以下のように計算す
る。At S27, a note is detected by combining with the vertical line of the stem candidate previously obtained. The notes here do not consider the flag. In S28, a continuous hook is detected from the detected note and the thick line image. The continuous hook detects thick labels around the stems of the notes that do not consider the flag that has been obtained so far, and calculates the number of continuous hooks from the shape. In addition, other notes connected to this thick label are also detected. If there are no other notes to connect, it is considered to be a single flag or a half-hook. The note length information is determined by the number of hooks. In addition, due to the intersection with the staff, several hooks may be one thick label,
Taking this into account, the number of hooks is calculated as follows.

【００３３】連鉤の可能性のある太ラベルのｙ軸方向の
最大値、最小値を求め、この長さの標準偏差を求める。
標準偏差がある値以下だったら、この太ラベルは、同じ
長さの連鉤が１あるいは数本結合したものと考えられ
る。この後、太ラベルの縦方向の隙間（ギャップ）を計
測することで、連鉤の本数を計測し、音符の長さ情報に
反映させる。標準偏差がある値より大きかったら、この
太ラベルは、長さの違う連鉤が結合していると考えられ
るので、この場合には、このラベルに結合している他の
音符の符尾で領域を分割し、ギャップの計算を行う。符
尾の端からある程度の距離内に存在する太ラベルを検出
した後、該ラベルと関係する他の音符が存在しなかった
場合には、単独旗を持つ音符と考えられる。The maximum value and the minimum value in the y-axis direction of the thick label having a possibility of continuous hooking are obtained, and the standard deviation of this length is obtained.
If the standard deviation is less than a certain value, this thick label is considered to be a combination of one or several continuous hooks of the same length. After that, by measuring the vertical gap (gap) of the thick label, the number of continuous hooks is measured and reflected in the note length information. If the standard deviation is greater than a certain value, this thick label is considered to be a combination of hooks of different lengths.In this case, the stems of other notes connected to this label are , And calculate the gap. If a thick label existing within a certain distance from the end of the stem is detected and no other notes related to the label are present, it is considered as a note having a single flag.

【００３４】Ｓ２９においては、音符の音高を検出す
る。音の高さは、五線の間にある音符の場合には、楕円
の中心の座標から判別でき、そうでない場合には、やは
り楕円の中心の座標から求めてもよいが、加線の間隔が
五線の間隔と異なっている場合もあるので加線を考慮し
たほうがよい。加線は、横線検出により検出されてい
る。ただし、線上にある４分音符の加線は、細線の部分
が少ないので、横線が検出されない場合もある。よっ
て、この場合には、楕円の両端のある範囲の黒画素の存
在確率によって線上か線間かを判別しても良い。また、
大譜表の２つの五線の間にある音符の場合、予想される
位置に加線があるかどうかによって、その音符の高さが
上の五線を基準にしたものか、下の五線を基準にしたも
のかを判別しなければならない。Ｓ３０においては、こ
れまでに検出した音符の画像を画像から消去する。認識
された音符は、楕円や縦横線などの、検出された音符部
品の形にそって画像から削除することで、認識された音
符に接触した他のラベルを分離することができる。Ｓ３
１においては、残りの横線と縦線からくり返しかっこを
認識し、該画像を消去する。Ｓ３２においては、残った
横線からクレッシェンド、デクレッシェンド等を認識
し、消去する。In S29, the pitch of the note is detected. The pitch of a note can be determined from the coordinates of the center of the ellipse in the case of the notes between the staves, otherwise it may be obtained from the coordinates of the center of the ellipse. It may be different from the staff interval, so it is better to consider the additional line. The additional line is detected by horizontal line detection. However, in the addition lines of quarter notes on the line, the horizontal line may not be detected because the thin line portion is small. Therefore, in this case, it is possible to determine whether the line is on the line or between the lines by the existence probability of black pixels in a certain range at both ends of the ellipse. Also,
For a note between two staves on a grand staff, depending on whether there is an additional line at the expected position, the height of the note is based on the upper staff or the lower staff. It is necessary to determine whether it is the standard. In S30, the image of the note detected so far is deleted from the image. Recognized notes can be removed from the image along the shape of the detected note parts, such as ellipses and vertical and horizontal lines, to separate other labels that touch the recognized notes. S3
In 1, the repeated parentheses are recognized from the remaining horizontal and vertical lines and the image is erased. In S32, the crescendo, the decrescendo, etc. are recognized from the remaining horizontal lines and erased.

【００３５】Ｓ３３においては、残りの横線と縦線から
連音符のかっこを認識し、消去する。Ｓ３４において
は、残った縦線から小節線を認識することができる。複
縦線の太い縦線は、太ラベルで縦線検出を行うことによ
って検出できる。五線との関係や、小節線候補同士の位
置関係により小節線を認識する。認識された記号は、認
識された記号の形にそって画像から削除することで、認
識された記号に接触した他のラベルを分離することがで
きる。この後、削除されなかった画像について、一般の
楽譜認識装置と同様に五線消去を行い、ラベルを抽出し
ながら他の記号を認識する。In step S33, the brackets of the tuplet are recognized and erased from the remaining horizontal and vertical lines. In S34, the bar line can be recognized from the remaining vertical lines. A thick vertical line of a double vertical line can be detected by performing vertical line detection with a thick label. The bar line is recognized based on the relationship with the staff and the positional relationship between the bar line candidates. The recognized symbol can be removed from the image along the shape of the recognized symbol to separate other labels that have come into contact with the recognized symbol. Thereafter, the undeleted image is subjected to staff deletion in the same manner as a general musical score recognition apparatus, and other symbols are recognized while extracting labels.

【００３６】図６は、細線分離処理の第２の実施例を示
すフローチャートである。第２の実施例は、太い部分と
細い部分とを分離する方式として、画像の縮退と膨張に
より分離する方式を採用したものである。まず、Ｓ５０
においては、原画像を保存する。Ｓ５１においては、黒
画素領域の周辺を１ドットずつ削る処理（縮退処理）を
ｎ回くり返し、細い部分を消滅させる。Ｓ５２において
は、縮退された図形の周辺に１ドットずつ黒画素の肉を
つけていく処理（膨張処理）をｍ回くり返し、太い部分
のみを元の太さに戻す。なお縮退処理回数ｎと膨張処理
回数ｍは等しくなくてもよい。FIG. 6 is a flow chart showing the second embodiment of the thin line separation processing. The second embodiment adopts a method of separating a thick portion and a thin portion by image shrinkage and expansion. First, S50
In, the original image is saved. In S51, the process of reducing the periphery of the black pixel area dot by dot (degeneration process) is repeated n times to erase the thin portion. In S52, the process (expansion process) of adding black pixels to the periphery of the degenerated figure by one dot is repeated m times, and only the thick portion is returned to the original thickness. It should be noted that the number of shrinkage processes n and the number of expansion processes m do not have to be equal.

【００３７】Ｓ５３においては、Ｓ５０において保存し
た原画像と処理後の画像の画素（黒を１とする）ごとの
ＡＮＤ（論理積）を取ることにより太線画像を得る。Ｓ
５４においては、原画像から太線画像を引くことにより
細線画像を得る。細線画像については、このまま縦線、
横線を検出してもよいし、更に、縦線と横線を分離して
もよい。In S53, a thick line image is obtained by taking an AND (logical product) for each pixel (black is 1) of the original image stored in S50 and the processed image. S
At 54, a thin line image is obtained by subtracting a thick line image from the original image. For thin line images, vertical lines,
A horizontal line may be detected, or a vertical line and a horizontal line may be separated.

【００３８】この方式は、特に斜め方向において、短ラ
ン消去方式よりも太細の分離性が良い。ただし、処理時
間が多少長くなる。また、縮退、膨張処理の際の連結数
（４連結、８連結）を変化させることにより、太い部分
の抽出のされかたを、目的の部品の形に応じて変化させ
ることができる。例えば、縮退を８連結、膨張を４連結
で行ったり、縮退を８連結、４連結交互に行い、膨張を
８連結で行うなどの組み合わせが考えられ、更に上下あ
るいは左右の２連結により、縮退、膨張を行うことも考
えられる。例えば、２連結の縮退の場合には、原画像に
おいて注目画素および上下（あるいは左右）の３画素が
黒の場合にのみ、該画素を黒とする。この２連結の処理
により、縦あるいは横方向のみの細線消去を行うことも
可能である。This method has a thick and thin separability better than the short run erasing method, especially in the oblique direction. However, the processing time will be somewhat longer. Further, by changing the number of connections (4 connections, 8 connections) at the time of shrinkage and expansion processing, it is possible to change the extraction method of the thick portion according to the shape of the target part. For example, it is possible to combine degeneration with 8 connections and expansion with 4 connections, or with degeneration with 8 connections and 4 connections alternately, and expansion with 8 connections. It is also possible to perform expansion. For example, in the case of 2-linked degeneracy, the pixel is set to black only when the pixel of interest and the three pixels above and below (or left and right) are black in the original image. It is also possible to perform thin line erasing only in the vertical or horizontal direction by this two-link process.

【００３９】[0039]

【発明の効果】以上述べたように、本発明は、例えば楽
譜認識の前処理として、楽譜画像中の細い線を構成する
部分の画像データ、それ以外の画像データを分離してお
くことによって、簡単な処理により、楽譜を人間の認識
感覚に近い記号に分離することができ、以下の認識処理
を簡潔かつ、感覚的に行うことができるようになる。そ
して、分離した画像データから効率良く細い線（細記
号）および黒玉符頭、連鉤および旗（太記号）を検出し
て音符を認識することができ、認識率が向上するという
効果がある。また、小節線、符尾、加線などは、同じ細
い線であっても太さが異なっている場合もあるが、細線
検出のしきい値を縦と横で変えて検出することにより、
これらの相違にも対応可能であるという効果もある。As described above, according to the present invention, for example, as preprocessing of score recognition, by separating the image data of the portion forming the thin line in the score image and the other image data, By a simple process, the musical score can be separated into symbols close to human perception, and the following recognition process can be performed simply and intuitively. Then, it is possible to efficiently detect thin lines (thin symbols), black ball note heads, consecutive hooks and flags (thick symbols) from the separated image data, and thus to recognize notes, which has an effect of improving the recognition rate. . In addition, bar lines, stems, additional lines, etc. may have different thicknesses even if they are the same thin line, but by detecting by changing the threshold of thin line detection vertically and horizontally,
There is also an effect that it is possible to cope with these differences.

[Brief description of drawings]

【図１】本発明の楽譜認識装置の実施例の構成を示すブ
ロック図である。FIG. 1 is a block diagram showing a configuration of an embodiment of a musical score recognition apparatus of the present invention.

【図２】画像分離処理例を示す説明図である。FIG. 2 is an explanatory diagram showing an example of image separation processing.

【図３】ＣＰＵ１のメイン処理を示すフローチャートで
ある。FIG. 3 is a flowchart showing a main process of CPU 1.

【図４】Ｓ１２の音符認識処理の詳細を示すフローチャ
ートである。FIG. 4 is a flowchart showing details of a note recognition process in S12.

【図５】縦線検出処理の詳細を示すフローチャートであ
る。FIG. 5 is a flowchart showing details of vertical line detection processing.

【図６】細線分離処理の第２の実施例を示すフローチャ
ートである。FIG. 6 is a flowchart showing a second embodiment of thin line separation processing.

【図７】和音の音符の切断処理例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of a chord note cutting process.

[Explanation of symbols]

１…ＣＰＵ、２…ＲＯＭ、３…ＲＡＭ、４…ハードディ
スク装置、５…フロッピディスク装置、６…ＣＲＴディ
スプレイ装置、７…ＣＲＴインターフェース回路、８…
キーボード、９…キーボードインターフェース回路、１
０…プリンタ、１１…プリンタインターフェース回路、
１２…スキャナ、１３…スキャナインターフェース回
路、１４…ＭＩＤＩインターフェース回路、１５…バス1 ... CPU, 2 ... ROM, 3 ... RAM, 4 ... Hard disk device, 5 ... Floppy disk device, 6 ... CRT display device, 7 ... CRT interface circuit, 8 ...
Keyboard, 9 ... Keyboard interface circuit, 1
0 ... printer, 11 ... printer interface circuit,
12 ... Scanner, 13 ... Scanner interface circuit, 14 ... MIDI interface circuit, 15 ... Bus

フロントページの続き (72)発明者日野鉄夫静岡県浜松市寺島町200番地株式会社河合楽器製作所内 (72)発明者大場厚始静岡県浜松市寺島町200番地株式会社河合楽器製作所内Ｆターム(参考） 5B064 AA06 AB02 AB13 AB17 CA07 CA12 EA10 5L096 AA07 BA17 BA18 DA02 EA02 EA06 EA16 EA27 FA03 FA04 FA06 FA35 FA64 FA66 FA69 GA07 GA34 GA36 GA51 Continued front page (72) Inventor Tetsuo Hino 200 Terajima-cho, Hamamatsu City, Shizuoka Prefecture Kawa Co., Ltd. Synga Musical Instrument Factory (72) Inventor Atsushi Ohba 200 Terajima-cho, Hamamatsu City, Shizuoka Prefecture Kawa Co., Ltd. Synga Musical Instrument Factory F-term (reference) 5B064 AA06 AB02 AB13 AB17 CA07 CA12 EA10 5L096 AA07 BA17 BA18 DA02 EA02 EA06 EA16 EA27 FA03 FA04 FA06 FA35 FA64 FA66 FA69 GA07 GA34 GA36 GA51

Claims

[Claims]

1. A musical score recognition apparatus for recognizing various symbols from inputted musical score image data, in which the image data of a portion forming a thin line is separated from the musical score image data according to a threshold value corresponding to a predetermined line width. Means, fine symbol detection means for detecting a fine line as a symbol based on image data of a portion forming the fine line separated by the separating means, and a fine symbol detecting means included in the musical score image data, Black ball note head based on image data other than the image data of the portion forming the line, thick symbol detection means for detecting the hook and flag, based on the detection result of the thin symbol detection means and the thick symbol detection means A musical score recognition apparatus comprising a musical note detecting means for detecting musical notes.

2. The separation means detects a pixel row having a length equal to or shorter than a predetermined threshold value in the vertical and horizontal directions, and separates the pixel row to separate image data of a portion forming a thin line. The musical score recognition apparatus according to claim 1, wherein

3. The image obtained by repeating the degeneracy a certain number of times and then repeating the expansion a certain number of times,
The musical score recognition apparatus according to claim 1, wherein the image data of a portion forming a thin line is separated by taking a difference between the original images.

4. The musical score recognition apparatus according to claim 3, wherein the separating unit changes the number of connected pixels at the time of the contraction / expansion processing to separate the image data of the portion forming the thin line. .

5. The thin symbol detecting means connects the middle line extending in the center of the thin line in the line width direction center in the image data of the portion forming the thin line separated by the separating means to obtain a symbol. The musical score recognition apparatus according to any one of claims 1 to 4, further comprising means for detecting a thin line.

6. The bold symbol detecting means is included in the musical score image data, and an elliptic expression is obtained from boundary line coordinate information of an image figure in image data other than image data of a portion forming the thin line. Elliptic formula calculating means to be obtained,
5. The musical score recognition apparatus according to claim 1, further comprising notehead detecting means for detecting a notehead from the elliptical shape obtained by the means.