JP5999689B2

JP5999689B2 - Performance system and program

Info

Publication number: JP5999689B2
Application number: JP2012096835A
Authority: JP
Inventors: 哲晃馬場
Original assignee: Tokyo Metropolitan University
Current assignee: Tokyo Metropolitan University
Priority date: 2012-04-20
Filing date: 2012-04-20
Publication date: 2016-09-28
Anticipated expiration: 2032-04-20
Also published as: JP2013225016A

Description

本発明は、演奏システム及びプログラムに関し、より詳細には、描画された五線譜を利用して演奏を実行する演奏システム及びプログラムに関する。 The present invention relates to a performance system and a program, and more particularly to a performance system and a program for performing a performance using a drawn staff.

特開２００８−１２３１８１号公報（特許文献１）は、楽譜の画像を解析して自動的にＭＩＤＩデータを生成する楽譜認識装置を開示する。ここで、特許文献１の楽譜認識装置は、外部ＭＩＤＩ機器との間でＭＩＤＩデータを送受信するためのインターフェースを備えているので、楽譜認識装置を外部ＭＩＤＩ機器に接続すれば、スキャンした楽譜に基づくＭＩＤＩデータをそのまま外部ＭＩＤＩ機器に出力して演奏させることができる。 Japanese Patent Laying-Open No. 2008-123181 (Patent Document 1) discloses a musical score recognition apparatus that analyzes a musical score image and automatically generates MIDI data. Here, since the score recognition apparatus of Patent Document 1 includes an interface for transmitting and receiving MIDI data to and from an external MIDI device, if the score recognition device is connected to the external MIDI device, it is based on the scanned score. MIDI data can be directly output to an external MIDI device for performance.

しかしながら、特許文献１の楽譜認識装置は、譜面または段落単位のバッチ処理（一括処理）によってＭＩＤＩデータを生成するものであり、楽譜認識装置が楽譜からＭＩＤＩデータを生成することと、外部ＭＩＤＩ機器が生成されたＭＩＤＩデータに基づいて演奏することは本質的には無関係である。 However, the score recognition device of Patent Document 1 generates MIDI data by batch processing (batch processing) in units of musical scores or paragraphs. The score recognition device generates MIDI data from a score, and the external MIDI device Performing based on the generated MIDI data is essentially irrelevant.

特開２００８−１２３１８１号公報JP 2008-123181 A

本発明は、五線譜を利用して演奏を実行する新規な演奏システム及びプログラムを提供することを目的とする。 An object of this invention is to provide the novel performance system and program which perform a performance using a staff score.

本発明者は、これまで、単に、音楽の記述フォーマットとして利用されてきた五線譜について改めて検討を加えた結果、五線譜を利用した新しい演奏形態に想到し、本発明に至ったのである。 As a result of reexamining the music notation that has been used as a music description format, the present inventor has come up with a new performance form using the music notation, and has reached the present invention.

すなわち、本発明によれば、描画された五線譜を利用して演奏を実行する演奏システムであって、撮影手段を搭載する演奏用デバイスと、前記撮影手段が撮影した五線譜の画像を解析する画像解析部と、前記画像解析部の解析結果に基づいて前記五線譜上に描画された音符が表す音の音高を決定する音高決定部と、決定された前記音高に基づいて発音データを生成する発音データ生成部と、前記発音データに基づいて音声信号を出力する発音制御部と、前記画像解析部の解析結果に基づいて演奏用ＧＵＩを生成し、該演奏用ＧＵＩを介して演奏操作を受け付ける演奏用ＧＵＩ部とを含み、前記演奏用ＧＵＩ部は、前記五線譜を表す背景画像の上にＹ軸方向に伸びる発音ラインを重畳表示する演奏用ＧＵＩを生成し、前記発音データ生成部は、前記発音ラインが前記五線譜上の前記音符の画像の上に重畳されている間、前記発音データを発音制御部に対して出力する演奏システムが提供される。 That is, according to the present invention, a performance system for performing a performance using a drawn staff score, a performance device equipped with a photographing means, and an image analysis for analyzing an image of the staff score photographed by the photographing means A pitch determining unit that determines a pitch of a sound represented by a note drawn on the staff based on an analysis result of the image analysis unit, and generates pronunciation data based on the determined pitch A sound generation data generation unit, a sound generation control unit that outputs a sound signal based on the sound generation data, a performance GUI is generated based on the analysis result of the image analysis unit, and a performance operation is received via the performance GUI A performance GUI unit, and the performance GUI unit generates a performance GUI that superimposes and displays a sound line extending in the Y-axis direction on a background image representing the staff. The sound generation data generation unit includes: While serial sound line is superimposed on the note of the image on the musical staff, performance system for outputting the sound data to the sound control unit is provided.

上述したように、本発明によれば、五線譜を利用して演奏を実行する新規な演奏システムが提供される。本発明の演奏システムによれば、音楽初学者に向けた簡易な印刷楽譜を再生することができるので、これまで数多く出版されてきた楽譜教材を利用したインタラクティブな音楽学習体験をユーザに提供することができる。 As described above, according to the present invention, a novel performance system for performing a performance using a staff is provided. According to the performance system of the present invention, a simple printed score for music beginners can be reproduced, so that an interactive music learning experience using a number of score materials published so far can be provided to the user. Can do.

本実施形態の演奏システムの機能ブロック図。The functional block diagram of the performance system of this embodiment. 本実施形態の演奏システムの実装形態を示す図。The figure which shows the implementation form of the performance system of this embodiment. 紙面に鉛筆で描画された五線譜を示す図。The figure which shows the staff score drawn with the pencil on paper. 本実施形態における演奏用デバイスを示す図。The figure which shows the device for performance in this embodiment. リアルタイム演奏処理を示すフローチャート。The flowchart which shows a real-time performance process. 画像解析処理を示すフローチャート。The flowchart which shows an image analysis process. 画像解析処理を説明するための概念図。The conceptual diagram for demonstrating an image analysis process. 画像解析処理を説明するための概念図。The conceptual diagram for demonstrating an image analysis process. 演奏用ＧＵＩ画面を説明するための概念図。The conceptual diagram for demonstrating the GUI screen for performance. 音量を決定する処理を説明するための概念図。The conceptual diagram for demonstrating the process which determines a sound volume. エフェクトを加える処理を説明するための概念図。The conceptual diagram for demonstrating the process which adds an effect. 音声設定処理を説明するための概念図。The conceptual diagram for demonstrating an audio | voice setting process.

以下、本発明を図面に示した実施の形態をもって説明するが、本発明は、図面に示した実施の形態に限定されるものではない。なお、以下に参照する各図においては、共通する要素について同じ符号を用い、適宜、その説明を省略するものとする。 Hereinafter, the present invention will be described with reference to embodiments shown in the drawings, but the present invention is not limited to the embodiments shown in the drawings. In the drawings referred to below, the same reference numerals are used for common elements, and the description thereof is omitted as appropriate.

図１は、本発明の実施形態である演奏システム１０００の機能ブロック図を示す。本実施形態の演奏システム１０００は、演奏用デバイス１００と、演奏用デバイス１００からの入力に基づいて所定の情報処理を実行する情報処理装置２００と、情報処理装置２００から出力される音声信号を音波に変換する発音手段３００と、情報処理装置２００から出力される画像データを表示する表示装置４００を含んで構成されている。 FIG. 1 shows a functional block diagram of a performance system 1000 according to an embodiment of the present invention. The performance system 1000 of the present embodiment includes a performance device 100, an information processing apparatus 200 that executes predetermined information processing based on input from the performance device 100, and a sound signal output from the information processing apparatus 200 as a sound wave. And a display device 400 that displays image data output from the information processing device 200.

本実施形態における演奏用デバイス１００は、撮影手段１１０と、モード切替スイッチ１２０を含んで構成されている。撮影手段１１０は、ＣＣＤイメージセンサやＣＭＯＳイメージセンサを搭載したデジタルカメラとして構成することができ、演奏用デバイス１００は、撮影手段１１０によって撮影された画像およびモード切替スイッチ１２０のスイッチ信号を情報処理装置２００に送信する。 The performance device 100 according to the present embodiment is configured to include a photographing unit 110 and a mode switch 120. The photographing means 110 can be configured as a digital camera equipped with a CCD image sensor or a CMOS image sensor, and the performance device 100 receives an image photographed by the photographing means 110 and a switch signal of the mode change switch 120 as an information processing device. 200.

情報処理装置２００は、ＣＰＵ、ＲＡＭ、ＲＯＭ、およびハードディスクやフラッシュメモリなどの外部記憶装置、ならびにネットワーク・インターフェースや無線インターフェースなどの適切な通信インターフェースを含んで構成されるコンピュータ装置であって、ＷＩＮＤＯＷＳ（登録商標）、ＵＮＩＸ（登録商標）、ＬＩＮＵＸ（登録商標）などといった適切なオペレーション・システム（ＯＳ）の管理下で、Ｃ、Ｃ＋＋、ＶｉｓｕａｌＣ＋＋、ＶｉｓｕａｌＢａｓｉｃ、Ｊａｖａ（登録商標）などのオブジェクト指向のプログラミング言語により記述された所定のアプリケーション・プログラムを実行することによって、画像解析部２１１、ＯＣＲ部２１２、音高決定部２１４、音量決定部２１６、エフェクト部２１８、演奏用ＧＵＩ部２２０、音色設定部２２４、発音データ生成部２２６、発音制御部２２８、モード切替部２３０として機能する。 The information processing device 200 is a computer device including a CPU, a RAM, a ROM, an external storage device such as a hard disk and a flash memory, and an appropriate communication interface such as a network interface and a wireless interface. Object-oriented programming languages such as C, C ++, Visual C ++, Visual Basic, Java (registered trademark) under the control of an appropriate operating system (OS) such as registered trademark (registered trademark), UNIX (registered trademark), LINUX (registered trademark) By executing a predetermined application program described by the above, an image analysis unit 211, an OCR unit 212, a pitch determination unit 214, a volume determination unit 216, an effect unit 218, a performance GU Part 220, tone color setting section 224, sound data generation unit 226, the sound generation control unit 228, which functions as a mode switching unit 230.

画像解析部２１１およびＯＣＲ部２１２は、演奏用デバイス１００から画像入力Ｉ/Ｆ２１０を介して入力される画像に基づいて所定の画像解析処理を実行する。音高決定部２１４、音量決定部２１６、エフェクト部２１８、および演奏用ＧＵＩ部２２０の各々は、画像解析部２１１の解析結果に基づいて以下の処理を実行する。 The image analysis unit 211 and the OCR unit 212 execute predetermined image analysis processing based on an image input from the performance device 100 via the image input I / F 210. Each of the pitch determination unit 214, the sound volume determination unit 216, the effect unit 218, and the performance GUI unit 220 executes the following processing based on the analysis result of the image analysis unit 211.

すなわち、音高決定部２１４は、五線譜に記述された音符が表す音の音高を決定し、音量決定部２１６は、音符が表す音の音量を決定する。エフェクト部２１８は、音符が表す音に所定のエフェクトを加え、演奏用ＧＵＩ部２２０は、演奏操作を受け付けるための演奏用ＧＵＩを生成し、表示装置４００に表示する。 In other words, the pitch determination unit 214 determines the pitch of the sound represented by the notes described in the staff, and the volume determination unit 216 determines the volume of the sound represented by the notes. The effect unit 218 adds a predetermined effect to the sound represented by the note, and the performance GUI unit 220 generates a performance GUI for accepting a performance operation and displays the performance GUI on the display device 400.

発音データ生成部２２６は、音高決定部２１４が決定した音高に基づいて、ＭＩＤＩデータとして参照される発音データを生成するための機能部であり、演奏用ＧＵＩ画面を介したユーザからの操作入力に応答して生成した発音データを発音制御部２２８に出力する。 The sound generation data generation unit 226 is a functional unit for generating sound generation data referred to as MIDI data based on the pitch determined by the pitch determination unit 214, and is operated by the user via the performance GUI screen. The sound generation data generated in response to the input is output to the sound generation control unit 228.

発音制御部２２８は、発音データ生成部２２６から入力された発音データを、音量決定部２１６が決定した音量およびエフェクト部２１８によって加えられる所定のエフェクトを反映させたＷＡＶなどの音声信号に変換し、スピーカー装置として参照される発音手段３００に出力する。 The sound generation control unit 228 converts the sound generation data input from the sound generation data generation unit 226 into an audio signal such as WAV reflecting the sound volume determined by the sound volume determination unit 216 and a predetermined effect applied by the effect unit 218, The sound is output to the sound generation means 300 referred to as a speaker device.

ここで、図１に示した演奏用デバイス１００、情報処理装置２００、発音手段３００および表示装置４００の各構成は、それぞれが物理的に分離した存在であってもよく、適切な組み合わせで一体化されていてもよい。例えば、情報処理装置２００、発音手段３００および表示装置４００の各機能を一つの装置（例えば、デスクトップＰＣ、ノートＰＣ、スマートフォン、タブレット型端末、ＰＤＡなど）に集約してもよいし、演奏用デバイス１００、情報処理装置２００、発音手段３００および表示装置４００の全ての機能を１つの筐体内に集約してもよい。１つの筐体内に集約した場合、演奏システム１０００は、演奏装置として参照することができる。なお、物理的に分離した各構成間は、有線・無線の方式を問わず、ローカル接続や、インターネットやＬＡＮなどのネットワークを介して接続することもできる。 Here, the components of the performance device 100, the information processing device 200, the sound generation means 300, and the display device 400 shown in FIG. 1 may be physically separated from each other, and are integrated in an appropriate combination. May be. For example, the functions of the information processing device 200, the sound generation means 300, and the display device 400 may be integrated into one device (for example, a desktop PC, a notebook PC, a smartphone, a tablet terminal, a PDA, etc.), or a performance device 100, the information processing device 200, the sound generation means 300, and the display device 400 may all be integrated into one housing. When collected in one housing, the performance system 1000 can be referred to as a performance device. Note that the physically separated components can be connected via a local connection or a network such as the Internet or a LAN regardless of a wired or wireless system.

さらに、本実施形態における表示装置４００は、画像を表示可能な手段であればどのようなものであってもよく、演奏システム１０００の実装形態に応じて、デスクトップ型ディスプレイ装置、ヘッドマウントディスプレイ、プロジェクター装置、小型液晶パネルなどの適切な手段を採用することができる。 Furthermore, the display device 400 according to the present embodiment may be any device that can display an image. Depending on the implementation of the performance system 1000, a desktop display device, a head-mounted display, a projector Appropriate means such as an apparatus and a small liquid crystal panel can be employed.

以上、本実施形態の演奏システム１０００の基本構成について説明してきたが、続いて、より具体的な実装形態に基づいて演奏システム１０００を説明する。なお、以降の説明においては、適宜、図１を参照するものとする。 The basic configuration of the performance system 1000 according to this embodiment has been described above. Next, the performance system 1000 will be described based on a more specific mounting form. In the following description, FIG. 1 will be referred to as appropriate.

図２は、演奏システム１０００の実装形態を例示する。図２に例示する演奏システム１０００は、ユーザがペンのごとく握って操作する演奏用デバイス１００と出力装置５００を含んで構成されている。ここで、出力装置５００は、先に説明した情報処理装置２００、発音手段３００および表示装置４００の各機能を集約した装置として参照されたい。 FIG. 2 illustrates an implementation of the performance system 1000. A performance system 1000 illustrated in FIG. 2 includes a performance device 100 and an output device 500 that a user holds and operates like a pen. Here, the output device 500 should be referred to as a device in which the functions of the information processing device 200, the sound generation means 300, and the display device 400 described above are integrated.

演奏にあたり、ユーザは、まず、図２（ａ）に示すように、紙などの任意の媒体に対して手書きで五線譜を描画する。図３（ａ）は、紙面に鉛筆で描画された五線譜を示す。図３（ａ）に示すように、演奏システム１０００を使用した即興演奏においては、五線の上に簡略化した音符（符頭のみからなる簡易音符）を記述することが前提となる。なぜなら、本実施形態における即興演奏においては、ユーザが音の長さ（音価）を動的に決めることを前提とするので、音符は音高のみを表せば足りるからである。そして、このような簡易音符を使用することによって、五線譜の認識処理に要するデータ量を低減できるので、五線譜の認識処理速度が向上し、リアルタイム性が向上する。 In performing the performance, the user first draws a musical score on an arbitrary medium such as paper as shown in FIG. FIG. 3A shows a staff notation drawn with a pencil on a paper surface. As shown in FIG. 3A, in the improvisational performance using the performance system 1000, it is assumed that a simplified note (simple note consisting only of note heads) is described on the staff. This is because, in the improvisational performance in the present embodiment, since it is assumed that the user dynamically determines the length (sound value) of the sound, it is sufficient that the note represents only the pitch. And by using such a simple note, the amount of data required for the staff score recognition process can be reduced, so that the staff score recognition processing speed is improved and the real-time performance is improved.

ユーザは、五線譜を書き終わると、演奏用デバイス１００と出力装置５００を起動し、図２（ｂ）に示すように、演奏用デバイス１００を紙面に当接させた状態で、五線譜をなぞるように左から右へと移動させる。このとき、ユーザは、出力装置５００の表示画面に表示される演奏用ＧＵＩ画面２０を見ながら演奏用デバイス１００を所望の速度で動かす。そして、演奏用デバイス１００が五線譜に記譜された音符の上に到達すると、当該音符に対応する音が出力装置５００の発音手段から発音される。 When the user finishes writing the staff, he activates the performance device 100 and the output device 500, and traces the staff in a state where the performance device 100 is in contact with the paper surface as shown in FIG. 2B. Move from left to right. At this time, the user moves the performance device 100 at a desired speed while viewing the performance GUI screen 20 displayed on the display screen of the output device 500. Then, when the performance device 100 reaches the top of the note recorded on the staff, a sound corresponding to the note is generated from the sound generation means of the output device 500.

図４は、図２に示した演奏用デバイス１００の正面図、上面図、下面図、側面図および側面断面図を示す。図４に示すように、演奏用デバイス１００は、円柱部１３２と、円柱部１３２の一端に接続される略半球状の読み取り部１３４を含んで構成されている。円柱部１３２の外周面には、モードを切り替えるための２つのトグルスイッチ１２２，１２３が配設されており、各トグルスイッチ１２２，１２３の押下によって、出力装置５００がリアルタイム演奏モードおよび録音演奏モードにセットされ、トグルスイッチ１２４の押下によって録音内容がリセットされるように構成されている。 4 shows a front view, a top view, a bottom view, a side view, and a side cross-sectional view of the performance device 100 shown in FIG. As shown in FIG. 4, the performance device 100 includes a columnar part 132 and a substantially hemispherical reading unit 134 connected to one end of the columnar part 132. Two toggle switches 122 and 123 for switching modes are disposed on the outer peripheral surface of the cylindrical portion 132, and when the toggle switches 122 and 123 are pressed, the output device 500 is switched to the real-time performance mode and the recording performance mode. The recorded contents are reset when the toggle switch 124 is pressed.

略半球状の読み取り部１３４は中空状になっており、その外周には、ユーザに読み取り方向をガイドするべく、対向する２つの切り欠き部１３６，１３６が形成されている。ここで、２つの切り欠き部１３６，１３６は、ＣＣＤカメラ１１０のカメラ座標系のＸ軸方向に対向するように形成されており、ユーザは、五線譜の読み取りにあたり、図２（ｂ）および図３（ｂ）に示すように、切り欠き部１３６の幅方向に５本の描画線の全てが収まるように読み取り部１３４を紙面に当接し、その状態を維持しながら演奏用デバイス１００をおおよそＸ軸方向になぞるように移動する。 The substantially hemispherical reading unit 134 has a hollow shape, and two notch portions 136 and 136 facing each other are formed on the outer periphery thereof so as to guide the reading direction to the user. Here, the two notches 136 and 136 are formed so as to face each other in the X-axis direction of the camera coordinate system of the CCD camera 110, and the user reads the staff notation in FIGS. 2B and 3. As shown in (b), the reading device 134 is brought into contact with the paper surface so that all of the five drawing lines can be accommodated in the width direction of the cutout portion 136, and the performance device 100 is held approximately on the X axis while maintaining this state. Move to trace the direction.

さらに、演奏用デバイス１００には、読み取り部１３４の内部に臨むようにＣＣＤカメラ１１０が搭載され、読み取り部１３４の内部に照明をあてるためのＬＥＤ１３７が配置されている。読み取り部１３４を紙面に当接した状態において、ＣＣＤカメラ１１０は紙面上の五線譜に対峙する。そして、ＣＣＤカメラ１１０は、この状態で、図３（ｂ）に示すように、五線と当該五線上に記述された少なくとも１つの音符を同時に撮影できるように、その画角が設定される。なお、ＣＣＤカメラ１１０およびＬＥＤ１３７の代替手段として、赤外線カメラおよび赤外線照射手段を用いてもよい。ＣＣＤカメラ１１０の画像出力およびトグルスイッチ１２２，１２３，１２４（モード切替スイッチ１２０）のＯＮ/ＯＦＦ信号は、無線通信Ｉ/Ｆ１４０を介して出力装置５００にリアルタイムで転送される。 Further, the performance device 100 is equipped with a CCD camera 110 facing the inside of the reading unit 134, and an LED 137 for illuminating the inside of the reading unit 134 is disposed. In a state where the reading unit 134 is in contact with the paper surface, the CCD camera 110 faces the staff on the paper surface. Then, in this state, as shown in FIG. 3B, the CCD camera 110 sets the angle of view so that the staff and at least one musical note described on the staff can be photographed simultaneously. In addition, as an alternative means for the CCD camera 110 and the LED 137, an infrared camera and an infrared irradiation means may be used. The image output of the CCD camera 110 and the ON / OFF signals of the toggle switches 122, 123, and 124 (mode switch 120) are transferred to the output device 500 in real time via the wireless communication I / F 140.

以上、演奏用デバイス１００について説明してきたが、図４に示した形状・構造はあくまで例示であり、種々の設計変更が可能であることに留意されたい。 Although the performance device 100 has been described above, it should be noted that the shape and structure shown in FIG. 4 are merely examples, and various design changes are possible.

本実施形態においては、演奏用デバイス１００のトグルスイッチ１２２がユーザによって押下されたことに応答して、モード切替部２３０が情報処理装置２００の動作モードをリアルタイム演奏モードに切り替える。以下、情報処理装置２００において実行されるリアルタイム演奏処理を図５に示すフローチャート基づいて説明する。 In this embodiment, in response to the toggle switch 122 of the performance device 100 being pressed by the user, the mode switching unit 230 switches the operation mode of the information processing apparatus 200 to the real-time performance mode. Hereinafter, real-time performance processing executed in the information processing apparatus 200 will be described with reference to the flowchart shown in FIG.

リアルタイム演奏モードにおいては、まず、画像解析部２１１が画像解析処理（ステップ１００）を実行する。以下、図６に示すフローチャート基づいて画像解析部２１１が実行する画像解析処理を説明する。 In the real-time performance mode, first, the image analysis unit 211 executes image analysis processing (step 100). Hereinafter, image analysis processing executed by the image analysis unit 211 will be described based on the flowchart shown in FIG.

まず、ステップ１０１において、演奏用デバイス１００からリアルタイムで転送される手書き楽譜のデジタル画像を取得する。図７（ａ）は、取得した手書き楽譜のデジタル画像を例示する。図７（ａ）に示す画像には、画像座標系のＸ軸方向に伸びる５本の描画線と２つの簡易音符（以下、単に音符という）が写っている。 First, in step 101, a digital image of a handwritten score transferred from the performance device 100 in real time is acquired. FIG. 7A illustrates a digital image of the acquired handwritten score. The image shown in FIG. 7A includes five drawing lines extending in the X-axis direction of the image coordinate system and two simple musical notes (hereinafter simply referred to as musical notes).

次に、ステップ１０２において、取得したデジタル画像について２値化処理を実行する。なお、ステップ１０２は、画像認識の精度を上げるために用いられる画像処理（ノイズ除去、平滑化、エッジ強調、細線化、背景正規化など）を適宜含んでいてもよい。図７（ｂ）は、図７（ａ）に示したデジタル画像を２値化した２値化画像を示す。 Next, in step 102, binarization processing is executed on the acquired digital image. Note that step 102 may appropriately include image processing (noise removal, smoothing, edge enhancement, thinning, background normalization, etc.) used to increase the accuracy of image recognition. FIG. 7B shows a binary image obtained by binarizing the digital image shown in FIG.

次に、ステップ１０３において、２値化画像を画像座標系のＸ軸方向に領域分割する処理を実行する。具体的には、２値化画像を五線のみが存在する領域（以下、第１領域という）とその余の領域（以下、第２領域という）に領域分割する処理を実行する。その結果、五線の上に音符が描画された領域が第２領域として分割される。図７（ｃ）は、２値化画像を領域分割した状態を示す。なお、図７（ｃ）において、第１領域および第２領域を、それぞれ、丸囲み数字１および２で示す（以下の図において同様）。ただし、ステップ１０３においては、例外的に、五線譜の両端のように五線が描画されていない領域や線がかすれたり途切れたりして五線の全てが認識できなかったような領域が、音符が描画されているいないにかかわらず第２領域として分割されることもある。 Next, in step 103, a process for dividing the binarized image in the X-axis direction of the image coordinate system is executed. Specifically, the binarized image is divided into an area where only the staff is present (hereinafter referred to as the first area) and the remaining area (hereinafter referred to as the second area). As a result, the area where the musical notes are drawn on the staff is divided as the second area. FIG. 7C shows a state where the binarized image is divided into regions. In FIG. 7C, the first region and the second region are indicated by encircled numerals 1 and 2, respectively (the same applies to the following drawings). However, in step 103, as an exception, an area where the staff is not drawn, such as both ends of the staff, or an area where all the staffs cannot be recognized due to faint or broken lines, It may be divided as the second area regardless of whether it is drawn.

次に、ステップ１０３の結果を受けて、ステップ１０４において第１領域が存在するか否かが判断され、第１領域が存在する場合（ステップ１０４、Ｙｅｓ）、処理はそのままステップ１０５に進む。一方、第１領域が存在しない場合は（ステップ１０４、Ｎｏ）、音色設定処理（ステップ３００）に進んだ後、ステップ１０５に進む。なお、音色設定処理については、後述する。 Next, based on the result of step 103, it is determined in step 104 whether or not the first area exists. If the first area exists (step 104, Yes), the process proceeds to step 105 as it is. On the other hand, if the first region does not exist (step 104, No), the process proceeds to the timbre setting process (step 300) and then proceeds to step 105. The tone color setting process will be described later.

ステップ１０５においては、第１領域に存在する五線の座標情報に基づいて線形補間を実行することによって、第２領域内に５本の推定線を定義する。具体的には、図８（ａ）に拡大して示すように、第２領域の左右に隣接する２つの第１領域内の描画線について、注目する第２領域の左側に位置する第１領域内の描画線の右端画素Ｒと、注目する第２領域の右側に位置する第１領域内の描画線の左端画素Ｌを結ぶ線分Ｓを第２領域内の五線として推定する。このとき、５本の推定線は、画像の下から上にかけて順番に、第１線〜第５線としてマークされる。 In step 105, five estimated lines are defined in the second region by performing linear interpolation based on the coordinate information of the staff existing in the first region. Specifically, as shown in an enlarged view in FIG. 8A, the first region located on the left side of the second region of interest with respect to the drawing lines in the two first regions adjacent to the left and right of the second region. A line segment S connecting the right end pixel R of the inner drawing line and the left end pixel L of the drawing line in the first region located on the right side of the second region of interest is estimated as a staff in the second region. At this time, the five estimated lines are marked as the first to fifth lines in order from the bottom to the top of the image.

次に、ステップ１０６において、ラベリング処理などの手法を用いて第２領域内に描画された音符の画素領域を画定した後、図８（ｂ）に拡大して示すように、音符のエッジ画素Ｅの座標と、音符の代表画素Ｄの座標を求める。図８（ｂ）は、第２領域を二分するようにＹ軸方向に伸びる基準線Ｂを定義し、この基準線Ｂ上に存在する２つのエッジ画素を結ぶ線分の中点に相当する画素を代表画素Ｄとする例を示しているが、この他にも、音符の画素領域の重心を代表画素Ｄとするなど、適切な規則に従って代表画素Ｄを決定すればよい Next, in step 106, a pixel area of a note drawn in the second area is defined using a technique such as a labeling process, and then, as shown in an enlarged view in FIG. And the coordinates of the representative pixel D of the note. FIG. 8B defines a reference line B extending in the Y-axis direction so as to bisect the second region, and a pixel corresponding to the midpoint of a line segment connecting two edge pixels existing on the reference line B. In this example, the representative pixel D may be determined in accordance with an appropriate rule, for example, the center of gravity of the note pixel region may be the representative pixel D.

最後に、ステップ１０７において、各第２領域について、先のステップ１０５で定義した５本の推定線の座標情報と、先のステップ１０６で求めた音符に係る各種座標情報（エッジ画素および代表画素）を楽譜画像情報として一時記憶用のバッファに保持し（ステップ１０７）、処理を終了する。なお、第２領域に音符が描画されていないケースでは、当該第２領域に係る楽譜画像情報をブランクとして保持する。 Finally, in step 107, for each second region, the coordinate information of the five estimated lines defined in the previous step 105 and various coordinate information (edge pixels and representative pixels) relating to the musical note obtained in the previous step 106 Is stored in the temporary storage buffer as musical score image information (step 107), and the process ends. Note that, in the case where a note is not drawn in the second area, the musical score image information related to the second area is held as a blank.

再び、図５に戻って説明を続ける。画像解析処理（ステップ１００）が終了すると、処理はステップ２０１に進み、演奏用ＧＵＩ部２２０が、演奏用ＧＵＩ画面に表示する発音ラインが第２領域の上にあるか否かを判断する。ここで、演奏用ＧＵＩ部２２０が生成・提供する演奏用ＧＵＩ画面について説明する。 Returning to FIG. 5 again, the description will be continued. When the image analysis process (step 100) ends, the process proceeds to step 201, and the performance GUI unit 220 determines whether or not the sounding line to be displayed on the performance GUI screen is on the second area. Here, the performance GUI screen generated and provided by the performance GUI unit 220 will be described.

図９は、本実施形態における演奏用ＧＵＩ画面を説明するための概念図である。なお、図９においては、理解の容易のため、演奏用ＧＵＩ画面２０を手書き楽譜の上に重畳して表示する（以下、図１０および図１１において同様）。本実施形態における演奏用ＧＵＩ画面は、画像解析部２１１の解析結果に基づいて演奏用ＧＵＩ部２２０が生成する。図９に示す演奏用ＧＵＩ画面２０は、演奏用デバイス１００から転送された画像の２値化画像の第１領域と第２領域を識別可能に区分け表示した画像（例えば、色分け表示）を背景画像とし、当該背景画像の中央部の適切な位置を画像座標系のＹ軸方向に伸びて横断する直線Ｐを固定表示している。本実施形態においては、この直線Ｐを発音ラインＰとして定義する。 FIG. 9 is a conceptual diagram for explaining a performance GUI screen in the present embodiment. In FIG. 9, for ease of understanding, the performance GUI screen 20 is displayed superimposed on the handwritten score (hereinafter the same as in FIGS. 10 and 11). The performance GUI screen in this embodiment is generated by the performance GUI unit 220 based on the analysis result of the image analysis unit 211. The performance GUI screen 20 shown in FIG. 9 displays an image (for example, color-coded display) in which the first area and the second area of the binarized image of the image transferred from the performance device 100 are displayed so as to be distinguishable. And a straight line P extending in the Y-axis direction of the image coordinate system and traversing an appropriate position in the center of the background image is fixedly displayed. In the present embodiment, this straight line P is defined as a sound generation line P.

ユーザが演奏用デバイス１００で手書き楽譜を左から右になぞっていくことに応答して、演奏用ＧＵＩ画面２０に表示される背景画像が時々刻々と変化する。仮に、ステップ２０１において、演奏用ＧＵＩ画面２０が図９（ａ）に示す状態にあるとき、発音ラインＰは、第２領域の上にないので（ステップ２０１、Ｎｏ）、処理はステップ１００に戻り、再び、演奏用デバイス１００から転送されてくる最新の画像に基づいて画像解析処理が実行される。 In response to the user tracing the handwritten score from left to right with the performance device 100, the background image displayed on the performance GUI screen 20 changes from moment to moment. If the performance GUI screen 20 is in the state shown in FIG. 9A in step 201, the sounding line P is not on the second region (step 201, No), and the process returns to step 100. The image analysis process is executed again based on the latest image transferred from the performance device 100.

続いて、ユーザが演奏用デバイス１００を右側に移動させた結果、図９（ｂ）に示すように、発音ラインＰが第２領域の上に重畳されると（ステップ２０１、Ｙｅｓ）、処理はステップ２０２に進み、発音ラインＰが重畳された第２領域に音符があるか否か判断される。なお、この判断は、当該第２領域について一時記憶用バッファに楽譜画像情報が保持されている否かによって行うことができる。要するに、上述したステップ２０１およびステップ２０２を経て、演奏用ＧＵＩ画面に表示される発音ラインＰが音符の画像の上に重畳されているか否かが判断される。 Subsequently, as a result of the user moving the performance device 100 to the right side, as shown in FIG. 9B, when the sound generation line P is superimposed on the second area (step 201, Yes), the processing is performed. Proceeding to step 202, it is determined whether or not there is a note in the second region where the sound generation line P is superimposed. This determination can be made based on whether or not the score image information is held in the temporary storage buffer for the second area. In short, it is determined whether or not the sounding line P displayed on the performance GUI screen is superimposed on the note image through the above-described step 201 and step 202.

図９（ｂ）に示す状態の場合、発音ラインＰが重畳された第２領域に音符Ｑが存在するので（ステップ２０２、Ｙｅｓ）、ステップ２０３に進み、一時記憶用バッファに保持された当該第２領域の楽譜画像情報に基づいて、音高決定部２１４が、音符Ｑが表す音の音高を決定する。例えば、音符Ｑが存在する第２領域について、図８（ｃ）に示す楽譜画像情報が保持されていた場合、代表画素Ｄと第１〜第３線との相互位置関係に基づき、適切なアルゴリズムを使用して音高を決定する。なお、五線譜の画像から音符が表す音高を決定する手法は既知であり、本実施形態においては、上述した手順に限らず、既知の手法をはじめとした適切なアルゴリズムを使用して音高を決定することができる。以下、ステップ２０３において、音高「ソ」が決定されたとして説明を続ける。 In the state shown in FIG. 9B, the note Q is present in the second region where the sounding line P is superimposed (step 202, Yes), so that the process proceeds to step 203, and the second stored in the temporary storage buffer. Based on the score image information of the two areas, the pitch determination unit 214 determines the pitch of the sound represented by the note Q. For example, when the musical score image information shown in FIG. 8C is held for the second region where the note Q exists, an appropriate algorithm is based on the mutual positional relationship between the representative pixel D and the first to third lines. Use to determine the pitch. Note that the method for determining the pitch represented by the note from the image of the musical score is known, and in this embodiment, the pitch is not limited to the above-described procedure, and the pitch is calculated using an appropriate algorithm including a known method. Can be determined. Hereinafter, the description will be continued assuming that the pitch “so” is determined in step 203.

続いて、ステップ２０４に進み、同じ楽譜画像情報に基づいて、音量決定部２１６が、音符Ｑが表す音の音量を決定する。図１０（ａ）は、音量を決定する処理を説明するための概念図である。音量の決定にあたっては、音符ＱのＹ軸方向の幅に基づいて音量を決定する。図１０（ａ）は、第２領域を二分するようにＹ軸方向に伸びる基準線Ｂ上に存在する音符Ｑの２つのエッジ画素の間のピクセル距離を音符ＱのＹ軸方向の幅とし、当該幅に基づいて音量を決定する例を示す。この他にも、Ｙ軸方向に伸び音符Ｑの代表画素Ｄを通る線を基準線Ｂとし、当該基準線Ｂ上に存在する音符Ｑの２つのエッジ画素の間のピクセル距離を音符ＱのＹ軸方向の幅とすることもできる。 Subsequently, the process proceeds to step 204, where the volume determination unit 216 determines the volume of the sound represented by the note Q based on the same musical score image information. FIG. 10A is a conceptual diagram for explaining the process of determining the volume. In determining the volume, the volume is determined based on the width of the note Q in the Y-axis direction. FIG. 10A shows the pixel distance between two edge pixels of the note Q existing on the reference line B extending in the Y-axis direction so as to bisect the second region as the width of the note Q in the Y-axis direction. An example in which the volume is determined based on the width is shown. In addition to this, a line passing through the representative pixel D of the note Q extending in the Y-axis direction is defined as a reference line B, and the pixel distance between two edge pixels of the note Q existing on the reference line B is defined as the Y of the note Q. It can also be an axial width.

続いて、ステップ２０５に進み、上述した手順で決定した音高および音量で発音を開始する。具体的には、発音データ生成部２２６が、音色設定部２２４によって設定された音色と音高決定部２１４が決定した音高「ソ」に基づいて発音データ（ＭＩＤＩデータ）を生成し、発音制御部２２８に出力する。発音制御部２２８は、発音データ生成部２２６から入力された音高「ソ」の発音データを、音量決定部２１６が決定した音量に基づいて、ＷＡＶなどの音声信号に変換して発音手段３００に出力する。 Subsequently, the process proceeds to step 205, where sound generation is started with the pitch and volume determined in the above-described procedure. Specifically, the sound generation data generation unit 226 generates sound generation data (MIDI data) based on the timbre set by the timbre setting unit 224 and the pitch “seo” determined by the pitch determination unit 214, thereby generating sound control To the unit 228. The sound generation control unit 228 converts the sound generation data of the pitch “So” input from the sound generation data generation unit 226 into a sound signal such as WAV based on the sound volume determined by the sound volume determination unit 216 and sends it to the sound generation unit 300. Output.

発音が開始された後、発音ラインＰが第２領域上に重畳されている限り、演奏用ＧＵＩ部２２０は、発音データ生成部２２６に対して発音を継続するように指示する。これを受けて、発音データ生成部２２６は発音制御部２２８に対して発音データの出力を継続する。さらに、この間に、演奏用デバイス１００の所定の動きに応答して、後述するステップ２０６またはステップ２０７の処理が実行され、発音される音の性質が動的に変化する。以下、この点について説明する。 After the sound generation is started, the performance GUI unit 220 instructs the sound generation data generation unit 226 to continue sound generation as long as the sound generation line P is superimposed on the second area. In response to this, the sound generation data generation unit 226 continues to output sound generation data to the sound generation control unit 228. Further, during this time, in response to a predetermined movement of the performance device 100, the processing of Step 206 or Step 207, which will be described later, is executed, and the nature of the sound to be sounded dynamically changes. Hereinafter, this point will be described.

ステップ２０６では、発音ラインＰを第２領域に重畳させた状態を維持しつつ、演奏用デバイス１００が長手方向を軸として回動することに応答して、音量決定部２１６が、音符Ｑが表す音の音量を変更する。ステップ２０６で実行される処理は、先のステップ２０４において音量決定部２１６が実行する処理と本質的に同じであり、音量決定部２１６は、音符ＱのＹ軸方向の幅を常時監視し、最新の幅情報に基づいて音量を動的に再設定する。一般に、音符（符頭）は五線に対して傾きをもった楕円として記述されることが多いので、図１０（ａ）の左側に示す状態から、演奏用デバイス１００を右回転させて、図１０（ｂ）の左側に示す状態にすると、図１０（ｂ）の右側に拡大して示すように、ピクセル距離がＷ１からＷ２に変化する（Ｗ２＞Ｗ１）。これに応じて、音量決定部２１６は音符Ｑが表す音の設定音量を増加させる。 In step 206, in response to the performance device 100 rotating about the longitudinal direction while maintaining the state where the sound generation line P is superimposed on the second region, the volume determination unit 216 represents the note Q. Change the volume of the sound. The processing executed in step 206 is essentially the same as the processing executed by the volume determination unit 216 in the previous step 204. The volume determination unit 216 constantly monitors the width of the note Q in the Y-axis direction, The volume is dynamically reset based on the width information. In general, since a note (note head) is often described as an ellipse having an inclination with respect to a staff, the performance device 100 is rotated to the right from the state shown on the left side of FIG. In the state shown on the left side of FIG. 10B, the pixel distance changes from W1 to W2 as shown on the right side of FIG. 10B (W2> W1). In response to this, the volume determination unit 216 increases the set volume of the sound represented by the note Q.

ステップ２０７では、発音ラインＰを第２領域に重畳させた状態を維持しつつ、演奏用デバイス１００が譜面上下方向（すなわち、画像座標系のＹ軸方向）に動くことに応答して、エフェクト部２１８が発音にピッチベントエフェクトを加える。具体的には、図１１（ａ）に示す初期状態から図１１（ｂ）に示すように、ユーザが演奏用デバイス１００を譜面上下方向に動かすことに応答して、エフェクト部２１８は、発音を開始した時点の音符Ｑの代表画素ＤのＹ座標値[Ｙ₁]と、移動後の音符Ｑの代表画素ＤのＹ座標値[Ｙ_２]との差分の絶対値を移動量として検出し、その移動量に応じて音高「ソ」から無段階に音の高さを変化させる（すなわち、音高「ソ」の周波数を基準としてそこから周波数を増分する）。なお、ここでは、ピッチベントエフェクトに基づいて説明したが、エフェクト部２１８が加えるエフェクトはこれに限定されるものではなく、譜面上下方向の移動量に応じて、任意のエフェクト（例えば、リバーブやコーラスなど）を設定するように構成してもよい。 In step 207, in response to the performance device 100 moving in the musical score vertical direction (that is, the Y-axis direction of the image coordinate system) while maintaining the state where the sound generation line P is superimposed on the second region, the effect unit 218 adds a pitch vent effect to the pronunciation. Specifically, as shown in FIG. 11 (b) from the initial state shown in FIG. 11 (a), in response to the user moving the performance device 100 in the vertical direction of the musical score, the effect unit 218 generates a sound. The absolute value of the difference between the Y coordinate value [Y ₁ ] of the representative pixel D of the note Q at the time of start and the Y coordinate value [Y ₂ ] of the representative pixel D of the note Q after movement is detected as a movement amount, The pitch of the sound is changed steplessly from the pitch “seo” according to the amount of movement (that is, the frequency is incremented from the pitch “seo” based on the frequency). Although the description has been given based on the pitch vent effect here, the effect applied by the effect unit 218 is not limited to this, and an arbitrary effect (for example, reverb or chorus) is selected according to the amount of movement in the musical score vertical direction. Etc.) may be set.

続くステップ２０８では、再び、演奏用ＧＵＩ部２２０が、発音ラインＰが第２領域上にあるか否かを判断し、発音ラインＰが未だ第２領域上にある場合は、ステップ２０６以降の手順を繰り返す。一方、ユーザがさらにカメラを譜面右側に移動させた結果、図９（ｃ）に示す状態になると、演奏用ＧＵＩ部２２０は、発音ラインＰが第２領域から外れたと判断し（ステップ２０８、Ｎｏ）、演奏用ＧＵＩ部２２０は発音データ生成部２２６に対して発音を停止するように指示する。これを受けて、発音データ生成部２２６は発音制御部２２８への発音データの出力を停止し、発音が終了する（ステップ２０９）。その後は、再び、ステップ１００の画像解析処理に戻り、上述した一連の処理を繰り返す。 In subsequent step 208, the performance GUI unit 220 determines again whether or not the sounding line P is on the second area. If the sounding line P is still on the second area, the procedure after step 206 is performed. repeat. On the other hand, as a result of the user moving the camera further to the right side of the musical score, when the state shown in FIG. 9C is obtained, the performance GUI unit 220 determines that the sounding line P has deviated from the second area (step 208, No). ), The performance GUI unit 220 instructs the sound generation data generation unit 226 to stop sound generation. In response to this, the sound generation data generation unit 226 stops outputting the sound generation data to the sound generation control unit 228, and the sound generation ends (step 209). Thereafter, the process returns to the image analysis process in step 100 again, and the series of processes described above is repeated.

次に、本実施形態における音色設定処理について説明する。本実施形態における音色設定部２２４は、任意の文字と所定の音色を紐付けて管理する。例えば、音色設定部２２４は、文字列「ｐｆ」を「ピアノ」の音色に紐付けたり、文字列「ｄｒ」を「ドラム」の音色に紐付けたりすることができる。 Next, the tone color setting process in this embodiment will be described. The tone color setting unit 224 in the present embodiment manages an arbitrary character and a predetermined tone color in association with each other. For example, the tone color setting unit 224 can link the character string “pf” to the tone color of “piano”, or link the character string “dr” to the tone color of “drum”.

例えば、ユーザが図１２（ａ）に示すように、手書き譜面の適当な余白を利用して、文字列「ｐｆ」および文字列「ｄｒ」を手書きした後、ユーザが演奏用デバイス１００の読み取り部１３４を文字列「ｐｆ」が描画された位置に当接したとして、図６に戻って説明する。この場合、取得画像には第１領域（五線のみが存在する領域）が一切存在しないので（ステップ１０４、Ｎｏ）、ステップ３００に進んで音色設定処理が実行される。以下、図１２（ｂ）に示すフローチャート基づいて音色設定処理を説明する。 For example, as shown in FIG. 12A, after the user has handwritten a character string “pf” and a character string “dr” using an appropriate margin of the handwritten musical score, the user reads the reading unit of the performance device 100. Returning to FIG. 6, description will be made assuming that 134 is in contact with the position where the character string “pf” is drawn. In this case, since there is no first area (area where only the staff is present) in the acquired image (step 104, No), the process proceeds to step 300, and the timbre setting process is executed. The timbre setting process will be described below based on the flowchart shown in FIG.

まず、ステップ３０１において、演奏用デバイス１００が撮影した手書き文字「ｐｆ」の画像を取得する。続く、ステップ３０２において、ＯＣＲ部２１２が取得した手書き文字「ｐｆ」の画像のＯＣＲ処理を実行し、手書き文字を認識する。最後に、ステップ３０２において、音色設定部２２４は、認識した文字「ｐｆ」に紐付けられたピアノの音色を発音データ生成部２２６に設定する。 First, in step 301, an image of the handwritten character “pf” photographed by the performance device 100 is acquired. Subsequently, in step 302, the OCR process of the image of the handwritten character “pf” acquired by the OCR unit 212 is executed to recognize the handwritten character. Finally, in step 302, the tone color setting unit 224 sets the piano tone color associated with the recognized character “pf” in the pronunciation data generation unit 226.

以上、本実施形態のリアルタイム演奏モードについて説明してきたが、本実施形態においては、演奏用デバイス１００のトグルスイッチ１２４がユーザによって押下されたことに応答して、モード切替部２３０が情報処理装置２００の動作モードを、録音演奏モードに切り替える。録音演奏モードにおいては、ユーザが五線譜をスキャンする間、発音データ生成部２２６は、音高決定部２１４および音量決定部２１６が決定する音高および音量と、発音ラインＰの第２領域滞在期間に応じて決定された音価に基づいてＭＩＤＩデータを生成・保存し、ユーザからの求めに応じて演奏を再生する。 Although the real-time performance mode of the present embodiment has been described above, in this embodiment, the mode switching unit 230 is operated by the information processing apparatus 200 in response to the toggle switch 124 of the performance device 100 being pressed by the user. Switch to the recording performance mode. In the recording performance mode, while the user scans the staff, the pronunciation data generation unit 226 performs the pitch and volume determined by the pitch determination unit 214 and the volume determination unit 216, and the period of stay in the second region of the pronunciation line P. MIDI data is generated and stored based on the sound value determined accordingly, and the performance is reproduced in response to a request from the user.

以上、本発明について実施形態をもって説明してきたが、本発明は上述した実施形態に限定されるものではない。 Although the present invention has been described with the embodiment, the present invention is not limited to the above-described embodiment.

例えば、上述した実施形態では、手書き描画した五線譜を使用して演奏する態様を説明したが、本発明の演奏システムは、手書き描画した五線譜に限らず、印刷描画された五線譜、印刷された五線に手書きの音符を描画した五線譜など、どのような五線譜であっても適用可能であることは言うまでもない。また、五線譜が描画される媒体は紙に限らず任意の媒体であればよい。 For example, in the above-described embodiment, the aspect of performing using the hand-drawn staff notation has been described. However, the performance system of the present invention is not limited to the hand-drawn staff notation, but the printed staff notation and the printed staff Needless to say, any musical notation such as a musical notation in which handwritten notes are drawn is applicable. The medium on which the staff is drawn is not limited to paper, and any medium may be used.

また、上述した実施形態の演奏用ＧＵＩ画面では、発音ラインを定義した上で、当該発音ラインを表す直線を背景画像上に重畳表示する態様を例示したが、発音ラインを表すグラフィックは、ユーザが発音ラインを直感的に認識できるものであればどのようなものであってもよく、例えば、上述した直線の代わりに矢印などのグラフィックを採用してもよい。 In the performance GUI screen of the above-described embodiment, the sound line is defined, and the straight line representing the sound line is superimposed on the background image. However, the graphic representing the sound line is displayed by the user. Anything can be used as long as the pronunciation line can be intuitively recognized. For example, a graphic such as an arrow may be employed instead of the straight line described above.

また、上述した実施形態では、取得画像の内容に基づいて音色設定処理を自動的に実行する態様を示したが、音色設定処理用の専用のスイッチを設け、当該スイッチの押下に応答して音色設定処理を実行するようにしてもよい。また、上述した実施形態では、ＯＣＲ機能を利用して音色を設定する態様について説明したが、本発明においては、ＯＣＲ機能を利用して、エフェクトやリズムボックス、音域やテンポなどの所定の設定項目を設定するように構成してもよい。 In the above-described embodiment, the timbre setting process is automatically executed based on the content of the acquired image. However, a dedicated switch for the timbre setting process is provided, and the timbre is set in response to pressing of the switch. A setting process may be executed. In the above-described embodiment, the aspect in which the timbre is set using the OCR function has been described. However, in the present invention, predetermined setting items such as an effect, a rhythm box, a sound range, and a tempo are used using the OCR function. It may be configured to set.

また、本発明においては、音色設定部において複数のカラーと音色を紐付けておき、演奏用デバイスに搭載した３ＣＣＤカメラが取得したカラー画像を解析することによって、音符の彩色に応じてその音符が表す音の音色を設定するように構成することもできる。また、音量決定部を音符の画素領域の面積（ピクセル数）に基づいて音量を決定するように構成してもよく、音符の大きさ・形状にかかわらず一定の音量で発音してもよい。さらに、演奏デバイスに適切な振動アクチュエータを搭載し、音符の発音時やＯＣＲ認識成功時に１０ｍｓ程度の振動をユーザに与えるなどの触覚フィードバックの構成を採用してもよい。その他、当業者が推考しうる実施態様の範囲内において、本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 In the present invention, a plurality of colors and timbres are associated with each other in the timbre setting unit, and the color image obtained by the 3CCD camera mounted on the performance device is analyzed, so that the musical note is changed according to the coloring of the musical note. It can also be configured to set the timbre of the sound to be represented. Further, the sound volume determination unit may be configured to determine the sound volume based on the area (number of pixels) of the pixel area of the note, and may sound at a constant sound volume regardless of the size and shape of the note. Furthermore, an appropriate vibration actuator may be mounted on the performance device, and a tactile feedback configuration may be employed in which a vibration of about 10 ms is given to the user when a note is generated or when OCR recognition is successful. In addition, it is included in the scope of the present invention as long as the effects and effects of the present invention are exhibited within the scope of embodiments that can be considered by those skilled in the art.

上述した実施形態の各機能は、Ｃ、Ｃ＋＋、Ｃ＃、Ｊａｖａ（登録商標）などのオブジェクト指向プログラミング言語などで記述された装置実行可能なプログラムにより実現でき、本実施形態のプログラムは、ハードディスク装置、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、フレキシブルディスク、ＥＥＰＲＯＭ、ＥＰＲＯＭなどの装置可読な記録媒体に格納して頒布することができ、また他装置が可能な形式でネットワークを介して伝送することができる。 Each function of the above-described embodiment can be realized by a device-executable program written in an object-oriented programming language such as C, C ++, C #, Java (registered trademark), and the program of this embodiment is a hard disk device. , CD-ROM, MO, DVD, flexible disk, EEPROM, EPROM and the like can be stored and distributed in a device-readable recording medium, and can be transmitted via a network in a format that other devices can.

最後に、本発明の演奏システムの応用展開について説明する。 Finally, application development of the performance system of the present invention will be described.

（音楽教育用コンテンツとしての利用）
日本では、義務教育において五線譜の読み方が必修とされているにもかかわらず、依然として多くの人が楽譜を読めないまま大人になるという現実がある。これは、これまでの学習が、五線譜のフォーマットを理論的に学ばせることだけに終始する結果、生徒の頭の中で五線譜の視覚情報とそれが表すところの音（聴覚情報）がいつまでたっても結びついていかないことに原因がある。 (Use as music education content)
In Japan, even though compulsory education requires reading the staff notation, there is still the reality that many people still become adults without being able to read the score. This is because learning so far only begins to theoretically learn the format of the staff notation, so that the visual information of the staff and the sound it represents (auditory information) will last in the student's head. The cause is that they are not connected.

この点につき、本発明の演奏システムを音楽教育用コンテンツとして導入すれば、生徒は、思いつくままに手書きした楽譜をその場ですぐに音として確認することができるので、記譜→発音→記譜→発音のサイクルを繰り返して遊んでいるうちに、自然と楽譜が読めるようになるであろう。 In this regard, if the performance system of the present invention is introduced as a content for music education, the student can immediately confirm the musical score handwritten as he / she thinks as a sound, so notation → pronunciation → notation → You will be able to read the score naturally as you play through the pronunciation cycle.

（新しい演奏パフォーマンスの提案）
近年、ラップトップコンピュータを使用して演奏パフォーマンスを行うラップトップミュージックと呼ばれる演奏形態が注目を集めている。ラップトップミュージックでは、ラップトップコンピュータやサンプラーなどの操作機器だけを使用してライブ演奏を行うが、演奏者がそれらを操作する姿それ自体は、魅力的な演奏パフォーマンスとは程遠い。ライブ演奏の観客は、演奏者の奏でる音はもちろんのこと、演奏者の動作パフォーマンスに大きな魅力を感じるものであり、この点に関しては、ラップトップミュージックは古典楽器の演奏に劣っていると言わざるを得ない。 (Proposal of new performance)
In recent years, a performance form called laptop music in which a performance is performed using a laptop computer has attracted attention. In laptop music, live performances are performed using only operating devices such as laptop computers and samplers, but the appearance of the player's operation itself is far from an attractive performance. Audiences of live performances have a great appeal not only to the sound played by the performer, but also to the performance performance of the performer. In this regard, laptop music is inferior to playing classical instruments. I do not get.

一方、本発明の演奏システムをラップトップミュージックに応用すれば、ラップトップミュージックが持つ本来の利点を維持しつつ、従来の古典楽器の演奏に通ずるようなヒューマンなパフォーマンスを観客に披露することが可能になる。例えば、以下のような演奏を実現することができるであろう。 On the other hand, if the performance system of the present invention is applied to laptop music, it is possible to show the human performance to the audience that is similar to the performance of conventional classical instruments while maintaining the original advantages of laptop music. become. For example, the following performance may be realized.

演奏者は、ステージ上で即興的に五線譜を描き、書き上げた五線譜を本発明の演奏用デバイスでなぞっていく。その様子は大型スクリーンに映し出される。このとき、演奏用ＧＵＩ画面も併せて大型スクリーンに映し出される。パフォーマーは、五線譜をなぞる速度を速くしたり遅くしたりしてメロディに緩急を付けたり、なぞる方向を反転させて逆再生的な音を出したり、演奏用デバイスを動かすことによって音量や音高を揺らし、トレモロ、ビブラート、ピッチベントのようなニュアンスを演奏の中に即興的に挿入したり、観客が見守る中、次々に新しい楽譜を書き足していき、それぞれの楽譜を異なる楽器音でループシーケンスさせる等々。このように、本発明の演奏システムをラップトップミュージックに応用すれば、観客を視覚と聴覚の両方から魅了することができるであろう。 The performer improvisely draws a staff on the stage and traces the staff that has been written with the performance device of the present invention. The situation is projected on a large screen. At this time, the performance GUI screen is also displayed on the large screen. The performer can increase or decrease the volume and pitch by moving the performance device by increasing or decreasing the speed of tracing the stave to give the melody more or less, reversing the direction of the trace to produce a reverse playback sound, or moving the performance device. Nuances such as shaking, tremolo, vibrato, and pitch bent are improvised in the performance, and while the audience watches, new scores are added one after another, and each score is loop-sequenced with different instrument sounds. And so on. Thus, if the performance system of the present invention is applied to laptop music, it will be possible to attract the audience both visually and auditorily.

２０…演奏用ＧＵＩ画面
１００…演奏用デバイス
１１０…撮影手段（ＣＣＤカメラ）
１２０…モード切替スイッチ
１２２，１２３，１２４…トグルスイッチ
１３２…円柱部
１３４…読み取り部
１３６…切り欠き部
１３７…ＬＥＤ
１４０…無線通信Ｉ/Ｆ
２００…情報処理装置
２１０…画像入力Ｉ/Ｆ
２１１…画像解析部
２１２…ＯＣＲ部
２１４…音高決定部
２１６…音量決定部
２１８…エフェクト部
２２０…演奏用ＧＵＩ部
２２４…音色設定部
２２６…発音データ生成部
２２８…発音制御部
２３０…モード切替部
３００…発音手段
４００…表示装置
５００…出力装置
１０００…演奏システム 20 ... GUI screen for performance 100 ... Device for performance 110 ... Photography means (CCD camera)
DESCRIPTION OF SYMBOLS 120 ... Mode change switch 122,123,124 ... Toggle switch 132 ... Cylindrical part 134 ... Reading part 136 ... Notch part 137 ... LED
140 ... Wireless communication I / F
200 ... Information processing device 210 ... Image input I / F
211 ... Image analysis unit 212 ... OCR unit 214 ... Pitch determination unit 216 ... Volume determination unit 218 ... Effect unit 220 ... Performance GUI unit 224 ... Tone setting unit 226 ... Sound generation data generation unit 228 ... Sound generation control unit 230 ... Mode switching Part 300 ... Sound generation means 400 ... Display device 500 ... Output device 1000 ... Performance system

Claims

A performance system for performing a performance using a drawn staff notation,
A performance device equipped with a photographing means;
An image analysis unit for analyzing an image of the staff score photographed by the photographing means;
A pitch determination unit that determines a pitch of a sound represented by a note drawn on the staff based on the analysis result of the image analysis unit;
A pronunciation data generation unit that generates the pronunciation data based on the determined pitch;
A sound generation control unit that outputs a sound signal based on the sound generation data;
A performance GUI unit that generates a performance GUI based on the analysis result of the image analysis unit and receives a performance operation via the performance GUI;
Including
The performance GUI unit generates a performance GUI that superimposes and displays a straight line extending in the Y-axis direction as a pronunciation line on the background image representing the staff.
The sound generation data generation unit outputs the sound generation data to the sound generation control unit while the sound generation line is superimposed on the image of the note on the staff .
An effect unit that adds a predetermined effect in an amount corresponding to the amount of movement in the Y-axis direction to the sound represented by the note in response to the photographing unit being moved in the Y-axis direction;
Performance system.

The performance system according to claim 1, wherein the staff notation is drawn by handwriting, and the notes are described by simple notes with only note heads.

A volume determination unit that determines a volume of a sound represented by the note based on a width of the note in the Y-axis direction, and the sound generation control unit outputs the audio signal based on the determined volume. The performance system according to claim 1 or 2.

The performance system according to claim 1, wherein the predetermined effect is a pitch vent effect.

An OCR unit that performs character recognition on an image photographed by the photographing unit;
It further includes a timbre setting unit for managing an arbitrary character and a predetermined timbre in association with each other,
The performance system according to any one of claims 1 to 4 , wherein the timbre setting unit sets a predetermined timbre associated with a character recognized by the OCR unit as a timbre of a sound represented by the note.

A computer-executable program for causing a computer to perform a performance using a drawn staff.
Analyzing an image of a musical score taken by a performance device equipped with a photographing means;
Determining a pitch of a sound represented by a note drawn on the staff based on the image analysis result of the staff;
Generating pronunciation data based on the determined pitch;
Generating a performance GUI for displaying a straight line extending in the Y-axis direction as a pronunciation line on a background image representing the staff based on the image analysis result of the staff;
Outputting an audio signal based on the pronunciation data while the pronunciation line is superimposed on the image of the note on the staff .
Performing a step of applying a predetermined effect in an amount corresponding to the amount of movement in the Y-axis direction to the sound represented by the note in response to the photographing unit being moved in the Y-axis direction. Program to let you.

The step of analyzing the image of the staff is as follows:
Binarizing the stave image to generate a binarized image;
Dividing the binarized image into a first region where only a staff is present and a second region where a note is drawn on the staff in the X-axis direction of the image coordinate system ; Including
The step of generating the performance GUI includes:
For the binarized image, the displayed image obtained by dividing the first region and the second region so as to be distinguishable is used as the background image, and a straight line extending across the center of the background image is fixedly displayed as the sound generation line. The program according to claim 6 , comprising steps.

In the step of the area division, tone when said first area does not exist, the imaging means performs character recognition on the image captured is, the sound representing a predetermined tone which are linked to the recognized character is the note The program according to claim 7 , set as