JP2011164162A

JP2011164162A - Support device for giving expression to performance

Info

Publication number: JP2011164162A
Application number: JP2010023795A
Authority: JP
Inventors: Mitsuyo Hashida; 光代橋田; Haruhiro Katayose; 晴弘片寄
Original assignee: Kwansei Gakuin Educational Foundation
Current assignee: Kwansei Gakuin Educational Foundation
Priority date: 2010-02-05
Filing date: 2010-02-05
Publication date: 2011-08-25

Abstract

<P>PROBLEM TO BE SOLVED: To provide a support device for giving an expression to performance easily by acquiring the relationship among tempo, dynamics, and articulation easily when correcting an expression curve. <P>SOLUTION: The support device 10 for giving an expression to performance includes a display device 18 and supports to give an expression to performance based on phrasing. When giving an expression to performance for the phrase specified by a user, the display device 18 multi-displays the expression curve 50 of three performance expressions including tempo, dynamics, and articulation on the same screen 42 by the same time axis. The expression curve 50 can change a shape by operating a mouse or the like. The user gives the expression to the performance for the specified phrase based on the correction of the expression curve 50. As a result, it becomes easy for the user to acquire the relationship among tempo, dynamics, and articulation so that giving the expression to the performance can be easily and efficiently performed. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

この発明は演奏表情付け支援装置に関し、特にたとえば、楽曲分析情報をユーザに提供することによって、ユーザ主導で行われるフレージングに基づく演奏の表情付けを支援する、演奏表情付け支援装置に関する。 The present invention relates to a performance expression support apparatus, and more particularly to a performance expression support apparatus that supports expression of a performance based on phrasing performed by the user by providing music analysis information to a user.

近年、音楽コンテンツのデザインにおいて、演奏の表情付けの重要性が増してきている。演奏の表情付けとは、指定された音の並びに対して、音量、テンポおよびアーティキュレーションを含む演奏表現（演奏パラメータ）に変化を与え、音楽を活き活きしたものとして実体化する作業であり、人間の知性や感性の表れとして実施されるものである。なお、アーティキュレーションとは、音や音のつながりに影響をもたらす音の形の表現であり、スラー、スタッカートおよびアクセント等の記号やその表現を示す。 In recent years, the importance of facial expression of performance has been increasing in the design of music content. The expression of a performance is a work that changes the performance expression (performance parameters) including volume, tempo, and articulation for a specified sequence of sounds and materializes it as a lively piece of music. It is implemented as a manifestation of the intellect and sensitivity. The articulation is a representation of the shape of a sound that affects the sound and the connection of the sound, and indicates symbols such as slur, staccato, and accent, and their representation.

演奏の表情付けに関する研究は、音楽系人工知能研究の１つとして取り組まれ、その多くは、ルール学習や事例ベース推論に基づく自動化技術の開発に焦点が当てられてきた。このような自動化技術は、音楽コンテンツの生産性の向上をもたらすが、一方では、ユーザが求める演奏表情に辿り着くことが必ずしも容易ではないという問題が残る。 Research on expression of musical performance has been addressed as one of the researches on music-related artificial intelligence, and many of them have been focused on the development of automation technology based on rule learning and case-based reasoning. Such an automation technology improves the productivity of music content, but on the other hand, there remains a problem that it is not always easy to reach the performance expression that the user desires.

そこで、この発明者らは、非特許文献１などにおいて、演奏デザインにおける表情付けはユーザが主導で行い、煩雑な作業の部分を自動化技術で代替（支援）することによって、ユーザ所望の表情付けを効率的に実行できるシステムを提案している。非特許文献１の技術は、フレージングに焦点を当てたユーザ主導の演奏表情デザインシステムであって、フレーズ構造解析支援機能と演奏表情エディット機能とを備えている。フレーズ構造解析支援機能は、ユーザが部分的にフレーズの境界（構造）を指定すれば、残りのフレーズ構造については、ＧＴＴＭ（A Generative Theory of Tonal Music）を応用して解析できる機能であり、演奏表情エディット機能は、ユーザが表情付けしたいフレーズを任意に選択すると、そのフレーズに対応するテンポおよび音量（ダイナミクス）の表情カーブのそれぞれを各エディット画面に表示し、ユーザの操作に応じて各表情カーブを個別に修正していくことによって、ユーザ所望の演奏表情を形成する機能である。ユーザが選択したフレーズに表情付けが行われると、その表情付けは楽曲全体に反映され、演奏の表情付けの効率化が図られる。
「Mixtract：ユーザの意図に応える演奏表現デザイン支援環境」（第１７回大阪大学保健センター健康科学フォーラム「音楽とウェルネスの学際的融合」，３６−３７，２００９年１１月） Therefore, in the non-patent document 1 and the like, the present inventors perform facial expression in the performance design by the user, and substitute (support) the complicated work part with an automated technique, thereby providing the desired facial expression. A system that can be executed efficiently is proposed. The technique of Non-Patent Document 1 is a user-driven performance expression design system focused on phrasing, and includes a phrase structure analysis support function and a performance expression edit function. The phrase structure analysis support function is a function that can analyze the remaining phrase structure by applying GTTM (A Generative Theory of Tonal Music) if the user partially specifies the boundary (structure) of the phrase. The facial expression editing function allows the user to arbitrarily select a phrase that the user wants to add a facial expression to display each tempo and volume (dynamics) facial expression curve corresponding to that phrase on each edit screen, and to change each facial expression curve according to the user's operation. This is a function to form a performance expression desired by the user by individually correcting the sound. When facial expression is applied to the phrase selected by the user, the facial expression is reflected in the entire music piece, so that the performance expression of the performance is improved.
“Mixtract: Performance expression design support environment that responds to user's intention” (17th Osaka University Health Center Health Science Forum “Interdisciplinary Fusion of Music and Wellness”, 36-37, November 2009)

音楽経験者は、暗黙知としてフレーズの表現方法を理解しているが、演奏表現における「聴こえ」を実現する演奏パラメータ（テンポやダイナミクスなど）の組み合わせは、必ずしも一意ではない。また、フレーズ構造の演奏表現は、テンポ表現やダイナミクス表現などを有機的に組み合わせることによって、その個性が表出されるものである。しかしながら、非特許文献１の技術では、指定されたフレーズに対するテンポおよびダイナミクスの表情カーブを別個のエディット画面に表示し、各エディット画面において各表情カーブを個別に修正するようにしていたので、演奏パラメータ相互の関係性を把握し難い上、演奏パラメータのどのような組み合わせによってその演奏表現の「聴こえ」が実現されているのかを把握し難かった。 The experienced musician understands the expression method of the phrase as tacit knowledge, but the combination of performance parameters (tempo, dynamics, etc.) for realizing “hearing” in performance expression is not necessarily unique. The performance expression of the phrase structure expresses its individuality by organically combining tempo expression and dynamic expression. However, in the technique of Non-Patent Document 1, the expression curve of tempo and dynamics for a specified phrase is displayed on a separate edit screen, and each expression curve is individually corrected on each edit screen. It was difficult to grasp the relationship between each other, and it was difficult to grasp what kind of combination of performance parameters realized “hearing” of the performance expression.

また、ユーザは、選択したフレーズに対する演奏表現（表情カーブ）の修正（編集）と試聴とを繰り返すことによって、所望の表情付けを行うが、演奏生成に関わる方法論が規定されず、目安や手掛かりとなる情報が乏しい状態、たとえば楽譜情報（ピアノロール情報）のみを参考にして、演奏の表情付けを行うことは、必ずしも容易ではなかった。 In addition, the user performs desired expression by repeating correction (editing) of the performance expression (expression curve) for the selected phrase and audition, but the methodology related to performance generation is not specified, and there are no guidelines or clues. It is not always easy to express a performance with reference to only a state where information is scarce, for example, only musical score information (piano roll information).

それゆえに、この発明の主たる目的は、新規な、演奏表情付け支援装置を提供することである。 Therefore, a main object of the present invention is to provide a new performance expression support device.

この発明の他の目的は、演奏に対する所望の表情付けを容易に実行できる、演奏表情付け支援装置を提供することである。 Another object of the present invention is to provide a performance expression support device that can easily execute a desired expression for a performance.

この発明は、上記の課題を解決するために、以下の構成を採用した。なお、括弧内の参照符号および補足説明などは、本発明の理解を助けるために後述する実施の形態との対応関係を示したものであって、この発明を何ら限定するものではない。 The present invention employs the following configuration in order to solve the above problems. Note that reference numerals in parentheses and supplementary explanations indicate correspondence with embodiments described later in order to help understanding of the present invention, and do not limit the present invention.

第１の発明は、フレージングに基づく演奏の表情付けを支援する演奏表情付け支援装置であって、ユーザによって指定、もしくは、自動解析された階層的フレーズ構造を構成する各フレーズに対して、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブを、同じ画面に、同じ時間軸で多重表示する第１表示手段、および画面に表示された表情カーブの修正を受け付ける修正受付手段を備える、演奏表情付け支援装置である。 A first invention is a performance expression support device that supports expression of a performance based on phrasing, wherein a tempo is provided for each phrase constituting a hierarchical phrase structure designated or automatically analyzed by a user. Performance expression support that includes first display means for displaying three expression curves of dynamics and articulation on the same screen in the same time axis, and correction accepting means for receiving correction of the expression curve displayed on the screen Device.

第１の発明では、演奏表情付け支援装置（１０）は、第１表示手段（１２，１８，４２，Ｓ１３）を備え、ユーザが行うフレージングに基づく演奏の表情付けを支援する。第１表示手段は、たとえば、ユーザによる指定や自動解析によって楽曲の階層的フレーズ構造の解析がなされた後、その階層的フレーズ構造を構成する複数のフレーズから任意に選択されたフレーズに対して、テンポ、ダイナミクスおよびアーティキュレーションの３つの演奏表現の表情カーブ（５０）を、同じ画面上に、同じ時間軸で多重表示する。修正受付手段（１２，１６，４２，Ｓ５１）は、ユーザが行うマウス等の操作によって、表情カーブの修正を受け付ける。この表情カーブの修正により、ユーザは、そのフレーズに対する演奏の表情付けを行う。 In the first invention, the performance expression support device (10) includes the first display means (12, 18, 42, S13) and supports the expression of the performance based on phrasing performed by the user. The first display means, for example, for a phrase arbitrarily selected from a plurality of phrases constituting the hierarchical phrase structure after the hierarchical phrase structure of the music is analyzed by designation or automatic analysis by the user, Expression curves (50) of three performance expressions of tempo, dynamics, and articulation are displayed in multiple on the same screen on the same time axis. The correction accepting means (12, 16, 42, S51) accepts the correction of the expression curve by the operation of the mouse or the like performed by the user. By correcting the expression curve, the user adds a performance expression to the phrase.

第１の発明によれば、３つの表情カーブを、同じ画面上に、同じ時間軸で多重表示するので、テンポ、ダイナミクスおよびアーティキュレーションの関係性が把握し易くなる。つまり、これら３つの演奏表現の関係性を総合的に判断しながら編集できるので、ユーザは、演奏の表情付けを容易に行うことができる。 According to the first invention, the three facial expression curves are displayed on the same screen in the same time axis, so that the relationship between the tempo, the dynamics, and the articulation can be easily grasped. That is, since the editing can be performed while comprehensively judging the relationship between these three performance expressions, the user can easily perform the expression of the performance.

また、暗黙知として存在していたフレーズの表現の仕方、つまりテンポ等の演奏パラメータをどのように組み合わせてその表現を実現したかを外在化できるので、自身の知識を形式化することができ、第３者に演奏表現の技法をより具体的に伝えることができるようになる。 In addition, it is possible to externalize how phrases are expressed as tacit knowledge, that is, how the performance parameters such as tempo are combined to realize the expression, so that you can formalize your knowledge. , It will be possible to convey the technique of performance expression to a third party more specifically.

第２の発明は、第１の発明に従属し、フレーズ内の各音符の頂点らしさを算出する算出手段、および各音符の頂点らしさを識別可能に表示する第２表示手段をさらに備える。 The second invention is dependent on the first invention, and further includes calculation means for calculating the apex-likeness of each note in the phrase, and second display means for displaying the apex-likeness of each note in an identifiable manner.

第２の発明では、各音符の頂点らしさ、つまり演奏表現における各音符の重要度を定量的に見分けるための指標となるものを算出する算出手段（１２，１４，Ｓ７）、および算出した各音符の頂点らしさを識別可能に表示する第２表示手段（１２，１８，４０，Ｓ９）を備える。第２表示手段は、たとえば、各音符の頂点らしさを、ピアノロール上の各音符の描画色の濃淡情報として表示する。 In the second invention, the calculation means (12, 14, S7) for calculating the apex-likeness of each note, that is, an index for quantitatively distinguishing the importance of each note in the performance expression, and each calculated note Is provided with second display means (12, 18, 40, S9) for identifiably displaying the vertices. The second display means displays, for example, the apex-likeness of each note as shade information of the drawing color of each note on the piano roll.

第２の発明によれば、フレーズ内のどの音符が重要であるかを定量的に判断できるようになるので、これを参照することにより、ユーザは、演奏の表情付けを容易かつ効率的にできるようになる。 According to the second invention, since it becomes possible to quantitatively determine which note in the phrase is important, the user can easily and efficiently express the performance by referring to this. It becomes like this.

第３の発明は、第２の発明に従属し、算出手段は、所定の音楽理論に基づく音符の頂点らしさを判定するためのルール群を記憶するルール記憶手段、およびフレーズ内の各音符とルール群のそれぞれとを照合し、該当するルールがある音符には、当該ルールに割り当てられた評価ポイントを、その音符の頂点らしさを示すエネルギー値として加算する加算手段を含む。 A third invention is dependent on the second invention, and the calculating means stores rule groups for storing a rule group for determining the likelihood of a note based on a predetermined music theory, and each note and rule in the phrase Each note of each group is checked, and a note having a corresponding rule includes an adding means for adding the evaluation point assigned to the rule as an energy value indicating the likelihood of the note.

第３の発明では、算出手段（１２，１４，Ｓ７）は、ルール記憶手段（１４）を含み、ルール記憶手段には、たとえば、保科の提唱した音楽解釈理論に基づくルール群が記憶される。各ルールには、重要度などに応じて評価ポイントが設定される。つまり、音符の頂点らしさの計算のための条件および評価ポイント値をルールとして記述している。加算手段（１２，Ｓ４７）は、フレーズ内の各音符（具体的には、その音高、音価および和声など）と、ルール記憶手段に記憶されたルール群とを比較照合し、該当するルールがある音符には、該当するルールに割り当てられた評価ポイントを、その音符の頂点らしさを示すエネルギー値として加算する。各音符についてのエネルギー値の計算結果は、メモリ（１４）等に適宜記憶される。 In the third invention, the calculation means (12, 14, S7) includes a rule storage means (14), and a rule group based on, for example, music interpretation theory proposed by Hoshina is stored in the rule storage means. In each rule, an evaluation point is set according to the importance. In other words, the conditions and evaluation point values for calculating the vertex likeness of a note are described as rules. The adding means (12, S47) compares and matches each note in the phrase (specifically, its pitch, note value, harmony, etc.) and the rule group stored in the rule storage means. For a note with a rule, the evaluation point assigned to the rule is added as an energy value indicating the likelihood of the vertex of the note. The calculation result of the energy value for each note is appropriately stored in the memory (14) or the like.

第３の発明によれば、算出されたエネルギー値の大きさによって、各音符の頂点らしさ、つまり各音符の重要度を定量的に表すことができる。また、音符の頂点らしさの計算のための条件および評価ポイント値をルールとして記述していることから、その計算の根拠をユーザに示すことも容易に実現される。したがって、ユーザ自身による頂点の推定や把握を助けるという視点において、演奏の表情付けの支援を行うことができる。 According to the third aspect, it is possible to quantitatively represent the apex-likeness of each note, that is, the importance of each note, based on the magnitude of the calculated energy value. In addition, since the conditions and evaluation point values for calculating the likelihood of a note are described as rules, it is easy to show the basis of the calculation to the user. Therefore, it is possible to support the expression of the performance from the viewpoint of helping the user to estimate and grasp the vertex.

第１の発明によれば、表情カーブを修正するときに、テンポ、ダイナミクスおよびアーティキュレーションの関係性を把握し易くなるので、演奏の表情付けを容易に行うことができる。 According to the first invention, when the facial expression curve is corrected, the relationship between the tempo, the dynamics, and the articulation can be easily grasped, so that the performance can be easily expressed.

第３の発明によれば、音符の頂点らしさの計算のための条件および評価ポイント値をルールとして記述していることから、その計算の根拠をユーザに示すことも容易に実現される。したがって、ユーザ自身による頂点の推定や把握を助けるという視点において、演奏の表情付けの支援を行うことができる。 According to the third aspect, since the conditions and the evaluation point values for calculating the likelihood of the note apex are described as rules, it is easy to show the basis of the calculation to the user. Therefore, it is possible to support the expression of the performance from the viewpoint of helping the user to estimate and grasp the vertex.

この発明の上述の目的、その他の目的、特徴および利点は、図面を参照して行う後述の実施例の詳細な説明から一層明らかとなろう。 The above object, other objects, features, and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

この発明の一実施例の演奏表情付け支援装置の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the performance expression addition assistance apparatus of one Example of this invention. 図１の演奏表情付け支援装置の表示装置に表示されるフレーズ構造解析画面の一例を示す図解図である。It is an illustration figure which shows an example of the phrase structure analysis screen displayed on the display apparatus of the performance expression addition assistance apparatus of FIG. 頂点推定ルールの分類を示す図解図である。It is an illustration figure which shows the classification | category of a vertex estimation rule. 頂点推定ルールの一部を示すテーブルである。It is a table which shows a part of vertex estimation rule. 頂点推定ルールの他の一部を示すテーブルである。It is a table which shows the other part of vertex estimation rules. 図１の演奏表情付け支援装置の表示装置に表示される演奏表情編集画面の一例を示す図解図である。It is an illustration figure which shows an example of the performance expression edit screen displayed on the display apparatus of the performance expression addition assistance apparatus of FIG. 図１の演奏表情付け支援装置のＣＰＵが実行する演奏デザインの全体処理の動作の一例を示すフロー図である。It is a flowchart which shows an example of the operation | movement of the whole process of the performance design which CPU of the performance expression addition support apparatus of FIG. 1 performs. 図７におけるフレーズ構造解析処理の動作の一例を示すフロー図である。It is a flowchart which shows an example of operation | movement of the phrase structure analysis process in FIG. 図７における頂点らしさの計算処理の動作の一例を示すフロー図である。It is a flowchart which shows an example of the operation | movement of the calculation process of the vertex in FIG. 図７における表情カーブの編集処理の動作の一例を示すフロー図である。It is a flowchart which shows an example of the operation | movement of the edit process of the expression curve in FIG.

図１を参照して、この発明の一実施例である演奏表情付け支援装置（以下、単に「支援装置」という。）１０は、階層的なフレーズ表現を基本とした、ユーザ主導で行う演奏の表情付けを支援するための装置である。支援装置１０は、たとえばパーソナルコンピュータのような電子計算機を用いて構成され、詳細は後述するように、ユーザによる演奏の表情付けを支援するために、フレーズ構造解析支援機能、頂点解析機能および演奏表情編集機能などを備えている。 Referring to FIG. 1, a performance expression support device (hereinafter simply referred to as “support device”) 10 according to an embodiment of the present invention is a user-driven performance based on a hierarchical phrase expression. It is a device for supporting facial expression. The support device 10 is configured by using an electronic computer such as a personal computer, for example, as will be described in detail later, in order to support the expression of a performance by a user, a phrase structure analysis support function, a vertex analysis function, and a performance expression It has editing functions.

支援装置１０は、ＣＰＵ１２を含む。ＣＰＵ１２は、マイクロコンピュータ或いはプロセサとも呼ばれ、支援装置１０の全体的な制御を実行する。ＣＰＵ１２には、バス等を介して、メモリ１４、入力装置１６、表示装置１８および音源２０等が接続される。 The support device 10 includes a CPU 12. The CPU 12 is also called a microcomputer or a processor, and executes overall control of the support apparatus 10. A memory 14, an input device 16, a display device 18, a sound source 20, and the like are connected to the CPU 12 via a bus or the like.

メモリ１４は、図示は省略するが、ＲＯＭ、ＨＤＤおよびＲＡＭ等を含む。ＲＯＭやＨＤＤには、ＣＰＵ１２がこの演奏表情付け支援処理を行うためのプログラムおよびデータ、ならびにその他必要なプログラムおよびデータが記憶される。ＲＡＭは、ＣＰＵ１２の作業領域またはバッファ領域として使用される。つまり、ＣＰＵ１２は、ＲＯＭやＨＤＤに記憶されたプログラムおよびデータをＲＡＭにロードし、これらプログラムに従って支援装置１０の各部の動作を制御し、各種処理を実行する。また、ＣＰＵ１２は、演奏の表情付け処理の進行などに応じて適宜生成される演奏データ等の一時的なデータをＲＡＭに記憶する。 The memory 14 includes a ROM, an HDD, a RAM, and the like, although not shown. The ROM and HDD store a program and data for the CPU 12 to perform this performance expression support process, and other necessary programs and data. The RAM is used as a work area or a buffer area for the CPU 12. That is, the CPU 12 loads a program and data stored in the ROM or HDD into the RAM, controls operations of each unit of the support apparatus 10 according to these programs, and executes various processes. Further, the CPU 12 stores temporary data such as performance data, which is appropriately generated according to the progress of the performance expression process, in the RAM.

メモリ１４（ＲＯＭやＨＤＤ）に記憶されるプログラムとしては、表示装置１８に所定の画面を表示するための画面表示プログラム、ユーザが入力した部分的なフレーズ構造に基づいて残りのフレーズ構造を解析するためのフレーズ構造解析支援プログラム、各音符の頂点らしさを推定するための頂点推定プログラム、ユーザによって指定されたフレーズにおける表情カーブを生成するための表情カーブ生成プログラム、ユーザによる表情カーブの修正などに応じて演奏データを生成（更新）するための演奏データ生成プログラム、生成された演奏データに応じた音声（演奏音）を出力するための演奏音出力プログラム等がある。 As a program stored in the memory 14 (ROM or HDD), a screen display program for displaying a predetermined screen on the display device 18, and the remaining phrase structure is analyzed based on a partial phrase structure input by the user. Phrase structure analysis support program, vertex estimation program for estimating the vertices of each note, facial expression curve generation program for generating facial expression curves for phrases specified by the user, facial expression curve correction by the user, etc. There are a performance data generation program for generating (updating) performance data and a performance sound output program for outputting sound (performance sound) corresponding to the generated performance data.

メモリ１４に記憶されるデータとしては、表示装置１８に表示される文字データや映像データ、各音符の頂点らしさを推定するためのルールに関するデータ、および所望の楽曲の楽譜データ（音符の並びや演奏記号の情報）等がある。楽譜データは、たとえば、標準ＭＩＤＩファイル（ＳＭＦ）形式やＭｕｓｉｃＸＭＬ形式のデータを用いることができ、ＭＩＤＩのスコアエディタ、またはＭｕｓｉｃＸＭＬのインポート機能などを利用して適宜入力されてメモリ１４に記憶される。 The data stored in the memory 14 includes character data and video data displayed on the display device 18, data relating to rules for estimating the likelihood of the vertices of each note, and musical score data of desired music (note arrangement and performance). Symbol information). As the score data, for example, data in a standard MIDI file (SMF) format or MusicXML format can be used. The score data is appropriately input using a MIDI score editor or a MusicXML import function, and stored in the memory 14.

なお、上述のプログラムおよびデータ等は、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭおよびメモリカード等の各種情報記録媒体に記録されているものを読み出して使用してもよいし、ＬＡＮあるいはインターネットのようなネットワーク上のサーバ等から取得して使用してもよい。 The programs and data described above may be read out and used from various information recording media such as CD-ROM, DVD-ROM and memory card, or on a network such as a LAN or the Internet. It may be obtained from the server or the like.

入力装置１６は、ユーザが操作する操作手段であり、キーボードおよびポインティングデバイス等を含む。ポインティングデバイスとしては、マウス、タッチパネル、トラックボール、トラックパッド、デジタイザまたはタブレット等が用いられる。ユーザによる入力装置１６の操作に応じた操作信号（操作情報）は、ＣＰＵ１２に与えられ、ＣＰＵ１２は、与えられた操作信号に基づいて処理を実行する。 The input device 16 is operation means operated by a user, and includes a keyboard, a pointing device, and the like. As the pointing device, a mouse, a touch panel, a track ball, a track pad, a digitizer, a tablet, or the like is used. An operation signal (operation information) according to the operation of the input device 16 by the user is given to the CPU 12, and the CPU 12 executes processing based on the given operation signal.

表示装置１８は、ＬＣＤやＣＲＴ等であり、ＣＰＵ１２から与えられる表示信号に基づいて、後述するピアノロール画面およびエディット画面などの各種画面を表示する。 The display device 18 is an LCD, a CRT, or the like, and displays various screens such as a piano roll screen and an edit screen, which will be described later, based on a display signal given from the CPU 12.

音源２０は、ＣＰＵ１２の制御の下、ＲＡＭに生成される演奏データ（表情付けされた演奏データ等）に基づいて音声信号を生成し、音声出力装置２２に出力する。音声出力装置２２は、Ｄ／Ａコンバータ、アンプおよびスピーカ等を含み、音声信号に応じた音声（演奏音）を出力する。なお、音源２０としては、音源を備えたサウンドカード等の内部音源を使用することもできるし、電子ピアノ等の外部音源を使用することもできる。 The sound source 20 generates an audio signal based on performance data (such as performance data with a facial expression) generated in the RAM under the control of the CPU 12 and outputs the audio signal to the audio output device 22. The sound output device 22 includes a D / A converter, an amplifier, a speaker, and the like, and outputs sound (performance sound) corresponding to the sound signal. As the sound source 20, an internal sound source such as a sound card provided with a sound source can be used, or an external sound source such as an electronic piano can be used.

このような構成の支援装置１０は、上述のように、階層的なフレーズ表現（フレージング）に焦点を当て、ユーザ主導で行われる演奏の表情付けを支援する。ここで、この発明の理解に必要な範囲で、音楽表現におけるフレーズ構造について簡単に説明する。なお、演奏家や作曲家によっては、フレーズのことをグループと表現している場合があるが、フレーズとグループとは同義語であるので、ここでは、フレーズの呼称を使用することとする。 As described above, the support device 10 having such a configuration focuses on hierarchical phrase expression (phrasing) and supports the expression of performance performed by the user. Here, the phrase structure in music expression will be briefly described within the scope necessary for understanding the present invention. Note that, depending on the performer or composer, the phrase may be expressed as a group, but the phrase and the group are synonyms, so the phrase name is used here.

どのような音楽においても、ある時刻における音とそれに前後する音との間には何らかの関係を持ったまとまりが形成される。このまとまりをフレーズという。複数のフレーズはさらに関連付けられて、階層的なフレーズ構造が形成される。また、フレージングとは、音楽の流れを適切な位置で区切る（つまりフレーズを設定する）ことによって曲に呼吸を与えることであり、フレージングを踏まえて演奏することによって演奏に抑揚が生まれる。つまり、楽曲の階層的フレーズ構造をいかに聴取者へ伝達するかが、音楽表現の主題であると言っても過言ではない。 In any music, a group having a certain relationship is formed between the sound at a certain time and the sound around it. This unit is called a phrase. A plurality of phrases are further associated to form a hierarchical phrase structure. In addition, phrasing is to give a breath to a song by dividing the flow of music at an appropriate position (that is, by setting a phrase). In other words, it is no exaggeration to say that how to convey the hierarchical phrase structure of music to the listener is the subject of musical expression.

ところで、作曲家が用意した楽譜から解釈されるフレーズ構造は必ずしも１つではなく、多義性が存在する。演奏家や指揮者などは、フレーズ構造の可能な解釈の中から１つを定め、そのフレーズ構造が伝わるように演奏表現として具現化している。また、フレーズ構造の演奏表現も必ずしも１つではなく、テンポ、ダイナミクスおよびアーティキュレーションの組み合わせによって演奏の個性が表出される。つまり、ユーザが望む演奏デザインを行うためには、楽曲のフレーズ構造をどのように解釈（設定）し、そのフレーズ構造をどのように表現するのかが重要となる。しかし、楽曲全体の階層的なフレーズ構造、およびそのフレーム構造の表現方法の全てを、何の手掛かりもないままユーザが決定していくことは、多大な労力を必要とすると共に、初心者では能力的に困難な作業となる。 By the way, the phrase structure interpreted from the score prepared by the composer is not necessarily one, and there is ambiguity. Performers, conductors, etc. define one possible interpretation of the phrase structure and embody it as a performance expression so that the phrase structure is transmitted. Also, the performance expression of the phrase structure is not necessarily one, and the individuality of the performance is expressed by a combination of tempo, dynamics and articulation. That is, in order to perform the performance design desired by the user, it is important how to interpret (set) the phrase structure of the music and how to express the phrase structure. However, it takes a lot of labor for the user to determine all of the hierarchical phrase structure of the entire song and the method of expressing the frame structure without any clues, and it is difficult for beginners It will be a difficult task.

そこで、支援装置１０では、あくまでもユーザ主導で演奏の表情付けを行うことを前提として、特定のルールや事例参照などに基づく楽曲分析情報（フレーズ構造解析情報や頂点解析情報など）をユーザに提供することによって、ユーザ主導で行われる演奏の表情付けを支援する。 Therefore, the support apparatus 10 provides the user with music analysis information (phrase structure analysis information, vertex analysis information, etc.) based on specific rules, case references, etc., on the premise that the expression of the performance is led by the user. Thus, the user-driven performance expression is supported.

なお、この実施例では、ピアノのような打鍵楽器を対象として説明するが、支援装置１０は、打鍵楽器以外の楽器、たとえば、弦楽器、管楽器、打楽器、およびそれらの複合にも適用可能である。 Although this embodiment will be described with respect to a percussion instrument such as a piano, the support apparatus 10 can be applied to instruments other than the percussion instrument, for example, string instruments, wind instruments, percussion instruments, and combinations thereof.

以下には、支援装置１０が提供する演奏表情付け支援処理の一連の流れについて、一例を挙げて説明する。支援装置１０を用いて演奏の表情付けを行う作業は、簡単にいうと、（１）一列に連なるフレーズ群（プライマリフレーズライン）の決定、（２）プライマリフレーズライン中のフレーズに対しての頂点の決定、（３）頂点情報を利用したプライマリフレーズラインの演奏デザイン、（４）プライマリフレーズラインの上位構造に対するまとまり感の付与、という流れで行われる。支援装置１０は、ユーザに入力や選択動作を行わせるためのＧＵＩ（Graphical User Interface）画面を表示装置１８に適宜表示し、ユーザによる入力指示に応じた楽曲分析情報を提供しながら、インタラクティブに演奏表情付け支援処理を実行する。なお、演奏デザインの進行に応じて、音声出力装置２２から演奏音を出力するための演奏データは適宜生成され、たとえば、表示装置１８の画面上に表示される試聴ボタン（図示せず）をクリックすることによって、ユーザは、任意段階での演奏音の試聴が可能である。以下、具体的に説明する。 Below, an example is given and demonstrated about a series of flows of the performance facial expression assistance process which the assistance apparatus 10 provides. The task of performing performance expression using the support device 10 is simply as follows: (1) determination of a group of phrases (primary phrase line) connected in a row; (2) apex with respect to phrases in the primary phrase line. (3) Performance design of the primary phrase line using vertex information, and (4) Giving a sense of unity to the upper structure of the primary phrase line. The support device 10 displays a GUI (Graphical User Interface) screen for allowing the user to perform input and selection operations on the display device 18 as appropriate, and performs interactive performance while providing music analysis information in accordance with an input instruction from the user. Perform facial expression support processing. In accordance with the progress of the performance design, performance data for outputting performance sounds from the audio output device 22 is appropriately generated. For example, a preview button (not shown) displayed on the screen of the display device 18 is clicked. By doing so, the user can audition the performance sound at an arbitrary stage. This will be specifically described below.

支援装置１０を用いて演奏の表情付け作業を行うユーザは、先ず、所望の楽曲の楽譜情報（楽譜データ）を支援装置１０に入力する、或いは予め支援装置１０のメモリ１４等に楽譜情報が記憶された楽曲の中から所望の楽曲を選択する。表情付けを行う楽曲が決定すると、ユーザは、支援装置１０のフレーズ構造解析支援機能を利用して、楽曲のフレーズ構造を決定する。図２は、表示装置１８の画面上に表示されるフレーズ構造解析画面の一例を示している。 A user who performs a performance expression work using the support apparatus 10 first inputs musical score information (score data) of a desired music piece to the support apparatus 10 or stores musical score information in the memory 14 or the like of the support apparatus 10 in advance. The desired music is selected from the selected music. When the music to be used for expression is determined, the user uses the phrase structure analysis support function of the support device 10 to determine the phrase structure of the music. FIG. 2 shows an example of a phrase structure analysis screen displayed on the screen of the display device 18.

具体的には、フレーズ構造を解析する際には、支援装置１０では、その楽曲の楽譜情報を示すピアノロール画面３０がＧＵＩとして表示装置１８に表示される（図２の上部参照）。ユーザは、ピアノロールの任意の部分、たとえば楽曲最初の数小節や中盤の数小節を選択して、概ね１〜数小節分の長さを目安に、一列に連なるフレーズ群（プライマリフレーズライン）を形成する。つまり、楽曲の任意の部分において、ピアノロール上に数個の区切り線を入れることによって、フレーズ境界を部分的に指定する。 Specifically, when analyzing the phrase structure, the support apparatus 10 displays the piano roll screen 30 showing the musical score information of the music on the display device 18 as a GUI (see the upper part of FIG. 2). The user selects an arbitrary part of the piano roll, for example, the first few bars of the music or the middle bar, and sets a series of phrases (primary phrase lines) in a row, using the length of about one to several bars as a guide. Form. That is, the phrase boundary is partially designated by putting several dividing lines on the piano roll at an arbitrary part of the music piece.

ユーザがフレーズ境界を部分的に指定すると、支援装置１０では、ユーザが指定した部分以外のフレーズ境界および階層的フレーズ構造が、ＧＴＴＭ（A Generative Theory of Tonal Music）の境界判定を行うルール（ＧＰＲ：Grouping Preference Rule）を利用したｅｘＧＴＴＭ（Hamanaka,M., Hirata,K. andTojo,S.: Implementing“A Generative Theory of Tonal Music”,Journal of New Music Research, Vol.35, No.4, pp.249-277(2006)参照）を利用して自動解析される。 When the user partially specifies a phrase boundary, the support device 10 uses a rule (GPR: Rule for determining a boundary of GTTM (A Generative Theory of Tonal Music) for phrase boundaries and hierarchical phrase structures other than the part specified by the user). ExGTTM using Grouping Preference Rule (Hamanaka, M., Hirata, K. and Tojo, S .: Implementing “A Generative Theory of Tonal Music”, Journal of New Music Research, Vol.35, No.4, pp.249 -277 (2006)).

フレーズ構造の解析処理が終了すると、階層的フレーズ構造３２がＧＵＩとして表示装置１８に表示される（図２の下部参照）。図２において、斜線で示されるフレーズ構造（Ｕ（０）−Ｕ（５）のフレーズ）３２ａがユーザ指定によるフレーズ構造であり、ドット模様で示されるフレーズ構造（Ｃ（０）−Ｃ（２７）のフレーズ）３２ｂが支援装置１０によって自動解析された上位および下位のフレーズ構造である。なお、階層的フレーズ構造３２中に示される複数の縦線３４は、ユーザのフレーズ境界指定に依存しない自動処理によって提示されるフレーズ境界の候補である。ユーザは、このフレーズ境界の候補を参照して、フレーズ構造を修正することもできる。また、図面の都合上、斜線やドット模様を付してユーザ指定のフレーズ構造３２ａと自動解析によるフレーズ構造３２ｂとを区別しているが、実際には、任意の色を配色することによって区別しており、自動解析によるフレーズ構造３２ｂにおいては、さらに上位構造と下位構造とで違う色を配色して区別できるようにしている。 When the phrase structure analysis process ends, the hierarchical phrase structure 32 is displayed on the display device 18 as a GUI (see the lower part of FIG. 2). In FIG. 2, a phrase structure (phrase U (0) -U (5)) 32a indicated by diagonal lines is a phrase structure specified by the user, and a phrase structure (C (0) -C (27) indicated by a dot pattern is shown. ) 32b are upper and lower phrase structures automatically analyzed by the support device 10. A plurality of vertical lines 34 shown in the hierarchical phrase structure 32 are phrase boundary candidates presented by automatic processing that does not depend on the phrase boundary designation of the user. The user can also correct the phrase structure by referring to the phrase boundary candidates. Also, for convenience of drawing, the user-specified phrase structure 32a is distinguished from the phrase structure 32b by automatic analysis by adding diagonal lines and dot patterns, but in actuality, they are distinguished by arranging arbitrary colors. In the phrase structure 32b by automatic analysis, different colors are further arranged for the upper structure and the lower structure so that they can be distinguished.

また、支援装置１０では、ユーザによってフレーズ境界が入力または更新されると、自動演奏を再生するための演奏データが順次生成される。楽譜情報のみを手掛かりとしてフレーズ境界を判断することは、ユーザにとって必ずしも容易な作業ではないが、楽曲の試聴機能を利用することで、専門家以外のユーザでもほぼ直感的にこの作業を実施できる。 Further, in the support device 10, when the phrase boundary is input or updated by the user, performance data for reproducing the automatic performance is sequentially generated. Although it is not always easy for a user to determine a phrase boundary using only musical score information as a clue, a user other than an expert can perform this operation almost intuitively by using a music preview function.

上述の作業で意図するフレーズ構造が得られなかった場合には、ユーザは、適宜、フレーズ構造を編集する。たとえば、あるフレーズを選択して、そのフレーズとその直前または直後のフレーズとの境界位置を移動させることによって、修正作業を実施する。ユーザによって修正されたフレーズ構造の上位構造および下位構造は、自動的に再解析が実施され、この作業を繰り返すことにより、最終的にユーザの意図するフレーズ構造に近づけていくことができる。 When the intended phrase structure is not obtained in the above-described operation, the user appropriately edits the phrase structure. For example, the correction work is performed by selecting a certain phrase and moving the boundary position between the phrase and the phrase immediately before or after the phrase. The upper structure and the lower structure of the phrase structure corrected by the user are automatically reanalyzed. By repeating this work, the phrase structure intended by the user can be finally brought close to the phrase structure.

なお、ユーザの指定するフレーズ境界は、ＧＴＴＭのプレファランスと反駁することがあるが、この際には、ユーザ指定のフレーズ境界が優先され、ＧＴＴＭの支持するデフォルトプレファランスの重みを下げることによって自動解析処理自体のユーザ適合が図られる。また、支援装置１０には、ＧＰＲのユーザプレファランスを、ユーザが指定したフレーズ境界をサポートするものになるように更新していく機能を備えるようにしてもよい。この場合には、支援装置１０が実行するフレーズ構造の自動解析に関して、ＧＰＲの基本プレファランスと随時更新されるユーザプレファランスとのどちらを優先させるかをユーザが選択できるようにしてもよい。さらに、ユーザのフレーズ指定作業を補助するものとして、旋律外形の計算により、ユーザが指定したフレーズと同型のフレーズを検出する機能を備えるようにしてもよい。 Note that the phrase boundary specified by the user may conflict with the GTTM preference. In this case, the phrase boundary specified by the user is prioritized and automatically reduced by lowering the weight of the default preference supported by GTTM. User adaptation of the analysis process itself is achieved. The support device 10 may be provided with a function of updating the GPR user preference so as to support the phrase boundary specified by the user. In this case, regarding the automatic analysis of the phrase structure executed by the support apparatus 10, the user may be able to select which of the GPR basic preference and the user preference updated as needed is prioritized. Further, as a means for assisting the user in specifying a phrase, a function for detecting a phrase of the same type as the phrase specified by the user may be provided by calculating a melody outline.

上述のような作業によってユーザが望むフレーズ構造が得られると、続いて、支援装置１０のフレーズ頂点解析機能を利用して、プライマリフレーズライン中のフレーズに対しての頂点を決定する。 When the phrase structure desired by the user is obtained by the above-described operation, the vertex for the phrase in the primary phrase line is subsequently determined using the phrase vertex analysis function of the support apparatus 10.

具体的には、フレーズに対しての頂点を決定する際には、先ず、ユーザは、プライマリフレーズラインに含まれるフレーズ群の中から１つのフレーズを選択する。ユーザがフレーズを選択すると、支援装置１０の表示装置１８には、選択したフレーズ部分のピアノロール画面が表示され、支援装置１０では、選択されたフレーズに含まれる各音符の頂点らしさ（尤度）が推定される。 Specifically, when determining the apex for a phrase, first, the user selects one phrase from a phrase group included in the primary phrase line. When the user selects a phrase, the piano roll screen of the selected phrase portion is displayed on the display device 18 of the support device 10. In the support device 10, the likelihood (likelihood) of the vertices of each note included in the selected phrase. Is estimated.

この実施例では、保科の提唱した音楽解釈理論（保科洋：生きた音楽表現へのアプローチ，エネルギー思考に基づく演奏解釈法，音楽之友社，（１９９９）参照。以下、「保科理論」という。）に基づいて、頂点解析を行うようにしている。 In this example, see Hoshina's music interpretation theory (Hoshishina Hiroshi: Approach to live music expression, performance interpretation based on energy thinking, Ongaku no Tomosha, (1999), hereinafter referred to as "Hoshina theory". ), Vertex analysis is performed.

保科理論では、楽譜の解析から得られるフレーズにおいて、最もエネルギー値が高い音符を頂点または重心と定義している。そして、音符のエネルギー値を音量やテンポ等の演奏表現上での制御に置き換えていくことで、作曲者の意図に沿った演奏が可能になると説明している。頂点の演奏表現法は一意ではないが、頂点を中心とした表情カーブを描くことによって、音楽的な破綻を避けた演奏の表情付け（演奏デザイン）が可能となる。 In Hoshina theory, in the phrase obtained from the analysis of the score, the note with the highest energy value is defined as the apex or the center of gravity. Then, it is described that the performance according to the composer's intention can be performed by replacing the energy value of the note with the control on the performance expression such as the volume and the tempo. Although the expression method of the performance of the vertex is not unique, the expression of the performance (performance design) that avoids musical failure is possible by drawing a facial expression curve centered on the vertex.

そこで、この実施例では、保科理論に即した頂点らしさの推定方法を定式化し、フレーズが構成する音符が持つエネルギー値を推定して表示することによって、ユーザが演奏の表情付けを実行する上での目安あるいは手掛かりとなる情報を提示している。 Therefore, in this embodiment, a method for estimating the likelihood of vertices in accordance with Hoshina theory is formulated, and the energy value of the notes included in the phrase is estimated and displayed, so that the user can perform the facial expression of the performance. It provides information that is a guide or clue.

具体的には、保科理論では、音高の高い音（つまり音形輪郭における頂点音）、音価の長い音、および各音の緊張-弛緩構造における緊張音は、エネルギー値（つまり頂点らしさ）が高いとされる。また、和声によって頂点らしさは変化し、フレーズの最終音は頂点にならないという原則も与えられている。この実施例では、これらの音形輪郭や和音進行の原則をルールの条件節として使用できるように類型化すると共に、臨時記号などに関するルールを追加している。図３−５には、頂点らしさの計算に使用されるルール群の一例を示す。図３に示すように、頂点らしさの計算に用いるルール群は、音価、音高、旋律的緊張-弛緩、音積（和声）、グループおよびその他の６つに分類される。また、ドミナント・モーションなど音楽的に意味を持つ音の並びの表現にも対応するため、各ルールにおいては、該当する音符の他に、先行音および後続音にも評価ポイントを設定している場合がある。 Specifically, in Hoshina's theory, high-pitched sounds (that is, apex sounds in tone contours), long-tone sounds, and tense sounds in the tone-relaxation structure of each sound are energy values (that is, apex-likeness). Is said to be high. Also, the principle that the apex is changed by harmony and the final sound of the phrase does not reach the apex is also given. In this embodiment, these tone shape contours and chord progression principles are categorized so that they can be used as rule clauses, and rules regarding temporary symbols and the like are added. FIG. 3-5 shows an example of a group of rules used for calculating the likelihood of vertices. As shown in FIG. 3, the rule group used for calculating the likelihood of apex is classified into six values: tone value, pitch, melodic tone-relaxation, tone product (harmonic), group, and others. In addition, in order to support the expression of musically meaningful sounds such as dominant motion, each rule sets evaluation points for the preceding and subsequent sounds in addition to the corresponding notes. There is.

図４および５に示すように、音価に属するルールとしては、たとえば「隣接する２音の第１音が後続音より短いとき、後続音に１ポイント加算する。」というルールがあり、音高に属するルールとしては、たとえば「隣接する２音の第１音が後続音より低いとき、後続音に１ポイント加算する。」というルールがあり、旋律的緊張-弛緩に属するルールとしては、たとえば「倚音の音価が後続音より長いとき、その倚音に３ポイント加算する。」というルールがある。また、音積に属するルールとしては、たとえば「Ｉ６の和音には１ポイント加算する。」というルールがあり、グループに属するルールとしては、たとえば「グループ（つまりフレーズ）の開始音には１ポイント加算する。」というルールがあり、その他に属するルールとしては、たとえば「アクセント記号が付された音には１ポイント加算する。」というルールがある。 As shown in FIGS. 4 and 5, as a rule belonging to the tone value, for example, there is a rule that “one point is added to the succeeding sound when two adjacent first sounds are shorter than the succeeding sound”. As a rule belonging to, for example, there is a rule that “one point is added to the succeeding sound when the two adjacent first sounds are lower than the succeeding sound.” A rule belonging to the melodic tension-relaxation is, for example, “ There is a rule that when the value of a stuttering sound is longer than the subsequent sound, 3 points are added to that stuttering. Further, as a rule belonging to the sound product, for example, there is a rule “1 point is added to the chord of I6”, and as a rule belonging to the group, for example, “1 point is added to the start sound of the group (that is, phrase)”. There is a rule “Yes.”, And another rule belonging to this is, for example, a rule “add one point to a sound with an accent mark”.

各音符の頂点らしさの計算は、各ルールに適合するか否かを判定していき、該当するルールがあったときに、そのルールに割り当てられた頂点らしさの評価ポイントを、その音符のエネルギー値として加算する処理によって実行される。このエネルギー値の合計ポイントは、音符の頂点らしさを表し、合計ポイントの高い音符が、頂点音として表現することが好ましい音符とされる。各音符についての計算結果は、メモリ１４等に適宜記憶される。 The calculation of the likelihood of the vertex of each note is made by determining whether or not it matches each rule. When there is a corresponding rule, the evaluation point of the likelihood of vertex assigned to that rule is used as the energy value of that note. Is executed by the process of adding. The total point of the energy value represents the apex-likeness of a note, and a note with a high total point is preferably expressed as a vertex sound. The calculation result for each note is appropriately stored in the memory 14 or the like.

なお、ここで適用したルール群は単なる例示であり、他のルールを適用して音符の頂点らしさを計算してもよい。たとえば、図４および５に示すルール群の一部のみを用いることもできるし、図４および５に示すルール群に他のルールを付加することもできる。また、保科理論とは別の理論に基づくルール群を適用することもできる。さらに、特定のルールに該当するか否かで頂点らしさを推定することにも限定されず、たとえば事例参照によって頂点らしさを推定することもできる。 Note that the rule group applied here is merely an example, and other rules may be applied to calculate the likelihood of the vertex of a note. For example, only a part of the rule group shown in FIGS. 4 and 5 can be used, or another rule can be added to the rule group shown in FIGS. A rule group based on a theory different from Hoshina theory can also be applied. Furthermore, it is not limited to estimating the likelihood of a vertex depending on whether or not a specific rule is met. For example, the likelihood of a vertex can be estimated by referring to a case.

上述のような方法でフレーズ内の各音符の頂点らしさが推定されると、支援装置１０の表示装置１８には、ユーザが各音符の頂点らしさを識別できるように、ピアノロール上の各音符の描画色の濃淡情報などとして表示される。ユーザは、この各音符の頂点らしさの情報を適宜参照して、フレーズの頂点音を指定する。ただし、フレーズの頂点音は、支援装置１０によって自動的に決定されてもよく、たとえば、頂点らしさのポイントが１番高い音符を頂点音として自動的に決定するようにしてもよい。 When the apex-likeness of each note in the phrase is estimated by the above-described method, the display device 18 of the support device 10 displays the note-likeness of each note on the piano roll so that the user can identify the apex-likeness of each note. It is displayed as shading information of the drawing color. The user designates the apex sound of the phrase by referring to the information on the likelihood of the apex of each note as appropriate. However, the apex sound of the phrase may be automatically determined by the support device 10. For example, a note having the highest apex point may be automatically determined as the apex sound.

フレーズの頂点音が決定すると、続いて、支援装置１０の演奏表情編集機能を利用して、選択したフレーズに対する演奏デザイン（演奏の表情付け）を行う。 When the apex sound of the phrase is determined, subsequently, using the performance expression editing function of the support device 10, performance design (performance expression of the performance) for the selected phrase is performed.

具体的には、フレーズの頂点音が決定すると、支援装置１０では、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブのデフォルト値が設定され、ユーザからの指示に応じて、表情カーブがＧＵＩとして表示装置１８に表示される。表情カーブのデフォルト値としては、決定した頂点音に対するシンプルな山型の表情カーブ、算出した各音符の頂点らしさに応じた表情カーブ、または頂点音に依存しない直線や自由曲線の表情カーブなどが設定される。これらは、ユーザによって選択可能にされてもよいし、自動的に決定されてもよい。ユーザは、表示画面１８に表示された３つの表情カーブの形状を修正（編集）することによって演奏の表情付けを実施する。たとえば、マウスによるドラッグ操作によって表情カーブの形状を修正するとよい。 Specifically, when the apex sound of the phrase is determined, the support device 10 sets default values of three facial expression curves of tempo, dynamics, and articulation, and the facial expression curve is set as a GUI according to an instruction from the user. It is displayed on the display device 18. The default value of the facial expression curve is a simple mountain-shaped facial expression curve for the determined vertex sound, a facial expression curve according to the calculated vertices of each note, or a facial expression curve that does not depend on the vertex sound or a free curve Is done. These may be made selectable by the user or may be determined automatically. The user performs expression of the performance by correcting (editing) the shapes of the three expression curves displayed on the display screen 18. For example, the shape of the expression curve may be corrected by a drag operation with the mouse.

図６には、表示装置１８に表示される演奏表情編集画面の一例を示す。演奏表情編集画面は、画面上部のピアノロール画面４０、および画面下部のエディット画面４２を含む。なお、ピアノロール画面４０およびエディット画面４２においては、横軸が時間軸を表している。エディット画面４２の左部には、３つ表示ボタン、つまりＤ（ダイナミクス）ボタン４４、Ｔ（テンポ）ボタン４６、およびＡ（アーティキュレーション）ボタン４８が設けられており、これら表示ボタン４４，４６，４８をユーザがオンにすると、対応する演奏表現の表情カーブ５０がエディット画面４２上に点線表示される。たとえば、Ｄボタン４４のみをオンにすると、ダイナミクスカーブのみが表示され、３つの表示ボタン４４，４６，４８の全てをオンにすると、ダイナミクスカーブ、テンポカーブおよびアーティキュレーションカーブが多重表示される。つまり、エディット画面４２には、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブ５０を同じ時間軸上に多重表示することが可能である。 FIG. 6 shows an example of a performance expression editing screen displayed on the display device 18. The performance expression editing screen includes a piano roll screen 40 at the top of the screen and an edit screen 42 at the bottom of the screen. In the piano roll screen 40 and the edit screen 42, the horizontal axis represents the time axis. On the left side of the edit screen 42, three display buttons, that is, a D (dynamics) button 44, a T (tempo) button 46, and an A (articulation) button 48 are provided. , 48 are turned on, the expression curve 50 of the corresponding performance expression is displayed as a dotted line on the edit screen 42. For example, when only the D button 44 is turned on, only the dynamics curve is displayed, and when all the three display buttons 44, 46, and 48 are turned on, the dynamics curve, the tempo curve, and the articulation curve are displayed in a multiplexed manner. That is, on the edit screen 42, three facial expression curves 50 of tempo, dynamics, and articulation can be displayed in multiple on the same time axis.

また、表示ボタン４４，４６，４８の横に設けられるエディットラジオボタン５２で選択された表情カーブ５０は、実線で表示され、ユーザは、入力装置１６のマウスなどを操作することにより、エディット画面４２上でその形状を直接修正（編集）できる。この編集操作を各表情カーブ５０について行うことによって、ユーザは、所望の演奏表情を形成する。 The expression curve 50 selected by the edit radio button 52 provided beside the display buttons 44, 46, 48 is displayed by a solid line, and the user operates the mouse of the input device 16 to edit the edit screen 42. You can modify (edit) the shape directly above. By performing this editing operation on each expression curve 50, the user forms a desired performance expression.

このとき、エディット画面４２の下部に設けられるShow apecesのチェックボックス５４を指定すると、ピアノロール画面４０上に各音符の頂点らしさの情報が表示される。図６では、図面の都合上、密度が異なるように音５６の塗りつぶしのパターンを変化させ、そのパターンによって表現される密度の違いによって、各音５６の頂点らしさの違いを表現している。ただし、実際には、任意の色を配色し、その濃淡情報によって頂点らしさを識別可能に表示している。この実施例では、頂点らしさが高ければ色が濃く、頂点らしさが低ければ色が薄くされる。ただし、その他の方法、たとえば頂点らしさの高い順に番号を付けるなどして、各音符の頂点らしさの違いを表現してもよい。 At this time, if the Show apeces check box 54 provided at the bottom of the edit screen 42 is designated, information on the likelihood of the vertices of each note is displayed on the piano roll screen 40. In FIG. 6, for the convenience of the drawing, the pattern of filling the sound 56 is changed so that the density is different, and the difference in the vertices of each sound 56 is expressed by the difference in density expressed by the pattern. However, in practice, an arbitrary color is arranged and the apex-likeness is displayed in an identifiable manner based on the shading information. In this embodiment, the color is darker when the apex likelihood is high, and the color is lightened when the apex likelihood is low. However, the difference in the likelihood of the vertices of each note may be expressed by other methods, for example, by assigning numbers in descending order of the likelihood of vertices.

また、エディット画面４２の下部に設けられるShow articulationのチェックボックス５８を指定すると、たとえば、エディット画面４２に表示されるデフォルト表情カーブが、各音符の頂点らしさに応じた表情カーブとなり、チェックボックス５８の指定を外すと、頂点音に対するシンプルな山型の表情カーブとなる。 If the Show articulation check box 58 provided at the bottom of the edit screen 42 is designated, for example, the default expression curve displayed on the edit screen 42 becomes an expression curve corresponding to the vertices of each note. If it is not specified, it becomes a simple mountain-shaped expression curve for the vertex sound.

そして、Reset curvesボタン６０が押されると、選択したフレーズに対する表情カーブが更新され、この表情付けは、残りのフレーズに対しても反映される。また、設定装置１０では、プライマリフレーズラインの上位構造に対しても、頂点情報に基づいて表情カーブのデフォルト値が自動設定される。この上位構造については、その構造がまとまりをもって聞かれる程度に控えめの表情カーブが付与される。なお、ユーザは、上記と同様の操作を行うことによって、この上位フレーズの表情カーブを編集することも可能である。 Then, when the Reset curves button 60 is pressed, the expression curve for the selected phrase is updated, and this expression is reflected to the remaining phrases. In addition, the setting device 10 automatically sets the default value of the expression curve based on the vertex information for the upper structure of the primary phrase line. For this upper structure, a modest expression curve is given to the extent that the structure is heard together. Note that the user can edit the expression curve of the upper phrase by performing the same operation as described above.

階層的フレーズ構造の各フレーズの表情カーブが設定されると、これら表情カーブは合成され、楽曲全体としての表情カーブが生成される。楽曲全体（階層的フレーズ構造全体）での各表情カーブの算出方法を以下に示す。 When the expression curves of each phrase having a hierarchical phrase structure are set, these expression curves are combined to generate an expression curve for the entire music. The calculation method of each expression curve in the whole music (the whole hierarchical phrase structure) is shown below.

テンポカーブ自体は、指数表現、すなわち変化なしの場合は０、テンポが倍になる場合は１、テンポが半分になる場合は−１として与えられる。楽曲全体でのテンポカーブＴｅｍｐｏ（ｔ）は、式（１）で与えられる。 The tempo curve itself is given as an exponential expression, that is, 0 when there is no change, 1 when the tempo is doubled, and -1 when the tempo is halved. The tempo curve Tempo (t) for the entire music is given by equation (1).

式（１）

Formula (1)

ここで、ｔは楽譜時間、ＧｒｏｕｐＴｅｍｐｏ_ｋ（ｔ）は楽譜上での時刻ｔをその範囲に持つフレーズのテンポカーブ、ｗ_ｋはそのフレーズの重みを示す。また、各音の発音時刻および消音時刻は、Ｔｅｍｐｏ（ｔ）の積分値によって定められる。つまり、楽譜上の時刻としてｔ_０，ｔ_１に配置されたイベント間の経過時間Ｔｉｍｅ（ｔ_０，ｔ_１）は、テンポカーブが指数で表記されていること、およびテンポが時間進行の逆数になることから、式（２）によって表される。 Here, t is a musical score time, Group Tempo _k (t) is a tempo curve of a phrase having a time t on the musical score within the range, and w _k is a weight of the phrase. Further, the sound generation time and the mute time of each sound are determined by the integral value of Tempo (t). That is, the elapsed time Time (t ₀ , t ₁ ) between the events arranged at t ₀ and t ₁ as the time on the score is that the tempo curve is represented by an exponent and that the tempo is the reciprocal of time progression. Therefore, it is expressed by the equation (2).

式（２）

Formula (2)

ここで、ＤｅｌｔａＤｕｒａｔｉｏｎは、曲全体のｂｐｍ（Beats Per Minute）と式（２）での積分区間の分解能によって規定される値である。このような計算を実施することによって、指揮システム等の実時間のスケジューラを構成する際に、拍以下の単位で、だんだん速くする或いはだんだん遅くする等の処理をシンプルに実装できる。 Here, DeltaDuration is a value defined by the bpm (Beats Per Minute) of the entire song and the resolution of the integration interval in Equation (2). By carrying out such a calculation, when configuring a real-time scheduler such as a command system, it is possible to simply implement processes such as increasing or decreasing the speed in units of beats or less.

上述のテンポカーブと同様に、楽曲全体でのダイナミクスカーブＤｙｎ（ｔ）は、式（３）によって与えられる。各音の音量の計算についても、上述の発音・消音と同じく、楽譜上での時刻ｔをその範囲に持つフレーズのダイナミクスカーブの足し合わせによって計算される。 Similar to the tempo curve described above, the dynamics curve Dyn (t) for the entire music is given by equation (3). The sound volume of each sound is also calculated by adding together the dynamic curves of phrases having the time t on the score in the same range as in the above-described pronunciation / mute.

式（３）

Formula (3)

ここで、ｔは楽譜時間、ＧｒｏｕｐＤｙｎ_ｋ（ｔ）は楽譜上での時刻ｔをその範囲に持つフレーズのダイナミクスカーブ、ｗ_ｋはそのフレーズの重みを示す。ダイナミックスカーブは、ＭＩＤＩにおけるｖｅｌｏｃｉｔｙ（音の強さ）を基準として設定されることから、楽譜上での時刻ｔの音符のｖｅｌｏｃｉｔｙは、式（４）によって計算される。 Here, t is a musical score time, GroupDyn _k (t) is a dynamic curve of a phrase having a time t on the musical score within the range, and w _k is a weight of the phrase. Since the dynamics curve is set on the basis of velocity (sound intensity) in MIDI, the velocity of the note at time t on the score is calculated by the equation (4).

式（４）

Formula (4)

ここで、ＳｔｄＶｅｌは、デフォルトのｖｅｌｏｃｉｔｙ設定値である。 Here, StdVel is a default velocity setting value.

また、この実施例では、打鍵楽器を対象としているので、アーティキュレーションの表現における主要な制御対象は、各音符のノートオフのタイミングとなる。ｉ番目の音符の楽譜上の終了時間Ｎ_{Ｏｆｆｓｅｔ}（ｉ）は、式（５）によって設定される。 Further, in this embodiment, since a keyed musical instrument is targeted, the main control target in articulation expression is the note-off timing of each note. The end time N _Offset (i) on the score of the i-th note is set by Equation (5).

式（５）

Formula (5)

ここで、Ｎ_{Ｏｎｓｅｔ}（ｉ）およびＮ_ＴＶ（ｉ）のそれぞれは、ｉ番目の音符の楽譜上の開始時間および音価を示す。ＧｒｏｕｐＡｒｔｃｌｔｎ（ｔ）は、その音符が含まれるフレーズのアーティキュレーションカーブの時刻ｔにおける比率を示す。Ｎ_{Ｏｆｆｓｅｔ}（ｉ）が実際に発行される時間は、式（２）に従って計算される。 Here, each of N _Onset (i) and N _TV (i) indicates a start time and a note value on the score of the i-th note. GroupArtcltn (t) indicates the ratio of the articulation curve of the phrase containing the note at time t. The time when N _Offset (i) is actually issued is calculated according to equation (2).

なお、この支援装置１０では、楽譜情報（楽譜データ）として、クレッシェンド、リタルダントおよびスタッカート等の演奏記号が付加されている場合には、それらをルール処理によって表情カーブに展開することもできる。演奏記号が適用されると、その演奏記号に対応する新たなフレーズが形成され、対応する表情カーブのデフォルト値が設定される。 In the support device 10, when performance symbols such as crescendo, ritardant, and staccato are added as musical score information (score data), they can be developed into facial expression curves by rule processing. When a performance symbol is applied, a new phrase corresponding to the performance symbol is formed, and a default value of the corresponding expression curve is set.

このようにして、楽曲全体としての表情カーブは生成される。なお、選択したフレーズの表情カーブを編集する際には、ユーザは、必要に応じて、各音の開始時刻、終了時刻および音量を個別に設定することもできる。また、プライマリフレーズラインよりも、上位構造の演奏デザインを重視したくなった場合には、予め設定されたプライマリフレーズラインの重み係数を下げ、上位構造をあらためてプライマリフレーズラインとして再設定して編集操作を継続してもよい。また、上述の一連の過程は、必要に応じて前の操作に戻り、処理を繰り返すこともできる。 In this way, the expression curve for the entire music is generated. When editing the expression curve of the selected phrase, the user can individually set the start time, end time, and volume of each sound as necessary. Also, if you want to emphasize the performance design of the upper structure rather than the primary phrase line, lower the preset primary phrase line weighting factor and re-set the upper structure as the primary phrase line for editing operation. May be continued. Further, the series of processes described above can return to the previous operation and repeat the process as necessary.

支援装置１０では、上述のようなユーザによる表情カーブの修正に応じて、自動演奏を再生するための演奏データが生成または更新され、ピアノロール画面４０のプレビュも更新される。そして、ユーザは、ピアノロール画面４０およびエディット画面４２に表示される視覚情報を参考にしながら、試聴と上述の編集操作とを繰り返すことにより、所望の演奏表情を形成していく。 In the support device 10, performance data for reproducing the automatic performance is generated or updated in accordance with the correction of the expression curve by the user as described above, and the preview of the piano roll screen 40 is also updated. Then, the user repeats the audition and the editing operation described above with reference to the visual information displayed on the piano roll screen 40 and the edit screen 42, thereby forming a desired performance expression.

この編集操作の際には、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブ５０が、エディット画面４２上に、同じ時間軸で多重表示されるので、ユーザは、これら演奏表現の関係性を把握し易くなる。つまり、これら３つの演奏表現の関係性を総合的に判断しながら編集作業を実行できるようになる。また、テンポや音量などの演奏表現をどのように組み合わせることで、その演奏表情が出来上がっているかを認識できるので、ユーザは、演奏の表情付けを容易に行うことができるようになる。 During this editing operation, three facial expression curves 50 of tempo, dynamics and articulation are displayed in multiple on the same time axis on the edit screen 42, so the user grasps the relationship between these performance expressions. It becomes easy to do. That is, the editing operation can be executed while comprehensively determining the relationship between these three performance expressions. In addition, it is possible to recognize how a performance expression is completed by combining performance expressions such as tempo and volume, so that the user can easily perform the expression of the performance.

また、ピアノロール画面４０に頂点らしさの情報が表示されるので、どの音符が重要であるかを定量的に判断できるようになり、これによってもユーザは、演奏の表情付けを容易に実行できるようになる。 In addition, since the information about the apex is displayed on the piano roll screen 40, it is possible to quantitatively determine which note is important, and this also allows the user to easily perform the expression of the performance. become.

なお、１分間程度の中級者向けのピアノ曲に対して表情付けを行う場合、対象音符は概ね５００音程度であり、それぞれに演奏表情を付けることは困難な作業である。しかし、支援装置１０を用いた場合、ユーザが指定するフレーズ（境界）は１０数個であり、ユーザが指定したフレーズの表情カーブを修正するだけで演奏の表情付けが実行できるので、演奏デザインの効率化という観点において、支援装置１０は有効であることがわかる。 In addition, when performing expression on a piano song for intermediate players of about 1 minute, the target notes are approximately 500 sounds, and it is difficult to add a performance expression to each. However, when the support device 10 is used, there are a dozen or so phrases (boundaries) specified by the user, and the expression of the performance can be executed simply by correcting the expression curve of the phrase specified by the user. It can be seen that the support device 10 is effective in terms of efficiency.

続いて、上述のような支援装置１０の動作の一例をフロー図に従って説明する。具体的には、支援装置１０のＣＰＵ１２が、図７に示す全体処理を実行する。図７を参照して、ＣＰＵ１２は、全体処理を開始すると、ステップＳ１で、楽譜情報（楽譜データ）を取得し、その楽譜情報をピアノロール形式で提示する。すなわち、メモリ１４等に記憶されたユーザ所望の楽曲の楽譜情報を読み出し、その情報に基づいてピアノロール画面を生成して表示装置１８の表示画面に表示する。次のステップＳ３では、詳細は後述するフレーズ構造の解析処理を実行する。ステップＳ３で楽曲のフレーズ構造が決定すると、ステップＳ５に進む。 Then, an example of operation | movement of the above assistance apparatuses 10 is demonstrated according to a flowchart. Specifically, the CPU 12 of the support apparatus 10 executes the entire process shown in FIG. Referring to FIG. 7, when starting the entire process, CPU 12 acquires score information (score data) in step S1, and presents the score information in a piano roll format. That is, the musical score information of the user-desired music stored in the memory 14 or the like is read out, and a piano roll screen is generated based on the information and displayed on the display screen of the display device 18. In the next step S3, a phrase structure analysis process, which will be described later in detail, is executed. When the phrase structure of the music is determined in step S3, the process proceeds to step S5.

ステップＳ５では、フレーズ選択の入力があるか否かを判断する。すなわち、ステップ３で決定されたフレーズ構造における一連のフレーズ群の中から、ユーザが入力装置１６等を用いて１つのフレーズを選択したか否かを判断する。ステップＳ５で“ＹＥＳ”の場合、すなわちフレーズ選択の入力がある場合には、ステップＳ７に進む。一方、ステップＳ５で“ＮＯ”の場合、すなわちフレーズ選択の入力がない場合には、ステップＳ１７に進む。 In step S5, it is determined whether or not there is an input for phrase selection. That is, it is determined whether or not the user has selected one phrase from the series of phrase groups in the phrase structure determined in step 3 using the input device 16 or the like. If “YES” in the step S5, that is, if there is an input of phrase selection, the process proceeds to a step S7. On the other hand, if “NO” in the step S5, that is, if there is no input of phrase selection, the process proceeds to a step S17.

ステップＳ７では、詳細は後述する頂点らしさの推定処理を実行し、ステップＳ９に進む。ステップＳ９では、頂点らしさの提示、つまり頂点音の候補の提示を行う。たとえば、ピアノロール画面上の音に任意の色を配色し、その濃淡情報によって頂点らしさを識別可能に表示する。 In step S7, an apex likelihood estimation process, which will be described later in detail, is performed, and the process proceeds to step S9. In step S9, the likelihood of a vertex, that is, a candidate for a vertex sound is presented. For example, an arbitrary color is arranged on the sound on the piano roll screen, and the apex-likeness is displayed in an identifiable manner based on the shading information.

次のステップＳ１１では、頂点指定の入力を受け付ける。ただし、ユーザによる頂点指定が無い場合や、自動決定に設定されているような場合には、ステップＳ７で算出した各音符の頂点らしさに基づいて、フレーズの頂点音を自動的に決定してもよい。また、ユーザによる頂点指定の入力があった場合には、その情報をユーザプレファランスとして適宜記憶しておき、頂点推定ルールのパラメータなどをそのユーザに合うように更新していくようにすることもできる。 In the next step S11, input of apex designation is accepted. However, if there is no apex designation by the user, or if automatic determination is set, the apex sound of the phrase may be automatically determined based on the vertices of each note calculated in step S7. Good. In addition, when there is an input of apex designation by the user, the information is appropriately stored as a user preference, and the apex estimation rule parameters and the like may be updated to suit the user. it can.

次のステップＳ１３では、頂点情報に基づく表情カーブの初期値を設定し、提示する。たとえば、ステップＳ１１で決定された頂点音を頂点とする山型形状となるようにダイナミクスカーブやテンポカーブのデフォルト値を設定したり、ステップＳ７で推定した頂点らしさに応じた形状になるようにダイナミクスカーブやテンポカーブのデフォルト値を設定したりする。どのようなデフォルト値を取るかはユーザが選択できるようにすることもできる。そして、ユーザからの入力指示に応じて、各表情カーブをＧＵＩとして表示する。たとえば、ユーザが、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブを表示するよう入力したときは、これら３つの表情カーブを同じ画面上に、同じ時間軸で多重表示する。 In the next step S13, an initial value of the expression curve based on the vertex information is set and presented. For example, a default value of a dynamics curve or a tempo curve is set so as to have a peak shape with the vertex sound determined in step S11 as a vertex, or a dynamics so as to have a shape according to the vertex-likeness estimated in step S7. Set default values for curves and tempo curves. The user can also select what default value to take. Each facial expression curve is displayed as a GUI in response to an input instruction from the user. For example, when the user inputs to display three expression curves of tempo, dynamics, and articulation, these three expression curves are displayed in multiple on the same screen on the same time axis.

次のステップＳ１５では、後述する表情カーブの編集処理を実行し、ステップＳ１７に進む。ステップＳ１７では、全体処理を終了するか否かを判断する。たとえば、ユーザから終了指示があった場合や、ユーザによる入力操作が所定時間以上行われなかった場合には、全体処理の終了と判断する。ステップＳ１７で“ＹＥＳ”の場合には、この全体処理をそのまま終了する。一方、ステップＳ１７で“ＮＯ”の場合には、処理はステップＳ３またはユーザの指定に応じた適宜のステップに戻る。 In the next step S15, an expression curve editing process described later is executed, and the process proceeds to step S17. In step S17, it is determined whether or not to end the entire process. For example, when there is an end instruction from the user, or when an input operation by the user has not been performed for a predetermined time or more, it is determined that the entire process is finished. If “YES” in the step S17, the entire process is terminated as it is. On the other hand, if “NO” in the step S17, the process returns to an appropriate step according to the designation of the step S3 or the user.

図８は、図７に示したステップＳ３のフレーズ構造解析処理を示すフロー図である。図８を参照して、ＣＰＵ１２は、フレーズ構造解析処理を開始すると、ステップＳ２１で、ＧＰＲプレファレンスの指定を受け付ける。すなわち、ＧＰＲの基本プレファレンスとユーザプレファレンスのどちらを優先するのかのユーザによる選択を受け付け、ユーザの選択に応じたＧＰＲプレファレンスに設定する。 FIG. 8 is a flowchart showing the phrase structure analysis processing in step S3 shown in FIG. Referring to FIG. 8, when starting the phrase structure analysis process, CPU 12 accepts designation of a GPR preference in step S21. In other words, the selection by the user regarding whether to give priority to the GPR basic preference or the user preference is accepted, and the GPR preference is set according to the user selection.

次のステップＳ２３では、フレーズ境界の入力を受け付ける。すなわち、図７のステップＳ１で提示したピアノロール上のどの部分にユーザによってフレーズ境界が入力されたかを検出し、メモリ１４等に一時記憶する。ステップＳ２５では、ユーザＧＰＲプレファレンスを更新する。すなわち、ステップＳ２３でのユーザによるフレーズの境界指定に応じて、ユーザＧＰＲプレファレンスをそのユーザに合うように更新する。 In the next step S23, an input of a phrase boundary is accepted. That is, it is detected in which part on the piano roll presented in step S1 of FIG. 7 that the phrase boundary is input by the user, and is temporarily stored in the memory 14 or the like. In step S25, the user GPR preference is updated. That is, according to the phrase boundary designation by the user in step S23, the user GPR preference is updated to suit the user.

次のステップＳ２７では、類似フレーズを検出する。すなわち、ステップＳ２３でユーザが指定したフレーズと類似するフレーズを楽曲全体から抽出し、ステップＳ２９に進む。ステップＳ２９では、ｅＸＧＴＴＭに基づいて、残りのフレーズ構造を解析する。すなわち、ユーザが境界指定したフレーズの前後のフレーズ構造、および上位や下位のフレーズ構造を解析する。この際には、ステップＳ２１で設定したＧＰＲプレファレンス（ユーザまたは基本）を優先する。また、ユーザの指定したフレーズ境界が、ＧＴＴＭのプレファレンスと反駁するときには、基本ＧＰＲプレファレンスの重みを下げると共に、ユーザＧＰＲプレファレンスをそのユーザに合うように更新する。ステップＳ３１では、フレーズ構造を提示する。すなわち、ステップＳ２９で解析した階層的フレーズ構造を表示装置１８にＧＵＩとして表示する。 In the next step S27, a similar phrase is detected. That is, a phrase similar to the phrase specified by the user in step S23 is extracted from the entire music, and the process proceeds to step S29. In step S29, the remaining phrase structure is analyzed based on eXGTTM. That is, the phrase structure before and after the phrase specified by the user and the upper and lower phrase structures are analyzed. In this case, the GPR preference (user or basic) set in step S21 is prioritized. Further, when the phrase boundary specified by the user conflicts with the GTTM preference, the weight of the basic GPR preference is reduced and the user GPR preference is updated to match the user. In step S31, a phrase structure is presented. That is, the hierarchical phrase structure analyzed in step S29 is displayed on the display device 18 as a GUI.

次のステップ３３では、フレーズ構造を決定するか否かを判断する。すなわち、ユーザからのフレーズ構造の決定指示があるか否かを判断する。たとえば、ユーザは、この段階での演奏の試聴とフレーズ構造の視覚的な確認とによって、意図するフレーズ構造になっているか判断し、満足である場合にはその判断結果を入力する。ステップＳ３３で“ＹＥＳ”の場合、すなわちフレーズ構造を決定する場合には、図１の全体処理にリターンする。一方、ステップＳ３３で“ＮＯ”の場合、すなわちフレーズ構造を解析しなおす場合には、ステップＳ２３またはユーザの指定に応じた適宜のステップに戻る。 In the next step 33, it is determined whether or not the phrase structure is determined. That is, it is determined whether or not there is a phrase structure determination instruction from the user. For example, the user determines whether the intended phrase structure is achieved by listening to the performance at this stage and visually confirming the phrase structure, and inputs the determination result if satisfied. If “YES” in the step S33, that is, if the phrase structure is determined, the process returns to the entire process of FIG. On the other hand, if “NO” in the step S33, that is, if the phrase structure is analyzed again, the process returns to the step S23 or an appropriate step according to the user's designation.

図９は、図７に示したステップＳ７の頂点らしさの計算処理を示すフロー図である。なお、図９において、「ｉ」は、音符を指定するために音符の並び順に割り振られた変数である。また、Ｎｏｔｅ（ｉ）は、ｉ番目の音符の音高、音価および和声などの情報を記述したものであり、ＡｐｅｘＶａｌ（ｉ）は、ｉ番目の音符の頂点らしさを表す値（エネルギー値）である。図９を参照して、ＣＰＵ１２は、頂点らしさの計算処理を開始すると、ステップＳ４１で、ＡｐｅｘＶａｌ（ｉ）＝０を設定して頂点らしさの初期化を行い、ステップＳ４３で、ｉ＝０を設定する。 FIG. 9 is a flowchart showing the vertex-likeness calculation process in step S7 shown in FIG. In FIG. 9, “i” is a variable assigned in the order of the notes in order to specify the notes. Note (i) describes information such as the pitch, note value and harmony of the i-th note, and ApexVal (i) is a value (energy value) representing the apex-likeness of the i-th note. ). Referring to FIG. 9, when starting the vertex-likeness calculation process, CPU 12 sets ApexVal (i) = 0 to initialize vertex-likeness in step S41, and sets i = 0 in step S43. To do.

次のステップＳ４５では、ｉ＝ｍａｘか否かを判断する。すなわち、対象となる全ての音符について頂点らしさの計算を行ったか否かを判断する。ステップＳ４５で“ＹＥＳ”の場合、すなわち対象となる全ての音符について頂点らしさの計算を行った場合には、図１の全体処理にリターンする。一方、ステップＳ４５で“ＮＯ”の場合、すなわち頂点らしさの計算を行っていない音符がある場合には、ステップＳ４７に進む。 In the next step S45, it is determined whether i = max. That is, it is determined whether or not the vertex-likeness has been calculated for all target notes. If “YES” in the step S45, that is, if the vertex-likeness is calculated for all the target notes, the process returns to the entire process of FIG. On the other hand, if “NO” in the step S45, that is, if there is a note for which the vertex-likeness calculation is not performed, the process proceeds to a step S47.

ステップＳ４７では、Ｎｏｔｅ（ｉ）の情報と頂点推定ルール（図４および５参照）のそれぞれとを比較照合し、Ｎｏｔｅ（ｉ）に該当するルールがある場合には、該当するルールに割り当てられた評価ポイントをＡｐｅｘＶａｌ（ｉ）に加算する。算出したＡｐｅｘＶａｌ（ｉ）の値、つまり頂点らしさを示す合計ポイントは、各音符に対応付けてメモリ１４などに適宜記憶される。次のステップＳ４９では、ｉをインクリメントして（ｉ＝ｉ＋１）、ステップＳ４５に戻る。 In step S47, the information of Note (i) is compared with each of the vertex estimation rules (see FIGS. 4 and 5), and if there is a rule corresponding to Note (i), it is assigned to the corresponding rule. The evaluation point is added to ApexVal (i). The calculated ApexVal (i) value, that is, the total points indicating the likelihood of vertices, is appropriately stored in the memory 14 or the like in association with each note. In the next step S49, i is incremented (i = i + 1), and the process returns to step S45.

図１０は、図７に示したステップＳ１５の表情カーブの編集処理を示すフロー図である。図１０を参照して、ＣＰＵ１２は、表情カーブの編集処理を開始すると、ステップＳ５１で、表情カーブの修正があるか否かを判断する。すなわち、ユーザによってテンポカーブやダイナミクスカーブ等に修正が加えられたか否かを判断する。ステップＳ５１で“ＹＥＳ”の場合、すなわち表情カーブの修正がある場合には、ステップＳ５３に進む。一方、ステップＳ５１で“ＮＯ”の場合、すなわち表情カーブの修正がない場合には、ステップＳ６１に進む。 FIG. 10 is a flowchart showing the expression curve editing process in step S15 shown in FIG. Referring to FIG. 10, when starting the expression curve editing process, CPU 12 determines in step S51 whether or not there is an expression curve correction. That is, it is determined whether or not the tempo curve or dynamics curve has been modified by the user. If “YES” in the step S51, that is, if there is correction of the expression curve, the process proceeds to a step S53. On the other hand, if “NO” in the step S51, that is, if the facial expression curve is not corrected, the process proceeds to a step S61.

ステップＳ５３では、楽曲全体の表情カーブを計算する。すなわち、ステップＳ５１で修正したフレーズの上位構造に対しても、頂点情報に基づいて表情カーブのデフォルト値を設定し、階層的フレーズ構造の各フレーズの表情カーブを合成して、楽曲全体としての表情カーブを生成する。なお、ユーザは、表情カーブを修正する際には、クレッシェンド等の演奏記号を入力することも可能であり、演奏記号が入力された場合には、ルール処理によって演奏記号を表情カーブに展開する。 In step S53, the expression curve of the entire music is calculated. That is, the default value of the expression curve is also set based on the vertex information for the superstructure of the phrase modified in step S51, and the expression curve of each phrase in the hierarchical phrase structure is synthesized to express the expression as the entire song. Generate a curve. The user can input a performance symbol such as crescendo when correcting the expression curve. When a performance symbol is input, the user develops the performance symbol into the expression curve by rule processing.

次のステップＳ５５では、各音符の個別修正があるか否かを判断する。すなわち、音符ごとの発音時刻、消音時刻および音量などの修正がユーザによって行われたか否かを判断する。ステップＳ５５で“ＹＥＳ”の場合、すなわち各音符の個別修正がある場合には、ステップＳ５７に進む。一方、ステップＳ５５で“ＮＯ”の場合、すなわち各音符の個別修正がない場合には、ステップＳ５９に進む。 In the next step S55, it is determined whether or not there is an individual correction of each note. That is, it is determined whether or not the sound generation time, the mute time, and the volume of each note have been corrected by the user. If “YES” in the step S55, that is, if there is an individual correction of each note, the process proceeds to a step S57. On the other hand, if “NO” in the step S55, that is, if there is no individual correction of each note, the process proceeds to a step S59.

ステップＳ５７では、ステップＳ５５で受け付けた修正指示に応じて、音符単位での演奏表情の修正を行う。ステップＳ５９では、ピアノロール画面およびエディット画面に編集結果を表示すると共に、表情カーブの編集に応じた演奏を出力するための演奏データを更新する。 In step S57, the performance expression is corrected in note units in accordance with the correction instruction received in step S55. In step S59, the editing result is displayed on the piano roll screen and the edit screen, and the performance data for outputting a performance corresponding to the expression curve editing is updated.

そして、ステップ６１では、表情カーブの決定か否かを判断する。すなわち、ユーザからの表情カーブの決定指示があるか否かを判断する。たとえば、ユーザは、表情付けされた演奏の試聴とピアノロール画面やエディット画面の視覚的な確認とによって、意図する演奏の表情付けが行われたか否かを判断し、満足である場合にはその判断結果を入力する。ステップＳ６１で“ＹＥＳ”の場合、すなわち表情カーブを決定する場合には、図１の全体処理にリターンする。一方、ステップＳ６１で“ＮＯ”の場合、すなわち表情カーブを修正しなおす場合には、ステップＳ５１またはユーザの指定に応じた適宜のステップに戻る。 In step 61, it is determined whether or not a facial expression curve is determined. That is, it is determined whether or not there is an instruction to determine the expression curve from the user. For example, the user determines whether or not the intended expression of the performance has been performed by auditioning the performance with the expression and visual confirmation of the piano roll screen or the edit screen. Enter the judgment result. If “YES” in the step S61, that is, if the facial expression curve is determined, the process returns to the entire process of FIG. On the other hand, if “NO” in the step S61, that is, if the facial expression curve is corrected again, the process returns to the step S51 or an appropriate step according to the user's designation.

この実施例によれば、表情カーブを編集するときに、テンポ、ダイナミクスおよびアーティキュレーションの３つの表情カーブを、エディット画面上に、同じ時間軸で多重表示するので、ユーザは、これら演奏表現の関係性を把握し易い。つまり、テンポや音量などの演奏表現をどのように組み合わせることで、その演奏表情が出来上がっているかを認識できるので、ユーザは、演奏の表情付けを効率良く容易に実行することができるようになる。また、このような認識を持って演奏の表情付けを行うことによって、ユーザの演奏表現に関する思考が客観的のものに移行し、楽曲分析や演奏表現の技法をより具体的に理解できるようになる。つまり、暗黙知として存在していたフレーズの表現法、つまりテンポ等の演奏パラメータをどのように組み合わせてその表現を実現したかを外在化できるので、自身の知識を形式化することができる。したがって、第３者に演奏表現の技法をより具体的に伝えることができるようになるので、支援装置１０は、教育用ツールとして好適に用いることができる。 According to this embodiment, when editing a facial expression curve, three facial expression curves of tempo, dynamics, and articulation are displayed on the edit screen in the same time axis. Easy to grasp the relationship. In other words, since it is possible to recognize how the performance expression is completed by combining performance expressions such as tempo and volume, the user can efficiently and easily perform expression of the performance. In addition, by performing expression of performance with such recognition, the user's thinking about performance expression shifts to an objective one, and the technique of music analysis and performance expression can be understood more specifically. . That is, since the expression method of the phrase that has existed as tacit knowledge, that is, how the performance parameters such as tempo are combined to realize the expression can be externalized, its own knowledge can be formalized. Therefore, since the technique of performance expression can be conveyed more specifically to a third party, the support device 10 can be suitably used as an educational tool.

また、フレーズの頂点音は、フレーズ境界のように試聴によって明示的に判断できるものではないため、音楽未経験者にとって頂点音を見極めることは困難な作業となる。しかし、この実施例によれば、ピアノロール画面に頂点らしさの情報（ガイダンス）を表示するので、ユーザは、どの音符が重要であるかを定量的に判断できるようになり、これを参照することによって、演奏の表情付けを容易かつ効率的にできるようになる。また、どのような音が頂点音になり易いかを視覚的に判断できるので、ユーザは、頂点音の発見方法を理解し易くなる。この点においても、支援装置１０は、教育用ツールとして好適に用いることができると言える。 In addition, since the peak sound of a phrase cannot be determined explicitly by trial listening like a phrase boundary, it is difficult for those who have not experienced music to determine the peak sound. However, according to this embodiment, the information (guidance) about the apex is displayed on the piano roll screen, so that the user can quantitatively determine which note is important and refer to this. This makes it easier and more efficient to express performance. In addition, since it is possible to visually determine what kind of sound is likely to be a vertex sound, the user can easily understand how to find the vertex sound. Also in this respect, it can be said that the support device 10 can be suitably used as an educational tool.

さらに、この実施例では、音符の頂点らしさの計算のための条件および評価ポイント値をルールとして記述していることから、その計算の根拠をユーザに示すことも容易に実現される。したがって、ユーザ自身による頂点の推定や把握を助けるという視点において、演奏の表情付けの支援を行うことができる。 Furthermore, in this embodiment, since the conditions for calculating the likelihood of the vertex of a note and the evaluation point value are described as rules, it is easy to show the basis of the calculation to the user. Therefore, it is possible to support the expression of the performance from the viewpoint of helping the user to estimate and grasp the vertex.

なお、上述の実施例では、ユーザがフレーズ境界を部分的に指定（プライマリフレーズラインを指定）した後、残りの部分のフレーズ構造を自動解析するようにしたが、これに限定されない。たとえば、楽曲の階層的フレーズ構造の全てをユーザが指定することもできるし、階層的フレーズ構造の全てを自動解析することもできる。もちろん、階層的フレーズ構造の全てを自動解析した後に、ユーザによってフレーズ境界を修正していくこともできる。 In the above-described embodiment, after the user partially specifies the phrase boundary (designates the primary phrase line), the phrase structure of the remaining part is automatically analyzed, but the present invention is not limited to this. For example, the user can specify all of the hierarchical phrase structure of the music, or can automatically analyze all of the hierarchical phrase structure. Of course, the phrase boundary can be corrected by the user after the entire hierarchical phrase structure is automatically analyzed.

また、上述の実施例では、プライマリフレーズラインに含まれるフレーズに対して、頂点らしさを計算したり、表情カーブの修正を受け付けたりしたが、これに限定されず、楽曲の階層的フレーズ構造を構成する全てのフレーズのそれぞれに対して、頂点らしさを計算することもできるし、表情カーブの修正を受け付けることもできる。 Further, in the above-described embodiment, for the phrase included in the primary phrase line, the apex-likeness is calculated and the correction of the expression curve is accepted. However, the present invention is not limited to this, and the hierarchical phrase structure of the music is configured. It is possible to calculate the likelihood of vertices for each of all the phrases to be performed and accept correction of the expression curve.

１０ …演奏表情付け支援装置
１２ …ＣＰＵ
１４ …メモリ
１６ …入力装置
１８ …表示装置
２０ …音源
４０ …ピアノロール画面
４２ …エディット画面
５０ …表情カーブ 10 ... Performance expression support device 12 ... CPU
DESCRIPTION OF SYMBOLS 14 ... Memory 16 ... Input device 18 ... Display device 20 ... Sound source 40 ... Piano roll screen 42 ... Edit screen 50 ... Expression curve

Claims

A performance expression support device that supports expression of performance based on phrasing,
First, multiple expression curves of tempo, dynamics, and articulation are displayed on the same screen in the same time axis for each phrase constituting a hierarchical phrase structure designated or automatically analyzed by the user. A performance expression support apparatus, comprising: display means; and correction accepting means for accepting correction of the expression curve displayed on the screen.

The performance expression support apparatus according to claim 1, further comprising: a calculation unit that calculates the apex-likeness of each note in the phrase; and a second display unit that displays the apex-likeness of each note in an identifiable manner.

The calculating means includes
Rule storage means for storing a group of rules for determining the apex-likeness of a note based on a predetermined music theory, and by comparing each note in the phrase with each of the rule group, The performance expression support device according to claim 2, further comprising addition means for adding the evaluation point assigned to the rule as an energy value indicating the likelihood of the note apex.