JP2005099068A

JP2005099068A - Musical instrument and musical sound control method

Info

Publication number: JP2005099068A
Application number: JP2003329338A
Authority: JP
Inventors: Takuya Shinkawa; 拓也新川
Original assignee: Individual
Current assignee: Individual
Priority date: 2003-09-22
Filing date: 2003-09-22
Publication date: 2005-04-14

Abstract

<P>PROBLEM TO BE SOLVED: To provide a musical instrument not requiring a person to move hands and feet for playing like existing musical instruments and welfare musical instruments in order for the person with four paralyzed limbs and a person with four disabled limbs to enjoy playing of the musical instrument. <P>SOLUTION: In the musical instrument, the images of the face and expressions of the mouth part, tongue, eyes and nose, etc. of a player, are analyzed, a feature quantities is extracted and musical sound is controlled by the feature quantities. Also, in the musical instrument, the image of the mouth part, the tongue, the nose or a part of the face extracted from video signals or a graphic indicating the size, shape or position of the image is displayed together with a musical sound template. Further, in the musical instrument, the operation of arranging the prescribed part of the image or the graphic on the musical instrument template is received and the size and shape of the image or the graphic or the position of the graphic are related to the attribute of the musical sound to be generated. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、四肢麻痺者や四肢不自由者が楽器演奏を楽しむことができる福祉用の楽器で、演奏に四肢を用いなくともよく、演奏に必要な操作子を身体に装着しなくともよい楽器に関する。 The present invention is a welfare musical instrument that allows a paralyzed person or a physically handicapped person to enjoy playing a musical instrument, and does not require the use of the extremity for performance, and does not require the operator necessary for performance to be worn About.

既存の楽器や福祉楽器では、演奏に手足を動かすことが不可欠である。四肢麻痺者や四肢不自由者が楽器演奏を楽しむことができるものは知られていない。一方、人体の手首、肘、肩および足などに取り付けられるセンサーで演奏制御用の情報を電子楽器に与える技術が知られている（例えば、特許文献１参照）。
特開平１０−０９７２５０号公報（第１頁、図１） With existing musical instruments and welfare instruments, it is essential to move the limbs for performance. There are no known limb paralyzed or handicapped persons who can enjoy playing musical instruments. On the other hand, a technique is known in which performance control information is provided to an electronic musical instrument using sensors attached to the wrist, elbow, shoulders, and feet of a human body (see, for example, Patent Document 1).
JP-A-10-097250 (first page, FIG. 1)

四肢麻痺者や四肢不自由者が楽器演奏を楽しむには、既存の楽器や福祉楽器のように、演奏に手足を動かすことが不要な楽器が必要である。本発明では、演奏に四肢を用いず、演奏に必要な操作子を身体に装着しない楽器を提供する。 In order for a person with limb paralysis or a person with a limb disability to enjoy playing a musical instrument, an instrument that does not require moving hands and feet is necessary, such as existing musical instruments and welfare instruments. The present invention provides a musical instrument that does not use limbs for performance and does not attach an operator necessary for performance to the body.

上記の課題を解決するために、本発明の楽器は、以下のような手段を採用する。
（１）演奏者の顔またはその一部を撮影する撮像部と、前記撮像部の出力する映像信号を解析して、特徴量を抽出する画像解析部と、前記特徴量に基づき楽音信号を発生する楽音合成部を備える楽器。
（２）前記映像信号から、演奏者の口唇部を抽出し、前記特徴量として、前記口唇部の開口部分の高さ情報と幅情報を抽出し、前記高さ情報と前記幅情報により、前記楽音信号の発生を制御する（１）記載の楽器。
（３）前記高さ情報により、前記楽音信号の音高と音量の一方を制御し、前記幅情報により、他方を制御することを特徴とする（２）記載の楽器。
（４）前記映像信号から、前記特徴量として、演奏者の鼻の位置情報を抽出し、前記鼻の位置情報により、前記楽音信号の発生を制御する（１）記載の楽器。
（５）前記映像信号から、前記特徴量として、演奏者の鼻の縦方向と横方向の２次元の位置情報を抽出し、前記縦方向の位置情報により、前記楽音信号の音高と音量の一方を制御し、前記横方向の位置情報により他方を制御することを特徴とする（４）記載の楽器。
（６）前記映像信号から、前記特徴量として、演奏者の舌の位置情報を抽出し、前記舌の位置情報により、前記楽音信号の発生を制御する（１）記載の楽器。 In order to solve the above problems, the musical instrument of the present invention employs the following means.
(1) An image capturing unit that captures a face of a performer or a part thereof, an image analysis unit that analyzes a video signal output from the image capturing unit and extracts a feature value, and generates a musical sound signal based on the feature value A musical instrument with a musical sound synthesis unit.
(2) The player's lip is extracted from the video signal, the height information and width information of the opening of the lip is extracted as the feature amount, and the height information and the width information The musical instrument according to (1), which controls generation of a musical tone signal.
(3) The musical instrument according to (2), wherein one of pitch and volume of the musical tone signal is controlled by the height information, and the other is controlled by the width information.
(4) The musical instrument according to (1), wherein a player's nose position information is extracted as the feature value from the video signal, and the generation of the musical tone signal is controlled by the nose position information.
(5) Two-dimensional position information in the vertical and horizontal directions of the performer's nose is extracted as the feature value from the video signal, and the pitch and volume of the musical sound signal are extracted from the vertical position information. (1) The musical instrument according to (4), wherein one is controlled and the other is controlled based on the lateral position information.
(6) The musical instrument according to (1), wherein position information of a player's tongue is extracted as the feature value from the video signal, and generation of the musical sound signal is controlled by the position information of the tongue.

（７）前記映像信号から、前記特徴量として、演奏者の舌の縦方向と横方向の２次元の位置情報を抽出し、前記縦方向の位置情報により、前記楽音信号の音高と音量の一方を制御し、前記横方向の位置情報により他方を制御する、あるいは、演奏者の舌の縦方向、横方向、奥行き方向の３次元の位置情報を抽出し、各位置情報のいずれかにより、前記楽音信号の音高、音量、音色のいずれかを制御することを特徴とする（６）記載の楽器。
（８）前記映像信号から、前記特徴量として、演奏者の瞼または口唇部の開閉パターン情報を抽出し、前記開閉パターン情報により、前記楽音信号の属性の制御、もしくは、前記楽器の操作の制御を行うことを特徴とする（１）記載の楽器。
（９）表示部を備え、前記映像情報より抽出した口唇部、舌、鼻、または、顔の一部の画像、または、前記画像の大きさ、形状、あるいは位置を示す図形を、楽音テンプレートと共に前記表示部に表示することを特徴とする（１）記載の楽器。
（１０）前記画像、または、前記図形の所定の部分を、前記楽音テンプレート上に配置する操作を受け付け、前記画像、あるいは、前記図形の大きさ、形状または前記図形の位置と発生する楽音の属性との関連付けを行うようにしたことを特徴とする（９）記載の楽器。 (7) Two-dimensional position information in the vertical and horizontal directions of the performer's tongue is extracted as the feature value from the video signal, and the pitch and volume of the tone signal are extracted from the vertical position information. One side is controlled and the other side is controlled by the lateral position information, or three-dimensional positional information of the player's tongue in the vertical direction, horizontal direction, and depth direction is extracted. The musical instrument according to (6), wherein any one of a pitch, a volume, and a tone color of the musical tone signal is controlled.
(8) Extracting opening / closing pattern information of a player's heel or lip as the feature amount from the video signal, and controlling the attribute of the musical sound signal or controlling the operation of the musical instrument based on the opening / closing pattern information (1) The musical instrument according to (1).
(9) An image of a part of the lip, tongue, nose, or face extracted from the video information, or a graphic indicating the size, shape, or position of the image, together with a musical sound template The musical instrument according to (1), which is displayed on the display unit.
(10) Accepting an operation to place a predetermined portion of the image or the graphic on the musical sound template, and the size or shape of the graphic or the position of the graphic and the attribute of the generated musical sound The musical instrument according to (9), wherein the musical instrument is associated with the musical instrument.

（１１）顔またはその一部の映像信号を解析して、特徴量を抽出する画像解析手順、および、前記特徴量に基づき楽音信号の発生を制御する手順の組、または、前記映像信号から、口唇部を抽出し、特徴量として、前記口唇部の開口部分の高さ情報と幅情報のいずれかを抽出する手順、および、前記高さ情報と前記幅情報のいずれかにより楽音信号の発生を制御する手順の組、または、前記映像信号から、口唇部を抽出し、特徴量として、前記口唇部の開口部分の高さ情報と幅情報を抽出する手順、および、前記高さ情報により、楽音信号の音高と音量の一方を制御し、前記幅情報により、他方を制御する手順の組、または、前記映像信号から、特徴量として、演奏者の鼻の位置情報を抽出する手順、前記鼻の位置情報により、楽音信号の発生を制御する手順の組、または、前記映像信号から、特徴量として、演奏者の鼻の縦方向と横方向の２次元の位置情報を抽出する手順、および、前記縦方向の位置情報により、楽音信号の音高と音量の一方を制御し、前記横方向の位置情報により他方を制御する手順の組、または、前記映像信号から、特徴量として、演奏者の舌の位置情報を抽出手順、および、前記舌の位置情報により、前記楽音信号の発生を制御する手順の組、または、前記映像信号から、特徴量として、演奏者の舌の縦方向と横方向の２次元の位置情報を抽出する手順、および、前記縦方向の位置情報により、楽音信号の音高と音量の一方を制御し、前記横方向の位置情報により他方を制御する手順の組、または、前記映像信号から、特徴量として、演奏者の舌の縦方向、横方向、奥行き方向の３次元の位置情報を抽出する手順、および、各位置情報のいずれかにより、楽音信号の音高、音量、音色のいずれかを制御する手順の組、または、前記映像信号から、特徴量として、演奏者の瞼の開閉パターン情報を抽出する手順、および、前記開閉パターン情報により、楽音信号の属性の制御、もしくは、楽器の操作の制御する手順の組、の内の何れかの手順の組を実行する楽音制御方法。
（１２）前記映像情報より抽出した口唇部、舌、鼻、または、顔の一部の画像、または、前記画像の大きさ、形状、あるいは位置を示す図形を、楽音テンプレートと共に表示部に表示する手順を有することを特徴とする（１１）記載の楽音制御方法。
（１３）前記画像、または、前記図形の所定の部分を、前記楽音テンプレート上に配置する操作を受け付け、前記画像、あるいは、前記図形の大きさ、形状または前記図形の位置と発生する楽音の属性との関連付けを行う手順を有することを特徴とする（１２）記載の楽音制御方法。
（１４）上記（１１）、（１２）、（１３）何れか記載の手順をコンピュータに行わせるプログラム。
（１５）上記（１１）、（１２）、（１３）何れか記載の手順をコンピュータに行わせるプログラムを記載した記録媒体。 (11) Analyzing a video signal of a face or a part thereof and extracting a feature quantity, and a set of procedures for controlling generation of a musical sound signal based on the feature quantity, or from the video signal, Extracting the lip and extracting the height information and the width information of the opening portion of the lip as a feature amount, and generating a musical sound signal by either the height information or the width information According to a set of procedures to be controlled or a procedure for extracting the lip from the video signal and extracting height information and width information of the opening of the lip as features, and the height information, A set of procedures for controlling one of the pitch and volume of the signal and controlling the other based on the width information, or a procedure for extracting position information of a player's nose as a feature value from the video signal; The position information of the A musical tone is obtained from a set of procedures for controlling the player, or a procedure for extracting two-dimensional position information in the vertical and horizontal directions of the player's nose as a feature value from the video signal, and the vertical position information. A set of procedures for controlling one of the pitch and volume of the signal and controlling the other based on the lateral position information, or a procedure for extracting the position information of the performer's tongue as a feature value from the video signal; and From the set of procedures for controlling the generation of the musical tone signal based on the position information of the tongue, or from the video signal, two-dimensional position information of the player's tongue in the vertical and horizontal directions is extracted as a feature amount. From the procedure and the position information in the vertical direction, one of the pitch and volume of the tone signal is controlled, and the other is controlled by the position information in the horizontal direction, or from the video signal, as a feature amount The vertical direction of the performer's tongue A set of procedures for extracting three-dimensional position information in the horizontal direction and the depth direction, and a procedure for controlling one of the pitch, volume, and tone color of the musical tone signal according to any one of the position information, or the video signal From the procedure for extracting the opening / closing pattern information of the player's kite as the feature quantity, and the set of procedures for controlling the attribute of the musical tone signal or the operation of the musical instrument based on the opening / closing pattern information. A musical sound control method that executes a set of procedures.
(12) A lip portion, tongue, nose, or part of face image extracted from the video information, or a graphic indicating the size, shape, or position of the image is displayed on the display unit together with the musical sound template. The musical sound control method according to (11), further comprising a procedure.
(13) An operation for placing the image or a predetermined part of the graphic on the musical sound template is accepted, and the size or shape of the graphic or the position of the graphic and the attribute of the generated musical sound are received. (10) The musical tone control method according to (12), further comprising:
(14) A program that causes a computer to perform the procedure according to any one of (11), (12), and (13).
(15) A recording medium on which a program for causing a computer to perform the procedure described in any one of (11), (12), and (13) is described.

以上のような本発明における楽器によれば、四肢麻痺者や四肢不自由者が、手足を動かすことなく楽器演奏が可能になる。また、演奏に必要な操作子を身体に装着しなくともよい。四肢麻痺者や四肢不自由者が、歌を唄うようにして楽器演奏できる。 According to the musical instrument in the present invention as described above, a paralyzed person or a physically handicapped person can play a musical instrument without moving his limbs. Moreover, it is not necessary to attach an operation element required for performance to the body. A paralyzed person or a handicapped person can play a musical instrument as if singing.

以下、本発明の楽器の実施形態について図面を参照して説明する。なお、実施の形態において同じ符号を付した構成要素が同様の動作を行う場合には、再度の説明を省略する場合がある。
（実施の形態１）
図１は、本発明の楽器の実施の形態を示す図である。本発明は、演奏者１の顔またはその一部を撮影する撮像部２、前記撮像部の出力する映像信号を解析して、顔、または表情から特徴量を抽出する画像解析部３、および、前記特徴量に基づき楽音信号を発生する楽音合成部４を備える楽器である。 Hereinafter, embodiments of a musical instrument of the present invention will be described with reference to the drawings. In addition, when the component which attached | subjected the same code | symbol performed in embodiment performs the same operation | movement, description may be abbreviate | omitted again.
(Embodiment 1)
FIG. 1 is a diagram showing an embodiment of a musical instrument of the present invention. The present invention includes an imaging unit 2 that captures a face of a performer 1 or a part thereof, an image analysis unit 3 that analyzes a video signal output from the imaging unit and extracts a feature amount from a face or facial expression, and The musical instrument includes a musical tone synthesis unit 4 that generates a musical tone signal based on the feature amount.

図１において、撮像部２は、演奏者１の顔を撮影し、顔の映像信号を画像解析部３に供給する。画像解析部３は、映像信号に対して後述するような顔の画像解析を行い、特徴量を検出して楽音合成部４に供給する。楽音合成部４は、特徴量に従って楽音信号を出力する。楽音信号は、ヘッドフォンスピーカやスピーカなどにより音に変換され、演奏者１や周囲の人が聴取する。なお、図１において、当該ヘッドフォンスピーカやスピーカなどは省略している。
まず、図１における表示部５と操作部６は無いものとして説明する。
画像解析部３は、画像解析により、演奏者１の口唇部分を認識し、口唇の位置、口唇に設定した各位置の各サンプル点、開口面積等をリアルタイムで計測して、その変位をピッチデータ、ボリュームデータに変換して演奏する。なお、口唇位置のみを利用して、演奏手段に用いても良い。 In FIG. 1, the imaging unit 2 captures the face of the player 1 and supplies a video signal of the face to the image analysis unit 3. The image analysis unit 3 performs face image analysis as will be described later on the video signal, detects a feature amount, and supplies it to the musical tone synthesis unit 4. The musical tone synthesis unit 4 outputs a musical tone signal according to the feature amount. The musical sound signal is converted into a sound by a headphone speaker or a speaker, and is listened to by the performer 1 and surrounding people. In FIG. 1, the headphone speaker and the speaker are omitted.
First, it is assumed that the display unit 5 and the operation unit 6 in FIG. 1 are not provided.
The image analysis unit 3 recognizes the lip portion of the player 1 by image analysis, measures the position of the lip, each sample point at each position set on the lip, the opening area, etc. in real time, and calculates the displacement as pitch data. , Convert to volume data and play. It should be noted that only the lip position may be used for performance means.

図２（Ａ）は、画像解析部３が抽出した口唇部の画像例である。口唇部１３のように口を開いている状態で、Ｗ１は、開口部の幅である。Ｈ１は開口部の高さである。この例では、口を横に広げた状態である。図２（Ｂ）の口唇部１４では、口を縦に広く開け、横にはややすぼめた場合で、幅、高さがそれぞれＷ２、Ｈ２である。画像解析部３は、このような口唇部分を認識抽出し、幅情報Ｗ、高さ情報Ｈのような特徴量を抽出する。 FIG. 2A is an image example of the lip extracted by the image analysis unit 3. In a state where the mouth is open like the lip 13, W1 is the width of the opening. H1 is the height of the opening. In this example, the mouth is widened. In the lip portion 14 of FIG. 2B, the width and height are W2 and H2, respectively, in the case where the mouth is wide open vertically and is gently swollen sideways. The image analysis unit 3 recognizes and extracts such lips, and extracts feature amounts such as width information W and height information H.

図３は、画像解析部３と楽音合成部４の例である。画像解析部３は、Ｈ検出部３１、Ｗ検出部３２、舌位置検出部３３、右目検出部３４、左目検出部３５を備えている。Ｈ検出部３１は、図２（Ａ）、（Ｂ）における開口部の高さＨを検出する。Ｗ検出部３２は、図２（Ａ）、（Ｂ）における口唇部１３、１４の画像の開口部の幅Ｗを検出する。開口部の高さＨデータは楽音合成部４の端子Ｔ１に供給され、楽音の音階を制御する。開口部の幅Ｗデータは楽音合成部４の端子Ｔ２に供給され、楽音の音量、すなわち、ボリュームを制御する。 FIG. 3 is an example of the image analysis unit 3 and the tone synthesis unit 4. The image analysis unit 3 includes an H detection unit 31, a W detection unit 32, a tongue position detection unit 33, a right eye detection unit 34, and a left eye detection unit 35. The H detector 31 detects the height H of the opening in FIGS. 2 (A) and 2 (B). The W detection unit 32 detects the width W of the opening of the images of the lip portions 13 and 14 in FIGS. The height data H of the opening is supplied to the terminal T1 of the musical tone synthesis unit 4 and controls the musical tone scale. The width W data of the opening is supplied to the terminal T2 of the musical tone synthesizing unit 4 and controls the volume of the musical tone, that is, the volume.

開口部の高さＷを５ｍｍ毎に区分し、Ｈ＝５〜１０ｍｍをド、Ｈ＝１０〜１５ｍｍをレ、Ｈ＝１５〜２０ｍｍをミというふうに割り当てる。５〜４５ｍｍで１オクターブとなる。開口部の幅Ｈが３０ｍｍ以下を楽譜の強弱記号のメゾピアノ（ｍｐ）、３０〜４０ｍｍをメゾフォルテ（ｍｆ）、４０ｍｍ以上をフォルテ（ｆ）に割り当てる。楽音合成部４は、上記Ｈデータが、音階データとして、端子Ｔ１に印加されると対応する音階の楽音信号を発生する。Ｗデータは端子Ｔ２に印加されると、対応する強弱のついた楽音信号を発生する。Ｈ＝０〜５ｍｍ、すなわち、口をほぼ閉じている場合は、楽音信号の発生を停止する。なぜなら、楽音が出っ放しにならないように、Ｗを０にしたり、Ｗを小さい値に保つために口の左右をすぼめておいたりするのは、容易な動作ではないと思われるからである。
なお、幅情報Ｗにより音高を制御し、高さ情報Ｈにより音量を制御してもよい。この場合は、口を閉じると音量がゼロになる。
本実施の形態のように、口唇部の形状により楽音の発生を制御する場合、リズムに乗って顔を多少動かしても、口唇部をそのままに保てば、発生する楽音は変化しないようにでき、演奏者１が顔を拘束される感じは少なくてすむ。 The height W of the opening is divided every 5 mm, H = 5 to 10 mm is assigned, H = 10 to 15 mm is assigned, and H = 15 to 20 mm is assigned to Mi. It becomes 1 octave at 5-45 mm. An opening width H of 30 mm or less is assigned to the mezzo piano (mp) of the musical score, 30 to 40 mm is assigned to the mesoforte (mf), and 40 mm or more is assigned to the forte (f). When the H data is applied to the terminal T1 as the scale data, the music synthesizer 4 generates a musical signal of a corresponding scale. When W data is applied to the terminal T2, a corresponding tone signal is generated. When H = 0 to 5 mm, that is, when the mouth is almost closed, the generation of the tone signal is stopped. This is because it seems that it is not an easy operation to reduce the left and right sides of the mouth in order to keep the musical sound without leaving the musical sound and to keep W at a small value.
Note that the pitch may be controlled by the width information W and the volume may be controlled by the height information H. In this case, the volume becomes zero when the mouth is closed.
When the generation of musical sounds is controlled by the shape of the lip as in this embodiment, even if the face is moved a little on the rhythm, the generated musical sounds can be kept unchanged if the lip is kept as it is. The player 1 is less likely to be restrained by his face.

図３において、舌位置検出部３３は、口を開いている状態での舌の位置を検出し、その位置情報を出力する。舌が開口部の中心位置を基準にして、上下、左右のどの位置にあるかを検出し、２次元の舌の位置情報である舌座標データを端子Ｔ３に供給する。舌座標データは、楽音信号の種々の特性を制御するのに使用できる。楽音信号の音色を柔らかくしたり硬くしたりするのに使用できる。舌を動かすとビブラートが掛かるようにできる。撮像部２を立体視カメラにして、舌が前に出ているか奥にあるかを検出し、舌の３次元の位置座標情報を供給して楽音の制御に使用してもよい。 In FIG. 3, the tongue position detection unit 33 detects the position of the tongue when the mouth is open, and outputs the position information. Based on the center position of the opening, it is detected whether the tongue is in the vertical or horizontal position, and tongue coordinate data, which is two-dimensional tongue position information, is supplied to the terminal T3. The tongue coordinate data can be used to control various characteristics of the tone signal. It can be used to soften or harden the tone of the tone signal. Move your tongue to make it vibrato. The imaging unit 2 may be a stereoscopic camera to detect whether the tongue is in front or in the back, and supply three-dimensional positional coordinate information of the tongue to be used for controlling sound.

図３において、右目検出部３４と左目検出部３５は、左右の眼の意識的な瞬きを検出する。眼を複数回瞬きしたとき、あるいは所定時間以上眼をつむったときに、検出信号を端子４と端子５に供給する。簡単なモールス符号のようなコードを決めておき、左右のコード検出信号によって、瞼の開閉パターン情報を得て、楽音信号の音色、楽器音を切り替えるようにできる。生理的に起きる瞬間的な短時間の瞬きは無視するようにする。片方の目が閉じているときには楽音の音量をゼロにするようにしてもよい。なお、瞼の開閉パターン情報は、楽音の属性の制御や選択だけでなく、楽器の各種機能の選択や切り替えにも使用できる。
画像解析部３が出力する連続的に変化するＨデータを、楽音合成部４において音階データに変換してもよいし、連続的に変化するＨデータを、画像解析部３において、音階名を表す音階データに変換してから、楽音合成部４に供給するようにしてもよい。得られる音階は、平均率に従った半音単位で隔たった音階になる。
なお、楽音合成部４における楽音の発生と制御の技術は、電子楽器の分野では周知であるので、詳細の説明を省く。 In FIG. 3, a right eye detection unit 34 and a left eye detection unit 35 detect conscious blinks of the left and right eyes. When the eyes are blinked a plurality of times or when the eyes are pinched for a predetermined time or more, a detection signal is supplied to the terminals 4 and 5. A code such as a simple Morse code can be determined, and the opening / closing pattern information of the kite can be obtained from the left and right code detection signals to switch the tone color of the musical tone signal and the instrument sound. Ignore momentary short blinks that occur physiologically. When one eye is closed, the sound volume may be set to zero. Note that the opening / closing pattern information of the kite can be used not only for control and selection of musical sound attributes, but also for selection and switching of various functions of musical instruments.
The continuously changing H data output from the image analyzing unit 3 may be converted into scale data by the musical tone synthesizing unit 4, or the continuously changing H data represents the scale name in the image analyzing unit 3. It may be supplied to the musical tone synthesis unit 4 after being converted into musical scale data. The resulting scale is a scale separated by semitones according to the average rate.
The technique for generating and controlling the musical tone in the musical tone synthesizing unit 4 is well known in the field of electronic musical instruments and will not be described in detail.

（実施の形態２）
つぎに、演奏者１が、口唇部の練習や、演奏中の確認を行えるように表示部５を設けた実施の形態について、図１を使用して説明する。
図１において、表示部５には、画像解析部３から顔の一部分の画像データと楽音テンプレートデータが供給され、表示部５は、これらを表示する。他の部分は、実施の形態１での場合と同様である。 (Embodiment 2)
Next, an embodiment in which the display unit 5 is provided so that the performer 1 can practice the lips and check during performance will be described with reference to FIG.
In FIG. 1, image data of a part of the face and musical tone template data are supplied to the display unit 5 from the image analysis unit 3, and the display unit 5 displays them. Other parts are the same as those in the first embodiment.

図４（Ａ）は、表示部５に表示する画像の例である。図４（Ａ）において、縦横にメッシュ状の楽音テンプレートを表示する。楽音テンプレートには、その中心位置を基準にして、図４（Ａ）に示すように、第１象限に、縦方向に音階を割り振り、横方向には、ｐ、ｍｐ、ｍｆの強弱記号を割り振る。第２〜第４象限は、第１象限に対して左右上下対象になる。楽音テンプレート上に演奏者１の口唇部の画像を表示する。画像解析部３は、口唇部の中心位置を検出し、楽音テンプレートの中心位置に合わせるように表示データを供給し、口唇部を表示させる。図４（Ａ）の口唇部画像の場合、音階はレであり、強弱記号は、フォルテとなる。演奏者１は、表示部５の自分の口唇部の画像を見ながら。音階と音量を調節、修正して演奏ができる。どのくらい口をあけるとどの音階音になるかを確かめながら日頃の楽器演奏の訓練を行うこともできる。
なお、画像解析部３は、その内部で、口唇部の画像データと楽音テンプレートデータとを扱い、口唇部画像の中心点データを楽音テンプレートの中心点データに合わせて、開口の大きさを楽音テンプレート上で評価して、ＷとＨを算出し、音階と音量を判定するようにしてもよい。
なお、図４（Ａ）の楽音テンプレートでは、簡単にするために、音階はハ長調とし、シャープやフラットのついた半音階音は省略した。楽音テンプレートの行数を増やして、全ての半音階音を割り振ってもよい。
（実施の形態３） FIG. 4A is an example of an image displayed on the display unit 5. In FIG. 4A, a mesh-like musical sound template is displayed vertically and horizontally. As shown in FIG. 4A, the musical sound template is assigned a musical scale in the vertical direction in the first quadrant, and a dynamic symbol of p, mp, and mf in the horizontal direction, as shown in FIG. 4A. . The second to fourth quadrants are subject to left, right, up and down with respect to the first quadrant. An image of the lip of the performer 1 is displayed on the musical tone template. The image analysis unit 3 detects the center position of the lip part, supplies display data so as to match the center position of the musical tone template, and displays the lip part. In the case of the lip image in FIG. 4A, the scale is “Le” and the strength symbol is “Forte”. The player 1 is watching the image of his / her lip on the display unit 5. You can adjust and adjust the scale and volume. It is also possible to practice the practice of musical instruments on a daily basis while checking how much of the scale sound is generated when the mouth is opened.
The image analysis unit 3 handles the image data of the lip portion and the musical tone template data therein, matches the center point data of the lip portion image with the central point data of the musical tone template, and sets the size of the opening to the musical tone template. Evaluation may be made to calculate W and H, and the scale and volume may be determined.
In the musical tone template of FIG. 4A, for the sake of simplicity, the scale is C major, and the sharp and flat semitones are omitted. All the chromatic scales may be allocated by increasing the number of lines of the musical tone template.
(Embodiment 3)

つぎに、音階・強弱範囲設定モードにより、発生する楽音の音階範囲や強弱範囲を、演奏者１に合わせて設定する機能について説明する。すなわち、口唇部の画像の所定の部分を、楽音テンプレート上に配置する操作を受け付け、当該操作の受け付けにより、画像の大きさ、形状や画像の特定の部位と、発生する楽音の属性との関連付けを行う。
一般的に、演奏者の口唇部の大きさや口をあけられる範囲には個人差がある。演奏者１が、撮像部２との距離を調節してこの個人差を解消することができる。しかし、つぎのように口唇部のサイズの個人差を補正して、正規化するようにしてもよい。図１に示した操作部６を使用する場合について説明する。 Next, a function for setting the scale range and the strength range of the generated musical tone according to the player 1 in the scale / strong / weak range setting mode will be described. That is, an operation for placing a predetermined portion of the image of the lip on the musical tone template is accepted, and by accepting the operation, the size, shape, or specific part of the image is associated with the attribute of the musical tone to be generated I do.
In general, there are individual differences in the size of the performer's lip and the range in which the mouth can be opened. The performer 1 can eliminate this individual difference by adjusting the distance to the imaging unit 2. However, individual differences in the size of the lip may be corrected and normalized as follows. A case where the operation unit 6 shown in FIG. 1 is used will be described.

まず、演奏者１は、操作部６上の音階・強弱範囲設定モードボタン（図示しない）を押し、画像解析部３を音階・強弱範囲設定モードにする。演奏者１が、口唇部を縦方向に最大に広げた状態で、操作部６の最高音階決定ボタン（図示しない）を押す。この操作により、画像解析部３は、口唇部画像の上唇を図４（Ａ）の楽音テンプレートの最上部位置に、下唇をテンプレートの最低部位置になるように配置する決定を行う。つぎに、演奏者１が、口唇部を横方向に最大に広げた状態で、操作部６の強弱記号決定ボタン（図示しない）を押す。この操作により、画像解析部３は、口唇部画像の右端を楽音テンプレートの右端位置に、口唇部画像の左端を楽音テンプレートの左端位置になるように配置する決定を行う。元の口唇部画像は、縦方向と横方向とではその倍率を変えた形で処理され、表示されることになる。以降、この倍率に従って口唇部画像が処理され、楽音テンプレート上に位置が判定され、音階と強弱が決まる。また、この倍率に従って口唇部画像が表示される。この操作により、演奏者１の口唇部の開き方の個性、個人差に合わせて、音階と強弱の制御ができるようになる。
口唇部表示の縦横の倍率差を変える代わりに、楽音テンプレートを縦または横方向に縮小、拡大して、演奏者１の口唇部のサイズに合わせるようにしてもよい。
音楽演奏を行う場合は、音階・強弱範囲設定モード解除ボタン（図示しない）を押して、音階・強弱範囲設定モードを解除すればよい。
四肢不自由者の場合、操作部６のボタン操作が困難であれば、介護者が操作してもよい。演奏者１の眼の瞬きの特定のパターンを画像解析部３が検出して、音階・強弱範囲設定モードの選択と、音階と強弱の設定操作を検知するようにすれば、操作部６の操作が無くとも、音階・強弱範囲設定ができる。
（実施の形態４） First, the performer 1 presses a scale / dynamic range setting mode button (not shown) on the operation unit 6 to set the image analysis unit 3 to a scale / dynamic range setting mode. The player 1 presses a maximum scale determination button (not shown) of the operation unit 6 in a state where the lip portion is maximally widened in the vertical direction. By this operation, the image analysis unit 3 determines to arrange the upper lip of the lip image at the uppermost position of the musical tone template in FIG. 4A and the lower lip at the lowest position of the template. Next, the player 1 presses a strength symbol determination button (not shown) of the operation unit 6 in a state where the lip portion is widened to the maximum in the lateral direction. By this operation, the image analysis unit 3 determines to arrange the right end of the lip image at the right end position of the musical tone template and the left end of the lip image at the left end position of the musical tone template. The original lip image is processed and displayed with the magnification changed in the vertical direction and the horizontal direction. Thereafter, the lip image is processed according to this magnification, the position is determined on the musical sound template, and the scale and strength are determined. Further, the lip image is displayed according to this magnification. By this operation, the scale and strength can be controlled in accordance with the personality of how to open the lip of the performer 1 and individual differences.
Instead of changing the vertical / horizontal magnification difference of the lip display, the musical sound template may be reduced or expanded in the vertical or horizontal direction to match the size of the lip of the player 1.
When performing a music performance, a scale / strength range setting mode cancel button (not shown) may be pressed to cancel the scale / strength range setting mode.
In the case of a person with a physical disability, a caregiver may operate if the button operation of the operation unit 6 is difficult. If the image analysis unit 3 detects a specific pattern of blinking eyes of the performer 1 and detects the selection of the scale / strength range setting mode and the setting operation of the scale and strength, the operation of the operation unit 6 is performed. Even if there is no sound, the scale and strength range can be set.
(Embodiment 4)

上記実施の形態３において、撮像部２に自動焦点調節機能を設けておき、演奏者１が撮像部２に近づいたり離れたりしても、口唇部の画像がぼけないようにするのが好ましい。自動焦点調節機能から得た被写体との間の距離情報を、画像解析部３に供給して、演奏者１が撮像部２に近づいて口唇部画像が大きくなっても、音階が高く強い音になることがないように、口唇部画像の倍率をさらに補正してもよい。このように演奏者１と撮像部２の距離が音階や強弱に影響しないようにしておいた上で、前記距離情報により、楽音の音色やビブラートの深さなどのような、音階、強弱以外の音の性質を制御するようにしてもよい。 In the third embodiment, it is preferable that the image pickup unit 2 is provided with an automatic focus adjustment function so that the lip image is not blurred even when the player 1 approaches or moves away from the image pickup unit 2. The distance information between the subject obtained from the automatic focus adjustment function is supplied to the image analysis unit 3, and even if the performer 1 approaches the imaging unit 2 and the lip image becomes large, the scale is high and the sound is strong. The magnification of the lip image may be further corrected so as not to occur. In this way, the distance between the performer 1 and the imaging unit 2 is not affected by the scale or strength, and the distance information is used to determine the tone of the musical tone and the depth of vibrato, etc. You may make it control the property of a sound.

（実施の形態５）
実施の形態３や実施の形態４で説明した口唇部のサイズの個人差の正規化や距離情報による倍率補正を行わず、演奏者１が、撮像部２に顔を近づけたり、離したりすることにより、音階や音量を制御できるようにしておいてもよい。 (Embodiment 5)
The player 1 moves his / her face closer to or away from the imaging unit 2 without performing normalization of individual differences in the size of the lips and the magnification correction based on the distance information described in the third and fourth embodiments. Thus, the scale and volume may be controlled.

（実施の形態６）
上記説明では、連続的に変化するＨデータをハ長調の長音階の音階音に割り付けて平均率音階による音楽を演奏するようにしたが、Ｈデータをそのまま連続的な音高に対応させて東洋的な、あるいは民族的な音楽を演奏できるようにしてもよい。 (Embodiment 6)
In the above description, the H data that changes continuously is assigned to the scale notes of the C major major scale, and the music with the average rate scale is played. However, the H data corresponds to the continuous pitch as it is oriental. You may also be able to play national or ethnic music.

（実施の形態７）
実施の形態１で説明した舌座標データの内、舌の水平位置データを音階データに使用してもよい。舌を右側にすると高い音階音が出るようにすればよい。舌の前後方向の位置データで音量を制御し、舌が喉の奥の方にあるときは、小さい音とし、前に突き出すほど大きい音とすると、人の感覚に合致した操作になる。舌の上下方向は、音色などの音の種々の特性を制御するようにしてもよい。
前記実施の形態２において、口唇部の表示だけでなく、舌の先端部の位置を表示するようにし、実施の形態３において、口唇部の上下幅、左右幅の代わりに、舌の先端部の上下左右の位置設定ボタンにより、音階の範囲や強弱の範囲を設定するようにすればよい。
口の開口のコントロールよりも舌の先端位置のコントロールの方が容易な演奏者の場合、この方式が適している。 (Embodiment 7)
Of the tongue coordinate data described in the first embodiment, the horizontal position data of the tongue may be used as the scale data. It is sufficient to make a high scale sound when the tongue is on the right side. When the sound volume is controlled by position data in the front-rear direction of the tongue and the tongue is in the back of the throat, the sound is low, and the sound is loud enough to protrude forward. The vertical direction of the tongue may control various characteristics of sound such as timbre.
In the second embodiment, not only the display of the lip but also the position of the tip of the tongue is displayed. In the third embodiment, instead of the vertical and horizontal widths of the lip, It is only necessary to set the scale range and the strength range using the up / down / left / right position setting buttons.
This method is suitable for a player who can control the position of the tip of the tongue more easily than the control of the mouth opening.

（実施の形態８）
図３の画像解析部３に鼻位置検出部を設け、映像信号から、演奏者の鼻の先端位置データを位置情報として抽出し、鼻の位置情報により、楽音信号の音高、または、音高と強弱を制御してもよい。演奏者の鼻の縦方向と横方向の２次元の位置情報を抽出して、それぞれを楽音の制御に使用する。 (Embodiment 8)
The nose position detection unit is provided in the image analysis unit 3 in FIG. 3, the player's nose tip position data is extracted as position information from the video signal, and the pitch of the musical tone signal or the pitch is determined by the nose position information. You may control strength. Two-dimensional position information in the vertical and horizontal directions of the performer's nose is extracted and used to control the musical sound.

実施の形態２において、図４（Ｂ）に示すように、口唇部の画像の表示の代わりに、鼻先端部の画像、または、先端部を示す点Ｎを楽音テンプレート上に表示する。この場合の楽音テンプレートは、その中心位置を、１オクターブなり２オクターブの音階発生範囲の中央の音階の高さとし、かつ、強弱記号の中間、ｍｐ、あるいは、ｍｆとする。最も左の位置は、無音状態ｏｆｆとしてもよい。図４（Ｂ）の例では、鼻の先端部と認識している点Ｎの位置にＸ印も表示している。また、楽音テンプレートの中央の位置は、ファとソの境界、ｐとｍｐの境界になっている。
なお、図４（Ｂ）の楽音テンプレートでは、簡単にするために、音階はハ長調とし、シャープやフラットのついた半音階音は省略した。楽音テンプレートの行数を増やして、全ての半音階音を割り振ってもよい。 In the second embodiment, as shown in FIG. 4B, instead of displaying the lip image, the image of the tip of the nose or the point N indicating the tip is displayed on the musical tone template. In this case, the musical sound template has a center position that is one octave or two octaves in the middle of the scale generation range and the middle of the dynamic symbol, mp, or mf. The leftmost position may be a silent state off. In the example of FIG. 4B, an X mark is also displayed at the position of the point N recognized as the tip of the nose. Further, the central position of the musical sound template is the boundary between fa and so, and the boundary between p and mp.
In the musical tone template of FIG. 4B, for the sake of simplicity, the scale is C major, and the chromatic halftones with sharp or flat are omitted. All the chromatic scales may be allocated by increasing the number of lines of the musical tone template.

実施の形態３での口唇部サイズの正規化と同様の考え方により、鼻の画像の先端の部分を、楽音テンプレート上に配置する操作を行い、鼻の位置と発生する楽音の属性との関連付けを行う。最初に、演奏者１は、ベッドや車椅子の上で、顔を通常位置にして、このときの鼻の位置Ｎ０を、図４（Ｂ）の楽音テンプレートの中心位置になるように、画像解析部３において座標設定する。この設定が終わると、顔をやや上に向けて鼻を上方に移動させ、この位置Ｎｈを、最高音階音である上のソに対応する位置として設定する。この設定が終わると、つぎに、顔を通常位置からやや右方向へ向けて鼻を右へ移動させ、この位置Ｎｆを最大音量、強記号ｆｆに対応する位置として設定する。以上の設定で、音階・強弱範囲設定モードを終了する。操作部６には、これらの設定操作用のボタンを設けておけばよい。
この音階・強弱範囲設定モードを終了の後は、演奏者１は、顔を上下、左右に少し動かすことにより、音階や音の強弱を制御して音楽の演奏ができる。演奏が終われば、顔を少し左に向けて音を止めることもできる。図４（Ｂ）の例では、下のドがｍｆの強度で発音される。
なお、座標の左右方向を音高とし、上下方向を強弱記号に割り振ってもよい。
口唇部は、動かさなくともよい。口をずっと開けておき、さらに、開口具合を正確に制御するのが困難な場合は、この実施の形態によって、より楽に音楽演奏ができる。 Based on the same concept as the normalization of the lip size in the third embodiment, an operation of placing the tip portion of the nose image on the musical tone template is performed, and the association between the nose position and the generated musical tone attribute is performed. Do. First, the performer 1 places the face on a bed or a wheelchair in a normal position, and sets the nose position N0 at this time to the center position of the musical tone template in FIG. The coordinates are set at 3. When this setting is finished, the nose is moved upward with the face slightly upward, and this position Nh is set as the position corresponding to the upper so that is the highest scale sound. When this setting is finished, the face is then moved slightly to the right from the normal position and the nose is moved to the right, and this position Nf is set as the position corresponding to the maximum volume and the strong symbol ff. The scale / strength range setting mode ends with the above settings. The operation unit 6 may be provided with buttons for these setting operations.
After the scale / strength range setting mode is completed, the performer 1 can perform music by controlling the scale and sound intensity by slightly moving his / her face up / down and left / right. When the performance is over, you can turn the face a little to the left and stop the sound. In the example of FIG. 4B, the lower symbol is pronounced with an intensity of mf.
Note that the coordinate direction may be assigned to the left and right direction, and the up and down direction may be assigned to dynamic symbols.
It is not necessary to move the lip. If it is difficult to keep the mouth open and control the opening accurately, this embodiment allows music to be played more easily.

口唇部のＷ、Ｈのデータや開口面積なども検出して、別の楽音制御に使用してもよい。強弱記号の制御は、鼻の位置によらず、開口具合により制御してもよい。鼻の位置の制御は、左右のみとして音階制御に使用し、大きな音を出すときは、口をやや大きく開けるようにする。顔の向きと口の大きさは、それぞれ独立に制御しやすい。また、人の音楽的感覚に割合に合致した動作感覚ともいえる。通常は口を閉じておいて、楽音を発生させるときだけ口を開けるようにして演奏することができる。
鼻の位置をある音階から別の音階へ瞬時に移動させるのは、一般的には困難であり、ポルタメントのように連続的な音高の変化が伴いがちになる。これはこれでひとつの音楽表現になる。ポルタメントが好ましくない場合は、口を一旦閉じて音を消してから、次の音階を発生するようにすることも容易にできる。 The W and H data of the lip and the opening area may also be detected and used for another musical tone control. The control of the strength symbol may be controlled according to the degree of opening regardless of the position of the nose. The nose position control is used for the scale control only for the left and right sides, and when making a loud sound, the mouth is opened slightly larger. Face orientation and mouth size are easy to control independently. It can also be said to be a movement sensation that matches the proportion of human musical sensation. You can usually perform with your mouth closed and your mouth open only when you want to generate a musical tone.
In general, it is difficult to move the position of the nose from one musical scale to another, and it tends to be accompanied by continuous pitch changes like portamento. This is now a musical expression. If portamento is not preferred, it is easy to generate the next scale after closing the mouth and muting the sound.

口を開けた後、瞬時に閉じるように演奏した場合、すぐに音を消すのではなく、徐々に減衰するようにしてもよい。ピアノやギターなどのパーカッシブな音を演奏することができる。口を開ける速度や閉じる速度により、音の立ち上がりや音の緊迫感や音色を制御するようにしてもよい。口を閉じて、前の音が残っている状態で、鼻の位置を変えて新たな音階の音を決め、口を開けて次の音を出すようにすれば、一時的であるが、和音が出せる。擬似的に２声や多声の音楽演奏もできる。
顔を左右に向けると、口唇部の画像が変わり、音の大きさに影響を与えるので、前を見たまま頭部を左右に移動させるように演奏してもよい。顔が左右に向いた場合、その角度を画像解析により検出して、楽音の制御に使用してもよい。逆に、その角度データにより、口唇部画像の変形を補正するようにしてもよい。
なお、鼻の位置データを半音階に離散的に割り振らず、前記実施の形態６の場合と同様に、連続的な音高を発生させるようにしてもよい。
なお、上記のような、鼻の位置による楽音テンプレート上での音階・強弱範囲設定を行わないで、撮像部２の位置を演奏者１の前で、前後左右の適切な位置に設定して、音階の範囲や強弱記号の範囲を決めるようにしてもよい。 When the performance is performed so that the mouth is closed immediately after opening the mouth, the sound may be gradually attenuated rather than being immediately turned off. Percussive sounds such as piano and guitar can be played. The rising of the sound, the tightness of the sound, and the timbre may be controlled by the speed at which the mouth is opened or closed. If you close your mouth and leave the previous sound, change the position of your nose to determine the sound of a new scale, and open the mouth to make the next sound. Can be put out. Pseudo and multi-voice music performances can also be performed.
If the face is turned to the left or right, the lip image changes and affects the volume of the sound. Therefore, the head may be moved left and right while looking forward. When the face turns to the left or right, the angle may be detected by image analysis and used to control the musical sound. Conversely, the deformation of the lip image may be corrected based on the angle data.
It should be noted that, as in the case of the sixth embodiment, continuous pitches may be generated instead of discretely assigning the nose position data to the chromatic scale.
In addition, without performing the scale / strength range setting on the musical tone template according to the position of the nose as described above, the position of the imaging unit 2 is set to an appropriate position in front, back, left, and right in front of the player 1, You may make it determine the range of a musical scale, and the range of a dynamic symbol.

（その他の実施の形態および補足）
上記各実施の形態において、画像解析部３の画像解析、画像認識の性能が不十分であったり、演奏者１の口唇部や舌や鼻の先端位置が認識困難な形状であったりする場合は、あまり望ましいことではないが、口唇部の上下左右の位置や舌、鼻の先端に小さな目印のラベルを貼り付けたり、目印をつけることにより、画像解析部３の画像解析、画像認識を容易にしてもよい。目印は、特定の波長に反応するが一般の人の目にはつかないような物質でもよい。 (Other embodiments and supplements)
In each of the above embodiments, when the image analysis and image recognition performance of the image analysis unit 3 is insufficient, or the position of the lip, tongue, or nose of the player 1 is difficult to recognize. Although it is not very desirable, it is possible to facilitate image analysis and image recognition of the image analysis unit 3 by attaching a label of a small mark on the top / bottom / left / right position of the lip, the tongue, or the tip of the nose. May be. The landmark may be a substance that responds to a specific wavelength but is not visible to the general public.

上記表示部５は、映像情報より抽出した口唇部、舌、鼻、または、顔の一部の画像を表示するようにしたが、実際の映像を表示せずに、演奏者１の顔の画像から、その大きさ、形状、あるいは位置を示す図形を作図して、人工的な図形を楽音テンプレートと共に表示するようにしてもよい。上記図形の所定の部分を、楽音テンプレート上に配置する操作により、図形の大きさ、形状または図形の位置と発生する楽音の属性との関連付けを行うようにすればよい。楽音テンプレートは、図４（Ａ）、（Ｂ）に示したものに限らない。
図３の画像解析部３が楽音合成部４に出力する楽音制御用の各信号は、電子楽器分野で周知のＭＩＤＩ信号の形で楽音合成部４に供給されるようにしてもよい。この場合、画像解析部３が出力する楽音制御用の各信号は、１つの信号に纏められるので、楽音合成部４のＴ１〜Ｔ５の端子は１つに集約される。 The display unit 5 displays an image of a part of the lip, tongue, nose, or face extracted from the video information. However, without displaying an actual video, the image of the face of the player 1 is displayed. Then, a figure indicating the size, shape, or position may be drawn, and the artificial figure may be displayed together with the musical sound template. It is only necessary to associate the size, shape, or position of the graphic with the attribute of the generated musical sound by an operation of placing a predetermined portion of the graphic on the musical sound template. The musical sound templates are not limited to those shown in FIGS. 4 (A) and 4 (B).
The musical tone control signals output from the image analysis unit 3 of FIG. 3 to the musical tone synthesis unit 4 may be supplied to the musical tone synthesis unit 4 in the form of MIDI signals well known in the electronic musical instrument field. In this case, since the signals for controlling the musical tone output from the image analyzing unit 3 are collected into one signal, the terminals T1 to T5 of the musical tone synthesizing unit 4 are combined into one.

本発明の楽器は、ノートパソコンにカメラを接続し、画像解析部３の上記各実施の形態における動作をコンピュータプログラムの形にして、前記ノートパソコン上に構成することができる。楽音合成部４は、ノートパソコン上に設けられた楽音合成機能を使用することができる。ノートパソコンの液晶表示部に、図４（Ａ）、（Ｂ）のような表示を行える。また、携帯電話器には、カメラと楽音合成機能が搭載されていることが多いので、このような携帯電話器に、上記画像解析部３の機能をソフトウェアとして搭載することにより、携帯電話を本発明の楽器として使用することができる。 The musical instrument of the present invention can be configured on the notebook computer by connecting a camera to the notebook computer and making the operation of the above-described embodiments of the image analysis unit 3 in the form of a computer program. The musical tone synthesis unit 4 can use a musical tone synthesis function provided on a notebook personal computer. 4A and 4B can be displayed on the liquid crystal display portion of the notebook computer. In addition, since a mobile phone is often equipped with a camera and a musical sound synthesis function, the mobile phone can be implemented by installing the function of the image analysis unit 3 as software on such a mobile phone. It can be used as a musical instrument of the invention.

すなわち、上記のすべての実施の形態における本発明の楽器の処理は、ソフトウェアで実現しても良い。そして、このソフトウェアをソフトウェアダウンロード等により配布しても良い。また、このソフトウェアをＣＤ−ＲＯＭなどの記録媒体に記録して流布しても良い。かかるソフトウェアは、コンピュータに、演奏者の顔またはその一部を撮影を指示する撮像指示ステップと、撮像指示ステップにおける指示に対する撮像結果である映像信号を解析して、特徴量を抽出する画像解析ステップと、当該特徴量に基づき楽音信号を発生する楽音合成ステップを実行させるためのプログラム、である。また、上記プログラムに対して、コンピュータに、映像信号から抽出した口唇部、舌、鼻、または、顔の一部の画像、または、画像の大きさ、形状、あるいは位置を示す図形を、楽音テンプレートと共に表示するステップをさらに実行させるプログラムでも良い。さらに、上記プログラムに対して、画像、または、図形の所定の部分を、楽音テンプレート上に配置する操作を受け付けるステップと、当該操作の受け付けにより、画像、あるいは、図形の大きさ、形状または図形の位置と発生する楽音の属性との関連付けを行うステップをさらに実行させるためのプログラムでも良い。 In other words, the processing of the musical instrument of the present invention in all the above embodiments may be realized by software. Then, this software may be distributed by software download or the like. Further, this software may be recorded and distributed on a recording medium such as a CD-ROM. The software includes an imaging instruction step for instructing a computer to photograph the player's face or a part thereof, and an image analysis step for analyzing a video signal that is an imaging result corresponding to the instruction in the imaging instruction step and extracting a feature amount. And a program for executing a tone synthesis step for generating a tone signal based on the feature amount. In addition, for the above program, a musical tone template is created on the computer by using a lip, tongue, nose, or part of face image extracted from the video signal, or a figure indicating the size, shape, or position of the image. It may be a program for further executing the step of displaying together. Furthermore, a step of accepting an operation for placing a predetermined part of an image or a graphic on the musical sound template with respect to the program, and the size of the image or the graphic, the shape or the graphic by receiving the operation. A program for further executing the step of associating the position with the attribute of the generated musical sound may be used.

本発明にかかる楽器は、四肢麻痺者や四肢不自由者が楽器演奏を楽しむのに有用であり、また、種々の情報機器上に実現することができ、また、健常者が使用する楽器とすることも可能である。 The musical instrument according to the present invention is useful for a paralyzed person or a physically handicapped person to enjoy playing a musical instrument, can be realized on various information devices, and is an instrument used by a healthy person. It is also possible.

本発明の楽器の一実施形態のブロック図The block diagram of one Embodiment of the musical instrument of this invention 本発明の楽器の演奏者の口の特徴量を説明する図The figure explaining the feature-value of the player's mouth of the musical instrument of this invention 本発明の楽器の一実施形態の要部のブロック図The block diagram of the principal part of one Embodiment of the musical instrument of this invention 本発明の楽器の一実施形態の表示の例を示す図The figure which shows the example of a display of one Embodiment of the musical instrument of this invention

Explanation of symbols

１演奏者
２撮像部
３画像解析部
４楽音合成部
５表示部
６操作部
３１Ｈ検出部
３２Ｗ検出部
３３舌位置検出部
３４右目検出部
３５左目検出部
Ｔ１〜Ｔ５入力用の端子

DESCRIPTION OF SYMBOLS 1 Player 2 Image pick-up part 3 Image analysis part 4 Musical sound synthesis part 5 Display part 6 Operation part 31 H detection part 32 W detection part 33 Tongue position detection part 34 Right eye detection part 35 Left eye detection part T1-T5 Input terminal

Claims

An imaging unit for photographing the performer's face or a part thereof;
An image analysis unit that analyzes a video signal output from the imaging unit and extracts a feature amount;
A musical instrument comprising a tone synthesis unit that generates a tone signal based on the feature amount.

The image analysis unit
The player's lip is extracted from the video signal, and height information and width information of the opening portion of the lip is extracted as the feature amount,
The musical tone synthesis unit
The musical instrument according to claim 1, wherein the generation of the tone signal is controlled by the height information and the width information.

The musical tone synthesis unit
The musical instrument according to claim 2, wherein one of a pitch and a volume of the musical tone signal is controlled by the height information, and the other is controlled by the width information.

The image analysis unit
From the video signal, the player's nose position information is extracted as the feature amount,
The musical tone synthesis unit
The musical instrument according to claim 1, wherein the generation of the musical tone signal is controlled by the position information of the nose.

The image analysis unit
From the video signal, two-dimensional position information in the vertical and horizontal directions of the performer's nose is extracted as the feature amount,
The musical tone synthesis unit
The musical instrument according to claim 4, wherein one of pitch and volume of the musical tone signal is controlled by the position information in the vertical direction, and the other is controlled by the position information in the horizontal direction.

The image analysis unit
From the video signal, the position information of the player's tongue is extracted as the feature amount,
The musical tone synthesis unit
The musical instrument according to claim 1, wherein generation of the musical tone signal is controlled based on position information of the tongue.

The image analysis unit
From the video signal, two-dimensional position information in the vertical and horizontal directions of the performer's tongue is extracted as the feature amount,
The musical tone synthesis unit
The musical instrument according to claim 6, wherein one of pitch and volume of the musical tone signal is controlled by the position information in the vertical direction, and the other is controlled by the position information in the horizontal direction.

The image analysis unit
Extract the three-dimensional position information of the player's tongue in the vertical, horizontal and depth directions,
The musical tone synthesis unit
The musical instrument according to claim 6, wherein any one of the pieces of positional information controls any one of a pitch, a volume, and a timbre of the musical tone signal.

The image analysis unit
From the video signal, as the feature amount, the open / close pattern information of the performer's heel or lip is extracted,
The musical tone synthesis unit
The musical instrument according to claim 1, wherein an attribute of the musical sound signal or an operation of the musical instrument is controlled based on the opening / closing pattern information.

A display unit;
An image of a part of a lip, tongue, nose, or face extracted from the video signal, or a graphic indicating the size, shape, or position of the image is displayed on the display unit together with a musical sound template. 1. The musical instrument according to 1.

An operation for placing the image or a predetermined portion of the graphic on the musical tone template is accepted, and the size of the image, the shape of the graphic, or the position of the graphic is generated by the reception of the operation. The musical instrument according to claim 10, wherein the musical instrument is associated with a musical sound attribute.

An imaging step for photographing the performer's face or part thereof;
An image analysis step of analyzing a video signal that is an output in the imaging step and extracting a feature amount from a face or an expression;
A musical tone control method comprising a musical tone synthesis step for generating a musical tone signal based on the feature amount.

A step of displaying an image of a part of the lip, tongue, nose, or face extracted from the video signal, or a graphic indicating the size, shape, or position of the image on the display unit together with the musical sound template. The musical sound control method according to claim 12.

Receiving an operation of placing a predetermined portion of the image or the graphic on the musical sound template; and receiving the operation, the size of the image, the shape of the graphic, or the position of the graphic; 14. The musical tone control method according to claim 13, further comprising a step of associating with an attribute of a musical tone to be generated.

On the computer,
An imaging instruction step for instructing photographing of the performer's face or a part thereof;
An image analysis step of analyzing a video signal which is an imaging result for an instruction in the imaging instruction step and extracting a feature amount;
A program for executing a tone synthesis step for generating a tone signal based on the feature amount.

On the computer,
In order to further execute a step of displaying an image of a part of the lip, tongue, nose, or face extracted from the video signal, or a graphic showing the size, shape, or position of the image together with the musical sound template. The program according to claim 15.

Receiving an operation of placing a predetermined portion of the image or the graphic on the musical sound template; and receiving the operation, the size of the image, the shape of the graphic, or the position of the graphic; The program according to claim 16, further executing a step of associating with an attribute of a generated musical sound.