JP7452855B2

JP7452855B2 - Swallowing movement prediction method and system using time-series data prediction

Info

Publication number: JP7452855B2
Application number: JP2020131221A
Authority: JP
Inventors: 誠佐々木; 宇曦劉
Original assignee: Iwate University
Current assignee: Iwate University
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2024-03-19
Anticipated expiration: 2040-07-31
Also published as: JP2022027304A

Description

本発明は、摂食嚥下時における前頸部及びその周辺の摂食嚥下動作に関わる生体信号を検出し、検出した生体信号から特徴量を抽出し、口腔、咽頭・喉頭、食道などの嚥下諸器官の運動、ならびに嚥下物（食塊）の運動を予測して摂食嚥下機能を評価・訓練する摂食嚥下機能評価・訓練方法及び摂食嚥下機能評価・訓練システムに関する。 The present invention detects biosignals related to the swallowing motion in the front neck and its surroundings during swallowing, extracts feature quantities from the detected biosignals, and extracts features from the swallowing motion in the oral cavity, pharynx/larynx, esophagus, etc. The present invention relates to an eating/swallowing function evaluation/training method and an eating/swallowing function evaluation/training system for evaluating/training the eating/swallowing function by predicting the movement of organs and the movement of the swallowed object (bolus).

脳血管障害や神経筋疾患、加齢による筋力低下などが原因で、嚥下機能が低下すれば、図４に示すように食塊の咽頭残留や喉頭侵入、誤嚥、窒息のリスクが高まる。これには、舌骨・喉頭位の下垂、それに伴う舌骨や喉頭の挙上量や前方移動量の減少、喉頭挙上速度の低下による喉頭挙上の遅れ、嚥下反射惹起の遅延、喉頭閉鎖のタイミングのズレなど、嚥下諸器官の運動機能や感覚機能の低下が起因している。そのため、医療機関では嚥下機能を評価するために、舌、舌骨、喉頭、喉頭蓋、食道入口部などの嚥下に関連する様々な嚥下諸器官の運動ならびに食塊の運動の評価が行われている。 If swallowing function deteriorates due to cerebrovascular disease, neuromuscular disease, muscle weakness due to aging, etc., the risk of bolus remaining in the pharynx, entering the larynx, aspiration, and suffocation increases, as shown in Figure 4. This includes drooping of the hyoid and larynx position, associated decrease in the amount of elevation and forward movement of the hyoid and larynx, delay in larynx elevation due to decreased laryngeal elevation speed, delay in triggering the swallowing reflex, and laryngeal closure. This is caused by a decline in the motor and sensory functions of the various swallowing organs, such as a difference in the timing of swallowing. Therefore, in order to evaluate swallowing function, medical institutions evaluate the movement of various organs related to swallowing, such as the tongue, hyoid bone, larynx, epiglottis, and esophageal entrance, as well as the movement of the food bolus. .

例えば、嚥下機能の精密検査方法のゴールドスタンダードは、嚥下造影検査（Videofluoroscopic examination of swallowing：VF）である。VFは、図５(a)～(f)に示すように、X線透視下で造影剤入りの食塊を嚥下させ、準備・口腔、咽頭、食道期の嚥下諸器官の運動、ならびに食塊の動きを評価するものである。準備・口腔期では、咀嚼による食塊形成過程・機能や舌運動による咽頭への送り込み、口腔内残留などを評価することができる。咽頭期では、食塊の送り込みに伴う嚥下反射の惹起、ならびに、舌骨や喉頭の挙上、鼻咽腔閉鎖、喉頭蓋の反転に伴う喉頭閉鎖、食道入口部の開大、食塊の咽頭残留や喉頭侵入、誤嚥などを、舌骨、喉頭蓋、食道入口部などの嚥下諸器官の運動と食塊の運動を観察しながら詳細に評価することができる。一方で、このVFには放射線被曝や造影剤の誤嚥などのリスクがあるため、検査回数・時間・頻度、検査場所、検査条件などなどが制限される問題がある。 For example, the gold standard method for detailed examination of swallowing function is videofluoroscopic examination of swallowing (VF). As shown in Figures 5(a) to (f), VF consists of swallowing a bolus containing a contrast agent under X-ray fluoroscopy, and observing the movements of the swallowing organs during preparation, oral cavity, pharynx, and esophagus, as well as the bolus. It evaluates the movement of In the preparation/oral phase, it is possible to evaluate the process and function of bolus formation through mastication, delivery to the pharynx through tongue movement, and retention in the oral cavity. During the pharyngeal stage, the swallowing reflex is triggered as the bolus is fed, the hyoid bone and larynx are elevated, the nasopharynx is closed, the larynx is closed due to inversion of the epiglottis, the esophageal entrance opens, and the bolus remains in the pharynx. It is possible to evaluate in detail the movement of swallowing organs such as the hyoid bone, epiglottis, and entrance to the esophagus, as well as the movement of the bolus. On the other hand, this VF poses risks such as radiation exposure and aspiration of contrast media, which limits the number, time and frequency of examinations, examination locations, examination conditions, etc.

最近では、ベッドサイドや在宅で嚥下諸器官や食塊の運動を観察する方法として、嚥下内視鏡検査（Videoendoscopic evaluation of swallowing.：VE）も広く用いられている。VEは、被曝のリスクを伴うことなく、食塊の状態や咽頭残留を評価できる利点があるが、鼻腔から内視鏡を挿入するため、粘膜損傷や痛みを伴うリスクがあり、必ずしも自然な嚥下を観察しているとはいえない側面もある。加えて、咽頭期における嚥下の瞬間や準備・口腔、食道期の運動は観察できない問題がある。 Recently, videoendoscopic evaluation of swallowing (VE) has also been widely used as a method for observing the movements of swallowing organs and bolus at the bedside or at home. VE has the advantage of being able to assess the condition of the food bolus and the amount of food remaining in the pharynx without the risk of radiation exposure, but because the endoscope is inserted through the nasal cavity, there is a risk of mucous membrane damage and pain, and natural swallowing is not always possible. There are also aspects that cannot be said to be observing the situation. In addition, there is the problem that the moment of swallowing during the pharyngeal stage and movements during the preparation, oral cavity, and esophageal stages cannot be observed.

嚥下諸器官や食塊の運動を数値として定量的に評価する際には、画像処理が用いられ、舌骨はその中でもよく着目される重要な嚥下諸器官の一つである。舌骨は人体の中で唯一、隣り合う骨、もしくは軟骨と関節の形態を呈さない、宙に浮いた状態にある極めて特異な骨である。図１に示すように、舌骨は舌と喉頭の中間に位置し、嚥下運動に関与する多くの筋が付着している。例えば、顎二腹筋、茎突舌骨筋、顎舌骨筋、オトガイ舌骨筋、胸骨舌骨筋、甲状舌骨筋、肩甲舌骨筋、咽頭舌骨筋、中咽頭収縮筋などがある。これらの筋群が協調的に活動することによって咀嚼、嚥下、発声などの巧妙な動作がなされている。 Image processing is used to quantitatively evaluate the movements of the swallowing organs and bolus as numerical values, and the hyoid bone is one of the important swallowing organs that often receives attention. The hyoid bone is the only bone in the human body that does not form a joint with adjacent bones or cartilage, and is extremely unique because it is suspended in midair. As shown in Figure 1, the hyoid bone is located between the tongue and larynx, and many muscles involved in swallowing movements are attached to it. Examples include the digastric muscle, stylohyoid muscle, mylohyoid muscle, geniohyoid muscle, sternohyoid muscle, thyrohyoid muscle, omohyoid muscle, pharyngohyoid muscle, and oropharyngeal constrictor muscle. . The coordinated activity of these muscle groups enables skillful movements such as mastication, swallowing, and vocalization.

図２に示すように、舌骨は嚥下時におおむね三角形に類似した運動軌跡を描くことが知られている。第一に比較的ゆっくりと挙上運動を始めるが、この際、わずかに後退運動を伴うことが多い（１：挙上後退運動）。第二に舌骨は大きく挙上すると同時に急激に前進する（２：挙上前進運動）。そして最大挙上位置及び最大前進位置に停滞した後に第三の運動、すなわち元の位置へと復元するために後退及び下降運動を行う（３：下降後退運動）。これらの動作に大きく関わってくるのが舌骨上筋群と舌骨下筋群である。図３に示すように、この一連の運動の際に、咽頭筋や舌筋とともに舌骨上筋群が収縮して舌骨が上前方に移動する。そして舌骨に追従する形で舌骨下筋群の収縮により喉頭が挙上し、合わせて輪状咽頭筋の弛緩と収縮が連続的に生じて食塊は食道入口部を通過する。また、舌骨の挙上のタイミングや挙上時間は、食塊が喉頭に侵入するのを防ぐ喉頭閉鎖のタイミングや閉鎖時間と密接に関わっている。 As shown in FIG. 2, the hyoid bone is known to draw a movement trajectory that roughly resembles a triangle during swallowing. First, a lifting movement is started relatively slowly, but at this time, a slight backward movement is often accompanied (1: lifting backward movement). Second, the hyoid bone elevates greatly and at the same time rapidly moves forward (2: forward movement of elevation). After stopping at the maximum raised position and the maximum forward movement position, a third movement, that is, a backward and downward movement is performed to restore the original position (3: descending backward movement). The suprahyoid and infrahyoid muscles are largely involved in these movements. As shown in FIG. 3, during this series of movements, the suprahyoid muscle group contracts together with the pharyngeal muscles and tongue muscles, and the hyoid moves upward and forward. Following the hyoid bone, the larynx is raised by the contraction of the subhyoid muscles, and the cricopharyngeal muscles are continuously relaxed and contracted, allowing the bolus to pass through the entrance to the esophagus. In addition, the timing and duration of elevation of the hyoid bone are closely related to the timing and closure duration of the larynx, which prevents food bolus from entering the larynx.

舌骨上筋群と舌骨下筋群の筋活動に着目した摂食嚥下機能を評価する技術として、特許文献１に開示される摂食嚥下機能評価技術が知られている。特許文献１の摂食嚥下機能評価技術は、摂食嚥下開始から摂食嚥下終了までの生体信号を検出し、検出した生体信号から特徴量を抽出し、機械学習を用いて特徴量から摂食嚥下動作を識別して摂食嚥下機能を評価する摂食嚥下機能評価法である。生体信号として、舌骨上筋群の筋活動による舌骨上筋群生体信号と、舌骨下筋群の筋活動による舌骨下筋群生体信号とを用い、舌骨上筋群生体信号と舌骨下筋群生体信号とから特徴量を抽出する。しかし、特許文献１は、随意嚥下の強さや一回嚥下量の違い、食物や食塊の物性値（硬さ、粘度、温度、液体、個体など）の違いなど、嚥下状態の違いや誤嚥の有無・種類（顕性誤嚥、不顕性誤嚥、嚥下前誤嚥、嚥下中誤嚥、嚥下後誤嚥など）・リスク（喉頭流入など）を判別できる嚥下機能評価法及び嚥下機能評価装置であり、VFやVEで観測可能な嚥下諸器官及び食塊の運動を時系列データとして直接予測しうるものではない。これらを予測できれば、ベッドサイドや在宅で利用可能な、非侵襲かつ簡便な摂食嚥下機能評価ならびに摂食嚥下訓練を実現できる。 As a technique for evaluating the eating and swallowing function focusing on the muscle activities of the suprahyoid muscle group and the infrahyoid muscle group, the eating and swallowing function evaluation technique disclosed in Patent Document 1 is known. The feeding and swallowing function evaluation technology of Patent Document 1 detects biological signals from the start of feeding and swallowing to the end of feeding and swallowing, extracts feature amounts from the detected biological signals, and uses machine learning to evaluate feeding and swallowing functions from the feature amounts. This is an eating and swallowing function evaluation method that evaluates eating and swallowing functions by identifying swallowing movements. As biosignals, we use the suprahyoid muscle group biosignal due to the muscle activity of the suprahyoid muscle group and the infrahyoid muscle group biosignal due to the muscle activity of the infrahyoid muscle group. Features are extracted from the infrahyoid muscle group biological signals. However, Patent Document 1 describes differences in swallowing conditions such as differences in the strength of voluntary swallowing, differences in the amount swallowed at a time, and differences in physical property values (hardness, viscosity, temperature, liquid, solidity, etc.) of food and bolus, and differences in aspiration. Swallowing function evaluation method and swallowing function evaluation that can determine the presence/absence and type (overt aspiration, covert aspiration, pre-swallow aspiration, aspiration during swallowing, post-swallow aspiration, etc.) and risk (laryngeal inflow, etc.) It is not possible to directly predict the movements of the swallowing organs and bolus as time-series data, which can be observed with VF and VE. If these can be predicted, non-invasive and simple evaluation of swallowing function and feeding and swallowing training that can be used at the bedside or at home can be realized.

特開２０１９－２０８６２９号公報JP2019-208629A

本発明は、以上の点に鑑み、非侵襲的でリスクの少ない、ベッドサイドや在宅医療でも簡便に嚥下諸器官及び食塊の運動を予測することができる摂食嚥下機能評価・訓練技術を提供することを課題とする。 In view of the above points, the present invention provides a non-invasive and low-risk eating and swallowing function evaluation/training technique that can easily predict the movement of swallowing organs and bolus at the bedside or in home medical care. The task is to do so.

［１］被験者が摂食した食塊の動き及び嚥下諸器官の動きを撮影する嚥下撮影工程と、
前記嚥下撮影工程の画像から前記嚥下諸器官及び前記食塊の位置を取得して座標として数値化し、前記嚥下諸器官及び前記食塊の運動の教師信号を作成する前処理工程と、
前記嚥下撮影工程に同期させ、前記被験者の所定の皮膚表面に配置したセンサ部で摂食嚥下時における生体信号を検出する生体信号検出工程と、
解析部で前記生体信号から特徴量を抽出する特徴量抽出工程と、
RNN (Recurrent Neural Network)及び前記RNNから派生したLSTM (Long Short-Term Memory)、GRU (Gated Recurrent Unit)、LSTNet（Long- and Short-term Time-series Network）や、AR（Autoregressive）モデル及び前記ARモデルから派生したARMA（Autoregressive Moving Average）、ARIMA（Autoregressive Integrated Moving Average）、SARIMA（Seasonal AutoRegressive Integrated Moving Average）モデルを含む時系列データの予測手法を用いて前記教師信号及び前記特徴量に基づいて前記嚥下諸器官及び前記食塊の運動を学習して、前記特徴量から少なくとも前記嚥下諸器官と前記食塊の一方の運動を予測しうるモデルを生成する学習工程と、
前記学習工程で生成した予測モデルを用いて、前記特徴量から少なくとも前記嚥下諸器官と前記食塊の一方の運動を予測する予測工程と、
を備えていることを特徴とする。 [1] A swallowing photographing step of photographing the movement of the bolus ingested by the subject and the movements of various swallowing organs;
a preprocessing step of acquiring the positions of the swallowing organs and the bolus from the image of the swallowing photographing step and digitizing them as coordinates to create a teacher signal of the movement of the swallowing organs and the bolus;
a biosignal detection step of detecting biosignals during ingestion and swallowing with a sensor unit placed on a predetermined skin surface of the subject in synchronization with the swallowing imaging step;
a feature amount extraction step of extracting a feature amount from the biological signal in an analysis unit;
RNN (Recurrent Neural Network), LSTM (Long Short-Term Memory) derived from the RNN, GRU (Gated Recurrent Unit), LSTNet (Long- and Short-term Time-series Network), AR (Autoregressive) model, and the Based on the teacher signal and the feature amount using a time series data prediction method including ARMA (Autoregressive Moving Average), ARIMA (Autoregressive Integrated Moving Average), and SARIMA (Seasonal AutoRegressive Integrated Moving Average) models derived from the AR model. a learning step of learning the movements of the swallowing organs and the bolus to generate a model capable of predicting the movement of at least one of the swallowing organs and the bolus from the feature values;
a prediction step of predicting the movement of at least one of the swallowing organs and the bolus from the feature amounts using the prediction model generated in the learning step ;
It is characterized by having the following.

かかる構成によれば、嚥下撮影工程、前処理工程、嚥下撮影工程と同期させた生体信号検出工程、特徴量抽出工程、学習工程、予測工程及び評価・訓練工程を備えている。嚥下撮影工程において、造影剤を混ぜた、あるいは表面にコーティングした嚥下物が球状であれば、画像処理による食塊の運動の数値化が容易になる。撮影は、嚥下工程の連続的な変化が確認できれば動画または静止画などでも良く、画像の種類は問わない。さらに、学習工程、予測工程において、例えば、時系列データの予測手法としてLSTMを用いる場合は、長期の時間依存性及び短期の時間依存性を学習する回帰型ニューラルネットワークアーキテクチャである長・短期記憶を用いて教師信号及び特徴量に基づいて舌骨をはじめとする嚥下諸器官及び食塊の運動を学習する。LSTMは、深層学習の分野において用いられる回帰型ニューラルネットワークアーキテクチャであり、従来のRNNで訓練する際に、長期の時間依存性では学習できない問題を解決し、長期の時間依存性も短期の時間依存性も学習できる。学習過程で新たな入力、出力が来た時に、新たなパターンに適合するようにし、RNNで発生していた入力重み衝突、出力重み衝突の問題に対処可能とした。このため、被験者は、最初に少なくとも一回の嚥下造形検査（VF検査）又は嚥下内視鏡検査（VE検査）などの検査と同時にセンサ部で生体信号を取ることで、その被験者の嚥下時の少なくとも嚥下諸器官と食塊の一方の運動に関する特徴を学習し、２回目以降からはVF検査などなしで、前記学習工程の後に新たに検出された生体信号の特徴量のみから少なくとも嚥下諸器官と食塊の一方の運動を予測することができる。結果、VF検査時に要するX線透視装置が不要になり、非侵襲的でリスクの少ない、ベッドサイドや在宅医療でも簡便に嚥下諸器官及び食塊の運動を予測する摂食嚥下機能評価・訓練を行うことができる。また、同じ量、同じ物性値を同じように飲み込んだときの嚥下であれば、学習データは１回で適切なデータとなるが、より好適な学習データとするには、量や物性値を変えたときの嚥下について、その条件における嚥下データを学習に加えて学習データとしてもよい。 According to this configuration, the swallowing imaging process, the preprocessing process, the biosignal detection process synchronized with the swallowing imaging process, the feature extraction process, the learning process, the prediction process, and the evaluation/training process are provided. In the swallowing imaging process, if the swallowed object mixed with a contrast agent or coated on the surface is spherical, it becomes easier to quantify the movement of the bolus through image processing. The type of image does not matter, as long as continuous changes in the swallowing process can be confirmed, such as moving images or still images. Furthermore, in the learning process and prediction process, for example, when using LSTM as a prediction method for time series data, long and short-term memory, which is a recurrent neural network architecture that learns long-term time dependence and short-term time dependence, is used. This method is used to learn the movements of the swallowing organs, including the hyoid bone, and the bolus based on teacher signals and feature quantities. LSTM is a recurrent neural network architecture used in the field of deep learning, which solves problems that cannot be learned with long-term time dependence when training with traditional RNNs, and can be used to solve problems that cannot be learned with long-term time dependence or short-term time dependence. You can also learn about sex. When new inputs and outputs are received during the learning process, new patterns are applied, making it possible to deal with the problems of input weight collisions and output weight collisions that occur in RNNs. For this reason, the test subject must first perform at least one swallowing morphological test (VF test) or swallowing endoscopy (VE test), etc., and at the same time as the test, the biological signals can be obtained using the sensor unit. The features related to the movement of at least one of the swallowing organs and the food bolus are learned, and from the second time onwards, at least the features related to the movement of the swallowing organs and the food bolus are learned from only the feature values of the biological signals newly detected after the learning process, without any VF examination etc. The movement of one side of the bolus can be predicted. As a result, the X-ray fluoroscopy equipment required for VF examinations is no longer required, making it possible to easily evaluate and train swallowing functions to predict movements of the swallowing organs and bolus at the bedside or at home, which is non-invasive and has little risk. It can be carried out. In addition, if the same amount and physical properties are swallowed in the same way, the learning data will be appropriate data in one time, but to make the learning data more suitable, it is necessary to change the amount and physical properties. With respect to swallowing when the patient is swallowed, the swallowing data under that condition may be added to the learning and used as the learning data.

［２］好ましくは、前記生体信号検出工程では、舌骨上筋群部分に配置した舌骨上筋群用筋電センサで舌骨上筋群生体信号を検出し、舌骨下筋群部分に配置した舌骨下筋群用筋電センサで舌骨下筋群生体信号を検出し、喉頭部分に配置した喉頭挙動センサで喉頭挙動信号を検出し、
前記特徴量抽出工程では、前記生体信号としての、前記舌骨上筋群生体信号、前記舌骨下筋群生体信号及び前記喉頭挙動信号から特徴量を抽出している。 [2] Preferably, in the biological signal detection step, a suprahyoid muscle group myoelectric sensor placed in the suprahyoid muscle group portion detects the suprahyoid muscle group biological signal, and the suprahyoid muscle group biological signal is detected in the infrahyoid muscle group portion. The placed myoelectric sensor for the subhyoid muscle group detects the subhyoid muscle group biosignal, the laryngeal behavior sensor placed in the larynx detects the laryngeal behavior signal,
In the feature amount extraction step, feature amounts are extracted from the suprahyoid muscle group biosignal, the infrahyoid muscle group biosignal, and the laryngeal behavior signal as the biosignals.

かかる構成によれば、生体信号検出工程では、舌骨上筋群生体信号、舌骨下筋群生体信号、及び喉頭挙動信号を検出するので、より精度の高い少なくとも嚥下諸器官と食塊の一方の運動の予測ができる。 According to this configuration, in the biological signal detection step, the suprahyoid muscle group biological signal, the infrahyoid muscle group biological signal, and the laryngeal behavior signal are detected, so that at least one of the swallowing organs and the bolus is detected with higher accuracy. It is possible to predict the movement of

［３］好ましくは、前記学習工程では、学習データとして前記嚥下諸器官及び前記食塊の座標データを用い、前記学習データを、１つの元データを所定の周期で同一の座標データが含まれないようにシフトして複数に増幅させている。 [3] Preferably, in the learning step, the coordinate data of the swallowing organs and the food bolus are used as learning data, and the learning data is one source data that does not contain the same coordinate data at a predetermined period. It is shifted and amplified to multiple levels.

かかる構成によれば、学習工程では、１つの元データを所定の周期で同一の座標データが含まれないようにシフトして複数に増幅させているので、最初の１回の学習で予測値と実測値の誤差を軽減させ、より精度の高い少なくとも嚥下諸器官と食塊の一方の運動の予測ができる。 According to this configuration, in the learning process, one source data is shifted and amplified into multiple pieces at a predetermined period so that the same coordinate data is not included, so that the predicted value and the predicted value can be calculated in the first learning step. Errors in actual measurements can be reduced, and the movement of at least one of the swallowing organs and the bolus can be predicted with higher accuracy.

［４］好ましくは、前処理工程では、前記嚥下諸器官及び前記食塊の座標データを求めるための座標系とその原点を定め、前記被験者の第５頸椎前縁下端を原点とし、前記被験者の第３頸椎前縁上端を一つの軸上の点とした座標系を設定することで、前記被験者の矢状面または前額面における嚥下諸器官及び食塊の各位置を前記座標系の座標で取得する。 [4] Preferably, in the preprocessing step, a coordinate system and its origin are determined for determining the coordinate data of the swallowing organs and the food bolus, and the lower end of the anterior border of the fifth cervical vertebrae of the subject is set as the origin, and By setting a coordinate system with the upper end of the anterior edge of the third cervical vertebra as a point on one axis, the positions of the swallowing organs and bolus in the sagittal or frontal plane of the subject are obtained using the coordinates of the coordinate system. do.

かかる構成によれば、前処理工程では、被験者の矢状面もしくは前額面における座標系を設定し、嚥下諸器官及び食塊の各位置のXY座標系の座標を取得しているので、被験者の前後・上下など各方向の少なくとも嚥下諸器官と食塊の一方の運動の予測を分かり易くすることができる。 According to this configuration, in the preprocessing step, a coordinate system in the sagittal plane or frontal plane of the subject is set, and the coordinates of the swallowing organs and each position of the bolus in the XY coordinate system are obtained, so that the subject's It is possible to easily predict the movement of at least one of the swallowing organs and the bolus in each direction such as back and forth and up and down.

［５］好ましくは、被験者が摂食した食塊の動き及び嚥下諸器官の動きを撮影した画像から前記嚥下諸器官及び前記食塊の位置を取得して座標として数値化し、前記嚥下諸器官及び前記食塊の運動の教師信号を作成する前処理部と、
前記被験者の所定の皮膚表面に配置され、前記画像の撮影に同期させて、摂食嚥下時における生体信号を検出するセンサ部と、
前記生体信号から特徴量を抽出するとともに、RNN (Recurrent Neural Network)及び前記RNNから派生したLSTM (Long Short-Term Memory)、GRU (Gated Recurrent Unit)、LSTNet（Long- and Short-term Time-series Network）や、AR（Autoregressive）モデル及び前記ARモデルから派生したARMA（Autoregressive Moving Average）、ARIMA（Autoregressive Integrated Moving Average）、SARIMA（Seasonal AutoRegressive Integrated Moving Average）モデルを含む時系列データの予測手法を用いて前記教師信号及び前記特徴量に基づいて少なくとも嚥下諸器官と食塊の一方の運動を学習し、前記特徴量から少なくとも前記嚥下諸器官と前記食塊の一方の運動を予測しうるモデルを生成し、この生成した予測モデルを用いて、前記特徴量から少なくとも前記嚥下諸器官と前記食塊の一方の運動を予測する解析部と、を備えている。 [5] Preferably, the positions of the swallowing organs and the bolus are obtained from an image of the movement of the bolus ingested by the subject and the movements of the swallowing organs, and are digitized as coordinates. a preprocessing unit that creates a teacher signal for the movement of the food bolus;
a sensor unit that is placed on a predetermined skin surface of the subject and detects biological signals during ingestion and swallowing in synchronization with the imaging of the image;
In addition to extracting feature quantities from the biological signals, RNN (Recurrent Neural Network) and LSTM (Long Short-Term Memory) derived from the RNN, GRU (Gated Recurrent Unit), and LSTNet (Long- and Short-term Time-series Using time-series data prediction methods including AR (Autoregressive) models, ARMA (Autoregressive Moving Average), ARIMA (Autoregressive Integrated Moving Average), and SARIMA (Seasonal AutoRegressive Integrated Moving Average) models derived from the AR model. learn the movement of at least one of the swallowing organs and the bolus based on the teacher signal and the feature amount, and generate a model capable of predicting the movement of at least one of the swallowing organs and the bolus from the feature amount. The apparatus further includes an analysis section that uses the generated prediction model to predict the movement of at least one of the swallowing organs and the bolus from the feature amounts.

かかる構成によれば、非侵襲的でリスクの少ない、ベッドサイドや在宅医療でも簡便に少なくとも嚥下諸器官と食塊の一方の運動を予測する摂食嚥下機能評価・訓練を行うことができる時系列データ予測を用いた嚥下機能評価・訓練システムを提供することができる。 According to such a configuration, it is possible to conduct a non-invasive and low-risk time-series evaluation and training of the swallowing function that predicts the movement of at least one of the swallowing organs and the bolus, easily at the bedside or in home medical care. A swallowing function evaluation/training system using data prediction can be provided.

非侵襲的でリスクの少ない、ベッドサイドや在宅医療でも簡便に少なくとも嚥下諸器官と食塊の一方の運動を予測する摂食嚥下機能評価・訓練を行うことができる。 Eating and swallowing function evaluation and training that predicts the movement of at least one of the swallowing organs and the bolus can be easily performed at the bedside or in home medical care, which is non-invasive and has little risk.

舌骨上筋群と舌骨下筋群を示す説明図である。It is an explanatory view showing a suprahyoid muscle group and an infrahyoid muscle group. 舌骨の運動を示す説明図である。It is an explanatory view showing movement of a hyoid bone. 随意運動及び嚥下反射からなる嚥下の仕組みを示す説明図である。FIG. 2 is an explanatory diagram showing the mechanism of swallowing consisting of voluntary movements and swallowing reflexes. 誤嚥リスクを示す説明図である。It is an explanatory diagram showing aspiration risk. VFで得られる情報を示す説明図である。FIG. 3 is an explanatory diagram showing information obtained by VF. 本発明に係る時系列データ予測を用いた嚥下機能評価・訓練方法システムの構成図である。FIG. 1 is a configuration diagram of a swallowing function evaluation/training method system using time-series data prediction according to the present invention. センサ部を示す説明図である。It is an explanatory view showing a sensor part. 喉頭挙動センサ及び電極用治具を示す説明図である。It is an explanatory view showing a laryngeal behavior sensor and an electrode jig. 伸縮率に対応した出力電圧を示す説明図である。FIG. 3 is an explanatory diagram showing output voltages corresponding to expansion/contraction ratios. 回路構成を示す説明図である。FIG. 2 is an explanatory diagram showing a circuit configuration. Σ-ΔAD変換の概略図及び各AD変換モジュールの並列化と同期化を示す説明図である。FIG. 2 is a schematic diagram of Σ-Δ AD conversion and an explanatory diagram showing parallelization and synchronization of each AD conversion module. データ処理・転送回路を示す説明図である。FIG. 2 is an explanatory diagram showing a data processing/transfer circuit. 絶縁構成を示す説明図である。FIG. 3 is an explanatory diagram showing an insulation configuration. 同期用マイク及びポータブルマルチミキサーを示す説明図である。It is an explanatory view showing a synchronization microphone and a portable multi-mixer. 時系列データ予測を用いた嚥下機能評価・訓練方法を示すフロー図である。It is a flow diagram showing a swallowing function evaluation/training method using time-series data prediction. フレームシフトの様子を示す説明図である。FIG. 3 is an explanatory diagram showing a state of frame shifting. X線画像での対象物の設定を示す説明図である。FIG. 3 is an explanatory diagram showing settings of an object in an X-ray image. 舌骨の開始点の距離（X軸及びY軸）を示す説明図である。It is an explanatory view showing the distance (X-axis and Y-axis) of the starting point of the hyoid bone. VF画像上でのA～Fの時刻における舌骨位置（白丸）を示す説明図である。FIG. 7 is an explanatory diagram showing the hyoid bone position (white circle) at times A to F on the VF image. 動画による動作区間決定を示す説明図である。FIG. 2 is an explanatory diagram showing motion section determination using a moving image. 生体信号による動作区間決定を示す説明図である。FIG. 3 is an explanatory diagram showing determination of a motion section based on biological signals. RNNの基本図である。It is a basic diagram of RNN. LSTMブロックの内部構成の簡略図である。FIG. 2 is a simplified diagram of the internal configuration of an LSTM block. LSTMブロックの内部構成を示す説明図である。FIG. 2 is an explanatory diagram showing the internal configuration of an LSTM block. 忘却ゲート層、入力ゲート層、セルの更新、出力ゲート層を示す説明図である。FIG. 3 is an explanatory diagram showing a forgetting gate layer, an input gate layer, cell updating, and an output gate layer. X線透視装置及び時系列データ予測を用いた嚥下機能評価・訓練システムを示す説明図である。FIG. 2 is an explanatory diagram showing a swallowing function evaluation/training system using an X-ray fluoroscope and time-series data prediction. 電極の配置を示す説明図である。FIG. 3 is an explanatory diagram showing the arrangement of electrodes. センサ部及び被験者にセンサ部を装着して透過した状態を示す説明図である。It is an explanatory view showing a sensor part and a state where the sensor part is attached to a subject and the image is transmitted through the subject. 舌骨運動の数値化を示す説明図である。It is an explanatory view showing numericalization of hyoid bone movement. 学習条件を示す説明図である。It is an explanatory diagram showing learning conditions. 学習データ及びテストデータの作成を示す説明図である。FIG. 2 is an explanatory diagram showing creation of learning data and test data. データの増幅を示す説明図である。FIG. 2 is an explanatory diagram showing data amplification. データ増幅とRMSEの関係を示す説明図である。FIG. 3 is an explanatory diagram showing the relationship between data amplification and RMSE. 学習の例（図３１の条件６に相当する）を示す説明図である。32 is an explanatory diagram showing an example of learning (corresponding to condition 6 in FIG. 31). FIG. 嚥下１回目のsEMG信号及び喉頭運動の一例を示す説明図である。FIG. 2 is an explanatory diagram showing an example of the sEMG signal and laryngeal movement during the first swallowing. 各筋群のsEMG信号、RMS、CCと喉頭運動及び舌骨の動きの時系列データを示す説明図である。FIG. 3 is an explanatory diagram showing time-series data of sEMG signals, RMS, CC of each muscle group, laryngeal movement, and hyoid bone movement. 一例として学習A－予測Cの結果を示す説明図である。FIG. 3 is an explanatory diagram showing the results of learning A-prediction C as an example. 一例として学習A－予測Cでの舌骨の軌跡を示す説明図である。FIG. 7 is an explanatory diagram showing the trajectory of the hyoid bone in learning A-prediction C as an example. X軸における実測値と予測値のRMSEと、Y軸における実測値と予測値のRMSEを示す説明図である。FIG. 3 is an explanatory diagram showing the RMSE of the actual measured value and the predicted value on the X axis, and the RMSE of the actual measured value and the predicted value on the Y axis. X軸方向における実測値と予測値の相関係数と、Y軸方向における実測値と予測値の相関係数を示す説明図である。FIG. 3 is an explanatory diagram showing a correlation coefficient between actual measured values and predicted values in the X-axis direction, and a correlation coefficient between actual measured values and predicted values in the Y-axis direction. 一例として学習A－予測Cの結果を示す説明図である。FIG. 3 is an explanatory diagram showing the results of learning A-prediction C as an example. 一例として学習A－予測Cでの食塊の先端の軌跡を示す説明図である。FIG. 7 is an explanatory diagram showing the trajectory of the tip of the bolus in learning A-prediction C as an example.

本発明の実施の形態として、舌骨の運動予測を例に、添付図に基づいて以下に説明する。なお、図面は、摂食嚥下機能評価システムの概略構成を概念的（模式的）に示すものとする。 DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the accompanying drawings, taking as an example the prediction of the movement of a hyoid bone. Note that the drawings conceptually (schematically) show the schematic configuration of the eating and swallowing function evaluation system.

まず本発明の実施例に係る摂食嚥下機能評価システム１０の全体構成を説明する。
図６～図８、図２６及び図２７に示すように、摂食嚥下機能評価システム１０は、摂食嚥下時の生体信号を検出するセンサ部２０と、検出した生体信号を増幅してPCに送信する多機能筋電位計測装置３０と、生体信号から舌骨をはじめとする少なくとも嚥下諸器官と食塊の一方の運動を予測し、嚥下機能の評価や訓練に用いるための解析部４０と、評価・訓練した結果を記録する記録部（不図示）と、評価・訓練した結果を表示する表示部４１と、これらに給電するバッテリ（不図示）とを備えている。 First, the overall configuration of an eating and swallowing function evaluation system 10 according to an embodiment of the present invention will be described.
As shown in FIGS. 6 to 8, FIG. 26, and FIG. 27, the eating and swallowing function evaluation system 10 includes a sensor unit 20 that detects biological signals during eating and swallowing, and a sensor unit 20 that amplifies the detected biological signals and sends them to a PC. A multifunctional myoelectric potential measurement device 30 for transmitting data; an analysis unit 40 for predicting the movement of at least one of the swallowing organs including the hyoid bone and the bolus from the biological signals, and using the prediction for evaluating and training the swallowing function; It includes a recording section (not shown) for recording the results of evaluation and training, a display section 41 for displaying the results of evaluation and training, and a battery (not shown) for supplying power to these.

また、摂食嚥下機能評価システム１０は、被験者６０が摂食した食塊の動き及び嚥下諸器官の動きを透視化して摂食嚥下造影動画を得るX線透視装置５０と、摂食嚥下造影動画から嚥下諸器官及び食塊の位置を取得して少なくとも嚥下諸器官と食塊の一方の運動を座標として数値化し、各運動の教師信号を作成する前処理部５１とを備えている。なお、X線透視装置５０は、被験者６０が、最初にVF検査と同時にセンサ部２０で生体信号を検知して学習データを得るときにのみ使用される。 The eating and swallowing function evaluation system 10 also includes an X-ray fluoroscopy device 50 that obtains a contrast video of eating and swallowing by visualizing the movement of the food bolus ingested by the subject 60 and the movements of various swallowing organs, and a contrast video of eating and swallowing. The preprocessing unit 51 acquires the positions of the swallowing organs and the bolus, converts the movements of at least one of the swallowing organs and the bolus into numerical values as coordinates, and creates a teacher signal for each movement. Note that the X-ray fluoroscopy device 50 is used only when the subject 60 first detects biological signals with the sensor section 20 and obtains learning data simultaneously with the VF examination.

次にセンサ部２０について説明する。
センサ部２０は、舌骨上筋群部分に配置され舌骨上筋群の筋活動による舌骨上筋群生体信号を検出する舌骨上筋群用筋電センサ２１と、舌骨下筋群部分に配置され舌骨下筋群の筋活動による舌骨下筋群生体信号を検出する舌骨下筋群用筋電センサ２２と、喉頭部分に配置され喉頭の挙上による喉頭挙動信号を検出する喉頭挙動センサ２５とを備えている。センサ部２０は、被験６０者の所定の皮膚表面に配置され、摂食嚥下造影動画の取得に同期させて、摂食嚥下時における生体信号を検出するものである。 Next, the sensor section 20 will be explained.
The sensor unit 20 includes a myoelectric sensor 21 for the suprahyoid muscle group, which is placed in the suprahyoid muscle group and detects biosignals of the suprahyoid muscle group due to muscle activity of the suprahyoid muscle group, and a myoelectric sensor 21 for the suprahyoid muscle group, which is arranged in the suprahyoid muscle group. A myoelectric sensor 22 for the infrahyoid muscle group, which is placed in the larynx area, detects biological signals of the infrahyoid muscle group due to muscle activity of the infrahyoid muscle group, and a myoelectric sensor 22 is placed in the larynx area, which detects laryngeal behavior signals due to the elevation of the larynx. A laryngeal behavior sensor 25 is provided. The sensor unit 20 is placed on a predetermined skin surface of the 60 subjects, and detects biological signals during eating and swallowing in synchronization with acquisition of the eating and swallowing contrast video.

舌骨上筋群用筋電センサ２１は、多チャンネルの電極２１aが整列したアレイ状電極が用いられている。舌骨下筋群用筋電センサ２２は、多チャンネルの電極２２aが整列したアレイ状電極が用いられている。 The myoelectric sensor 21 for the suprahyoid muscle group uses an array electrode in which multi-channel electrodes 21a are arranged. The myoelectric sensor 22 for the subhyoid muscle group uses an array electrode in which multi-channel electrodes 22a are arranged.

多チャンネルの電極２１a、２２aは多機能筋電位計測装置３０に接続して使用する。舌骨上筋群用筋電センサ２１は後頭部に干渉しないように、かつ下顎底部奥に存在する茎突舌骨筋部分も計測できるような形状である。舌骨下筋群用筋電センサ２２は喉頭隆起の動きに干渉せず計測できるような形状である。 The multi-channel electrodes 21a and 22a are used by being connected to a multifunctional myoelectric potential measuring device 30. The myoelectric sensor 21 for the suprahyoid muscle group is shaped so as not to interfere with the back of the head, and to also be able to measure the stylohyoid muscle located deep in the bottom of the mandible. The myoelectric sensor 22 for the subhyoid muscle group has a shape that allows measurement without interfering with the movement of the laryngeal protuberance.

基板自体の厚さは０．３mmであり、基板保護のために全体をシリコンで覆い、シリコン上に埋め込んだ銀電極を介して筋肉の表面筋電位信号（surface Electromyography、以下sEMG信号という）を抽出する。 The thickness of the substrate itself is 0.3 mm, and the entire substrate is covered with silicon to protect it, and muscle surface electromyography (hereinafter referred to as sEMG signal) signals are extracted via silver electrodes embedded on the silicon. do.

銀電極は直径２mm、高さ２．５mmであり、舌骨上筋群用の電極２１aは縦８mm、横１１．５mm間隔で埋め込み、下顎底部全体を覆うように２２個配置した。舌骨下筋群用の電極２２aは縦８mm、横８mm間隔で埋め込み、頸部前面を覆うように２２個配置した．また、GND電極２３aとバイポーラ電極の基準電極２３bを左右の耳朶に、RLD電極２４を第7頸椎棘突起にそれぞれ配置した．計測の際は接触抵抗を抑えるために電極部分にペースト（Elefix、日本光電）を塗布した多チャンネルの電極２１a、２２aを被験者にとりつける。得られた信号は多機能筋電位計測装置３０に送られる。本発明において、周波数帯域は２０～４０００Hz、ゲインは１２５倍である。 The silver electrodes had a diameter of 2 mm and a height of 2.5 mm, and the electrodes 21a for the suprahyoid muscle group were embedded at intervals of 8 mm in length and 11.5 mm in width, and 22 electrodes were arranged to cover the entire bottom of the mandible. Electrodes 22a for the infrahyoid muscle group were implanted at intervals of 8 mm vertically and 8 mm horizontally, and 22 electrodes were placed so as to cover the front surface of the neck. In addition, a GND electrode 23a and a bipolar reference electrode 23b were placed on the left and right earlobes, and an RLD electrode 24 was placed on the spinous process of the 7th cervical vertebra. During measurement, multi-channel electrodes 21a and 22a whose electrodes are coated with paste (Elefix, Nihon Kohden) are attached to the subject in order to suppress contact resistance. The obtained signal is sent to the multifunctional myoelectric potential measuring device 30. In the present invention, the frequency band is 20 to 4000 Hz and the gain is 125 times.

また、本発明では舌骨が挙上する際、それに追従するかたちで喉頭も挙上するため、喉頭隆起の位置変化を記録するために、図８の（a）に示す喉頭挙動センサ（伸縮性ひずみセンサ）２５を用いた。本発明で用いたのは喉頭挙動センサC-STRETCH（登録商標）（F51FS01、バンドー化学株式会社）である。本センサは、エラストマーフィルムと保護膜で構成されている誘電容量式のひずみセンサで、電源電圧を入力することで、センサの伸びに応じたアナログ電圧を出力する。センサ伸縮部は長さ５０mm、幅５mmである。センサ伸縮部の伸縮レンジは０～１００％であり、伸縮の変位に対応する出力電圧は図９に示す値になる。 Furthermore, in the present invention, when the hyoid bone lifts, the larynx also lifts to follow it. Therefore, in order to record changes in the position of the laryngeal prominence, a laryngeal behavior sensor (a stretchable A strain sensor) 25 was used. The laryngeal behavior sensor C-STRETCH (registered trademark) (F51FS01, Bando Chemical Co., Ltd.) was used in the present invention. This sensor is a dielectric capacitance strain sensor composed of an elastomer film and a protective film, and by inputting a power supply voltage, it outputs an analog voltage corresponding to the elongation of the sensor. The sensor extendable part has a length of 50 mm and a width of 5 mm. The expansion/contraction range of the sensor expansion/contraction section is 0 to 100%, and the output voltage corresponding to the expansion/contraction displacement has a value shown in FIG.

また、２２chフレキシブル電極２１a及び２２chフレキシブル電極２２aの装着の際は、テーピングを施したのちに、図８の（b）に示す舌骨上筋群用筋電センサ用治具２７（帽子とバンド）と、図８の（c）に示す舌骨下筋群用筋電センサ用治具２８（バンド）で固定した。 In addition, when attaching the 22ch flexible electrode 21a and the 22ch flexible electrode 22a, after taping, the jig 27 (cap and band) for the myoelectric sensor for the suprahyoid muscle group shown in FIG. 8(b) is used. and fixed with a myoelectric sensor jig 28 (band) for the subhyoid muscle group shown in FIG. 8(c).

次に多機能筋電位計測装置３０について説明する。
多機能筋電位計測装置３０は、複数の異なるセンサを同時に利用することを前提に設計された、生体活動をモニタリングするための計測装置である。最大６４チャンネルのセンサを同時にサンプリングすることが可能である。USB２．０（High Speed）インターフェースを介して、計測データを取り込むためのPCと接続される。任意のアプリケーションソフトウェアから装置を制御することも可能である。DC１２（V）のACアダプタまたは外部バッテリ入力電源により作動する。多機能筋電位計測装置の回路構成は、以下に示すように、シグナルコンディショニング部、AD変換部、データ転送部、絶縁部の４つに分けられる。 Next, the multifunctional myoelectric potential measuring device 30 will be explained.
The multifunctional myoelectric potential measurement device 30 is a measurement device for monitoring biological activity that is designed on the premise that a plurality of different sensors are used simultaneously. It is possible to sample up to 64 channels of sensors simultaneously. It is connected to a PC for importing measurement data via a USB 2.0 (High Speed) interface. It is also possible to control the device from any application software. Operates from DC12 (V) AC adapter or external battery input power source. The circuit configuration of the multifunctional myoelectric potential measurement device is divided into four parts: a signal conditioning section, an AD conversion section, a data transfer section, and an insulation section, as shown below.

図１０に示すように、シグナルコンディショニング部は、最大で２個の多チャンネル電極と、４個の汎用筋電位センサ、１６個の任意のアナログセンサが入力可能である。まず、差動増幅回路にて、耳朶に張り付けられた基準電極から得られる信号と、多チャンネル電極の各電極から得られる信号間の同相ノイズを除去して、信号成分の差のみを増幅する。単極誘導計測とも呼ばれる。 As shown in FIG. 10, the signal conditioning section can input up to two multichannel electrodes, four general-purpose myoelectric potential sensors, and 16 arbitrary analog sensors. First, a differential amplifier circuit removes in-phase noise between the signal obtained from the reference electrode attached to the earlobe and the signal obtained from each electrode of the multi-channel electrode, and only the difference in signal components is amplified. Also called unipolar lead measurement.

また、得られた差動信号からDCサーボ回路にて１（Hz）以下の低周波帯域信号を検出して除去する。次に、信号増幅回路PGA（Programmable Gain Amplifier）にて、１２５か１０００倍のいずれかに信号を増幅する。３極のアンチエイリアシングフィルタ回路にて不要な高周波雑音を除去する。これにはAD変換時の帯域折り返しを防止する効果もある。最後にAD変換を駆動するための高速アンプに入力し、出力信号を得る。 Further, from the obtained differential signal, a low frequency band signal of 1 (Hz) or less is detected and removed by a DC servo circuit. Next, the signal is amplified by either 125 times or 1000 times in a signal amplification circuit PGA (Programmable Gain Amplifier). A three-pole anti-aliasing filter circuit removes unnecessary high-frequency noise. This also has the effect of preventing band folding during AD conversion. Finally, it is input to a high-speed amplifier to drive AD conversion to obtain an output signal.

その他、低周波信号を追加で除去するために、デジタルフィルタによる１次ローカットフィルタ処理を施すことも可能である。遮断周波数は、disable、０．０１、０．１、１．０、１０．０、２０．０（Hz）のいずれかである。 In addition, in order to additionally remove low frequency signals, it is also possible to perform first-order low-cut filter processing using a digital filter. The cutoff frequency is any one of disabled, 0.01, 0.1, 1.0, 10.0, and 20.0 (Hz).

汎用筋電位センサの信号処理回路は、多チャンネル電極のそれと殆ど同じ構成であるが、体表面に張り付けられた任意の電極２点から得られる信号を差動増幅する点が異なる。双極誘導計測とも呼ばれる。 The signal processing circuit of a general-purpose myoelectric potential sensor has almost the same configuration as that of a multi-channel electrode, but differs in that it differentially amplifies signals obtained from two arbitrary electrodes attached to the body surface. Also called bipolar induction measurement.

汎用アナログセンサの信号処理回路は、様々なセンサを任意に接続できるように、最大で±１５（V）のアナログ信号を入力できる仕様になっている。振幅の大きな信号を入力する場合、多機能筋電位計測装置の計測範囲（±２．５（V））に調整するためにPGAによりゲイン調整を行う。PGAの値は、disable、１／４、１／２、１倍のいずれかである。出力信号は、AD変換を駆動するための高速アンプから得られる。 The signal processing circuit of the general-purpose analog sensor is designed to accept analog signals of up to ±15 (V) so that various sensors can be connected as desired. When inputting a signal with a large amplitude, gain adjustment is performed using PGA to adjust it to the measurement range (±2.5 (V)) of the multifunctional myoelectric potential measuring device. The value of PGA is one of disable, 1/4, 1/2, and 1 times. The output signal is obtained from a high speed amplifier to drive the AD conversion.

図１１に示すように、多機能筋電位計測装置３０が内蔵するAD変換機能は、Σ-Δ変換方式で、１６（bit）の分解能、最大１０（kHz）で全チャンネルの同時サンプリングが可能である。Σ-Δ AD変換の概略図を示す。アナログ信号Vinに対して、サンプリング周波数fs（Hz）×nのオーバーサンプリングとΣ-Δ変調を施すことにより、帯域外の高周波帯域に不要なノイズの周波数スペクトルを移行させ、これをデジタルフィルタにより除去する。最後にfs（Hz）にダウンレートすることで、デジタライズされた出力信号を得る。広く用いられる逐次比較AD変換と比べてSN比を高くとることができ、またアンチエイリアシングフィルタを単純化することができる。 As shown in FIG. 11, the AD conversion function built into the multifunctional myoelectric potential measuring device 30 uses the Σ-Δ conversion method, and can simultaneously sample all channels at a maximum of 10 (kHz) with a resolution of 16 (bit). be. A schematic diagram of Σ-Δ AD conversion is shown. By applying oversampling of the sampling frequency fs (Hz) × n and Σ-Δ modulation to the analog signal Vin, the frequency spectrum of unnecessary noise is shifted to the high frequency band outside the band, and this is removed by a digital filter. do. Finally, by down-rating to fs (Hz), a digitized output signal is obtained. Compared to the widely used successive approximation AD conversion, it is possible to obtain a higher signal-to-noise ratio and to simplify the anti-aliasing filter.

多機能筋電位計測装置３０のデジタルフィルタは、振幅が平坦で、線形位相の特性を持つ（有効帯域は、サンプリング周波数の１／２）。サンプリング周波数は、１k、１．２５k、２k、２．５k、４k、５k、８k、１０k（Hz）から選択する。 The digital filter of the multifunctional myoelectric potential measuring device 30 has flat amplitude and linear phase characteristics (the effective band is 1/2 of the sampling frequency). The sampling frequency is selected from 1k, 1.25k, 2k, 2.5k, 4k, 5k, 8k, and 10k (Hz).

また、各チャンネルに対応した（６４個の）Σ-Δ AD変換モジュールは、等長配線された同一のクロック源により駆動されるため、各々が同期してAD変換動作を行う。 Furthermore, since the (64) Σ-Δ AD conversion modules corresponding to each channel are driven by the same clock source with wires of equal length, they each perform AD conversion operations in synchronization.

図１２に示すように、データ転送部では、AD変換によりデジタライズされた計測データはUSB２．０（High Speed）インターフェースを介してPCに取り込まれる。これらの処理はDSP（Digital Signal Processor）に書き込まれたファームウェアによって実現される。サンプリング周波数毎に、各AD変換モジュールから転送される計測データは、DMA（Direct Memory Access）によって、DSP内のメモリに転送される。DSPは、デジタルフィルタなどの追加の信号処理を行い、SDRAMで構成されるFIFO（First In First Out）メモリに計測データを保存する。USB送信バッファが空になると、FIFOメモリから対象の計測データを順次読み込み、PC（USBホスト）に送信する。このようにFIFOメモリを、データ処理とUSB転送処理の間に入れることで、抜けを起こさずに全ての計測データを、PCに転送できるようにした。 As shown in FIG. 12, in the data transfer section, the measurement data digitized by AD conversion is taken into the PC via the USB 2.0 (High Speed) interface. These processes are realized by firmware written in a DSP (Digital Signal Processor). Measurement data transferred from each AD conversion module at each sampling frequency is transferred to the memory in the DSP by DMA (Direct Memory Access). The DSP performs additional signal processing, such as digital filtering, and stores measurement data in FIFO (First In First Out) memory, which consists of SDRAM. When the USB transmission buffer becomes empty, the target measurement data is sequentially read from the FIFO memory and sent to the PC (USB host). In this way, by inserting FIFO memory between data processing and USB transfer processing, we have made it possible to transfer all measurement data to the PC without any omissions.

図１３に示すように、絶縁部において、多機能筋電位計測装置３０は、生体活動をモニタリングするための計測装置であるため、安全性についても考慮する必要がある。電極と生体が接触するアナログ部(既述のシグナルコンディショニング部とAD変換部)と、電源やPCへの接続を可能にするデジタル部(既述のデータ転送部)は、電気的に絶縁する仕様とした。アナログ部の駆動電力は、１２（V）入力から絶縁電源回路により生成される。デジタル部とアナログ部のデータ通信は、デジタルアイソレータを介して行われる。 As shown in FIG. 13, in the insulating section, the multifunctional myoelectric potential measuring device 30 is a measuring device for monitoring biological activity, so safety must also be considered. The analog part (signal conditioning part and AD conversion part mentioned above) where the electrodes come in contact with the living body, and the digital part (data transfer part mentioned above) that enables connection to the power supply and PC are electrically isolated. And so. Drive power for the analog section is generated by an isolated power supply circuit from a 12 (V) input. Data communication between the digital section and the analog section is performed via a digital isolator.

次にX線透視装置５０について説明する。
図２６に示すように、本発明で用いたVF検査装置はX線透視装置５０（SHIMADZU Corp 、Safire II ZS-100）であり、検査時の電圧の出力状態は７９kVp、電流は２５０mAである。 Next, the X-ray fluoroscope 50 will be explained.
As shown in FIG. 26, the VF inspection device used in the present invention is an X-ray fluoroscopy device 50 (SHIMADZU Corp, Safire II ZS-100), and the voltage output state during inspection is 79 kVp and the current is 250 mA.

次に同期用マイク５２及びポータブルマルチミキサー５３について説明する。
図１４に示すように、同期用マイクは、検査の際、X線透視装置５０によって得られた動画と多機能筋電位計測装置３０によって得られたsEMG信号を同期させるのを目的に同期用マイク５２を使用した。図１４の（a）に示す同期用マイク５２の本体にはステレオマイクロホン（AT9941、audio-technica（登録商標））を使用し、図１４の（b）に示すポータブルマルチミキサー５３（AT-PMX5P、audio-technica（登録商標））を同期用マイク５２の本体、X線透視装置５０、多機能筋電位計測装置３０に接続することで同期を図った。 Next, the synchronization microphone 52 and the portable multi-mixer 53 will be explained.
As shown in FIG. 14, the synchronization microphone is used for the purpose of synchronizing the moving image obtained by the X-ray fluoroscope 50 and the sEMG signal obtained by the multifunctional myoelectric potential measurement device 30 during an examination. 52 was used. A stereo microphone (AT9941, audio-technica (registered trademark)) is used in the main body of the synchronization microphone 52 shown in (a) of FIG. 14, and a portable multi-mixer 53 (AT-PMX5P, synchronization was achieved by connecting an audio-technica (registered trademark) to the main body of the synchronization microphone 52, the X-ray fluoroscope 50, and the multifunctional myoelectric potential measuring device 30.

次に解析部４０について説明する。
解析部４０（図６及び図２６参照）は、生体信号から特徴量を抽出するとともに、ここでは長期の時間依存性及び短期の時間依存性を学習する回帰型ニューラルネットワークアーキテクチャである長・短期記憶（LSTM）を用いて教師信号及び特徴量に基づいて舌骨をはじめとする嚥下諸器官や食塊の運動を学習して予測するものである（詳細は後述する）。 Next, the analysis section 40 will be explained.
The analysis unit 40 (see FIGS. 6 and 26) extracts feature quantities from biological signals and uses long/short-term memory, which is a recurrent neural network architecture that learns long-term time dependence and short-term time dependence. (LSTM) is used to learn and predict the movement of swallowing organs such as the hyoid bone and the bolus based on teacher signals and feature amounts (details will be described later).

次に本発明の実施例に係る摂食嚥下機能評価方法について説明する。
図１５に示すように、摂食嚥下機能評価方法は、嚥下撮影工程（VF動画工程）と、前処理工程と、生体信号検出工程（信号計測工程）と、特徴量抽出工程と、学習工程及び予測工程（学習・予測工程）と、予測結果の評価工程（動作予測工程）とを備えている。 Next, a method for evaluating eating and swallowing function according to an example of the present invention will be explained.
As shown in FIG. 15, the eating and swallowing function evaluation method includes a swallowing imaging process (VF video process), a preprocessing process, a biological signal detection process (signal measurement process), a feature extraction process, a learning process, and It includes a prediction process (learning/prediction process) and a prediction result evaluation process (motion prediction process).

嚥下撮影工程（VF動画工程）では、被験者が摂食した食塊の動き及び嚥下関連器官の動きをX線透視下で観察可能な嚥下造影検査により摂食嚥下造影動画を得る。前処理工程では、摂食嚥下造影動画から舌骨をはじめとする嚥下諸器官や食塊の運動の位置を取得して各運動を座標として数値化し、嚥下諸器官や食塊の運動の教師信号を作成する。 In the swallowing imaging process (VF video process), a swallowing contrast video is obtained using a swallowing contrast test that allows the movement of the bolus ingested by the subject and the movements of swallowing-related organs to be observed under X-ray fluoroscopy. In the preprocessing process, the positions of the movements of the swallowing organs such as the hyoid bone and the bolus are obtained from the contrast-enhanced video of swallowing, and each movement is digitized as coordinates, and a teacher signal of the movement of the swallowing organs and the bolus is obtained. Create.

生体信号検出工程（信号計測工程）では、造影動画工程に同期させ、被験者の所定の皮膚表面に配置したセンサ部で摂食嚥下時の生体信号を検出する。特徴量抽出工程では、解析部で生体信号から特徴量を抽出する。 In the biosignal detection process (signal measurement process), biosignals during ingestion and swallowing are detected by a sensor section placed on a predetermined skin surface of the subject in synchronization with the contrast video process. In the feature amount extraction step, the analysis unit extracts feature amounts from the biosignal.

学習工程では、長期の時間依存性及び短期の時間依存性を学習する回帰型ニューラルネットワークアーキテクチャであるLSTM（長・短期記憶）を用いて教師信号及び特徴量に基づいて舌骨をはじめとする嚥下諸器官や食塊の運動を学習して特徴量から嚥下諸器官及び食塊の運動を予測しうるモデル（予測モデル）を生成する。予測工程では、学習工程で生成した被験者の運動予測モデルにより、学習工程の後に新たに検出された、あるいは、学習工程で使用していない被験者の生体信号について特徴量から、舌骨をはじめとする嚥下諸器官や食塊の運動を予測する。また、予測工程の結果を用いて摂食嚥下機能評価・訓練する評価・訓練工程を備える。 In the learning process, LSTM (long short-term memory), which is a recurrent neural network architecture that learns long-term time dependence and short-term time dependence, is used to learn swallowing, including the hyoid bone, based on teacher signals and feature quantities. A model (prediction model) that can predict the movements of the swallowing organs and the bolus from the feature values is generated by learning the movements of the various organs and the bolus. In the prediction process, the motion prediction model of the subject generated in the learning process is used to calculate features of the subject's biological signals newly detected after the learning process or not used in the learning process, including the hyoid bone. Predict movements of swallowing organs and bolus. Furthermore, an evaluation/training step is provided for evaluating and training the eating and swallowing function using the results of the prediction step.

さらに、生体信号検出工程では、舌骨上筋群部分に配置した舌骨上筋群用筋電センサで舌骨上筋群生体信号を検出し、舌骨下筋群部分に配置した舌骨下筋群用筋電センサで舌骨下筋群生体信号を検出し、喉頭部分に配置した喉頭挙動センサで喉頭挙動信号を検出し、特徴量抽出工程では、生体信号としての、舌骨上筋群生体信号、舌骨下筋群生体信号及び喉頭挙動信号から特徴量を抽出している。 Furthermore, in the biological signal detection process, the suprahyoid muscle group biosignal is detected by the myoelectric sensor for the suprahyoid muscle group placed in the suprahyoid muscle group, and A myoelectric sensor for the muscle group detects the infrahyoid muscle group biosignal, a laryngeal behavior sensor placed in the larynx detects the laryngeal behavior signal, and in the feature extraction process, the suprahyoid muscle group is detected as the biosignal. Features are extracted from biological signals, subhyoid muscle group biological signals, and laryngeal behavior signals.

さらに、学習工程では、学習データとして舌骨をはじめとする嚥下諸器官や食塊の運動の座標データを用い、学習データを、１つの元データを所定の周期で同一の座標データが含まれないようにシフトして複数に増幅させている（図３２参照）。また、データを増幅する際、１つの元データを所定の周期で同一の座標データが含まれないようにシフトした各点（例えば、図３２のデータ１、データ２、データ３）において、各点を含む平均値（移動平均）を用いてもよい。 Furthermore, in the learning process, the coordinate data of swallowing organs such as the hyoid bone and the movement of the bolus are used as learning data, and the learning data is divided into one source data at a predetermined period so that it does not contain the same coordinate data. (See FIG. 32). In addition, when amplifying data, each point (for example, data 1, data 2, data 3 in FIG. An average value (moving average) including the following may be used.

さらに、前処理工程では、被験者の側面視で、被験者の第５頸椎前縁下端を原点とし、被験者の第３頸椎前縁上端をY軸上の点としたXY座標系を設定し、被験者の舌骨の下端を前記舌骨の位置として前記XY座標系の座標を取得している。 Furthermore, in the preprocessing step, in a side view of the subject, an XY coordinate system is set with the lower end of the anterior edge of the subject's 5th cervical vertebrae as the origin and the upper end of the anterior edge of the subject's 3rd cervical vertebrae as a point on the Y axis. The coordinates of the XY coordinate system are obtained with the lower end of the hyoid bone as the position of the hyoid bone.

次に摂食嚥下機能評価方法における時系列データによる舌骨の運動予測アルゴリズムについて説明する。時系列データによる舌骨の運動の予測アルゴリズムの概略図は、図１５に示す通りである。 Next, an algorithm for predicting the motion of the hyoid bone using time-series data in the method of evaluating eating and swallowing function will be explained. A schematic diagram of an algorithm for predicting the movement of the hyoid bone using time-series data is shown in FIG. 15.

次に信号計測の特徴部抽出部について説明する。
発明者らは、信号計測によって得られた舌骨上筋群２２ch、舌骨下筋群２２chにおいてバンドパスフィルタ（２５０－７００Hz）をかけることでノイズの除去を行った。その後、舌骨上筋群及び舌骨下筋群の各チャンネルにおいて動作に関連した特徴的な信号成分（特徴量）を抽出する。本研究は舌骨上筋群と舌骨下筋群の各チャンネルのそれぞれのsEMG信号に対して、長さ２５６サンプル分のフレームを、１６サンプルの周期でシフトさせながら特徴量を抽出し作成した。この特徴量には以下のものを用いた。 Next, the characteristic part extraction unit for signal measurement will be explained.
The inventors removed noise by applying a band pass filter (250-700 Hz) to 22 channels of the suprahyoid muscle group and 22 channels of the infrahyoid muscle group obtained by signal measurement. Then, characteristic signal components (feature amounts) related to the motion are extracted in each channel of the suprahyoid muscle group and the infrahyoid muscle group. In this study, features were extracted and created for each sEMG signal of each channel of the suprahyoid muscle group and the infrahyoid muscle group while shifting frames of 256 samples in length at a cycle of 16 samples. . The following features were used for this feature.

RMS（Root Mean Square）は数式１で表され、EMG信号の振幅に関する特徴が得られる。 RMS (Root Mean Square) is expressed by Equation 1, and provides characteristics regarding the amplitude of the EMG signal.

CC（Cepstrum coefficient）は数式２で表される。周波数領域から抽出する特徴量であり、パワースペクトルの包絡形状と微細構造の分離を行える特徴がある。次数が低いと包絡形状の特徴が、次数が高いと微細構造の特徴が表れる。 CC (Cepstrum coefficient) is expressed by Formula 2. It is a feature extracted from the frequency domain, and has the characteristic of being able to separate the envelope shape and fine structure of the power spectrum. When the order is low, the features of the envelope shape appear, and when the order is high, the features of the fine structure appear.

ここで、nは総サンプル数、sEMGはsEMG信号を表す。 Here, n represents the total number of samples, and sEMG represents the sEMG signal.

図１６に示すように、RMSの計算には過去nサンプルのEMGを用いる。この際、nサンプル分を一つのフレームとして切り出して計算し、切り出す範囲を一定周期でシフトさせていくフレームシフト方式を用いる。 As shown in FIG. 16, the past n samples of EMG are used for RMS calculation. At this time, a frame shift method is used in which calculations are performed by cutting out n samples as one frame, and the range to be cut out is shifted at a constant cycle.

次にVF検査動画の前処理について説明する。
図１７に示すように、VF検査によって得られた動画は動画解析ソフトウェアのDIPP‐Motion V（株式会社ディテクト）を用いて３０fpsのサンプリング速度で取り込み、舌骨の運動の数値化を行った。座標系は、第３頸椎前縁上端をP１、第５頸椎前縁下端をP２、舌骨体の下端をP３とし、P１とP２を通過する直線をY軸とした。Y軸に垂直かつP２を通る直線をX軸と設定した。画面上のスケール設定においては多チャンネル電極の２２個ある純銀棒の１つの直径２mmを基準とした。 Next, preprocessing of VF inspection videos will be explained.
As shown in Figure 17, the video obtained by the VF test was captured using the video analysis software DIPP-Motion V (Detect Co., Ltd.) at a sampling speed of 30 fps, and the movement of the hyoid bone was quantified. In the coordinate system, the upper end of the anterior edge of the third cervical vertebrae is P1, the lower end of the anterior edge of the fifth cervical vertebrae is P2, the lower end of the hyoid body is P3, and the straight line passing through P1 and P2 is the Y axis. A straight line that is perpendicular to the Y axis and passes through P2 is set as the X axis. When setting the scale on the screen, the diameter of one of the 22 pure silver rods of the multichannel electrode was set at 2 mm.

図１８に示すように、舌骨体の下端の動きの解析においては安静状態の舌骨体の下端を座標の原点とし、嚥下時におけるX軸及びY軸の移動距離（mm）を算出し、その後移動平均を行い、平滑化を行った。 As shown in Fig. 18, in analyzing the movement of the lower end of the hyoid body, the lower end of the hyoid body in a resting state is used as the origin of coordinates, and the moving distance (mm) of the X-axis and Y-axis during swallowing is calculated. After that, moving average was performed and smoothing was performed.

図１９に示すように、画像結果から、Aを随意嚥下開始に伴う舌尖の運動開始、Bを舌骨挙上運動開始、Cを嚥下反射開始に伴う急速な舌骨挙上開始、Dを舌骨の最前上方位到達、Eを舌骨の急速下降開始、Fを嚥下終了後の舌骨安静位として決定し、AからFまでの区間を舌骨の運動を予測する区間とした。 As shown in Figure 19, from the image results, A is the start of the movement of the tongue tip with the start of voluntary swallowing, B is the start of hyoid lifting movement, C is the start of rapid hyoid lifting with the start of the swallowing reflex, and D is the tongue tip movement. The bone reached its most anterior superior position, E was determined to be the start of rapid descent of the hyoid bone, and F was determined to be the resting position of the hyoid bone after swallowing, and the interval from A to F was defined as the interval for predicting the movement of the hyoid bone.

図２０に示すように、動作区間の決定については、教師信号となる舌骨の運動から、開始点を随意嚥下開始に伴う舌尖の運動開始A（図１９参照）とし、終了点を嚥下終了後の舌骨安静位Fとした。 As shown in Figure 20, from the movement of the hyoid bone that serves as a teacher signal, the start point is set as the start point of tongue tip movement accompanying the start of voluntary swallowing (see Figure 19), and the end point is determined after the end of swallowing, as shown in Figure 20. The hyoid bone was placed in resting position F.

そして、図２１に示すように、舌骨の運動に対して、舌骨上筋群のsEMG信号、舌骨下筋群のsEMG信号、及び喉頭挙動センサ（伸縮ひずみセンサ）による喉頭運動を合わせて、動作区間を決定した。 As shown in Figure 21, the sEMG signal of the suprahyoid muscle group, the sEMG signal of the infrahyoid muscle group, and the laryngeal movement detected by the laryngeal behavior sensor (stretching strain sensor) are combined with respect to the movement of the hyoid bone. , the operating range was determined.

次に学習器について説明する。
ここではsEMG信号からの舌骨の運動予測に、時系列データ予測に適しているLSTMを用いた。LSTM（Long short-term memory、長・短期記憶）とは、深層学習の分野において用いられる回帰型ニューラルネットワーク（Recurrent Neural Network ：RNN）アーキテクチャであり、従来のRNNで訓練する際に、長期の時間依存性では学習できない問題を解決し、長期の時間依存性も短期の時間依存性も学習できる手法である。 Next, the learning device will be explained.
Here, we used LSTM, which is suitable for predicting time-series data, to predict the motion of the hyoid bone from sEMG signals. LSTM (Long short-term memory) is a recurrent neural network (RNN) architecture used in the field of deep learning. This method solves problems that cannot be learned using dependencies, and can learn both long-term and short-term time dependencies.

RNNは、ニューラルネットワークを拡張した深層学習の一つで、時系列データの分野で優れた性能をもつ手法である。現在では機械翻訳や音声認識の分野にてよく使用される。図２２に示すように、可変長データをニューラルネットワークで扱うために中間層で得られた値を再び中間層に入力するというネットワーク構造になっている。中間層h_tは、入力x_tを見て、値h_tを出力する。ループは、情報をネットワークの１ステップから次のステップに渡すことを可能にした。しかし、長期間の予測になればなるほど、予測する値に関連する情報が最初に位置していると、RNNの場合、関連づけて学習することが困難となる。 RNN is a type of deep learning that is an extension of neural networks, and is a method with excellent performance in the field of time-series data. Currently, it is often used in the fields of machine translation and speech recognition. As shown in FIG. 22, in order to handle variable length data in a neural network, the network structure is such that values obtained in the intermediate layer are input again to the intermediate layer. The intermediate layer h _t looks at the input x _t and outputs the value h _t . Loops allowed information to be passed from one step of the network to the next. However, the longer the prediction is made, the more difficult it becomes for RNNs to associate and learn information if the information related to the predicted value is located at the beginning.

図２３に示すように、RNNと異なる部分として、LSTMにはCEC（Constant Error Carousel、記憶セル、セルとも呼ばれる）、入力ゲート、出力ゲート、忘却ゲートがある。CECとは過去のデータを保存するためのユニットで記憶セル、セルとも呼ばれる。これを導入することにより、長周期の規則性を検出することが可能になる。入力ゲート及び出力ゲートでは、学習過程で新たな入力、出力が来た時に、新たなパターンに適合するようにし、RNNで発生していた入力重み衝突、出力重み衝突の問題に対処可能となった。忘却ゲートがないモデルの場合、大きな変化のある入力が来たとしても、相対的にその入力の影響は小さくなってしまい、今までと同様の結果しか出力されなくなってしまう。この問題に対処するために忘却ゲートを導入することで、入力のパターンが大きく変化した際、セルの状態を一気に更新することを可能にした。 As shown in FIG. 23, LSTM has different parts from RNN: CEC (Constant Error Carousel, also called memory cell, cell), input gate, output gate, and forget gate. CEC is a unit for storing past data and is also called a memory cell. By introducing this, it becomes possible to detect long-period regularity. Input gates and output gates are made to adapt to new patterns when new inputs and outputs arrive during the learning process, making it possible to deal with the problems of input weight collisions and output weight collisions that occurred in RNNs. . In the case of a model without a forgetting gate, even if an input with a large change comes, the influence of that input will be relatively small, and the output will only be the same as before. By introducing a forgetting gate to deal with this problem, it became possible to update the cell state all at once when the input pattern changed significantly.

図２４に示すように、LSTMブロックの内部の詳細は図２４の（a）のようになる。図２５の（b）のようにそれぞれの線はベクトル全体を、１つのノードの出力から他のノードの入力に運ぶ。小さな円は、ベクトルの加算のような１点の操作を表し、矩形のボックスは、学習されるニューラルネットワークの層である。合流している線は連結を意味し、分岐している線は内容がコピーされ、そのコピーが別の場所に行くことを意味する。 As shown in FIG. 24, the internal details of the LSTM block are as shown in FIG. 24(a). Each line carries an entire vector from the output of one node to the input of another node, as shown in FIG. 25(b). The small circles represent single point operations, such as vector addition, and the rectangular boxes are the layers of the neural network being trained. A line that joins means a connection, and a line that diverges means that the content is copied and the copy goes somewhere else.

図２５は忘却ゲートを示しており、図２５の（a）に示すように、LSTMブロック内では、最初のステップで捨てる情報を判定する。この判定は「忘却ゲート層」と呼ばれるシグモイド層によって行われる。入力されたh_t－１とx_tを見て、セル状態C_t－１の中の各数値のために０と１の間の数値を出力します。１は「完全に維持する」を表し、０は「完全に取り除く」を表す。また、その時の式は数式３、ゲート活性化関数であるσの式は数式４に示す。 FIG. 25 shows a forgetting gate, and as shown in FIG. 25(a), in the LSTM block, information to be discarded is determined in the first step. This determination is made by a sigmoid layer called the "forgetting gate layer." It looks at the input h _t-1 and x _t and outputs a number between 0 and 1 for each number in the cell state C _t-1 . 1 represents "completely retained" and 0 represents "completely removed". Further, the equation at that time is shown in equation 3, and the equation for σ, which is the gate activation function, is shown in equation 4.

ここで、Wは入力の重み、bはバイアスを表す。 Here, W represents the input weight and b represents the bias.

図２５の（b）は入力ゲートを示しており、次のステップは、セル状態で保存する新たな状態を判定する。これには２つの部分がある。まず、「入力ゲート層」と呼ばれるシグモイド層は、どの値を更新するか判定する。次に、tanh層は、セル状態に加えられる新たな候補地のベクトルC_tを作成する。そして次のステップで状態を更新するために、これら２つを組み合わせる。その時の式を数式５、数式６に表す．ここでtanhは双曲線正接関数を表す。 FIG. 25(b) shows the input gate, and the next step is to determine the new state to save in the cell state. There are two parts to this. First, a sigmoid layer called the "input gate layer" determines which values to update. The tanh layer then creates a new candidate location vector C _t that is added to the cell state. The next step is to combine these two to update the state. The equations at that time are expressed in Equations 5 and 6. Here, tanh represents the hyperbolic tangent function.

図２５の（c）はセルの更新を示しており、古いセル状態C_t－１から新しいセル状態C_tに更新する。古いセル状態にf_tを掛け、先ほど忘れると判定されたものを忘れる。そして、i_t×ベクトルC_tを加える。これは、各状態値を更新すると決定した割合でスケーリングされた新たな候補値である。その時の式を数式７に表す． FIG. 25(c) shows cell updating, in which the old cell state C _t−1 is updated to the new cell state C _t . Multiply the old cell state by f _t and forget what was previously determined to be forgotten. Then, add i _t ×vector C _t . This is the new candidate value scaled by the percentage determined to update each state value. The equation at that time is expressed in Equation 7.

図２５の（d）は出力ゲートを示しており、最後に、出力するものを判定する必要がある。この出力はセル状態に基づいて行われる。まず、シグモイド層を実行する。この層は、セル状態のどの部分を出力するかを判定し、判定された部分のみを出力するため、セル状態に（値を－１と１の間に圧縮するために）tanhを適用し、それにシグモイド層での出力を掛け合わせる。その時の式を数式８、数式９に表す。 FIG. 25(d) shows an output gate, and finally, it is necessary to determine what to output. This output is done based on the cell state. First, run the sigmoid layer. This layer determines which part of the cell state to output, and applies tanh to the cell state (to compress the value between -1 and 1) to output only the determined part, Multiply it by the output from the sigmoid layer. The equations at that time are shown in Equation 8 and Equation 9.

次にsEMG信号による舌骨の運動予測について説明する。
検査条件として、被検者は、口腔機能に疑いがあり、岩手医科大学附属病院に来院され、VF検査の実施に同意された７０代の女性である。なお、本検査は岩手医科大学歯学部倫理審査委員会（第０１３０４号）及び岩手大学研究倫理審査委員会（第２０１９０５号）の承認を得て、通常検査の範囲内で実施した。 Next, prediction of hyoid bone motion using sEMG signals will be explained.
The test subject was a woman in her 70s who had doubts about her oral function, visited Iwate Medical University Hospital, and consented to a VF test. This test was approved by the Iwate Medical University School of Dentistry Ethics Review Committee (No. 01304) and the Iwate University Research Ethics Review Committee (No. 201905), and was conducted within the scope of normal tests.

計測方法及び計測動作については、図２６に示すように、VF検査時には、X線透視装置（SafireII ZS‐１００、島津製作所）を用い、頸部側面から椅子座位における舌骨の運動を撮影した。検査食は１％のとろみを付与した９８w／w％硫酸バリウム溶液３mlとし、検査者が被検者の舌下部にシリンジにて注入した後、検査者の指示によって嚥下を行った。検査回数は３試行、撮影速度は３０fpsとした。また、図２７及び図２８に示すように、VF検査と同時に、下顎部に舌骨上筋群用２２チャンネルフレキシブル電極、頸部に舌骨下筋群用２２チャンネルフレキシブル電極、耳朶に耳電極を装着した。舌骨上筋群は図７の（c）の１、２番の電極からオトガイまでの距離が２５mmから３０mmの間で、電極が顎骨に当たらない位置に装着した。舌骨下筋群は図７の（d）の５、６番の電極が甲状軟骨（喉仏）前方に最も突出している部分に位置するように装着した。 Regarding the measurement method and measurement operation, as shown in Figure 26, during the VF examination, an X-ray fluoroscope (SafireII ZS-100, Shimadzu Corporation) was used to photograph the movement of the hyoid bone in a chair sitting position from the side of the neck. The test food was 3 ml of a 98 w/w% barium sulfate solution with a 1% thickening, and the examiner injected it into the sublingual region of the subject using a syringe, and the subject swallowed it according to the examiner's instructions. The number of tests was 3 trials, and the shooting speed was 30 fps. In addition, as shown in Figures 27 and 28, at the same time as the VF test, a 22-channel flexible electrode for the suprahyoid muscle group was placed on the mandible, a 22-channel flexible electrode for the infrahyoid muscle group on the neck, and an ear electrode was placed on the earlobe. I installed it. The suprahyoid muscle group was attached at a position where the distance from electrodes 1 and 2 to the mentalis as shown in FIG. 7(c) was between 25 mm and 30 mm, and the electrodes did not touch the jawbone. The infrahyoid muscle group was attached so that the electrodes No. 5 and 6 in FIG. 7(d) were located at the part that most protruded in front of the thyroid cartilage (Adam's apple).

喉頭運動の計測には、甲状軟骨部分に喉頭挙動センサ（伸縮ひずみセンサ）を装着し、筋電計測と同じ多機能筋電位計測装置に接続した。なお、sEMG信号の増幅率は１２５倍sEMG信号及び喉頭挙動センサのサンプリング周波数は２０００Hzとした。そして、筋電と喉頭運動の計測システムとVF検査の同期を行うために、多機能筋電位計測装置及びX線透視装置に同期用マイクを接続し、音によるトリガー入力を行った。図２９は舌骨の運動の数値化の過程を示したものであり、VF動画に応じて、舌骨のX軸方向（前後方向）の動きと、舌骨のY軸方向（上下方向）の動きをグラフ化している。なお、この計測はVF検査が主目的であり、筋電の計測は付随して行われたものである。 To measure laryngeal movement, a laryngeal behavior sensor (stretch strain sensor) was attached to the thyroid cartilage and connected to the same multifunctional myoelectric potential measurement device used for myoelectric measurement. Note that the amplification factor of the sEMG signal was 125 times and the sampling frequency of the sEMG signal and the laryngeal behavior sensor was 2000 Hz. In order to synchronize the myoelectric potential and laryngeal movement measurement system with the VF examination, a synchronization microphone was connected to the multifunctional myoelectric potential measurement device and the X-ray fluoroscopy device, and a sound trigger input was performed. Figure 29 shows the process of quantifying the movement of the hyoid bone, and the movement of the hyoid bone in the X-axis direction (anterior-posterior direction) and the Y-axis direction (vertical direction) of the hyoid bone is calculated according to the VF video. The movement is graphed. Note that the main purpose of this measurement was the VF test, and the measurement of myoelectric potential was performed incidentally.

LSTMを用いた舌骨の運動予測については、解析条件としての学習条件は、学習の入力値において、入力に用いる筋群が増えるとどうなるか、また、喉頭運動を加えることで予測精度にどういう影響を与えるかを検証するため図３０に示すような６パターンで検証を行った。 Regarding the prediction of hyoid bone motion using LSTM, the learning conditions as analysis conditions are: What happens when the number of muscle groups used for input increases in the learning input value, and what effect does adding laryngeal motion have on prediction accuracy? In order to verify whether the given value is given, verification was performed using six patterns shown in FIG.

解析条件としての学習・予測用データセットの作成は、検査回数が３回のためそれぞれをA、B、Cとした。そして図３１のように学習及びテストデータを６通り作成した。予測精度の向上を目的に学習データそれぞれにおいてデータの増幅を行った。図３２に示すように、本発明では、元データのサンプル数に対して、同一の点を含まないように３０サンプルに１つの周期で点をとり、データを３０個まで増幅を行った。３０個に増幅したデータは、それぞれのデータが元のデータのサンプル数の１／３０になっているため、元のデータのサンプル数に合わせるために１次データ内挿を行い、元のデータのサンプル数に合わせた。 The training and prediction datasets used as analysis conditions were tested three times, so they were set as A, B, and C, respectively. Then, six types of learning and test data were created as shown in FIG. Data amplification was performed for each training data to improve prediction accuracy. As shown in FIG. 32, in the present invention, points are taken at every 30 samples so as not to include the same points, and the data is amplified up to 30 samples. Since each data amplified to 30 pieces is 1/30 of the number of samples of the original data, primary data interpolation is performed to match the number of samples of the original data. according to the number of samples.

データ増幅の有効性については、本発明でデータを３０個まで増幅したが、増幅しない場合との比較を図３３に示す。図３３はデータ増幅とRMSEの関係を示しており、RMSEでの誤差検証からデータ増幅することによって予測値と実測値の誤差が減っていることが見て取れ、データ増幅の有効性が分かる。 Regarding the effectiveness of data amplification, the present invention amplifies up to 30 pieces of data, but FIG. 33 shows a comparison with the case where data is not amplified. Figure 33 shows the relationship between data amplification and RMSE, and it can be seen from the error verification using RMSE that the error between the predicted value and the measured value is reduced by data amplification, which shows the effectiveness of data amplification.

図３４は学習についての例を示しており、学習の入力値が舌骨上筋群と舌骨下筋群及び喉頭運動とした場合、計２６５次元が入力の次元数となる。出力値は舌骨の前後方向の運動、上下方向の運動の２次元となる。また、テストデータで予測する際、予測結果には平滑化処理を行っている。 FIG. 34 shows an example of learning, and when the input values for learning are the suprahyoid muscle group, the infrahyoid muscle group, and laryngeal movement, the total number of input dimensions is 265. The output value is two-dimensional: the anteroposterior and vertical movements of the hyoid bone. Furthermore, when making predictions using test data, the prediction results are smoothed.

次に評価指標について説明する。
VF動画によって得られた実測値と予測値の結果にどれほどの差があるかを検証するためにRMSE（Root Mean Square Error、平均二乗誤差）及びピアソンの積率相関係数を用いた。RMSE及び相関係数の結果は６通りの学習結果の平均値とする。RMSEは数式１０で表され、回帰モデルの最も一般的な性能指標であり、誤差が少ないほど良い精度であるといえる。 Next, evaluation indicators will be explained.
RMSE (Root Mean Square Error) and Pearson's product moment correlation coefficient were used to verify the difference between the actual measured values and predicted values obtained by VF video. The results of RMSE and correlation coefficient are the average values of six learning results. RMSE is expressed by Equation 10 and is the most common performance index for regression models, and it can be said that the smaller the error, the better the accuracy.

ピアソンの積率相関係数（以下、相関係数と呼ぶ）は、数式１１で表され、値が大きいほど波形の類似度が高く、予測精度が高いといえる。 Pearson's product moment correlation coefficient (hereinafter referred to as correlation coefficient) is expressed by Equation 11, and it can be said that the larger the value, the higher the waveform similarity and the higher the prediction accuracy.

ここで、y_obsは出力の実測値、y_predは出力の予測値を表す。 Here, y _obs represents the actual measured value of the output, and y _pred represents the predicted value of the output.

次に結果について説明する。
収集したデータについては、図３５に、本発明で得られた嚥下３回分のうち１回分のsEMG信号、喉頭運動を示している。 Next, the results will be explained.
Regarding the collected data, FIG. 35 shows the sEMG signal and laryngeal movement for one of the three swallowings obtained by the present invention.

また、図３６は、各筋群のsEMG信号、RMS、CCと喉頭運動及び舌骨の動きの時系列データを示している。なお、ここで用いられるqrとはケプストラム係数（CC）のquefrencyのことである。 Moreover, FIG. 36 shows time series data of the sEMG signal, RMS, CC of each muscle group, laryngeal movement, and hyoid bone movement. Note that qr used here refers to the quefrency of cepstral coefficients (CC).

X軸方向（前後方向）とY軸方向（上下方向）の予測結果については、舌骨のX軸方向及びY軸方向を図１７のようにして、舌骨の運動の各方向での実測値及び予測値の６通りのうち１通りの結果は図３７のようになる。 Regarding the prediction results in the X-axis direction (anterior-posterior direction) and Y-axis direction (vertical direction), the X-axis direction and Y-axis direction of the hyoid bone are set as shown in Figure 17, and the actual measured values of the movement of the hyoid bone in each direction are calculated. The result of one of the six predicted values is shown in FIG.

図３７における結果に対応する、舌骨の運動の矢状面（XY座標面）での軌跡は、図３８のようになる。なお、図３８では、実測値と予測値を含むように示している。 The trajectory of the movement of the hyoid bone in the sagittal plane (XY coordinate plane) corresponding to the result in FIG. 37 is as shown in FIG. 38. Note that in FIG. 38, the values are shown to include actual measured values and predicted values.

次に筋群の組み合わせにおける舌骨の運動予測の精度について説明する。
図３９はRMSEの結果を示しており、筋群の組み合わせにおけるX軸方向及びY軸方向の結果である。 Next, the accuracy of predicting the movement of the hyoid bone in combinations of muscle groups will be explained.
FIG. 39 shows the results of RMSE, which are the results in the X-axis direction and Y-axis direction for the combination of muscle groups.

図４０は相関係数の結果を示しており、筋群の組み合わせにおけるX軸方向及びY軸方向の結果である。 FIG. 40 shows the results of the correlation coefficients, which are the results in the X-axis direction and Y-axis direction for the combination of muscle groups.

これらの結果により、舌骨をX軸方向に動かす筋肉はオトガイ舌骨筋をはじめとする舌骨上筋群が強く作用していると考えられる。また、舌骨をY軸方向に動かす筋肉は各筋群のみによって動いているのではなく、両筋群の協調運動によって動作がなされていると考えられる。喉頭は舌骨が嚥下によって上方向へ運動する際、追従する形で舌骨下筋群によって引き上げられるため、伸縮性ひずみセンサによる喉頭運動の情報を加えることでより精度が向上したと考えられる。 These results suggest that the suprahyoid muscles, including the geniohyoid muscle, act strongly as the muscles that move the hyoid bone in the X-axis direction. Furthermore, it is thought that the muscles that move the hyoid bone in the Y-axis direction are not moved by each muscle group alone, but by the coordinated movement of both muscle groups. Since the larynx is pulled up by the subhyoid muscles as the hyoid moves upward during swallowing, it is thought that adding information on laryngeal movement from the elastic strain sensor improved accuracy.

舌骨の矢状面の予測結果において、実測値と同様の三角形状の軌跡を描く予測結果が多くみられた。これにより、嚥下時の舌骨の運動からも正しい予測ができたと考えられる。 Among the prediction results of the sagittal plane of the hyoid bone, there were many prediction results that drew a triangular trajectory similar to the actual measurement value. This suggests that accurate predictions could be made from the movement of the hyoid bone during swallowing.

次に、舌骨の運動ではなく、食塊の運動を推定したもう一つの実施例を示す。教師信号は、VFによって撮影した食塊（１％のとろみを付与した９８w／w％硫酸バリウム溶液３ml）の先端の位置変化とし、特徴量を含む学習データなどの学習条件は舌骨の運動推定の場合と同一とした。図４１、図４２に示す通り、食塊先端が食道入口部を通過する際のXY座標（原点は第五頸椎前縁下端）の実測値と推定値（学習A－予測C）がほぼ一致し、高い精度で食塊の運動を予測できることが確認できる。 Next, another example will be shown in which the movement of the bolus rather than the movement of the hyoid bone is estimated. The training signal is the position change of the tip of the food bolus (3 ml of 98w/w% barium sulfate solution with 1% thickening) photographed by VF, and the learning conditions such as learning data including features are the estimation of the motion of the hyoid bone. The same as in the case of As shown in Figures 41 and 42, the actual measured values and estimated values (Learning A - Prediction C) of the XY coordinates (origin is the lower end of the anterior border of the fifth cervical vertebra) when the bolus tip passes through the esophageal entrance are almost the same. It can be confirmed that the movement of the bolus can be predicted with high accuracy.

以上に述べた摂食嚥下機能評価方法及び摂食嚥下機能評価システムの作用・効果について説明する。
例えば、VF検査以外で、嚥下機能及び舌骨の動きを評価可能なものとしては、超音波診断装置を用いたエコー検査がある。しかし、先行研究では付着性のある粥はエコー画像で検出しやすいが、水やゼリーなどの流動性があるものの嚥下では嚥下反射から誤嚥までの一瞬の動態の観察が困難であり、エコー検査では誤嚥の同定は困難である。また、プローブのあて方によって観測される映像が異なり、再現性の高い観察が難しい。この点、本発明によれば、舌骨だけでなく、喉頭閉鎖のタイミング、食道入口部の開閉など、VFやVEで観察できる少なくとも嚥下諸器官と食塊の一方の運動も同様に予測できるため、放射線被曝などのリスクのない評価法として期待できる。本発明では、１回の嚥下データさえ取得できればX軸方向においてRMSEが１．０３、相関係数が０．６４、Y軸方向においてRMSEが１．２０、相関係数が０．９６の精度で、同条件における嚥下運動を予測できることは確認できた。 The functions and effects of the eating and swallowing function evaluation method and the eating and swallowing function evaluation system described above will be explained.
For example, in addition to the VF test, there is an echo test using an ultrasonic diagnostic device that can evaluate swallowing function and the movement of the hyoid bone. However, previous research has shown that sticky gruel is easy to detect using echo imaging, but when swallowing liquid fluids such as water or jelly, it is difficult to observe the instantaneous dynamics from the swallowing reflex to aspiration; Therefore, it is difficult to identify aspiration. Furthermore, the image observed differs depending on how the probe is applied, making it difficult to observe with high reproducibility. In this regard, according to the present invention, it is possible to predict not only the hyoid bone but also the movement of at least one of the swallowing organs and the bolus that can be observed in VF and VE, such as the timing of laryngeal closure and the opening and closing of the esophageal entrance. , it can be expected to be an evaluation method without risks such as radiation exposure. In the present invention, if only one swallowing data can be acquired, the accuracy is RMSE of 1.03 and correlation coefficient of 0.64 in the X-axis direction, and RMSE of 1.20 and correlation coefficient of 0.96 in the Y-axis direction. It was confirmed that swallowing movements under the same conditions could be predicted.

本発明の実施例によれば、造影動画工程（VF動画工程）、前処理工程、VF動画工程と同期させた生体信号検出工程、特徴量抽出工程、学習工程、及び予測工程を備えている。学習工程では、特徴量を教師信号に合わせて動作区間を決定し、長期の時間依存性及び短期の時間依存性を学習する回帰型ニューラルネットワークアーキテクチャであるLSTM（長・短期記憶）を用いて教師信号及び特徴量に基づいて舌骨の運動を学習する。LSTMは、従来のRNNで訓練する際に、長期の時間依存性では学習できない問題を解決し、長期の時間依存性も短期の時間依存性も学習できる。学習過程で新たな入力、出力が来た時に、新たなパターンに適合するようにし、RNNで発生していた入力重み衝突、出力重み衝突の問題に対処可能とした。このため、被験者は、最初に一度だけVF検査と同時にセンサ部で生体信号を取ることで、その被験者の嚥下時の舌骨運動に関する特徴を学習し、２回目以降（予測工程）からはVF検査なしで舌骨の運動を予測することができる。結果、X線透視装置の場所が不要になり、非侵襲的でリスクの少ない、ベッドサイドや在宅医療でも簡易的に舌骨の運動を予測する摂食嚥下機能評価を行うことができる。また、同じ量、同じ物性値を同じように飲み込んだときの嚥下であれば、学習データは１回で適切なデータとなるが、より好適な学習データとするには、量や物性値を変えたときの嚥下ついて、その条件における嚥下データを学習に加えて学習データとしてもよい。 According to the embodiment of the present invention, a contrast video process (VF video process), a preprocessing process, a biological signal detection process synchronized with the VF video process, a feature amount extraction process, a learning process, and a prediction process are provided. In the learning process, the motion interval is determined by matching the feature values to the teacher signal, and the teacher uses LSTM (long and short-term memory), a recurrent neural network architecture that learns long-term time dependence and short-term time dependence. The motion of the hyoid bone is learned based on signals and feature amounts. LSTM solves problems that cannot be learned using long-term time dependencies when trained with traditional RNNs, and can learn both long-term and short-term time dependencies. When new inputs and outputs are received during the learning process, new patterns are applied, making it possible to deal with the problems of input weight collisions and output weight collisions that occur in RNNs. For this reason, the subject learns the characteristics related to the hyoid bone movement during swallowing by collecting biological signals with the sensor unit at the same time as the VF test only once, and from the second time onwards (prediction process), the VF test is performed. It is possible to predict the movement of the hyoid bone without any. As a result, there is no need for an X-ray fluoroscope, and it is possible to perform a non-invasive and low-risk assessment of eating and swallowing function that predicts hyoid bone movement easily at the bedside or in home medical care. In addition, if the same amount and physical properties are swallowed in the same way, the learning data will be appropriate data in one time, but to make the learning data more suitable, it is necessary to change the amount and physical properties. Regarding swallowing under certain conditions, swallowing data under that condition may be added to learning and used as learning data.

さらに、被験者は最初に少なくとも１回、X線透視装置のある病院などで被験者の舌骨の運動を学習すれば、次回以降からは場所を選ばずにVF検査なしで舌骨の運動を予測して嚥下機能評価・訓練を行うことができる。また、同じ量、同じ物性値を同じように飲み込んだときの嚥下であれば、学習データは１回で適切なデータとなるが、より好適な学習データとするには、量や物性値を変えたときの嚥下ついて、その条件における嚥下データを学習に加えて学習データとすることで、より好適に推定することができる。 Furthermore, if the subject learns the motion of the subject's hyoid bone at least once at a hospital with an X-ray fluoroscope, from then on, they can predict the motion of the hyoid bone from any location without VF examination. Swallowing function evaluation and training can be performed. In addition, if the same amount and physical properties are swallowed in the same way, the learning data will be appropriate data in one time, but to make the learning data more suitable, it is necessary to change the amount and physical properties. Swallowing under that condition can be estimated more appropriately by adding the swallowing data under that condition to learning and using it as learning data.

さらに、生体信号検出工程では、舌骨上筋群生体信号、舌骨下筋群生体信号、及び喉頭挙動信号を検出するので、より精度の高い舌骨の運動の予測ができる。 Furthermore, in the biosignal detection step, the suprahyoid muscle biosignal, the infrahyoid muscle biosignal, and the laryngeal behavior signal are detected, so the movement of the hyoid bone can be predicted with higher accuracy.

さらに、学習工程、予測工程では、１つの元データを所定の周期で同一の座標データが含まれないようにシフトして複数に増幅させているので、最初の１回の学習で予測値と実測値の誤差を軽減させ、より精度の高い舌骨の運動の予測ができる。 Furthermore, in the learning process and prediction process, one source data is shifted and amplified into multiple pieces at a predetermined period so that the same coordinate data is not included, so the predicted value and actual measurement data are combined in the first learning process. It reduces the error in values and enables more accurate prediction of hyoid bone movement.

さらに、前処理工程では、被験者の側面視におけるXY座標系を設定し、舌骨の下端を舌骨の位置としてXY座標系の座標を取得しているので、被験者の前後・上下の舌骨の運動の予測を分かり易くすることができる。 Furthermore, in the preprocessing step, the XY coordinate system in the subject's side view is set, and the coordinates of the XY coordinate system are acquired with the lower end of the hyoid bone as the position of the hyoid bone. Prediction of motion can be made easier to understand.

尚、実施例では、喉頭挙動センサを伸縮性ひずみセンサとしたが、これに限定されず、喉頭の運動は、圧力センサ、加速度センサ、非接触式センサなどとしても良い。また、実施例では、舌骨上筋群用筋電センサ及び舌骨下筋群筋電センサをアレイ状の電極としたが、等間隔に整列したアレイ状でなくても、複数（少なくとも２チャンネル以上）の電極を備えていればよい。嚥下音を加えても良い。また、上述の説明や図中において、適宜、摂食嚥下を意味する部分でも嚥下と記載する部分を有する。 In the embodiment, the laryngeal behavior sensor is a stretchable strain sensor, but the laryngeal movement sensor is not limited to this, and a pressure sensor, an acceleration sensor, a non-contact type sensor, etc. may be used to measure the movement of the larynx. In addition, in the embodiment, the myoelectric sensor for the suprahyoid muscle group and the myoelectric sensor for the infrahyoid muscle group are arranged in an array, but it is possible to use a plurality of electrodes (at least two channels) even if the electrodes are not arranged in an array at equal intervals. The above electrodes may be provided. Swallowing sounds may also be added. In addition, in the above description and figures, there are parts described as "swallowing" even if the part means eating and swallowing.

また、実施例では、時系列データの予測手法として、RNN (Recurrent Neural Network)から派生したLSTM (Long Short-Term Memory)を用いたが、これに限定されず、RNN及びRNNから派生したGRU (Gated Recurrent Unit)、LSTNet（Long- and Short-term Time-series Network）などや、AR（Autoregressive）モデル及び前記ARモデルから派生したARMA（Autoregressive Moving Average）、ARIMA（Autoregressive Integrated Moving Average）、SARIMA（Seasonal AutoRegressive Integrated Moving Average）モデルなどの時系列データの予測手法を用いてもよい。 In addition, in the example, as a prediction method for time series data, LSTM (Long Short-Term Memory) derived from RNN (Recurrent Neural Network) was used, but the present invention is not limited to this. ARMA (Autoregressive Moving Average), ARIMA (Autoregressive Integrated Moving Average), SARIMA ( A time-series data prediction method such as a Seasonal AutoRegressive Integrated Moving Average) model may also be used.

また、実施例では、嚥下撮影工程を、X線透視下で観測可能な嚥下造形検査（VF検査）としたが、これに限定されず、内視鏡による嚥下内視鏡検査（VE検査）の観測でもよく、さらには、エコーで舌骨の動きを観測して座標に表してもよい。嚥下撮影工程の撮影による画像は、動画に限定せず、連続的な変化が分かれば静止画でもよく、静止画をコマ送り画像としてもよい。 In addition, in the example, the swallowing imaging process was a swallowing morphological test (VF test) that can be observed under X-ray fluoroscopy, but it is not limited to this, and the swallowing imaging process is a swallowing endoscopic test (VE test) using an endoscope. It may be done by observation, or furthermore, the movement of the hyoid bone may be observed by echo and expressed in coordinates. The images taken in the swallowing photographing process are not limited to moving images, but may be still images as long as continuous changes can be seen, or the still images may be frame-by-frame images.

また、予測工程では、摂食嚥下時における生体信号から嚥下諸器官と食塊の運動を予測するものとしたが、これに限定されず、訓練時（食べ物を用いる直接訓練や、食べ物を用いない間接訓練、いわゆるバイオフィードバック訓練時）における生体信号からの嚥下諸器官や食隗の運動予測に用いてもよい。嚥下機能の訓練法である食物を用いた直接訓練や、食物を用いない間接訓練（基礎訓練）における嚥下諸器官の運動の評価やバイオフィードバック訓練への利用など、摂食嚥下機能を訓練する摂食嚥下機能訓練技術に好適である。 In addition, in the prediction process, the movements of the swallowing organs and the bolus are predicted from the biological signals during ingestion and swallowing, but this is not limited to the prediction process. It may also be used to predict movements of swallowing organs and food intake from biological signals during indirect training (so-called biofeedback training). Ingestion that trains the swallowing function, such as direct training using food, which is a training method for the swallowing function, evaluation of the movement of swallowing organs in indirect training (basic training) that does not use food, and use for biofeedback training. Suitable for eating and swallowing function training techniques.

即ち、本発明の作用及び効果を奏する限りにおいて、本発明は、実施例に限定されるものではない。 That is, the present invention is not limited to the examples as long as the functions and effects of the present invention are achieved.

本発明は、摂食嚥下時における生体信号を検出し、検出した生体信号から特徴量を抽出し、少なくとも嚥下諸器官と食塊の一方の運動を予測して摂食嚥下機能を評価する摂食嚥下機能評価技術に好適である。また、嚥下機能の訓練法である食物を用いた直接訓練や、食物を用いない間接訓練（基礎訓練）における嚥下諸器官の運動の評価やバイオフィードバック訓練への利用など、摂食嚥下機能を訓練する摂食嚥下機能訓練技術に好適である。 The present invention detects biosignals during ingestion and swallowing, extracts features from the detected biosignals, predicts the movement of at least one of the swallowing organs and the bolus, and evaluates the ingestion and swallowing function. Suitable for swallowing function evaluation technology. In addition, we train the eating and swallowing function, such as direct training using food, which is a method of training the swallowing function, and indirect training without food (basic training) to evaluate the movement of the swallowing organs and use it for biofeedback training. It is suitable for eating and swallowing function training techniques.

１０…摂食嚥下機能評価システム、２０…センサ部、２１…舌骨上筋群用筋電センサ、２１a…電極、２２…舌骨下筋群用筋電センサ、２２a…電極、２５…喉頭挙動センサ（伸縮性ひずみセンサ）、３０…多機能筋電位計測装置、４０…解析部、５０…X線透視装置、５１…前処理部、６０…被験者。 DESCRIPTION OF SYMBOLS 10... Eating and swallowing function evaluation system, 20... Sensor section, 21... Myoelectric sensor for suprahyoid muscle group, 21a... Electrode, 22... Myoelectric sensor for infrahyoid muscle group, 22a... Electrode, 25... Laryngeal behavior Sensor (stretchable strain sensor), 30... Multifunctional myoelectric potential measurement device, 40... Analysis section, 50... X-ray fluoroscopy device, 51... Preprocessing section, 60... Subject.

Claims

a swallowing photographing step of photographing the movement of the food bolus ingested by the subject and the movements of various swallowing organs;
a preprocessing step of acquiring the positions of the swallowing organs and the bolus from the image of the swallowing photographing step and digitizing them as coordinates to create a teacher signal of the movement of the swallowing organs and the bolus;
a biosignal detection step of detecting biosignals during ingestion and swallowing with a sensor unit placed on a predetermined skin surface of the subject in synchronization with the swallowing imaging step;
a feature quantity extraction step of extracting a feature quantity from the biological signal in an analysis unit;
The movements of the swallowing organs and the bolus are learned based on the teacher signal and the feature quantities using a time-series data prediction method, and the movements of at least one of the swallowing organs and the bolus are learned from the feature quantities. a learning process that generates a model that can predict
a prediction step of predicting the movement of at least one of the swallowing organs and the bolus from the feature amounts using the prediction model generated in the learning step ;
A swallowing movement prediction method using time-series data prediction, characterized by comprising:

A swallowing movement prediction method using time series data prediction according to claim 1,
In the biological signal detection step, a suprahyoid muscle group biosignal is detected by a myoelectric sensor for the suprahyoid muscle group placed in the suprahyoid muscle group portion, and a biological signal of the suprahyoid muscle group is detected in the infrahyoid muscle group portion placed in the infrahyoid muscle group portion. The group myoelectric sensor detects the subhyoid muscle group biological signals, the laryngeal behavior sensor placed in the larynx detects the laryngeal behavior signals,
In the feature amount extraction step, feature amounts are extracted from the suprahyoid muscle group biosignal, the infrahyoid muscle group biosignal, and the laryngeal behavior signal as the biosignal. Swallowing movement prediction method using data prediction.

A swallowing movement prediction method using time series data prediction according to claim 1 or claim 2,
In the learning step, the coordinate data of the swallowing organs and the food bolus are used as learning data, and the learning data is divided into multiple pieces by shifting one source data at a predetermined period so that the same coordinate data is not included. A swallowing movement prediction method using time-series data prediction characterized by amplifying.

A swallowing movement prediction method using time series data prediction according to any one of claims 1 to 3,
In the preprocessing step, a coordinate system and its origin are determined for obtaining the coordinate data of the swallowing organs and the food bolus, with the lower end of the anterior edge of the fifth cervical vertebra of the subject as the origin, and the upper end of the anterior edge of the third cervical vertebra of the subject as the origin. By setting a coordinate system with , as a point on one axis, the positions of the swallowing organs and bolus in the sagittal plane or the frontal plane of the subject are obtained using the coordinates of the coordinate system. A swallowing movement prediction method using time-series data prediction.

The positions of the swallowing organs and the bolus are obtained from images of the movements of the bolus ingested by the subject and the movements of the swallowing organs, and are digitized as coordinates, and the movements of the swallowing organs and the bolus are calculated. a preprocessing unit that creates a teacher signal;
a sensor unit that is placed on a predetermined skin surface of the subject and detects biological signals during ingestion and swallowing in synchronization with the imaging of the image;
A feature quantity is extracted from the biological signal, and the movement of at least one of the swallowing organs and the food bolus is learned based on the teacher signal and the feature quantity using a time-series data prediction method, and the feature quantity is extracted from the biological signal. A model is generated that can predict the movement of at least one of the swallowing organs and the bolus, and the generated prediction model is used to predict the movement of at least one of the swallowing organs and the bolus from the feature values. A swallowing movement prediction system using time-series data prediction, comprising: an analysis unit that performs prediction;