CN117789249A - Motor imagery classification method based on multichannel fNIRS - Google Patents
Motor imagery classification method based on multichannel fNIRS Download PDFInfo
- Publication number
- CN117789249A CN117789249A CN202311781458.0A CN202311781458A CN117789249A CN 117789249 A CN117789249 A CN 117789249A CN 202311781458 A CN202311781458 A CN 202311781458A CN 117789249 A CN117789249 A CN 117789249A
- Authority
- CN
- China
- Prior art keywords
- fnirs
- motor imagery
- channel
- near infrared
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000012545 processing Methods 0.000 claims abstract description 12
- 238000013461 design Methods 0.000 claims abstract description 6
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 11
- 230000003287 optical effect Effects 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 5
- 230000011218 segmentation Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 3
- INGWEZCOABYORO-UHFFFAOYSA-N 2-(furan-2-yl)-7-methyl-1h-1,8-naphthyridin-4-one Chemical compound N=1C2=NC(C)=CC=C2C(O)=CC=1C1=CC=CO1 INGWEZCOABYORO-UHFFFAOYSA-N 0.000 claims description 3
- 108010064719 Oxyhemoglobins Proteins 0.000 claims description 3
- 230000008033 biological extinction Effects 0.000 claims description 3
- 230000000295 complement effect Effects 0.000 claims description 3
- 108010002255 deoxyhemoglobin Proteins 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000005286 illumination Methods 0.000 claims description 3
- 238000002329 infrared spectrum Methods 0.000 claims description 3
- 238000007670 refining Methods 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 230000031700 light absorption Effects 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 abstract description 7
- 230000007246 mechanism Effects 0.000 abstract description 6
- 210000004556 brain Anatomy 0.000 abstract description 5
- 230000008901 benefit Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 4
- 230000000007 visual effect Effects 0.000 abstract description 3
- 230000002349 favourable effect Effects 0.000 abstract description 2
- 238000006467 substitution reaction Methods 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 14
- 238000004458 analytical method Methods 0.000 description 7
- 108091026922 FnrS RNA Proteins 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000002599 functional magnetic resonance imaging Methods 0.000 description 5
- 238000003058 natural language processing Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 108010054147 Hemoglobins Proteins 0.000 description 3
- 102000001554 Hemoglobins Human genes 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000004497 NIR spectroscopy Methods 0.000 description 2
- 238000010521 absorption reaction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000004 hemodynamic effect Effects 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 238000002610 neuroimaging Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- YLOCGHYTXIINAI-XKUOMLDTSA-N (2s)-2-amino-3-(4-hydroxyphenyl)propanoic acid;(2s)-2-aminopentanedioic acid;(2s)-2-aminopropanoic acid;(2s)-2,6-diaminohexanoic acid Chemical compound C[C@H](N)C(O)=O.NCCCC[C@H](N)C(O)=O.OC(=O)[C@@H](N)CCC(O)=O.OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 YLOCGHYTXIINAI-XKUOMLDTSA-N 0.000 description 1
- 206010003658 Atrial Fibrillation Diseases 0.000 description 1
- 208000012661 Dyskinesia Diseases 0.000 description 1
- 108010072051 Glatiramer Acetate Proteins 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 210000005013 brain tissue Anatomy 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000003710 cerebral cortex Anatomy 0.000 description 1
- 230000002490 cerebral effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000000537 electroencephalography Methods 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 229940042385 glatiramer Drugs 0.000 description 1
- 238000002582 magnetoencephalography Methods 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000000926 neurological effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000000284 resting effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a motor imagery classification method based on multichannel fNIRS, and belongs to the technical field of brain waves. The method comprises the following steps: generating an image of near infrared data, namely generating the image of near infrared data through data acquisition and data processing; and (3) classifying left and right hand motor imagery, namely based on the gram angle field image, extracting important features through a SwinTransformer design simplified model, scConv substitution standard convolution and SC-SwinTransformarblocks, and realizing the classification of the left and right hand motor imagery. The invention adopts the motor imagery classification method based on the multichannel fNIRS, provides a visual fNIRS framework, combines the ScConv and the Swin transducer, introduces a channel attention mechanism, is favorable for obtaining a characteristic diagram with channel attention, and enables a model to be more effectively focused on channel information. The advantage of Swin transducer in the image classification field can be fully utilized, the latest computer vision model is applied to the near infrared signal classification problem, and the possibility of a deep learning technology is provided for solving the near infrared classification problem.
Description
Technical Field
The invention relates to the technical field of brain waves, in particular to a motor imagery classification method based on multichannel fNIRS.
Background
Near infrared spectroscopy (fnires) is a novel non-invasive neuroimaging technique that uses near infrared light with wavelengths of 650 to 950 nanometers to measure hemodynamic responses in the brain. During metabolism of brain tissue, neuronal activity consumes oxygen carried by hemoglobin, causing changes in the concentration of oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR) in the activated area. Many other neuroimaging techniques, such as electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI), are used for brain signal analysis. Electroencephalograms have better temporal resolution than fnigs signals, but lower spatial resolution and are more susceptible to electrical noise and motion artifacts. While fMRI can provide excellent spatial resolution, the fMRI signal acquisition process is time consuming and generates a lot of noise. More importantly, fMRI equipment is cumbersome and expensive and not suitable for mobile scenes. The fNIRS has the advantages of safety, portability, relatively low cost and the like. Therefore, fmirs signal analysis has attracted attention from researchers in recent years.
Many studies have been performed in the past using machine learning analysis of BCI systems using near infrared spectroscopy (fNIRS). Coyle et al demonstrate for the first time the potential of fNIRS in developing brain-computer interfaces (BCI) to provide non-muscle support for severely dyskinesia patients. Typically, statistical features such as mean, variance, peak, slope, skewness, kurtosis, etc. are extracted from the fnigs signal for training the machine learning classifier. Support Vector Machines (SVMs), linear Discriminant Analysis (LDA), and Artificial Neural Networks (ANN) are all classical machine learning algorithms. However, these methods have poor generalization ability, complicated parameter adjustment process, and too much depend on an artificially designed extractor and expertise, so that high accuracy cannot be achieved in practical application. Over the past decade, deep learning techniques have evolved over the years. Deep learning can extract efficient features per se, thus exhibiting good classification performance. Despite the progress made in fNIRS classification, deep learning still faces many obstacles. First, the fnrs device is costly and thus large-scale collection of the fnrs signal is challenging. In addition, the subject must undergo a cumbersome signal acquisition procedure. Second, the direct application of the latest Computer Vision (CV) and Natural Language Processing (NLP) deep learning techniques to fnrs data is challenging because of the differences in fnrs data from those in these fields.
Recent studies have shown that the Glatiramer Angle Field (GAF) has found wide application in many fields. For example, mitiche et al used GADF in electromagnetic interference image studies, which combined the feature reduction methods of local binary pattern and local phase quantization to reject redundant information. In the research, images are classified by using random forests, so that good effects are obtained. In addition, in the case of the optical fiber,the GASF method was applied to classify Electrocardiographic (ECG) data of the 2017 PhysioNet/CinC challenge race to detect atrial fibrillation. In the financial arts, chen et al propose a method of encoding a time series into two-dimensional images and compare with the GAF method.
Disclosure of Invention
The invention aims to provide a motor imagery classification method based on multichannel fNIRS, which solves the problems that fNIRS equipment is high in cost and expensive, a subject has to accept a complicated signal acquisition process, and the latest Computer Vision (CV) and Natural Language Processing (NLP) deep learning technology is directly applied to fNIRS data and has challenges.
In order to achieve the above purpose, the invention provides a motor imagery classification method based on multichannel fNIRS, comprising the following steps:
s1, generating an image of near infrared data, and generating the image of the near infrared data through data acquisition and data processing;
s2, classifying left and right hand motor imagery, namely based on the gram angle field image, extracting important features by using a SwinTransformer design simplified model and ScConv to replace standard convolution and SC-SwinTransformer blocks, and realizing classification of the left and right hand motor imagery.
Preferably, in step S1, the data set in the data acquisition is an fnrs signal, and the light source and the detector are used to record near infrared spectrum data, so as to generate a physiological channel.
Preferably, in step S1, during the data processing, the signal acquired by the near infrared device is converted into concentration changes of oxyhemoglobin and deoxyhemoglobin, and the optical density OD in time t is converted into concentration changes of HbR and HbO of near infrared absorption by means of a modified Beer-Lambert law.
Preferably, the conversion formula for converting the optical density OD in time t into the change in the concentration of HbR and HbO of near infrared light absorption is as follows:
where d is the distance between the light source and the detector, λ1 and λ2 are different illumination wavelengths, DPF is a differential path length factor, ε is the extinction coefficients of HbR and HbO.
Preferably, in step S1, during the image generation process, integrating the data in the experimental process to form a time matrix sequence, performing segmentation processing on the fNIRS data by adopting a sliding time window method, remolding the fNIRS signal into an image form, compressing the fNIRS signal to a predetermined length by using PAA, and maintaining the signal trend;
adjusting the sequence to interval [ -1,1]:
the scaled time series is converted to polar coordinates.
Wherein phi represents the processed near infrared signal, theta and r are the corresponding radian and radius respectively, and N is a constant for normalizing the span of the polar coordinate system.
Preferably, in step S1, during image generation, the sequence is converted into a glamer angle field, including a glamer angle sum field GASF and a glamer angle difference field GADF, GASF and GADF being defined as follows:
phi according to formula (5) 1 Represents the first element in the time series, phi S Representing the last element of the time series;
and superposing 3 gram angle difference fields generated by 3 channels together in a third dimension, and storing the file into a picture format to obtain a 3-channel true color image with the resolution of 224 x 224.
Preferably, in step S2, the SCConv is composed of a spatial reconstruction unit SRU and a channel reconstruction unit CRU, the SRU separating and reconstructing redundant features according to weights, thereby suppressing redundancy in spatial dimensions and enhancing the representativeness of the features; the CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution.
Preferably, in step S2, channel attention (ECA) modules and SCConv are introduced in SC-SwinTransformaberlocks, the SC-Swin modules comprising LN layers, W-MSA, SW-MSA, SRU, CRU, and GELU layers, where W-MSA and SW-MSA are multi-headed self-attention modules, employing conventional and moving window configurations, respectively.
Preferably, in step S2, the SRU uses the scaling factors in the group normalized GN layer to evaluate the information content of the different feature maps, given an intermediate feature map X εR N×C×H×W Where N is the batch axis, C is the channel axis, and H and W are the spatial height and width axes. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:
where μ and σ are the mean and standard deviation of X, ε is a small positive number added for stable division, and γ and β are trainable affine transformations.
Measuring the variance of the spatial pixels of each batch and channel in the GN layer by using a trainable parameter gamma epsilon RC;
mapping the weight values of the feature map re-weighted by wy to the range (0, 1) by a sigmoid function and gating by a threshold; then the weights above the threshold are set to 1 to obtain the information rich weight W1, and they are set to 0 to obtain the non-information rich weight W2, the whole process of obtaining W can be expressed by equation 2:
W=Gate(Sigmoid(W γ (GN(X)))) (7)
multiplying the input feature X by W1 and W2, respectively, yields two weighted features: information rich featuresAnd less informative features->The input features are divided into two parts: />Spatial content with rich information and strong expressive power, while +.>Almost no information is considered redundant.
Preferably, in step S2, the channel reconstruction unit CRU makes use of a "split-transform-fusion" strategy, comprising the steps of:
s21, segmentation: refining feature X for a given space ω ∈R C×H×W First X is taken up ω Wherein one part is provided with alpha C channels, and the other part is provided with (1-alpha) C channels, wherein 0 is less than or equal to alpha is less than or equal to 1; subsequently, the channels of the feature map are compressed using a 1×1 convolution to refine the feature X spatially ω Divided into an upper portion Xup and a lower portion Xlow;
s22, conversion: xup is input to the up-conversion stage as a "rich feature extractor", advanced representative information is extracted using GWC and PWC instead of standard kxk convolution, xlow is input to the bottom conversion stage, and a feature map with shallow hidden details is generated as a complement to the rich feature extractor.
S23, fusion: after conversion, the channel refinement feature is obtained by combining the upper and lower features in a channel manner.
Therefore, the motor imagery classification method based on the multichannel fNIRS has the following beneficial effects:
(1) The invention provides a visual fNIRS framework, which combines ScConv and Swin convertors, introduces a channel attention mechanism, is beneficial to obtaining a characteristic diagram with channel attention, and enables a model to be focused on channel information more effectively.
(2) The invention can fully utilize the advantages of the Swin transducer in the field of image classification, applies the latest computer vision model to the near infrared signal classification problem, and provides the possibility of a deep learning technology for solving the near infrared classification problem.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a schematic diagram of data preprocessing and image generation of a motor imagery classification method based on a multi-channel fNIRS according to the invention;
FIG. 2 is a schematic diagram of a 3-channel true color map of a motor imagery classification method based on a multi-channel fNIRS according to the invention;
FIG. 3 is a schematic diagram of an SC-Swin transducer of the motor imagery classification method based on the multichannel fNIRS;
fig. 4 is a schematic diagram of accuracy r of training set and verification set of a motor imagery classification method based on multichannel fNIRS.
Detailed Description
The technical scheme of the invention is further described below through the attached drawings and the embodiments.
Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Examples
As shown in fig. 1, the invention provides a motor imagery classification method based on multichannel fNIRS, comprising the following steps:
s1, generating an image of near infrared data, namely generating the image of near infrared data through data acquisition and data processing, wherein the method comprises the following steps of:
s101, subject
29 healthy subjects participated in the experiment, including 14 men and 15 women, with average ages of 28.5+ -3.7. No subject reported neurological, psychiatric or other brain-related disorders. All volunteers received detailed experimental procedure instructions and were confirmed by written intent before the experiment was performed.
S102, data acquisition
The data set is the fNIRS signal, supplied by Shin et al, university of Berlin industries. Two experimental tasks are performed, namely, a motor imagery experiment firstly, wherein the left hand or the right hand of the tested person is required to be subjected to gripping movement; and secondly, mental arithmetic and resting state experiments. We mainly use data related to motor imagery in the fnigs system. When the acquisition experiment was performed, the near infrared spectrum data was recorded with 14 light sources and 16 detectors, yielding thirty-six physiological channels. The fNIRS data was downsampled to 10Hz after acquisition was complete for better processing and analysis.
S103, data processing
In BCI experiments, the signals collected by the near infrared device are near infrared light refracted and scattered and absorbed light signals, which can be converted into concentration changes of oxyhemoglobin (HbO) and deoxyhemoglobin (HbR). From optical correlation knowledge, the propagation of light in biological tissue obeys Lambert-Beer law. By modifying Beer-Lambert law, we can convert the optical density OD over time t to the HbR and HbO concentration changes of near infrared absorption.
The conversion formula is as follows:
where d is the distance between the light source and the detector, λ1 and λ2 are different illumination wavelengths, DPF is a differential path length factor, ε is the extinction coefficients of HbR and HbO.
fNIRS data noise is primarily derived from internal physiological noise, including heart beat (1-1.5 Hz), physiological respiration (about 0.4 Hz), etc., while the activity and metabolism of endothelial cells of the cerebral cortex is primarily concentrated between 0.005-0.21 Hz. In the dataset we use, the fnigs signal is downsampled to 10 hz. To reject physiological noise and motion disturbances, we use a third order Butterworth filter with a passband set to 0.01-0.1 Hz. The signal was then adjusted to a specific level using baseline correction using a time period of 5 seconds to 2 seconds before the start of the experiment as a parameter for baseline correction. Finally, we segmented the data into 0 to 15 seconds from the start of the experiment. It should be noted that since the cerebral hemodynamic response has a certain delay profile, these segmented data contain a delayed response of the task related part (10 to 15 seconds after the start of the experiment). To facilitate subsequent processing and analysis, we save the processed data in xls format.
S104, image generation
In our study we systematically performed a series of data processing steps to accommodate our further analysis. First, we integrated the data of the same subject over the course of 30 experiments, including a time span of 10 seconds for each experiment and 5 seconds after the experiment, forming a time matrix sequence of size 4500 (30×15×10) and width 72 (36×2). Next, we segment the fNIRS data using a sliding time window approach. We set the time window size to 224 and the step size to 20, thus dividing the data of one single channel under test into 213 time series segments. After dividing the fNIRS signal into small segments with overlapping time windows, it becomes critical to construct the appropriate network inputs. The CNN-based method generally requires input of a gray image of mxn×1 or an RGB image of mxn×3. For this purpose we consider the reconstruction of the fnigs signal into the form of an image.
Image generation is the basis of our model. The length of the fnigs signal may vary due to the acquisition equipment and experimental mode. We compress the fnigs signal to a predetermined length using PAA and preserve signal trend. Let X be a time series segment of near infrared acquisition data, x= { X 1 ,x 2 ,…,x n N, etcAt 224.
First, we adjust the sequence to interval [ -1,1]:
second, we convert the scaled time series to polar coordinates.
Wherein phi represents the processed near infrared signal, theta and r are the corresponding radian and radius respectively, and N is a constant for normalizing the span of the polar coordinate system.
Finally, the sequence is converted into a glabram angle field. The Gram Angle Field (GAF) is a method of encoding a time series signal into an image representation, including the Gram Angle Sum Field (GASF) and the Gram Angle Difference Field (GADF). The gladhand field can convert time series data into image data, so that the complete information of the signal can be kept, and the dependence of the signal on time can be kept.
GASF and GADF are defined as follows:
phi according to formula (5) 1 Represents the first element in the time series, phi S Representing the last element of the time series, namely 224 th element. From this we can get a glamer angle difference field with a side length of 224. And superposing 3 gram angle difference fields generated by 3 channels together in a third dimension, and storing the file into a picture format to obtain a 3-channel true color image with the resolution of 224 x 224.
Specifically, a given time series X is first scaled to [ -1,1], and then represented in a polar coordinate system. The scaled data is then mapped to an angle in polar coordinates. Next, the GAF matrix is calculated. And finally visualizing the matrix and storing. The generated GADFs for the 3 channels are then superimposed together in a third dimension to form a 3-channel true color map, as shown in fig. 2.
S2, classifying left and right hand motor imagery, namely based on a gram angle field image, extracting important features by using a SwinTransformer design simplified model and ScConv to replace standard convolution and SC-SwinTransformarblocks, and realizing the classification of the left and right hand motor imagery, wherein the method comprises the following steps of:
the goal of the S201, swin transducer is to expand the applicability of the transducer so that it can serve as a universal backbone network in the computer vision field. This is similar to the application of NLP and CNN in the field of vision. An important component of the Swin transform design is the way it changes the window divider between the different self-attention-growing phases. The moving windows create connections between windows of the previous layer, thereby significantly increasing modeling capabilities. As shown in FIG. 3, the upper part shows a simplified model of the Swin transducer design.
S202、ScConv
Convolutional Neural Networks (CNNs) achieve significant performance in a variety of computer vision tasks, but at the cost of enormous computational resources, in part because the convolutional layers extract redundant features. SCConv (space and channel reconstruction convolution) attempts to exploit space and channel redundancy between features for CNN compression to reduce redundancy computation and facilitate representative feature learning. SCConv consists of two units: a Spatial Reconstruction Unit (SRU) and a Channel Reconstruction Unit (CRU). The SRU separates and reconstructs the redundant features according to the weights, thereby suppressing redundancy in the spatial dimension and enhancing the representativeness of the features. The CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution. SCConv can greatly save the calculation load and improve the performance of the model in high-difficulty tasks.
S203, since the time dimension of the fNIRS data is obviously larger than the channel dimension, the channel characteristics of the fNIRS data may be ignored by the transducer model. To solve this problem we created a unique structure that combines the channel attention mechanism with the Swin transducer. We introduce a channel attention (ECA) module and SCConv that exploit dependencies between channels so that the model can extract important features according to various conditions. The SC-Swin module consists of LN layer, W-MSA, SW-MSA, SRU, CRU and GELU layer, as shown in FIG. 3. W-MSA and SW-MSA are multi-headed self-attention modules employing conventional and moving window configurations, respectively.
(1)ECA
ECA is a model of the attention mechanism for image processing, which mainly improves the effectiveness of image feature representation and extracts more critical and important information by performing attention regulation on the image channel. Specifically, the ECA attention mechanism model consists of two parts: global average pooling and linear transformation. The global average pooling can be used for gathering the information of each channel so as to judge whether the information in the channel is critical or not; the linear transformation can scale and translate the information of the channel, so that key information is better reserved, and non-key information is restrained.
(2)SRU
The information content of the different feature maps is first evaluated with scaling factors in the Group Normalization (GN) layer. Specifically, given an intermediate feature map X ε R N×C×H×W Where N is the process lot, C is the channel axis, H and W are the spatial height and width. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:
where μ and σ are the mean and standard deviation of X, ε is a small positive number added for stable division, and γ and β are trainable affine transformations.
The variance of the spatial pixels for each batch and channel is measured in the GN layer using the trainable parameter γe RC. The richer spatial information reflects more spatial pixel variations and thus contributes more gamma. The normalized correlation weight wγe RC is obtained by equation 2, which indicates the importance of the different feature maps. The weight values of the feature map re-weighted by wy are then mapped to the range (0, 1) by a sigmoid function and gated by a threshold. The weights above the threshold are then set to 1 to obtain an information rich weight W1, while they are set to 0 to obtain a non-information rich weight W2 (the threshold is set to 0.5 in the experiment). The entire process of acquiring W can be represented by equation 2:
W=Gate(Sigmoid(W γ (GN(X)))) (7)
finally, we multiply the input feature X by W1 and W2, respectively, resulting in two weighted features: information rich featuresAnd less informative features->Thus we succeeded in dividing the input features into two parts: />Spatial content with rich information and strong expressive power, while +.>Almost no information is considered redundant.
(3)CRU
In order to exploit the channel redundancy feature, a Channel Reconstruction Unit (CRU) is introduced again, which exploits a "split-switch-fuse" strategy. Segmentation: refining feature X for a given space ω ∈R C×H×W First X is taken up ω Is divided into two parts, one part of which has alpha C channels and the other part has (1-alpha) C channels, wherein 0.ltoreq.alpha.ltoreq.1. Subsequently, the channels of the feature map are further compressed using 1×1 convolution to increase computational efficiency. After the segmentation and compression operations, we refine the feature X spatially ω Divided into an upper portion Xup and a lower portion Xlow. Conversion: xup is input to the up-conversion stage as a "rich feature extractor". We use high efficiencyConvolution operations (i.e., GWC and PWC) replace expensive standard kxk convolution to extract high-level representative information while reducing computation costs. Xlow is input to the bottom conversion stage where we apply a low cost 1 x 1PWC operation to generate a feature map with shallow hidden details, as a complement to the rich feature extractor. Fusion: after conversion, the channel refinement feature is obtained by combining the upper and lower features in a channel manner.
Therefore, the invention adopts the motor imagery classification method based on the multichannel fNIRS, provides a visual fNIRS framework, combines the ScConv and the Swin transducer, introduces a channel attention mechanism, is favorable for obtaining a characteristic diagram with channel attention, and enables a model to be more effectively focused on channel information. The advantage of Swin transducer in the image classification field can be fully utilized, the latest computer vision model is applied to the near infrared signal classification problem, and the possibility of a deep learning technology is provided for solving the near infrared classification problem.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.
Claims (10)
1. A motor imagery classification method based on multichannel fNIRS is characterized in that: the method comprises the following steps:
s1, generating an image of near infrared data, and generating the image of the near infrared data through data acquisition and data processing;
s2, classifying left and right hand motor imagery, namely based on the gram angle field image, extracting important features by using a SwinTransformer design simplified model and ScConv to replace standard convolution and SC-SwinTransformer blocks, and realizing classification of the left and right hand motor imagery.
2. The method for classifying motor imagery based on multichannel fNIRS according to claim 1, wherein: in step S1, the data set in the data acquisition is an fNIRS signal, and the light source and the detector are used to record near infrared spectrum data, so as to generate a physiological channel.
3. The method for classifying motor imagery based on multichannel fNIRS according to claim 2, wherein: in step S1, during data processing, signals acquired by the near infrared device are converted into concentration changes of oxyhemoglobin and deoxyhemoglobin, and the optical density OD in time t is converted into concentration changes of HbR and HbO absorbed by near infrared light by means of a modified Beer-Lambert law.
4. A method of classifying motor imagery based on a multi-channel fNIRS according to claim 3, wherein: the conversion formula for converting the optical density OD in time t into the change in the concentration of HbR and HbO of near infrared light absorption is as follows:
where d is the distance between the light source and the detector, λ1 and λ2 are different illumination wavelengths, DPF is a differential path length factor, ε is the extinction coefficients of HbR and HbO.
5. The method for classifying motor imagery based on multichannel fNIRS of claim 4, wherein: in step S1, integrating data in the experimental process in the image generation process to form a time matrix sequence, segmenting the fNIRS data by adopting a sliding time window method, remolding the fNIRS signal into an image form, compressing the fNIRS signal to a preset length by using PAA, and keeping a signal trend;
adjusting the sequence to interval [ -1,1]:
the scaled time series is converted to polar coordinates.
Wherein phi represents the processed near infrared signal, theta and r are the corresponding radian and radius respectively, and N is a constant for normalizing the span of the polar coordinate system.
6. The method for classifying motor imagery based on multichannel fNIRS according to claim 5, wherein: in step S1, during image generation, the sequence is converted into a glamer angle field including a glamer angle sum field GASF and a glamer angle difference field GADF, GASF and GADF being defined as follows:
phi according to formula (5) 1 Represents the first element in the time series, phi S Representing the last element of the time series;
and superposing 3 gram angle difference fields generated by 3 channels together in a third dimension, and storing the file into a picture format to obtain a 3-channel true color image with the resolution of 224 x 224.
7. The method for classifying motor imagery based on multichannel fNIRS of claim 6, wherein: in step S2, SCConv is composed of a spatial reconstruction unit SRU and a channel reconstruction unit CRU, the SRU separating and reconstructing redundant features according to weights, thereby suppressing redundancy in spatial dimensions and enhancing the representativeness of the features; the CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution.
8. The method for classifying motor imagery based on multichannel fNIRS of claim 7, wherein: in step S2, in SC-SwinTransformaberlocks, channel attention (ECA) and SCConv are introduced, the SC-Swin modules include LN layer, W-MSA, SW-MSA, SRU, CRU, and GELU layer, where W-MSA and SW-MSA are multi-headed self-attention modules, employing conventional and moving window configurations, respectively.
9. The method for classifying motor imagery based on multichannel fNIRS according to claim 8, wherein: in step S2, the SRU evaluates the information content of the different feature maps using the scaling factors in the group normalized GN layer, giving an intermediate feature map X εR N×C×H×W Where N is the batch axis, C is the channel axis, and H and W are the spatial height and width axes. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:
where μ and σ are the mean and standard deviation of X, ε is a small positive number added for stable division, and γ and β are trainable affine transformations.
Measuring the variance of the spatial pixels of each batch and channel in the GN layer by using a trainable parameter gamma epsilon RC;
mapping the weight values of the feature map re-weighted by wy to the range (0, 1) by a sigmoid function and gating by a threshold; then the weights above the threshold are set to 1 to obtain the information rich weight W1, and they are set to 0 to obtain the non-information rich weight W2, the whole process of obtaining W can be expressed by equation 2:
W=Gate(Sigmoid(W γ (GN(X)))) (7)
multiplying the input feature X by W1 and W2, respectively, yields two weighted features: information rich featuresAnd less informative features->The input features are divided into two parts: />Spatial content with rich information and strong expressive power, while +.>Almost no information is considered redundant.
10. The method for classifying motor imagery based on multichannel fNIRS according to claim 9, wherein: in step S2, the channel reconstruction unit CRU makes use of a "split-switch-fuse" strategy, comprising the steps of:
s21, segmentation: refining feature X for a given space ω ∈R C×H×W First X is taken up ω Wherein one part is provided with alpha C channels, and the other part is provided with (1-alpha) C channels, wherein 0 is less than or equal to alpha is less than or equal to 1; subsequently, the channels of the feature map are compressed using a 1×1 convolution to refine the feature X spatially ω Divided into an upper portion Xup and a lower portion Xlow;
s22, conversion: xup is input to the up-conversion stage as a "rich feature extractor", advanced representative information is extracted using GWC and PWC instead of standard kxk convolution, xlow is input to the bottom conversion stage, and a feature map with shallow hidden details is generated as a complement to the rich feature extractor.
S23, fusion: after conversion, the channel refinement feature is obtained by combining the upper and lower features in a channel manner.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311781458.0A CN117789249A (en) | 2023-12-22 | 2023-12-22 | Motor imagery classification method based on multichannel fNIRS |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311781458.0A CN117789249A (en) | 2023-12-22 | 2023-12-22 | Motor imagery classification method based on multichannel fNIRS |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117789249A true CN117789249A (en) | 2024-03-29 |
Family
ID=90395698
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311781458.0A Pending CN117789249A (en) | 2023-12-22 | 2023-12-22 | Motor imagery classification method based on multichannel fNIRS |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117789249A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104375635A (en) * | 2014-08-14 | 2015-02-25 | 华中科技大学 | Quick near-infrared brain-computer interface method |
US20170202518A1 (en) * | 2016-01-14 | 2017-07-20 | Technion Research And Development Foundation Ltd. | System and method for brain state classification |
WO2023092813A1 (en) * | 2021-11-25 | 2023-06-01 | 苏州大学 | Swin-transformer image denoising method and system based on channel attention |
CN116842329A (en) * | 2023-07-10 | 2023-10-03 | 湖北大学 | Motor imagery task classification method and system based on electroencephalogram signals and deep learning |
-
2023
- 2023-12-22 CN CN202311781458.0A patent/CN117789249A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104375635A (en) * | 2014-08-14 | 2015-02-25 | 华中科技大学 | Quick near-infrared brain-computer interface method |
US20170202518A1 (en) * | 2016-01-14 | 2017-07-20 | Technion Research And Development Foundation Ltd. | System and method for brain state classification |
WO2023092813A1 (en) * | 2021-11-25 | 2023-06-01 | 苏州大学 | Swin-transformer image denoising method and system based on channel attention |
CN116842329A (en) * | 2023-07-10 | 2023-10-03 | 湖北大学 | Motor imagery task classification method and system based on electroencephalogram signals and deep learning |
Non-Patent Citations (6)
Title |
---|
HAN WANG等: ""A Novel Algorithmic Structure of EEG Channel Attention Combined With Swin Transformer for Motor Patterns Classification"", 《IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING》, vol. 31, 21 July 2023 (2023-07-21), pages 3132 - 3141 * |
JIAFENG LI等: ""SCConv: Spatial and Channel Reconstruction Convolution for Feature"", 《2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, 22 August 2023 (2023-08-22), pages 6153 - 6162 * |
ZE LIU等: ""Swin transformer Hierarchical vision transformer using shifted windows"", 《PROCEEDINGS OF THE IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION》, 31 December 2021 (2021-12-31), pages 10012 - 10022 * |
ZENGHUI WANG等: ""A general and scalable vision framework for functional near-infrared spectroscopy classification"", 《IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING》, vol. 30, 13 July 2022 (2022-07-13), pages 1982 - 1991, XP011914933, DOI: 10.1109/TNSRE.2022.3190431 * |
ZENGHUI WANG等: ""A general and scalable vision framework for functional near-infrared spectroscopy classification"", 《IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING》, vol. 30, 22 July 2022 (2022-07-22), pages 1982 - 1991, XP011914933, DOI: 10.1109/TNSRE.2022.3190431 * |
李玉;熊馨;李昭阳;伏云发;: "基于功能性近红外光谱识别右脚三种想象动作研究", 生物医学工程学杂志, no. 02, 25 April 2020 (2020-04-25) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Nakisa et al. | Long short term memory hyperparameter optimization for a neural network based emotion recognition framework | |
Fazli et al. | Learning from more than one data source: data fusion techniques for sensorimotor rhythm-based brain–computer interfaces | |
Tivatansakul et al. | Emotion recognition using ECG signals with local pattern description methods | |
KR20190128978A (en) | Method for estimating human emotions using deep psychological affect network and system therefor | |
Kwak et al. | FGANet: fNIRS-guided attention network for hybrid EEG-fNIRS brain-computer interfaces | |
CN113951900B (en) | Motor imagery intention recognition method based on multi-mode signals | |
Al-Saegh et al. | CutCat: An augmentation method for EEG classification | |
Peng et al. | 3D-STCNN: Spatiotemporal Convolutional Neural Network based on EEG 3D features for detecting driving fatigue | |
Li et al. | Non-contact PPG signal and heart rate estimation with multi-hierarchical convolutional network | |
Yue et al. | Exploring BCI control in smart environments: intention recognition via EEG representation enhancement learning | |
Mellouk et al. | CNN-LSTM for automatic emotion recognition using contactless photoplythesmographic signals | |
Ouzar et al. | Lcoms lab's approach to the vision for vitals (v4v) challenge | |
Gao et al. | Deep learning-based motion artifact removal in functional near-infrared spectroscopy | |
Wang et al. | Multiband decomposition and spectral discriminative analysis for motor imagery BCI via deep neural network | |
Ryu et al. | A measurement of illumination variation-resistant noncontact heart rate based on the combination of singular spectrum analysis and sub-band method | |
Ghonchi et al. | Spatio-temporal deep learning for EEG-fNIRS brain computer interface | |
Zhang et al. | Video based cocktail causal container for blood pressure classification and blood glucose prediction | |
Wang et al. | TransPhys: Transformer-based unsupervised contrastive learning for remote heart rate measurement | |
Hu et al. | Contactless blood oxygen estimation from face videos: A multi-model fusion method based on deep learning | |
Zhu et al. | Stacked topological preserving dynamic brain networks representation and classification | |
Karmakar et al. | Real time detection of cognitive load using fNIRS: A deep learning approach | |
Zhang et al. | An end-to-end lower limb activity recognition framework based on sEMG data augmentation and enhanced CapsNet | |
Liu et al. | Information-enhanced network for noncontact heart rate estimation from facial videos | |
Lee et al. | Review on remote heart rate measurements using photoplethysmography | |
Jha et al. | Emotion Recognition from Electroencephalogram (EEG) Signals Using a Multiple Column Convolutional Neural Network Model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |