CN117789249A

CN117789249A - Motor imagery classification method based on multichannel fNIRS

Info

Publication number: CN117789249A
Application number: CN202311781458.0A
Authority: CN
Inventors: 奚洋; 石鹏; 李奇; 孟天宇; 曲朝阳; 曹杰
Original assignee: Northeast Dianli University
Current assignee: Northeast Electric Power University
Priority date: 2023-12-22
Filing date: 2023-12-22
Publication date: 2024-03-29

Abstract

The invention discloses a motor imagery classification method based on multichannel fNIRS, and belongs to the technical field of brain waves. The method comprises the following steps: generating an image of near infrared data, namely generating the image of near infrared data through data acquisition and data processing; and (3) classifying left and right hand motor imagery, namely based on the gram angle field image, extracting important features through a SwinTransformer design simplified model, scConv substitution standard convolution and SC-SwinTransformarblocks, and realizing the classification of the left and right hand motor imagery. The invention adopts the motor imagery classification method based on the multichannel fNIRS, provides a visual fNIRS framework, combines the ScConv and the Swin transducer, introduces a channel attention mechanism, is favorable for obtaining a characteristic diagram with channel attention, and enables a model to be more effectively focused on channel information. The advantage of Swin transducer in the image classification field can be fully utilized, the latest computer vision model is applied to the near infrared signal classification problem, and the possibility of a deep learning technology is provided for solving the near infrared classification problem.

Description

Motor imagery classification method based on multichannel fNIRS

Technical Field

The invention relates to the technical field of brain waves, in particular to a motor imagery classification method based on multichannel fNIRS.

Background

Near infrared spectroscopy (fnires) is a novel non-invasive neuroimaging technique that uses near infrared light with wavelengths of 650 to 950 nanometers to measure hemodynamic responses in the brain. During metabolism of brain tissue, neuronal activity consumes oxygen carried by hemoglobin, causing changes in the concentration of oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR) in the activated area. Many other neuroimaging techniques, such as electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI), are used for brain signal analysis. Electroencephalograms have better temporal resolution than fnigs signals, but lower spatial resolution and are more susceptible to electrical noise and motion artifacts. While fMRI can provide excellent spatial resolution, the fMRI signal acquisition process is time consuming and generates a lot of noise. More importantly, fMRI equipment is cumbersome and expensive and not suitable for mobile scenes. The fNIRS has the advantages of safety, portability, relatively low cost and the like. Therefore, fmirs signal analysis has attracted attention from researchers in recent years.

Many studies have been performed in the past using machine learning analysis of BCI systems using near infrared spectroscopy (fNIRS). Coyle et al demonstrate for the first time the potential of fNIRS in developing brain-computer interfaces (BCI) to provide non-muscle support for severely dyskinesia patients. Typically, statistical features such as mean, variance, peak, slope, skewness, kurtosis, etc. are extracted from the fnigs signal for training the machine learning classifier. Support Vector Machines (SVMs), linear Discriminant Analysis (LDA), and Artificial Neural Networks (ANN) are all classical machine learning algorithms. However, these methods have poor generalization ability, complicated parameter adjustment process, and too much depend on an artificially designed extractor and expertise, so that high accuracy cannot be achieved in practical application. Over the past decade, deep learning techniques have evolved over the years. Deep learning can extract efficient features per se, thus exhibiting good classification performance. Despite the progress made in fNIRS classification, deep learning still faces many obstacles. First, the fnrs device is costly and thus large-scale collection of the fnrs signal is challenging. In addition, the subject must undergo a cumbersome signal acquisition procedure. Second, the direct application of the latest Computer Vision (CV) and Natural Language Processing (NLP) deep learning techniques to fnrs data is challenging because of the differences in fnrs data from those in these fields.

Recent studies have shown that the Glatiramer Angle Field (GAF) has found wide application in many fields. For example, mitiche et al used GADF in electromagnetic interference image studies, which combined the feature reduction methods of local binary pattern and local phase quantization to reject redundant information. In the research, images are classified by using random forests, so that good effects are obtained. In addition, in the case of the optical fiber,the GASF method was applied to classify Electrocardiographic (ECG) data of the 2017 PhysioNet/CinC challenge race to detect atrial fibrillation. In the financial arts, chen et al propose a method of encoding a time series into two-dimensional images and compare with the GAF method.

Disclosure of Invention

The invention aims to provide a motor imagery classification method based on multichannel fNIRS, which solves the problems that fNIRS equipment is high in cost and expensive, a subject has to accept a complicated signal acquisition process, and the latest Computer Vision (CV) and Natural Language Processing (NLP) deep learning technology is directly applied to fNIRS data and has challenges.

In order to achieve the above purpose, the invention provides a motor imagery classification method based on multichannel fNIRS, comprising the following steps:

s1, generating an image of near infrared data, and generating the image of the near infrared data through data acquisition and data processing;

s2, classifying left and right hand motor imagery, namely based on the gram angle field image, extracting important features by using a SwinTransformer design simplified model and ScConv to replace standard convolution and SC-SwinTransformer blocks, and realizing classification of the left and right hand motor imagery.

Preferably, in step S1, the data set in the data acquisition is an fnrs signal, and the light source and the detector are used to record near infrared spectrum data, so as to generate a physiological channel.

Preferably, in step S1, during the data processing, the signal acquired by the near infrared device is converted into concentration changes of oxyhemoglobin and deoxyhemoglobin, and the optical density OD in time t is converted into concentration changes of HbR and HbO of near infrared absorption by means of a modified Beer-Lambert law.

Preferably, the conversion formula for converting the optical density OD in time t into the change in the concentration of HbR and HbO of near infrared light absorption is as follows:

where d is the distance between the light source and the detector, λ1 and λ2 are different illumination wavelengths, DPF is a differential path length factor, ε is the extinction coefficients of HbR and HbO.

Preferably, in step S1, during the image generation process, integrating the data in the experimental process to form a time matrix sequence, performing segmentation processing on the fNIRS data by adopting a sliding time window method, remolding the fNIRS signal into an image form, compressing the fNIRS signal to a predetermined length by using PAA, and maintaining the signal trend;

adjusting the sequence to interval [ -1,1]:

the scaled time series is converted to polar coordinates.

Wherein phi represents the processed near infrared signal, theta and r are the corresponding radian and radius respectively, and N is a constant for normalizing the span of the polar coordinate system.

Preferably, in step S1, during image generation, the sequence is converted into a glamer angle field, including a glamer angle sum field GASF and a glamer angle difference field GADF, GASF and GADF being defined as follows:

phi according to formula (5) ₁ Represents the first element in the time series, phi _S Representing the last element of the time series;

and superposing 3 gram angle difference fields generated by 3 channels together in a third dimension, and storing the file into a picture format to obtain a 3-channel true color image with the resolution of 224 x 224.

Preferably, in step S2, the SCConv is composed of a spatial reconstruction unit SRU and a channel reconstruction unit CRU, the SRU separating and reconstructing redundant features according to weights, thereby suppressing redundancy in spatial dimensions and enhancing the representativeness of the features; the CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution.

Preferably, in step S2, channel attention (ECA) modules and SCConv are introduced in SC-SwinTransformaberlocks, the SC-Swin modules comprising LN layers, W-MSA, SW-MSA, SRU, CRU, and GELU layers, where W-MSA and SW-MSA are multi-headed self-attention modules, employing conventional and moving window configurations, respectively.

Preferably, in step S2, the SRU uses the scaling factors in the group normalized GN layer to evaluate the information content of the different feature maps, given an intermediate feature map X εR ^N×C×H×W Where N is the batch axis, C is the channel axis, and H and W are the spatial height and width axes. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:

where μ and σ are the mean and standard deviation of X, ε is a small positive number added for stable division, and γ and β are trainable affine transformations.

Measuring the variance of the spatial pixels of each batch and channel in the GN layer by using a trainable parameter gamma epsilon RC;

mapping the weight values of the feature map re-weighted by wy to the range (0, 1) by a sigmoid function and gating by a threshold; then the weights above the threshold are set to 1 to obtain the information rich weight W1, and they are set to 0 to obtain the non-information rich weight W2, the whole process of obtaining W can be expressed by equation 2:

W＝Gate(Sigmoid(W _γ (GN(X)))) (7)

multiplying the input feature X by W1 and W2, respectively, yields two weighted features: information rich featuresAnd less informative features->The input features are divided into two parts: />Spatial content with rich information and strong expressive power, while +.>Almost no information is considered redundant.

Preferably, in step S2, the channel reconstruction unit CRU makes use of a "split-transform-fusion" strategy, comprising the steps of:

s21, segmentation: refining feature X for a given space ^ω ∈R ^C×H×W First X is taken up ^ω Wherein one part is provided with alpha C channels, and the other part is provided with (1-alpha) C channels, wherein 0 is less than or equal to alpha is less than or equal to 1; subsequently, the channels of the feature map are compressed using a 1×1 convolution to refine the feature X spatially ^ω Divided into an upper portion Xup and a lower portion Xlow;

s22, conversion: xup is input to the up-conversion stage as a "rich feature extractor", advanced representative information is extracted using GWC and PWC instead of standard kxk convolution, xlow is input to the bottom conversion stage, and a feature map with shallow hidden details is generated as a complement to the rich feature extractor.

S23, fusion: after conversion, the channel refinement feature is obtained by combining the upper and lower features in a channel manner.

Therefore, the motor imagery classification method based on the multichannel fNIRS has the following beneficial effects:

(1) The invention provides a visual fNIRS framework, which combines ScConv and Swin convertors, introduces a channel attention mechanism, is beneficial to obtaining a characteristic diagram with channel attention, and enables a model to be focused on channel information more effectively.

(2) The invention can fully utilize the advantages of the Swin transducer in the field of image classification, applies the latest computer vision model to the near infrared signal classification problem, and provides the possibility of a deep learning technology for solving the near infrared classification problem.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a schematic diagram of data preprocessing and image generation of a motor imagery classification method based on a multi-channel fNIRS according to the invention;

FIG. 2 is a schematic diagram of a 3-channel true color map of a motor imagery classification method based on a multi-channel fNIRS according to the invention;

FIG. 3 is a schematic diagram of an SC-Swin transducer of the motor imagery classification method based on the multichannel fNIRS;

fig. 4 is a schematic diagram of accuracy r of training set and verification set of a motor imagery classification method based on multichannel fNIRS.

Detailed Description

The technical scheme of the invention is further described below through the attached drawings and the embodiments.

Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.

Examples

As shown in fig. 1, the invention provides a motor imagery classification method based on multichannel fNIRS, comprising the following steps:

s1, generating an image of near infrared data, namely generating the image of near infrared data through data acquisition and data processing, wherein the method comprises the following steps of:

s101, subject

29 healthy subjects participated in the experiment, including 14 men and 15 women, with average ages of 28.5+ -3.7. No subject reported neurological, psychiatric or other brain-related disorders. All volunteers received detailed experimental procedure instructions and were confirmed by written intent before the experiment was performed.

S102, data acquisition

The data set is the fNIRS signal, supplied by Shin et al, university of Berlin industries. Two experimental tasks are performed, namely, a motor imagery experiment firstly, wherein the left hand or the right hand of the tested person is required to be subjected to gripping movement; and secondly, mental arithmetic and resting state experiments. We mainly use data related to motor imagery in the fnigs system. When the acquisition experiment was performed, the near infrared spectrum data was recorded with 14 light sources and 16 detectors, yielding thirty-six physiological channels. The fNIRS data was downsampled to 10Hz after acquisition was complete for better processing and analysis.

S103, data processing

In BCI experiments, the signals collected by the near infrared device are near infrared light refracted and scattered and absorbed light signals, which can be converted into concentration changes of oxyhemoglobin (HbO) and deoxyhemoglobin (HbR). From optical correlation knowledge, the propagation of light in biological tissue obeys Lambert-Beer law. By modifying Beer-Lambert law, we can convert the optical density OD over time t to the HbR and HbO concentration changes of near infrared absorption.

The conversion formula is as follows:

fNIRS data noise is primarily derived from internal physiological noise, including heart beat (1-1.5 Hz), physiological respiration (about 0.4 Hz), etc., while the activity and metabolism of endothelial cells of the cerebral cortex is primarily concentrated between 0.005-0.21 Hz. In the dataset we use, the fnigs signal is downsampled to 10 hz. To reject physiological noise and motion disturbances, we use a third order Butterworth filter with a passband set to 0.01-0.1 Hz. The signal was then adjusted to a specific level using baseline correction using a time period of 5 seconds to 2 seconds before the start of the experiment as a parameter for baseline correction. Finally, we segmented the data into 0 to 15 seconds from the start of the experiment. It should be noted that since the cerebral hemodynamic response has a certain delay profile, these segmented data contain a delayed response of the task related part (10 to 15 seconds after the start of the experiment). To facilitate subsequent processing and analysis, we save the processed data in xls format.

S104, image generation

In our study we systematically performed a series of data processing steps to accommodate our further analysis. First, we integrated the data of the same subject over the course of 30 experiments, including a time span of 10 seconds for each experiment and 5 seconds after the experiment, forming a time matrix sequence of size 4500 (30×15×10) and width 72 (36×2). Next, we segment the fNIRS data using a sliding time window approach. We set the time window size to 224 and the step size to 20, thus dividing the data of one single channel under test into 213 time series segments. After dividing the fNIRS signal into small segments with overlapping time windows, it becomes critical to construct the appropriate network inputs. The CNN-based method generally requires input of a gray image of mxn×1 or an RGB image of mxn×3. For this purpose we consider the reconstruction of the fnigs signal into the form of an image.

Image generation is the basis of our model. The length of the fnigs signal may vary due to the acquisition equipment and experimental mode. We compress the fnigs signal to a predetermined length using PAA and preserve signal trend. Let X be a time series segment of near infrared acquisition data, x= { X ₁ ,x ₂ ,…,x _n N, etcAt 224.

First, we adjust the sequence to interval [ -1,1]:

second, we convert the scaled time series to polar coordinates.

Finally, the sequence is converted into a glabram angle field. The Gram Angle Field (GAF) is a method of encoding a time series signal into an image representation, including the Gram Angle Sum Field (GASF) and the Gram Angle Difference Field (GADF). The gladhand field can convert time series data into image data, so that the complete information of the signal can be kept, and the dependence of the signal on time can be kept.

GASF and GADF are defined as follows:

phi according to formula (5) ₁ Represents the first element in the time series, phi _S Representing the last element of the time series, namely 224 th element. From this we can get a glamer angle difference field with a side length of 224. And superposing 3 gram angle difference fields generated by 3 channels together in a third dimension, and storing the file into a picture format to obtain a 3-channel true color image with the resolution of 224 x 224.

Specifically, a given time series X is first scaled to [ -1,1], and then represented in a polar coordinate system. The scaled data is then mapped to an angle in polar coordinates. Next, the GAF matrix is calculated. And finally visualizing the matrix and storing. The generated GADFs for the 3 channels are then superimposed together in a third dimension to form a 3-channel true color map, as shown in fig. 2.

S2, classifying left and right hand motor imagery, namely based on a gram angle field image, extracting important features by using a SwinTransformer design simplified model and ScConv to replace standard convolution and SC-SwinTransformarblocks, and realizing the classification of the left and right hand motor imagery, wherein the method comprises the following steps of:

the goal of the S201, swin transducer is to expand the applicability of the transducer so that it can serve as a universal backbone network in the computer vision field. This is similar to the application of NLP and CNN in the field of vision. An important component of the Swin transform design is the way it changes the window divider between the different self-attention-growing phases. The moving windows create connections between windows of the previous layer, thereby significantly increasing modeling capabilities. As shown in FIG. 3, the upper part shows a simplified model of the Swin transducer design.

S202、ScConv

Convolutional Neural Networks (CNNs) achieve significant performance in a variety of computer vision tasks, but at the cost of enormous computational resources, in part because the convolutional layers extract redundant features. SCConv (space and channel reconstruction convolution) attempts to exploit space and channel redundancy between features for CNN compression to reduce redundancy computation and facilitate representative feature learning. SCConv consists of two units: a Spatial Reconstruction Unit (SRU) and a Channel Reconstruction Unit (CRU). The SRU separates and reconstructs the redundant features according to the weights, thereby suppressing redundancy in the spatial dimension and enhancing the representativeness of the features. The CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution. SCConv can greatly save the calculation load and improve the performance of the model in high-difficulty tasks.

S203, since the time dimension of the fNIRS data is obviously larger than the channel dimension, the channel characteristics of the fNIRS data may be ignored by the transducer model. To solve this problem we created a unique structure that combines the channel attention mechanism with the Swin transducer. We introduce a channel attention (ECA) module and SCConv that exploit dependencies between channels so that the model can extract important features according to various conditions. The SC-Swin module consists of LN layer, W-MSA, SW-MSA, SRU, CRU and GELU layer, as shown in FIG. 3. W-MSA and SW-MSA are multi-headed self-attention modules employing conventional and moving window configurations, respectively.

(1)ECA

ECA is a model of the attention mechanism for image processing, which mainly improves the effectiveness of image feature representation and extracts more critical and important information by performing attention regulation on the image channel. Specifically, the ECA attention mechanism model consists of two parts: global average pooling and linear transformation. The global average pooling can be used for gathering the information of each channel so as to judge whether the information in the channel is critical or not; the linear transformation can scale and translate the information of the channel, so that key information is better reserved, and non-key information is restrained.

(2)SRU

The information content of the different feature maps is first evaluated with scaling factors in the Group Normalization (GN) layer. Specifically, given an intermediate feature map X ε R ^N×C×H×W Where N is the process lot, C is the channel axis, H and W are the spatial height and width. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:

The variance of the spatial pixels for each batch and channel is measured in the GN layer using the trainable parameter γe RC. The richer spatial information reflects more spatial pixel variations and thus contributes more gamma. The normalized correlation weight wγe RC is obtained by equation 2, which indicates the importance of the different feature maps. The weight values of the feature map re-weighted by wy are then mapped to the range (0, 1) by a sigmoid function and gated by a threshold. The weights above the threshold are then set to 1 to obtain an information rich weight W1, while they are set to 0 to obtain a non-information rich weight W2 (the threshold is set to 0.5 in the experiment). The entire process of acquiring W can be represented by equation 2:

W＝Gate(Sigmoid(W _γ (GN(X)))) (7)

finally, we multiply the input feature X by W1 and W2, respectively, resulting in two weighted features: information rich featuresAnd less informative features->Thus we succeeded in dividing the input features into two parts: />Spatial content with rich information and strong expressive power, while +.>Almost no information is considered redundant.

(3)CRU

In order to exploit the channel redundancy feature, a Channel Reconstruction Unit (CRU) is introduced again, which exploits a "split-switch-fuse" strategy. Segmentation: refining feature X for a given space ^ω ∈R ^C×H×W First X is taken up ^ω Is divided into two parts, one part of which has alpha C channels and the other part has (1-alpha) C channels, wherein 0.ltoreq.alpha.ltoreq.1. Subsequently, the channels of the feature map are further compressed using 1×1 convolution to increase computational efficiency. After the segmentation and compression operations, we refine the feature X spatially ^ω Divided into an upper portion Xup and a lower portion Xlow. Conversion: xup is input to the up-conversion stage as a "rich feature extractor". We use high efficiencyConvolution operations (i.e., GWC and PWC) replace expensive standard kxk convolution to extract high-level representative information while reducing computation costs. Xlow is input to the bottom conversion stage where we apply a low cost 1 x 1PWC operation to generate a feature map with shallow hidden details, as a complement to the rich feature extractor. Fusion: after conversion, the channel refinement feature is obtained by combining the upper and lower features in a channel manner.

Therefore, the invention adopts the motor imagery classification method based on the multichannel fNIRS, provides a visual fNIRS framework, combines the ScConv and the Swin transducer, introduces a channel attention mechanism, is favorable for obtaining a characteristic diagram with channel attention, and enables a model to be more effectively focused on channel information. The advantage of Swin transducer in the image classification field can be fully utilized, the latest computer vision model is applied to the near infrared signal classification problem, and the possibility of a deep learning technology is provided for solving the near infrared classification problem.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.

Claims

1. A motor imagery classification method based on multichannel fNIRS is characterized in that: the method comprises the following steps:

2. The method for classifying motor imagery based on multichannel fNIRS according to claim 1, wherein: in step S1, the data set in the data acquisition is an fNIRS signal, and the light source and the detector are used to record near infrared spectrum data, so as to generate a physiological channel.

3. The method for classifying motor imagery based on multichannel fNIRS according to claim 2, wherein: in step S1, during data processing, signals acquired by the near infrared device are converted into concentration changes of oxyhemoglobin and deoxyhemoglobin, and the optical density OD in time t is converted into concentration changes of HbR and HbO absorbed by near infrared light by means of a modified Beer-Lambert law.

4. A method of classifying motor imagery based on a multi-channel fNIRS according to claim 3, wherein: the conversion formula for converting the optical density OD in time t into the change in the concentration of HbR and HbO of near infrared light absorption is as follows:

5. The method for classifying motor imagery based on multichannel fNIRS of claim 4, wherein: in step S1, integrating data in the experimental process in the image generation process to form a time matrix sequence, segmenting the fNIRS data by adopting a sliding time window method, remolding the fNIRS signal into an image form, compressing the fNIRS signal to a preset length by using PAA, and keeping a signal trend;

adjusting the sequence to interval [ -1,1]:

the scaled time series is converted to polar coordinates.

6. The method for classifying motor imagery based on multichannel fNIRS according to claim 5, wherein: in step S1, during image generation, the sequence is converted into a glamer angle field including a glamer angle sum field GASF and a glamer angle difference field GADF, GASF and GADF being defined as follows:

7. The method for classifying motor imagery based on multichannel fNIRS of claim 6, wherein: in step S2, SCConv is composed of a spatial reconstruction unit SRU and a channel reconstruction unit CRU, the SRU separating and reconstructing redundant features according to weights, thereby suppressing redundancy in spatial dimensions and enhancing the representativeness of the features; the CRU utilizes a split-transform-fusion strategy to reduce redundancy in channel dimensions, computation cost and storage space. The SRU and CRU are combined in a sequential fashion, replacing the standard convolution.

8. The method for classifying motor imagery based on multichannel fNIRS of claim 7, wherein: in step S2, in SC-SwinTransformaberlocks, channel attention (ECA) and SCConv are introduced, the SC-Swin modules include LN layer, W-MSA, SW-MSA, SRU, CRU, and GELU layer, where W-MSA and SW-MSA are multi-headed self-attention modules, employing conventional and moving window configurations, respectively.

9. The method for classifying motor imagery based on multichannel fNIRS according to claim 8, wherein: in step S2, the SRU evaluates the information content of the different feature maps using the scaling factors in the group normalized GN layer, giving an intermediate feature map X εR ^N×C×H×W Where N is the batch axis, C is the channel axis, and H and W are the spatial height and width axes. We first normalize the input feature X by subtracting the mean μ divided by the standard deviation σ as follows:

W＝Gate(Sigmoid(W _γ (GN(X)))) (7)

10. The method for classifying motor imagery based on multichannel fNIRS according to claim 9, wherein: in step S2, the channel reconstruction unit CRU makes use of a "split-switch-fuse" strategy, comprising the steps of: