CN116484290A - Depression recognition model construction method based on Stacking integration - Google Patents

Depression recognition model construction method based on Stacking integration Download PDF

Info

Publication number
CN116484290A
CN116484290A CN202310420813.5A CN202310420813A CN116484290A CN 116484290 A CN116484290 A CN 116484290A CN 202310420813 A CN202310420813 A CN 202310420813A CN 116484290 A CN116484290 A CN 116484290A
Authority
CN
China
Prior art keywords
data
electroencephalogram
model
knn
depression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310420813.5A
Other languages
Chinese (zh)
Inventor
许进
余紫微
陈耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Textile University
Original Assignee
Wuhan Textile University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Textile University filed Critical Wuhan Textile University
Priority to CN202310420813.5A priority Critical patent/CN116484290A/en
Publication of CN116484290A publication Critical patent/CN116484290A/en
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • A61B5/372Analysis of electroencephalograms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7271Specific aspects of physiological measurement analysis
    • A61B5/7282Event detection, e.g. detecting unique waveforms indicative of a medical condition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/70ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mental therapies, e.g. psychological therapy or autogenous training
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Psychiatry (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Pathology (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Veterinary Medicine (AREA)
  • Psychology (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Social Psychology (AREA)
  • Signal Processing (AREA)
  • Developmental Disabilities (AREA)
  • Physiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Child & Adolescent Psychology (AREA)
  • Educational Technology (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses a depression recognition model construction method based on Stacking integration, which comprises the following steps: step 1: respectively acquiring brain electrical signals of a plurality of subjects, and dividing a data set into a training set and a testing set; step 2: the electroencephalogram signal is subjected to data preprocessing, and the method mainly comprises the steps of filtering, artifact removal and normalization processing of the data; the signal-to-noise ratio is improved, and the data which can be used subsequently are obtained; step 3: extracting the characteristics of the electroencephalogram data in the step 2 by using a co-space mode to obtain the characteristic vector of the corresponding electroencephalogram data; step 4: and (3) constructing a depression recognition model based on Stacking integration by using the feature vector in the step (3) as input according to the Stacking integration strategy. The identification model in the invention can combine the advantages of various basic learners, so that the overall classification effect is improved, the model uses cross verification, the over fitting can be prevented, and the robustness of the model is further enhanced.

Description

Depression recognition model construction method based on Stacking integration
Technical Field
The invention relates to a depression recognition model construction method based on a Stacking integrated algorithm, and belongs to the technical field of digital signal processing.
Background
Depression is an important factor in the global disease burden, affecting more than 300 million people worldwide. One fifth of the people experience a period of depression in life, which is one of the major causes of global disability. The disease burden is a complex concept with varying connotations covering the burden of patients, caregivers, hygiene systems, society and economy. In addition to the personal costs of the patient and his family, the economic impact is enormous. In europe alone, the annual cost is up to 92 million euros, and most of them are due to productivity losses.
Currently, in the monitoring and diagnosis of depression, this is mainly done by means of questionnaires. The PHQ-9 scale is generally selected for testing. The doctor or psychological consultant will generally evaluate the patient on the basis of this scale, plus his own subjective experience, and this process is highly subjective. On the one hand, the testee who fills out the questionnaire for the first time cannot easily understand the text expression in the questionnaire, and can cause incorrect filling; the testee with multiple filling experience may understand the screening mechanism of the questions, so that false answers result in inaccurate measurement results. On the other hand, doctors or psychological consultants rely more on subjective experience when analyzing the scale results and inquiries, so that symptoms cannot be effectively identified, and even misdiagnosis risks exist.
Objective measurement of depression has been one of the focus of attention of researchers, and recently, terms such as "deep learning", "machine learning", and "artificial intelligence" have attracted great attention worldwide, and methods for measuring brain electrical signals have been increasingly focused. Deep learning auxiliary diagnosis based on brain electrical signals is hopeful to improve the traditional method lacking physiological basis, and is more effective in monitoring and diagnosing depression.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a method for constructing a depression identification model based on Stacking integration, the Stacking algorithm can automatically integrate the advantages of different models, the generalization capability and the performance of the model are well improved, and the electroencephalogram characteristics of a depression patient and a normal control group can be effectively distinguished based on the model.
The invention is realized in particular as follows:
a depression recognition model construction method based on Stacking integration comprises the following steps:
step 1: respectively acquiring brain electrical signals of a plurality of subjects, and defining the brain electrical signals as a data set, wherein the subjects comprise a normal control group and a depression patient; dividing the data set into training sets D { X ] 1 ,X 2 ,X 3 ...X m Sum test set T { X } m+1 ,X m+2 ,X m+3 ...X m+n M represents the number of samples of the training set, and n represents the number of samples of the test set;
step 2: the electroencephalogram signal is subjected to data preprocessing, and the method mainly comprises the steps of filtering, artifact removal and normalization processing of the data, so that the signal to noise ratio is improved, and the subsequently available electroencephalogram data are obtained;
step 3: after preprocessing data, extracting the characteristics of the electroencephalogram data in the step 2 by using a common space mode (CSP) to obtain corresponding characteristic vectors of the electroencephalogram data; the CSP is a feature extraction algorithm under two classification tasks, and the variance value difference of two types of signals is maximized by utilizing diagonalization of a matrix, so that a feature vector with higher distinction degree is obtained, and the feature vector is used as input of a classifier to classify the electroencephalogram signals;
step 4: according to a Stacking integration strategy, constructing a depression recognition model based on Stacking integration by using the feature vector in the step 3 as input;
selecting KNN, SVM and LDA learners as primary classifiers of a first layer, and selecting a logistic regression classifier as a secondary classifier of a second layer; classifying the electroencephalogram signals by taking the feature vectors as the input of a classifier; and 5-fold cross validation is carried out by using the training set and the testing set in the step 1, so as to obtain a depression recognition model.
Further, in the step 1, an electroencephalogram acquisition device is used to acquire an electroencephalogram of the forehead of the subject in a resting state.
Further, in the step 2,
(1) Filtering the electroencephalogram data by adopting 4-45 Hz band-pass filtering;
(2) Removing artifacts such as electrooculogram, electrocardiograph and myoelectricity by utilizing independent component analysis;
(3) The normalization method is used for processing the data, the dimension and dimension units of the data with different dimensions are removed, and for LDA, the normalization method is used for enabling the mapped one-dimensional data to be relatively close, so that accuracy is affected, and therefore the Min-Max normalization mode is used for processing the data for the KNN classifier and the SVM classifier, and the method is as follows:
wherein x is j As the j-th element in sample x, x max X is the maximum value in the sample data min Is the minimum value in the sample data.
Further, the step 4 is specifically that
(1) Training set D { X } 1 ,X 2 ,X 3 ...X m 5 subsets { D } of equal size are randomly divided 1 ,D 2 ,D 3 ,D 4 ,D 5 (D) 1 ,D 2 ,D 3 ,D 4 As a feeding set, { D 5 Will be the validation set; each subset has m/5 pieces of data;
(2) The KNN classifier selects a weighted KNN algorithm, uses Euclidean distance as the distance between samples, and gives weight according to the distance to obtain a KNN model; the implementation method is as follows:
wherein (1)>
Wherein x is i Represents the i-th sample, k is the number of samples, f (x i ) For sample x i The value range of the class label is v= {0,1}, which corresponds to the normal control group and the patients with depression respectively; w (w) i The weight is here the inverse of the square of the euclidean distance.
The SVM classifier selects a Gaussian radial basis function kernel (RBF kernel) of the following formula:
wherein X is p ,X q The characteristic vectors of the sample p and the sample q are respectively, and sigma is the width parameter of the function, so that the radial action range of the function is controlled;
the optimization target selected by the LDA classifier is used for obtaining an LDA model, which is specifically as follows:
wherein,,and->Projection of the centers of two types of samples, namely an electroencephalogram signal of a normal control group and an electroencephalogram signal of a depressive patient, on a straight line respectively, and (I)>And->For the covariance of these two types of samples, +.>Is projected straight line, mu k (k=0, 1) is the mean vector of the kth class of samples, Σ k (k=0, 1) is the covariance matrix of the kth sample.
(3) Using { D ] 1 ,D 2 ,D 3 ,D 4 Training the KNN model in the step (2) by the feeding set to obtain a new KNN model, and then using the KNN model to verify the set { D ] 5 Predicting to obtain m/5 pieces of data;
(4) Simultaneously, the trained KNN model in the step (3) is utilized to test the test set T { X } 1 ,X 2 ,X 3 ...X n Predicting to obtain n pieces of data;
(5) After 5 times of cross checking, a verification set of m pieces of data and a prediction result of a test set of 5n pieces of data can be obtained, and then the result of the verification set is spliced into a matrix with m rows of length and is marked as A 1 The result of the test set is weighted and averaged to obtain a matrix with n rows of length, which is marked as B 1
(6) Feeding and training the primary learner SVM and the LDA model in the same way as the steps (3), (4) and (5) to finally obtain A 2 ,B 2 ,A 3 ,B 3 The method comprises the steps of carrying out a first treatment on the surface of the A total of 6 matrices A 1 ,B 1 ,A 2 ,B 2 ,A 3 ,B 3
(7) Will A 1 ,A 2 ,A 3 Are juxtaposed together to form a matrix of m rows and 3 columns, namely a new training set { A } 1 ,A 2 ,A 3 }, B is 1 ,B 2 ,B 3 Are juxtaposed together to form an n-row 3-column matrix, namely a new test set { B } 1 ,B 2 ,B 3 };
(8) Using a new training set { A } 1 ,A 2 ,A 3 Training LR model to obtain trained LR model, and collecting new test set { B } 1 ,B 2 ,B 3 And predicting as input to obtain the classification result. Thus, the depression recognition model is constructed.
Compared with the prior art, the invention has at least the following outstanding advantages:
1. compared with other depression recognition models, the depression recognition model based on Stacking integration provided by the invention can combine the advantages of various basic learners, so that the overall classification effect is improved, the model uses cross verification, the overfitting can be prevented, and the robustness of the model is further enhanced.
2. The invention utilizes the difference of the brain electrical signals of normal people and depression patients to extract the characteristics of the brain electrical signals of the two groups of people, and puts the brain electrical signals into the depression identification model, so that the normal people and depression patients can be effectively identified, and the accuracy of depression identification is greatly improved.
Drawings
Fig. 1 is a schematic flow chart of a method for constructing a depression recognition model based on Stacking integration according to an embodiment of the present invention.
Fig. 2 is a graph of a Stacking ensemble learning model in an example of the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is apparent that the described embodiment is only one embodiment of the present invention, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Referring to fig. 1, a method for constructing a depression recognition model based on Stacking integration includes the following steps: and acquiring an electroencephalogram signal, preprocessing electroencephalogram data, extracting characteristics of the electroencephalogram signal, and finally establishing a depression identification model based on Stacking integration. The method comprises the following steps:
s1, respectively acquiring brain electrical signals of a plurality of subjects, wherein the subjects comprise a normal control group and a patient suffering from depression; dividing the data set into training sets D { X ] 1 ,X 2 ,X 3 ...X m Sum test set T { X } m+1 ,X m+2 ,X m+3 ...X m+n M represents trainingNumber of set samples, n represents the number of test set samples.
S2, preprocessing the EEG signals, wherein the preprocessing mainly comprises the steps of filtering, removing artifacts and normalizing the data; and improving the signal-to-noise ratio and obtaining the follow-up available electroencephalogram data.
And S3, extracting the characteristics of the electroencephalogram data in the S2 by using a common space model (CSP) after preprocessing the data to obtain the corresponding characteristic vector of the electroencephalogram data. The CSP is a feature extraction algorithm under two classification tasks, and uses diagonalization of a matrix to maximize variance value difference of two types of signals, so that a feature vector with higher distinction degree is obtained, and the feature vector is used as input of a classifier to classify the electroencephalogram signals.
And S4, constructing a depression recognition model based on Stacking integration by using the feature vector in the S3 as input according to the Stacking integration strategy.
In step S1, an electroencephalogram signal of the forehead of the subject in a resting state is acquired from an electroencephalogram sensor by using an electroencephalogram acquisition apparatus.
The specific method of the step S2 is as follows:
s21, filtering the electroencephalogram data by adopting 4-45 Hz band-pass filtering;
s22, removing artifacts such as electrooculogram, electrocardiographic, myoelectric and the like of the electroencephalogram data by utilizing independent component analysis;
s23, processing electroencephalogram data by using a normalization method, removing dimension and dimension units of data with different dimensions, wherein for LDA, the normalization method can enable mapped one-dimensional data to be relatively close, and accuracy is affected, so that the KNN classifier and the SVM classifier process the data by using a Min-Max normalization method, and the method comprises the following steps:
wherein x is j As the j-th element in sample x, x max X is the maximum value in the sample data min Is the minimum value in the sample data.
Referring to fig. 2, in step S4, according to the Stacking integration strategy, the specific method for constructing the Stacking integration-based depression recognition model by using the feature vector in S3 as an input is as follows:
in the invention, KNN, SVM and LDA learners are selected as primary classifiers of a first layer, and a logistic regression classifier is selected as a secondary classifier of a second layer. The feature vector is used as the input of the classifier to classify the electroencephalogram signals.
And 5-fold cross validation is carried out by using the training set and the testing set in the step S1, so as to obtain a depression recognition model. The method comprises the following steps:
s41, training set D { X } 1 ,X 2 ,X 3 ...X m 5 subsets { D } of equal size are randomly divided 1 ,D 2 ,D 3 ,D 4 ,D 5 (D) 1 ,D 2 ,D 3 ,D 4 As a feeding set, { D 5 Will be the validation set. Each subset has m/5 pieces of data.
S42, selecting a weighted KNN algorithm by the KNN classifier, taking Euclidean distance as the distance between samples, and giving weight according to the distance. The implementation method is as follows:
wherein (1)>Wherein x is i Represents the i-th sample, k is the number of samples, f (x i ) For sample x i The value range of the class label is v= {0,1}, which corresponds to the normal control group and the patients with depression respectively; w (w) i The weight is here the inverse of the square of the euclidean distance.
The SVM classifier selects a Gaussian radial basis function kernel (RBF kernel) of the following formula:
wherein the method comprises the steps of,X p ,X q The characteristic vectors of the sample p and the sample q are respectively, and sigma is the width parameter of the function, so that the radial action range of the function is controlled;
the optimization targets selected by the LDA classifier are as follows:
wherein,,and->Projection of the centers of two types of samples, namely an electroencephalogram signal of a normal control group and an electroencephalogram signal of a depressive patient, on a straight line respectively, and (I)>And->For the covariance of these two types of samples, +.>Is projected straight line, mu k (k=0, 1) is the mean vector of the kth class of samples, Σ k (k=0, 1) is the covariance matrix of the kth sample.
S43 using { D ] 1 ,D 2 ,D 3 ,D 4 Training the KNN model by the feeding set to obtain a new KNN model, and then using the KNN model to verify the set { D ] 5 And predicting to obtain m/5 pieces of data.
S44, simultaneously utilizing the trained KNN model to test the T { X } m+1 ,X m+2 ,X m+3 ...X m+n And predicting to obtain n pieces of data.
S45, after 5 times of cross checking, a verification set of m pieces of data and a prediction result of a test set of 5n pieces of data can be obtained, and then verification is performedThe result of the set is spliced into a matrix with m rows and is marked as A 1 The result of the test set is weighted and averaged to obtain a matrix with n rows of length, which is marked as B 1
S46, feeding and training the primary learner SVM and the LDA model in the same way as the steps (3), (4) and (5) to finally obtain A 2 ,B 2 ,A 3 ,B 3 . A total of 6 matrices A 1 ,B 1 ,A 2 ,B 2 ,A 3 ,B 3
S47, will A 1 ,A 2 ,A 3 Are juxtaposed together to form a matrix of m rows and 3 columns, namely a new training set { A } 1 ,A 2 ,A 3 }, B is 1 ,B 2 ,B 3 Are juxtaposed together to form an n-row 3-column matrix, namely a new test set { B } 1 ,B 2 ,B 3 }。
S48, utilizing the new training set { A } 1 ,A 2 ,A 3 Training LR model to obtain trained LR model, and collecting new test set { B } 1 ,B 2 ,B 3 And predicting as input to obtain the classification result. 0 represents a normal control group and 1 represents a patient suffering from depression.
The method comprises the steps of collecting an electroencephalogram signal of a subject, preprocessing the electroencephalogram signal, extracting features, taking the obtained feature vector as input of a depression recognition model, obtaining a classification result, and if the output result is 1, identifying the subject as a depression patient.
Although the invention has been described herein with reference to the above-described illustrative embodiments thereof, the above-described embodiments are merely preferred embodiments of the present invention, and the embodiments of the present invention are not limited by the above-described embodiments, it should be understood that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the scope and spirit of the principles of this disclosure.

Claims (5)

1. A depression recognition model construction method based on Stacking integration is characterized by comprising the following steps:
step 1: respectively acquiring brain electrical signals of a plurality of subjects, and defining the brain electrical signals as a data set, wherein the subjects comprise a normal control group and a depression patient; dividing the data set into training sets D { X ] 1 ,X 2 ,X 3 ...X m Sum test set T { X } m+1 ,X m+2 ,X m+ 3 ...X m+n M represents the number of samples of the training set, and n represents the number of samples of the test set;
step 2: the electroencephalogram signal is subjected to data preprocessing, and the method mainly comprises the steps of filtering, artifact removal and normalization processing of the data, so that the signal to noise ratio is improved, and the subsequently available electroencephalogram data are obtained;
step 3: extracting the characteristics of the electroencephalogram data in the step 2 by using a co-space mode to obtain the characteristic vector of the corresponding electroencephalogram data;
step 4: according to a Stacking integration strategy, constructing a depression recognition model based on Stacking integration by using the feature vector in the step 3 as input;
selecting KNN, SVM and LDA learners as primary classifiers of a first layer, and selecting a logistic regression classifier as a secondary classifier of a second layer; classifying the electroencephalogram signals by taking the feature vectors as the input of a classifier; and 5-fold cross validation is carried out by using the training set and the testing set in the step 1, so as to obtain a depression recognition model.
2. The method for constructing a depression recognition model based on Stacking integration according to claim 1, wherein:
in step 1, an electroencephalogram signal is acquired, namely, an electroencephalogram signal of the forehead of a subject in a resting state is acquired by using an electroencephalogram signal acquisition device.
3. The method for constructing a depression recognition model based on Stacking integration according to claim 1, wherein:
the step 2 specifically comprises the following steps:
(1) Filtering the electroencephalogram data by adopting 4-45 Hz band-pass filtering;
(2) Removing artifacts such as electrooculogram, electrocardiograph and myoelectricity by utilizing independent component analysis;
(3) The normalization method is used for processing the data, the dimension and dimension units of the data with different dimensions are removed, and for LDA, the normalization method is used for enabling the mapped one-dimensional data to be relatively close, so that accuracy is affected, and therefore the Min-Max normalization mode is used for processing the data for the KNN classifier and the SVM classifier, and the method is as follows:
wherein x is j As the j-th element in sample x, x max X is the maximum value in the sample data min Is the minimum value in the sample data.
4. The method for constructing a depression recognition model based on Stacking integration according to claim 1, wherein:
in step 4, specifically, the method includes:
(1) Training set D { X } 1 ,X 2 ,X 3 ...X m 5 subsets { D } of equal size are randomly divided 1 ,D 2 ,D 3 ,D 4 ,D 5 (D) 1 ,D 2 ,D 3 ,D 4 As a feeding set, { D 5 Will be the validation set; each subset has m/5 pieces of data;
(2) The KNN classifier selects a weighted KNN algorithm, the Euclidean distance is used as the distance between samples, and weights are given according to the distance, so that a KNN model is obtained; the SVM classifier selects a Gaussian radial basis function kernel; the LDA classifier selects an optimization target to obtain an LDA model;
(3) Using { D ] 1 ,D 2 ,D 3 ,D 4 Training the KNN model in the step (2) by the feeding set to obtain a new KNN model, and then using the KNN model to verify the set { D ] 5 Predicting to obtain m/5 pieces of data;
(4) Simultaneously, the trained KNN model in the step (3) is utilized to test the test set T { X } 1 ,X 2 ,X 3 ...X n Predicting to obtain n pieces of data;
(5) After 5 times of cross checking, a verification set of m pieces of data and a prediction result of a test set of 5n pieces of data can be obtained, and then the result of the verification set is spliced into a matrix with m rows of length and is marked as A 1 The result of the test set is weighted and averaged to obtain a matrix with n rows of length, which is marked as B 1
(6) Feeding and training the primary learner SVM and the LDA model in the same way as the steps (3), (4) and (5) to finally obtain A 2 ,B 2 ,A 3 ,B 3 The method comprises the steps of carrying out a first treatment on the surface of the A total of 6 matrices A 1 ,B 1 ,A 2 ,B 2 ,A 3 ,B 3
(7) Will A 1 ,A 2 ,A 3 Are juxtaposed together to form a matrix of m rows and 3 columns, namely a new training set { A } 1 ,A 2 ,A 3 }, B is 1 ,B 2 ,B 3 Are juxtaposed together to form an n-row 3-column matrix, namely a new test set { B } 1 ,B 2 ,B 3 };
(8) Using a new training set { A } 1 ,A 2 ,A 3 Training LR model to obtain trained LR model, and collecting new test set { B } 1 ,B 2 ,B 3 And predicting as input to obtain the classification result. Thus, the depression recognition model is constructed.
5. The method for constructing the depression recognition model based on Stacking integration according to claim 4, wherein the method comprises the following steps of:
in the step (2), the KNN classifier selects a weighted KNN algorithm, and the euclidean distance is used as the distance between samples, and the implementation method for giving the weight according to the distance is as follows:
wherein (1)>
Wherein x is i Represents the i-th sample, k is the number of samples, f (x i ) For sample x i The value range of the class label is v= {0,1}, which corresponds to the normal control group and the patients with depression respectively; w (w) i The inverse of the square of the Euclidean distance is the weight;
the SVM classifier selects a Gaussian radial basis function kernel as follows:
wherein X is p ,X q The characteristic vectors of the sample p and the sample q are respectively, and sigma is the width parameter of the function, so that the radial action range of the function is controlled;
the optimization targets selected by the LDA classifier are as follows:
wherein,,and->Projection of the centers of two types of samples, namely an electroencephalogram signal of a normal control group and an electroencephalogram signal of a depressive patient, on a straight line respectively, and (I)>And->For the covariance of these two types of samples, +.>Is projected straight line, mu k (k=0, 1) is the mean vector of the kth class of samples, Σ k (k=0, 1) is the covariance matrix of the kth sample.
CN202310420813.5A 2023-04-18 2023-04-18 Depression recognition model construction method based on Stacking integration Pending CN116484290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310420813.5A CN116484290A (en) 2023-04-18 2023-04-18 Depression recognition model construction method based on Stacking integration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310420813.5A CN116484290A (en) 2023-04-18 2023-04-18 Depression recognition model construction method based on Stacking integration

Publications (1)

Publication Number Publication Date
CN116484290A true CN116484290A (en) 2023-07-25

Family

ID=87217177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310420813.5A Pending CN116484290A (en) 2023-04-18 2023-04-18 Depression recognition model construction method based on Stacking integration

Country Status (1)

Country Link
CN (1) CN116484290A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117338313A (en) * 2023-09-15 2024-01-05 武汉纺织大学 Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117338313A (en) * 2023-09-15 2024-01-05 武汉纺织大学 Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology
CN117338313B (en) * 2023-09-15 2024-05-07 武汉纺织大学 Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology

Similar Documents

Publication Publication Date Title
Khatamino et al. A deep learning-CNN based system for medical diagnosis: An application on Parkinson’s disease handwriting drawings
Rubin et al. Recognizing abnormal heart sounds using deep learning
Thoduparambil et al. EEG-based deep learning model for the automatic detection of clinical depression
Palmes et al. Pattern mining of multichannel sEMG for tremor classification
CN109009102B (en) Electroencephalogram deep learning-based auxiliary diagnosis method and system
CN104636580A (en) Health monitoring mobile phone based on human face
Shaban Automated screening of parkinson's disease using deep learning based electroencephalography
CN110289097A (en) A kind of Pattern Recognition Diagnosis system stacking model based on Xgboost neural network
CN116484290A (en) Depression recognition model construction method based on Stacking integration
CN111671423A (en) EEG signal representation method, classification method, visualization method and medium
Herath et al. Autism spectrum disorder diagnosis support model using Inception V3
Kabir et al. Multi-classification based Alzheimer's disease detection with comparative analysis from brain MRI scans using deep learning
Adem et al. Classification of Parkinson's disease using EMG signals from different upper limb movements based on multiclass support vector machine
CN108962379A (en) A kind of mobile phone assisted detection system of cerebral nervous system disease
Harshavarthini et al. Automated epileptic seizures detection and classification
Jeyarani et al. Eye Tracking Biomarkers for Autism Spectrum Disorder Detection using Machine Learning and Deep Learning Techniques
Kuila et al. ECG signal classification using DEA with LSTM for arrhythmia detection
Gupta et al. Gender specific and age dependent classification model for improved diagnosis in Parkinson’s disease
Singh et al. Autism Spectrum Disorder Detection using theDeep Learning Approaches
Özyurt et al. Multi-transfer learning techniques for detecting auditory brainstem response
Mounika et al. A study on deep learning and machine learning techniques on detection of Parkinson’s disease
Barizão et al. Voice disorder classification using MLP and wavelet packet transform
Önder et al. Diagnosis of Alzheimer's Disease Using Boosting Classification Algorithms
Smith et al. Implicit context representation Cartesian genetic programming for the assessment of visuo-spatial ability
Venneti et al. Amdnet: Age-related macular degeneration diagnosis through retinal fundus images using lightweight convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination