CN111126241B - Electroencephalogram mode extraction method based on optimal sequence feature subset - Google Patents

Electroencephalogram mode extraction method based on optimal sequence feature subset Download PDF

Info

Publication number
CN111126241B
CN111126241B CN201911319017.2A CN201911319017A CN111126241B CN 111126241 B CN111126241 B CN 111126241B CN 201911319017 A CN201911319017 A CN 201911319017A CN 111126241 B CN111126241 B CN 111126241B
Authority
CN
China
Prior art keywords
sequence
data
representing
electroencephalogram
equal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911319017.2A
Other languages
Chinese (zh)
Other versions
CN111126241A (en
Inventor
臧明文
黄刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201911319017.2A priority Critical patent/CN111126241B/en
Publication of CN111126241A publication Critical patent/CN111126241A/en
Application granted granted Critical
Publication of CN111126241B publication Critical patent/CN111126241B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Signal Processing (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Dermatology (AREA)
  • General Health & Medical Sciences (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • Human Computer Interaction (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses an electroencephalogram mode extraction method based on an optimal sequence feature subset, which aims at the high-precision classification requirement of a brain-computer interface system on electroencephalogram data classification. Through PAA dimension reduction and SAX symbolization, the complexity of data is reduced, and in addition, similarity calculation between data is replaced by a Hash mode, so that the speed and the efficiency are higher. A discrimination is defined for determining the discriminative power between the symbol subsequences for different electroencephalogram signals.

Description

Electroencephalogram mode extraction method based on optimal sequence feature subset
Technical Field
The invention relates to design and implementation of electroencephalogram data processing, feature discovery, mode classification and mode mining methods based on sequence feature subsets, aims to form an electroencephalogram analysis mining system based on the sequence feature subsets, and belongs to the cross field of electroencephalogram analysis and mode mining technologies.
Background
The optimal feature subset is a subsequence of the feature subset that maximally represents a class. The classification method based on the feature subset has the advantages of high classification accuracy, high classification speed, strong interpretability and the like. The feature subset is a number of subsequences with discrimination that can express the greatest difference between different sequence data. In addition to performing efficient classification work, the feature subset has intuitively interpretable sequence features. The conventional feature subset discovery algorithm principle is described as follows, calculating and comparing the classification information gains of all possible candidate subsequences from a set of all possible subsequences, and finally selecting the subsequence with the largest difference between classes as the optimal feature subsequence. Therefore, the subsequence feature extraction mode is a promising approach in the field of electroencephalogram analysis.
Brain electrical patterns have become a hotspot of research as it is increasingly used in gaming applications and stroke rehabilitation to convert brain signals of a task under imagination into the expected movement of paralyzed limbs. For example, a wheelchair controlled by a brain-computer interface to enable a disabled person to operate around a house and perform basic tasks. Furthermore, through the brain-computer interface, we can detect in advance that a person is about to suffer a seizure so as to inform them in advance to prevent accidents or serious injuries. Brain-computer interface sensors have been widely used in the field of electroencephalographic signals obtained by non-invasive sensors due to their low cost, ease of use, and lack of any surgery required by invasive sensors.
Some results have been obtained in the fields of using electroencephalogram signals to drive dependable neural plasticity or rehabilitation robots, but brain-computer interfaces for rehabilitation are still an emerging field. Thus, the electroencephalogram signals can be used to more accurately classify different tasks, which is not only beneficial for gaming and rehabilitation, but also helps to better detect diseases or abnormal behaviors, such as epilepsy, sleep apnea, sleep stages, and drowsiness detection. However, data transmitted by the sensor faces the problems of noise interference, real data distortion, large data dimension and the like. Therefore, a method for classifying different types of electroencephalogram signals with high precision is highly required to improve the performance of a brain-computer interface system.
The optimal characteristic subsequence is used for searching the characteristic with the minimum data difference in the electroencephalogram signal, and only one part of sequence data is used when the characteristic needs to be distinguished, so that compared with a method for training and classifying by using complete data samples, the method has stronger anti-interference performance and more accurate data classification and extraction. In addition, the method has the advantage of strong interpretability due to the fact that the optimal subset is extracted, and the difference is acquired while the characteristics are output, so that basis is provided for analyzing the reason of electroencephalogram data.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the high-precision classification requirement of a brain-computer interface system on classification of electroencephalogram data, the invention provides an electroencephalogram mode extraction method based on an optimal sequence feature subset.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
an electroencephalogram mode extraction method based on an optimal sequence feature subset comprises the following steps:
step 1, acquiring real-time electroencephalogram data through a sensor, sending a request to a central server, and requesting to send the electroencephalogram data.
Step 2, the central server receives the sensor request, receives data and processes the data; the acquired data is processed preliminarily to obtain an original data sequence C ═ C1c2c3,…,cn) And n is the length of the original data, wherein C belongs to D, and D is the original data set.
And 3, carrying out segmented aggregation approximate representation on the original data C, and reducing the data dimension.
Figure BDA0002326627880000021
Wherein j is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to w,
Figure BDA0002326627880000022
representing the sequence after linear segmentation, w representing the number of data segments in the segment aggregation approximation, cjRepresenting the elements in the original sequence and,
Figure BDA0002326627880000023
the representation segments represent elements in the post-sequence.
And 4, performing symbolic conversion on the data after the dimensionality reduction representation, and converting the data into a letter sequence representation corresponding to the letter space.
Figure BDA0002326627880000024
Wherein i is more than or equal to 1 and less than or equal to w,0<p-1<p<1,
Figure BDA0002326627880000025
Representing a segmented sequence of symbolized representations, alphapRepresenting symbols, β, corresponding to symbol mappingsp-1And betapMapping of representation to symbol alphapThe corresponding interval value.
And 5, the server transmits the symbolized sequence to a central online classifier for classification. And if the result is matched with the normal electroencephalogram signal, no response is made, and the sensor continues to wait for a sensor request. And if the obtained result is matched with the abnormal electroencephalogram signal, feeding back a target connected with the sensor.
The classification processing in step 5 includes the steps of:
the initialization of the classifier in the server requires a set of labeled data to be trained to obtain a sufficiently reliable and accurate feature subset, at step 51, the optimal feature sequence in the feature subset.
Set={s1,s2,…,sk,…,sl}
Wherein k is more than or equal to 1 and less than or equal to l, Set represents an optimal subsequence Set, and skRepresenting a certain sequence of features in the set, and l representing the current number of features.
Step 52, because the original data may have the same type of data mapped to different symbolized sequences after being mapped to the space, a similar sequence is obtained by covering partial letters of the symbolized sequences so as to solve the sequence ordering and the abnormal influence.
And 1, round:
Figure BDA0002326627880000031
and 2, round 2:
Figure BDA0002326627880000032
round m-1:
Figure BDA0002326627880000033
q-1 and, q is 0 in position;
wherein,
Figure BDA0002326627880000034
to representA length m of the symbolized sequence, m representing the sequence length,&representing the logical and operation of the corresponding element. Q-1 is not less than 1<q.ltoreq.m, q-1 and q representing two adjacent positions in the sequence.
In step 53, the symbol sequence covered in step 52 in each round is mapped into the Hash table T by a Hash operation, and the signed sequence having the same Hash value is subjected to a self-increment operation.
Step 54, storing the number of the symbolic sequences with Hash collision in the Hash table T, and calculating the influence degree of each symbol on the classification by defining the discrimination Dist.
Dist=dfar+dclose
Wherein d isfarRefers to a distinguishing value between other classes and classes outside, dcloseRefer to discriminative values for this and other classes.
Step 55, extracting k candidate subsequences from all the symbolic subsequences with the highest discrimination value in each class, and then calculating the integral information entropy I with the data set DSSelecting the entropy with the maximum information
Figure BDA0002326627880000035
As a basis for an optimal subset of features of the class, wherein,
Figure BDA0002326627880000036
representing entropy, I representing amount of information before classification, ISIndicating the amount of information after classification.
And 56, adding the optimal symbol self-sequence into the Set if better difference entropy can be obtained for the original data acquired by each sensing, and updating the data in the Set. And judging whether the overall subsequence is optimal or not by using the expectation rate.
Figure BDA0002326627880000037
If the expectation rate is within a preset threshold of the sequence expectation rate, the symbol sequence is matched with the sequences in the feature subset, whether the symbol sequence is a normal/abnormal sequence or not can be judged according to the matched feature sequence, and if the symbol sequence is matched with the abnormal sequence, a result can be fed back to a sensor user in a feedback mode.
Preferably: the preliminary processing in step 2 includes performing a verify-denormalization operation.
Preferably: the average value of the data after the segmentation and aggregation processing satisfies
Figure BDA0002326627880000038
Will all map to the letter alphapThe above.
Compared with the prior art, the invention has the following beneficial effects:
the invention does not train samples on the whole sequence of each sample of a data set like the traditional electroencephalogram analysis method when processing the electroencephalogram data, and establishes the mapping of the original data in a symbolic expression mode through dimension reduction processing. The method not only reduces the complexity of the data to a great extent, but also avoids the interference of noise on the whole data received by the algorithm, and has higher precision. On the other hand, the Hash matching mechanism is used in the model, so that the matching and calculation amount among symbols is reduced to a great extent, the access speed is improved, the requirement of back-end data on the sensor is further reduced, the requirement of electroencephalogram equipment on hardware is reduced, and the barrier on the hardware is broken.
In conclusion, the invention can overcome the problems of low precision and high complexity in the traditional electroencephalogram mode extraction, and because of using the optimal characteristic subsequence matching mode, most noise interference on data is avoided, the calculation complexity is reduced, and the system operation efficiency is improved.
Drawings
FIG. 1 is a flow chart of user data discrimination.
FIG. 2 illustrates a user data and background service interaction flow.
Fig. 3SAX symbol represents a diagram.
Fig. 4 optimal feature subsequence selection process.
Detailed Description
The present invention is further illustrated by the following description in conjunction with the accompanying drawings and the specific embodiments, it is to be understood that these examples are given solely for the purpose of illustration and are not intended as a definition of the limits of the invention, since various equivalent modifications will occur to those skilled in the art upon reading the present invention and fall within the limits of the appended claims.
An electroencephalogram mode extraction method based on an optimal sequence feature subset is shown in fig. 1-4, and comprises the following steps:
step 1, acquiring real-time electroencephalogram data through a sensor, sending a request to a central server, and requesting to send the electroencephalogram data.
And 2, the central server receives the sensor request, receives data and processes the data. And (4) carrying out primary processing on the acquired data, including checking, de-duplication, normalization and other operations, to obtain original data C.
And 3, performing segmented aggregation approximate representation on the original data C, and reducing data dimensionality to improve processing speed and precision. Segment aggregation Approximation (PAA segmentation for short): according to the dimensionality reduction method for the high-dimensional sequence data, data is segmented and approximately polymerized, the average value of the data in each segment is used as the approximate value of each segment of data to replace, the data dimensionality can be effectively reduced, and the efficiency is improved:
Figure BDA0002326627880000041
wherein j is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to w,
Figure BDA0002326627880000042
representing the sequence after linear segmentation, w representing the number of data segments in the segment aggregation approximation, cjRepresenting the elements in the original sequence and,
Figure BDA0002326627880000043
representing elements in a sequence after a segmented representation
Step 4, symbol aggregation Approximation (symbololic Aggregate Approximation): on the basis of segmentation aggregation approximation, the segmented data is represented symbolically, so that the influence of data fluctuation in a range can be reduced, and the precision and the classification efficiency are improved. Therefore, the data after the dimensionality reduction is symbolically converted into the letter sequence representation corresponding to the letter space, that is, the segmented data is mapped onto the alphabet through Gaussian distribution, and the corresponding mapping is determined by the following formula:
Figure BDA0002326627880000051
wherein i is more than or equal to 1 and less than or equal to w,0<p-1<p<1,
Figure BDA0002326627880000052
Representing a segmented sequence of symbolized representations, alphapRepresenting symbols, β, corresponding to symbol mappingsp-1And betapMapping of representation to symbol alphapThe corresponding interval value.
The data are mapped to letter spaces according to Gaussian distribution, letters corresponding to each data can be obtained by looking up a distribution table, wherein beta is a threshold value of the data space corresponding to each letter, namely, the data obtained in a certain range after the segmentation aggregation approximation can be mapped to the same letter space.
And 5, the server transmits the symbolized sequence to a central online classifier for classification. And if the result is matched with the normal electroencephalogram signal, no response is made, and the sensor continues to wait for a sensor request. And if the obtained result is matched with the abnormal electroencephalogram signal, feeding back a target connected with the sensor.
The classification processing in step 5 includes the steps of:
the initialization of the classifier in the server requires a set of labeled data to be trained to obtain a sufficiently reliable and accurate feature subset, at step 51, the optimal feature sequence in the feature subset.
Set={s1,s2,…,sk,…,sl}
Wherein k is more than or equal to 1 and less than or equal to l, Set represents an optimal subsequence Set, and skIn a representation setA certain sequence of features, l, represents the current number of features.
Step 52, since it is sure that the same type of data is mapped to different symbolized sequences after the original data is mapped to the space, in order to solve this problem, a similar sequence can be obtained by covering partial letters of the symbolized sequences, so as to solve the sequence ordering and the abnormal influence.
And 1, round:
Figure BDA0002326627880000053
and 2, round 2:
Figure BDA0002326627880000054
and (4) the p-th round:
Figure BDA0002326627880000055
q-1 and, q is 0 in position;
wherein,
Figure BDA0002326627880000056
indicating a certain length m of the symbolized sequence, m indicating the sequence length,&representing the logical and operation of the corresponding element. Q-1 is not less than 1<q.ltoreq.m, q-1 and q representing two adjacent positions in the sequence.
In step 53, in order to ensure efficiency and accelerate the similarity measurement between symbols, direct similarity calculation between each pair of sequences in an N × N manner cannot be adopted, so that a Hash operation is used instead of direct measurement. The sequence of symbols covered in step 52 in each round is mapped by a Hash operation to a Hash table T, and the sequence of symbols having the same Hash value is incremented.
Step 54, storing the number of the symbolic sequences with Hash collision in the Hash table T, and calculating the influence degree of each symbol on the classification by defining the discrimination Dist.
Dist=dfar+dclose
Wherein d isfarRefers to a distinguishing value between other classes and classes outside, dcloseThe method refers to the discrimination of the type and other types which can be obtained by summing the discrimination values, so that the higher the symbolic subsequence is for the type A discrimination value, the higher the discrimination value is for the non-type A discrimination value, and the higher the discrimination is, and the method is very suitable for serving as a criterion for discriminating between the types.
Step 55, extracting k candidate subsequences from all the symbolic subsequences with the highest discrimination value in each class, and then calculating the integral information entropy I with the data set DSSelecting the entropy with the maximum information
Figure BDA0002326627880000061
As a basis for an optimal subset of features of the class, wherein,
Figure BDA0002326627880000062
representing entropy, I representing amount of information before classification, ISIndicating the amount of information after classification.
And 56, adding the optimal symbol self-sequence into the Set if better difference entropy can be obtained for the original data acquired by each sensing, and updating the data in the Set. And judging whether the overall subsequence is optimal or not by using the expectation rate for the overall accuracy of the model.
Figure BDA0002326627880000063
If the expectation rate is within a preset threshold of the sequence expectation rate, the symbol sequence is matched with the sequences in the feature subset, whether the symbol sequence is a normal/abnormal sequence or not can be judged according to the matched feature sequence, and if the symbol sequence is matched with the abnormal sequence, a result can be fed back to a sensor user in a feedback mode.
The operation method of the invention is as follows:
step A: the electroencephalogram signal is acquired by the wearable device by a device user, the electroencephalogram sensor on the wearable device has the characteristics of low power consumption and low design requirement, a simple detection and judgment process is needed, and data can be submitted to an application end, as shown in figure 2.
And B: because the sensor does not have the capability of direct communication with the server, the communication equipment (such as a computer or a mobile phone) of an equipment user is required to send a request to the server end by the communication equipment for the data collected by the sensor, as shown in (c), and the interactive information sent by the processing server is obtained and presented to the user for the judgment of the next action.
And C: if so, the processor processes the data transmitted by the communication equipment, and then returns the data to the equipment through the process of (iv) and is processed by the equipment-side software.
The initializing optimal sequence selection process of the process is shown by a flow chart 4 and comprises the following specific steps:
step C1: and cleaning all data of the original data set, and performing operations such as duplicate removal, normalization, abnormal value processing and the like.
Step C2: cleaning data C ═ C1,…,cn) Represented by PAA segmentation into
Figure BDA0002326627880000071
Step C3: as shown in fig. 3, the segmented sequence is further represented as a symbol sequence by the SAX method
Figure BDA0002326627880000072
The value after PAA segmentation treatment satisfies
Figure BDA0002326627880000073
Will map to the letter b and a segment sequence can be represented as a character sequence of bccbaaabc.
Step C4: the character sequence is randomly masked, and the left side of the table I and the table II shows that the gray area is a masking part, and a part of the character is randomly masked each time. And further carrying out Hash operation on unmasked parts in the masked sub-character sequence, adding one to the numerical value of the sequence corresponding to the operation result, and carrying out an adding operation on the numerical value when a plurality of characters generate the same Hash value collision. The right side of the table I and the table II shows the result table of hash operation of different sequences, and then different sequences are added and subtracted to obtain the discrimination.
Figure BDA0002326627880000074
Table 1: randomly covering the first two characters
Figure BDA0002326627880000081
Table 2: random covering for selecting middle two characters
Step C5: if the length of the character sequence is l, the characters at two positions are taken to be randomly covered and Hash operation is required to be carried out at most
Figure BDA0002326627880000083
After this operation, the mapping operation of the Hash table in step 4 is completed.
Figure BDA0002326627880000082
Table 3: discrimination solution process
Step C6: using Dist ═ dfar+dcloseThe calculated discriminations were calculated as shown in table 3.
Step C7: and extracting the 10 sub-symbol sequences with the highest discrimination.
Step C8: information entropy of 10 sub-symbol sequences of a calculator on an original data set
Figure BDA0002326627880000091
If it satisfies
Figure BDA0002326627880000092
Then the sub-symbol sequence is decoded
Figure BDA0002326627880000093
And adding the mixture into the Set.
After the initialization of the server is completed, a specific processing flow for receiving the request is shown in fig. 1.
Step D1: and cleaning all data of the original data set, and performing operations such as duplicate removal, normalization, abnormal value processing and the like.
Step D2: cleaning data C ═ C1,…,cn) Expressed by PAA segmentation into
Figure BDA0002326627880000094
Step D3: then the segmented sequence is expressed into a symbol sequence by an SAX method
Figure BDA0002326627880000095
The value after PAA segmentation treatment satisfies
Figure BDA0002326627880000096
Is mapped to the letter b and a segment sequence can be represented as a sequence of characters.
Step D4: the symbol sequence is matched with the symbol sequence of a tree formed by the character good sequences in the feature subset from the tree root to the leaf node through a classifier.
Step D5: and if the normal sequence is matched, returning a normal value to the user software, and if the abnormal sequence is matched, returning an abnormal value warning.
Aiming at the high-precision classification requirement of a brain-computer interface system on classification of electroencephalogram data, the brain-computer interface can classify signals by finding out an optimal characteristic subsequence of an electroencephalogram signal, and provides a relevant strong explanatory result while ensuring the classification precision of the electroencephalogram signal. Through PAA dimension reduction and SAX symbolization, the complexity of data is reduced, and in addition, similarity calculation between data is replaced by a Hash mode, so that the speed and the efficiency are higher. A discrimination is defined for determining the discriminative power between the symbol subsequences for different electroencephalogram signals.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (3)

1. An electroencephalogram mode extraction method based on an optimal sequence feature subset is characterized by comprising the following steps:
step 1, acquiring real-time electroencephalogram data through a sensor, sending a request to a central server, and requesting to send the electroencephalogram data;
step 2, the central server receives the sensor request, receives data and processes the data; the acquired data is processed preliminarily to obtain an original data sequence C ═ C1c2c3,…,cn) N is the length of original data, wherein C belongs to D, and D is an original data set;
step 3, performing segmented aggregation representation on the original data C, and reducing data dimensionality:
Figure FDA0003287783060000011
wherein j is more than or equal to 1 and less than or equal to n, t is more than or equal to 1 and less than or equal to w,
Figure FDA0003287783060000012
representing the sequence after linear segmentation, w representing the number of data segments in the segment aggregation approximation, cjRepresenting the elements in the original sequence and,
Figure FDA0003287783060000013
representing elements in the sequence after the segmented representation;
step 4, performing symbolic conversion on the data after the dimensionality reduction representation, and converting the data into a letter sequence representation corresponding to a letter space:
Figure FDA0003287783060000014
wherein i is more than or equal to 1 and less than or equal to w, p-1 is more than 0 and more than p and less than 1,
Figure FDA0003287783060000015
representing a segmented sequence of symbolized representations, alphapRepresenting symbols, β, corresponding to symbol mappingsp-1And betapMapping of representation to symbol alphapA corresponding interval value;
step 5, the server transmits the symbolized sequence into a central online classifier for classification; if the result is matched with the normal electroencephalogram signal, no response is made, and the request of the sensor is continuously waited; if the obtained result is matched with the abnormal electroencephalogram signal, feeding back a target connected with the sensor;
the classification processing in step 5 includes the steps of:
step 51, a set of labeled data is required for training the classifier initialization in the server to obtain a feature subset with sufficient confidence and sufficient accuracy, where the optimal feature sequence in the feature subset is:
Set={s1,s2,…,sk,…,sl}
wherein k is more than or equal to 1 and less than or equal to l, Set represents an optimal subsequence Set, and skRepresenting a certain characteristic sequence in the set, and l represents the current characteristic quantity;
step 52, because the same type of data is mapped to different symbolic sequences after the original data is mapped to the space, similar sequences are obtained by covering partial letters of the symbolic sequences so as to solve sequence ordering and abnormal influence;
and 1, round:
Figure FDA0003287783060000016
and 2, round 2:
Figure FDA0003287783060000017
and (4) the p-th round:
Figure FDA0003287783060000021
q-1 and, q is 0 in position;
wherein,
Figure FDA0003287783060000022
Figure FDA0003287783060000023
indicating a certain length m of the symbolized sequence, m indicating the sequence length,&a logical and operation representing the corresponding element; q is more than or equal to 1 and more than or equal to q and less than or equal to m, and q-1 and q represent two adjacent positions in the sequence;
step 53, in each round, the symbol sequence covered in step 52 will be mapped to the Hash table T through Hash operation, and the symbolic sequence with the same Hash value will be subjected to auto-increment operation;
step 54, storing the number of the symbolic sequences with Hash collision in a Hash table T, and calculating the influence degree of each symbol on classification by defining the discrimination Dist;
Dist=dfar+dclose
wherein d isfarRefers to a distinguishing value between other classes and classes outside this class, dcloseRefer to discriminative values for this and other classes;
step 55, extracting k candidate subsequences from all the symbolic subsequences with the highest discrimination value in each class, and then calculating the integral information entropy I with the data set DSSelecting the minimum entropy of information Min (I ″)S) As the basis of the optimal feature subset of the class, wherein I ″S=I-IS,I`SRepresenting entropy, I representing amount of information before classification, ISRepresenting the amount of information after classification;
step 56, aiming at the original data acquired by each sensing, if better difference entropy can be obtained, adding the optimal symbol self-sequence into the Set, and updating the data in the Set; judging whether the overall subsequence is optimal or not by using the expectation rate;
Figure FDA0003287783060000024
and if the expected rate is within a preset threshold value of the sequence expected rate, matching the symbol sequence with the sequences in the feature subset, judging whether the symbol sequence is a normal sequence or not according to the matched feature sequence, and if the symbol sequence is matched with the abnormal sequence, feeding back a result to a sensor user in a feedback mode.
2. The electroencephalogram mode extraction method based on the optimal sequence feature subset according to claim 1, which is characterized in that: the preliminary processing in step 2 includes performing a verify-denormalization operation.
3. The electroencephalogram mode extraction method based on the optimal sequence feature subset according to claim 2, which is characterized in that: the average value of the data after the segmentation and aggregation processing satisfies
Figure FDA0003287783060000025
Will all map to the letter alphapThe above.
CN201911319017.2A 2019-12-19 2019-12-19 Electroencephalogram mode extraction method based on optimal sequence feature subset Active CN111126241B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911319017.2A CN111126241B (en) 2019-12-19 2019-12-19 Electroencephalogram mode extraction method based on optimal sequence feature subset

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911319017.2A CN111126241B (en) 2019-12-19 2019-12-19 Electroencephalogram mode extraction method based on optimal sequence feature subset

Publications (2)

Publication Number Publication Date
CN111126241A CN111126241A (en) 2020-05-08
CN111126241B true CN111126241B (en) 2022-04-22

Family

ID=70500226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911319017.2A Active CN111126241B (en) 2019-12-19 2019-12-19 Electroencephalogram mode extraction method based on optimal sequence feature subset

Country Status (1)

Country Link
CN (1) CN111126241B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107518894A (en) * 2017-10-12 2017-12-29 公安部南昌警犬基地 A kind of construction method and device of animal brain electricity disaggregated model
CN109199414A (en) * 2018-10-30 2019-01-15 武汉理工大学 A kind of audiovisual induction Emotion identification method and system based on EEG signals
CN109497996A (en) * 2018-11-07 2019-03-22 太原理工大学 A kind of the complex network building and analysis method of micro- state EEG temporal signatures
CN109691996A (en) * 2019-01-02 2019-04-30 中南大学 One kind is based on mixing binary-coded EEG signals feature preferably and classifier preferred method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080263432A1 (en) * 2007-04-20 2008-10-23 Entriq Inc. Context dependent page rendering apparatus, systems, and methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107518894A (en) * 2017-10-12 2017-12-29 公安部南昌警犬基地 A kind of construction method and device of animal brain electricity disaggregated model
CN109199414A (en) * 2018-10-30 2019-01-15 武汉理工大学 A kind of audiovisual induction Emotion identification method and system based on EEG signals
CN109497996A (en) * 2018-11-07 2019-03-22 太原理工大学 A kind of the complex network building and analysis method of micro- state EEG temporal signatures
CN109691996A (en) * 2019-01-02 2019-04-30 中南大学 One kind is based on mixing binary-coded EEG signals feature preferably and classifier preferred method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A review of channel selection algorithms for EEG signal processing;Turky Alotaiby et al.;《EURASIP Journal on Advances in Signal Processing(2015)》;20150801;1-21 *

Also Published As

Publication number Publication date
CN111126241A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
CN110008674B (en) High-generalization electrocardiosignal identity authentication method
Ahmed et al. Appearance-based arabic sign language recognition using hidden markov models
CN106529504B (en) A kind of bimodal video feeling recognition methods of compound space-time characteristic
CN109190698B (en) Classification and identification system and method for network digital virtual assets
CN113486752B (en) Emotion recognition method and system based on electrocardiosignal
JP7330338B2 (en) Human image archiving method, device and storage medium based on artificial intelligence
CN115294658A (en) Personalized gesture recognition system and gesture recognition method for multiple application scenes
Chowdhury et al. Lip as biometric and beyond: a survey
Wen et al. A two-dimensional matrix image based feature extraction method for classification of sEMG: A comparative analysis based on SVM, KNN and RBF-NN
CN114384999B (en) User-independent myoelectric gesture recognition system based on self-adaptive learning
Jang et al. Motor-imagery EEG signal classification using position matching and vector quantisation
CN114239649B (en) Identity recognition method for discovering and recognizing new user by photoelectric volume pulse wave signal of wearable device
Karayaneva et al. Unsupervised Doppler radar based activity recognition for e-healthcare
CN111126241B (en) Electroencephalogram mode extraction method based on optimal sequence feature subset
CN112380903B (en) Human body activity recognition method based on WiFi-CSI signal enhancement
CN106709442B (en) Face recognition method
CN110502883B (en) PCA-based keystroke behavior anomaly detection method
Suriani et al. Smartphone sensor accelerometer data for human activity recognition using spiking neural network
Popescu-Bodorin Exploring new directions in iris recognition
Cheng et al. Advancing surface feature encoding and matching for more accurate 3D biometric recognition
CN113057654B (en) Memory load detection and extraction system and method based on frequency coupling neural network model
CN114764580A (en) Real-time human body gesture recognition method based on no-wearing equipment
KR101556696B1 (en) Method and system for recognizing action of human based on unit operation
Saqib et al. Recognition of static gestures using correlation and cross-correlation
KR101058719B1 (en) Method of input control based on hand posture recognization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant