CN109545372B - Patient physiological data feature selection method based on greedy-of-distance strategy - Google Patents
Patient physiological data feature selection method based on greedy-of-distance strategy Download PDFInfo
- Publication number
- CN109545372B CN109545372B CN201811313953.8A CN201811313953A CN109545372B CN 109545372 B CN109545372 B CN 109545372B CN 201811313953 A CN201811313953 A CN 201811313953A CN 109545372 B CN109545372 B CN 109545372B
- Authority
- CN
- China
- Prior art keywords
- wolf
- vector
- value
- distance
- greedy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Abstract
The invention discloses a patient physiological data feature selection method based on a greedy distance strategy, which is improved aiming at the disadvantage of lower performance of the existing feature selection algorithm.
Description
Technical Field
The invention belongs to the technical field of medical treatment, relates to a patient physiological data feature selection method, and particularly relates to a wolf's own characteristics selection method based on a greedy distance strategy.
Background
Nowadays, the science and technology are developed at a high speed, the medical detection system is continuously updated, and the detection system is mature day by day. Heart disease is a killer of human health and has great significance in detecting it before the onset of the disease. The physiological data of the patient has large characteristic quantity and is redundant, the redundant characteristic makes the workload of detecting the heart disease become huge, and the effect becomes poor. The gray wolf optimization algorithm (GWO) is a group intelligence algorithm which is put into use at present, determines the position of prey to be prey by simulating the process of prey on wolf groups, namely, the optimal solution of the optimization problem, and is largely used in the feature selection part, but the algorithm itself has a slow convergence speed and a low search efficiency. The invention provides an improved wolf algorithm for a feature selection part, the algorithm replaces a general wolf algorithm position updating part with a greedy strategy, and the optimal price searching efficiency is improved, so that a better feature set can be extracted, and the detection of a sample is facilitated.
The purpose of feature selection is to extract important features from the data and remove redundant features. The feature selection can reduce data dimensionality, improve prediction performance, reduce overfitting, enhance understanding between features and feature values, and the like. In the real world, data to be classified often has a large number of redundant features, which means that some features in the data can be replaced by other features, and the replaced features can be removed in the classification process, furthermore, the mutual connection between the features has a great influence on the output effect of the classification, and if we can find out the connection between the features, we can dig out a large amount of information hidden in the data.
All feature selection algorithms can be classified into the following three categories, filtering, embedding and wrapping. The filtering method is realized by firstly selecting the characteristics of the data set and then training a classifier to split the data set and the classifier. The key of the method is to find a method for measuring the importance of features, such as pearson correlation coefficients, mutual information and the like. Then sorting is carried out according to the size of the metric, and the characteristic with the metric value sorted in the front is selected as the characteristic of the classification standard. However, the method has the disadvantage of neglecting the interdependence relationship between the features, and on one hand, the top-ranked features are equivalent to the features with redundancy introduced if the correlation between some features is strong. On the other hand, the feature in the next rank, although the metric value is not large and the value is not obvious, has good prediction effect independently of other features and is combined with other features, so that the valuable features are lost. The embedded method is to integrate the feature selection process into the learner training process, and the two are completed in a unified process, such as lasso ridge regression. The core idea of the wrapping method is that under the condition that a training model and an evaluation method of prediction effect are given, the prediction effect of each subset is evaluated according to different feature subsets in a feature space, and the feature subset with the best prediction effect is selected as a finally selected training subset. The method has the advantages that the characteristic subset selected by the wrapping method has better prediction effect than the filtering method in consideration of the interdependence relation among the characteristics, but the method has the defect of large calculation amount because the characteristic subset is in an exponential order. Different algorithms are generated for how efficiently the entire feature space is searched.
The genetic algorithm is the first intelligent algorithm used for solving the problem, the idea of the genetic algorithm is derived from the reproductive genetic process among natural biological populations, the solution of the optimization problem is considered as a gene, and then genetic communication including crossing and variation is carried out among the whole populations. The natural environment can be regarded as an objective function, and genes with high adaptability to the natural environment are reserved and are passed on to the next generation. Genetic algorithms have the ability to solve complex nonlinear optimization problems. However, the genetic algorithm has many disadvantages such as low operation efficiency and easy falling into the local optimal solution.
The Particle Swarm Optimization (PSO) concept stems from the study of the foraging behavior of a flock of birds. The potential solution of each optimization problem can be thought of as a point on a d-dimensional search space, which is called as a 'particle', all particles have an adaptive value determined by an objective function, each particle also has a speed to determine the flying direction and distance of the particle, and then the particles follow the current optimal particle to search in the solution space. Compared with the traditional multi-target optimization method, the particle swarm optimization method has great advantages in solving the multi-target problem. However, the method has the disadvantages of low precision, easy divergence and the like.
Disclosure of Invention
The invention aims to solve the problems that the existing patient physiological data feature selection algorithm is low in convergence speed and searching efficiency and is easy to fall into a local optimal solution, and provides a gray wolf feature selection method based on a distance greedy strategy, so that the algorithm classification accuracy is improved, and the data feature redundancy is reduced.
The technical scheme adopted by the invention is as follows: a patient physiological data characteristic selection method based on a greedy-of-distance strategy is characterized by comprising the following steps of:
step 1: inputting data captured from physiological data of a patient, and forming sample data containing labels into a training set; wherein, the label marks that the physiological data of the patient represents the disease state of the patient, and the disease state is divided into diseased state and non-diseased state;
step 2: aiming at the captured data, utilizing a gray wolf feature selection method based on a greedy distance strategy to select the physiological data features of the patient;
step 2.1: initializing the current iteration times, the number of the wolf individuals, the population size of the wolf group and the position vector of each wolf individual; the position vector of each wolf individual represents a candidate solution of the feature selection problem;
step 2.2: calculating the coding vector of each wolf according to the position vector, and calculating the adaptive value of each wolf according to the coding vector;
step 2.3: setting the maximum iteration number as maximum, and selecting the first three as alpha, beta and delta according to the size of the adaptive value;
step 2.4: calculating a distance map of each wolf;
step 2.5: updating the coding vectors of alpha, beta and delta according to the distance mapping of each wolf head;
step 2.6: judging whether t is larger than maximum;
if yes, executing the following step 3;
if not, returning to the step 2.4 after t is equal to t + 1;
and step 3: and outputting the feature subset corresponding to the alpha code vector.
The invention improves the disadvantage of lower performance of the existing feature selection algorithm, improves the position updating part in the original Huilusu algorithm by using a greedy strategy, improves the capability of the algorithm in developing the optimal solution, improves the convergence rate, can effectively improve the classification accuracy and reduce the data feature redundancy.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention;
FIG. 2 is a graph comparing the detection error rate of the present invention with three other feature selection algorithms;
FIG. 3 is a comparison graph of feature selection numbers after feature selection in the present invention versus three other feature selection algorithms.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
The core of the technology of the invention is to regard the feature selection problem of medical data containing N features as a discrete optimization combination problem in a binary system N-dimensional space, each feature subset can be represented by an N-dimensional binary vector, and the improved Hui wolf optimization algorithm is adopted to search in the N-dimensional binary system space.
Referring to fig. 1, the method for selecting physiological data characteristics of a patient based on a greedy-of-distance strategy provided by the invention comprises the following steps:
step 1: inputting data captured from physiological data of a patient, and forming sample data containing labels into a training set; the label marks the diseased condition in the physiological data of the patient, the diseased condition is divided into a diseased condition and an undiseased condition, 0 represents normal, and 1-4 represents the degree of vasoconstriction;
in this embodiment, Z pieces of captured medical data containing N features are input, each piece of data in the data set is a sample, the sample capacity is Z, each piece of input data is represented by a feature vector, each dimension of the vector represents one feature of the data, and all samples containing category labels constitute a training sample set T.
Step 2: aiming at the captured data, utilizing a gray wolf feature selection method based on a greedy distance strategy to select the physiological data features of the patient;
step 2.1: initializing the current iteration times, the number of the wolf individuals, the population size of the wolf group and the position vector of each wolf individual; the position vector of each wolf individual represents a candidate solution of the feature selection problem;
(1) initializing the current iteration time t as 1, the number i of a wolf individual as 1, and the population size of a wolf group as K;
(2) for the wolf individual from i ═ 1, 2, … K, the position vector of each head wolf in the wolf cluster is initialized randomly within (0, max)The vector dimension is N, wherein max represents the maximum value of the position of the wolf individual, and is taken as 1;
step 2.2: calculating the coding vector of each wolf according to the position vector, and calculating the adaptive value of each wolf according to the coding vector;
(1) find a mapping function f that can map values in the (0, max) interval into {0, 1} the discrete set, and guarantee that there is a number δ in (0, max) such that f (temp1) for all temp1 ∈ (0, δ) and temp2 ∈ [ δ, max)<f (temp2) so that the continuous feature vector can be usedBecome binary coded vectors containing only 0 and 1
The following function is selected as the mapping function in this embodiment:
wherein position (i, j) representsThe value of the j-th dimension in (i, j) representsThe j-th dimension of the vector, so that the position of the gray wolf is converted from a continuous value to a binary coded value of 0, 1 by using the function, and the binary coded value can be used in a feature selection algorithm.
(2) Encoding vector in wolf 1 represents that the characteristic is selected, 0 represents that the characteristic is not selected, and the training set T is arranged in the coding vectorRetaining the corresponding selected features, deleting the unselected features to obtain a new training setIs T _ solution.
(3) The average precision (or classification error rate) of the classified T _ solution is calculated by a classifier, and the precision is used as a wolf pack coding vectorCorresponding adaptive value Pi. The classifier can select different classifiers such as an SVM (support vector machine), an artificial neural network and the like according to actual conditions, the embodiment uses a KNN classifier, and K in KNN takes a value of 5;
step 2.3: setting the maximum iteration number as maximum, and selecting the first three as alpha, beta and delta according to the size of the adaptive value;
setting maximum iteration number maximum, and then selecting adaptive value PiThe optimal encoding vector of wolf is taken as the encoding vector of alpha. The excellent of the adaptive value is relative, and is related to the meaning of the selected adaptive value function, the invention selects the classification error rate as the adaptive value of the wolf, and the lower the classification error rate is, the better the classification effect is, the better the wolf is. Therefore, the initialization of α, β and δ in the present invention is divided into the following three substeps:
(1) selecting an adaptation value PiLowest wolfInitializing a code vector of alphaCode vector of wolf j
(2) After j is deleted, the adaptive value P is selected from the rest wolf individualsiLowest wolfInitializing a code vector of betaCode vector of wolf's n
(3) After n is deleted, an adaptive value P is finally selected from the remaining wolf individualsiLowest wolfInitializing a code vector of deltaCode vector of wolf m
Step 2.4: calculating a distance map of each wolf;
the step is the core of the invention and is an innovation point, the invention improves the defects of the existing wolf optimization algorithm, improves the capability of the algorithm for developing the optimal solution, improves the convergence speed, and can effectively improve the classification accuracy and reduce the data characteristic redundancy.
In the embodiment, a greedy strategy is utilized to calculate the distance mapping of each wolf head; the specific implementation comprises the following substeps:
step 2.4.1: computing successive encoded distance vectors based on selection of alpha, beta, and delta
Wherein the content of the first and second substances,representing parametersThree different random vectors, parametersCalculated in step 2.4.2;
wherein the content of the first and second substances,representing parametersThree different random vectors of, wherein the parametersCalculated in step 2.4.2;andposition vectors representing α, β, and δ in the t-th iteration;
is a middleA parameter representing the final position of each wolf moving along α, β, and δ at the tth iteration; it is defined as follows:
wherein the content of the first and second substances,is in a value range of [0, 1]A is a parameter variable for controlling the development and searchability of the algorithm, the parameter variable is linearly reduced from 2 to 0 along with the increase of the iteration times, t is the number of current iteration times, and maximer is the total number of algorithm iteration times;
WhereinRepresents the calculation of step 2.4.1The value of the n-th dimension is,representing a vectorThe value of the nth dimension; b represents the maximum value of the assumed problem search interval,is represented byMapping functions obtained in different problem search intervals;
step 2.4.4: calculating XdChange and hold;
wherein the content of the first and second substances,is composed ofThe value of the d-th dimension in (1),binary coded vectors, X, representing individualsdRepresenting the d-dimension value of each single binary coding vector; ddFor continuously encoding vectorsThe value of d is [0, 1 ]]Random numbers in intervals, where hold and change represent pairsThe value after the operation is taken as XdThe value of (c).
Step 2.5: updating the coding vectors of alpha, beta and delta according to the distance mapping of each wolf head;
updating the code vectors of alpha, beta and delta, sorting the updated individual adaptive values of wolfs, and selecting the adaptive value P of the three-headed wolf with the first three of the adaptive valuesα'、Pβ' and Pδ' Adaptation values P to original alpha, beta and deltaα,PβAnd PδPerforming corresponding comparison if the new adaptive value PiIs superior to the original adaptation value PiThen the corresponding code vector is usedUpdating the code vector corresponding to the new adaptive valueOtherwise, the updating is not carried out.
Step 2.6: judging whether t is larger than maximum;
if yes, executing the following step 3;
if not, returning to the step 2.4 after t is equal to t + 1;
and step 3: and outputting the feature subset corresponding to the alpha code vector.
The coded vector of alphaBinary string representing optimal feature subset, 1 representing feature selected, 0 representing feature not selected, and outputtingAnd the feature corresponding to the dimension with the value of 1 is extracted.
The effects of the present invention will be further described below by comparative experiments.
(1) Simulating conditions;
the data set used in the experiment was a set of cardiac data in the uci database, which was divided equally into two parts, one as the training set and the other as the test set. In the experiment, the language used by each method is realized by matlab.
(2) Experimental content and results;
the method comprises the steps of utilizing a group of heart disease data in an uci database as a data set, utilizing a KNN algorithm in matlab as a classifier to detect, then optimizing a post-algorithm GWO, a Genetic Algorithm (GA) and a particle swarm algorithm (PSO) as algorithms of a feature selection part, utilizing KNN as a sample classifier, utilizing a sample classification error rate and a final feature selection number as comparison indexes, and comparing average performance indexes of four different feature selection algorithms under different running times.
The data set used in the experiment was a set of cardiac disease data sets provided by the UCI database, for a total of 303 data, each of which recorded all physiological indicators of cardiac patients. Each datum consists of 14 features and a label, the population number of the wolf pack is set to be 12, the iteration number maximum of the algorithm is set to be 6, KNN is selected as a classifier in the experiment, and K is 5.
The specific 14 data characteristics are respectively: age represents the patient's age; sex denotes patient gender, wherein 0 denotes female and 1 denotes male; cp represents the chest pain type of the patient and is divided into four types, namely 1, 2, 3 and 4; trestbps represents the resting blood pressure of the patient; chol denotes the cholesterol value of the patient; fbs denotes fasting plasma glucose level of the patient; restecg means electrocardiogram results of patients, 0 means normal, 1 means mild, 2 means severe; thalach represents the maximum heart beat number of the patient; exang indicates whether the patient has exercise angina, 0 indicates present, and 1 indicates absent; oldpeak represents the number of st wave drops caused by patient motion; slop represents the patient's motion st band slope; ca represents the number of vessels seen by the patient's fluoroscopy; thal represents the defect types of the patients, namely 3, 6 and 7; status indicates the disease status of the patient, 0 indicates normal, and 1 to 4 indicate the degree of vasoconstriction.
The performance indexes of four different feature selection algorithms under different algorithm running times are compared in experiments, and the algorithm running times are increased from 20 times to 200 times. The abscissa of fig. 2 and 3 represents the number of algorithm runs, 1 represents the number of first experimental runs as 20, and 10 represents the number of tenth experimental runs as 200. Errorb and count indicate the error rate and feature selection number after the original Grey wolf algorithm is used as the feature selection part, and Errore and count indicate the error rate and feature selection number after the improved Grey wolf algorithm is used as the feature selection part. As can be seen from fig. 1, except for the 2 nd experiment (40 times of operation), the classification accuracy of the improved algorithm is better than that of all other algorithms, the average error rate is below 1.85%, the effect is obviously improved, the fluctuation amplitude is small, and the operation effect is stable. As can be seen from fig. 2, the average feature selection numbers using the improved algorithm were all less than 3.85, both lower than those of PSO and GA in ten experiments. Compared with the improved gray wolf algorithm, the number of feature choices is reduced greatly, and the volatility is stable.
In conclusion, experiments show that under the same conditions, the algorithm can achieve better effect in the aspect of feature selection. In longitudinal comparison, the detection error rate of the algorithm after feature selection is superior to that of the original BGWO, and the algorithm is superior to the EBGWO of the improved version in the aspect of feature selection number, so that the advantages of the two algorithms are combined in general. The algorithm is superior to PSO and GA no matter the number of feature choices or the detection error rate are compared in the transverse direction, the convergence speed of the algorithm is high, and a good effect can be achieved with few iteration times.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (4)
1.A patient physiological data characteristic selection method based on a greedy-of-distance strategy is characterized by comprising the following steps of:
step 1: inputting data captured from physiological data of a patient, and forming sample data containing labels into a training set; wherein, the label marks that the physiological data of the patient represents the disease state of the patient, and the disease state is divided into diseased state and non-diseased state;
step 2: aiming at the captured data, utilizing a gray wolf feature selection method based on a greedy distance strategy to select the physiological data features of the patient;
step 2.1: initializing the current iteration times, the number of the wolf individuals, the population size of the wolf group and the position vector of each wolf individual; the position vector of each wolf individual represents a candidate solution of the feature selection problem;
wherein, the number of initialization iterations t is 1, the population size of the wolf cluster is K, and for wolf individuals with i being 1, 2, … and K, the position vector of each head wolf in the wolf cluster is initialized randomly in (0, max)The vector dimension is N, where max represents the maximum value of the position of the wolf individual;
step 2.2: calculating the coding vector of each wolf according to the position vector, and calculating the adaptive value of each wolf according to the coding vector;
step 2.3: setting the maximum iteration number as maximum, and selecting the first three as alpha, beta and delta according to the size of the adaptive value;
step 2.4: calculating a distance map of each wolf;
wherein, the distance mapping of each wolf is calculated by utilizing a greedy strategy; the specific implementation comprises the following substeps:
step 2.4.1: computing successive encoded distance vectors based on selection of alpha, beta, and delta
Wherein the content of the first and second substances,representing parametersThree different random vectors, parametersCalculated in step 2.4.2;
wherein the content of the first and second substances,representing parametersThree different random vectors of, wherein the parametersCalculated in step 2.4.2;andposition vectors representing α, β, and δ in the t-th iteration;
for the intermediate parameters, the final position of each wolf moving along α, β, and δ at the tth iteration is represented; it is defined as follows:
wherein the content of the first and second substances,is in a value range of [0, 1]A is a parameter variable for controlling the development and searchability of the algorithm, the parameter variable is linearly reduced from 2 to 0 along with the increase of the iteration times, t is the number of current iteration times, and maximer is the total number of algorithm iteration times;
WhereinRepresents the calculation of step 2.4.1The value of the n-th dimension is,representing a vectorThe value of the nth dimension; b represents the maximum value of the assumed problem search interval,is represented byMapping functions obtained in different problem search intervals;
step 2.4.4: calculating XdChange and hold;
wherein the content of the first and second substances,is composed ofThe value of the d-th dimension in (1),binary coded vectors, X, representing individualsdRepresenting the d-dimension value of each single binary coding vector; ddFor continuously encoding vectorsThe value of d is [0, 1 ]]Random numbers in intervals, where hold and change represent pairsThe value after the operation is taken as XdA value of (d);
step 2.5: updating the coding vectors of alpha, beta and delta according to the distance mapping of each wolf head;
wherein, updating the code vectors of alpha, beta and delta comprises sorting the updated individual adaptive values of wolf, and selecting the adaptive value P of the three-headed wolf with the first three of the adaptive valuesα'、Pβ' and Pδ' Adaptation values P to original alpha, beta and deltaα,PβAnd PδPerforming corresponding comparison if the new adaptive value PiIs superior to the original adaptation value PiThen the corresponding code vector is usedUpdating the code vector corresponding to the new adaptive valueOtherwise, not updating;
step 2.6: judging whether t is larger than maximum;
if yes, executing the following step 3;
if not, returning to the step 2.4 after t is equal to t + 1;
and step 3: and outputting the feature subset corresponding to the alpha code vector.
2. The greedy-of-distance-strategy-based patient physiological data feature selection method as recited in claim 1, wherein: in step 1, for the data captured and labeled in known manner, each piece of data is represented by a feature vector, and each dimension of the vector represents a feature of the data.
3. The greedy-of-distance-strategy-based patient physiological data feature selection method as recited in claim 1, wherein: in step 2.2, find a mapping function f that maps the values in the (0, max) interval into the {0, 1} discrete set and ensures that there is a number δ in (0, max) such that f (temp1) exists for all temp1 ∈ (0, δ) and temp2 ∈ [ δ, max)<f (temp2), so that the continuous feature vectorBecome binary coded vectors containing only 0 and 1
4. The greedy-of-distance-strategy-based patient physiological data feature selection method as recited in claim 1, wherein: in step 2.2, the vector is encoded according to binary of wolfCalculating an adaptation value of each wolf, the code vector of each wolf1 represents that the characteristic is selected, 0 represents that the characteristic is not selected, and the training set T is enabled to be encoded in the encoding vectorCorresponding to the training set under the selected characteristics as T _ solution, calculating the average precision or the classification error rate P after classifying the T _ solution by utilizing a classifieriThe accuracy is used as the wolf group code vectorThe corresponding adaptation value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811313953.8A CN109545372B (en) | 2018-11-06 | 2018-11-06 | Patient physiological data feature selection method based on greedy-of-distance strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811313953.8A CN109545372B (en) | 2018-11-06 | 2018-11-06 | Patient physiological data feature selection method based on greedy-of-distance strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109545372A CN109545372A (en) | 2019-03-29 |
CN109545372B true CN109545372B (en) | 2021-07-06 |
Family
ID=65846544
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811313953.8A Active CN109545372B (en) | 2018-11-06 | 2018-11-06 | Patient physiological data feature selection method based on greedy-of-distance strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109545372B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111382366B (en) * | 2020-03-03 | 2022-11-25 | 重庆邮电大学 | Social network user identification method and device based on language and non-language features |
CN112002419B (en) * | 2020-09-17 | 2023-09-26 | 吾征智能技术(北京)有限公司 | Disease auxiliary diagnosis system, equipment and storage medium based on clustering |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832830A (en) * | 2017-11-17 | 2018-03-23 | 湖北工业大学 | Intruding detection system feature selection approach based on modified grey wolf optimized algorithm |
-
2018
- 2018-11-06 CN CN201811313953.8A patent/CN109545372B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832830A (en) * | 2017-11-17 | 2018-03-23 | 湖北工业大学 | Intruding detection system feature selection approach based on modified grey wolf optimized algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN109545372A (en) | 2019-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Purwar et al. | Hybrid prediction model with missing value imputation for medical data | |
Zriqat et al. | A comparative study for predicting heart diseases using data mining classification methods | |
Mucherino et al. | Data mining in agriculture | |
CN111180068A (en) | Chronic disease prediction system based on multi-task learning model | |
CN110110753B (en) | Effective mixed characteristic selection method based on elite flower pollination algorithm and ReliefF | |
CN109545372B (en) | Patient physiological data feature selection method based on greedy-of-distance strategy | |
CN113962278A (en) | Intelligent ensemble learning classification method based on clustering | |
Bicego | K-Random Forests: A K-means style algorithm for Random Forest clustering | |
CN116628510A (en) | Self-training iterative artificial intelligent model training method | |
CN110400610B (en) | Small sample clinical data classification method and system based on multichannel random forest | |
Rattan et al. | Artificial intelligence and machine learning: what you always wanted to know but were afraid to ask | |
Thinsungnoen et al. | Deep autoencoder networks optimized with genetic algorithms for efficient ECG clustering | |
CN113707317A (en) | Disease risk factor importance analysis method based on mixed model | |
Thakkar et al. | Metaheuristics in classification, clustering, and frequent pattern mining | |
Angayarkanni | Predictive analytics of chronic kidney disease using machine learning algorithm | |
CN114255865A (en) | Diagnosis and treatment project prediction method based on recurrent neural network | |
CN113378946A (en) | Robust multi-label feature selection method considering feature label dependency | |
CN112800224A (en) | Text feature selection method and device based on improved bat algorithm and storage medium | |
Kecman et al. | Adaptive local hyperplane for regression tasks | |
Aslan | An Artificial Bee Colony-Guided Approach for Electro-Encephalography Signal Decomposition-Based Big Data Optimization | |
CN114565972B (en) | Skeleton action recognition method, system, equipment and storage medium | |
Punjabi et al. | Enhancing Performance of Lazy Learner by Means of Binary Particle Swarm Optimization | |
CN116130110A (en) | Biological big data analysis, disease precise identification, classification and prediction system based on algorithm and blockchain and application | |
Priya et al. | Deep learning-based breast cancer disease prediction framework for medical industries | |
Kiage | A data mining approach for forecasting cancer threats |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |