CN107845407A - Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined - Google Patents
Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined Download PDFInfo
- Publication number
- CN107845407A CN107845407A CN201710733507.1A CN201710733507A CN107845407A CN 107845407 A CN107845407 A CN 107845407A CN 201710733507 A CN201710733507 A CN 201710733507A CN 107845407 A CN107845407 A CN 107845407A
- Authority
- CN
- China
- Prior art keywords
- msub
- feature
- mrow
- cluster
- clustering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
The invention discloses a kind of human body physiological characteristics selection algorithm being combined based on filtering type and improvement cluster, including:S1:Impedance model is selected, collects fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection and final optimal subset;S2:Filter algorithm is introduced, for each feature in the data that are collected into;S3:Feature set is ranked up from big to small according to HSIC value;S4:The feature of K before ranking is added in feature set, parameter uncorrelated to body composition is filtered off using Filter algorithms, builds initial data set;S5:According to clustering algorithm by dataset construction feature sparse graph;S6:Redundancy feature in cluster is screened using improved clustering algorithm;The human body physiological characteristics selection algorithm that the application establishes can improve human body composition precision of prediction, and more efficiently detection means is provided for body composition Study and clinical practice.
Description
Technical field
The invention belongs to field of bioinformatics, more particularly to a kind of human body being combined based on filtering type and improvement cluster
Physiological characteristic selection algorithm.
Background technology
The equilibrium state of human body composition is to influence health to the stable important role of maintenance organismic internal environment
An important factor for.When disease occurs, the change of human body composition is often earlier than the clinical symptoms of disease.Therefore, it is available
The change of human body composition carries out dependency prediction to diseases such as hypertension, dyslipidemia, metabolic syndromes.However, influence people
The relevant parameter of body body composition is numerous, exist between parameter nonlinearity, redundancy, it is uncorrelated the features such as.
Existing Wrapper algorithms remove redundancy feature, and this method can obtain preferable general performance, but because algorithm is complicated
Degree is high, is not suitable for large-scale dataset;Filter algorithms assign each feature one weight according to criterion result of calculation
Value, computational efficiency is higher, but this method does not fully take into account the redundancy between feature, and the character subset of selection is likely to
In the presence of substantial amounts of redundancy;Body composition parameter object-oriented data is divided into multiple groups or cluster by clustering method so that there is very object in cluster
High similitude, according to each cluster and central point distance, judged, effectively weed out redundancy feature, but can not effectively sieve
Select uncorrelated features.Given this, before to body composition High dimensional data analysis, it is necessary to propose a kind of new Data Dimensionality Reduction
The method of processing.
The content of the invention
In view of the shortcomings of the prior art, the present invention proposes based on filtering type and improves the Human Physiology spy for clustering and being combined
Selection algorithm is levied, first by the removal of Filter feature selecting algorithms and the incoherent feature of body composition classification, then using M-
The methods of Chameleon feature clusterings removes redundancy feature so that the advantages of Filter feature selecting algorithms and feature clustering all
It is able to the performance of maximum.The human body composition forecast model being built such that can improve human body composition precision of prediction, be body
Composition Study and clinical practice provide more efficiently detection means.
To achieve the above object, the invention provides the human body physiological characteristics choosing being combined based on filtering type and improvement cluster
Algorithm is selected, including:
S1:Select impedance model, collect fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection with most
Whole optimal subset, and initial characteristicses collection and final optimal subset are initialized as empty set;
Further, it is designated as using the body compositional data of body composition analysis's instrument (INBODY) measurement as data set
T=(O, F, C), wherein, O is data sample set, and F is selection characteristic set, and C is body composition classification;Will be to human body composition
The parameter set having a major impact such as each section of body weight, height, age, sex, human body impedance value etc. are used as fisrt feature parameter, respectively
The 1/R reciprocal of section impedancei, square Ri 2、RiRjAs second feature parameter.Wherein, 1KHZ impedances in impedance value selection INBODY
Parameter, fisrt feature parameter (R1, R2, R3, R4, R5, A, H, W, wherein A are that age, H are that height, W are body weight) and second special
Levy parameter (1/R1,1/R2,1/R3,1/R4,1/R5, R1R2, R1R3, R1R4, R1R5, R2R3, R2R4, R2R5, R3R4,
R3R5, R5R4 etc.) primitive character parameter set is used as, it is designated as F={ f1,f2,…,fm};Body composition classification collection C includes body fat
Measure (BFM), total Water (TBW).
S2:Filter algorithm (Filter) is introduced, for each feature in the data that are collected into, is calculated in body composition classification C
Under HSIC values, the value characterizes physiological characteristic and the other correlation size of body constituent class;
Further, for each feature { f1,f2,…,fm∈ F, define a nonlinear characteristic mapping phi:The mapping can be by characteristic point f1,f2,…,fmIt is mapped to Reproducing Kernel Hilbert SpaceIn, kernel function is:In formula:SpaceOn inner product.Similar, define an individual composition classification and reflect
Penetrate ψ:Body component target C space reflections are designated as to Reproducing Kernel Hilbert SpaceIn, kernel function is:In addition, defined feature is with the other cross covariance operator of body constituent class:In formulaRepresent tensor product,WithRepresent it is expected.For
Each feature { f1,f2,…,fm∈ F, (HSIC is a kind of independence based on core to the HSIC values for calculating under body composition classification c
Measure, by defining cross covariance operator on Reproducing Kernel Hilbert Space, and by estimating to the experience of operator norm
Meter obtains independence judgment criterion, can be used for weighing the similitude between two data distributions, be widely used in feature selecting, drop
In dimension), the value characterizes physiological characteristic and the other correlation size of body constituent class:
It is stronger to f dependence for some feature f and body composition classification c, the HSIC bigger explanation c of value.
S3:Feature set is ranked up from big to small according to HSIC value;
S4:The feature of K before ranking is added in feature set, ginseng uncorrelated to body composition is filtered off using Filter algorithms
Number, build initial data set;
S5:According to clustering algorithm (M-chameleon) by dataset construction feature sparse graph.RI is connected with each other between being characterized
Side collection, RC be characterized between phase recency, initialization it is expected cluster number k;
Further, Chameleon uses Agglomerative Hierarchical Clustering method, dilute come construction feature according to the method for the closest figures of K-
Dredge and scheme, each summit in figure represents a data object, a line be present between the two summits, utilizes adding for side
Power can reflect the similarity of object, algorithm principle such as Fig. 1.The similarity of feature submanifold is according to 2 points of assessments:1) object in cluster
Interconnection situation;2) propinquity of cluster.If the interconnectivity of two feature clusters is very high, and apart from close, distant feature
Cluster will be merged replacement.Their two spies are determined according to the relative interconnections degree RI and relative closeness RC of two feature clusters
Similarity between sign.Given normalization and the characteristic data set F={ f after Filter is filtered1,f2,…,fm, clusters of data
F is divided into submanifold f1And f2, F bis- is divided into f1And f2And the weight on cut-off side is minimum, feature submanifold f1And f2Between
Relative interconnections are bigger.Two feature cluster f1And f2Relative interconnections degree RI (f1,f2) definition be characterized cluster f1And f2Between phase
To interconnection degree, on two cluster f1And f2Interconnected metric generalized, i.e.,:
Wherein,It is to include f1And f2The side of cluster cut, similarly,OrIt is by f1(or f2) be divided into greatly
Cause minimum that equal two-part side cuts and.
Two feature cluster f1And f2Relative closeness RC (f1,f2) it is defined as f1And f2Between the absolute degree of approximation, on
Two feature cluster f1And f2The inside degree of approximation standardization, i.e.,:
Wherein,It is connection f1Summit and f2The average weight on the side on summit,(or) it is most young waiter in a wineshop or an inn's sub-clustering f1
(or f2) side average weight.Pass through feature submanifold f1And f2Relative interconnections and relative closeness determine two submanifolds
Between similarity.
S6:Redundancy feature in cluster is screened using improved clustering algorithm;
S61:Calculate the distance between cluster and cluster and it is ranked up, whether judgement sample submanifold number h is equal to initially
Change and it is expected number of clusters mesh k;S62:Two submanifolds for selecting similarity function value maximum if not waiting are merged, tied if equal
Beam; S63:The relative closeness RC of new submanifold is recalculated, travels through all submanifolds, if all submanifolds are all attempted between any two
Merge;S64:If all submanifolds are all attempted to merge, S61 is returned;Otherwise two minimum submanifolds of similarity function are merged
After return to S63;S65:The maximum feature of selection HSIC values is combined.
S7:The maximum combinations of features of a HSIC value is selected from each feature cluster into optimal characteristics collection.
The present invention can obtain following technique effect due to using above technical scheme:Joined according to human body physiological characteristics
The characteristics of number, it is proposed that the characteristics of human body's parameter selection algorithm being combined based on Filter and cluster, use Hilbert-
The characteristic filter method of Schmidt dependence criterions eliminate with the incoherent feature of classification, improved Chameleon is clustered
For in feature selecting and carrying out Optimal improvements, redundancy feature is eliminated well, is effectively selected for tectosome composition
The optimal characteristics parameter set of model, solve the problems, such as that human body physiological characteristics parameter is more and redundancy, for body composition Study and face
Bed application provides more efficiently detection means.
Brief description of the drawings
Fig. 1 is Chameleon clustering algorithm schematic diagrams;
Fig. 2 is the Chameleon algorithm principle figures after improving;
Fig. 3 is characteristics of human body's parameter selection process;
Fig. 4 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 1KHZ frequency ranges;
Fig. 5 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 250KHZ frequency ranges;
Fig. 6 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 500KHZ frequency ranges
Fig. 7 is that Parameter Clustering number is analyzed using after filter algorithm;
Fig. 8 is that different sample sizes are polymerized to during four classes characteristic parameter and BFM indexs apart from situation;
Fig. 9 is BFM model predication values and actual comparison situation;
Figure 10 is that BFM model predication values relative error contrasts situation.
Embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with the accompanying drawings with specific embodiment pair
The present invention is described in detail.
Using the body compositional data of INBODY measurements as data set, T=(O, F, C) is designated as;To have to human body composition
The parameter set of the material impact such as conduct such as each section of body weight, height, age, sex, human body impedance value fisrt feature parameter, each section
The 1/R reciprocal of impedancei, square Ri 2、RiRjAs second feature parameter.INBODY measurement frequency range have 1KHZ, 250KHZ,
Tri- frequency ranges of 500KHZ, study the pass of body composition and characteristic parameter in the case of above-mentioned three frequency range, different sample sizes respectively herein
System.Wherein, fisrt feature parameter (R is selected1、R2、R3、R4、R5, A, H, W) and second feature parameter 1/R1、1/R2、1/R3、1/
R4、1/R5、R1R2、R1R3、R1R4、R1R5、R2R3、R2R4、R2R5、 R3R4、R3R5、R4R5As primitive character parameter set, F=is designated as
{f1,f2,…,fm};Body composition classification collection C includes Fat distribution (BFM), total Water (TBW);Wrapped in body composition classification collection C
Include Fat distribution (BFM), total Water (TBW).Following table lists part sample data set.
However, the relevant parameter for influenceing human body composition is numerous, nonlinearity, redundancy, uncorrelated between parameter be present
The features such as.In view of the above problems, it is necessary to a kind of method that data are carried out with dimension-reduction treatment is proposed, to solve features described above ginseng
Number redundancy, incoherent problem.Body composition parameter object-oriented data is divided into multiple groups or cluster by clustering method so that right in cluster
As there is very high similitude, according to each cluster and central point distance, judged, effectively weed out redundancy feature.Meanwhile
The step of to before body composition High dimensional data analysis by reducing number of features, eliminates and the incoherent attribute of required feature;
Therefore, filter algorithm should be carried out to data first.Given primitive character collection F={ f1,f2,…,fm, data sample
Collect O={ o1,o2,…,on, human body composition BFM, TBW, to preceding 100 people sample in tri- frequencies of 1KHZ, 250KHZ, 500KHZ
The lower operation filter algorithm of section, figure below list the characteristic parameter degree of correlation after being filtered to gained after body composition BFM operation algorithms.
In formula:SpaceOn inner product.Similar, define an individual composition classification mapping ψ:By body composition
Index C space reflections are designated as to Reproducing Kernel Hilbert SpaceIn, corresponding kernel function is:
Kernel function can calculate inner product of two characteristic points between eigenspace projection, and specific without explicit algorithm
MappingThe calculation cost that dimension implies need not be paid.Therefore definable feature and the other cross covariance operator of body constituent class
For:
In above formulaRepresent tensor product,WithRepresent it is expected[16], can be by this covariance
Square normReferred to as HSIC:Its expression formula is[14]:
Using filter algorithms to different impedance lower body composition BFM run after can both obtain degree of correlation situation, as Fig. 4,
Shown in Fig. 5, Fig. 6, by upper three figures as can be seen that when impedance frequency range gradually increases, the numerical value of impedance also constantly reduction,
The BFM information content that each characteristic parameter is included gradually reduces.It is used as and is screened according to confidential interval 80%, selected characteristic parameter,
Collect as shown in table 2 below to the feature after operation filter algorithms under different frequency range:
Table 2:The feature after filter algorithms is run under different frequency range
As shown in Table 2, this paper algorithms largely reduce the number of primitive character collection, the aggregation of 250KHZ frequency ranges feature
It is more.Therefore choose the feature after medium impedance frequency range 250KHZ is filtered and carry out cluster analysis, filter out redundancy.
Before being clustered, first have to judgement and be polymerized to several classes, the characteristic parameter after screening is obtained into different clusters respectively
In the case of the information number that is included, as shown in fig. 7, analysis understands that characteristic parameter is divided into 4 classes with body composition preferably to represent
Selected characteristic information.When sample size is 20 people, 40 people, 60 people, 80 people, 100 people, as shown in Figure 8, situation of change is clustered
Less, 1/R4,1/R5, it is polymerized to one kind, A, H, W, R5,R4It is polymerized to one kind, R4R5, R1R2,R2 2,R1 2, R5 2It is polymerized to one kind, R2R3,
R1R3It is polymerized to one kind.The characteristic parameter obtained after Filter algorithms after poly- 4 class can remove with cluster centre BFM farther out 1/
R4, R4,R1 2, R1R3.Table 3 lists the characteristic parameter selection situation after Filter and clustering algorithm.
Table 3:The characteristic parameter after Filter and clustering algorithm
Table 4 lists obtains the candidate characteristic set for body composition BFM predictions and time using three kinds of feature selection approach
Complexity;
Table 4:Optimal characteristics collection and complexity compare
As known from Table 4, in the case of data set dimension identical, the candidate characteristic set obtained by inventive algorithm is used
Number and its time complexity are respectively less than Filter and Wrapper, mRMR feature selecting algorithm;
To verify the quality of eigen selection algorithm performance, for body composition (BFM), respectively using mRMR, Filter and
Wrapper combined types feature selecting algorithm carries out feature selecting with eigen selection algorithm, special accurately to weigh above-mentioned candidate
The good and bad degree under given body composition BFM is collected, using first 80 in sample set as training sample set, is designated as T1=
{(x1,y1),(x2,y2),…,(x80,y80), latter 20 are used as test sample collection
T2={ (x81,y81),(x82,y82),…,(x100,y100), wherein xi∈RlFor the characteristic ginseng value of input, as
Independent variable, yi∈ R are actual body signal component value, as dependent variable;Using the multiple linear regression in SPSS softwares to T1Carry out
Training.Table 5 is shown the model obtained using features described above set pair BFM progress regression modelings and collected:
Table 5:Model collects and (changed)
A. predictive variable:(constant), W, S, A, R3,1/R2,1/R1,1/R3,R4 2,R4R5,R5 2
B. predictive variable:(constant), 1/R3,W,S,R2 2,R4 2,R4R5,R5 2,1/R1,R5
C. predictive variable:(constant), A, H, W, R5,R1R2,R2R3,R4R5,1/R5,R2 2,R5 2,
It can be seen from table 5, the correlation of physiological characteristic collection and BFM in model 1,2,3 is respectively 0.927,0.906,
0.978, it is therefore, most strong using the feature set that this paper algorithms are obtained and the correlation of body composition;
According to obtained each model regression coefficient, predictive equation is listed:
BFM1=0.041*W+0.126*S+0.523*A-0.212*R3+0.171*1/R1+0.126*1/R2+0.179*1/R3
+0.132R2 4+0.13R4R5+0.127R2 5-8.56(1)
BFM2=0.313*W-0.044*S-0.125*1/R3+0.108*1/R1+0.016*R4 2-0.01R2 2+0.071R5 2+
0.072R4R5-0.526R5+5.674 (2)
BFM3=-0.464*A-0.15*H+0.122*W-0.143*R5+0.129*R1R2+0.122*R2R3-0.134*R4R5+
0.145*1/R5+0.129*R2 2-0.141*R5 2 (3)
Using obtained forecast model to test set T2It is predicted, and compared with actual value, it is pre- obtains BFM models
Measured value and actual comparison Fig. 9 and error analysis Figure 10.As shown in Figure 10, the spy obtained using this paper feature selecting algorithms
It is higher to levy the accuracy of the forecast model of structure, its Relative Error is less than 0.12.As a result show, based on filter and gather
The feature set that the human body physiological characteristics selection algorithm that class is combined obtains shows good correlation, Ke Yiti with body composition
The fitting precision of high body ingredient prediction model, reduce prediction error.
Compared to prior art, the present invention provides a kind of human body physiological characteristics selection being combined based on Filter and cluster
Algorithm.Using the characteristic filter method of Hilbert-Schmidt dependence criterions eliminate with the incoherent feature of classification, will change
The Chameleon entered is clustered in feature selecting and carrying out Optimal improvements, eliminates redundancy feature well, effective choosing
The optimal characteristics parameter set for tectosome Component Model is selected out, solves the problems, such as that human body physiological characteristics parameter is more and redundancy;This
The human body composition forecast model that sample is established can improve human body composition precision of prediction, be that body composition Study and clinic should
With the more efficiently detection means of offer.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto,
Any one skilled in the art in the technical scope of present disclosure, technique according to the invention scheme and
Its inventive concept is subject to equivalent substitution or change, should all be included within the scope of the present invention.
Claims (8)
1. based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined, it is characterised in that including:
S1:Select impedance model, collect fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection with it is final most
Excellent subset, and initial characteristicses collection and final optimal subset are initialized as empty set;
S2:Filter algorithm is introduced, for each feature in the data that are collected into, calculates the HSIC values under body composition classification, this
Value characterizes physiological characteristic and the other correlation size of body constituent class;
S3:Feature set is ranked up from big to small according to HSIC value;
S4:The feature of K before ranking is added in feature set, parameter uncorrelated to body composition, structure are filtered off using filter algorithm
Initial data set;
S5:According to clustering algorithm by dataset construction feature sparse graph, side collection that RI is connected with each other between being characterized, between RC is characterized
Phase recency, initialization it is expected cluster number k;
S6:Redundancy feature in cluster is screened using improved clustering algorithm;
S7:The maximum combinations of features of a HSIC value is selected from each feature cluster into optimal characteristics collection.
2. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special
Sign is, using the body compositional data of body composition analysis's instrument measurement as data set, is designated as T=(O, F, C), wherein O is
Data sample set, F are selection characteristic sets, and C is body composition classification;The parameter set being had a major impact to human body composition is made
For fisrt feature parameter, the 1/R reciprocal of each section of impedancei, square Ri 2、RiRjAs second feature parameter;Wherein, impedance value selects
1KHZ impedance parameters in body composition analysis's instrument, fisrt feature parameter R1, R2, R3, R4, R5, A, H, W and second feature parameter
1/R1,1/R2,1/R3,1/R4,1/R5, R1R2, R1R3, R1R4, R1R5, R2R3, R2R4, R2R5, R3R4, R3R5, R5R4 make
For primitive character parameter set, F={ f are designated as1,f2,···,fm, f1,f2,···,fmIt is characterized a little, wherein A is age, H
It is that height, W are body weight, body composition classification C includes Fat distribution, total Water.
3. the human body physiological characteristics selection algorithm according to claim 1 or claim 2 being combined based on filtering type and improvement cluster, its
It is characterised by, for each feature in the data that are collected into, calculates the HSIC values under body composition classification C, is specially:
For each feature { f1,f2,···,fm∈ F, define a nonlinear characteristic mappingThe mapping
By characteristic point f1,f2,···,fmIt is mapped to Reproducing Kernel Hilbert SpaceIn, by body component target C space reflections to again
Raw core Hilbert spaces are designated asIn, kernel function is:For each feature { f1,
f2,···,fm∈ F, calculate the HSIC values under body composition classification C.
4. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 3, it is special
Sign is that defined feature is with the other cross covariance operator of body constituent class:Formula
InRepresent tensor product,WithRepresent it is expected;HSIC values characterize physiological characteristic and body into
Sub-category correlation size:
It is stronger to f dependence for some feature f and body composition classification c, the HSIC bigger explanation c of value.
5. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special
Sign is, using Chameleon Agglomerative Hierarchical Clustering methods, according to the method for the closest figures of K- come construction feature sparse graph, in figure
Each summit represent a data object, a line be present between the two summits, can be reflected using the weighting on side
The similarity of object, the similarity of feature submanifold is according to 2 points of assessments:1) in cluster object interconnection situation;2) propinquity of cluster;
Similarity between their two features is determined according to the relative interconnections degree RI and relative closeness RC of two feature clusters.
6. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 4, it is special
Sign is, gives normalization and using the characteristic data set F={ f after filter algorithm filters1,f2,···,fm, clusters of data
F is divided into submanifold f1And f2, F bis- is divided into f1And f2And the weight on cut-off side is minimum, feature submanifold f1And f2Between
Relative interconnections are bigger;Two feature cluster f1And f2Relative interconnections degree RI (f1,f2) definition be characterized cluster f1And f2Between phase
To interconnection degree, on two cluster f1And f2Interconnected metric generalized, i.e.,:
<mrow>
<mi>R</mi>
<mi>I</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mrow>
<mo>|</mo>
<msub>
<mi>EC</mi>
<mrow>
<mo>{</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>}</mo>
</mrow>
</msub>
<mo>|</mo>
</mrow>
<mrow>
<mfrac>
<mn>1</mn>
<mn>2</mn>
</mfrac>
<mrow>
<mo>(</mo>
<mo>|</mo>
<mi>E</mi>
<mi>C</mi>
<msub>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
</msub>
<mo>+</mo>
<mo>|</mo>
<mi>E</mi>
<mi>C</mi>
<msub>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
</msub>
<mo>)</mo>
</mrow>
</mrow>
</mfrac>
</mrow>
Wherein,It is to include f1And f2The side of cluster cut, similarly,OrIt is by f1Or f2It is divided into substantially phase
Deng two-part side cut minimum and.
7. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 5, it is special
Sign is,
Two feature cluster f1And f2Relative closeness RC (f1,f2) it is defined as f1And f2Between relative closeness, on two
Feature cluster f1And f2The inside degree of approximation standardization, i.e.,:
<mrow>
<mi>R</mi>
<mi>C</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<msub>
<mover>
<mi>S</mi>
<mo>&OverBar;</mo>
</mover>
<mrow>
<msub>
<mi>EC</mi>
<mrow>
<mo>{</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>}</mo>
</mrow>
</msub>
</mrow>
</msub>
<mrow>
<mfrac>
<mrow>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>|</mo>
</mrow>
<mrow>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>|</mo>
<mo>+</mo>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>|</mo>
</mrow>
</mfrac>
<msub>
<mover>
<mi>S</mi>
<mo>&OverBar;</mo>
</mover>
<mrow>
<msub>
<mi>EC</mi>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
</msub>
</mrow>
</msub>
<mo>+</mo>
<mfrac>
<mrow>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>|</mo>
</mrow>
<mrow>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>1</mn>
</msub>
<mo>|</mo>
<mo>+</mo>
<mo>|</mo>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
<mo>|</mo>
</mrow>
</mfrac>
<msub>
<mover>
<mi>S</mi>
<mo>&OverBar;</mo>
</mover>
<mrow>
<msub>
<mi>EC</mi>
<msub>
<mi>f</mi>
<mn>2</mn>
</msub>
</msub>
</mrow>
</msub>
</mrow>
</mfrac>
</mrow>
Wherein,It is connection f1Summit and f2The average weight on the side on summit,OrIt is most young waiter in a wineshop or an inn's sub-clustering f1Or f2
Side average weight;
Pass through feature submanifold f1And f2Relative interconnections and relative closeness determine the similarity between two submanifolds.
8. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special
Sign is that improved clustering algorithm is that all feature submanifolds are all traveled through and attempt to merge and substituted, by after submanifold is merged
Feature selecting quality is assessed, and existing submanifold is all attempted between any two to merge;It is concretely comprised the following steps:
S61:Calculate the distance between cluster and cluster and it is ranked up, whether judgement sample submanifold number h is equal to the initialization phase
Hope number of clusters mesh k;
S62:Two submanifolds for selecting similarity function value maximum if not waiting are merged, terminated if equal;
S63:The relative closeness RC of new submanifold is recalculated, travels through all submanifolds, if all submanifolds are all attempted between any two
Merge;
S64:If all submanifolds are all attempted to merge, S61 is returned;Otherwise after two minimum submanifolds of similarity function are merged
Return to S63;
S65:The maximum feature of selection HSIC values is combined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710733507.1A CN107845407A (en) | 2017-08-24 | 2017-08-24 | Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710733507.1A CN107845407A (en) | 2017-08-24 | 2017-08-24 | Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107845407A true CN107845407A (en) | 2018-03-27 |
Family
ID=61683251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710733507.1A Pending CN107845407A (en) | 2017-08-24 | 2017-08-24 | Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107845407A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210559A (en) * | 2019-05-31 | 2019-09-06 | 北京小米移动软件有限公司 | Object screening technique and device, storage medium |
CN110363229A (en) * | 2019-06-27 | 2019-10-22 | 岭南师范学院 | A kind of characteristics of human body's parameter selection method combined based on improvement RReliefF and mRMR |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007002991A1 (en) * | 2005-07-01 | 2007-01-11 | Impedimed Limited | Monitoring system |
CN106485086A (en) * | 2016-10-19 | 2017-03-08 | 大连大学 | Human body composition Forecasting Methodology based on AIC and improvement entropy assessment |
-
2017
- 2017-08-24 CN CN201710733507.1A patent/CN107845407A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007002991A1 (en) * | 2005-07-01 | 2007-01-11 | Impedimed Limited | Monitoring system |
CN106485086A (en) * | 2016-10-19 | 2017-03-08 | 大连大学 | Human body composition Forecasting Methodology based on AIC and improvement entropy assessment |
Non-Patent Citations (2)
Title |
---|
刘文凤 等: "Chameleon聚类算法的Weka实现", 《计算机系统应用》 * |
吴金峰: "基于支持向量机的人体体成分预测模型研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210559A (en) * | 2019-05-31 | 2019-09-06 | 北京小米移动软件有限公司 | Object screening technique and device, storage medium |
CN110210559B (en) * | 2019-05-31 | 2021-10-08 | 北京小米移动软件有限公司 | Object screening method and device and storage medium |
CN110363229A (en) * | 2019-06-27 | 2019-10-22 | 岭南师范学院 | A kind of characteristics of human body's parameter selection method combined based on improvement RReliefF and mRMR |
WO2020258973A1 (en) * | 2019-06-27 | 2020-12-30 | 岭南师范学院 | Human body feature parameter selection method based on improved rrelieff in combination with mrmr |
CN110363229B (en) * | 2019-06-27 | 2021-07-27 | 岭南师范学院 | Human body characteristic parameter selection method based on combination of improved RReliefF and mRMR |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110993081B (en) | Doctor online recommendation method and system | |
Antonelli et al. | Analysis of diabetic patients through their examination history | |
CN113616184B (en) | Brain network modeling and individual prediction method based on multi-mode magnetic resonance image | |
Zohora et al. | Forecasting the risk of type ii diabetes using reinforcement learning | |
JP2014225176A (en) | Analysis system and health business support method | |
CN107463766A (en) | Generation method, device and the computer-readable recording medium of blood glucose prediction model | |
CN112633601A (en) | Method, device, equipment and computer medium for predicting disease event occurrence probability | |
Fine et al. | A simple estimator for a shared frailty regression model | |
CN113284623A (en) | Personalized cognitive training task recommendation algorithm and system based on user ability | |
CN107785079B (en) | Depression patient disease recovery assessment method based on diffusion tensor imaging | |
CN107845407A (en) | Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined | |
Affeldt et al. | Ensemble block co-clustering: a unified framework for text data | |
Hsu | A decision-making mechanism for assessing risk factor significance in cardiovascular diseases | |
Estiri et al. | Kluster: an efficient scalable procedure for approximating the number of clusters in unsupervised learning | |
Jafarzadeh et al. | Provide a new approach for mining fuzzy association rules using apriori algorithm | |
Liu et al. | A sample survey based linguistic MADM method with prospect theory for online shopping problems | |
CN108804549A (en) | Eyeground contrastographic picture search method based on the adjustment of medical image features weight | |
Moreno-Sánchez | Data-driven early diagnosis of chronic kidney disease: development and evaluation of an explainable AI model | |
Durgadevi et al. | Fetal brain abnormality detection through PSO (particle swarm optimization) and volume estimation | |
CN112233742A (en) | Medical record document classification system, equipment and storage medium based on clustering | |
Christopher et al. | Knowledge-based systems and interestingness measures: Analysis with clinical datasets | |
Liu et al. | A hybrid double-density dual-tree discrete wavelet transformation and marginal Fisher analysis for scoring sleep stages from unprocessed single-channel electroencephalogram | |
Voges et al. | Rough clustering using an evolutionary algorithm | |
Mishra et al. | Predictive modelling and analytics for diabetes using a machine learning approach | |
Mohsen et al. | Random forest algorithm using accuracy-based ranking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180327 |