CN107845407A - Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined - Google Patents

Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined Download PDF

Info

Publication number
CN107845407A
CN107845407A CN201710733507.1A CN201710733507A CN107845407A CN 107845407 A CN107845407 A CN 107845407A CN 201710733507 A CN201710733507 A CN 201710733507A CN 107845407 A CN107845407 A CN 107845407A
Authority
CN
China
Prior art keywords
msub
feature
mrow
cluster
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710733507.1A
Other languages
Chinese (zh)
Inventor
陈波
俞洁
高秀娥
郑庆国
白旭飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN201710733507.1A priority Critical patent/CN107845407A/en
Publication of CN107845407A publication Critical patent/CN107845407A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses a kind of human body physiological characteristics selection algorithm being combined based on filtering type and improvement cluster, including:S1:Impedance model is selected, collects fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection and final optimal subset;S2:Filter algorithm is introduced, for each feature in the data that are collected into;S3:Feature set is ranked up from big to small according to HSIC value;S4:The feature of K before ranking is added in feature set, parameter uncorrelated to body composition is filtered off using Filter algorithms, builds initial data set;S5:According to clustering algorithm by dataset construction feature sparse graph;S6:Redundancy feature in cluster is screened using improved clustering algorithm;The human body physiological characteristics selection algorithm that the application establishes can improve human body composition precision of prediction, and more efficiently detection means is provided for body composition Study and clinical practice.

Description

Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined
Technical field
The invention belongs to field of bioinformatics, more particularly to a kind of human body being combined based on filtering type and improvement cluster Physiological characteristic selection algorithm.
Background technology
The equilibrium state of human body composition is to influence health to the stable important role of maintenance organismic internal environment An important factor for.When disease occurs, the change of human body composition is often earlier than the clinical symptoms of disease.Therefore, it is available The change of human body composition carries out dependency prediction to diseases such as hypertension, dyslipidemia, metabolic syndromes.However, influence people The relevant parameter of body body composition is numerous, exist between parameter nonlinearity, redundancy, it is uncorrelated the features such as.
Existing Wrapper algorithms remove redundancy feature, and this method can obtain preferable general performance, but because algorithm is complicated Degree is high, is not suitable for large-scale dataset;Filter algorithms assign each feature one weight according to criterion result of calculation Value, computational efficiency is higher, but this method does not fully take into account the redundancy between feature, and the character subset of selection is likely to In the presence of substantial amounts of redundancy;Body composition parameter object-oriented data is divided into multiple groups or cluster by clustering method so that there is very object in cluster High similitude, according to each cluster and central point distance, judged, effectively weed out redundancy feature, but can not effectively sieve Select uncorrelated features.Given this, before to body composition High dimensional data analysis, it is necessary to propose a kind of new Data Dimensionality Reduction The method of processing.
The content of the invention
In view of the shortcomings of the prior art, the present invention proposes based on filtering type and improves the Human Physiology spy for clustering and being combined Selection algorithm is levied, first by the removal of Filter feature selecting algorithms and the incoherent feature of body composition classification, then using M- The methods of Chameleon feature clusterings removes redundancy feature so that the advantages of Filter feature selecting algorithms and feature clustering all It is able to the performance of maximum.The human body composition forecast model being built such that can improve human body composition precision of prediction, be body Composition Study and clinical practice provide more efficiently detection means.
To achieve the above object, the invention provides the human body physiological characteristics choosing being combined based on filtering type and improvement cluster Algorithm is selected, including:
S1:Select impedance model, collect fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection with most Whole optimal subset, and initial characteristicses collection and final optimal subset are initialized as empty set;
Further, it is designated as using the body compositional data of body composition analysis's instrument (INBODY) measurement as data set T=(O, F, C), wherein, O is data sample set, and F is selection characteristic set, and C is body composition classification;Will be to human body composition The parameter set having a major impact such as each section of body weight, height, age, sex, human body impedance value etc. are used as fisrt feature parameter, respectively The 1/R reciprocal of section impedancei, square Ri 2、RiRjAs second feature parameter.Wherein, 1KHZ impedances in impedance value selection INBODY Parameter, fisrt feature parameter (R1, R2, R3, R4, R5, A, H, W, wherein A are that age, H are that height, W are body weight) and second special Levy parameter (1/R1,1/R2,1/R3,1/R4,1/R5, R1R2, R1R3, R1R4, R1R5, R2R3, R2R4, R2R5, R3R4, R3R5, R5R4 etc.) primitive character parameter set is used as, it is designated as F={ f1,f2,…,fm};Body composition classification collection C includes body fat Measure (BFM), total Water (TBW).
S2:Filter algorithm (Filter) is introduced, for each feature in the data that are collected into, is calculated in body composition classification C Under HSIC values, the value characterizes physiological characteristic and the other correlation size of body constituent class;
Further, for each feature { f1,f2,…,fm∈ F, define a nonlinear characteristic mapping phi:The mapping can be by characteristic point f1,f2,…,fmIt is mapped to Reproducing Kernel Hilbert SpaceIn, kernel function is:In formula:SpaceOn inner product.Similar, define an individual composition classification and reflect Penetrate ψ:Body component target C space reflections are designated as to Reproducing Kernel Hilbert SpaceIn, kernel function is:In addition, defined feature is with the other cross covariance operator of body constituent class:In formulaRepresent tensor product,WithRepresent it is expected.For Each feature { f1,f2,…,fm∈ F, (HSIC is a kind of independence based on core to the HSIC values for calculating under body composition classification c Measure, by defining cross covariance operator on Reproducing Kernel Hilbert Space, and by estimating to the experience of operator norm Meter obtains independence judgment criterion, can be used for weighing the similitude between two data distributions, be widely used in feature selecting, drop In dimension), the value characterizes physiological characteristic and the other correlation size of body constituent class:
It is stronger to f dependence for some feature f and body composition classification c, the HSIC bigger explanation c of value.
S3:Feature set is ranked up from big to small according to HSIC value;
S4:The feature of K before ranking is added in feature set, ginseng uncorrelated to body composition is filtered off using Filter algorithms Number, build initial data set;
S5:According to clustering algorithm (M-chameleon) by dataset construction feature sparse graph.RI is connected with each other between being characterized Side collection, RC be characterized between phase recency, initialization it is expected cluster number k;
Further, Chameleon uses Agglomerative Hierarchical Clustering method, dilute come construction feature according to the method for the closest figures of K- Dredge and scheme, each summit in figure represents a data object, a line be present between the two summits, utilizes adding for side Power can reflect the similarity of object, algorithm principle such as Fig. 1.The similarity of feature submanifold is according to 2 points of assessments:1) object in cluster Interconnection situation;2) propinquity of cluster.If the interconnectivity of two feature clusters is very high, and apart from close, distant feature Cluster will be merged replacement.Their two spies are determined according to the relative interconnections degree RI and relative closeness RC of two feature clusters Similarity between sign.Given normalization and the characteristic data set F={ f after Filter is filtered1,f2,…,fm, clusters of data F is divided into submanifold f1And f2, F bis- is divided into f1And f2And the weight on cut-off side is minimum, feature submanifold f1And f2Between Relative interconnections are bigger.Two feature cluster f1And f2Relative interconnections degree RI (f1,f2) definition be characterized cluster f1And f2Between phase To interconnection degree, on two cluster f1And f2Interconnected metric generalized, i.e.,:
Wherein,It is to include f1And f2The side of cluster cut, similarly,OrIt is by f1(or f2) be divided into greatly Cause minimum that equal two-part side cuts and.
Two feature cluster f1And f2Relative closeness RC (f1,f2) it is defined as f1And f2Between the absolute degree of approximation, on Two feature cluster f1And f2The inside degree of approximation standardization, i.e.,:
Wherein,It is connection f1Summit and f2The average weight on the side on summit,(or) it is most young waiter in a wineshop or an inn's sub-clustering f1 (or f2) side average weight.Pass through feature submanifold f1And f2Relative interconnections and relative closeness determine two submanifolds Between similarity.
S6:Redundancy feature in cluster is screened using improved clustering algorithm;
S61:Calculate the distance between cluster and cluster and it is ranked up, whether judgement sample submanifold number h is equal to initially Change and it is expected number of clusters mesh k;S62:Two submanifolds for selecting similarity function value maximum if not waiting are merged, tied if equal Beam; S63:The relative closeness RC of new submanifold is recalculated, travels through all submanifolds, if all submanifolds are all attempted between any two Merge;S64:If all submanifolds are all attempted to merge, S61 is returned;Otherwise two minimum submanifolds of similarity function are merged After return to S63;S65:The maximum feature of selection HSIC values is combined.
S7:The maximum combinations of features of a HSIC value is selected from each feature cluster into optimal characteristics collection.
The present invention can obtain following technique effect due to using above technical scheme:Joined according to human body physiological characteristics The characteristics of number, it is proposed that the characteristics of human body's parameter selection algorithm being combined based on Filter and cluster, use Hilbert- The characteristic filter method of Schmidt dependence criterions eliminate with the incoherent feature of classification, improved Chameleon is clustered For in feature selecting and carrying out Optimal improvements, redundancy feature is eliminated well, is effectively selected for tectosome composition The optimal characteristics parameter set of model, solve the problems, such as that human body physiological characteristics parameter is more and redundancy, for body composition Study and face Bed application provides more efficiently detection means.
Brief description of the drawings
Fig. 1 is Chameleon clustering algorithm schematic diagrams;
Fig. 2 is the Chameleon algorithm principle figures after improving;
Fig. 3 is characteristics of human body's parameter selection process;
Fig. 4 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 1KHZ frequency ranges;
Fig. 5 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 250KHZ frequency ranges;
Fig. 6 is using characteristic parameter obtained by filter algorithm and the BFM degrees of correlation under 500KHZ frequency ranges
Fig. 7 is that Parameter Clustering number is analyzed using after filter algorithm;
Fig. 8 is that different sample sizes are polymerized to during four classes characteristic parameter and BFM indexs apart from situation;
Fig. 9 is BFM model predication values and actual comparison situation;
Figure 10 is that BFM model predication values relative error contrasts situation.
Embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with the accompanying drawings with specific embodiment pair The present invention is described in detail.
Using the body compositional data of INBODY measurements as data set, T=(O, F, C) is designated as;To have to human body composition The parameter set of the material impact such as conduct such as each section of body weight, height, age, sex, human body impedance value fisrt feature parameter, each section The 1/R reciprocal of impedancei, square Ri 2、RiRjAs second feature parameter.INBODY measurement frequency range have 1KHZ, 250KHZ, Tri- frequency ranges of 500KHZ, study the pass of body composition and characteristic parameter in the case of above-mentioned three frequency range, different sample sizes respectively herein System.Wherein, fisrt feature parameter (R is selected1、R2、R3、R4、R5, A, H, W) and second feature parameter 1/R1、1/R2、1/R3、1/ R4、1/R5、R1R2、R1R3、R1R4、R1R5、R2R3、R2R4、R2R5、 R3R4、R3R5、R4R5As primitive character parameter set, F=is designated as {f1,f2,…,fm};Body composition classification collection C includes Fat distribution (BFM), total Water (TBW);Wrapped in body composition classification collection C Include Fat distribution (BFM), total Water (TBW).Following table lists part sample data set.
However, the relevant parameter for influenceing human body composition is numerous, nonlinearity, redundancy, uncorrelated between parameter be present The features such as.In view of the above problems, it is necessary to a kind of method that data are carried out with dimension-reduction treatment is proposed, to solve features described above ginseng Number redundancy, incoherent problem.Body composition parameter object-oriented data is divided into multiple groups or cluster by clustering method so that right in cluster As there is very high similitude, according to each cluster and central point distance, judged, effectively weed out redundancy feature.Meanwhile The step of to before body composition High dimensional data analysis by reducing number of features, eliminates and the incoherent attribute of required feature;
Therefore, filter algorithm should be carried out to data first.Given primitive character collection F={ f1,f2,…,fm, data sample Collect O={ o1,o2,…,on, human body composition BFM, TBW, to preceding 100 people sample in tri- frequencies of 1KHZ, 250KHZ, 500KHZ The lower operation filter algorithm of section, figure below list the characteristic parameter degree of correlation after being filtered to gained after body composition BFM operation algorithms.
In formula:SpaceOn inner product.Similar, define an individual composition classification mapping ψ:By body composition Index C space reflections are designated as to Reproducing Kernel Hilbert SpaceIn, corresponding kernel function is:
Kernel function can calculate inner product of two characteristic points between eigenspace projection, and specific without explicit algorithm MappingThe calculation cost that dimension implies need not be paid.Therefore definable feature and the other cross covariance operator of body constituent class For:
In above formulaRepresent tensor product,WithRepresent it is expected[16], can be by this covariance Square normReferred to as HSIC:Its expression formula is[14]
Using filter algorithms to different impedance lower body composition BFM run after can both obtain degree of correlation situation, as Fig. 4, Shown in Fig. 5, Fig. 6, by upper three figures as can be seen that when impedance frequency range gradually increases, the numerical value of impedance also constantly reduction, The BFM information content that each characteristic parameter is included gradually reduces.It is used as and is screened according to confidential interval 80%, selected characteristic parameter, Collect as shown in table 2 below to the feature after operation filter algorithms under different frequency range:
Table 2:The feature after filter algorithms is run under different frequency range
As shown in Table 2, this paper algorithms largely reduce the number of primitive character collection, the aggregation of 250KHZ frequency ranges feature It is more.Therefore choose the feature after medium impedance frequency range 250KHZ is filtered and carry out cluster analysis, filter out redundancy.
Before being clustered, first have to judgement and be polymerized to several classes, the characteristic parameter after screening is obtained into different clusters respectively In the case of the information number that is included, as shown in fig. 7, analysis understands that characteristic parameter is divided into 4 classes with body composition preferably to represent Selected characteristic information.When sample size is 20 people, 40 people, 60 people, 80 people, 100 people, as shown in Figure 8, situation of change is clustered Less, 1/R4,1/R5, it is polymerized to one kind, A, H, W, R5,R4It is polymerized to one kind, R4R5, R1R2,R2 2,R1 2, R5 2It is polymerized to one kind, R2R3, R1R3It is polymerized to one kind.The characteristic parameter obtained after Filter algorithms after poly- 4 class can remove with cluster centre BFM farther out 1/ R4, R4,R1 2, R1R3.Table 3 lists the characteristic parameter selection situation after Filter and clustering algorithm.
Table 3:The characteristic parameter after Filter and clustering algorithm
Table 4 lists obtains the candidate characteristic set for body composition BFM predictions and time using three kinds of feature selection approach Complexity;
Table 4:Optimal characteristics collection and complexity compare
As known from Table 4, in the case of data set dimension identical, the candidate characteristic set obtained by inventive algorithm is used Number and its time complexity are respectively less than Filter and Wrapper, mRMR feature selecting algorithm;
To verify the quality of eigen selection algorithm performance, for body composition (BFM), respectively using mRMR, Filter and Wrapper combined types feature selecting algorithm carries out feature selecting with eigen selection algorithm, special accurately to weigh above-mentioned candidate The good and bad degree under given body composition BFM is collected, using first 80 in sample set as training sample set, is designated as T1= {(x1,y1),(x2,y2),…,(x80,y80), latter 20 are used as test sample collection
T2={ (x81,y81),(x82,y82),…,(x100,y100), wherein xi∈RlFor the characteristic ginseng value of input, as Independent variable, yi∈ R are actual body signal component value, as dependent variable;Using the multiple linear regression in SPSS softwares to T1Carry out Training.Table 5 is shown the model obtained using features described above set pair BFM progress regression modelings and collected:
Table 5:Model collects and (changed)
A. predictive variable:(constant), W, S, A, R3,1/R2,1/R1,1/R3,R4 2,R4R5,R5 2
B. predictive variable:(constant), 1/R3,W,S,R2 2,R4 2,R4R5,R5 2,1/R1,R5
C. predictive variable:(constant), A, H, W, R5,R1R2,R2R3,R4R5,1/R5,R2 2,R5 2,
It can be seen from table 5, the correlation of physiological characteristic collection and BFM in model 1,2,3 is respectively 0.927,0.906, 0.978, it is therefore, most strong using the feature set that this paper algorithms are obtained and the correlation of body composition;
According to obtained each model regression coefficient, predictive equation is listed:
BFM1=0.041*W+0.126*S+0.523*A-0.212*R3+0.171*1/R1+0.126*1/R2+0.179*1/R3 +0.132R2 4+0.13R4R5+0.127R2 5-8.56(1)
BFM2=0.313*W-0.044*S-0.125*1/R3+0.108*1/R1+0.016*R4 2-0.01R2 2+0.071R5 2+ 0.072R4R5-0.526R5+5.674 (2)
BFM3=-0.464*A-0.15*H+0.122*W-0.143*R5+0.129*R1R2+0.122*R2R3-0.134*R4R5+ 0.145*1/R5+0.129*R2 2-0.141*R5 2 (3)
Using obtained forecast model to test set T2It is predicted, and compared with actual value, it is pre- obtains BFM models Measured value and actual comparison Fig. 9 and error analysis Figure 10.As shown in Figure 10, the spy obtained using this paper feature selecting algorithms It is higher to levy the accuracy of the forecast model of structure, its Relative Error is less than 0.12.As a result show, based on filter and gather The feature set that the human body physiological characteristics selection algorithm that class is combined obtains shows good correlation, Ke Yiti with body composition The fitting precision of high body ingredient prediction model, reduce prediction error.
Compared to prior art, the present invention provides a kind of human body physiological characteristics selection being combined based on Filter and cluster Algorithm.Using the characteristic filter method of Hilbert-Schmidt dependence criterions eliminate with the incoherent feature of classification, will change The Chameleon entered is clustered in feature selecting and carrying out Optimal improvements, eliminates redundancy feature well, effective choosing The optimal characteristics parameter set for tectosome Component Model is selected out, solves the problems, such as that human body physiological characteristics parameter is more and redundancy;This The human body composition forecast model that sample is established can improve human body composition precision of prediction, be that body composition Study and clinic should With the more efficiently detection means of offer.
The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto, Any one skilled in the art in the technical scope of present disclosure, technique according to the invention scheme and Its inventive concept is subject to equivalent substitution or change, should all be included within the scope of the present invention.

Claims (8)

1. based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined, it is characterised in that including:
S1:Select impedance model, collect fisrt feature parameter and second feature supplemental characteristic structure initial characteristicses collection with it is final most Excellent subset, and initial characteristicses collection and final optimal subset are initialized as empty set;
S2:Filter algorithm is introduced, for each feature in the data that are collected into, calculates the HSIC values under body composition classification, this Value characterizes physiological characteristic and the other correlation size of body constituent class;
S3:Feature set is ranked up from big to small according to HSIC value;
S4:The feature of K before ranking is added in feature set, parameter uncorrelated to body composition, structure are filtered off using filter algorithm Initial data set;
S5:According to clustering algorithm by dataset construction feature sparse graph, side collection that RI is connected with each other between being characterized, between RC is characterized Phase recency, initialization it is expected cluster number k;
S6:Redundancy feature in cluster is screened using improved clustering algorithm;
S7:The maximum combinations of features of a HSIC value is selected from each feature cluster into optimal characteristics collection.
2. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special Sign is, using the body compositional data of body composition analysis's instrument measurement as data set, is designated as T=(O, F, C), wherein O is Data sample set, F are selection characteristic sets, and C is body composition classification;The parameter set being had a major impact to human body composition is made For fisrt feature parameter, the 1/R reciprocal of each section of impedancei, square Ri 2、RiRjAs second feature parameter;Wherein, impedance value selects 1KHZ impedance parameters in body composition analysis's instrument, fisrt feature parameter R1, R2, R3, R4, R5, A, H, W and second feature parameter 1/R1,1/R2,1/R3,1/R4,1/R5, R1R2, R1R3, R1R4, R1R5, R2R3, R2R4, R2R5, R3R4, R3R5, R5R4 make For primitive character parameter set, F={ f are designated as1,f2,···,fm, f1,f2,···,fmIt is characterized a little, wherein A is age, H It is that height, W are body weight, body composition classification C includes Fat distribution, total Water.
3. the human body physiological characteristics selection algorithm according to claim 1 or claim 2 being combined based on filtering type and improvement cluster, its It is characterised by, for each feature in the data that are collected into, calculates the HSIC values under body composition classification C, is specially:
For each feature { f1,f2,···,fm∈ F, define a nonlinear characteristic mappingThe mapping By characteristic point f1,f2,···,fmIt is mapped to Reproducing Kernel Hilbert SpaceIn, by body component target C space reflections to again Raw core Hilbert spaces are designated asIn, kernel function is:For each feature { f1, f2,···,fm∈ F, calculate the HSIC values under body composition classification C.
4. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 3, it is special Sign is that defined feature is with the other cross covariance operator of body constituent class:Formula InRepresent tensor product,WithRepresent it is expected;HSIC values characterize physiological characteristic and body into Sub-category correlation size:
It is stronger to f dependence for some feature f and body composition classification c, the HSIC bigger explanation c of value.
5. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special Sign is, using Chameleon Agglomerative Hierarchical Clustering methods, according to the method for the closest figures of K- come construction feature sparse graph, in figure Each summit represent a data object, a line be present between the two summits, can be reflected using the weighting on side The similarity of object, the similarity of feature submanifold is according to 2 points of assessments:1) in cluster object interconnection situation;2) propinquity of cluster; Similarity between their two features is determined according to the relative interconnections degree RI and relative closeness RC of two feature clusters.
6. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 4, it is special Sign is, gives normalization and using the characteristic data set F={ f after filter algorithm filters1,f2,···,fm, clusters of data F is divided into submanifold f1And f2, F bis- is divided into f1And f2And the weight on cut-off side is minimum, feature submanifold f1And f2Between Relative interconnections are bigger;Two feature cluster f1And f2Relative interconnections degree RI (f1,f2) definition be characterized cluster f1And f2Between phase To interconnection degree, on two cluster f1And f2Interconnected metric generalized, i.e.,:
<mrow> <mi>R</mi> <mi>I</mi> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mo>|</mo> <msub> <mi>EC</mi> <mrow> <mo>{</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>}</mo> </mrow> </msub> <mo>|</mo> </mrow> <mrow> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mrow> <mo>(</mo> <mo>|</mo> <mi>E</mi> <mi>C</mi> <msub> <mo>|</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> </msub> <mo>+</mo> <mo>|</mo> <mi>E</mi> <mi>C</mi> <msub> <mo>|</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>
Wherein,It is to include f1And f2The side of cluster cut, similarly,OrIt is by f1Or f2It is divided into substantially phase Deng two-part side cut minimum and.
7. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 5, it is special Sign is,
Two feature cluster f1And f2Relative closeness RC (f1,f2) it is defined as f1And f2Between relative closeness, on two Feature cluster f1And f2The inside degree of approximation standardization, i.e.,:
<mrow> <mi>R</mi> <mi>C</mi> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <msub> <mover> <mi>S</mi> <mo>&amp;OverBar;</mo> </mover> <mrow> <msub> <mi>EC</mi> <mrow> <mo>{</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>}</mo> </mrow> </msub> </mrow> </msub> <mrow> <mfrac> <mrow> <mo>|</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mrow> <mo>|</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>|</mo> <mo>+</mo> <mo>|</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mfrac> <msub> <mover> <mi>S</mi> <mo>&amp;OverBar;</mo> </mover> <mrow> <msub> <mi>EC</mi> <msub> <mi>f</mi> <mn>1</mn> </msub> </msub> </mrow> </msub> <mo>+</mo> <mfrac> <mrow> <mo>|</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mrow> <mo>|</mo> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>|</mo> <mo>+</mo> <mo>|</mo> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mfrac> <msub> <mover> <mi>S</mi> <mo>&amp;OverBar;</mo> </mover> <mrow> <msub> <mi>EC</mi> <msub> <mi>f</mi> <mn>2</mn> </msub> </msub> </mrow> </msub> </mrow> </mfrac> </mrow>
Wherein,It is connection f1Summit and f2The average weight on the side on summit,OrIt is most young waiter in a wineshop or an inn's sub-clustering f1Or f2 Side average weight;
Pass through feature submanifold f1And f2Relative interconnections and relative closeness determine the similarity between two submanifolds.
8. based on filtering type and improving the human body physiological characteristics selection algorithm for clustering and being combined according to claim 1, it is special Sign is that improved clustering algorithm is that all feature submanifolds are all traveled through and attempt to merge and substituted, by after submanifold is merged Feature selecting quality is assessed, and existing submanifold is all attempted between any two to merge;It is concretely comprised the following steps:
S61:Calculate the distance between cluster and cluster and it is ranked up, whether judgement sample submanifold number h is equal to the initialization phase Hope number of clusters mesh k;
S62:Two submanifolds for selecting similarity function value maximum if not waiting are merged, terminated if equal;
S63:The relative closeness RC of new submanifold is recalculated, travels through all submanifolds, if all submanifolds are all attempted between any two Merge;
S64:If all submanifolds are all attempted to merge, S61 is returned;Otherwise after two minimum submanifolds of similarity function are merged Return to S63;
S65:The maximum feature of selection HSIC values is combined.
CN201710733507.1A 2017-08-24 2017-08-24 Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined Pending CN107845407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710733507.1A CN107845407A (en) 2017-08-24 2017-08-24 Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710733507.1A CN107845407A (en) 2017-08-24 2017-08-24 Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined

Publications (1)

Publication Number Publication Date
CN107845407A true CN107845407A (en) 2018-03-27

Family

ID=61683251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710733507.1A Pending CN107845407A (en) 2017-08-24 2017-08-24 Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined

Country Status (1)

Country Link
CN (1) CN107845407A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210559A (en) * 2019-05-31 2019-09-06 北京小米移动软件有限公司 Object screening technique and device, storage medium
CN110363229A (en) * 2019-06-27 2019-10-22 岭南师范学院 A kind of characteristics of human body's parameter selection method combined based on improvement RReliefF and mRMR

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007002991A1 (en) * 2005-07-01 2007-01-11 Impedimed Limited Monitoring system
CN106485086A (en) * 2016-10-19 2017-03-08 大连大学 Human body composition Forecasting Methodology based on AIC and improvement entropy assessment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007002991A1 (en) * 2005-07-01 2007-01-11 Impedimed Limited Monitoring system
CN106485086A (en) * 2016-10-19 2017-03-08 大连大学 Human body composition Forecasting Methodology based on AIC and improvement entropy assessment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘文凤 等: "Chameleon聚类算法的Weka实现", 《计算机系统应用》 *
吴金峰: "基于支持向量机的人体体成分预测模型研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210559A (en) * 2019-05-31 2019-09-06 北京小米移动软件有限公司 Object screening technique and device, storage medium
CN110210559B (en) * 2019-05-31 2021-10-08 北京小米移动软件有限公司 Object screening method and device and storage medium
CN110363229A (en) * 2019-06-27 2019-10-22 岭南师范学院 A kind of characteristics of human body's parameter selection method combined based on improvement RReliefF and mRMR
WO2020258973A1 (en) * 2019-06-27 2020-12-30 岭南师范学院 Human body feature parameter selection method based on improved rrelieff in combination with mrmr
CN110363229B (en) * 2019-06-27 2021-07-27 岭南师范学院 Human body characteristic parameter selection method based on combination of improved RReliefF and mRMR

Similar Documents

Publication Publication Date Title
CN110993081B (en) Doctor online recommendation method and system
Antonelli et al. Analysis of diabetic patients through their examination history
CN113616184B (en) Brain network modeling and individual prediction method based on multi-mode magnetic resonance image
Zohora et al. Forecasting the risk of type ii diabetes using reinforcement learning
JP2014225176A (en) Analysis system and health business support method
CN107463766A (en) Generation method, device and the computer-readable recording medium of blood glucose prediction model
CN112633601A (en) Method, device, equipment and computer medium for predicting disease event occurrence probability
Fine et al. A simple estimator for a shared frailty regression model
CN113284623A (en) Personalized cognitive training task recommendation algorithm and system based on user ability
CN107785079B (en) Depression patient disease recovery assessment method based on diffusion tensor imaging
CN107845407A (en) Based on filtering type and improve the human body physiological characteristics selection algorithm for clustering and being combined
Affeldt et al. Ensemble block co-clustering: a unified framework for text data
Hsu A decision-making mechanism for assessing risk factor significance in cardiovascular diseases
Estiri et al. Kluster: an efficient scalable procedure for approximating the number of clusters in unsupervised learning
Jafarzadeh et al. Provide a new approach for mining fuzzy association rules using apriori algorithm
Liu et al. A sample survey based linguistic MADM method with prospect theory for online shopping problems
CN108804549A (en) Eyeground contrastographic picture search method based on the adjustment of medical image features weight
Moreno-Sánchez Data-driven early diagnosis of chronic kidney disease: development and evaluation of an explainable AI model
Durgadevi et al. Fetal brain abnormality detection through PSO (particle swarm optimization) and volume estimation
CN112233742A (en) Medical record document classification system, equipment and storage medium based on clustering
Christopher et al. Knowledge-based systems and interestingness measures: Analysis with clinical datasets
Liu et al. A hybrid double-density dual-tree discrete wavelet transformation and marginal Fisher analysis for scoring sleep stages from unprocessed single-channel electroencephalogram
Voges et al. Rough clustering using an evolutionary algorithm
Mishra et al. Predictive modelling and analytics for diabetes using a machine learning approach
Mohsen et al. Random forest algorithm using accuracy-based ranking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180327