CN109102896A - A kind of method of generating classification model, data classification method and device - Google Patents

A kind of method of generating classification model, data classification method and device Download PDF

Info

Publication number
CN109102896A
CN109102896A CN201810712862.5A CN201810712862A CN109102896A CN 109102896 A CN109102896 A CN 109102896A CN 201810712862 A CN201810712862 A CN 201810712862A CN 109102896 A CN109102896 A CN 109102896A
Authority
CN
China
Prior art keywords
data
sign data
index value
sign
health
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810712862.5A
Other languages
Chinese (zh)
Inventor
王晓婷
栾欣泽
何光宇
孟健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201810712862.5A priority Critical patent/CN109102896A/en
Publication of CN109102896A publication Critical patent/CN109102896A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Abstract

The embodiment of the present application discloses a kind of method of generating classification model, data classification method and device, this method comprises: obtaining original sign data of health, at least one of every original sign data of health includes index value;There are the sign datas that index value lacks for lookup in original sign data of health;Data filling is carried out to index value lacked in the sign data lacked there are index value, generates the sign data after filling up;Sign data using the sign data that index value missing is not present in original sign data of health and after filling up is as training data, preliminary classification model is trained according to training data and every training data corresponding data classification label, generate sign data disaggregated model, sign data disaggregated model generated can classify to any sign data, classification results can assist doctor to diagnose, to which the application is directed to a large amount of original sign data of health, it excavates its internal connection and establishes sign data disaggregated model, improve the utilization rate of original sign data of health.

Description

A kind of method of generating classification model, data classification method and device
Technical field
This application involves data processing fields, and in particular to a kind of method of generating classification model and device, a kind of data point Class method and device.
Background technique
Population base of China is numerous to which the illness size of population is also more, a large amount of case histories can be generated after patient assessment, in disease It will include a large amount of medical data in example, such as patient carries out the sign data after medical inspection.In the prior art, patient Case history is usually retained for patient or is consulted for doctor, but a large amount of medical datas are caused there is no effectively excavating, utilizing Medical data utilization rate is low.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of method of generating classification model and device, a kind of data classification method And device, realization further analyze sign data, utilize, and improve the utilization rate of medical data.
To solve the above problems, technical solution provided by the embodiments of the present application is as follows:
A kind of method of generating classification model, which comprises
Original sign data of health is obtained, every original sign data of health includes at least one index value;
There are the sign datas that index value lacks for lookup in the original sign data of health;
To described there are the index value progress data filling lacked in the sign data of index value missing, generate after filling up Sign data;
By in the original sign data of health be not present index value missing sign data and it is described fill up after sign number According to as training data, according to the training data and the corresponding data classification label of every training data to initial point Class model is trained, and generates sign data disaggregated model.
In one possible implementation, the method also includes:
It, will be every before the index value lacked in the sign data lacked there are index value carries out data filling Index value in original sign data of health described in item is normalized.
In one possible implementation, described to the finger lacked in the sign data lacked there are index value Scale value carries out data filling, generates the sign data after filling up, comprising:
For it is any it is described there are index value missing sign data, determine this there are index value missing sign data in The corresponding index item of the index value lacked;
It generates this using a variety of data filling algorithms according to the index value of the index item in other original sign data of health and refers to Mark multiple data filling results of item;
The average value for calculating the multiple data filling result will be lacked in the sign data there are index value missing The index value of the index item fill up as the average value, generate the sign data after this fills up.
In one possible implementation, the data filling algorithm includes Maximum Likelihood Estimation Method, average value filling It is any number of in method and approximate polishing method.
In one possible implementation, the preliminary classification model uses model-naive Bayesian or decision tree mould Type.
A kind of data classification method, which comprises
Sign data to be sorted is obtained, the sign data to be sorted includes at least one index value;
If there are index value missings for the sign data to be sorted, index value carries out data filling to lacking in, will Sign data to be sorted after filling up inputs sign data disaggregated model, obtains the classification results of the sign data to be sorted;
If there is no index value missings for the sign data to be sorted, the sign data to be sorted is inputted into the body Data classification model is levied, the classification results of the sign data to be sorted are obtained;
The sign data disaggregated model is generated according to the method for generating classification model.
In one possible implementation, the method also includes:
Index value in the sign data to be sorted is normalized.
In one possible implementation, if there are index value missings for the sign data to be sorted, by institute The index value lacked carries out data filling, comprising:
It is lacked in the sign data to be sorted if the sign data to be sorted there are index value missing, determines The corresponding index item of index value;
The index item is generated using a variety of data filling algorithms according to the index value of the index item in original sign data of health Multiple data filling results;
The average value for calculating the multiple data filling result, the index that will be lacked in the sign data to be sorted Index value fill up as the average value, the sign data to be sorted after being filled up.
A kind of disaggregated model generating means, described device include:
Acquiring unit, for obtaining original sign data of health, every original sign data of health includes at least one index value;
Searching unit, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim is filled out for carrying out data to the index value lacked in the sign data lacked there are index value It mends, generates the sign data after filling up;
Generation unit, for the sign data of index value missing will to be not present in the original sign data of health and described fill out Sign data after benefit is as training data, according to the training data and the corresponding data classification of every training data Label is trained preliminary classification model, generates sign data disaggregated model.
In one possible implementation, described device further include:
Normalization unit, the index value for being lacked in the sign data lacked there are index value count According to before filling up, the index value in every original sign data of health is normalized.
In one possible implementation, the shim specifically includes:
Determine subelement, any described there are the sign data of index value missing for being directed to, determining this, there are index values The corresponding index item of the index value lacked in the sign data of missing;
Subelement is generated to fill out for the index value according to the index item in other original sign data of health using a variety of data Algorithm is mended, multiple data filling results of the index item are generated;
Subelement is filled up, the average value of the multiple data filling result is calculated, by the sign there are index value missing The index value of the index item lacked in data is filled up as the average value, and the sign data after this fills up is generated.
In one possible implementation, the data filling algorithm includes Maximum Likelihood Estimation Method, average value filling It is any number of in method and approximate polishing method.
In one possible implementation, the preliminary classification model uses model-naive Bayesian or decision tree mould Type.
A kind of device for classifying data, described device include:
Acquiring unit, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index value;
First obtains unit, if for the sign data to be sorted there are index value missing, the index to lacking in Value carries out data filling, and the sign data to be sorted after filling up inputs sign data disaggregated model, obtains the body to be sorted Levy the classification results of data;
Second obtaining unit, if for the sign data to be sorted there is no index value missing, it will be described to be sorted Sign data inputs the sign data disaggregated model, obtains the classification results of the sign data to be sorted;
The sign data disaggregated model is generated according to the disaggregated model generating means.
In one possible implementation, described device further include:
Normalization unit, for the index value in the sign data to be sorted to be normalized.
In one possible implementation, the first obtains unit specifically includes:
Subelement is determined, for determining the corresponding index item of index value lacked in the sign data to be sorted;
Subelement is generated, for the index value according to the index item in original sign data of health, is calculated using a variety of data fillings Method generates multiple data filling results of the index item;
Subelement is filled up, for calculating the average value of the multiple data filling result, by the sign data to be sorted The index value of middle the lacked index item is filled up as the average value, the sign data to be sorted after being filled up.
A kind of computer readable storage medium is stored with instruction in the computer readable storage medium storing program for executing, works as described instruction When running on the terminal device, so that the terminal device executes the method for generating classification model or the data classification side Method.
A kind of computer program product, when the computer program product is run on the terminal device, so that the terminal Equipment executes method of generating classification model or data classification method.
It can be seen that the embodiment of the present application has the following beneficial effects:
After the embodiment of the present application obtains original sign data of health, each index value in original sign data of health is filled up into complete generation Training data is trained preliminary classification model using the tag along sort of training data and training data, generates sign number According to disaggregated model, sign data disaggregated model generated can classify to any sign data, and classification results can be auxiliary It helps doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health, has excavated its internal connection and established Sign data disaggregated model improves the utilization rate of original sign data of health.
Detailed description of the invention
Fig. 1 is a kind of method of generating classification model flow chart provided by the embodiments of the present application;
Fig. 2 is a kind of data filling method flow diagram provided by the embodiments of the present application;
Fig. 3 is disaggregated model training flow chart provided by the embodiments of the present application;
Fig. 4 is a kind of data classification method flow chart provided by the embodiments of the present application;
Fig. 5 is another data filling method flow diagram provided by the embodiments of the present application;
Fig. 6 is data classification flow chart provided by the embodiments of the present application;
Fig. 7 is a kind of disaggregated model generating means structure chart provided by the embodiments of the present application;
Fig. 8 is a kind of device for classifying data structure chart provided by the embodiments of the present application.
Specific embodiment
In order to make the above objects, features, and advantages of the present application more apparent, with reference to the accompanying drawing and it is specific real Mode is applied to be described in further detail the embodiment of the present application.
Technical solution provided by the present application in order to facilitate understanding below first carries out the research background of technical scheme Simple declaration.
With the continuous development of computer field, data mining causes the very big concern of every field, and data mining is Refer to from a large amount of data by algorithm search hide with wherein information and knowledge, so that the information and knowledge that will acquire will be converted At useful information to instruct subsequent development.
The a large amount of medical datas that can be generated but in existing medical domain, after patient assessment are not carried out effective It excavates to assist doctor to carry out medical diagnosis, lies on the table so as to cause a large amount of medical datas, cause the waste of medical data.
Based on this, present applicant proposes a kind of method of generating classification model and device, a kind of data classification method and device, Using a large amount of original sign data of health as training data, its internal connection is excavated and has established sign data disaggregated model, and benefit It treats classification sign data with the model to classify, so that classification results are supplied to doctor, to assist doctor to carry out medical treatment Diagnosis, improves the utilization rate of original sign data of health.
Method of generating classification model provided by the embodiments of the present application is introduced below in conjunction with attached drawing.
Referring to Fig. 1, it illustrates a kind of flow chart of method of generating classification model provided by the embodiments of the present application, such as Fig. 1 It is shown, this method comprises:
S101: obtaining original sign data of health, and every original sign data of health includes at least one index value.
In the present embodiment, classify to realize to sign data, it is necessary first to by training generation disaggregated model, and In the generating process of disaggregated model, need to obtain original sign data of health first.Wherein, original sign data of health can refer to that patient carries out The sign data generated when medical inspection may include at least one index value, such as pressure value, blood glucose in the sign data Value, body temperature, heart rate etc..
In practical applications, the accuracy of the disaggregated model generated for guarantee training, available a large amount of original sign numbers According to.Since medical inspection can be divided into the projects such as multiple projects, such as blood routine, routine urinalysis, biochemical analysis, patient be can choose Check a project or multiple projects, therefore the original sign data of health may include the corresponding a kind of sign number of a certain inspection item According to, also may include the corresponding multiclass sign data of multiple inspection items, for example, the original sign data of health may include patient into The sign data generated when row routine urinalysis or blood routine or biochemical analysis, the original sign data of health also may include that patient urinates The sign data generated when conventional and blood routine examination.
For ease of understanding, by taking original sign data of health includes the index value in blood routine examination as an example, available Data1, The a plurality of original sign data of health such as Data2, Data3 and Data4, every characteristic may include average hemoglobin amount, be averaged Index value in the blood routine examinations such as hemoglobin concentration, blood platelet distributed density and erythrocyte distribution width, such as 1 institute of table Show.It should be noted that average hemoglobin amount, mean hemoglobin concentration, blood platelet distributed density, erythrocyte distribution width Deng for index item, the corresponding specific value of index item is the index value in original sign data of health.
1 original sign data of health table of table
It is understood that may include in every original sign data of health in the original sign data of health that medical inspection generates Numeric type index value, for example, average hemoglobin amount is 30.2pg, platelet distribution width 11.3fl, it also may include non- Numeric type index value, for example, Urine proteins and examination of sugar in urine result characterize patient sign with the negative or positive in routine urinalysis.
The index value that the embodiment of the present application includes to original sign data of health can carry out according to the actual situation without limiting Selection.
S102: there are the sign datas that index value lacks for lookup in original sign data of health.
In the present embodiment, after obtaining original sign data of health, needs to check every initial characteristic data, find out Lack the sign data of index value, to execute S103 using the sign data for lacking index value.In order to search, there are index values to lack The sign data of mistake can obtain the corresponding each index item of index value in whole original sign data of health first, determine initial body Corresponding whole index item in data are levied, lack any index item if existed in a certain original sign data of health, the initial body Sign data are that there are the sign datas of index value missing.For example, being corresponding with the index of index item 1,2,3 in original sign data of health A It is worth, the index value of index item 2,3,4 is corresponding in original sign data of health B, is corresponding with index item 3,4,5 in original sign data of health C Index value, then original sign data of health A, B, C be there are index value missing sign data, original sign data of health A missing refers to The index value of item 4,5 is marked, original sign data of health B lacks the index value of index item 1,5, and original sign data of health C lacks index item 1,2 Index value.
S103: data filling is carried out to the index value lacked in the sign data lacked there are index value, generation is filled up Sign data afterwards.
In practical applications, lead to generate sign classification mould to avoid being trained using the characteristic of missing index value The inaccuracy of type can be to the characteristic of the missing index value when finding the characteristic of missing index value by S102 According to being filled up, the index value of missing is supplemented, to obtain a complete original sign data of health, executes S104.Wherein, it fills up The specific implementation for lacking the characteristic of index value will be described in detail in subsequent embodiment.
S104: the sign data by the sign data that index value missing is not present in original sign data of health and after filling up is made For training data, preliminary classification model is instructed according to training data and every training data corresponding data classification label Practice, generates sign data disaggregated model.
In this example, by S103, after filling up the sign data of missing index value, the sign number after being filled up According to further, the characteristic there will be no the sign data of index value missing and after filling up is used as training data, so Preliminary classification model is trained according to training data and every training data corresponding data classification label afterwards, and then is obtained Take sign data disaggregated model.
In specific application, can classify in advance to every original sign data of health of acquisition, and according to classification results Data classification label is distributed to original sign data of health, to be used as when training data using original sign data of health, according to trained number Accordingly and the corresponding data classification label of training data is trained, and generates sign data disaggregated model.
Wherein, data classification label can be used for characterizing the corresponding patient's constitution of every sign data, different patients its Constitution may be different, and the sign data of generation may not also be identical when carrying out medical inspection for different constitutions.Specific implementation When, data classification label can be used different characters and be identified, such as the corresponding mark constitution 1 of label 1, the corresponding mark of label 2 Know corresponding mark constitution 3 of constitution 2, label 3 etc..
As can be seen from the above-described embodiment, after the embodiment of the present application is by obtaining original sign data of health, by original sign number Each index value in fills up complete and generates training data, using the tag along sort of training data and training data to initial point Class model is trained, and generates sign data disaggregated model, sign data disaggregated model generated can be to any sign number According to classifying, classification results can assist doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health, It has excavated its internal connection and has established sign data disaggregated model, improved the utilization rate of original sign data of health.
In the embodiment of the present application, one kind is possible is achieved in that, the preliminary classification model in the application can be Piao Plain Bayesian model or decision-tree model.It will be introduced respectively below according to training data and the corresponding data of every training data The process that tag along sort is trained model-naive Bayesian or decision-tree model.
One, naive Bayesian training pattern
In the present embodiment, naive Bayesian theory refers to, the probability of event is had occurred and that according to one, calculates another The probability that event occurs, mathematic(al) representation is referring to formula (1)
Wherein, P (Y) is the prior probability of event Y, and P (Y | X) is the posterior probability of event X, after indicating that event X occurs, hair It makes trouble the probability of part Y.
On this basis, in conjunction with the practical application of the application, wherein Y indicates data classification label classification, and X is training number According to, it is assumed that there are 4 data tag along sorts, respectively y1, y2, y3 and y4, obtains 5 original sign data of health, every sign data Including 4 index values, the tag along sort of every sign data is respectively y1, y2, y3, y3, y4, wherein the 1st article of sign data be [x1 x2 x3 x4], data classification label are y1;2nd article of sign data is [x5 x6 x7 x8], and data classification label is y2; 3rd data is [x9 x10 x11 x12], and data classification label is y3;4th article of sign data is [x13 x14 x15 X16], data classification label is y3;5th article of sign data is [x17 x18 x19 x20], and data classification label is y4.
Then
For ease of understanding, X1 is the 1st article of sign data, X2 is the 2nd article of sign data, X3 is that the 3rd article of sign data, X4 are 4th article of sign data, X5 are the 5th article of sign data, and the purpose of the present embodiment training is to calculate P (y1 | X1), P (y2 | X2), P (y3 | X3), P (y3 | X4) and P (y4 | X5), calculation formula is referring to formula (2):
Assuming that mutually indepedent between each data in Xi, then above-mentioned formula (2) can be written as:
Wherein, the specific value of xa, xb, xc and xd are related to Xi, for example, being respectively x1, x2, x3 as Xi=X1 And x4, then formula (3) can be with are as follows:
Since denominator and input data are related to constant, then denominator can be removed, then above-mentioned formula (3) can be with are as follows:
In practical application, all probable values that can use given data tag along sort Y calculate probability, and select to export The result of maximum probability, that is to say, that be respectively the probability of y1, y2, y3 and y4 when data are X1, select maximum general The corresponding tag along sort of rate, as the data classification label of X1, then:
It is illustrated by taking P (y1 | X1) as an example, then above-mentioned formula (4) can be with are as follows:
It can be seen that from above-mentioned calculation formula when obtaining P (y1 | X1), need to know P (Y=y1) and P (X1 | y1), under Face will introduce how to obtain specific probability value respectively.
(1) if prior probability without P (Y=y1), P (Y=y is utilizedk)=mk/ m is obtained, wherein mkFor data point Class label is ykNumber, m is the number of data classification label in all sign datas obtained, that is, the sign number obtained According to item number.
(2) when obtaining P (X1 | y1), the attribute of sign data X1 need to be distinguished, when sign data is discrete value, P (X1 | Y1 it) is obtained using following formula:
Wherein, xj is index value in sign data Xi, mkThe number for being yk for data classification label, n are every characteristic According to including index value number, δ is pre-set positive integer.
When sign data is successive value, P (X1 | y1) it is obtained using following formula:
Wherein, μkWithRespectively as Y=yk, the average value of all Xi, variance.
P (yk | Xi) is obtained by above-mentioned calculating, then model-naive Bayesian is trained, to generate sign data Disaggregated model.
Two, decision-tree model
Decision tree is also known as classification tree, is a kind of common classification method, and basic principle is a large amount of training sample of input, In, each training sample has attribute value and classification, and the category is predetermined, and decision tree obtains classifier by study, The classifier can correctly classify to the data newly inputted.
For ease of understanding, be illustrated by binary tree of decision tree, it is assumed that the original sign data of health of acquisition be table 2 shown in, 4 original sign data of health are obtained altogether, and every original sign data of health includes 4 index values, which is merely to illustrate how to train certainly Plan tree-model does not do any restriction to the original sign data of health of acquisition.
2 decision tree training data of table
There are four index values in the every sign data obtained as can be seen from Table 2, while having determined that respective Data classification label, then training process can be such that
(1) average hemoglobin amount is judged whether in no threshold range A, if it is, determining this sign data pair The classification of TCM constitution answered is y1;If not, carrying out (2);
(2) mean hemoglobin concentration is judged whether in threshold range B, if it is, determining this sign data pair The classification of TCM constitution answered is y2;If not, carrying out (3);
(3) platelet distribution width is judged whether in threshold range C, if it is, determining that this sign data is corresponding Classification of TCM constitution be y3;If not, carrying out (4);
(4) erythrocyte distribution width is judged whether in threshold range D, if it is, determining that this sign data is corresponding Classification of TCM constitution be y4;If not, can be other with mark, to be distinguished with the classification of above-mentioned four kinds of signs.
Wherein, the specific setting of A, B, C and D are referred to each index value in the original sign data of health obtained, by above-mentioned After learning training, sign data disaggregated model can be generated.
It should be noted that above-mentioned training process is using average hemoglobin amount as the first Rule of judgment, naturally it is also possible to It, can also be by mean hemoglobin concentration and blood with mean hemoglobin concentration or platelet distribution width for the first Rule of judgment The platelet dispersion of distribution is collectively as the first Rule of judgment, and the present embodiment is it is not limited here.
Through the foregoing embodiment, it can use training data to be trained above two preliminary classification model, so as to To quickly generate sign data disaggregated model, divided to treat classification sign data using the sign data disaggregated model Class.
Included index in every original sign data of health is provided in the original sign data of health provided by Tables 1 and 2 The dimension of value is different, such as average hemoglobin amount is (pg), mean hemoglobin concentration is (g/L), two indices value It is distributed in the different orders of magnitude, is unfavorable for subsequent trained preliminary classification model, it is therefore, in the embodiment of the present application, former obtaining After beginning sign data, the index value in original sign data of health is normalized first, thus by different dimension data Homogeneous classification data are divided into, it is inconvenient to eliminate dimension bring.Simultaneously, it is contemplated that need to the sign number lacked there are index value It is filled up in, to make the data filled up more accurate, in some possible implementations, to there are index value missings Sign data in front of the index value that is lacked carries out data filling, the index value in every original sign data of health is returned One change processing.
It in this example, is normalized for the corresponding index value of index item same in original sign data of health, it should Index item for example can be average hemoglobin, mean hemoglobin concentration, platelet distribution width or erythrocyte distribution width Deng.
In specific implementation, the index value in original sign data of health can be normalized using 0-1 standardized method Processing, wherein 0-1 standardization is also known as deviation standardization, is to carry out linear transformation to initial data, result is made to fall in [0,1] area Between, transfer function are as follows:
Wherein, x is that a corresponding index value, max are the index item in whole original sign data of health in certain index item Maximum value, min are the minimum value of the index item in whole original sign data of health.
It is illustrated by taking mean hemoglobin concentration as an example, max 345, min 320 is converted by above-mentioned transfer function Afterwards, mean hemoglobin concentration is normalized to that 0.76, Data2 is corresponding to be normalized to the corresponding normalization of 0, Data3 in Data1 1 is normalized to for 0.6, Data4 is corresponding.
It should be noted that can also be normalized using other standards method, such as min-max standard Change, the embodiment of the present application to the concrete mode of normalized without limitation.
In addition, can also be normalized when index value corresponding for index item is nonumeric type, implement When, it can be the nonumeric carry out assignment, then the index value after assignment is normalized.If certain mark sense pair The index value answered only there are two types of as a result, for example negative or positive, then can set the positive to 1, and feminine gender is set as 0, without into The subsequent normalized of row.
By present embodiment, place can be normalized to the index value in original sign data of health using normalization algorithm Reason is handled so that each index value is in [0,1] section convenient for subsequent classification, improves processing speed.
Through the foregoing embodiment it is found that needing to carry out the index value lacked in the sign data lacked there are index value Data filling is illustrated algorithm provided by the embodiments of the present application of filling up below in conjunction with attached drawing.
Referring to fig. 2, it illustrates a kind of data filling method flow diagrams provided by the embodiments of the present application, as shown in Fig. 2, should Method may include:
S201: for any there are the sign data of index value missing, the sign data there are index value missing is determined The corresponding index item of middle lacked index value.
In the present embodiment, it can be searched by S102 there are the sign data of index value missing, any one is lacked The sign data of index value, it is thus necessary to determine that the corresponding index item of index value is lacked in every sign data, which can be The corresponding inspection item title of the index value.For example, it is assumed that the corresponding index of erythrocyte distribution width is lacked in table 1 in Data2 Value can then determine that it is erythrocyte distribution width that the corresponding index item of index value is lacked in Data2;Blood platelet is lacked in Data3 The dispersion of distribution corresponds to index value, then can determine that it is platelet distribution width that the corresponding index item of index value is lacked in Data3.
S202: it according to the index value of the index item in other original sign data of health, is generated using a variety of data filling algorithms Multiple data filling results of the index item.
By S201, determines in the sign data of every missing index value after the index item of lacked index value, then utilize The index value for not lacking the index item in other original sign data of health of the index value carries out data filling.For example, in table 1, It is erythrocyte distribution width that the corresponding index item of index value is lacked in Data2, and is not lacked in Data1, Data3 and Data4 The corresponding index value of the index item then can use the corresponding index value of erythrocyte distribution width in Data1, Data3 and Data4 Carry out data filling;It is platelet distribution width that the corresponding index item of index value is lacked in Data3, and Data1, Data2 and It does not lack the corresponding index value of the index item in Data4, then it is wide to can use the distribution of Data1, Data2 and Data4 blood platelet Corresponding index value is spent to be filled up.In specific implementation, the accuracy that result is filled up for guarantee, can use a variety of data and fills out It mends algorithm to be filled up, every kind of data filling algorithm generates the corresponding data filling of the index item as a result, so as to obtain Much a data filling results.
It should be noted that determining the missing index value by S201 first when the index value for missing is nonumeric After corresponding index item, then can in other original sign data of health the index item corresponding index value carry out assignment, then Data filling is carried out according to the index value of the index item in other original sign data of health after assignment.In a kind of optional realization side In formula, data filling algorithm may include any more in Maximum Likelihood Estimation Method, average value completion method and approximate polishing method It is a.That is, when carrying out data filling, it can choose and wherein fill up algorithm and filled up for any two kinds, generate two numbers According to filling up result;Also it three kinds be can choose fills up algorithm and filled up, generate three data filling results.
Wherein, Maximum Likelihood Estimation Method is built upon a statistical method on the basis of maximum likelihood principle, provides one The method that the given observation data of kind carry out assessment models parameter.In the present embodiment, when deletion type is missing at random, it is assumed that mould Type is that reliably, can carry out maximum likelihood to unknown parameter by the limit distribution of observation data and estimate for complete sample Meter, in the case of, the calculation method that maximum likelihood parameter estimation uses maximizes (Expectation- for desired value Maximization, EM) algorithm, which is that one kind calculates Maximum-likelihood estimation or posteriority point in fragmentary data The iterative algorithm of cloth is alternately performed two steps in each iterative process:
(1) to calculate complete data in the case where giving complete main clause and the obtained parameter Estimation of preceding an iteration corresponding Log-likelihood function conditional expectation;
(2) parameter value is determined using maximization log-likelihood function, and be used for the iteration of lower step.
Until restraining, i.e. Parameters variation between two steps is less than continuous iteration EM algorithm between above-mentioned two step Terminate iterative process when preset threshold value.
Average value fill method, the attribute for the original sign data of health that can first will acquire are divided into numerical attribute and nonumeric category Property;When index value for missing is numerical attribute, be using in other initial bodies card data the index value of the index item it is flat Mean value is filled up;It, can be according to the mode principle in statistics, in other originals when index value for missing is nonumeric attribute The most numerical value of the corresponding index value frequency of occurrence of the index item is searched in beginning sign data, then the number that frequency of occurrence is most Value fills up the index value of missing.
Nearest polishing method, is searched in other original sign data of health and there are the sign data of index value missing is most like Then sign data fills up the index value of missing using the corresponding index value of the index item in the most like sign data of lookup. Wherein, most like sign data, can be identical for the corresponding data classification label of two strips sign data, alternatively, two strips levies number The difference of the corresponding index value of other index item is in preset threshold range in.
S203: calculating the average value of multiple data filling results, will lack in the sign data there are index value missing The index value of the index item lost is filled up as average value, and the sign data after this fills up is generated.
By S202, multiple data fillings can be obtained as a result, can incite somebody to action to improve the index value accuracy finally filled up Results are averaged for the multiple data fillings obtained, using the average value as being lacked in the sign data lacked there are index value Index value, thus generate fill up after sign data.
What is provided through this embodiment fills up algorithm, can fast and accurately generate it is required fill up data so that There are the sign datas of index value missing to be converted to complete sign data, and then provides complete training sample for subsequent training This, improves trained accuracy.
For ease of understanding in the application sign data disaggregated model training process, referring to Fig. 3, it illustrates the application realities The flow chart that the sign data disaggregated model training of example offer is provided, as shown in figure 3, in sign data disaggregated model training process In, it is necessary first to original sign data of health is obtained, duplicate checking then is carried out to original sign data of health, duplicate sign data is removed, subtracts Few redundancy, then the sign data after duplicate removal is normalized, normalization sign data is obtained, in normalization sign data There are the sign datas that index value lacks for middle lookup, and using a variety of data filling algorithms to the sign number lacked there are index value According to data filling is carried out, the sign data after filling up is generated, finally, the body that index value missing will be not present in original sign data of health Sign data are trained generation sign data classification mould to preliminary classification model as training data with the sign data after filling up Type.
The above are a kind of specific implementations of method of generating classification model provided by the embodiments of the present application, are based on above-mentioned reality The sign data disaggregated model in example is applied, the embodiment of the present application also provides a kind of data classification methods.
Ginseng is by Fig. 4, and it illustrates a kind of data classification method flow charts provided by the embodiments of the present application, as shown in figure 4, should Method may include:
S401: sign data to be sorted is obtained, wherein sign data to be sorted includes at least one index value.
In the present embodiment, when it needs to be determined that when the corresponding classification results of certain sign data, it is necessary first to obtain to be sorted Sign data may include one or more index values, such as pressure value, blood glucose value, heart rate etc. in the sign data.
S402: judge to lack in the sign data to be sorted obtained with the presence or absence of index value, if so, executing S403;Such as Fruit is no, executes S404.
In this example, after obtaining sign data to be sorted, need to check this sign data, it should with judgement The case where sign data is lacked with the presence or absence of index value, to avoid will be present the sign data input sign of index value missing In data classification model, classification results are influenced, therefore, when obtaining in sign data to be sorted in the presence of missing index value, are then held Row S403.If then executing S404 there is no missing index value in the sign data to be sorted of acquisition.
S403: carrying out data filling to the scale value of missing, the sign data to be sorted input sign data point after filling up Class model obtains the classification results of sign data to be sorted.
In the present embodiment, when determining in the sign data to be sorted obtained in the presence of the case where missing index value, to missing Index value is filled up, and specific complementing method is subsequent to be introduced.
In specific application, the sign data to be sorted after filling up is input to sign data classification mould as input data Type so that sign data disaggregated model according to input data obtain classification results, such as the classification results can characterize this to Classification sign data corresponds to the constitution of patient.Wherein, the sign data disaggregated model of the present embodiment is above-described embodiment training life At disaggregated model.
S404: sign data to be sorted is inputted into sign data disaggregated model, obtains the classification knot of sign data to be sorted Fruit.
By S402, after determining sign data to be sorted is partial data, using the data to be sorted as input data It is input in sign data disaggregated model, so that sign data disaggregated model can judge sign to be sorted according to input data The type of data obtains classification results.
As can be seen from the above-described embodiment, sign data to be sorted is obtained first, then judges that the sign data to be sorted is No there are index value deletion conditions, if it is, filling up to missing index value, the sign data to be sorted after filling up is defeated Enter in sign data disaggregated model;If it is not, then directly sign data to be sorted is input in sign data disaggregated model, into And the classification results of sign data to be sorted are obtained, to realize that quickly treating classification sign data classifies, and knot of classifying Fruit can assist doctor to diagnose, and improve the utilization rate of original sign data of health.
In the present embodiment, the index value that can also be treated in classification sign data is normalized, and is actually answering In, the index value that can be treated according to the index value of original sign data of health in classification sign data is normalized, from And the index value of different dimensions is divided into homogeneous classification data, specific implementation may refer to original sign data of health index value Normalization processing method, details are not described herein for the present embodiment.
It should be noted that in the present embodiment, if during generating sign data disaggregated model, to original sign number Index value in has carried out normalized, then when treating classification data using sign data disaggregated model and being classified, It is also required to treat the index value in classification data to be normalized;If do not returned to the index value of original sign data of health One change processing is then normalized without treating the index value in classification sign data, thus unified input data, it is ensured that Disaggregated model can accurately identify input data, guarantee the accuracy for obtaining classification results.
The case where being directed to acquired sign data to be sorted missing index value, the embodiment of the present application provides one kind and fills out It fills a vacancy the method for losing index value, is introduced below in conjunction with attached drawing.
Referring to Fig. 5, it illustrates another data filling methods provided by the embodiments of the present application, as shown in figure 5, this method May include:
S501: if there are index value missings for sign data to be sorted, the finger lacked in sign data to be sorted is determined The corresponding index item of scale value.
In the present embodiment, judging sign data to be sorted by S402, there are when index value deletion condition, it is thus necessary to determine that The corresponding index item of the index value, to carry out subsequent fill up using the index item.
S502: it is generated according to the corresponding index value of the index item in original sign data of health using a variety of data filling algorithms Multiple data filling results of the index item.
By S501, the corresponding index item of missing index value is determined, index item can use after determining a variety of fills up algorithm Data filling is carried out, the data utilized when filling up are the corresponding index value of the index item in original sign data of health, by the index Value is calculated as multiple parameters for filling up algorithm, obtains multiple data filling results of the index item.For example, obtain to Classify and lack average hemoglobin amount in sign data, then can use in table 1 that average hemoglobin amount is corresponding in 4 datas Value obtains multiple data filling results by multiple algorithms of filling up.
Wherein, a variety of algorithms of filling up can be in Maximum Likelihood Estimation Method, average value completion method and approximate polishing method Any number of, the specific implementation about each algorithm may refer to above-described embodiment, and details are not described herein for the present embodiment.
S503: calculating the average value of multiple data filling results, the index item that will be lacked in sign data to be sorted Index value fill up as the average value, the sign data to be sorted after being filled up.
In the present embodiment, by S502, obtains multiple data fillings and asked as a result, multiple data filling results be added It is averaged, using the average value as missing index value, to obtain complete sign data to be sorted, and then this is to be sorted Sign data is input in sign disaggregated model as input data and obtains classification results.
What is provided through this embodiment fills up algorithm, can quickly and accurately to missing index value data to be sorted into Row data filling improves the accuracy of final classification to guarantee the integrality of sign data to be sorted.
For ease of understanding in the application sign data to be sorted assorting process, referring to Fig. 6, it illustrates the application implementations The flow chart for the data classification that example provides during data classification, obtains sign data to be sorted, so as described in Figure 6 first After treat classification sign data be normalized, then judge the sign data to be sorted with the presence or absence of missing index value feelings Condition, if it does, carrying out data filling, the sign data to be sorted after being filled up is then input to sign data classification mould In type;If it does not exist, then sign data to be sorted is input in the tired model of sign data point, finally, output category result.
Based on above method embodiment, present invention also provides a kind of disaggregated model generating means, below in conjunction with attached drawing The device is illustrated.
Referring to Fig. 7, it illustrates a kind of disaggregated model generating means structure charts provided by the embodiments of the present application, can wrap It includes:
Acquiring unit 701, for obtaining original sign data of health, every original sign data of health includes at least one index Value;
Searching unit 702, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim 703, for being counted to the index value lacked in the sign data lacked there are index value According to filling up, the sign data after filling up is generated;
Generation unit 704, for sign data and the institute of index value missing will to be not present in the original sign data of health The sign data after filling up is stated as training data, according to the training data and the corresponding data of every training data Tag along sort is trained preliminary classification model, generates sign data disaggregated model.
In some possible implementations of the application, described device further include:
Normalization unit, the index value for being lacked in the sign data lacked there are index value count According to before filling up, the index value in every original sign data of health is normalized.
In some possible implementations of the application, the shim is specifically included:
Determine subelement, any described there are the sign data of index value missing for being directed to, determining this, there are index values The corresponding index item of the index value lacked in the sign data of missing;
Subelement is generated to fill out for the index value according to the index item in other original sign data of health using a variety of data Algorithm is mended, multiple data filling results of the index item are generated;
Subelement is filled up, the average value of the multiple data filling result is calculated, by the sign there are index value missing The index value of the index item lacked in data is filled up as the average value, and the sign data after this fills up is generated.
In some possible implementations of the application, the data filling algorithm includes Maximum Likelihood Estimation Method, is averaged It is any number of in value completion method and approximate polishing method.
In some possible implementations of the application, the preliminary classification model is using model-naive Bayesian or certainly Plan tree-model.
As can be seen from the above-described embodiment, after the embodiment of the present application is by obtaining original sign data of health, by original sign number Each index value in fills up complete and generates training data, using the tag along sort of training data and training data to initial point Class model is trained, and generates sign data disaggregated model, sign data disaggregated model generated can be to any sign number According to classifying, classification results can assist doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health, It has excavated its internal connection and has established sign data disaggregated model, improved the utilization rate of original sign data of health.
Referring to Fig. 8, it illustrates a kind of device for classifying data structure chart provided by the embodiments of the present application, which be can wrap It includes:
Acquiring unit 801, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index Value;
First obtains unit 802, if for the sign data to be sorted there are index value missing, the finger to lacking in Scale value carries out data filling, and the sign data to be sorted after filling up inputs sign data disaggregated model, obtains described to be sorted The classification results of sign data;
Second obtaining unit 803 will be described wait divide if there is no index value missings for the sign data to be sorted Class sign data inputs the sign data disaggregated model, obtains the classification results of the sign data to be sorted;
The sign data disaggregated model is generated according to the disaggregated model generating means.
In some possible implementations of the application, described device further include:
Normalization unit, for the index value in the sign data to be sorted to be normalized.
In some possible implementations of the application, the first obtains unit is specifically included:
Subelement is determined, for determining the corresponding index item of index value lacked in the sign data to be sorted;
Subelement is generated, for the index value according to the index item in original sign data of health, is calculated using a variety of data fillings Method generates multiple data filling results of the index item;
Subelement is filled up, for calculating the average value of the multiple data filling result, by the sign data to be sorted The index value of middle the lacked index item is filled up as the average value, the sign data to be sorted after being filled up.
In addition, the embodiment of the present application also provides a kind of computer readable storage medium, the computer readable storage medium storing program for executing In be stored with instruction, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned disaggregated model Generation method or above-mentioned data classification method.
The embodiment of the present application also provides a kind of computer program product, and the computer program product is transported on the terminal device When row, so that the terminal device executes above-mentioned method of generating classification model or above-mentioned data classification method.
As can be seen from the above-described embodiment, sign data to be sorted is obtained first, then judges that the sign data to be sorted is No there are index value deletion conditions, if it is, filling up to missing index value, the sign data to be sorted after filling up is defeated Enter in sign data disaggregated model;If it is not, then directly sign data to be sorted is input in sign data disaggregated model, into And the classification results of sign data to be sorted are obtained, to realize that quickly treating classification sign data classifies, and knot of classifying Fruit can assist doctor to diagnose, and improve the utilization rate of original sign data of health.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality For applying system or device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase Place is closed referring to method part illustration.
It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner It can indicate: only exist A, only exist B and exist simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c (a) can indicate: a, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", and wherein a, b, c can be individually, can also To be multiple.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of method of generating classification model, which is characterized in that the described method includes:
Original sign data of health is obtained, every original sign data of health includes at least one index value;
There are the sign datas that index value lacks for lookup in the original sign data of health;
To described there are the index value progress data filling lacked in the sign data of index value missing, the body after filling up is generated Levy data;
By in the original sign data of health be not present index value missing sign data and it is described fill up after sign data make For training data, according to the training data and the corresponding data classification label of every training data to preliminary classification mould Type is trained, and generates sign data disaggregated model.
2. the method according to claim 1, wherein the method also includes:
Before the index value lacked in the sign data lacked there are index value carries out data filling, by every institute The index value stated in original sign data of health is normalized.
3. method according to claim 1 or 2, which is characterized in that it is described to it is described there are index value missing sign number Data filling is carried out according to middle lacked index value, generates the sign data after filling up, comprising:
For it is any it is described there are index value missing sign data, determine this there are index value missing sign data in lack The corresponding index item of the index value of mistake;
The index item is generated using a variety of data filling algorithms according to the index value of the index item in other original sign data of health Multiple data filling results;
The average value for calculating the multiple data filling result, should by what is lacked in the sign data there are index value missing The index value of index item is filled up as the average value, and the sign data after this fills up is generated.
4. according to the method described in claim 3, it is characterized in that, the data filling algorithm include Maximum Likelihood Estimation Method, It is any number of in average value completion method and approximate polishing method.
5. the method according to claim 1, wherein the preliminary classification model using model-naive Bayesian or Person's decision-tree model.
6. a kind of data classification method, which is characterized in that the described method includes:
Sign data to be sorted is obtained, the sign data to be sorted includes at least one index value;
If there are index value missings for the sign data to be sorted, index value carries out data filling to lacking in, will fill up Sign data to be sorted afterwards inputs sign data disaggregated model, obtains the classification results of the sign data to be sorted;
If there is no index value missings for the sign data to be sorted, the sign data to be sorted is inputted into the sign number According to disaggregated model, the classification results of the sign data to be sorted are obtained;
The sign data disaggregated model is that method of generating classification model according to claim 1-5 is generated 's.
7. a kind of disaggregated model generating means, which is characterized in that described device includes:
Acquiring unit, for obtaining original sign data of health, every original sign data of health includes at least one index value;
Searching unit, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim, for carrying out data filling to the index value lacked in the sign data lacked there are index value, Generate the sign data after filling up;
Generation unit, for by the original sign data of health be not present index value missing sign data and it is described fill up after Sign data as training data, according to the training data and the corresponding data classification label of every training data Preliminary classification model is trained, sign data disaggregated model is generated.
8. a kind of device for classifying data, which is characterized in that described device includes:
Acquiring unit, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index value;
First obtains unit, if for the sign data to be sorted there are index value missing, to lacking in index value into Row data filling, the sign data to be sorted after filling up input sign data disaggregated model, obtain the sign number to be sorted According to classification results;
Second obtaining unit, if there is no index value missings for the sign data to be sorted, by the sign to be sorted Data input the sign data disaggregated model, obtain the classification results of the sign data to be sorted;
The sign data disaggregated model is that disaggregated model generating means according to claim 7 are generated.
9. a kind of computer readable storage medium, which is characterized in that it is stored with instruction in the computer readable storage medium storing program for executing, when When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item classification moulds of 1-5 Type generation method or data classification method as claimed in claim 6.
10. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make It obtains the terminal device perform claim and requires the described in any item method of generating classification model or as claimed in claim 6 of 1-5 Data classification method.
CN201810712862.5A 2018-06-29 2018-06-29 A kind of method of generating classification model, data classification method and device Pending CN109102896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810712862.5A CN109102896A (en) 2018-06-29 2018-06-29 A kind of method of generating classification model, data classification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810712862.5A CN109102896A (en) 2018-06-29 2018-06-29 A kind of method of generating classification model, data classification method and device

Publications (1)

Publication Number Publication Date
CN109102896A true CN109102896A (en) 2018-12-28

Family

ID=64845413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810712862.5A Pending CN109102896A (en) 2018-06-29 2018-06-29 A kind of method of generating classification model, data classification method and device

Country Status (1)

Country Link
CN (1) CN109102896A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597444A (en) * 2020-05-13 2020-08-28 北京达佳互联信息技术有限公司 Searching method, searching device, server and storage medium
CN112052914A (en) * 2020-09-29 2020-12-08 中国银行股份有限公司 Classification model prediction method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090319244A1 (en) * 2002-10-24 2009-12-24 Mike West Binary prediction tree modeling with many predictors and its uses in clinical and genomic applications
CN106156809A (en) * 2015-04-24 2016-11-23 阿里巴巴集团控股有限公司 For updating the method and device of disaggregated model
CN107480721A (en) * 2017-08-21 2017-12-15 上海中信信息发展股份有限公司 A kind of ox only ill data analysing method and device
CN107595243A (en) * 2017-07-28 2018-01-19 深圳和而泰智能控制股份有限公司 A kind of illness appraisal procedure and terminal device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090319244A1 (en) * 2002-10-24 2009-12-24 Mike West Binary prediction tree modeling with many predictors and its uses in clinical and genomic applications
CN106156809A (en) * 2015-04-24 2016-11-23 阿里巴巴集团控股有限公司 For updating the method and device of disaggregated model
CN107595243A (en) * 2017-07-28 2018-01-19 深圳和而泰智能控制股份有限公司 A kind of illness appraisal procedure and terminal device
CN107480721A (en) * 2017-08-21 2017-12-15 上海中信信息发展股份有限公司 A kind of ox only ill data analysing method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597444A (en) * 2020-05-13 2020-08-28 北京达佳互联信息技术有限公司 Searching method, searching device, server and storage medium
CN111597444B (en) * 2020-05-13 2024-03-05 北京达佳互联信息技术有限公司 Searching method, searching device, server and storage medium
CN112052914A (en) * 2020-09-29 2020-12-08 中国银行股份有限公司 Classification model prediction method and device
CN112052914B (en) * 2020-09-29 2023-12-01 中国银行股份有限公司 Classification model prediction method and device

Similar Documents

Publication Publication Date Title
Iftikhar et al. An evolution based hybrid approach for heart diseases classification and associated risk factors identification
Ambekar et al. Disease risk prediction by using convolutional neural network
Karthiga et al. Early prediction of heart disease using decision tree algorithm
CN112951413B (en) Asthma diagnosis system based on decision tree and improved SMOTE algorithm
Khezri et al. A fuzzy rule-based expert system for the prognosis of the risk of development of the breast cancer
Higa Diagnosis of breast cancer using decision tree and artificial neural network algorithms
Andreeva Data modelling and specific rule generation via data mining techniques
CN109213871A (en) Patient information knowledge mapping construction method, readable storage medium storing program for executing and terminal
CN109102896A (en) A kind of method of generating classification model, data classification method and device
CN111243753A (en) Medical data-oriented multi-factor correlation interactive analysis method
Sudharson et al. Performance analysis of enhanced adaboost framework in multifacet medical dataset
Azar et al. Inductive learning based on rough set theory for medical decision making
CN110610766A (en) Apparatus and storage medium for deriving probability of disease based on symptom feature weight
Christopher et al. Knowledge-based systems and interestingness measures: Analysis with clinical datasets
CN114048320B (en) Multi-label international disease classification training method based on course learning
Bindushree Prediction of cardiovascular risk analysis and performance evaluation using various data mining techniques: A review
Selvan et al. An Image Processing Approach for Detection of Prenatal Heart Disease
TWI599896B (en) Multiple decision attribute selection and data discretization classification method
Magoev et al. Application of clustering methods for detecting critical acute coronary syndrome patients
Shruthi et al. A Method for Predicting and Classifying Fetus Health Using Machine Learning
Rabiha et al. Diabetes Classification Using Support Vector Machine: Binary Classification Model
Özkan et al. Effect of data preprocessing on ensemble learning for classification in disease diagnosis
Ganesh et al. Diabetes Prediction using Logistic Regression and Feature Normalization
Abd et al. Diagnose of chronic kidney diseases by using Naive Bayes algorithm
Gancheva et al. X-Ray Images Analytics Algorithm based on Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181228