CN110197720A

CN110197720A - Prediction technique and device, storage medium, the computer equipment of diabetes

Info

Publication number: CN110197720A
Application number: CN201910185079.2A
Authority: CN
Inventors: 金晓辉; 阮晓雯; 徐亮; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2019-09-03
Also published as: WO2020181805A1

Abstract

This application discloses a kind of prediction technique of diabetes and device, storage medium, computer equipments, it is related to field of computer technology, can effectively solve can only judge whether user suffers from diabetes in the prior art, but the problem of can not judging the severity of its illness, wherein method includes: the sample of users data obtained in original health archives and electronic health record data；Regressive prediction model according to the user characteristics creation numeric type in the sample of users data；The first physical examination index value of target user's fasting blood-glucose and the second physical examination index value of postprandial preset duration blood glucose are judged using the regressive prediction model；According to the first physical examination index value and/or the second physical examination index value, the extent of the target user is determined.The application is suitable for the prediction of diabetes, and the determination to diabetes extent.

Description

Prediction technique and device, storage medium, the computer equipment of diabetes

Technical field

This application involves the prediction technique and device of field of computer technology more particularly to a kind of diabetes, storages to be situated between Matter, computer equipment.

Background technique

Diabetes are one group of metabolic diseases characterized by hyperglycemia, and it is impaired that when morbidity will lead to big blood vessel, capilary And multiple positions such as jeopardize the heart, brain, kidney, peripheral nerve, eyes, foot, also it can reinforce the pre- of diabetes with multiple complications It is completely necessary for surveying work.However as the progress of science and technology, the diagnosis of disease has been not limited to the analysis of doctor, has utilized people Work intelligently predicts diabetes, can more meet trend of today.

It is in the industry at present by collecting diabetes case, by diabetes patient data for the common methods of glycosuria disease forecasting It is compared with healthy population data, constructs 0-1 disaggregated model by all kinds of characteristic dimension data of patient and judge that user is It is no to suffer from diabetes.

However the method for existing glycosuria disease forecasting can only judge whether patient suffers from diabetes, can not but judge its illness Severity causes diagnostic result incomplete, can not carry out matched control according to extent and treat, and then may make At the exacerbated of conditions of patients.

Summary of the invention

In view of this, this application provides a kind of prediction technique of diabetes and device, storage medium, computer equipment, Main purpose is to solve when carrying out the prediction of diabetes using the 0-1 disaggregated model of building, can only judge whether user suffers from Diabetes, the problem that can not but judge the severity of its illness, and then cause diagnostic result incomplete.

According to the one aspect of the application, a kind of prediction technique of diabetes is provided, this method comprises:

Obtain the sample of users data in original health archives and electronic health record data；

Regressive prediction model according to the user characteristics creation numeric type in the sample of users data；

When judging the first physical examination index value of target user's fasting blood-glucose using the regressive prediction model and preset after the meal Second physical examination index value of long blood glucose；

According to the first physical examination index value and/or the second physical examination index value, the illness of the target user is determined Degree.

According to further aspect of the application, a kind of prediction meanss of diabetes are provided, which includes:

Acquiring unit, for obtaining the sample of users data in original health archives and electronic health record data；

Creating unit, for the regression forecasting mould according to the user characteristics creation numeric type in the sample of users data Type；

Judging unit, for judging the first physical examination index value of target user's fasting blood-glucose using the regressive prediction model With the second physical examination index value of postprandial preset duration blood glucose；

Determination unit, for determining the mesh according to the first physical examination index value and/or the second physical examination index value Mark the extent of user.

According to the another aspect of the application, a kind of non-volatile readable storage medium is provided, calculating is stored thereon with Machine program realizes the prediction technique of above-mentioned diabetes when described program is executed by processor.

According to another aspect of the application, a kind of computer equipment is provided, including non-volatile readable storage medium, Processor and it is stored in the computer program that can be run on non-volatile readable storage medium and on a processor, the processor The prediction technique of above-mentioned diabetes is realized when executing described program.

By above-mentioned technical proposal, a kind of prediction technique and device, storage medium, calculating of diabetes provided by the present application Machine equipment, compared with currently with the method for the 0-1 disaggregated model of building prediction diabetes, the application is pre- in existing diabetes It surveys on the basis of model, increases the regressive prediction model of postprandial blood sugar and empty stomach 2h blood glucose, sentence using regressive prediction model Second physical examination index value of disconnected first physical examination index value of target user's fasting blood-glucose and postprandial preset duration blood glucose out, Ji Keli It determines whether target user suffers from diabetes with physical examination index value, and can also further judge the illness journey of target user Degree.

Above description is only the general introduction of technical scheme, in order to better understand the technological means of the application, And it can be implemented in accordance with the contents of the specification, and in order to which the above and other objects, features and advantages of the application can be more It becomes apparent, below the special specific embodiment for lifting the application.

Detailed description of the invention

The drawings described herein are used to provide a further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please do not constitute the improper restriction to locally applying for explaining the application.In the accompanying drawings:

Fig. 1 shows a kind of flow diagram of the prediction technique of diabetes provided by the embodiments of the present application；

Fig. 2 shows the flow diagrams of the prediction technique of another diabetes provided by the embodiments of the present application；

Fig. 3 shows a kind of structural schematic diagram of the prediction meanss of diabetes provided by the embodiments of the present application；

Fig. 4 shows the structural schematic diagram of the prediction meanss of another diabetes provided by the embodiments of the present application.

Specific embodiment

The application is described in detail below with reference to embodiment and in conjunction with attached drawing.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.

When predicting diabetes, it can not be sentenced according to user data for the 0-1 disaggregated model currently with building The problem of disconnected diabetes illness severity out, a kind of prediction technique of diabetes is present embodiments provided, as shown in Figure 1, should Method includes:

101, the sample of users data in original health archives and electronic health record data are obtained.

Wherein, sample of users data may include that patient assessment's data, physical examination achievement data, administration data and health inform number According to etc..

102, the regressive prediction model according to the user characteristics creation numeric type in sample of users data.

Wherein, user characteristics may include postprandial blood sugar and empty stomach 2h blood glucose, blood pressure, stratum corneum lipids, insulin, BMI body The multiclass features dimension datas such as performance figure, diabetes hereditary information, age, diagnostic result.

In a particular embodiment, regressive prediction model can be used a variety of different frames models based on decision tree and carry out It constructs, i.e., multiple prediction models based on decision tree is flocked together using integrated study thought, to improve prediction knot The accuracy rate of fruit.Decision tree is to belong to one kind fairly simple in machine learning supervised learning sorting algorithm, and decision tree is prediction Model；What it was represented is a kind of mapping relations between object properties and object value.Each node indicates some object in tree, and Some possible attribute value that each diverging paths then represent, and then correspondence is passed through each leaf node from root node to the leaf node The value of object represented by the path gone through.Decision tree only has single output, if being intended to plural output, can establish independent decision Tree is to handle different outputs.Decision Tree algorithms have an ID3, C4.5, CART algorithm, common ground be all be greedy algorithm, degree of being distinguished as Amount mode is different, and just such as ID3 has used obtained information quantity as metric form, and C4.5 uses maximum gain ratio.

By create obtained regressive prediction model can be very good to reflect different blood pressures, stratum corneum lipids, insulin, The corresponding postprandial plasma glucose level of sample of users of BMI body-mass index, diabetes hereditary information, age, diagnostic result etc. With empty stomach 2h blood glucose value.

103, when judging the first physical examination index value of target user's fasting blood-glucose using regressive prediction model and preset after the meal Second physical examination index value of long blood glucose.

Wherein, target user is the user for needing to carry out diabetic condition prediction；First physical examination index value corresponds to target use The data detection result of family fasting blood-glucose；Second physical examination index value corresponds to the data detection of the postprandial preset duration blood glucose of target user As a result；Preset duration can determine according to actual needs.

The postprandial plasma glucose level and empty stomach 2h blood glucose value reflected for the present embodiment, the sample of users based on different characteristic, The feature of target user is matched with the feature of sample of users, finds the corresponding postprandial plasma glucose level of matched sample user characteristics With empty stomach 2h blood glucose value.

104, according to the first physical examination index value and/or the second physical examination index value, the extent of target user is determined.

In specific application scenarios, can judge whether target user's fasting blood-glucose is normal according to the first physical examination index value, Judge whether the postprandial preset duration blood glucose of target user normal according to the second physical examination index value, when the first physical examination index value and/or When the display of second physical examination index value is abnormal, it can judge that user suffers from diabetes, and can also be by carrying out with critical value Compare, further judges the extent of patient.

The prediction technique of middle diabetes through this embodiment can create number according to the user characteristics in sample of users data The regressive prediction model of value type judges the first physical examination index value of target user's fasting blood-glucose and postprandial using regressive prediction model Second physical examination index value of preset duration blood glucose, and according to the first physical examination index value and/or the second physical examination index value, determine mesh Mark user whether the severity of illness and illness, keep condition-inference result more accurate, diagnosis content is more perfect, is convenient for Timely and effectively mating treatment is carried out according to the different development degree of diabetes, and then contains progression of the disease.

Further, as the refinement and extension of above-described embodiment specific embodiment, in order to completely illustrate the application reality The specific implementation process in example is applied, the prediction technique of another diabetes is provided, as shown in Fig. 2, this method comprises:

201, the sample of users data in original health archives and electronic health record data are obtained.

For example, obtaining 100 or so the complete samples of user characteristics altogether in original health archives and electronic health record data Then user data is further analyzed processing to sample of users data.

It is divided to two kinds of prediction modes to be illustrated below, one is this physical examination index values using fasting blood sugar to be predicted (i.e. process shown in step 202a to 205a), another kind are predicted using postprandial this physical examination index value of two hours blood glucose values (i.e. process shown in step 202b to 205b).

202a, fasting blood sugar is mixed the sample in the user characteristics at family as label information Y1, and mix the sample with family except sky Target signature data other than abdomen blood glucose value and postprandial two hours blood glucose values create the first model training as characteristic information X1 Collection.

Wherein, user characteristics are extracted from sample of users data using regular expression, and target signature data are at least Suffer from history data, hospitalization data, medical administration data, physical examination data, healthy one informed in data including sample of users Or it is multinomial, such as may include that suffer from medical history taking, record of being hospitalized, medicining condition, physical examination situation, the health of user is informed etc. and related to be believed Breath.

The first obtained model training is created to concentrate comprising each characteristic information X1 and corresponding label information Y1.The i.e. different sample of users for suffering from medical history taking, record of being hospitalized, medicining condition, physical examination situation, health informing etc. are corresponding Fasting blood sugar.

203a, default regression forecasting algorithm training is based on for judging the first physical examination index value by the first model training collection The first identification model.

Wherein, it presets regression forecasting algorithm and decision tree is promoted by random forest (Random Forest), gradient Tetra- kinds of (Gradient Boosting Decision Tree, GBDT), Xgboost, LightGBM algorithm fusions obtain, and first The assessment of identification model uses mean absolute percentage error (MAPE) index, when the corresponding MAPE index value of the first identification model When comparing threshold value less than pre-set criteria, determine that the first identification model meets evaluation criteria.MAPE index is predicted for assessment models Error between value and true value.Common regression model evaluation index has MAP, MSE, RMSE and MAPE, but MAP, MSE and RMSE only considers the value of error, and MAPE also contemplates the ratio between error and true value, its calculation formula is:

In formula above, N is total sample number, and X is measured value, and Y is the analogue value.MAPE value is smaller, illustrates model prediction Error between value and true value is smaller, in a specific embodiment, standard comparing threshold value can be set according to actual conditions, when When MAPE is less than standard comparing threshold value, illustrate that the first identification model meets evaluation criteria.By the identification mould for meeting evaluation criteria Type is predicted, it is ensured that the accuracy of prediction result.

The first identification model by meeting evaluation criteria can determine first between characteristic information X1 and label information Y1 Mapping relations.

The first identification mould is obtained in order to illustrate the default regression forecasting algorithm training obtained using above-mentioned four kinds of algorithm fusions The process of type, alternatively, the process are specific can include:

(1) the first training sample set, the second training are obtained from the first model training concentration using stochastical sampling mode respectively Sample set, third training sample set, the 4th training sample set, such as n trained sample is randomly selected from the first model training concentration This, carries out four-wheel extraction altogether, obtains four training sets.(mutually indepedent between four training sets, element can have repetition)；

(2) random forests algorithm is utilized based on the first training sample set, training obtains the first classifier；Based on the second training Sample set utilizes GBDT algorithm, and training obtains the second classifier；Xgboost algorithm, training are utilized based on third training sample set Obtain third classifier；LightGBM algorithm is utilized based on the 4th training sample set, training obtains the 4th classifier；

Wherein, each training sample concentration includes different characteristic information X1 and corresponding label information The training process of Y1, these four classifiers can train these four that obtain, and obtain based on corresponding model training algorithm Classifier all can individually carry out the prediction of user's diabetes, that is, input characteristic (the particular content character pair of user to be measured Information X1), corresponding label information Y1 is found by classifier.

For the specific training process of the first classifier: 1. being concentrated from the first training sample and use the side Bootstraping Method puts back to sampling at random and selects m sample, carries out n times sampling altogether, generates n training set；2. for n training set, respectively N decision-tree model of training (can be constructed) by the existing algorithm such as ID3 algorithm, C4.5 algorithm, CART algorithm；3. for single Decision-tree model, it is assumed that the number of training sample feature is n, then according to information gain/information gain ratio/base when dividing every time The best feature of Buddhist nun's Index selection divides；4. each tree all go down so always by division, until all training of the node Sample belongs to same class, does not need beta pruning in the fission process of decision tree；5. more decision trees of generation are formed random Forest.For regression problem, the mean value for setting predicted value by more determines final prediction result, the i.e. prediction as the first classifier As a result.

For the specific training process of the second classifier: the second training sample T=of input (x1, y1), (x2, y2) ... (xm, ym) } T=(x1, y1), (x2, y2) ... (xm, ym) }, maximum number of iterations T, loss function L.Output is to learn by force Device f (x):

A weak learner) is initialized

Wherein, c is setting constant.

B) to iteration wheel number t=1,2 ... T has:

A) to sample i=1,2 ... m calculates negative gradient r_ti

B) (xi, rti) (i=1,2 ..m) is utilized, is fitted a CART regression tree, obtains the t regression tree, corresponded to Leaf node region be Rtj, j=1,2 ..., J.Wherein J is the number of the leaf node of regression tree t.

C) to area foliage j=1,2 ..J, best-fit values c is calculated_tj

D) strong learner is updated

Wherein, I is the training sample of all leaf node region Rtj Set.

C the expression formula of strong learner f (x)) is obtained

Based on above-mentioned strong learner f (x), training obtains the second classifier.

For the specific training process of third classifier:

A initial model) is established, formula specific as follows:

Wherein, k indicates the number of tree, and F indicates that each tree construction of building, xi indicate I-th of sample, predicted value that is score value of the xi on each tree and being exactly xi,For predicted value.

The objective function of the initial model is

Yi is the corresponding sample actual value of xi.

B) with the growth of tree, the formula recursion taken turns by t, obtaining final goal function is

Wherein, I_jIt indicates: j-th of leaf In include all samples, wj indicates the weight of j-th of leaf, and γ T corresponds to the number of leaf.

C third training sample set data) are substituted into using above-mentioned initial model and are fitted training, and utilize above-mentioned final mesh The fine or not degree that scalar functions measure models fitting training data (calculates loss function using objective function, loss function is smaller Illustrate that model can preferably be fitted training data) it is so that the deviation and variance of model obtain standard requirements, i.e., final trained To third classifier.

For the specific training process of the 4th classifier:

A the data that the 4th sample training is concentrated) are fitted using existing LightGBM algorithm, and to obtaining after each fitting Model concentrates the test set selected to be tested using from the 4th sample training, obtains the corresponding coefficient of determination and mean square error Value；

B the model) when the coefficient of determination is greater than certain threshold value and respectively error amount is less than certain threshold value, after determining fitting It complies with standard, and standard compliant model is determined as the 4th classifier.

(3) the first classifier, the second classifier, third classifier, the 4th classifier are finally utilized into bagging method (bagging) fusion treatment is carried out, the first identification model is obtained.

Specific fusion treatment mode is the process by voting, that is, uses most of principles, the minority is subordinate to the majority.Example Such as, for these four classifiers, suffer from history data, hospitalization data, medical administration data, physical examination number in input user to be measured After informing data according to, health, if the corresponding fasting blood sugar of prediction result obtained in four classifiers there are three classifier Meet the standard with diabetes, then can determine user to be measured with diabetes；If only one classifier obtains pre- It surveys the corresponding fasting blood sugar of result and meets the standard with diabetes, the corresponding fasting blood sugar of the other three classifier is not inconsistent The standard is closed, then can determine that user to be measured does not suffer from diabetes.

It should be noted that being trained if the MAPE index value of the first identification model is greater than pre-set criteria and compares threshold value The first obtained identification model does not meet evaluation criteria, then the first model training collection can be repartitioned, obtains the first new instruction Practice sample set, the second training sample set, third training sample set, the 4th training sample set, then utilizes the first new training sample This collection continues to train the first classifier, and continues the second classifier of training using the second new training sample set, and utilize newly Third training sample set continues to train third classifier, and continues the 4th classifier of training using the 4th new training sample set, Then this four Multiple Classifier Fusion processing obtained again by newly training, determine the MAPE index value of the first new identification model Whether it is less than pre-set criteria and compares threshold value, if being still greater than pre-set criteria compares threshold value, repeats above-mentioned repetition and divide mould Type training set and the process for updating training classifier, until the MAPE index value of newest the first obtained identification model is greater than Pre-set criteria compares threshold value, that is, meets evaluation criteria.

204a, the characteristic information of target user is input in the first identification model and characteristic information X1 progress similarity Match.

Wherein, the characteristic information of target user corresponds to target user in addition to fasting blood sugar and postprandial two hours blood glucose values Target signature data.

Alternatively, step 204a is specific can include: by the characteristic information of target user by data cleansing, Feature extraction, Missing Data Filling, outlier processing obtain the characteristic information of structural data；The feature of structural data is believed Breath carries out similarity mode with characteristic information X1.

Since the characteristic information of target user sometimes includes hash, and/or there are missing values, and/or there are different Constant value, that is, be not suitable for the unstructured data directly predicted using the first identification model.Therefore, target can be used first The characteristic information at family carries out data cleansing, and removing hash, (such as removal user shows dwelling location, registered permanent residence location number According to only history data, hospitalization data, medical administration data, physical examination data, health informing data etc. are suffered from reservation)；Again to reservation Data carry out feature extraction and (suffer from history data, hospitalization data, medical administration data, physical examination data, health informing number as extracted According to etc.)；If there are can be filled (height in such as user's physical examination data when missing values using 0 value in the characteristic extracted With one vacancy of weight, to be filled using 0 value, guarantee when matching with characteristic information X1 in model subsequent in this way is comparable, Generating when avoiding characteristic matching can not matched mistake)；If in the characteristic extracted, there are exceptional values can refer to practical feelings It (is 99999 days as being hospitalized duration one, hence it is evident that there are exceptions, can further pass through the time started of being hospitalized that condition, which is modified processing, Duration of being correctly hospitalized, processing of then modifying are calculated with the end time).

Pass through a series of places such as data cleansing, feature extraction, Missing Data Filling, outlier processing in this optional way Reason, it is ensured that obtain avoiding feature with the structural data being comparable when characteristic information X1 is matched in the first identification model When matching generate can not matched mistake, remove exceptional value, improve the accuracy of characteristic matching.

205a, it is greater than preset threshold and the highest characteristic information X1 of similarity and the first mapping relations using similarity, really The corresponding first physical examination index value of the user that sets the goal.

Wherein, preset threshold can be preset according to actual needs.For example, preset threshold is arranged bigger, it is corresponding special It is higher to levy matching precision, if similarity is 100%, illustrates that feature exactly matches.

For example, suffering from history data, hospitalization data, medical administration data, body in the first identification model input target user Examine data, health inform data after, be equivalent to by these data be separately input in four classifiers of step 203a and with point The corresponding characteristic information of class device carries out similarity mode, finds characteristic information that is most like and being greater than certain threshold value respectively, And then corresponding physical examination index value, the i.e. fasting blood sugar of target user are found out respectively by this four classifiers, if Meet the standard with diabetes there are three fasting blood sugar in this four fasting blood sugars, then can determine that target user suffers from Diabetes, and the first physical examination index that the average value for calculating these three fasting blood sugars is calculated as the first identification model Value；If not meeting the standard with diabetes there are two fasting blood sugar in this four fasting blood sugars, other two is on an empty stomach Blood glucose value meets the standard with diabetes, then calculating the average value of this four fasting blood sugars as the first identification model meter The first obtained physical examination index value, and determine whether target user suffers from diabetes according to this average value.

The step 202b arranged side by side with step 202a, mix the sample with the user characteristics Chinese meal at family after two hours blood glucose values as mark Information Y2 is signed, and mixes the sample with target signature data of the family in addition to fasting blood sugar and postprandial two hours blood glucose values as feature Information X2 creates the second model training collection.

It should be noted that step 202b is similar with step 202a, the target signature data of sample of users include at least sample This user suffer from history data, hospitalization data, medical administration data, physical examination data, health inform it is one or more in data. And it creates the second obtained model training to concentrate comprising each characteristic information X2 and corresponding label information Y2.I.e. The sample of users that difference suffers from medical history taking, record of being hospitalized, medicining condition, physical examination situation, health informing etc. is corresponding postprandial Two hours blood glucose values.

203b, default regression forecasting algorithm training is based on for judging the second physical examination index value by the second model training collection The second identification model.

Wherein, the assessment of the second identification model equally uses MAPE index, when the corresponding MAPE index of the second identification model When value compares threshold value less than preassigned, determine that the second identification model meets evaluation criteria, by meet evaluation criteria second Identification model can determine the second mapping relations between characteristic information X2 and label information Y2.

Alternatively, the detailed process of the 203b step can include:

(1) it is concentrated using stochastical sampling mode from the second model training and obtains the 5th training sample set, the 6th training respectively Sample set, the 7th training sample set, the 8th training sample set；

(2) random forests algorithm is utilized based on the 5th training sample set, training obtains the 5th classifier；Based on the 6th training Sample set utilizes GBDT algorithm, and training obtains the 6th classifier；Xgboost algorithm, training are utilized based on the 7th training sample set Obtain the 7th classifier；LightGBM algorithm is utilized based on the 8th training sample set, training obtains the 8th classifier；

(3) the 5th classifier, the 6th classifier, the 7th classifier, the 8th classifier are carried out at fusion using bagging method Reason, obtains the second identification model.

Similar with the optional way in step 203a, specific fusion treatment mode is also the process by voting, i.e., Using most of principles, the minority is subordinate to the majority.For example, for these four classifiers, input user to be measured suffer from history data, After hospitalization data, medical administration data, physical examination data, health inform data, if obtained in four classifiers there are three classifier To the corresponding postprandial two hours blood glucose values of prediction result meet the standard with diabetes, then can determine that user to be measured suffers from Diabetes；If the corresponding postprandial two hours blood glucose values of the prediction result that only one classifier obtains meet with diabetes Standard, the corresponding postprandial two hours blood glucose values of the other three classifier do not meet the standard, then can determine that user to be measured does not have Suffer from diabetes.

It should be noted that being trained if the MAPE index value of the second identification model is greater than pre-set criteria and compares threshold value The second obtained identification model does not meet evaluation criteria, then the first model training collection can be repartitioned, obtains the 5th new instruction Practice sample set, the 6th training sample set, the 7th training sample set, the 8th training sample set, then utilizes the 5th new training sample This collection continues to train the 5th classifier, and continues the 6th classifier of training using the 6th new training sample set, and utilize newly 7th training sample set continues the 7th classifier of training, and continues the 8th classifier of training using the 8th new training sample set, Then this four Multiple Classifier Fusion processing obtained again by newly training, determine the MAPE index value of the second new identification model Whether it is less than pre-set criteria and compares threshold value, if being still greater than pre-set criteria compares threshold value, repeats above-mentioned repetition and divide mould Type training set and the process for updating training classifier, until the MAPE index value of newest the second obtained identification model is greater than Pre-set criteria compares threshold value, that is, meets evaluation criteria.

204b, the characteristic information of target user is input in the second identification model and characteristic information X2 progress similarity Match.

In this step, the characteristic information of target user corresponds to target user except fasting blood sugar and postprandial two hours blood glucose Target signature data other than value.

Alternatively, step 204b is specific can include: by the characteristic information of target user by data cleansing, Feature extraction, Missing Data Filling, outlier processing obtain the characteristic information of structural data；The feature of structural data is believed Breath carries out similarity mode with characteristic information X2.

It is similar with the optional way in step 204a, pass through the data cleansing in this optional way, feature extraction, missing values A series of processing such as filling, outlier processing, it is ensured that having when obtaining matching with characteristic information X2 in the second identification model can Than the structural data of property, avoid generating when characteristic matching can not matched mistake, remove exceptional value, improve the essence of characteristic matching Exactness.

205b, it is greater than predetermined threshold and the highest characteristic information X2 of similarity and the second mapping relations using similarity, really The corresponding second physical examination index value of the user that sets the goal.

Wherein, predetermined threshold can be preset according to actual needs.For example, predetermined threshold is arranged bigger, it is corresponding special It is higher to levy matching precision, if similarity is 100%, illustrates that feature exactly matches.

For example, suffering from history data, hospitalization data, medical administration data, body in the second identification model input target user Examine data, health inform data after, be equivalent to by these data be separately input in four classifiers of step 203b and with point The corresponding characteristic information of class device carries out similarity mode, finds characteristic information that is most like and being greater than certain threshold value respectively, And then corresponding physical examination index value, the i.e. postprandial two hours blood glucose of target user are found out respectively by this four classifiers Value, if meeting the standard with diabetes there are three postprandial two hours blood glucose values in this four postprandial two hours blood glucose values, that It can determine that target user suffers from diabetes, and calculate the average value of these three postprandial two hours blood glucose values as the second identification mould The second physical examination index value that type is calculated；If there are two postprandial two hours blood glucose values in this four postprandial two hours blood glucose values The standard with diabetes is not met, other two postprandial two hours blood glucose value meets the standard with diabetes, then calculating The second physical examination index value that the average value of this four postprandial two hours blood glucose values is calculated as the second identification model, and foundation This average value determines whether target user suffers from diabetes.

206, according to the first physical examination index value and/or the second physical examination index value, the extent of target user is determined.

Alternatively, step 206 is specific can include: if the corresponding first physical examination index value of target user is greater than It is more than or equal to the second preset threshold equal to the first preset threshold and/or the second physical examination index value, it is determined that target user is with sugar Urine disease；Then pass through the second number locating for the first numerical intervals locating for the first physical examination index value, and/or the second physical examination index value It is worth section, judges the extent of target user.

Wherein, the first preset threshold is to judge the established standards of diabetes according to fasting blood-glucose to determine, such as 7.0mmol/L；Second preset threshold is to judge the established standards of diabetes according to postprandial two hours blood glucose to determine, such as 11.1mmol/L。

Such as, however, it is determined that the first preset threshold of target user is 8.0mmol/L, and the second preset threshold is 7.6mmol/L, Because the first physical examination index value is greater than the first preset threshold, therefore it can determine target user with diabetes；If it is determined that target user One preset threshold is 5.7mmol/L, and the second preset threshold is 11.9mmol/L, is preset because the second physical examination index value is greater than second Threshold value, therefore can determine target user with diabetes；If it is determined that the first preset threshold of target user is 8.3mmol/L, second is pre- If threshold value is 11.7mmol/L, because the first physical examination index value is greater than the first preset threshold, it is pre- that the second physical examination index value is greater than second If threshold value, therefore it can determine target user with diabetes.

And for the extent of diabetes, divide three kinds of situations to discuss below:

(1) only judged with the first physical examination index value, that is, pass through the first numerical intervals locating for the first physical examination index value, judgement The extent of the target user, specifically can include: divide and be greater than the first preset threshold, and according to predetermined value regular increase Multiple numerical intervals；Create the third mapping relations between multiple numerical intervals and diabetes extent；Determine the first body Examine corresponding the first numerical intervals in multiple numerical intervals of index value；According to third mapping relations and the first numerical value area Between, judge the first diabetes extent of target user.

For example, setting is right greater than in multiple numerical intervals of the first preset threshold 7.0mmol/L and third mapping relations The diabetes extent answered is respectively as follows: mild diabetes: 7.0~8.4mmol/L, medium diabetes mellitus: 8.4~10.1mmol/ L, severe diabetes: greater than 10.11mmol/L.If it is determined that the first physical examination index value is 9.6mmol/L, then it can determine whether out the first body The first numerical intervals that inspection index value is in are as follows: 8.4~11.1mmol/L, then according to third mapping relations and the first numerical value Section can determine whether out that the extent of the diabetes of target user is medium diabetes mellitus.

(2) only judged with the second physical examination index value, that is, pass through second value section locating for the second physical examination index value, judgement The extent of target user, specifically includes: dividing and is greater than the second preset threshold, and according to the multiple of predetermined value regular increase Numerical intervals；Create the 4th mapping relations between multiple numerical intervals and diabetes extent；Determine the second physical examination index The corresponding second value section in multiple numerical intervals of value；According to the 4th mapping relations and second value section, judgement The second diabetes extent of target user.

For example, setting is right greater than in multiple numerical intervals of the second preset threshold 11.1mmol/L and the 4th mapping relations The diabetes extent answered is respectively as follows: medium diabetes mellitus: 11.1~16.7mmol/L, severe diabetes: greater than 16.7mmol/ L (the phenomenon that being easy to appear ketoacidosis when being greater than 16.7mmol/L).If it is determined that the second physical examination index value is 12.6mmol/ L then can determine whether out the second value section that the second physical examination index value is in are as follows: 11.1~16.7mmol/L is then reflected according to the 4th Relationship and second value section are penetrated, can determine whether out that the extent of the diabetes of target user has been medium diabetes mellitus.

(3) combining the first physical examination index value and the second physical examination index value to carry out comprehensive judgement, (this decision procedure is due to examining Consider many factors, therefore precision of prediction is relatively high), i.e., by the first numerical intervals locating for the first physical examination index value and Second value section locating for second physical examination index value, judges the extent of target user, specifically includes: if the first diabetes Extent and the second diabetes extent are identical, then determine final illness journey according to the identical diabetes extent of the two Degree.If the first diabetes extent and the second diabetes extent are different, according to user to passing through the first identification model With the accuracy rate of both prediction modes of the second identification model feedback or adopt rate, obtain the first identification model corresponding the respectively One weight and corresponding second weight of the second identification model；When the first weight is greater than the second weight, by the first diabetes illness Degree is determined as the extent of target user；It is when the second weight is greater than the first weight, the second diabetes extent is true It is set to the extent of target user.

In the present embodiment, two kinds of corresponding weights of prediction mode can according to user feedback accuracy rate or adopt rate It is set.Specific statistics available different accuracy rate adopts the corresponding weighted value of rate, is then reflected by what statistics obtained Relationship is penetrated, the corresponding weight of prediction mode is found.For the present embodiment, according to user feedback accuracy rate or adopt rate, can be quasi- The prediction result for really reflecting which kind of prediction mode precision of prediction is higher, and then the higher prediction mode of precision of prediction being selected to obtain Determine as final as a result, more accurate.In addition to this, it is each that two kinds of prediction modes can be also artificially preset according to the actual situation Self-corresponding weight.

For example, according to user feedback as a result, discovery utilizes the standard of the first physical examination index value prediction diabetes extent True rate is higher, then can be 70% for the weight of the first physical examination index value prediction mode configuration, be the second physical examination index value prediction side The weight of formula configuration is 30%, and the result that can predict the first physical examination index value when the result difference that two kinds of predictions generate is fed back To target user, as last diagnostic result.Assuming that the prediction of the first physical examination index value is medium diabetes mellitus, the second physical examination index Value prediction is severe diabetes, then according to the weight of configuration height, the final diabetes extent for determining target user is Medium diabetes mellitus.

It is subsequent obtain target user practical fasting blood sugar and postprandial two hours blood glucose values after, be alternatively arranged as new sample This training set continues to train to two identification models in the present embodiment, to reach the higher effect of precision of prediction.Pass through The prediction technique of above-mentioned diabetes can be determined between characteristic information and label information by being trained to model training collection Mapping relations, the structural data of target user is matched with regressive prediction model, so pass through mapping relations determine First physical examination index value of fasting blood-glucose and postprandial two hours the second physical examination index values, by with the first preset threshold and second The numerical value of preset threshold compares, and can judge whether user suffers from diabetes, not only can be pre- from diabetes diagnosis index Survey user whether illness, moreover it is possible to pass through the first numerical intervals locating for the first physical examination index value, and/or the second physical examination index value institute The second value section at place, judges the extent of target user, keeps diagnostic result more perfect.

Further, the concrete embodiment as method shown in Fig. 1 and Fig. 2, the embodiment of the present application provide a kind of diabetes Prediction meanss, as shown in figure 3, the device includes: acquiring unit 31, creating unit 32, judging unit 33, determination unit 34.

Acquiring unit 31 can be used for obtaining the sample of users data in original health archives and electronic health record data；

Creating unit 32 can be used for the regression forecasting mould according to the user characteristics creation numeric type in sample of users data Type；

Judging unit 33 can be used for judging using regressive prediction model the first physical examination index value of target user's fasting blood-glucose With the second physical examination index value of postprandial preset duration blood glucose；

Determination unit 34 can be used for determining target user's according to the first physical examination index value and/or the second physical examination index value Extent.

In specific implementation application scenarios, for the recurrence according to the user characteristics creation numeric type in sample of users data Prediction model, as shown in figure 4, creating unit 32, specifically can include: creation module 321, training module 322, determining module 323.

Creation module 321 is particularly used in using fasting blood sugar in the user characteristics as label information Y1, and will Target signature data of the sample of users in addition to the fasting blood sugar and postprandial two hours blood glucose values are as characteristic information X1, creates the first model training collection, and the target signature data include at least suffering from history data, counting in hospital for the sample of users According to, medical administration data, physical examination data, health inform it is one or more in data；

Training module 322 is particularly used in and is based on default regression forecasting algorithm training by the first model training collection For judging the first identification model of the first physical examination index value, wherein the default regression forecasting algorithm by random forest, Gradient promotes tetra- kinds of algorithm fusions of decision tree GBDT, Xgboost, LightGBM and obtains, and the assessment of first identification model is adopted With mean absolute percentage error MAPE index, when the corresponding MAPE index value of first identification model is less than pre-set criteria ratio When compared with threshold value, determine that first identification model meets evaluation criteria；

Determining module 323 is particularly used in and can determine the spy by first identification model for meeting evaluation criteria Reference ceases the first mapping relations between the X1 and label information Y1；

Creation module 321 specifically can also be used in using two hours blood glucose values after the user characteristics Chinese meal as label information Y2, and using the target signature data of the sample of users as characteristic information X2, create the second model training collection；

Training module 322 specifically can also be used to be based on the default regression forecasting calculation by the second model training collection Method trains the second identification model for judging the second physical examination index value, wherein the assessment of second identification model is adopted With MAPE index, when the corresponding MAPE index value of second identification model, which is less than preassigned, compares threshold value, determine described in Second identification model meets evaluation criteria；

Determining module 323 specifically can also be used to can determine by second identification model for meeting evaluation criteria described The second mapping relations between characteristic information X2 and the label information Y2.

Correspondingly, in order to judge target user's fasting blood-glucose the first physical examination index value and postprandial preset duration blood glucose Second physical examination index value, as shown in figure 4, judging unit 33, specifically can include: matching module 331, determining module 332.

Matching module 331 is particularly used in the characteristic information of the target user being input to first identification model In with the characteristic information X1 carry out similarity mode, the characteristic information of the target user correspond to the target user remove described in The target signature data other than fasting blood sugar and postprandial two hours blood glucose values；

Determining module 332 is particularly used in and is greater than preset threshold and the highest feature letter of similarity using similarity X1 and first mapping relations are ceased, determine the corresponding first physical examination index value of the target user；

Matching module 331 specifically can also be used to for the characteristic information of the target user to be input to the second identification mould Similarity mode is carried out with the characteristic information X2 in type；

Determining module 332 specifically can also be used to be greater than predetermined threshold and the highest characteristic information of similarity using similarity X2 and second mapping relations determine the corresponding second physical examination index value of the target user.

In specific application scenarios, in order to determine mesh according to the first physical examination index value and/or the second physical examination index value The extent of user is marked, as shown in figure 4, determination unit 34, specifically can include: determining module 341, judgment module 342.

Determining module 341, if can be used for the corresponding first physical examination index value of target user is more than or equal to the first preset threshold, And/or second physical examination index value be more than or equal to the second preset threshold, it is determined that target user suffer from diabetes；

Judgment module 342 can be used for through the first numerical intervals locating for the first physical examination index value, and/or the second physical examination Second value section locating for index value, judges the extent of target user.

In specific application scenarios, in order to accurately judge the extent of target user, judgment module 342, tool Body, which is also used to divide, is greater than the first preset threshold, and according to multiple numerical intervals of predetermined value regular increase；Create multiple numbers The third mapping relations being worth between section and diabetes extent；Determine that the first physical examination index value is corresponding in multiple numerical value areas Between in the first numerical intervals；According to third mapping relations and the first numerical intervals, the diabetes illness of target user is judged Degree.It divides and is greater than the second preset threshold, and according to multiple numerical intervals of predetermined value regular increase；Create multiple numerical value areas Between the 4th mapping relations between diabetes extent；Determine that the second physical examination index value is corresponding in multiple numerical intervals Second value section；According to the 4th mapping relations and second value section, the diabetes extent of target user is judged；

Judgment module 342, if being specifically also used to the first diabetes extent and the second patient of diabetes course of disease Degree is different, then according to user to being fed back by first identification model and described both prediction modes of second identification model Accuracy rate adopts rate, obtains corresponding first weight of first identification model respectively and second identification model is corresponding Second weight；When first weight is greater than second weight, the first diabetes extent is determined as described The extent of target user；When second weight is greater than first weight, by the second diabetes extent It is determined as the extent of the target user.

In specific application scenarios, matching module 331 is particularly used in and passes through the characteristic information of the target user Data cleansing, feature extraction, Missing Data Filling, outlier processing obtain the characteristic information of structural data；By structural data Characteristic information and the characteristic information X1 carry out similarity mode；

Matching module 331, be particularly used in by the characteristic information of the target user by data cleansing, feature extraction, Missing Data Filling, outlier processing obtain the characteristic information of structural data；By the characteristic information of structural data and the spy Reference ceases X2 and carries out similarity mode.

In specific application scenarios, training module 322 is particularly used in using stochastical sampling mode from first mould The first training sample set, the second training sample set, third training sample set, the 4th training sample are obtained in type training set respectively Collection；Random forests algorithm is utilized based on first training sample set, training obtains the first classifier；Based on second training Sample set utilizes GBDT algorithm, and training obtains the second classifier；Xgboost algorithm is utilized based on the third training sample set, Training obtains third classifier；LightGBM algorithm is utilized based on the 4th training sample set, training obtains the 4th classifier； First classifier, second classifier, the third classifier, the 4th classifier are melted using bagging method Conjunction processing, obtains first identification model；

Training module 322 specifically can also be used to obtain respectively using stochastical sampling mode from second model training concentration Take the 5th training sample set, the 6th training sample set, the 7th training sample set, the 8th training sample set；Based on the 5th instruction Practice sample set and utilize random forests algorithm, training obtains the 5th classifier；It is calculated based on the 6th training sample set using GBDT Method, training obtain the 6th classifier；Xgboost algorithm is utilized based on the 7th training sample set, training obtains the 7th classification Device；LightGBM algorithm is utilized based on the 8th training sample set, training obtains the 8th classifier；By the 5th classification Device, the 6th classifier, the 7th classifier, the 8th classifier carry out fusion treatment using bagging method, obtain institute State the second identification model.

It should be noted that each functional unit involved by a kind of prediction meanss of diabetes provided in this embodiment is other Corresponding description, can be referring to figs. 1 to the corresponding description in Fig. 2, and details are not described herein.

Based on above-mentioned method as depicted in figs. 1 and 2, correspondingly, the embodiment of the present application also provides a kind of storage medium, On be stored with computer program, which realizes the above-mentioned prediction such as Fig. 1 and diabetes shown in Fig. 2 when being executed by processor Method.

Based on this understanding, the technical solution of the application can be embodied in the form of software products, which produces Product can store in a non-volatile memory medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions With so that computer equipment (can be personal computer, server or the network equipment an etc.) execution the application is each The method of implement scene.

Based on above-mentioned method as shown in Figure 1 and Figure 2 and Fig. 3, virtual bench embodiment shown in Fig. 4, in order to realize Above-mentioned purpose, the embodiment of the present application also provides a kind of computer equipments, are specifically as follows personal computer, server, network Equipment etc., the entity device include storage medium and processor；Storage medium, for storing computer program；Processor is used for Computer program is executed to realize the prediction technique of above-mentioned diabetes as depicted in figs. 1 and 2.

Optionally, which can also include user interface, network interface, camera, radio frequency (Radio Frequency, RF) circuit, sensor, voicefrequency circuit, WI-FI module etc..User interface may include display screen (Display), input unit such as keyboard (Keyboard) etc., optional user interface can also connect including USB interface, card reader Mouthful etc..Network interface optionally may include standard wireline interface and wireless interface (such as blue tooth interface, WI-FI interface).

It will be understood by those skilled in the art that computer equipment structure provided in this embodiment is not constituted and is set to the entity Standby restriction may include more or fewer components, perhaps combine certain components or different component layouts.

It can also include operating system, network communication module in non-volatile readable storage medium.Operating system is management The program of the entity device hardware and software resource of the prediction of diabetes, support message handling program and other softwares and/or The operation of program.Network communication module for realizing the communication between component each inside non-volatile readable storage medium, and It is communicated between hardware and softwares other in the entity device.

Through the above description of the embodiments, those skilled in the art can be understood that the application can borrow It helps software that the mode of necessary general hardware platform is added to realize, hardware realization can also be passed through.Pass through the skill of application the application Art scheme, compared with currently available technology, the application can be on the basis of detecting target user with diabetes, further Judge the severity of illness, diagnostic result can be made more perfect, and then the state of an illness hair for understanding target user can be tracked in time Situation is opened up, and carries out corresponding mating treatment.

It will be appreciated by those skilled in the art that the accompanying drawings are only schematic diagrams of a preferred implementation scenario, module in attached drawing or Process is not necessarily implemented necessary to the application.It will be appreciated by those skilled in the art that the mould in device in implement scene Block can according to implement scene describe be distributed in the device of implement scene, can also carry out corresponding change be located at be different from In one or more devices of this implement scene.The module of above-mentioned implement scene can be merged into a module, can also be into one Step splits into multiple submodule.

Above-mentioned the application serial number is for illustration only, does not represent the superiority and inferiority of implement scene.Disclosed above is only the application Several specific implementation scenes, still, the application is not limited to this, and the changes that any person skilled in the art can think of is all The protection scope of the application should be fallen into.

Claims

1. a kind of prediction technique of diabetes characterized by comprising

Using the regressive prediction model judge target user's fasting blood-glucose the first physical examination index value and postprandial preset duration blood Second physical examination index value of sugar；

According to the first physical examination index value and/or the second physical examination index value, the extent of the target user is determined.

2. the method according to claim 1, wherein the user characteristics are to utilize regular expression from the sample It is extracted in this user data, the preset duration is two hours；

The regressive prediction model according to the user characteristics creation numeric type in the sample of users data, specifically includes:

Using fasting blood sugar in the user characteristics as label information Y1, and family is mixed the sample with except the fasting blood sugar and institute The target signature data other than postprandial two hours blood glucose values are stated as characteristic information X1, create the first model training collection, the mesh Mark characteristic includes at least suffering from history data, hospitalization data, medical administration data, physical examination data, being good for for the sample of users Health is accused one or more in primary data；

Default regression forecasting algorithm training is based on for judging the first physical examination index value by the first model training collection The first identification model, wherein the default regression forecasting algorithm by random forest, gradient promoted decision tree GBDT, Tetra- kinds of algorithm fusions of Xgboost, LightGBM obtain, and the assessment of first identification model uses mean absolute percentage error MAPE index determines described the when the corresponding MAPE index value of first identification model, which is less than pre-set criteria, compares threshold value One identification model meets evaluation criteria, and first identification model by meeting evaluation criteria can determine the characteristic information X1 The first mapping relations between the label information Y1；

Using two hours blood glucose values after the user characteristics Chinese meal as label information Y2, and by the target of the sample of users Characteristic creates the second model training collection as characteristic information X2；

The default regression forecasting algorithm training is based on for judging that second physical examination refers to by the second model training collection Second identification model of scale value, wherein the assessment of second identification model uses MAPE index, when second identification model When corresponding MAPE index value compares threshold value less than preassigned, determines that second identification model meets evaluation criteria, pass through Meet evaluation criteria second identification model can determine between the characteristic information X2 and the label information Y2 second Mapping relations.

3. according to the method described in claim 2, it is characterized in that, described judge target user using the regressive prediction model First physical examination index value of fasting blood-glucose and the second physical examination index value of postprandial preset duration blood glucose, specifically include:

The characteristic information of the target user is input in first identification model similar to the characteristic information X1 progress Degree matching, the characteristic information of the target user are corresponding in addition to the fasting blood sugar and postprandial two hours blood glucose values The target signature data；

Using similarity greater than preset threshold and the highest characteristic information X1 of similarity and first mapping relations, really Determine the corresponding first physical examination index value of the target user；

The characteristic information of the target user is input in second identification model similar to the characteristic information X2 progress Degree matching；

Using similarity greater than predetermined threshold and the highest characteristic information X2 of similarity and second mapping relations, institute is determined State the corresponding second physical examination index value of target user.

4. according to the method described in claim 3, it is characterized in that, described according to the first physical examination index value and/or described Second physical examination index value, determines the extent of the target user, specifically includes:

If the corresponding first physical examination index value of the target user is more than or equal to the first preset threshold and/or described second Physical examination index value is more than or equal to the second preset threshold, it is determined that the target user suffers from diabetes；

Pass through locating for the first numerical intervals locating for the first physical examination index value, and/or the second physical examination index value Two numerical intervals judge the extent of the target user.

5. according to the method described in claim 4, it is characterized in that, passing through the first numerical value locating for the first physical examination index value Section judges the extent of the target user, specifically includes:

It divides and is greater than first preset threshold, and according to multiple numerical intervals of predetermined value regular increase；

Create the third mapping relations between the multiple numerical intervals and diabetes extent；

Determine corresponding first numerical intervals in the multiple numerical intervals of the first physical examination index value；

According to the third mapping relations and first numerical intervals, the first diabetes illness of the target user is judged Degree；

By second value section locating for the second physical examination index value, the extent of the target user is judged, specifically Include:

It divides and is greater than second preset threshold, and according to multiple numerical intervals of predetermined value regular increase；

Create the 4th mapping relations between the multiple numerical intervals and diabetes extent；

Determine the corresponding second value section in the multiple numerical intervals of the second physical examination index value；

According to the 4th mapping relations and the second value section, the second diabetes illness of the target user is judged Degree；

Pass through the second number locating for the first numerical intervals locating for the first physical examination index value and the second physical examination index value It is worth section, judges the extent of the target user, specifically include:

If the first diabetes extent is different with the second diabetes extent, according to user to by described The accuracy rate or adopt rate that first identification model and described both prediction modes of second identification model are fed back, respectively described in acquisition Corresponding first weight of first identification model and corresponding second weight of second identification model；

When first weight is greater than second weight, the first diabetes extent is determined as the target and is used The extent at family；

When second weight is greater than first weight, the second diabetes extent is determined as the target and is used The extent at family.

6. according to the method described in claim 3, it is characterized in that, the characteristic information by the target user is input to institute It states in the first identification model and carries out similarity mode with the characteristic information X1, specifically include:

The characteristic information of the target user is passed through into data cleansing, feature extraction, Missing Data Filling, outlier processing, is obtained The characteristic information of structural data；

The characteristic information of structural data and the characteristic information X1 are subjected to similarity mode；

The characteristic information by the target user is input in second identification model to carry out with the characteristic information X2 Similarity mode specifically includes:

The characteristic information of structural data and the characteristic information X2 are subjected to similarity mode.

7. according to the method described in claim 2, it is characterized in that, described be based on presetting back by the first model training collection Return prediction algorithm to train the first identification model for judging the first physical examination index value, specifically include:

First training sample set, the second training sample are obtained from first model training concentration using stochastical sampling mode respectively Collection, third training sample set, the 4th training sample set；

Random forests algorithm is utilized based on first training sample set, training obtains the first classifier；

GBDT algorithm is utilized based on second training sample set, training obtains the second classifier；

Xgboost algorithm is utilized based on the third training sample set, training obtains third classifier；

LightGBM algorithm is utilized based on the 4th training sample set, training obtains the 4th classifier；

By first classifier, second classifier, the third classifier, the 4th classifier using bagging method into Row fusion treatment obtains first identification model；

It is described to be collected based on the default regression forecasting algorithm training by second model training for judging second body The second identification model for examining index value, specifically includes:

It is concentrated using stochastical sampling mode from second model training and obtains the 5th training sample set, the 6th training sample respectively Collection, the 7th training sample set, the 8th training sample set；

Random forests algorithm is utilized based on the 5th training sample set, training obtains the 5th classifier；

GBDT algorithm is utilized based on the 6th training sample set, training obtains the 6th classifier；

Xgboost algorithm is utilized based on the 7th training sample set, training obtains the 7th classifier；

LightGBM algorithm is utilized based on the 8th training sample set, training obtains the 8th classifier；

By the 5th classifier, the 6th classifier, the 7th classifier, the 8th classifier using bagging method into Row fusion treatment obtains second identification model.

8. a kind of prediction meanss of diabetes characterized by comprising

Creating unit, for the regressive prediction model according to the user characteristics creation numeric type in the sample of users data；

Judging unit, for judging the first physical examination index value and meal of target user's fasting blood-glucose using the regressive prediction model Second physical examination index value of preset duration blood glucose afterwards；

Determination unit, for determining that the target is used according to the first physical examination index value and/or the second physical examination index value The extent at family.

9. a kind of non-volatile readable storage medium, is stored thereon with computer program, which is characterized in that described program is processed Device realizes the prediction technique of diabetes described in any one of claims 1 to 7 when executing.

10. a kind of computer equipment, including non-volatile readable storage medium, processor and it is stored in non-volatile readable storage On medium and the computer program that can run on a processor, which is characterized in that the processor is realized when executing described program The prediction technique of diabetes described in any one of claims 1 to 7.