CN111340069A - Incomplete data fine modeling and missing value filling method based on alternate learning - Google Patents
Incomplete data fine modeling and missing value filling method based on alternate learning Download PDFInfo
- Publication number
- CN111340069A CN111340069A CN202010085968.4A CN202010085968A CN111340069A CN 111340069 A CN111340069 A CN 111340069A CN 202010085968 A CN202010085968 A CN 202010085968A CN 111340069 A CN111340069 A CN 111340069A
- Authority
- CN
- China
- Prior art keywords
- model
- filling
- input
- features
- missing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/02—Computing arrangements based on specific mathematical models using fuzzy logic
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Software Systems (AREA)
- Automation & Control Theory (AREA)
- Biomedical Technology (AREA)
- Fuzzy Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an incomplete data fine modeling and missing value filling method based on alternate learning, and belongs to the field of data mining. Firstly, dividing an input space into a plurality of subsets based on a fuzzy clustering algorithm, and establishing a specific local linear regression model for each subset; then, a global model is constructed by adopting the weighted sum of the local linear regression model, so that the fineness of the model is improved; and performing selection of salient input features for each subset using a stepwise regression algorithm to further refine the model. The missing value is taken as a variable, a model solving strategy for alternately learning the selection of the obvious input features and the filling of the parameters and the missing value of the model is provided, and the filling is completed concomitantly while the modeling is completed. The invention improves the fineness of the model established in the traditional regression filling, effectively solves the problem of incomplete model input data during modeling of incomplete data, and has ideal filling precision.
Description
Technical Field
The invention belongs to the field of data mining, and relates to a method for performing fine modeling and missing value filling on incomplete data based on alternate learning.
Background
The data mining technology can search information hidden in a large amount of data through an algorithm, so that correct guidance is provided for decision making. However, in various fields of real life, data loss is almost an unavoidable problem. High quality data is a prerequisite for high quality data mining. Because many data mining algorithms are difficult to independently deal with incomplete data sets, missing value filling becomes a research hotspot of incomplete data analysis. Currently, researchers have proposed various missing value filling methods, such as a mean value filling method, a hot card filling method, a clustering-based filling method, a regression filling method, and the like.
The mean value filling method (H.L. Shashirekha, A.H.Wani, Analysis of input algorithm for micro gene expression data, in:2015International Conference on applied and Theoretical Computing and Communication Technology, Davanere, India,2015) replaces the missing value with the mean value of the existing data in the incomplete attribute column. Although the method can quickly fill in missing values, the diversity of filling values is reduced, and therefore the filling effect is poor.
Unlike the mean-value-filling method, the hot-stuck-filling method (T.Srebotnjak, G.Carr, A.Sherbinin, C.Rickwood, A global Water Quality Index and hot-delete evaluation of missing data, economic Indicators,17(2012) 108-. This method generally has better padding performance than the mean padding method, since the correlation between samples is taken into account.
Similar to the hot-card filling method, the cluster-Based filling method (c.f.tsai, m.l.li, w.c.lin, a class-Based approach for missing value estimation, Knowledge-Based Systems,151(2018) 124-.
Unlike the above methods, the regression filling method (c. crambes, y. henchiri, regression estimation in the functional linear model with missing values in the thermal, Journal of Statistical Planning and reference, 201(2018) 103) 119) is a model-based filling method, and its main idea is to build a regression model for incomplete data according to the dependency relationship between the attributes and then fill in the missing values based on the built regression model. This padding method generally has better padding performance than the above-described method, since the correlation between attributes is taken into account. The filling result of the regression filling method is usually greatly influenced by the accuracy of the established regression model, so that the modeling of incomplete data arouses the interest of many researchers. How to handle incomplete model input data and how to properly describe the relationships between attributes are two major issues facing incomplete data modeling.
Currently, for the incompleteness of the model input data, a simpler method is to delete all incomplete samples containing missing attribute values and model the incomplete samples based on the complete sample part of the incomplete dataset (f.honghai, c.guoshun, y.cheng, y.bingru, c.yumei, a svm regression based on the approach to filing values, feature Notes in Computer Science,3683(2005) 581-587). The method is more suitable for the condition of low deletion rate or less attributes, because when the deletion scale is overlarge, a large amount of useful information is deleted, so that the modeling effect is poor. Another more popular approach is to pre-fill Missing values before modeling, followed by modeling based on the reconstructed complete data set (H.Kim, G.H.Golub, H.park, Missing value estimation for DNA microarray expression data: local least squares estimation, Bioinformatics,21(2005) 187-198). The method reserves the existing value in the incomplete sample so as to improve the utilization rate of information, but the pre-filling of the missing value causes the quality of the pre-filling value to have direct influence on the model precision.
Some researchers build different models for different clustered samples to reasonably describe the relationship between attributes. A filling method based on clustering and regression models divides a data set into different clusters, and establishes a specific least square regression model in each cluster to predict deletion values (P.Keerin, W.Kurutach, T.Boongoen, analysis of missing value in DNA micro expression data using cluster-based LLS method, in: International Symposium on Communications and information technologies, Surat Thailand,2013, pp: 559-. Compared with the traditional regression filling method, the method has better filling performance. The method combines clustering and stacking denoising self-coders based on a filling rule of the clustering and stacking denoising self-coders, firstly divides samples by using a k-means clustering algorithm, and then constructs different models based on the stacking denoising self-coders in each cluster to fill missing values (W.C.Ku, G.R.Jagadesesh, A.Prakash, T.Srikanthan, A clustering-based adaptive for data-driving input adaptive data in IEEE Forum on Integrated and stationary transformation Systems (FITS), Beijing, China, 2016).
In recent years, researchers have applied the Takagi-Sugeno (TS) fuzzy model to the analysis and prediction of incomplete data and achieved better filling performance. The filling method Based on incomplete data Fuzzy Modeling firstly pre-fills Missing values by using a clustering center, then models a reconstructed complete data set Based on a TS Fuzzy model and predicts the Missing values Based on the built model (X.Lai, X.Liu, L.Zhang, et al, Missing Value Implantation by Rule-Based incorporated data Fuzzy Modeling, in: IEEE International Conference Communication (ICC) Shanghai, China, 2019). The main idea of the TS fuzzy model is to divide the input space into several subsets, then establish different linear regression equations on each subset, and finally connect the linear regression equations by the degree of membership. The model consists of a series of "IF-THEN" fuzzy rules, the back-parts of which are usually linear descriptions of the input variables. Given an incomplete data set X, X with a sample capacity of n and an attribute number of sk=[x1k,x2k,…,xsk](1. ltoreq. k. ltoreq.n) is where the kth sample and xjk(j is more than or equal to 1 and less than or equal to s) is xkThe j-th dimension of the attribute value. And (3) when the j-dimension attribute is taken as the model output and the other attributes are taken as the model input, the ith fuzzy rule is in the form of:
wherein c is the number of fuzzy rules;representing the subset to which the q-dimension input feature in the antecedent of the ith fuzzy rule belongs;a back-part parameter representing the ith fuzzy rule;representing the output of the ith fuzzy rule. The final output of the model is shown in equation (2):
wherein the content of the first and second substances,the contribution weight of the ith fuzzy rule, and can be obtained from equation (3):
in the formula, an operator lambada represents a small operation;denotes xqkBelong to a subsetDegree of membership of, characterizing xqkMembership to a subsetTo the extent of (c). Compared with the traditional regression model, the TS fuzzy model considers the difference of the regression relations in different subsets, and further considers the difference of the regression relations in different subsetsSuitable for describing the relationship between attributes.
Disclosure of Invention
The invention provides a method for performing fine modeling and missing value filling on incomplete data based on alternate learning, which is used for dividing an input space based on a TS fuzzy model, then selecting significant input features for each subset to improve the fineness of the model, and providing an alternate learning strategy to realize the solution of a fine model and the missing value filling. The influence of the quality of the pre-filling value on the selection of the input features and the influence of the model parameters can be effectively weakened through the alternate learning strategy, so that a better filling result is obtained. Compared with the traditional regression filling method, the filling method can effectively improve the filling precision.
The invention divides the input space into a plurality of subsets and establishes a specific linear regression equation for each subset, and then uses stepwise regression algorithm to select the significant input features for the input of each linear regression equation so as to improve the fineness of the model. On the basis, the missing value is taken as a variable, and the selection of the obvious input features and the filling of the back-piece parameters and the missing value of the model are alternately learned until iteration converges to solve the problem that the input data of the model is incomplete. When the iteration converges, the padding is completed along with the completion of the modeling.
The technical scheme of the invention is as follows:
a method for performing fine modeling and missing value filling on incomplete data based on alternate learning specifically comprises the following steps:
(1) modeling
The input space is first partitioned using a fuzzy C-means clustering (FCM-PDS) algorithm based on a local distance strategy. Given an incomplete data set with a sample capacity of n and a number of attributes of s, the FCM-PDS algorithm divides the input space into c subsets by minimizing the objective function in equation (4),
wherein the content of the first and second substances,represents a sample xkBelong to subset A(i)M is a weighted index of the degree of membership, m ∈ (1, infinity), dkiDenotes xkAnd the clustering center vi=[v1i,v2i,…,vsi](i is more than or equal to 1 and less than or equal to c), and the calculation formula is shown as the formula (5):
wherein v isjiDenotes viThe jth attribute value of (a);for marking xjkWhether or not it is missing, XMAnd XpRespectively a set of all missing values and a set of all complete values.
Then, a stepwise regression algorithm is used for selecting the significant input features of each fuzzy rule: the stepwise regression algorithm introduces the features which have obvious influence on the output into the regression model one by one according to the importance, and the significance test is carried out on the features which are selected into the regression model again when a new feature is introduced. If the existing features in the regression model become insignificant due to the introduction of new features, deleting the least significant features; the algorithm terminates when neither new features can be selected into the regression model, nor insignificant features can be removed from the regression model.
After the input space is divided and the significant input features of each fuzzy rule are selected, the significant input feature set of the ith fuzzy rule is madeAnd m isiFor the number of selected features, wherein the feature x is significantly inputj=[xj1,xj2,…,xjn]Τ(1≤j≤mi). The ith fuzzy rule is simplified from equation (1) to equation (6),
wherein c is the number of fuzzy rules;an output representing the ith fuzzy rule;is the reduced kth sample;for the m-th fuzzy rule front partiA subset to which the dimension input features belong;the back-part parameters of the simplified ith fuzzy rule are used. Moreover, the contribution weight of the ith fuzzy rule isBecome intoThe calculation method is shown as formula (7):
in the formula, degree of membership of a single variableBy multivariate degree of membershipObtained through Gaussian projection, as shown in formula (8):
wherein a isjiAnd bjiRespectively representing the center of a Gaussian function and the standard deviation of the Gaussian function, and the calculation formula is shown as formula (9):
wherein u iskiRepresents a sample xkMembership to fuzzy subset A(i)To the extent of (c). The output of the TS fuzzy modelCan be calculated from equation (10):
(2) missing value filling
Because the establishment of a single TS fuzzy model can only fill the missing values of a single incomplete attribute column, each incomplete attribute column is taken as output in sequence, and all the other attributes are taken as input to establish a plurality of TS fuzzy models. And aiming at the incompleteness of model input data, the missing value is taken as a variable, and an alternative learning strategy is provided for model solution and missing value filling. The alternative learning strategy can be divided into the following steps:
step 1: the missing values are mean pre-padded to obtain a reconstructed complete data set.
Step 2: significant input features and back-piece parameters of the model are updated based on the reconstructed complete data set.
And step 3: and obtaining model output according to the significant input features and the back-part parameters of the updated model and updating the missing value by using the model output.
And 4, step 4: if the filling error obtained by the existing value and the corresponding model output is larger than or equal to the given threshold value, returning to the step 2; otherwise, the model corresponding to the missing value is used to output the filled missing value and the filled data set is output.
The invention has the beneficial effects that: firstly, an input space is divided on the basis of regression modeling, a linear regression equation is established for each subset, then the linear regression equation is selected for significant input features, and through the two steps, the fineness of the model is improved and the filling performance is enhanced. Secondly, regarding the imperfection of model input, regarding the missing value as a variable, and providing an alternative learning strategy, so that the selection of the significant input features and the filling of the back-piece parameters and the missing value of the model are alternatively learned until iterative convergence. In the alternate learning process, the model structure and the model parameters will be gradually accurate with the improvement of the filling accuracy, and the accuracy of the model structure and the parameters will promote the filling value of the missing value to be more reasonable.
Drawings
Fig. 1 is an overall workflow diagram of the present invention.
FIG. 2 is a workflow diagram of the alternate learning strategy of the present invention.
Detailed Description
The following detailed description of the embodiments of the invention is provided in conjunction with the accompanying drawings.
Fig. 1 is an overall work flow diagram of the present invention. In the figure, the incomplete data set first line 1,2, …, s represents an attribute number, black marks represent missing values, and white marks represent existing values. The present invention first divides the incomplete data set into several subsets using the FCM-PDS algorithm and uses these subsets as input for the subsequent feature selection process. And then, carrying out mean value pre-filling on the incomplete data set to obtain a reconstructed complete data set, and carrying out feature selection on each subset by using a stepwise regression algorithm based on the reconstructed complete data set to obtain the significant input features of the model. And then calculating the back-part parameters of the model based on a least square method, and calculating the output of the model by using the back-part parameters and the significant input features of the model. And finally, updating the reconstructed complete data set based on the output of the model by considering the missing value as a variable, and updating the significant input features of the model, the back-piece parameters of the model and the output of the model and performing the next iteration. If the change of the reconstruction error calculated by the existing value and the corresponding model output in the two adjacent iterations is smaller than a specified threshold value, the iteration converges, the filling of the missing value is completed along with the completion of modeling, and the filled data set is output. Otherwise, updating the reconstructed complete data set and performing the next iteration.
Examples
The details of the present invention are described by taking the Blood data set of the UCI machine learning database as an example. Blood is a complete data set with a sample size of 748 and the number of attributes of 4, and partial data in the data set is deleted manually to construct an incomplete data set.
Assuming that 748 sample space is divided into 2 subsets, two fuzzy rules of the TS fuzzy model established by taking the first dimension attribute as output and all the other attributes as input are shown as formula (11), and the established model is expressed by TS-1,
similarly, the 2 nd, 3 rd and 4 th dimension attributes are output in sequence, all other attributes are input and are modeled based on the TS fuzzy model, and the built model is expressed by TS-j (j is more than or equal to 1 and less than or equal to 4). And then, carrying out mean value pre-filling on the missing values to obtain a reconstructed complete data set, and carrying out selection of significant input features on each fuzzy rule. Suppose R in formula (11)(1)Is T(1)={x2,…,xm1},R(2)Is T(2)={x2,…,xm2Is reduced to the formula (12) shown in the formula (11)
And the output of the model TS-1 can be represented by equation (13)
Let P be [ P ](1),P(1),…,P(1)]TWhereinAs fuzzy rule R(i)Back part ofIf the model TS-j is not correct, then the post-part parameters of the model TS-j can be obtained based on equation (14)
P=(BΤB)-1BΤy, (14)
Wherein y ═ xj1,xj2,…,xjn]Τ(1 ≦ j ≦ 4) for the desired output vector; b ═ B(1),B(1),…,B(c)]And B is(i)(1. ltoreq. i.ltoreq.2) is represented by the formula (15):
after the back-part parameters of the model TS-j are obtained, the output corresponding to TS-j can be obtained based on the formula (16)
The present invention considers the missing values as variables and designs an alternative learning strategy to weaken the influence of the quality of the pre-filling values on the model accuracy, and the specific implementation details of the strategy are shown in fig. 2. In FIG. 2, XPRepresents a set of all existing values; xMRepresenting a set of all missing values;representing a model output set corresponding to the existing value;representing the set of model outputs corresponding to the missing values. First, the missing value is calculated fromAn update is made to adjust the reconstructed complete data set. Then, based on stepwise regression algorithm, the significant output of fuzzy ruleInto a feature set composed ofIs adjusted toWhereinAndrespectively representing fuzzy rules R in the last iteration and the current iteration(i)The salient input feature set. Then, based on least square method, the back-piece parameters of fuzzy rule are divided byIs adjusted toWhereinAndrespectively representing fuzzy rules R in the last iteration and the current iteration(i)The back-piece parameter of (1). Then based on R(i)R can be calculated from the significant input feature set and the back-part parameters(i)Output of (2)And weighted summation is carried out on the output signals to obtain the output corresponding to the TS-jFinally, combining the outputs corresponding to s TS-j to obtain model outputWhereinIs used to update the missing value, andreconstruction error f used to calculate the existing value and its corresponding model outpute. If Δ fe<If epsilon then iteration terminates and the padded data set is output, if deltafeIf the value is more than or equal to epsilon, continuing the next iteration, wherein epsilon represents a threshold value; andrespectively representing the reconstruction errors calculated by the existing values and the corresponding model outputs in the current iteration and the last iteration.
Comparative example
3 data sets are selected from a UCI machine learning database to verify the filling performance of the method, and the description of the data sets is shown in table 1. In order to calculate the error between the estimation of the missing value and the true value, the selected data sets are all complete data sets, and an experiment constructs an incomplete data set by manually deleting partial data according to the specified missing rate. The specified deletion rates were 5%, 10%, 15%, 20%, 25%, and 30%, respectively.
Table 1 data set description
The experiment compares six methods, and all the methods carry out mean value pre-filling on the missing values before modeling, wherein the sixth method is a filling method based on the method provided by the invention.
(1) Incomplete data is modeled based on a linear regression model, and all features are taken as inputs (REGs).
(2) Incomplete data is modeled based on a linear regression model, and salient features are selected as input (REG-SR) using stepwise regression.
(3) On the basis of REG-SR, the deficiency values are treated as variables and the model structure, model parameters and deficiency values are alternately learned until convergence (REG-SR-AL).
(4) Incomplete data is modeled based on a TS fuzzy model, and all features are taken as input (TS).
(5) Incomplete data is modeled based on a TS fuzzy model, and salient features are selected as inputs (TS-SR) using stepwise regression for each subset.
(6) On the basis of TS-SR, the missing values are regarded as variables and the model structure, model background parameters and the missing values are alternately learned until convergence (TS-SR-AL).
The Root Mean Square Error (RMSE) was used in this experiment to evaluate the padding effect. The RMSE is the square root of the ratio of the square sum of the observed value and the corresponding real deviation thereof to the observation frequency, and can well reflect the modeling precision, and the calculation formula is as follows:
where N is the number of missing values in the data set, ztThe true value corresponding to the missing bit is,the padding value corresponding to the missing bit. Table 2 shows the RMSE indicator results for the six padding methods, where the best results are bolded and underlined, and the second best results are bolded.
TABLE 2 RMSE indices of six filling methods
Observing the comparison of TS and REG, the comparison of TS-SR and REG-SR, and the comparison of TS-SR-AL and REG-SR-AL in Table 2, it can be seen that establishing a linear regression method for each subspace after dividing the input space has smaller filling errors than directly establishing a linear method. The comparison of the index results of REG-SR and REG and the comparison of the index results of TS-SR and TS can be found out, and the filling errors can be reduced by performing feature selection on the input of the linear regression model during modeling. Comparing the index results of TS-SR-AL and TS-SR and the index results of REG-SR-AL and REG-SR, it can be known that the filling precision can be obviously improved by using the alternative strategy.
In conclusion, the TS-SR-AL based on the invention has the most optimal result, and the filling precision of the TS-SR-AL is better than that of other comparison methods.
Claims (1)
1. A method for performing fine modeling and missing value filling on incomplete data based on alternate learning is characterized by comprising the following steps:
(1) modeling
Firstly, dividing an input space by using a fuzzy C-means clustering algorithm based on a local distance strategy; given an incomplete data set with a sample size of n and a number of attributes of s, the algorithm divides the input space into c subsets by minimizing the objective function in equation (4),
wherein the content of the first and second substances,represents a sample xkBelong to subset A(i)M is a weighted index of the degree of membership, m ∈ (1, infinity), dkiDenotes xkAnd the clustering center vi=[v1i,v2i,…,vsi]I is more than or equal to 1 and less than or equal to c, dkiThe calculation formula is shown in formula (5):
wherein v isjiDenotes viThe jth attribute value of (a);for marking xjkWhether or not it is missing, XMAnd XpRespectively a set composed of all missing values and a set composed of all complete values;
then, a stepwise regression algorithm is used for selecting the significant input features of each fuzzy rule: the step-by-step regression algorithm introduces the characteristics which have obvious influence on the output into the regression model one by one according to the importance, and the characteristics which are selected into the regression model are subjected to significance testing again when a new characteristic is introduced; if the existing features in the regression model become insignificant due to the introduction of new features, deleting the least significant features; terminating the algorithm when neither new features can be selected into the regression model nor insignificant features can be removed from the regression model;
after the input space is divided and the significant input features of each fuzzy rule are selected, the significant input feature set of the ith fuzzy rule is madeAnd m isiFor the number of selected input features, in which the features are significantly inputThe ith fuzzy rule is reduced to equation (6),
wherein c is the number of fuzzy rules;an output representing the ith fuzzy rule;is the reduced kth sample;for simplificationM in the preceding part of the ith fuzzy ruleiA subset to which the dimension input features belong;the back-part parameters of the simplified ith fuzzy rule are obtained; contribution weight of the ith fuzzy ruleIs calculated as shown in equation (7):
in the formula, degree of membership of a single variableBy multivariate degree of membershipObtained through Gaussian projection, as shown in formula (8):
wherein a isjiAnd bjiRespectively representing the center of a Gaussian function and the standard deviation of the Gaussian function, and the calculation formula is shown as formula (9):
wherein u iskiRepresents a sample xkMembership to fuzzy subset A(i)The degree of (d); the output of the TS fuzzy modelCalculated from equation (10):
(2) missing value filling
Taking each incomplete attribute column as output in sequence, and taking all the other attributes as input to establish a plurality of TS fuzzy models; taking the missing value as a variable, and adopting an alternate learning strategy for model solution and missing value filling, wherein the steps are as follows:
step 1: performing mean pre-filling on the missing values to obtain a reconstructed complete data set;
step 2: updating salient input features and back-piece parameters of the model based on the reconstructed complete data set;
and step 3: obtaining model output according to the significant input features of the updated model and the parameters of the back-part, and updating the missing value by using the model output;
and 4, step 4: if the filling error obtained by the existing value and the corresponding model output is larger than or equal to the given threshold value, returning to the step 2; otherwise, the model corresponding to the missing value is used to output the filled missing value and the filled data set is output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010085968.4A CN111340069A (en) | 2020-02-11 | 2020-02-11 | Incomplete data fine modeling and missing value filling method based on alternate learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010085968.4A CN111340069A (en) | 2020-02-11 | 2020-02-11 | Incomplete data fine modeling and missing value filling method based on alternate learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111340069A true CN111340069A (en) | 2020-06-26 |
Family
ID=71185286
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010085968.4A Withdrawn CN111340069A (en) | 2020-02-11 | 2020-02-11 | Incomplete data fine modeling and missing value filling method based on alternate learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111340069A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112835884A (en) * | 2021-02-19 | 2021-05-25 | 大连海事大学 | Missing data filling method and system in marine fishing ground fishing situation forecasting system |
CN113240213A (en) * | 2021-07-09 | 2021-08-10 | 平安科技(深圳)有限公司 | Method, device and equipment for selecting people based on neural network and tree model |
CN115423005A (en) * | 2022-08-22 | 2022-12-02 | 江苏大学 | Big data reconstruction method and device for combine harvester |
CN116861042A (en) * | 2023-09-05 | 2023-10-10 | 国家超级计算天津中心 | Information verification method, device, equipment and medium based on material database |
-
2020
- 2020-02-11 CN CN202010085968.4A patent/CN111340069A/en not_active Withdrawn
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112835884A (en) * | 2021-02-19 | 2021-05-25 | 大连海事大学 | Missing data filling method and system in marine fishing ground fishing situation forecasting system |
CN112835884B (en) * | 2021-02-19 | 2023-05-16 | 大连海事大学 | Missing data filling method and system in ocean fishing ground fish condition forecasting system |
CN113240213A (en) * | 2021-07-09 | 2021-08-10 | 平安科技(深圳)有限公司 | Method, device and equipment for selecting people based on neural network and tree model |
CN115423005A (en) * | 2022-08-22 | 2022-12-02 | 江苏大学 | Big data reconstruction method and device for combine harvester |
CN115423005B (en) * | 2022-08-22 | 2023-10-31 | 江苏大学 | Big data reconstruction method and device for combine harvester |
CN116861042A (en) * | 2023-09-05 | 2023-10-10 | 国家超级计算天津中心 | Information verification method, device, equipment and medium based on material database |
CN116861042B (en) * | 2023-09-05 | 2023-12-05 | 国家超级计算天津中心 | Information verification method, device, equipment and medium based on material database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111340069A (en) | Incomplete data fine modeling and missing value filling method based on alternate learning | |
CN107992976B (en) | Hot topic early development trend prediction system and prediction method | |
Zhan et al. | A fast kriging-assisted evolutionary algorithm based on incremental learning | |
CN112232413B (en) | High-dimensional data feature selection method based on graph neural network and spectral clustering | |
CN110232434A (en) | A kind of neural network framework appraisal procedure based on attributed graph optimization | |
CN105930862A (en) | Density peak clustering algorithm based on density adaptive distance | |
CN111597760B (en) | Method for obtaining gas path parameter deviation value under small sample condition | |
CN113326731A (en) | Cross-domain pedestrian re-identification algorithm based on momentum network guidance | |
CN108171012B (en) | Gene classification method and device | |
CN111814907A (en) | Quantum generation countermeasure network algorithm based on condition constraint | |
Song et al. | Nonnegative Latent Factor Analysis-Incorporated and Feature-Weighted Fuzzy Double $ c $-Means Clustering for Incomplete Data | |
CN115730635A (en) | Electric vehicle load prediction method | |
CN107240028B (en) | Overlapped community detection method in complex network of Fedora system component | |
CN111832817A (en) | Small world echo state network time sequence prediction method based on MCP penalty function | |
Lu et al. | Robust and scalable Gaussian process regression and its applications | |
CN111353525A (en) | Modeling and missing value filling method for unbalanced incomplete data set | |
CN109934344A (en) | A kind of multiple target Estimation of Distribution Algorithm of improved rule-based model | |
CN112270047B (en) | Urban vehicle path optimization method based on data-driven group intelligent calculation | |
CN113610350B (en) | Complex working condition fault diagnosis method, equipment, storage medium and device | |
CN112465253B (en) | Method and device for predicting links in urban road network | |
CN114529096A (en) | Social network link prediction method and system based on ternary closure graph embedding | |
Hu et al. | Pwsnas: powering weight sharing nas with general search space shrinking framework | |
Ortelli et al. | Faster estimation of discrete choice models via dataset reduction | |
Wu et al. | A training-free neural architecture search algorithm based on search economics | |
Tian et al. | Microbial Network Recovery by Compositional Graphical Lasso |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200626 |