CN109165664A - A kind of attribute missing data collection completion and prediction technique based on generation confrontation network - Google Patents

A kind of attribute missing data collection completion and prediction technique based on generation confrontation network Download PDF

Info

Publication number
CN109165664A
CN109165664A CN201810722774.3A CN201810722774A CN109165664A CN 109165664 A CN109165664 A CN 109165664A CN 201810722774 A CN201810722774 A CN 201810722774A CN 109165664 A CN109165664 A CN 109165664A
Authority
CN
China
Prior art keywords
data
network
prediction
attribute
filling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810722774.3A
Other languages
Chinese (zh)
Other versions
CN109165664B (en
Inventor
赵跃龙
王禹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201810722774.3A priority Critical patent/CN109165664B/en
Publication of CN109165664A publication Critical patent/CN109165664A/en
Application granted granted Critical
Publication of CN109165664B publication Critical patent/CN109165664B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a kind of based on the attribute missing data collection completion and prediction technique that generate confrontation network, comprising steps of 1) normalizing to data minmax, while being encoded to the attribute of discrete type using one hot, missing values are labeled as 0;2) the deletion sites coding vector about sample is established using data set;3) building production confrontation network and auxiliary prediction network carry out the prediction of data filling and label;4) result before minmax normalization is reduced to according to maximin in attribute;5) suitable hyper parameter is chosen by test;The present invention makes full use of Data distribution information and label information in data set, effective data filling can be carried out to high-dimensional missing data collection, simultaneously after training is completed, another auxiliary prediction network for including in this method is capable of prediction result of the attribute missing data to outgoing label of direct team's input, and process is simple and direct, has higher predictablity rate.

Description

A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
Technical field
The present invention relates to the technical fields of data prediction, refer in particular to a kind of based on the attribute missing for generating confrontation network Data set completion and prediction technique.
Background technique
Data set attribute lacks this phenomenon and is widely present in Various types of data concentration, usually acquires or transmits in data During information lose caused by.Sample in data set, which loses one, can make subsequent foundation prediction, classification with multiple attributes Model prediction accuracy decline.How completion is carried out to these missing datas, and contained using the sample with attribute missing Information constructs high-precision prediction model, is the critical issue that data prediction faces.
Most statistical tools, which are taken, deletes the problem of missing sample corresponds to the mode processing attribute missing of row, column, or makes Deletion sites are filled with the column median, average;This kind of mode is although efficient, convenient, but fails that sample is fully utilized Notebook data distributed intelligence causes the inaccuracy of calculated result.It is past between data different attribute during multidimensional data processing Toward there are many relevances, the relevance between these attributes can provide more information for the filling of data, it is contemplated that this The data filling method of class relevance has smaller deviation when estimating missing values, excavates and lacks so as to depth Lose the information that sample contains.
On this basis, further data filling method fills up missing values by modeling.Such as returning enthesis will Missing attribute establishes regression equation as dependent variable and realizes prediction, and EM algorithm first initializes missing values, walks iteration with M by E step To obtain final fill up as a result, k nearest neighbor algorithm (KNN) then calculates Euclidean distance matched sample concentration according to the attribute not lacked K most like sample obtains filling up result by weighted average.These algorithms often in the enough situations of data volume, take It obtains and is more accurately filled up than mean value, median as a result, being then also typically present some problems: returning in enthesis, need each category Property between have a significant ground wire sexual intercourse, and the fill method based on EM algorithm, computation complexity are high, and are easily trapped into part most It is excellent;Fill method based on k neighbour realizes that simply, but when facing big data quantity computationally intensive complexity is high to be caused to calculate It is difficult.
In addition, the main purpose of data filling is to provide for more complete data for subsequent modeling and forecasting.With The process of modeling is not directed in upper method, the data of filling often often have some associations with the label of prediction, will be pre- The data that surveying model and fill method can combine to that filling is obtained play better prediction effect.For traditional number There are computation complexity height when handling high dimensional data according to fill method, fail sufficiently to excavate label information to correct filling result Both of these problems;The present invention will based on production confrontation e-learning data distribution carry out data filling, while establish one it is auxiliary Being associated between the prediction abundant mining data of network and label is helped, so that its mutual information reaches maximum.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, propose a kind of scarce based on the attribute for generating confrontation network Data set completion and prediction technique are lost, Data distribution information and label information in data set are made full use of, can be lacked to high-dimensional It loses data set and carries out effective data filling, while after training is completed, another auxiliary prediction network for including in this method Can be directly to the attribute missing data of input to the prediction result of outgoing label, process is simple and direct, has higher predictablity rate.
In order to achieve the above object, technical solution provided by the present invention are as follows: a kind of to be lacked based on the attribute for generating confrontation network Data set completion and prediction technique are lost, mainly includes minmax firstly, carrying out data prediction for the data set of attribute missing The one hot code conversion of normalization and discrete numerical variable;Then for the sample with attribute missing, building missing position The coding vector set, thus the location information of expression deletion;Then the filling network and auxiliary prediction network of missing data are constructed Synchronously complete filling and the Tag Estimation of missing data;After network training completion, the defeated of network is generated in network to fill Result is filling as a result, the column maximin recorded when being normalized according to minmax carries out scale reduction out;Finally, passing through Constantly modification hyper parameter observes it and completes the setting of hyper parameter in the loss of the prediction result of verifying collection;It includes the following steps
1) data prediction;
2) deletion sites coding vector is constructed;
3) building missing data filling network and auxiliary prediction network;
4) filling data scale reduction;
5) test is arranged with hyper parameter.
In step 1), different pretreatments is carried out to different types of data, the main data types being related to are divided into continuously Continuous type numerical value is directly normalized in type numerical value and discrete type numerical value using minmax;For discrete type numerical value, turn After turning to one hot coding, is normalized using minmax, 0 is uniformly filled for deletion sites;In addition, by data set whether It is divided into two parts: the data that data and attribute with attribute missing do not lack.
In step 2), deletion sites coding vector is constructed, situation is: in data filling, the attribute of sample missing Position is also a kind of important information, when being filled using neural network, it is only necessary to be filled out to the position of these missings It fills, when constructing deletion sites coding vector, each column of all samples is traversed, if the attribute lacks, be denoted as " 1 " is otherwise denoted as " 0 ", executes by this process, and each sample can have a deletion sites coding vector corresponding.
In step 3), building missing data filling network and auxiliary prediction network, situation is: the network is original Production confrontation network has done following improvement: 1. removing noise obtained through stochastical sampling in the input for generating network;2. making The data of filling are formed with the data of generation and deletion sites vector coding;In addition, the introducing of auxiliary prediction network is more abundant The considerations of contacting between attribute and label, predicted simultaneously, to use auxiliary prediction network using attribute missing data Loss carries out feedback calculating by BP algorithm and has updated generation network between prediction label and true tag, so that generate Filling data has better effect when constructing prediction model;Joint production is fought the loss function in network and is assisted pre- Loss function in survey grid network controls its weight ratio by hyper parameter, to determine the filling data distribution generated closer to complete The distribution of data either enables to prediction model prediction more acurrate;Wherein, data filling network and auxiliary predict network Structure includes to generate network, differentiate network, auxiliary prediction network;The structure of these three networks is described in detail below:
It generates network: importation structure being spliced by the corresponding deletion sites coding vector of the data lacked with attribute At;Different according to the structure of data, hidden layer is able to use full articulamentum or warp lamination to constitute, especially in the number of input When according to being picture type data, the filling data generated are operated using deconvolution;It is assumed herein that the data of input are denoted as I, The vector of 100 dimensions, thus corresponding deletion sites coding vector is denoted as the dimension of E and 100, spliced obtained input to Measuring dimension is 200;Hidden layer is made of full articulamentum, and activation primitive uses relu;Final output layer has 100 outputs Unit, is denoted as O, and the activation primitive of output layer uses sigmoid;The data of filling I (1-E)+OE finally by being made of;
Differentiate network: the data of input have two parts, and first part is the filler obtained based on the output for generating network According to as a result, second part is the sample data that attribute does not lack, output result is the decimal between 0~1, represents and differentiates that network is recognized For the probability for the data whether received input data does not lack from attribute;According to the difference of input data type, network knot The setting of structure is also different, when input data is image type data, is constructed by convolutional neural networks;It is assumed herein that input data It is 100 dimensional vectors, then hidden layer can select to be made of full articulamentum, activation primitive is set as relu;Output layer only includes One unit, activation primitive are selected as sigmoid, characterize probability;
Auxiliary prediction network: input and differentiation network are completely the same, and output is then the prediction to input sample about label Value, using cross entropy as loss function, when forecasting problem is regression problem, is used when forecasting problem is classification problem L2 norm or L1 norm are as loss function;Network structure is identical as the setting method of network is differentiated;It is assumed herein that input number According to being 100 dimensional vectors, then hidden layer can select to be made of full articulamentum, activation primitive is set as relu;Output layer only wraps Containing a unit, activation primitive is arranged in a manner described.
In step 4), scale reduction is carried out to the filling data of generation, due to pretreatment stage used minmax into Data normalization of having gone can restore to obtain the knot of final filling according to the maxima and minima of each attribute of record Fruit.
In step 5), test is arranged with hyper parameter, and situation is: for network during training, loss derives from two Part: the prediction that production fights loss and auxiliary prediction network in network is lost;This two parts loses λ in different proportions Combination obtains comprehensive loss;Different λ will affect the training of model;In operation, cutting data set be training set and Test set, the λ of selection different scale, respectively 0.1 on training set, 0.3,0.5,0.7,0.9 is trained, meanwhile, it uses Test set is tested, selection standard of the loss reduction of auxiliary prediction network as hyper parameter using on test set.
Compared with prior art, the present invention have the following advantages that with the utility model has the advantages that
1, traditional fill method such as median, mean value filling etc., method is simple, and filling effect is not good enough, and is based on Often time complexity is big for the method for KNN, EM, and when handling high dimensional data collection, time complexity is very big, or even occurring can not The case where processing.And production confrontation network has fabulous effect on the Distributed learning of high dimensional data, thus can solve Certainly high dimensional data collection bring trouble;In addition the sample for not having usually attribute to lack is to obey with the missing sample with attribute Same distribution, the data set for allowing filled data to approach no attribute missing from distribution enables to the result of filling not Meeting bias data distribution, gives prediction model to carry out band negative effect.
2, traditional fill method does not consider filled data to the subsequent prediction result for establishing prediction model It influences, step is usually that the data completed first are filled to missing data, recycles filled data to establish pre- Model is surveyed, thus the effect of prediction cannot used to go the filling of guide data.The present invention is by introducing auxiliary prediction network meter The loss progress backpropagation calculated between the value and true tag for the data prediction filled every time instructs the data for generating network to fill out It fills, selects prediction effect so as to observe that the data of filling show quality on prediction model, in conjunction with differentiation network The data of loss limitation filling and the difference of truthful data distribution, reach while having preferable filling effect with good prediction As a result.Furthermore it completes after training, what is obtained is that a network end to end can directly obtain after the data is entered The prediction result of auxiliary prediction network.
Detailed description of the invention
Fig. 1 is the flow chart that missing data is filled and predicted.
Fig. 2 is the production confrontation network and prediction network data flow graph for filling data.
Specific embodiment
The present invention is further explained in the light of specific embodiments.
As shown in Figure 1, based on the attribute missing data collection completion and prediction side for generating confrontation network provided by this example Method, concrete condition are as follows:
1) data prediction: the data type of different attribute is different, and corresponding processing mode is also different.The main number being related to It is divided into continuous type numerical value and discrete type numerical value according to type, for continuous type numerical value, is directly normalized using minmax;For Discrete type numerical value is converted into after one hot coding, is normalized using minmax, uniformly fill 0 for deletion sites.Furthermore Data set is divided into two parts: the data that data and attribute with attribute missing do not lack.
2) construct deletion sites coding vector: in data filling, the property location of sample missing is also a kind of important Information, when being filled using neural network, it is only necessary to which the position of these missings is filled.It is compiled in building deletion sites When code vector, each column of all samples are traversed, if the attribute lacks, " 1 " is denoted as, is otherwise denoted as " 0 ".By this stream Cheng Zhihang, each sample can have a deletion sites coding vector corresponding.
3) building missing data filling network and auxiliary prediction network: the invention proposes one kind to fight net based on production Network simultaneously combines auxiliary prediction network to carry out data filling while be able to carry out the integrated network of prediction.The network is in original life Accepted way of doing sth confrontation network has done following improvement: the noise that sampling obtains 1. is removed in the input for generating network;2. using generating Data and deletion sites vector coding form the data of filling.Furthermore the introducing of auxiliary prediction network more fully considers Contacting between attribute and label, is predicted simultaneously using attribute missing data, predicts neural network forecast label using auxiliary Loss carries out feedback calculating by BP algorithm and has updated generation network between true tag, so that the filling data generated There is better effect when constructing prediction model.Joint production is fought in loss function and auxiliary prediction network in network Loss function, its weight ratio is controlled by hyper parameter, come determine generate filling data distribution closer to partial data dividing Cloth either enables to prediction model prediction more acurrate.Fig. 2 is that most important data filling network and auxiliary are pre- in the present invention The structure chart of survey grid network, comprising generating network, differentiating network, auxiliary prediction network;The structure of these three networks is carried out below Detailed introduction:
It generates network: importation structure being spliced by the corresponding deletion sites coding vector of the data lacked with attribute At.Different according to the structure of data, hide can be used full articulamentum or warp lamination to constitute, especially in input layer by layer When data are picture type data, the filling data that are generated using warp lamination.It is assumed herein that the data (being denoted as I) of input The vector of 100 dimensions, thus corresponding deletion sites coding vector (being denoted as E) is also 100 dimensions, spliced obtained input to Measuring dimension is 200;Hidden layer is made of full articulamentum, and activation primitive uses relu;Final output layer has 100 outputs Unit (is denoted as O), and the activation primitive of output layer uses sigmoid.The data of filling I (1-E)+OE finally by being made of.
Differentiate network: the data of input have two parts, and first part is the filler obtained based on the output for generating network According to as a result, second part is the sample data that attribute does not lack, output result is the decimal between 0~1, represents and differentiates that network is recognized For the probability for the data whether received input data does not lack from attribute.According to the difference of input data type, network knot The setting of structure is also different, when input data is image type data, can be constructed by convolutional neural networks.It is assumed herein that input number According to being 100 dimensional vectors, constituted then hidden layer may be selected to be full articulamentum, activation primitive is set as relu;Output layer only includes One unit, activation primitive are selected as sigmoid, characterize probability.
Auxiliary prediction network: input and differentiation network are completely the same, and output is then the prediction to input sample about label Value, using cross entropy as loss function, when forecasting problem is regression problem, is used when forecasting problem is classification problem L2 norm or L1 norm are as loss function.Network structure is identical as the setting method of network is differentiated.It is assumed herein that input number According to being 100 dimensional vectors, constituted then hidden layer may be selected to be full articulamentum, activation primitive is set as relu;Output layer only includes One unit, activation primitive are arranged in a manner described.
4) filling data scale reduction: since pretreatment stage has used minmax to carry out data normalization, according to note The maxima and minima of each attribute of record, can restore to obtain the result of final filling.
5) test is arranged with hyper parameter: for network during training, loss fights net by production from two parts The prediction of loss and auxiliary prediction network in network is lost;λ combines to obtain comprehensive damage the loss of this two parts in different proportions It loses.Different λ will affect the training of model.In operation, cutting data set is training set and test set, on training set The λ of selection different scale, respectively 0.1,0.3,0.5,0.7,0.9 is trained, meanwhile, it is tested using test set, with Selection standard of the loss reduction of auxiliary prediction network as hyper parameter on test set.
Embodiment described above is only the preferred embodiments of the invention, and implementation model of the invention is not limited with this It encloses, therefore all shapes according to the present invention, changes made by principle, should all be included within the scope of protection of the present invention.

Claims (6)

1. a kind of based on the attribute missing data collection completion and prediction technique that generate confrontation network, it is characterised in that: firstly, being directed to The data set of attribute missing carries out data prediction, the main one hot including minmax normalization and discrete numerical variable Code conversion;Then for the sample with attribute missing, the coding vector of deletion sites is constructed, thus the position of expression deletion Information;Then it is pre- that filling and label of the filling network of missing data with auxiliary prediction Network Synchronization completion missing data are constructed It surveys;It is filling as a result, being returned according to minmax to fill the output result of generation network in network after network training completion The one column maximin recorded when changing carries out scale reduction;Finally, observing it in verifying collection by constantly modifying hyper parameter Prediction result is lost to complete the setting of hyper parameter;It includes the following steps
1) data prediction;
2) deletion sites coding vector is constructed;
3) building missing data filling network and auxiliary prediction network;
4) filling data scale reduction;
5) test is arranged with hyper parameter.
2. a kind of attribute missing data collection completion and prediction technique based on generation confrontation network according to claim 1, It is characterized by: carrying out different pretreatments to different types of data in step 1), the main data types being related to are divided into company Continuous type numerical value is directly normalized in ideotype numerical value and discrete type numerical value using minmax;For discrete type numerical value, It is converted into after one hot coding, is normalized using minmax, 0 is uniformly filled for deletion sites;In addition, by data set root According to whether thering is attribute missing to be divided into two parts: data and the data that do not lack of attribute with attribute missing.
3. a kind of attribute missing data collection completion and prediction technique based on generation confrontation network according to claim 1, It is characterized by: constructing deletion sites coding vector, situation is in step 2): in data filling, the category of sample missing Property position be also a kind of important information, when being filled using neural network, it is only necessary to be carried out to the position of these missings Filling, when constructing deletion sites coding vector, traverses each column of all samples, if the attribute lacks, is denoted as " 1 " is otherwise denoted as " 0 ", executes by this process, and each sample can have a deletion sites coding vector corresponding.
4. a kind of attribute missing data collection completion and prediction technique based on generation confrontation network according to claim 1, It is characterized by: building missing data filling network and auxiliary prediction network, situation is: the network is in original in step 3) The production confrontation network of beginning has done following improvement: 1. removing noise in the input for generating network;2. using the number generated According to the data for forming filling with deletion sites vector coding;In addition, the introducing of auxiliary prediction network more fully considers category Property and label between contact, using attribute missing data predicted simultaneously, using auxiliary prediction neural network forecast label with Loss carries out feedback calculating by BP algorithm and has updated generation network between true tag, so that the filling data generated exist There is better effect when constructing prediction model;Joint production is fought in loss function and auxiliary prediction network in network Loss function controls its weight ratio by hyper parameter, to determine the filling data distribution generated closer to the distribution of partial data Either enable to prediction model prediction more acurrate;Wherein, the structure of data filling network and auxiliary prediction network includes life At network, differentiate network, auxiliary prediction network;The structure of these three networks is described in detail below:
Generate network: importation is spliced to form by the corresponding deletion sites coding vector of the data lacked with attribute; Different according to the structure of data, hidden layer is able to use full articulamentum or warp lamination to constitute, especially in the data of input When being picture type data, the filling data generated are operated using deconvolution;It is assumed herein that the data of input are denoted as I, it is The vector of 100 dimensions, thus corresponding deletion sites coding vector is denoted as the dimension of E and 100, through splicing obtained input vector Dimension is 200;Hidden layer is made of full articulamentum, and activation primitive uses relu;Final output layer has 100 outputs single Member, is denoted as O, and the activation primitive of output layer uses sigmoid;The data of filling I (1-E)+OE finally by being made of;
Differentiate network: the data of input have two parts, and first part is the filling data knot obtained based on the output for generating network Fruit, second part are the sample datas that attribute does not lack, and output result is the decimal between 0~1, represent and differentiate that network is thought to connect The probability for the data whether input data of receipts does not lack from attribute;According to the difference of input data type, network structure Setting is also different, when input data is image type data, is constructed by convolutional neural networks;It is assumed herein that input data is 100 dimensional vectors, then hidden layer can select to be made of full articulamentum, activation primitive is set as relu;Output layer only includes one A unit, activation primitive are selected as sigmoid, characterize probability;
Auxiliary prediction network: input and differentiation network are completely the same, and output is then the predicted value to input sample about label, when When forecasting problem is classification problem, using cross entropy as loss function, when forecasting problem is regression problem, using L2 norm Or L1 norm is as loss function;Network structure is identical as the setting method of network is differentiated;It is assumed herein that input data is 100 Dimensional vector, then hidden layer can select to be made of full articulamentum, activation primitive is set as relu;Output layer only includes a list Member, activation primitive are arranged in a manner described.
5. a kind of attribute missing data collection completion and prediction technique based on generation confrontation network according to claim 1, It is characterized by: scale reduction is carried out to the filling data of generation, since pretreatment stage has used minmax in step 4) Data normalization has been carried out, according to the maxima and minima of each attribute of record, can restore to obtain final filling As a result.
6. a kind of attribute missing data collection completion and prediction technique based on generation confrontation network according to claim 1, It is characterized by: test is arranged with hyper parameter, and situation is: during training, loss derives from network in step 5) Two parts: the prediction that production fights loss and auxiliary prediction network in network is lost;This two parts loses with different ratios Example λ combines to obtain comprehensive loss;Different λ will affect the training of model;In operation, cutting data set is training set And test set, the λ of selection different scale, respectively 0.1 on training set, 0.3,0.5,0.7,0.9 is trained, meanwhile, make It is tested with test set, selection standard of the loss reduction of auxiliary prediction network as hyper parameter using on test set.
CN201810722774.3A 2018-07-04 2018-07-04 Attribute-missing data set completion and prediction method based on generation of countermeasure network Active CN109165664B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810722774.3A CN109165664B (en) 2018-07-04 2018-07-04 Attribute-missing data set completion and prediction method based on generation of countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810722774.3A CN109165664B (en) 2018-07-04 2018-07-04 Attribute-missing data set completion and prediction method based on generation of countermeasure network

Publications (2)

Publication Number Publication Date
CN109165664A true CN109165664A (en) 2019-01-08
CN109165664B CN109165664B (en) 2020-09-22

Family

ID=64897277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810722774.3A Active CN109165664B (en) 2018-07-04 2018-07-04 Attribute-missing data set completion and prediction method based on generation of countermeasure network

Country Status (1)

Country Link
CN (1) CN109165664B (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522973A (en) * 2019-01-17 2019-03-26 云南大学 Medical big data classification method and system based on production confrontation network and semi-supervised learning
CN109978257A (en) * 2019-03-25 2019-07-05 上海赢科信息技术有限公司 The continuation of insurance prediction technique and system of vehicle insurance
CN110046706A (en) * 2019-04-18 2019-07-23 腾讯科技(深圳)有限公司 Model generating method, device and server
CN110175168A (en) * 2019-05-28 2019-08-27 山东大学 A kind of time series data complementing method and system based on generation confrontation network
CN110647519A (en) * 2019-08-30 2020-01-03 中国平安人寿保险股份有限公司 Method and device for predicting missing attribute value in test sample
CN110728297A (en) * 2019-09-04 2020-01-24 电子科技大学 Low-cost antagonistic network attack sample generation method based on GAN
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
CN111037365A (en) * 2019-12-26 2020-04-21 大连理工大学 Cutter state monitoring data set enhancing method based on generative countermeasure network
CN111177135A (en) * 2019-12-27 2020-05-19 清华大学 Landmark-based data filling method and device
CN111259953A (en) * 2020-01-15 2020-06-09 云南电网有限责任公司电力科学研究院 Equipment defect time prediction method based on capacitive equipment defect data
CN111259916A (en) * 2020-02-12 2020-06-09 东华大学 Low-rank projection feature extraction method under condition of label missing
CN111429605A (en) * 2020-04-10 2020-07-17 郑州大学 Missing value filling method based on generation type countermeasure network
CN111738007A (en) * 2020-07-03 2020-10-02 北京邮电大学 Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN111737463A (en) * 2020-06-04 2020-10-02 江苏名通信息科技有限公司 Big data missing value filling method, device and computer program
CN112036955A (en) * 2020-09-07 2020-12-04 贝壳技术有限公司 User identification method and device, computer readable storage medium and electronic equipment
CN112183723A (en) * 2020-09-17 2021-01-05 西北工业大学 Data processing method for clinical detection data missing problem
CN112465150A (en) * 2020-12-02 2021-03-09 南开大学 Real data enhancement-based multi-element time sequence data filling method
CN112712855A (en) * 2020-12-28 2021-04-27 华南理工大学 Joint training-based clustering method for gene microarray containing deletion value
CN113010500A (en) * 2019-12-18 2021-06-22 中国电信股份有限公司 Processing method and processing system for DPI data
CN113515896A (en) * 2021-08-06 2021-10-19 红云红河烟草(集团)有限责任公司 Data missing value filling method for real-time cigarette acquisition
CN114826988A (en) * 2021-01-29 2022-07-29 中国电信股份有限公司 Method and device for anomaly detection and parameter filling of time sequence data
CN114936530A (en) * 2022-06-22 2022-08-23 郑州大学 Multi-element air quality data missing value filling model based on TAM and construction method thereof
CN115145906A (en) * 2022-09-02 2022-10-04 之江实验室 Preprocessing and completion method for structured data
WO2022222026A1 (en) * 2021-04-19 2022-10-27 浙江大学 Medical diagnosis missing data completion method and completion apparatus, and electronic device and medium
CN115829162A (en) * 2023-01-29 2023-03-21 北京市农林科学院信息技术研究中心 Crop yield prediction method, device, electronic device and medium
CN115883016A (en) * 2022-10-28 2023-03-31 南京航空航天大学 Method and device for enhancing flow data based on federal generation countermeasure network
CN117034142A (en) * 2023-10-07 2023-11-10 之江实验室 Unbalanced medical data missing value filling method and system
CN117150231A (en) * 2023-10-27 2023-12-01 国网江苏省电力有限公司苏州供电分公司 Measurement data filling method and system based on correlation and generation countermeasure network
CN117421548A (en) * 2023-12-18 2024-01-19 四川互慧软件有限公司 Method and system for treating loss of physiological index data based on convolutional neural network
CN117524318A (en) * 2024-01-05 2024-02-06 深圳新合睿恩生物医疗科技有限公司 New antigen heterogeneous data integration method and device, equipment and storage medium
CN117556267A (en) * 2024-01-12 2024-02-13 闪捷信息科技有限公司 Missing sample data filling method and device, storage medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107133934A (en) * 2017-05-18 2017-09-05 北京小米移动软件有限公司 Image completion method and device
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
KR20170137350A (en) * 2016-06-03 2017-12-13 (주)싸이언테크 Apparatus and method for studying pattern of moving objects using adversarial deep generative model
CN107945118A (en) * 2017-10-30 2018-04-20 南京邮电大学 A kind of facial image restorative procedure based on production confrontation network
CN107945140A (en) * 2017-12-20 2018-04-20 中国科学院深圳先进技术研究院 A kind of image repair method, device and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170137350A (en) * 2016-06-03 2017-12-13 (주)싸이언테크 Apparatus and method for studying pattern of moving objects using adversarial deep generative model
CN106952239A (en) * 2017-03-28 2017-07-14 厦门幻世网络科技有限公司 image generating method and device
CN107133934A (en) * 2017-05-18 2017-09-05 北京小米移动软件有限公司 Image completion method and device
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
CN107945118A (en) * 2017-10-30 2018-04-20 南京邮电大学 A kind of facial image restorative procedure based on production confrontation network
CN107945140A (en) * 2017-12-20 2018-04-20 中国科学院深圳先进技术研究院 A kind of image repair method, device and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JINSUNG YOON ET AL.: "GAIN: Missing Data Imputation using Generative Adversarial Nets", 《PROCEEDINGS OF THE 35 TH INTERNATIONAL CONFERENCE ON MACHINE》 *

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522973A (en) * 2019-01-17 2019-03-26 云南大学 Medical big data classification method and system based on production confrontation network and semi-supervised learning
CN109978257A (en) * 2019-03-25 2019-07-05 上海赢科信息技术有限公司 The continuation of insurance prediction technique and system of vehicle insurance
CN110046706A (en) * 2019-04-18 2019-07-23 腾讯科技(深圳)有限公司 Model generating method, device and server
CN110046706B (en) * 2019-04-18 2022-12-20 腾讯科技(深圳)有限公司 Model generation method and device and server
CN110175168B (en) * 2019-05-28 2021-06-01 山东大学 Time sequence data filling method and system based on generation of countermeasure network
CN110175168A (en) * 2019-05-28 2019-08-27 山东大学 A kind of time series data complementing method and system based on generation confrontation network
CN110647519A (en) * 2019-08-30 2020-01-03 中国平安人寿保险股份有限公司 Method and device for predicting missing attribute value in test sample
CN110647519B (en) * 2019-08-30 2023-10-03 中国平安人寿保险股份有限公司 Method and device for predicting missing attribute value in test sample
CN110728297A (en) * 2019-09-04 2020-01-24 电子科技大学 Low-cost antagonistic network attack sample generation method based on GAN
CN110728297B (en) * 2019-09-04 2021-08-06 电子科技大学 Low-cost antagonistic network attack sample generation method based on GAN
CN111046027A (en) * 2019-11-25 2020-04-21 北京百度网讯科技有限公司 Missing value filling method and device for time series data
CN113010500A (en) * 2019-12-18 2021-06-22 中国电信股份有限公司 Processing method and processing system for DPI data
CN111037365A (en) * 2019-12-26 2020-04-21 大连理工大学 Cutter state monitoring data set enhancing method based on generative countermeasure network
CN111037365B (en) * 2019-12-26 2021-08-20 大连理工大学 Cutter state monitoring data set enhancing method based on generative countermeasure network
CN111177135B (en) * 2019-12-27 2020-11-10 清华大学 Landmark-based data filling method and device
CN111177135A (en) * 2019-12-27 2020-05-19 清华大学 Landmark-based data filling method and device
CN111259953B (en) * 2020-01-15 2023-10-20 云南电网有限责任公司电力科学研究院 Equipment defect time prediction method based on capacitive equipment defect data
CN111259953A (en) * 2020-01-15 2020-06-09 云南电网有限责任公司电力科学研究院 Equipment defect time prediction method based on capacitive equipment defect data
CN111259916A (en) * 2020-02-12 2020-06-09 东华大学 Low-rank projection feature extraction method under condition of label missing
CN111429605A (en) * 2020-04-10 2020-07-17 郑州大学 Missing value filling method based on generation type countermeasure network
CN111737463B (en) * 2020-06-04 2024-02-09 江苏名通信息科技有限公司 Big data missing value filling method, device and computer readable memory
CN111737463A (en) * 2020-06-04 2020-10-02 江苏名通信息科技有限公司 Big data missing value filling method, device and computer program
CN111738007A (en) * 2020-07-03 2020-10-02 北京邮电大学 Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN112036955A (en) * 2020-09-07 2020-12-04 贝壳技术有限公司 User identification method and device, computer readable storage medium and electronic equipment
CN112183723A (en) * 2020-09-17 2021-01-05 西北工业大学 Data processing method for clinical detection data missing problem
CN112465150A (en) * 2020-12-02 2021-03-09 南开大学 Real data enhancement-based multi-element time sequence data filling method
CN112712855A (en) * 2020-12-28 2021-04-27 华南理工大学 Joint training-based clustering method for gene microarray containing deletion value
CN112712855B (en) * 2020-12-28 2022-09-20 华南理工大学 Joint training-based clustering method for gene microarray containing deletion value
CN114826988A (en) * 2021-01-29 2022-07-29 中国电信股份有限公司 Method and device for anomaly detection and parameter filling of time sequence data
WO2022222026A1 (en) * 2021-04-19 2022-10-27 浙江大学 Medical diagnosis missing data completion method and completion apparatus, and electronic device and medium
CN113515896A (en) * 2021-08-06 2021-10-19 红云红河烟草(集团)有限责任公司 Data missing value filling method for real-time cigarette acquisition
CN113515896B (en) * 2021-08-06 2022-08-09 红云红河烟草(集团)有限责任公司 Data missing value filling method for real-time cigarette acquisition
CN114936530A (en) * 2022-06-22 2022-08-23 郑州大学 Multi-element air quality data missing value filling model based on TAM and construction method thereof
CN115145906B (en) * 2022-09-02 2023-01-03 之江实验室 Preprocessing and completion method for structured data
US11841839B1 (en) 2022-09-02 2023-12-12 Zhejiang Lab Preprocessing and imputing method for structural data
CN115145906A (en) * 2022-09-02 2022-10-04 之江实验室 Preprocessing and completion method for structured data
CN115883016B (en) * 2022-10-28 2024-02-02 南京航空航天大学 Flow data enhancement method and device based on federal generation countermeasure network
CN115883016A (en) * 2022-10-28 2023-03-31 南京航空航天大学 Method and device for enhancing flow data based on federal generation countermeasure network
CN115829162A (en) * 2023-01-29 2023-03-21 北京市农林科学院信息技术研究中心 Crop yield prediction method, device, electronic device and medium
CN117034142A (en) * 2023-10-07 2023-11-10 之江实验室 Unbalanced medical data missing value filling method and system
CN117034142B (en) * 2023-10-07 2024-02-09 之江实验室 Unbalanced medical data missing value filling method and system
CN117150231A (en) * 2023-10-27 2023-12-01 国网江苏省电力有限公司苏州供电分公司 Measurement data filling method and system based on correlation and generation countermeasure network
CN117150231B (en) * 2023-10-27 2024-01-26 国网江苏省电力有限公司苏州供电分公司 Measurement data filling method and system based on correlation and generation countermeasure network
CN117421548A (en) * 2023-12-18 2024-01-19 四川互慧软件有限公司 Method and system for treating loss of physiological index data based on convolutional neural network
CN117421548B (en) * 2023-12-18 2024-03-12 四川互慧软件有限公司 Method and system for treating loss of physiological index data based on convolutional neural network
CN117524318A (en) * 2024-01-05 2024-02-06 深圳新合睿恩生物医疗科技有限公司 New antigen heterogeneous data integration method and device, equipment and storage medium
CN117524318B (en) * 2024-01-05 2024-03-22 深圳新合睿恩生物医疗科技有限公司 New antigen heterogeneous data integration method and device, equipment and storage medium
CN117556267A (en) * 2024-01-12 2024-02-13 闪捷信息科技有限公司 Missing sample data filling method and device, storage medium and electronic equipment
CN117556267B (en) * 2024-01-12 2024-04-02 闪捷信息科技有限公司 Missing sample data filling method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109165664B (en) 2020-09-22

Similar Documents

Publication Publication Date Title
CN109165664A (en) A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN109508360B (en) Geographical multivariate stream data space-time autocorrelation analysis method based on cellular automaton
Vladislavleva et al. Predicting the energy output of wind farms based on weather data: Important variables and their correlation
CN112684379A (en) Transformer fault diagnosis system and method based on digital twinning
AU2021240155A1 (en) Control Pulse Generation Method, Apparatus, System, Device And Storage Medium
CN111079977A (en) Heterogeneous federated learning mine electromagnetic radiation trend tracking method based on SVD algorithm
CN101093559A (en) Method for constructing expert system based on knowledge discovery
CN106503035A (en) A kind of data processing method of knowledge mapping and device
CN113486190B (en) Multi-mode knowledge representation method integrating entity image information and entity category information
CN113255895A (en) Graph neural network representation learning-based structure graph alignment method and multi-graph joint data mining method
CN110705178A (en) Tunnel/subway construction overall process surrounding rock deformation dynamic prediction method based on machine learning
CN103885867B (en) Online evaluation method of performance of analog circuit
CN114707754A (en) Intelligent ammeter fault prediction method and system based on BiLSTM-CNN model
CN109614896A (en) A method of the video content semantic understanding based on recursive convolution neural network
CN104732067A (en) Industrial process modeling forecasting method oriented at flow object
CN115049124A (en) Deep and long tunnel water inrush prediction method based on Bayesian network
Wang et al. Roof pressure prediction in coal mine based on grey neural network
Ballı et al. An application of artificial neural networks for prediction and comparison with statistical methods
CN107729942A (en) A kind of sorting technique of structured view missing data
CN111898746A (en) Deep learning method for association of interrupted flight path continuation
Yang et al. Using genetic algorithms for time series prediction
Zhang et al. RSVRs based on feature extraction: a novel method for prediction of construction projects’ costs
CN110210523A (en) A kind of model based on shape constraint diagram wears clothing image generating method and device
CN113887471A (en) Video time sequence positioning method based on feature decoupling and cross comparison

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant