CN110119394B - Improved data cleaning method for separate layer water injection - Google Patents

Improved data cleaning method for separate layer water injection Download PDF

Info

Publication number
CN110119394B
CN110119394B CN201910415497.6A CN201910415497A CN110119394B CN 110119394 B CN110119394 B CN 110119394B CN 201910415497 A CN201910415497 A CN 201910415497A CN 110119394 B CN110119394 B CN 110119394B
Authority
CN
China
Prior art keywords
data
water injection
interpolation
representing
association
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910415497.6A
Other languages
Chinese (zh)
Other versions
CN110119394A (en
Inventor
王海英
赵国堡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN201910415497.6A priority Critical patent/CN110119394B/en
Publication of CN110119394A publication Critical patent/CN110119394A/en
Application granted granted Critical
Publication of CN110119394B publication Critical patent/CN110119394B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

An improved method for cleaning layered water injection data aims to solve the problems that in the prior art, the method for cleaning the layered data of the oil field is particularly weak when dealing with high-dimensional data in a big data environment, and the interpolation strategy cannot judge the missing condition of the data; the application comprises determining t factors affecting the instantaneous flow according to the original data; detecting t factors determined in the step S1, and performing significance verification; classifying the original data, and performing iterative analysis through an association equation to obtain a complete data set; and (3) verifying the interpolation precision of the step S3. And finishing data cleaning work of the data of the deficiency value of the separate water injection in the data interpolation work of the separate water injection number.

Description

Improved data cleaning method for separate layer water injection
Technical Field
The application relates to the field of data analysis pretreatment, in particular to an improved data cleaning method for separate-layer water injection.
Background
Along with the continuous development of oilfield informatization, mass separate-layer water injection production data are accumulated in the separate-layer water injection development process. The relation among important parameters in the large data of the separate-layer water injection is comprehensively considered, so that the accuracy of the predicted key index can be improved, and the production efficiency and the safety coefficient of the oil field can be improved. In the process of separate layer water injection, the data distortion and the data loss can be caused by unstable operation of a logging sensor, communication equipment failure, well site power failure and other reasons, so that the abnormal prediction accuracy is not high and the formulation of separate layer water injection allocation scheme is influenced. This presents a significant challenge for analysis, diagnosis and optimization of the water injection system.
The existing mainstream oilfield layered data cleaning method is particularly weak when dealing with high-dimensional data in a big data environment, and the algorithm is difficult to play a role in practical research due to the defect of calculation speed. In data cleaning, interpolation of missing values is particularly important. The original interpolation strategy cannot judge the missing condition of the data, and in actual application, the interpolation method is often selected to perform missing value interpolation according to personal subjective and past experience, so that the interpolation improvement strategy needs to be studied.
Disclosure of Invention
In order to solve the technical defects, the technical scheme adopted by the application is that the hierarchical water injection data cleaning method for improving the interpolation strategy is provided, the branch steps of an original algorithm are improved, the improved algorithm can process a data set of a mixed type without dividing the data set into a classified variable and a continuous variable to be processed by adopting different methods, and the operation time is greatly improved while the interpolation precision is ensured.
The algorithm comprises the following steps:
s1, determining t factors influencing the instantaneous flow according to original data;
s2, detecting t factors determined in the step S1, and performing significance verification;
s3, classifying the original data, and performing iterative analysis through an association equation to obtain a complete data set;
and S4, verifying the interpolation precision of the step S3.
Further, step S1 includes:
s1-1, constructing an original data matrix, wherein the original data matrix is shown in the following formula:
wherein: |G| represents the raw data matrix, G ij (i=1, 2, …, n; j=1, 2, …, m) is the measured data of the ith water injection well for the jth factor;
s1-2, carrying out averaging treatment on the original data:
s1-3, calculating an absolute difference matrix according to the original data and the data after the averaging treatment;
s1-4, solving an association coefficient matrix;
s1-5, judging the association degree, screening out factors with the association degree meeting the requirements, and eliminating irrelevant variables.
Further, the correlation coefficient matrix in step S1-4 is composed of correlation coefficient elements, and the correlation coefficient elements are obtained by the following formula:
wherein r is ij Representing elements, delta, in the correlation coefficient matrix min Representing the minimum, delta, in the absolute difference matrix max Represents the maximum value, delta, in the absolute difference matrix ij Elements representing i rows and j columns of the absolute difference matrix, ρ being a resolution factor.
Further, step S2 includes:
s2-1, constructing an association equation, and determining a proper coefficient by solving the minimum value of the residual square sum Q, wherein the association equation is as follows:
wherein: h is a i The variable of the set of variables is represented,represents the observed value, Q is the sum of squares of the residuals, b p Coefficients representing the equation;
s2-2, significance verification of correlation equation for determining variable g pi And variable h i Whether there is a functional relationship between them;
s2-3, adopting a stepwise analysis method to select independent variables.
Further, step S2-2 specifically performs the sum of squares decomposition of the dispersion by the following formula;
wherein: e (E) Total (S) Mean value is representedIs decomposed into E The remainder is And E is Switch for closing ,E The remainder is Is the sum of the remaining squares, E Switch for closing Is the sum of the associated squares->Mean value, h i Representing dependent variables +.>Representing the observed value.
Correlation analysis was performed by SPSS software.
Further, step S3 includes:
s3-1, defining an original dataset g= (G1, G2, …, gk);
s3-2, calling a reference interpolation fast interpolation initial data set, and randomly dividing k variables contained in the data set G into mutually exclusive groups with the size of alpha k, wherein 0< alpha <1.
Each group is used as independent variables in turn to carry out multiple correlation analysis by using a correlation equation;
sequentially carrying out 1/alpha multiple analysis by a loop, thereby completing one iteration;
stopping if the tolerance value epsilon of the integration accuracy of the interpolation data is reached, otherwise repeating until convergence.
Further, step S4 includes:
s4-1, setting G under various missing data hypotheses j Use (1) 1,j ,…,1 n,j ) Will be denoted G j The artificial induced deletion of the values of (1) is defined as the vector 0-1, when G i,j 1 when missing i,j =1; conversely, 1 i,j =0; defining θ and σ as a set of continuous and categorical data variables with a plurality of artificially induced missing values;represents G j The number of artificially induced deletion values; defining θ and η as a set of continuous and categorical data variables with a plurality of artificially induced missing values;
s4-2, determining an interpolation error by the following formula:
wherein:is G j Modified interpolation strategy τ interpolation processed n-dimensional vector, < >>Representing the interpolated element, G, by an improved interpolation strategy i,j Representing raw elements untreated,/->Representing the mean of the original n-dimensional vector,x represents j The number of artificially induced deletion values of 1 i,j The artificially induced deficiency value is represented by epsilon (tau), the post-improvement interpolation strategy error is represented by epsilon (theta), the continuous data case error is represented by epsilon (eta), and the classified data case error is represented by epsilon (eta).
Compared with the prior art, the application has the beneficial effects that: according to the application, the correlation analysis method is used for determining important factors influencing the instantaneous flow of water injection, the optimal factor variables of the data of the layered water injection are grouped, each group is sequentially used as dependent variables for multiple analysis, the calculation speed improvement algorithm is improved on the premise of ensuring the interpolation precision, the data set of a mixed type can be processed without dividing the data set into the classified variables and the continuous variables and adopting different methods for processing, and the operation time is greatly improved while the interpolation precision is ensured.
The application can set and adjust according to the self requirement of the user, and can meet the time requirement or the precision requirement, thereby improving the flexibility of the algorithm. The improved algorithm uses the data interpolation work of the number of the separate layer water flooding to finish the data cleaning work of the separate layer water flooding missing value data.
The method for analyzing the association degree and establishing the association equation for the important factors improves the original interpolation strategy, improves the interpolation efficiency on the premise of guaranteeing the accuracy of the interpolation strategy, and has stable performance under different missing mechanisms, so that the method integrally meets the expectations.
Drawings
FIG. 1 is a general flow chart of the present application;
FIG. 2 is a statistical graph after calculating the correlation of the important factors affecting the instantaneous flow, wherein the abscissa axis is the temperature, the water flow speed, the porosity, the surface pressure, the depth of layer, the layer thickness, the pipe pressure, the jacket pressure, and the skin coefficient 9 important factors affecting the instantaneous flow, and the ordinate axis is the correlation of the important factors.
Detailed Description
The above and further technical features and advantages of the present application are described in more detail below with reference to the accompanying drawings.
S1, determining t factors influencing the instantaneous flow according to original data;
s1-1, constructing an original data matrix:
the original data matrix obtained by m factors affecting the instantaneous flow of water injection is shown in the formula (1):
wherein: |G| represents the raw data matrix, G ij (i=1, 2, …,18; j=1, 2, …, 9) is the measured data of the ith water injection well for the jth factor;
s1-2, eliminating the influence of dimension, and carrying out averaging treatment on the original data:
in order to eliminate the influence of dimension, and transform the original data into relative values near 1, adopting a averaging method to process the original data; the mean change according to equation (2) is:
wherein: i=1, 2, …, n; j=1, 2, …, m+1; g ij Representing measured data, g' ij Representing the data after the averaging process;
s1-3, calculating an absolute difference matrix:
the absolute difference matrix elements according to equation (3) are as follows:
Δ ij =|g′ ij -g′ i0 | (3)
wherein: i=1, 2, …, n; j=1, 2, …, m+1; delta ij Elements representing rows i and columns j in the absolute difference matrix, g' ij Representing the data after the averaging process.
S1-4, solving an association coefficient matrix, and solving the association coefficient matrix according to a formula (4):
wherein: r is (r) ij Representing elements, delta, in the correlation coefficient matrix min Representing the minimum, delta, in the absolute difference matrix max Represents the maximum value, delta, in the absolute difference matrix ij The elements of the row i and the column j of the absolute difference matrix are represented, rho is a resolution coefficient, the size of the elements can control the influence of the maximum difference on data conversion, generally, rho takes a value between 0 and 1, and rho=0 is selected according to the actual engineering background;
s1-5, calculating the association degree, and selecting several factors with the association degree larger than 0.7: porosity, gauge pressure, tube pressure, jacket pressure, and skin factor are used as predictors, excluding other extraneous variables.
The average value of the association number sequences of the parent factor and each child factor is called association degree, and for comparing and analyzing the association of the parent factor and each child factor, the association degree is calculated according to the formula (5), namely:
wherein: x-shaped articles 0i Is the factor g i For the mother factor g 0 And χ is the correlation degree of 0i The closer to 1, the higher the correlation between the two, r ij Representing elements in the association coefficient matrix.
S2, detecting t factors determined in the step S1, and performing significance verification;
s2-1 parameter estimation:
and (3) detecting t important factors obtained by correlation analysis, namely selecting the most suitable coefficient according to a formula (6) to minimize the residual square sum Q, and solving a corresponding correlation equation:
wherein: h is a i The variable of the set of variables is represented,represents the observed value, Q is the sum of squares of the residuals, b p Coefficients representing the equation;
s2-2 correlation equation significance verification:
the significance test, the significance value represents the significance of the test, and statistically, a significance value <0.05 is generally considered as a coefficient test significance, meaning that the absolute value of your regression coefficient is significantly greater than 0, indicating that the independent variable can effectively predict the variation of the dependent variable, and making this conclusion you have a 5% chance of making mistakes, i.e., 95% confidence that the conclusion is correct.
To determine the variable g pi And variable h i Whether there is a functional relationship or not, and according to equation (7), the decomposition of the sum of squares of the dispersion is first required. E (E) Total (S) Mean value is representedIs not limited by the fluctuation size of (a):
wherein: e (E) Total (S) Mean value is representedThe fluctuation size of (2) can be classified into E The remainder is And E is Switch for closing 。E The remainder is Is the sum of the remaining squares, E Switch for closing Is the sum of the associated squares->Mean value, h i Representing dependent variables +.>Representing the observed value;
in SPSS we set the factor affecting the instantaneous flow as a variable, while clicking on "option" sets the confidence percentage to 95%, and the saliency analysis data after using SPSS is shown in the table.
Selection of S2-3 argument:
adopting a step-by-step analysis method to select independent variables, wherein the equation only comprises one constant term at the beginning, and the independent variables are sequentially selected from large to small according to the contribution of the independent variables to the dependent variables; each time an independent variable is introduced, the variable in the equation is checked, and the variables meeting the elimination standard are eliminated one by one;
the independent variable with larger influence is added into the model as much as possible, instead of adding the insignificant variable into the model, so that an excellent model can be constructed.
S3, classifying the original data, and performing iterative analysis through an association equation to obtain a complete data set;
s3-1, classifying the original data:
defining the original data set g= (G1, G2, …, gk) as an n x k matrix;
s3-2, obtaining a complete data set by using an interpolation method:
the reference interpolation fast interpolation initial data set is called first, k variables contained in the data set G are randomly divided into mutually exclusive groups with the size of alpha k, wherein 0< alpha <1.
Each group is used as independent variables in turn to carry out multiple correlation analysis by using a correlation equation;
sequentially carrying out 1/alpha multiple analysis by a loop, thereby completing one iteration;
if the tolerance value epsilon of the collection precision of the interpolation data is reached, stopping, otherwise repeating until convergence; the tolerance value epsilon of the interpolation data is set according to the actual requirement, and the value of the embodiment is.0.05
And S4, verifying the interpolation precision of the step S3.
S4-1 defines the induced deletion variable:
setting G under various missing data assumptions j Use (1) 1,j ,…,1 n,j ) Will be denoted G j The artificial induced deletion of the values of (1) is defined as the vector 0-1, when G i,j 1 when missing i,j =1; conversely, 1 i,j =0; defining θ and σ as a set of continuous and categorical data variables with a plurality of artificially induced missing values;represents G j The number of artificially induced deletion values; defining θ and η as a set of continuous and categorical data variables with a plurality of artificially induced missing values;
s4-2 precision test experiment design:
the continuous variable adopts standard root mean square error to evaluate the model performance, and the classified variable adopts error of error division to evaluate the model performance; the calculation formula of the improved interpolation strategy error epsilon (tau) is shown as a formula (8):
wherein:is G j Modified interpolation strategy τ interpolation processed n-dimensional vector, < >>Representing the interpolated element, G, by an improved interpolation strategy i,j Representing raw elements untreated,/->Representing the mean of the original n-dimensional vector,x represents j The number of artificially induced deletion values of 1 i,j Representing the artificially induced deficiency value, epsilon (tau) representing the improved interpolation strategy error, epsilon (theta) representing the error of the continuous data case, epsilon (eta) representing the error of the classified data case;
for the unmodified interpolation strategy, the relative interpolation error formula of the strategy v and the strategy tau is shown as formula (9):
when X is R When the value of (tau) is less than 100, the effect of strategy tau is better than strategy v;
for parameter settings of the interpolation accuracy test, the average interpolation accuracy is calculated by repeating 10 experiments for improved and non-improved algorithms; number of variables M preselected by node j(number of all variables), number of nodes in algorithm M t Set to 1000, i.e., 1000 nodes are included in each algorithm. The parameter setting table for the interpolation accuracy test is shown in table 1.
TABLE 1
The improved algorithm discovers that the values of 0.06 and 0.20 can enable experimental results to have better calculation accuracy and calculation speed through multiple experimental comparison researches on the values of alpha in the correlation equation. According to the value range of the data correlation, the data are divided into three groups of values between [0,50], [50,75] and [75,100] percentiles. The algorithm is obtained with the highest interpolation precision aiming at the MACR mechanism. Three sets of experimental results ranging between the percentages of [0,50], [50,75] and [75,100] are shown in tables 2,3 and 4.
TABLE 2
TABLE 3 Table 3
TABLE 4 Table 4
TRQS in tables 2,3,4 are data that are completely randomly missing: the data missing is completely random and does not depend on the observed value or the missing value; RQS is random missing data: data loss depends on observations, not missing values; NRQS is non-randomly missing data: data loss depends on observations and missing values; RO represents the original interpolation strategy, RG α The modified interpolation strategy is shown, and alpha is respectively compared with 0.06 and 0.20.
The improved algorithm of the embodiment discovers that the values of 0.06 and 0.20 can enable experimental results to have better calculation accuracy and calculation speed through multiple experimental comparison researches on the values of alpha in the correlation equation. According to the value range of the data correlation, the data are divided into three groups of values between [0,50], [50,75] and [75,100] percentile. The algorithm is obtained with the highest interpolation precision under the data mechanism aiming at complete random missing.
The method for analyzing the association degree and establishing the association equation for the important factors improves the original interpolation strategy, improves the interpolation efficiency on the premise of guaranteeing the accuracy of the interpolation strategy, and has stable performance under different missing mechanisms, so that the method integrally meets the expectations.
The foregoing description of the preferred embodiment of the application is merely illustrative of the application and is not intended to be limiting. It will be appreciated by persons skilled in the art that many variations, modifications, and even equivalents may be made thereto without departing from the spirit and scope of the application as defined in the appended claims.

Claims (3)

1. An improved data cleaning method for separate layer water injection is characterized in that: the method comprises the following steps:
s1, determining t factors influencing the instantaneous flow according to the original data, wherein the t factors comprise:
s1-1, constructing an original data matrix, wherein the original data matrix is shown in the following formula:
wherein: |G| represents the raw data matrix, G ij Is the actual measurement data of the ith water injection well about the jth factor, wherein i=1, 2, …, n, j=1, 2, …,9; g i1 G is the temperature of the ith water injection well i2 G is the water flow speed of the ith water injection well i3 Porosity of the ith water injection well g i4 G is the surface pressure of the ith water injection well i5 For the i-th water injection well layer depth, g i6 For the i th water injection well layer thickness, g i7 Is the i-th water injection well pipe pressure, g i8 G is the i-th water injection well casing pressure i9 The skin coefficient of the ith water injection well;
s1-2, carrying out averaging treatment on the original data;
s1-3, calculating an absolute difference matrix according to the original data and the data after the averaging treatment;
s1-4, solving an association coefficient matrix, wherein the association coefficient matrix is composed of association coefficient elements, and the association coefficient elements are obtained through the following formula:
wherein r is ij Representing elements, delta, in the correlation coefficient matrix min Representing the minimum, delta, in the absolute difference matrix max Represents the maximum value, delta, in the absolute difference matrix ij Elements representing i rows and j columns of the absolute difference matrix, wherein ρ is a resolution coefficient;
s1-5, calculating the association degree of each factor and the instantaneous flow of water injection, screening out factors with the association degree meeting the requirements, and eliminating irrelevant variables;
s2, detecting t factors determined in the step S1, and performing significance verification, wherein the significance verification method comprises the following steps:
s2-1, constructing an association equation, and determining a proper coefficient by solving the minimum value of the residual square sum Q, wherein the association equation is as follows:
wherein: h is a i The variable of the set of variables is represented,represents the observed value, Q is the sum of squares of the residuals, b p Coefficients representing the equation;
s2-2, verifying significance of an association equation, and performing dispersion square sum decomposition by the following formula to determine a variable g pi And variable h i Whether there is a functional relationship between them;
wherein: e (E) Total (S) Mean value is representedIs decomposed into E The remainder is And E is Switch for closing ,E The remainder is Is the sum of the remaining squares, E Switch for closing Is the sum of the associated squares->Mean value, h i Representing dependent variables +.>Representing the observed value;
in the SPSS transfer, taking a factor affecting the instantaneous flow as a variable, setting a confidence percentage, and carrying out significance analysis through SPSS software;
s3, classifying the original data, and performing iterative analysis through an association equation to obtain a complete data set;
and S4, verifying the interpolation precision of the step S3.
2. An improved method of data cleansing for stratified charges as claimed in claim 1 wherein:
the step S3 comprises the following steps:
s3-1, defining an original dataset g= (G1, G2, …, gk);
s3-2, obtaining a complete data set by using an interpolation method:
firstly, calling a reference interpolation fast interpolation initial data set, and randomly dividing k variables contained in the data set G into mutually exclusive groups with the size of alpha k, wherein 0< alpha <1;
each group is used as independent variables in turn to carry out multiple correlation analysis by using a correlation equation;
sequentially carrying out 1/alpha multiple analysis by a loop, thereby completing one iteration;
if the tolerance value epsilon of the integration accuracy of the interpolation data is reached, stopping, otherwise repeating until convergence.
3. An improved method of data cleansing for stratified charges as claimed in claim 1 wherein: the step S4 includes:
s4-1, setting G under various missing data hypotheses j Use (1) 1,j ,…,1 n,j ) Will be denoted G j The artificial induced deletion of the values of (1) is defined as the vector 0-1, when G i,j 1 when missing i,j =1; conversely, 1 i,j =0; defining θ and σ as a set of continuous and categorical data variables with a plurality of artificially induced missing values;represents G j The number of artificially induced deletion values; defining θ and η as a set of continuous and categorical data variables with a plurality of artificially induced missing values;
s4-2, determining an interpolation error by the following formula:
wherein:is G j Modified interpolation strategy τ interpolation processed n-dimensional vector, < >>Representing the interpolated element, G, by an improved interpolation strategy i,j Representing raw elements untreated,/->Representing the mean of the original n-dimensional vector,x represents j The number of artificially induced deletion values of 1 i,j The artificially induced deficiency value is represented by epsilon (tau), the post-improvement interpolation strategy error is represented by epsilon (theta), the continuous data case error is represented by epsilon (eta), and the classified data case error is represented by epsilon (eta).
CN201910415497.6A 2019-05-18 2019-05-18 Improved data cleaning method for separate layer water injection Active CN110119394B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910415497.6A CN110119394B (en) 2019-05-18 2019-05-18 Improved data cleaning method for separate layer water injection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910415497.6A CN110119394B (en) 2019-05-18 2019-05-18 Improved data cleaning method for separate layer water injection

Publications (2)

Publication Number Publication Date
CN110119394A CN110119394A (en) 2019-08-13
CN110119394B true CN110119394B (en) 2023-10-27

Family

ID=67522900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910415497.6A Active CN110119394B (en) 2019-05-18 2019-05-18 Improved data cleaning method for separate layer water injection

Country Status (1)

Country Link
CN (1) CN110119394B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532518B (en) * 2019-08-30 2023-04-25 中国电力工程顾问集团西北电力设计院有限公司 Air cooling contrast observation data interpolation method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100300682A1 (en) * 2009-05-27 2010-12-02 Ganesh Thakur Computer-implemented systems and methods for screening and predicting the performance of enhanced oil recovery and improved oil recovery methods
CN104536044A (en) * 2015-01-16 2015-04-22 中国石油大学(北京) Interpolation and denoising method and system for seismic data
CN105117988A (en) * 2015-10-14 2015-12-02 国家电网公司 Method for interpolating missing data in electric power system
CN109472343A (en) * 2018-10-16 2019-03-15 上海电机学院 A kind of improvement sample data missing values based on GKNN fill up algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100300682A1 (en) * 2009-05-27 2010-12-02 Ganesh Thakur Computer-implemented systems and methods for screening and predicting the performance of enhanced oil recovery and improved oil recovery methods
CN104536044A (en) * 2015-01-16 2015-04-22 中国石油大学(北京) Interpolation and denoising method and system for seismic data
CN105117988A (en) * 2015-10-14 2015-12-02 国家电网公司 Method for interpolating missing data in electric power system
CN109472343A (en) * 2018-10-16 2019-03-15 上海电机学院 A kind of improvement sample data missing values based on GKNN fill up algorithm

Also Published As

Publication number Publication date
CN110119394A (en) 2019-08-13

Similar Documents

Publication Publication Date Title
CN106055918B (en) Method for identifying and correcting load data of power system
Wu et al. Selection of optimal parameter set using estimability analysis and MSE-based model-selection criterion
Silver et al. Fast identification of biological pathways associated with a quantitative trait using group lasso with overlaps
CN111723367B (en) Method and system for evaluating service scene treatment risk of power monitoring system
CN108897354B (en) Aluminum smelting process hearth temperature prediction method based on deep belief network
CN112231980A (en) Engine life prediction method, storage medium and computing device
Collazos et al. Consistent variable selection for functional regression models
CN110119394B (en) Improved data cleaning method for separate layer water injection
Chu et al. Generalization of a parameter set selection procedure based on orthogonal projections and the D‐optimality criterion
ElBakry et al. Inference of gene regulatory networks with variable time delay from time-series microarray data
CN108919755A (en) A kind of distributed fault detection method based on muti-piece Nonlinear and crossing relational model
Taylor et al. Sensitivity analysis for deep learning: ranking hyper-parameter influence
CN115982141A (en) Characteristic optimization method for time series data prediction
CN113761748A (en) Industrial process soft measurement method based on federal incremental random configuration network
CN109934334B (en) Disturbance-based chlorophyll a content related factor sensitivity analysis method
CN109242142B (en) Space-time prediction model parameter optimization method for infrastructure network
Yang et al. Robust fuzzy varying coefficient regression analysis with crisp inputs and Gaussian fuzzy output
KR20220152559A (en) Molecular techniques for detecting genomic sequences in bacterial genomes
Gandy et al. A framework for Monte Carlo based multiple testing
CN112884197A (en) Water bloom prediction method and device based on double models
Tokuda et al. A numerical analysis of learning coefficient in radial basis function network
CN111583990B (en) Gene regulation network inference method combining sparse regression and elimination rule
CN115619028A (en) Clustering algorithm fusion-based power load accurate prediction method
CN113253682B (en) Nonlinear chemical process fault detection method
CN111786935B (en) Service flow abnormity detection method for optical cable fiber core remote intelligent scheduling exchange

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant