CN110442911B - High-dimensional complex system uncertainty analysis method based on statistical machine learning - Google Patents

High-dimensional complex system uncertainty analysis method based on statistical machine learning Download PDF

Info

Publication number
CN110442911B
CN110442911B CN201910594968.4A CN201910594968A CN110442911B CN 110442911 B CN110442911 B CN 110442911B CN 201910594968 A CN201910594968 A CN 201910594968A CN 110442911 B CN110442911 B CN 110442911B
Authority
CN
China
Prior art keywords
dimensional
random variable
sample matrix
matrix
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910594968.4A
Other languages
Chinese (zh)
Other versions
CN110442911A (en
Inventor
付学谦
贾倩倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN201910594968.4A priority Critical patent/CN110442911B/en
Publication of CN110442911A publication Critical patent/CN110442911A/en
Application granted granted Critical
Publication of CN110442911B publication Critical patent/CN110442911B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Operations Research (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a high-dimensional complex system uncertainty analysis method based on statistical machine learning, wherein the method comprises the following steps: selecting uncertainty factors affecting a high-dimensional complex system, and acquiring a high-dimensional random variable input sample matrix; inputting the high-dimensional random variable into a sample matrix, and converting the high-dimensional random variable into a low-dimensional random variable sample matrix; the high-dimensional random variable is input into a sample matrix to be calculated one by one, so that an output response matrix is obtained; accurately modeling the random response surface agent model to obtain a random response surface model which is highly similar to the studied high-dimensional complex system; obtaining the mean value and variance of the output response quantity of the random response surface model by a formula deduction method; and analyzing the uncertainty factors according to the mean value and the variance to obtain an uncertainty analysis result. The method has the advantages of high accuracy of calculation results, reduced calculation amount and improved calculation efficiency on the basis of ensuring calculation accuracy, avoidance of dimension disaster, high flexibility and the like.

Description

High-dimensional complex system uncertainty analysis method based on statistical machine learning
Technical Field
The invention relates to the technical field of high-dimensional reduction and statistical machine learning, in particular to a high-dimensional complex system uncertainty analysis method based on statistical machine learning.
Background
Today, due to the need for research in various fields, there are a number of important practical problems that are urgent to be solved with accurate modeling, and these practical problems represent often high-dimensional complex systems, such as: building a large-span bridge, modeling a land hydrologic system, performing remote sensing inversion, optimizing aircraft design, analyzing a comprehensive energy system and the like. However, almost all systems in practice have varying degrees of uncertainty and nonlinearity, which presents challenges for accurate modeling.
When uncertainty analysis is carried out on a high-dimensional complex system, a numerical analysis model is established according to random variable setting parameters of an actual system by a traditional method, and then a deterministic classical optimization algorithm is adopted for solving. The uncertainty of the model output is influenced by the uncertainty of the parameters, so that quantitative uncertainty analysis can be performed on the system according to the digital characteristics of the simulation output result. However, in some engineering design problems, it is necessary to test and optimize different design parameters multiple times to determine the optimal parameters. The practical problem with these complex systems involves a large number of repeated simulation calculations, which can take hours or even days to perform a single simulation using a physical model, which is computationally expensive and inefficient. Most models are not explicit, so that the original models cannot be directly solved, and the problems of difficult solution and large calculation amount exist in solving the high-dimensional problem.
The agent model has the advantages of high calculation efficiency and simple application in the uncertainty analysis and optimization process of the complex system, so that the agent model based on statistical machine learning can be applied in practice. However, when the proxy model processes high-dimensional data, the "dimension disaster" problem becomes an unavoidable challenge. In the prior art, all agent models have unstable and overfitting problems when processing high-dimension data, so that it is not practical to find agent models with strong robustness and ultrahigh dimension tolerance to solve the problem of dimension disaster. The dimension which can be tolerated by the agent model is achieved while the characteristics of the original data set are maintained by processing the ultra-high dimension data set through a dimension reduction algorithm, so that the method is a thinking for solving the dimension disaster.
It is also contemplated that the resulting low-dimensional features, while representing the original high-dimensional data, may miss more or less information. In addition, the proxy model also has requirements on the dimension of input data, and the fact that the dimension is too low can lead to insufficient training and unobvious training effect, and the fact that the dimension is too high can lead to over fitting of the model and increase of model burden. The dimension reduction algorithm needs to reduce the dimension to the vicinity of the intrinsic dimension to ensure the integrity of the features of the original dataset, and needs to enable the low dimension to meet the tolerance of the proxy model.
Therefore, aiming at the defects of the prior art, a new technical scheme is urgently needed to be provided for more accurately and rapidly carrying out uncertainty quantitative analysis on a complex system on the basis of solving the dimension disaster caused by high dimension.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems in the related art to some extent.
Therefore, the invention aims to provide a high-dimensional complex system uncertainty analysis method based on statistical machine learning, which has the advantages of high accuracy of calculation results, reduced calculation amount and improved calculation efficiency on the basis of ensuring calculation accuracy, avoidance of dimension disaster, high flexibility and the like.
In order to achieve the above objective, the embodiment of the present invention provides a method for analyzing uncertainty of a high-dimensional complex system based on statistical machine learning, comprising the following steps: selecting uncertainty factors affecting a high-dimensional complex system, and acquiring a high-dimensional random variable input sample matrix; inputting the high-dimensional random variable into a sample matrix, and converting the high-dimensional random variable into a low-dimensional random variable sample matrix by a high-dimensional reduction method combining a non-negative matrix factorization dimension reduction algorithm and an intrinsic dimension estimation; the high-dimensional random variable input sample matrix is calculated one by one to obtain an output response matrix; accurately modeling a random response surface proxy model according to the low-dimensional random variable sample matrix and the output response volume matrix to obtain a random response surface model which is highly similar to the studied high-dimensional complex system; obtaining the mean value and variance of the output response quantity of the random response surface model by a formula deduction method; and analyzing the uncertainty factors according to the mean and the variance to obtain an uncertainty analysis result.
According to the high-dimensional complex system uncertainty analysis method based on statistical machine learning, the agent model based on statistical machine learning is utilized to carry out approximation modeling on the high-dimensional complex system, and a calculation result has high accuracy; compared with the traditional method, the uncertainty analysis method based on the statistical machine learning reduces the calculated amount on the basis of ensuring the calculation precision, shortens the calculation time and improves the calculation efficiency; the high-dimensional random variable sample data is not directly used in the proxy model, but the dimension of the random variable is effectively reduced by using a high-dimensional reduction method, so that the problem of dimension disaster is avoided; the cross verification method is used when modeling the random response surface agent model, so that the generalization capability of the model is improved; the random response surface model can directly derive the result of statistical characteristics obtained by known model parameters through a formula, so that the calculated amount is reduced, and the calculation efficiency is improved, thereby effectively overcoming the defect of difficult regression of high-dimensional nonlinear data of a complex system by the existing uncertainty analysis method and meeting the requirements of accurate and efficient uncertainty quantitative analysis of the high-dimensional complex system.
In addition, the high-dimensional complex system uncertainty analysis method based on statistical machine learning according to the embodiment of the invention may further have the following additional technical features:
further, in one embodiment of the invention, the variance is proportional to the uncertainty.
Further, in one embodiment of the present invention, the selecting the uncertainty factor affecting the high-dimensional complex system and obtaining the high-dimensional random variable input sample matrix includes: collecting each uncertainty factorIs->The real data are obtained to obtain the corresponding average value +.>Variance->And correlation coefficients between different uncertainty factors; and simulating the high-dimensional random variable input sample matrix by using a Latin hypercube sampling method according to the corresponding mean value, variance and correlation coefficient.
Further, in an embodiment of the present invention, the converting the high-dimensional random variable input sample matrix into the low-dimensional random variable sample matrix by a high-dimensional clipping method combined with the eigenvector estimation by a non-negative matrix factorization dimension-reduction algorithm includes: the eigenvalue of the high-dimensional random variable input sample matrix is obtained by adopting a mode of combining singular value decomposition, a principal component analysis method and an enumeration method; and obtaining the low-dimensional random variable sample matrix with preset number corresponding to the intrinsic dimension by utilizing a non-negative matrix decomposition method according to the intrinsic dimension.
Further, in an embodiment of the present invention, the accurately modeling the random response surface proxy model according to the low-dimensional random variable sample matrix and the output response volume matrix to obtain a random response surface model with high approximation about the studied high-dimensional complex system includes: taking the low-dimensional random variable sample matrixes with the preset number as input variables respectively, taking the deterministic output response quantity matrixes as output variables, taking samples with the corresponding preset percentages respectively as training sets, and taking the rest samples as test sets; carrying out nonlinear regression on input and output samples of the training set in a second-order random response surface model and a third-order random response surface model respectively by utilizing a least square method to obtain a plurality of groups of undetermined parameters and training errors respectively, and selecting the undetermined parameters of the groups of undetermined parameters as corresponding parameters of the random response surface model by taking the training errors of the second-order model and the third-order model and the minimum intrinsic dimension as the intrinsic dimension of the high-dimensional random variable input sample matrix; substituting two groups of test set input samples corresponding to the intrinsic dimensions of the high-dimensional random variable input sample matrix into the final two models respectively to obtain generalization errors of the two models respectively; and selecting the model with small generalization error as a final proxy model.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flow chart of a method for high-dimensional complex system uncertainty analysis based on statistical machine learning according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of high-dimensional complex system uncertainty analysis based on statistical machine learning according to one embodiment of the invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
The following describes a high-dimensional complex system uncertainty analysis method based on statistical machine learning according to an embodiment of the present invention with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method of high-dimensional complex system uncertainty analysis based on statistical machine learning in accordance with one embodiment of the present invention.
As shown in fig. 1, the method for analyzing uncertainty of a high-dimensional complex system based on statistical machine learning, wherein the high-dimensional complex system comprises a power system, and the uncertainty of the power system can be represented by a probability power flow calculation result, and the method comprises the following steps:
in step S101, an uncertainty factor affecting a high-dimensional complex system is selected, and a high-dimensional random variable input sample matrix is acquired.
It will be appreciated that as shown in FIG. 2, the primary uncertainty factor affecting the high-dimensional complex system is selected to obtain the commonPersonal->Dimension input variable, get ∈ ->Is input into the sample matrix.
Further, in one embodiment of the present invention, selecting an uncertainty factor affecting a high-dimensional complex system and obtaining a high-dimensional random variable input sample matrix includes: collecting each uncertainty factorIs->The real data are obtained to obtain the corresponding average value +.>Variance->And correlation coefficients between different uncertainty factors; and simulating a high-dimensional random variable input sample matrix by using a Latin hypercube sampling method according to the corresponding mean value, variance and correlation coefficient.
In step S102, a high-dimensional random variable is input into a sample matrix, and is converted into a low-dimensional random variable sample matrix by a high-dimensional clipping method combining a non-negative matrix factorization dimension reduction algorithm with an eigenvector estimation.
It will be appreciated that, as shown in FIG. 2, according to the method obtained in step S101Is converted into +.A high-dimensional random variable input sample matrix is converted into +.A high-dimensional reduction method combining a non-negative matrix factorization dimension reduction algorithm and an eigenvoice estimation is adopted>Is a low-dimensional random variable sample matrix, +.>
Further, in one embodiment of the present invention, the high-dimensional random variable input sample matrix is converted into the low-dimensional random variable sample matrix by a high-dimensional clipping method combined with the eigenvector estimation by a non-negative matrix factorization dimensionality reduction algorithm, comprising: the method comprises the steps of obtaining the intrinsic dimension of a high-dimensional random variable input sample matrix by adopting a mode of combining singular value decomposition, a principal component analysis method and an enumeration method; and obtaining a preset number of low-dimensional random variable sample matrixes corresponding to the intrinsic dimensions by utilizing a non-negative matrix decomposition method according to the intrinsic dimensions.
It should be noted that the preset number is related to the enumeration range of the intrinsic dimension, and the enumeration interval is increased by 1 positive and negative units based on the intrinsic dimension.
Specifically, step 21: the eigenvalue of the high-dimensional random variable input sample matrix in the step S101 is obtained by adopting a mode of combining singular value decomposition, a principal component analysis method and an enumeration method, and the eigenvalue is specifically:
step 211: inputting a sample matrix to the high-dimensional random variable in step S101Singular value decomposition is performed:
step 212: the principal components in step 211 are selected using the principle of principal component analysis:
wherein,
step 213: by means of enumerationEnumeration is performed in range as the undetermined intrinsic dimension.
Step 22: obtaining 5 eigenvectors corresponding to the eigenvalues by non-negative matrix factorization based on the eigenvalues obtained in step 21Is a low-dimensional random variable sample matrix of (a).
In step S103, the high-dimensional random variable is input into the sample matrix to be calculated one by one, so as to obtain an output response matrix.
It will be appreciated that, as shown in FIG. 2, the method of step S101 is performed in a conventional mannerInputting the high-dimensional random variables into a sample matrix for calculation one by one to obtain +.>Is provided. Conventional methods may include, among others, experimental metrology, physical modeling, and the like.
In step S104, the random response surface proxy model is accurately modeled according to the low-dimensional random variable sample matrix and the output response volume matrix, so as to obtain a random response surface model with high approximation about the studied high-dimensional complex system.
It will be appreciated that the method obtained in step S102Is used as an input variable, the +.a.obtained in step S103>And taking the deterministic output response matrix as an output variable, accurately modeling the random response surface proxy model, and calculating model parameters to obtain a random response surface model with high approximation about the studied high-dimensional complex system.
Further, in one embodiment of the present invention, the method for accurately modeling the random response surface proxy model according to the low-dimensional random variable sample matrix and the output response volume matrix to obtain a random response surface model with high approximation about the studied high-dimensional complex system includes: respectively taking a preset number of low-dimensional random variable sample matrixes as input variables, taking a deterministic output response matrix as an output variable, taking corresponding samples of a preset percentage as a training set, and taking the rest samples as a test set; respectively carrying out nonlinear regression on input and output samples of a training set in a second-order random response surface model and a third-order random response surface model by utilizing a least square method to respectively obtain a plurality of groups of undetermined parameters and training errors, and selecting the undetermined parameters of the group where the undetermined parameters of the undetermined parameters are as corresponding parameters of the random response surface model, wherein the training errors of the second-order model and the third-order model and the minimum intrinsic dimension are used as the intrinsic dimension of a high-dimensional random variable input sample matrix; substituting two groups of test set input samples corresponding to the intrinsic dimensions of the high-dimensional random variable input sample matrix into the final two models respectively to obtain generalization errors of the two models respectively; and selecting a model with small generalization error as a final proxy model.
Specifically, step 41: 5 obtained in step S102Respectively as input variables, the low-dimensional random variable sample matrix obtained in step S103 +.>The deterministic output response matrix is used as an output variable, and 70% of samples corresponding to the deterministic output response matrix are used as training sets, and 30% of samples are used as test sets;
step 42: nonlinear regression is carried out on the input and output samples of the training set in the step 41 by utilizing a least square method in a second-order and third-order random response surface model respectively to obtain 5 groups of undetermined parameters and training errors respectively, and the training errors of the second-order and third-order models are selected to be the smallestS101, inputting the intrinsic dimension of a sample matrix as a high-dimensional random variable in the step S101, wherein the undetermined parameters of the group where the intrinsic dimension is positioned are used as corresponding parameters of a random response surface model;
step 43: the final determination in step 42The corresponding two groups of test set input samples are respectively substituted into the final two models in the step 41, and after the obtained response value and the response value in the step 41 are analyzed, generalization errors of the two models are respectively obtained;
step 44: and selecting a model with small generalization error as a final proxy model.
In step S105, the mean and variance of the random response surface model output response amounts are obtained by the formula derivation method.
It will be appreciated that the mean and variance of the response of the output response of the surface model is random by the formula derivation, both of which are composed of the known polynomial parameters of the model obtained in step S104.
Wherein in one embodiment of the invention, the variance is proportional to the uncertainty.
Specifically, the mean value of the output response of the random response surface model in step S104 is obtained by the formula derivation methodSum of variances-> [1] Both the derivation results consist of the known polynomial parameters of the model obtained in step S104.
In step S106, the uncertainty factor is analyzed according to the mean and the variance, and an uncertainty analysis result is obtained.
It will be appreciated that, based on the resulting mean and uncertainty of the high-dimensional complex system under study by analysis of variance, a larger variance indicates a larger fluctuation and a stronger uncertainty.
The method according to the embodiment of the invention is used for carrying out specific probability power flow calculation, and comprises the following steps:
step S1: the illumination and the temperature in the meteorological conditions can influence the output conditions of the photovoltaic and the air conditioner, and then the calculation result of the probability tide is indirectly influenced. Collecting illumination and temperatureObtaining corresponding mean value, variance and Pelson correlation coefficient among different variables from real data of the degree, and simulating the common by using Latin hypercube sampling methodPersonal->The dimension input variables with respect to temperature and illumination, get +.>Is input into the sample matrix.
Step S2: the eigenvalue of the high-dimensional random variable input sample matrix in the step S101 is obtained by adopting a mode of combining singular value decomposition, principal component analysis and enumeration:
step S3: the principal components in step 211 are selected using the principle of principal component analysis:
wherein the method comprises the steps of
Step S4: by means of enumerationEnumeration is performed in range as the undetermined intrinsic dimension.
Step S5: from the eigen dimensions obtained in step 21, decomposition is performed using a non-negative matrixThe method obtains 5 numbers of the corresponding intrinsic dimensionsIs a low-dimensional random variable sample matrix of (a).
Step S6: using a physical model method, using matpower software to obtain the result in step S101Inputting the high-dimensional random variables into a sample matrix for calculation one by one to obtain +.>Is a probability flow output response matrix.
Step S7: 5 obtained in step S102Respectively as input variables, the low-dimensional random variable sample matrix obtained in step S103 +.>The deterministic output response matrix is used as an output variable, and 70% of samples corresponding to the deterministic output response matrix are used as training sets, and 30% of samples are used as test sets.
Step S8: nonlinear regression is carried out on the input and output samples of the training set in the step S7 by utilizing a least square method in a second-order and third-order random response surface model respectively to obtain 5 groups of undetermined parameters and training errors respectively, and the training errors of the second-order and third-order models are selected to be the smallestThe intrinsic dimension of the sample matrix is input as a high-dimensional random variable in step S101, and the undetermined parameters of the group in which the intrinsic dimension is located are used as corresponding parameters of the random response surface model.
Step S9: the final determination in step S8The two corresponding test set input samples are respectively substituted into the final two models in the step 42, and the obtained response value and the response value in the step 41 are analyzedAnd then, generalizing errors of the two models are respectively obtained.
Step S10: and selecting a model with small generalization error as a final proxy model.
Step S11: obtaining the average value of the output response quantity of the random response surface model in the step S104 by a formula deduction methodSum of variances-> [1] Both the derivation results consist of the known polynomial parameters of the model obtained in step S104.
Step S12: and (3) analyzing the uncertainty of the studied high-dimensional complex system according to the mean value and the variance obtained in the step S11, wherein the larger the variance is, the larger the representation fluctuation is, and the stronger the uncertainty is.
According to the high-dimensional complex system uncertainty analysis method based on the statistical machine learning, which is provided by the embodiment of the invention, the agent model based on the statistical machine learning is utilized to carry out approximation modeling on the high-dimensional complex system, and the calculation result has high accuracy; compared with the traditional method, the uncertainty analysis method based on the statistical machine learning reduces the calculated amount on the basis of ensuring the calculation precision, shortens the calculation time and improves the calculation efficiency; the high-dimensional random variable sample data is not directly used in the proxy model, but the dimension of the random variable is effectively reduced by using a high-dimensional reduction method, so that the problem of dimension disaster is avoided; the cross verification method is used when modeling the random response surface agent model, so that the generalization capability of the model is improved; the random response surface model can directly derive the result of statistical characteristics obtained by known model parameters through a formula, so that the calculated amount is reduced, and the calculation efficiency is improved, thereby effectively overcoming the defect of difficult regression of high-dimensional nonlinear data of a complex system by the existing uncertainty analysis method and meeting the requirements of accurate and efficient uncertainty quantitative analysis of the high-dimensional complex system.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (4)

1. A high-dimensional complex system uncertainty analysis method based on statistical machine learning is applied to a power system and is characterized by comprising the following steps:
selecting uncertainty factors affecting a high-dimensional complex system, and collecting each uncertainty factorIs->The real data are obtained to obtain the corresponding average value +.>Variance->And the correlation coefficient among different uncertainty factors, namely illumination and temperature, is simulated by using a Latin hypercube sampling method according to the corresponding mean value, variance and the correlation coefficient to obtain a high-dimensional random variable input sample matrix;
inputting the high-dimensional random variable into a sample matrix, and converting the high-dimensional random variable into a low-dimensional random variable sample matrix by a high-dimensional reduction method combining a non-negative matrix factorization dimension reduction algorithm and an intrinsic dimension estimation;
the high-dimensional random variable input sample matrix is calculated one by one to obtain an output response matrix;
accurately modeling a random response surface proxy model according to the low-dimensional random variable sample matrix and the output response volume matrix to obtain a random response surface model which is highly similar to the studied high-dimensional complex system;
obtaining the mean value and variance of the output response quantity of the random response surface model by a formula deduction method; and
and analyzing the uncertainty factors according to the mean and the variance to obtain an uncertainty analysis result.
2. The method of claim 1, wherein the variance is proportional to the uncertainty.
3. The method of claim 1, wherein said converting the high-dimensional random variable input sample matrix to a low-dimensional random variable sample matrix by a high-dimensional clipping method combined with eigenvector estimation by a non-negative matrix factorization dimensionality reduction algorithm, comprises:
the eigenvalue of the high-dimensional random variable input sample matrix is obtained by adopting a mode of combining singular value decomposition, a principal component analysis method and an enumeration method;
and obtaining the low-dimensional random variable sample matrix with preset number corresponding to the intrinsic dimension by utilizing a non-negative matrix decomposition method according to the intrinsic dimension.
4. A method according to claim 3, wherein said accurately modeling a random response surface proxy model from said low-dimensional random variable sample matrix and said output response volume matrix results in a random response surface model that is highly approximate with respect to the high-dimensional complex system under study, comprising:
taking the low-dimensional random variable sample matrixes with the preset number as input variables respectively, taking the deterministic output response quantity matrixes as output variables, taking samples with the corresponding preset percentages respectively as training sets, and taking the rest samples as test sets;
carrying out nonlinear regression on input and output samples of the training set in a second-order random response surface model and a third-order random response surface model respectively by utilizing a least square method to obtain a plurality of groups of undetermined parameters and training errors respectively, and selecting the undetermined parameters of the groups of undetermined parameters as corresponding parameters of the random response surface model by taking the training errors of the second-order model and the third-order model and the minimum intrinsic dimension as the intrinsic dimension of the high-dimensional random variable input sample matrix;
substituting two groups of test set input samples corresponding to the intrinsic dimensions of the high-dimensional random variable input sample matrix into the final two models respectively to obtain generalization errors of the two models respectively;
and selecting the model with small generalization error as a final proxy model.
CN201910594968.4A 2019-07-03 2019-07-03 High-dimensional complex system uncertainty analysis method based on statistical machine learning Active CN110442911B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910594968.4A CN110442911B (en) 2019-07-03 2019-07-03 High-dimensional complex system uncertainty analysis method based on statistical machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910594968.4A CN110442911B (en) 2019-07-03 2019-07-03 High-dimensional complex system uncertainty analysis method based on statistical machine learning

Publications (2)

Publication Number Publication Date
CN110442911A CN110442911A (en) 2019-11-12
CN110442911B true CN110442911B (en) 2023-11-14

Family

ID=68428501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910594968.4A Active CN110442911B (en) 2019-07-03 2019-07-03 High-dimensional complex system uncertainty analysis method based on statistical machine learning

Country Status (1)

Country Link
CN (1) CN110442911B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353232A (en) * 2020-03-06 2020-06-30 上海交通大学 Core thermal hydraulic program result uncertainty quantitative analysis method and device
CN112733382B (en) * 2021-01-21 2022-06-10 河北工业大学 Global sensitivity analysis method of multi-input multi-output complex system
CN112926233A (en) * 2021-01-25 2021-06-08 北京理工大学 Multi-factor sensitivity analysis method based on spatial interpolation
CN114692529B (en) * 2022-06-02 2022-09-02 中国空气动力研究与发展中心计算空气动力研究所 CFD high-dimensional response uncertainty quantification method and device, and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853239A (en) * 2010-05-06 2010-10-06 复旦大学 Nonnegative matrix factorization-based dimensionality reducing method used for clustering
CN109461091A (en) * 2018-05-25 2019-03-12 中国农业大学 Consider the Calculation of electric charge method and information system of photovoltaic and refrigeration duty correlation
CN109510209A (en) * 2019-01-14 2019-03-22 广东电网有限责任公司 Consider the serial-parallel power grid probability load flow calculation method of the high n-dimensional random variable n containing correlation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090306943A1 (en) * 2006-10-05 2009-12-10 North Carolina State University Methods, systems and computer program products for reduced order model adaptive simulation of complex systems
US20120209575A1 (en) * 2011-02-11 2012-08-16 Ford Global Technologies, Llc Method and System for Model Validation for Dynamic Systems Using Bayesian Principal Component Analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853239A (en) * 2010-05-06 2010-10-06 复旦大学 Nonnegative matrix factorization-based dimensionality reducing method used for clustering
CN109461091A (en) * 2018-05-25 2019-03-12 中国农业大学 Consider the Calculation of electric charge method and information system of photovoltaic and refrigeration duty correlation
CN109510209A (en) * 2019-01-14 2019-03-22 广东电网有限责任公司 Consider the serial-parallel power grid probability load flow calculation method of the high n-dimensional random variable n containing correlation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Estimating the failure probability in an integrated energy system considering correlations among failure patterns;Xueqian Fu等;Energy;第656-666页 *
Estimation of building energy consumption using weather information derived from photovoltaic power plants;Xueqian Fu等;Renewable Energy;第130-138页 *
Estimation of the failure probability of an integrated energy system based on the first order reliability method;Xueqian Fu等;Energy;第1068-1078页 *
计及高维随机变量的随机响应面法概率潮流计算;孙鑫等;中国电机工程学报;第2551-2560+2823页 *

Also Published As

Publication number Publication date
CN110442911A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
CN110442911B (en) High-dimensional complex system uncertainty analysis method based on statistical machine learning
Fu et al. Quantile regression for longitudinal data with a working correlation model
CN104361414B (en) Power transmission line icing prediction method based on correlation vector machine
Kuravsky et al. A numerical technique for the identification of discrete-state continuous-time Markov models
CN110377942B (en) Multi-model space-time modeling method based on finite Gaussian mixture model
CN109033513B (en) Power transformer fault diagnosis method and power transformer fault diagnosis device
CN113205207A (en) XGboost algorithm-based short-term power consumption load fluctuation prediction method and system
CN112101480A (en) Multivariate clustering and fused time sequence combined prediction method
CN107730097B (en) Bus load prediction method and device and computing equipment
CN114583767B (en) Data-driven wind power plant frequency modulation response characteristic modeling method and system
CN112733435A (en) Whole vehicle size matching deviation prediction method based on multi-model fusion
Efendi et al. Maximum-minimum temperature prediction using fuzzy random auto-regression time series model
CN113723541B (en) Slope displacement prediction method based on hybrid intelligent algorithm
Weiß et al. Non-parametric analysis of serial dependence in time series using ordinal patterns
CN114357870A (en) Metering equipment operation performance prediction analysis method based on local weighted partial least squares
CN110909492A (en) Sewage treatment process soft measurement method based on extreme gradient lifting algorithm
CN111061708A (en) Electric energy prediction and restoration method based on LSTM neural network
Qian et al. A new nonlinear risk assessment model based on an improved projection pursuit
CN114814707A (en) Intelligent ammeter stress error analysis method, equipment, terminal and readable medium
CN113539359A (en) Neural induction matrix supplementation-based map convolution network disease related lncRNA gene prediction method
Chong et al. A framework for the continuous calibration of building energy models with uncertainty
Mingoti et al. On capability indices for multivariate autocorrelated processes.
CN110866638A (en) Traffic volume prediction model construction method and device, computer equipment and storage medium
CN111210877A (en) Method and device for deducing physical property parameters
Zhu et al. Surrogating the response PDF of stochastic simulators using generalized lambda distributions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant