WO2022083009A1 - 一种基于异源数据差补融合的定制产品性能预测方法 - Google Patents

一种基于异源数据差补融合的定制产品性能预测方法 Download PDF

Info

Publication number
WO2022083009A1
WO2022083009A1 PCT/CN2021/070983 CN2021070983W WO2022083009A1 WO 2022083009 A1 WO2022083009 A1 WO 2022083009A1 CN 2021070983 W CN2021070983 W CN 2021070983W WO 2022083009 A1 WO2022083009 A1 WO 2022083009A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
sample
eset
data set
training
Prior art date
Application number
PCT/CN2021/070983
Other languages
English (en)
French (fr)
Inventor
张树有
裘乐淼
王阳
周会芳
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Priority to US17/522,921 priority Critical patent/US20220122103A1/en
Publication of WO2022083009A1 publication Critical patent/WO2022083009A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the invention belongs to the field of customized product performance prediction under the "Internet+” environment, and in particular relates to a customized product performance prediction method based on the difference and complement fusion of heterogeneous data.
  • the first is to perform deduction and prediction based on historical measured performance. From the perspective of accumulated historical measured performance data, the deduction and prediction of "performance-performance” is realized, which has high credibility.
  • the second method is the simulation prediction based on structural shape modeling. From the perspective of structural shape modeling and computational simulation, the simulation prediction of "shape-performance" is realized, and the response is efficient. Fast, but computational simulations have large errors, making performance predictions less credible. The above two methods are difficult to achieve efficient, fast and reliable prediction of customized products in the design stage.
  • the purpose of the present invention is to provide a method for predicting the performance of customized products based on the difference and complement fusion of heterologous data, aiming at the deficiencies of the prior art.
  • the present invention adopts the BP neural network model as the prediction model, combines with the deep self-encoder, and performs the differential complement fusion of the heterologous data through the neighborhood association method and the similarity difference complement method, and adopts the historical measured performance data of high fidelity
  • the calculated and simulated performance data of fidelity is correlated and corrected by error compensation, so that variable fidelity prediction of product performance can be realized, thus solving the problem that it is difficult to efficiently and reliably predict the performance of customized products in the design stage.
  • a kind of customized product performance prediction method based on heterologous data difference complement fusion the method comprises the following steps:
  • step (1) Select the BP neural network model as the performance prediction model of the customized product, and use the input features and output features selected in step (1) as the input and output of the prediction model.
  • step (2) is to first perform data denoising and data supplementation processing on the historical measured data set; and then perform data normalization processing on the historical measured data set and the computational simulation data set respectively.
  • a neural network model is trained by using the historical measured data set and the computational simulation data set as a deep autoencoder of the data sample.
  • the deep self-encoder consists of an input layer, an encoder, a feature expression layer, a decoder and an output layer, and both the encoder and the decoder include three hidden layers.
  • the input and output of the deep autoencoder are the input feature vectors of the data samples.
  • the layers are fully connected.
  • the activation function between the input layer and the hidden layer, and the activation function between the hidden layer and the hidden layer adopts the relu function.
  • the activation function between the output layers adopts the tanh function.
  • a neighborhood correlation method is used to correlate the encoded computational simulation data set ESet s and the historical measured training set ESet htrain , and the specific process of the association is:
  • Each data sample initializes an empty label set; select a data sample Sample k in the data set ESet htrain , take the data sample Sample k as the center, and take the neighborhood threshold ⁇ as the radius, for the samples located in this neighborhood.
  • the data samples in the data set ESet s are added to the mark set, and the added mark is the number of the data sample Sample k .
  • the access attribute of the data sample Sample k is set to visited; traverse all the accesses in the data set ESet htrain
  • the attribute is an unvisited data sample, and the data samples in its neighborhood are repeatedly marked until the access attributes of the data samples in the data set ESet htrain are all visited.
  • the historical measured training set ESet htrain is used to carry out difference compensation correction to the encoded calculation simulation data set ESet s .
  • the similarity difference compensation method is: traverse the data set ESet s , and perform difference compensation and correction on the output characteristics of the data sample according to the following formula:
  • the number of data samples in the associated data set ESet htrain S z represents the Euclidean distance between the data sample Sample l and the z-th data sample in the data set ESet htrain associated with it, which measures the two data Similarity between samples; ⁇ y z represents the absolute difference between the output feature vector of the data sample Sample l and the output feature vector of the data sample in the zth data set ESet htrain related to it; Represents the output feature vector of the data sample in the zth data set ESet htrain related to the data sample Sample l ; ⁇ and ⁇ are hyperparameters.
  • the calculation simulation data set MSet s after the difference compensation and correction is used as the training sample set
  • the historical measured verification set ESet hvalid is used as the verification sample set
  • the historical measured test set ESet htest is used as the test.
  • the sample set is combined with the training of the tabu search algorithm to construct an optimal BP neural network model BPNN sopt , which is used as the final prediction model.
  • the model consists of an input layer, three hidden layers and an output layer. The layers are fully connected. The number of neurons in the input layer is the number of input features of the data sample, and the number of neurons in the output layer.
  • h 1 , h 2 and h 3 are the number of output features of the data sample, and the number of neurons in the hidden layer is h 1 , h 2 and h 3 respectively.
  • L is the number of neurons in the hidden layer
  • n in is the number of neurons in the input layer
  • n out is the number of neurons in the output layer
  • a is (1,10)
  • the constant between h 1 , h 2 and h 3 is selected within the corresponding range, and the model under the current combination of h 1 , h 2 and h 3 is trained with the training sample set, and the verification sample set is used to verify the training.
  • the initialization of the hidden layer is as follows: the weights are all initialized to random numbers between [-1, 1] that obey the normal distribution, and the deviations are all initialized to 0; the activation function uses The relu-relu-relu-tanh form, the loss function adopts the mean square loss function, and the mini-batch gradient descent method is used to update the weights and biases.
  • the method of the present invention uses a deep self-encoder to encode the input features of the data samples, and maps the data samples from the input space to the feature space, so as to express the key features of the data, and to express computational simulation.
  • Correlation and similarity between data samples and historical measured data samples; based on data sample coding, the neighborhood association method and similarity difference complement method are used to realize the correlation between historical measured data and computational simulation data.
  • the difference compensation and correction of the low-fidelity computational simulation data from the high-fidelity historical measured data is implemented, so that the difference-compensation fusion of the computational simulation data and the historical measured data can be effectively realized.
  • the method of the invention can realize the variable fidelity prediction of the performance of the customized product through the fusion of the difference and complement of the heterologous data, improve the generalization ability of the performance prediction model, and effectively realize the efficient and credible prediction of the performance of the customized product in the design stage.
  • Fig. 1 is the flow chart of the customized product performance prediction method according to the present invention.
  • FIG. 2 is a topology diagram of a deep autoencoder constructed based on a computational simulation data set and a historical measured data set according to an embodiment of the present invention
  • FIG. 3 is a flow chart of constructing an optimal BP neural network BPNN sopt based on the calculation simulation data set after the difference compensation and correction according to an embodiment of the present invention.
  • the present invention takes the performance prediction of the peak-to-peak value of the car horizontal vibration acceleration of the customized elevator product as an example, and trains and constructs a BP neural network model, thereby establishing a mapping relationship between the configuration parameters of the customized elevator product and the peak-to-peak value of the horizontal vibration acceleration of the elevator car. , which is used to reliably predict the horizontal vibration performance of elevator products under different configuration parameters.
  • Fig. 1 is a flow chart of a prediction method constructed according to an embodiment of the present invention, as shown in Fig. 1:
  • a method for predicting the performance of a customized product based on the differential complement fusion of heterologous data of the present invention comprises the following steps:
  • Step 1 Take the configuration parameters of the customized product as the input feature, and the performance to be predicted of the customized product as the output feature, and collect and obtain data samples.
  • the maximum running speed v max , the maximum running acceleration a max , the running height H, the density ⁇ of the hoisting rope, the nominal diameter D of the hoisting rope, the elastic modulus E of the hoisting rope, and the mass of the car frame are used to customize the elevator product.
  • damping c rub is used as the input feature, and the peak-to-peak value a hvpp of the horizontal vibration acceleration of the customized elevator product is used as the output feature to collect and obtain training data samples.
  • Step 2 Perform data preprocessing on historical measured data sets and computational simulation data sets, including data denoising, data augmentation, and data normalization. Firstly, for the problems of noise and missing eigenvalues in the measured data samples, data denoising and data supplementation are carried out on the historical measured data set; then the data normalization processing is performed on the historical measured data set and the computational simulation data set respectively.
  • the historical measured data sets and computational simulation data sets before data normalization are shown in Table 1 and Table 2, respectively.
  • Table 1 Historical measured data of the peak-to-peak value of the horizontal vibration acceleration of the elevator car
  • Table 2 The calculated simulation data of the peak-to-peak value of the horizontal vibration acceleration of the elevator car
  • the embodiment of the present invention adopts an outlier detection method when performing data denoising processing on a historical measured data set.
  • the cluster-based outlier detection method is used to cluster the sample points in the data set, and the data sample points are organized into "clusters" through clustering. After the clustering is completed, the data samples that cannot be assigned to any cluster are outliers, so as to detect outliers while discovering clusters.
  • the detected outliers are noise in the dataset, and removing these outliers from the dataset can achieve dataset denoising.
  • the embodiment of the present invention adopts the DBSCAN clustering method.
  • the data supplementation processing when the data supplementation processing is performed on the historical measured data set, when the number of missing eigenvalues of a data sample exceeds 5, the data sample is removed, otherwise the mean value of the eigenvalue is used to fill in the missing features of the data sample.
  • the input features of the historical measured data set and the calculation simulation data set are respectively normalized, so that the input feature values of the data samples are all between [-1, 1].
  • the data normalization processing formula is as follows :
  • x i ′ represents the ith input eigenvalue after normalization
  • xi represents the ith input eigenvalue after normalization
  • xi,max represents the ith normalized input eigenvalue.
  • the maximum value of the input feature, x i,min represents the minimum value of the ith input feature that is normalized
  • m represents the number of input features in the dataset.
  • Step 3 Compensate and correct the calculated simulation data set based on the historical measured data set. Encode the historical measured data set and computational simulation data set based on the deep autoencoder, map the data samples from the input space to the feature space, and realize the expression of the key features of the data samples.
  • the encoded historical measured data set is ESet h , denote the encoded computational simulation data set as ESet s ; then the ratio of the number of training samples to the number of validation samples to the number of test samples is 7:2:1, randomly sampled, and the data set ESet h is divided into training sample sets , validation sample set and test sample set, respectively recorded as historical test training set ESet htrain , historical test validation set ESet hvalid and historical test test set ESet htest ; finally, the neighborhood correlation method is used to perform data set ESet s and data set ESet htrain . Relevant connections, and through the similarity difference compensation method, the data set ESet htrain is used to make a difference compensation correction to the data set ESet s , and the data set ESet s after the difference compensation correction is ESet smod .
  • Step 3.1 Take the historical measured data set and the computational simulation data set as the training sample set, build a training depth autoencoder, encode the data samples, denote the encoded computational simulation data set as ESet s , and record the encoded historical measured data Set is ESet h .
  • a BP neural network is used to build a deep autoencoder, and the topology of the deep autoencoder is shown in Figure 2.
  • the deep self-encoder consists of an input layer, an encoder, a feature expression layer, a decoder and an output layer. Both the encoder and the decoder contain three hidden layers, and the connections between the layers are fully connected.
  • the input and output of the deep autoencoder are both input feature vectors of data samples, that is, the number of neurons in the input layer and the output layer are both input features of data samples, which is 15; the three hidden layers of the encoder
  • the numbers of neurons in the layers are Ne1 , Ne2 and Ne3 respectively, and correspondingly, the numbers of neurons in the three hidden layers of the decoder are Ne3 , Ne2 and Ne1 respectively.
  • N e1 , N e2 and N e3 can be determined, where Le is the number of neurons in the hidden layer of the encoder/decoder, n ein is the number of neurons in the input layer, and n eout is the number of neurons in the feature expression layer number, a e is a constant between (1,10).
  • n ein is the number of neurons in the input layer
  • n eout is the number of neurons in the feature expression layer number
  • a e is a constant between (1,10).
  • each hidden layer is as follows: the weights are initialized to random numbers between [-1, 1] that obey the normal distribution, and the deviations are initialized to 0; the input layer and the hidden layer, the hidden layer and the hidden layer
  • the activation function between layers adopts the relu function, the activation function between the hidden layer and the output layer adopts the tanh function; the loss function adopts the mean square loss function; the mini-batch gradient descent method is used to update the weights and biases.
  • the deep autoencoder is trained using the computational simulation data set and the historical measured data set. After the training is completed, the output of the feature expression layer is extracted as the feature vector encoded by the data sample.
  • Step 3.2 Divide the encoded historical measured data set ESet h .
  • the ratio of the number of training samples to the number of validation samples to the number of test samples is 7:2:1, randomly sample, and divide the data set ESet h into a training sample set, a verification sample set and a test sample set, and record the divided sample sets. They are the historical test training set ESet htrain , the historical test validation set ESet hvalid and the historical test test set ESet htest .
  • Step 3.3 Use the neighborhood correlation method to associate the encoded computational simulation data set ESet s with the historical measured training set ESet htrain .
  • the process of correlating data samples by the neighborhood correlation method can be expressed as: initialize an empty label set for each data sample in the data set ESet s ; choose a data sample Sample k in the data set ESet htrain , with the data sample Sample k as the center and the neighborhood threshold ⁇ as the radius, add a label to the data sample in the data set ESet s located in this neighborhood range (hypersphere) to the label set, and the added label is the data
  • the number of the sample Sample k and at the same time set the access attribute of the data sample Sample k to Accessed; traverse all the data samples whose access attributes are not accessed in the data set ESet htrain , and repeat the above process until the data samples in the data set ESet htrain The access attributes of are all accessed. Therefore, for any data sample in the data sample set ESet s , the number of data samples in the data set ESet htrain related to it may be zero or more than one.
  • Step 3.4 Through the similarity difference compensation method, use the historical measured training set ESet htrain to perform difference compensation and correction on the encoded computational simulation data set ESet s , and the computational simulation data set after error compensation and correction is MSet s .
  • the similarity between the two can be measured by calculating the Euclidean distance between the data samples in the data set ESet htrain and the data samples in the data set ESet s .
  • the similarity difference compensation method can be expressed as: traversing the data set ESet s , performing a difference compensation correction on the output feature of each data sample Sample l whose marker set is not empty according to the following formula:
  • M represents the number of markers in the marker set of the data sample Sample 1 , that is, it is associated with the data sample Sample 1 .
  • S z represents the Euclidean distance between the data sample Sample l and the z-th data sample in the data set ESet htrain related to it, which measures the difference between the two data samples.
  • ⁇ y z represents the absolute difference between the output feature vector of the data sample Sample l and the output feature vector of the data sample in the zth data set ESet htrain related to it; Represents the output feature vector of the data sample in the zth data set ESet htrain related to the data sample Sample l ; ⁇ and ⁇ are hyperparameters.
  • Step 4 Select the BP neural network model as the prediction model of the peak-to-peak value of the horizontal vibration acceleration of the customized elevator product car, use the calculated simulation data set MSet s after the difference compensation as the training sample set, and use the historical measurement verification set ESet hvalid as the verification sample Set, using the historical test set ESet htest as the test sample set, combined with the training of the tabu search algorithm to build an optimal BP neural network model BPNN sopt . More, the process of constructing the optimal model BPNN sopt is shown in Figure 3.
  • Step 4.1 Model BPNN s construction and initialization:
  • the model consists of an input layer, three hidden layers and an output layer, and the layers are fully connected.
  • the number of neurons in the input layer is 15 if the number of input features of the data sample is 15, the number of neurons in the output layer is 1 if the number of neurons in the output layer is the number of output features of the data sample, and the number of neurons in the hidden layer is h 1 , h respectively 2 and h 3 , by empirical formula
  • the range of h 1 , h 2 and h 3 can be determined, where L is the number of neurons in the hidden layer, n in is the number of neurons in the input layer, n out is the number of neurons in the output layer, and a is (1,10 ) between the constants.
  • the weights of the three hidden layers are all initialized to random numbers between [-1, 1] obeying the normal distribution, the biases are all initialized to 0, and the activation function is in the form of relu-relu-relu-tanh.
  • Step 4.2 Training and validation of the model BPNNs : Set the training process to use the mean square loss function as the loss function, the mini-batch gradient descent method to update the weights and biases, the learning rate is 0.002, the batch size is 30, and the learning error target is 10 ⁇ 3. The maximum number of cycles for learning is 10,000 times.
  • the training iterations are as follows: 1) randomly sample a batch-sized training sample; 2) the samples are sequentially input into the model, perform forward calculation, and calculate the corresponding output; 3) According to the loss function , calculate the loss l batch of the batch size training samples; 4) Backpropagation of errors, using the mini-batch gradient descent method to update the weights and biases; 5) Repeat 1 to 4 until the training samples of the entire training sample set MSet s are traversed , and accumulate the losses of each batch to obtain the loss l sum of the entire training sample set MSet s ; 6) judge whether the loss l sum in the 5th step satisfies the set learning error target, if so, the model training is completed, otherwise, Go to the next step; 7) Determine whether the number of iterations exceeds the set maximum number of learning cycles, if so, the model training is completed, otherwise, complete one cycle iteration, enter the next cycle, and jump to step 1.
  • Step 4.3 Combine the tabu search algorithm to optimize the model parameters h 1 , h 2 and h 3 : select different combinations of h 1 , h 2 and h 3 within the range determined by L, first build and initialize the model BPNN s according to step 4.1, Then it is trained and verified according to step 4.2, and the verification errors under different combinations of h 1 , h 2 and h 3 are obtained. Taking the verification error as the goal, the tabu search algorithm is used to optimize h 1 , h 2 and h 3 , and the optimal number of neurons in the hidden layer h 1opt , h 2opt and h 3opt is determined , so that the number of hidden layers is fixed on the basis of the current hidden layer. Above, an optimal model BPNN sopt is obtained by training on the computational simulation data set MSet s after the correction of difference compensation.
  • Step 4.4 Test the current optimal model BPNN sopt : use the test sample set ESet htest to test the current optimal model BPNN sopt , calculate the test error, if the test error meets the requirements, set the model as the final product performance prediction model, Otherwise, reset the number of hidden layers of the BP neural network, and repeat steps 4.1 to 4.3 in step 4 to build, train and verify the model.
  • the mean square absolute percentage error is used as the indicator to calculate the test error of the prediction model BPNN sopt , and the expression of the mean square absolute percentage error is as follows:
  • N valid represents the sample size of the historical test validation set; Represents the prediction value of the prediction model Surr bpmix for the peak-to-peak value of the elevator car horizontal vibration acceleration peak-to-peak value of the u-th data sample in the historical measurement verification set; It represents the measured value of the peak-to-peak value of the horizontal vibration acceleration of the elevator car of the u-th data sample in the historical measurement verification set.
  • test error of the constructed prediction model BPNN sopt to the samples in the test sample set ESet htest is shown in Table 3.
  • the built prediction model BPNN sopt has a mean square absolute percentage error of 2.79% for the peak-to-peak value of the horizontal vibration acceleration of the elevator car.
  • the smaller the mean square absolute percentage error the higher the prediction accuracy of the model and the better the prediction performance. More, when the mean square absolute percentage error is lower than 10%, the prediction accuracy of the model is satisfactory. Therefore, the constructed prediction model BPNN sopt can realize the credible prediction of the peak-to-peak value of the horizontal vibration acceleration of the elevator car.
  • Step 5 Prediction of the data samples to be tested.
  • the data sample to be predicted first normalize its input features according to the data normalization processing method of calculating the simulated data set in step 2, and then input it into the deep self-encoder constructed in step 3 for encoding, and finally encode the
  • the last sample to be predicted is input into the prediction model BPNN sopt for prediction, and the peak-to-peak value of the horizontal vibration acceleration of the car under different configuration parameters of the customized elevator product can be obtained.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种基于异源数据差补融合的定制产品性能预测方法,该方法基于深度自编码器、邻域关联方法以及相似度差补方法,采用历史实测数据集对计算仿真数据集进行差补修正;采用差补修正后的计算仿真数据集训练BP神经网络作为定制产品的性能预测模型。该方法结合深度自编码器,采用邻域关联方法以及相似度差补方法,将低保真度的计算仿真数据与高保真度的历史实测数据进行关联,实现了高保真度的历史实测数据对低保真度的计算仿真数据的差补修正,从而通过异源数据差补融合实现了对产品性能的变保真度预测,提高了性能预测模型的泛化能力,实现了在设计阶段对定制产品性能的高效可信预测。

Description

一种基于异源数据差补融合的定制产品性能预测方法 技术领域
本发明属于“互联网+”环境下定制产品性能预测领域,尤其涉及一种基于异源数据差补融合的定制产品性能预测方法。
背景技术
“互联网+”环境下,用户对产品的个性化需求日益凸显,同时在用户深度参与设计过程方面也提出了更高的要求。针对用户需求驱动的个性化定制产品设计而言,定制产品的性能预测有助于在设计阶段对用户需求实现高效快速响应,同时能够有效降低设计成本同时缩短设计周期,从而可以有效促进用户深度参与设计过程。
传统的产品性能预测方法一般有两种,第一种是基于历史实测性能进行推演预测,从累积的历史实测性能数据的角度出发,实现“性能—性能”的推演预测,具有较高的可信度,但费用高、周期长且响应慢;第二种方法是基于结构形状建模的仿真预测,从结构形状建模及计算仿真的角度出发,实现“形状—性能”的仿真预测,响应高效快速,但计算仿真具有较大误差,使得性能预测的可信度低。以上两种方法都难以在设计阶段实现定制产品的高效快速、可信性预测。
发明内容
本发明的目的在于针对现有技术的不足,提供一种基于异源数据差补融合的定制产品性能预测方法。本发明采用BP神经网络模型作为预测模型,同时结合深度自编码器,并通过邻域关联方法和相似度差补方法进行异源数据差补融合,采用高保真度的历史实测性能数据对低保真度的计算仿真性能数据进行关联和差补修正,从而能够实现对产品性能的变保真度预测,由此解决了难以在设计阶段对定制产品性能进行高效、可信预测的问题。
本发明的目的是通过以下技术方案来实现的:一种基于异源数据差补融合的定制产品性能预测方法,该方法包括下列步骤:
(1)以定制产品的配置参数作为输入特征,定制产品的待预测性能作为输出特征,收集获取数据样本。收集对已有产品实测的性能数据,构建定制产品性能预测的历史实测数据集;利用计算机仿真软件,建立定制产品的虚拟仿真模型,通过仿真分析获取性能数据,构建定制产品性能预测的计算仿真数据集;
(2)对历史实测数据集和计算仿真数据集进行数据预处理。
(3)基于历史实测数据集对计算仿真数据集进行差补修正:基于深度自编码器对历史实 测数据集和计算仿真数据集进行编码,将数据样本从输入空间映射到特征空间,实现对数据样本的关键特征进行表达,记编码后的历史实测数据集为ESet h,记编码后的计算仿真数据集为ESet s;通过随机采样,将数据集ESet h划分为训练样本集、验证样本集和测试样本集等,分别记为历史实测训练集ESet htrain、历史实测验证集ESet hvalid和历史实测测试集ESet htest;最后采用邻域关联方法对数据集ESet s和数据集ESet htrain进行相关联接,并通过相似度差补方法,采用数据集ESet htrain对数据集ESet s进行差补修正,记差补修正后的数据集ESet s为MSet s
(4)选取BP神经网络模型作为定制产品的性能预测模型,并以步骤(1)中选取的输入特征和输出特征作为预测模型的输入和输出。以差补修正后的计算仿真数据集作为训练样本集,结合禁忌搜索算法训练构建一个最优BP神经网络模型;随后采用历史实测测试集ESet htest对该模型进行测试,以得到最终的定制产品的性能预测模型;
(5)对待预测的数据样本,首先按照步骤(2)中对计算仿真数据集的处理进行数据预处理,然后输入至步骤(3)中构建的深度自编码器中进行编码,最后将编码后的待预测样本输入至步骤(4)中构建的预测模型进行预测,即可得到定制产品在不同配置参数条件下的产品性能。
进一步地,所述步骤(2)为首先对历史实测数据集进行数据去噪以及数据增补处理;然后分别对历史实测数据集和计算仿真数据集进行数据归一化处理。
进一步地,所述的步骤(3)中,采用历史实测数据集和计算仿真数据集训练一个神经网络模型,以此作为数据样本的深度自编码器。该深度自编码器由输入层、编码器、特征表达层、解码器以及输出层构成,且编码器与解码器均包含三个隐藏层。深度自编码器的输入与输出均为数据样本的输入特征向量,层与层之间采用全连接方式,输入层与隐藏层、隐藏层与隐藏层之间的激活函数采用relu函数,隐藏层与输出层之间的激活函数采用tanh函数。
进一步地,所述的步骤(3)中,采用邻域相关方法对编码后的计算仿真数据集ESet s和历史实测训练集ESet htrain进行相关联接,关联的具体过程为:对数据集ESet s中每一个数据样本初始化一个空的标记集;在数据集ESet htrain中任选一个数据样本Sample k,以该数据样本Sample k为中心,以邻域阈值ε为半径,对位于此邻域范围内的数据集ESet s中的数据样本添加标记至标记集中,添加的标记为该数据样本Sample k的编号,与此同时将数据样本Sample k的访问属性设置为已访问;遍历数据集ESet htrain中所有访问属性为未访问的数据样本,重复对其邻域范围内的数据样本添加标记,直至数据集ESet htrain中的数据样本的访问属性均为已访问。
进一步地,所述的步骤(3)中,基于相似度差补方法,采用历史实测训练集ESet htrain对 编码后的计算仿真数据集ESet s进行差补修正。该相似度差补方法为:遍历数据集ESet s,对其每一个标记集不为空的数据样本Sample l按照下式对该数据样本的输出特征进行差补修正:
Figure PCTCN2021070983-appb-000001
Figure PCTCN2021070983-appb-000002
其中,
Figure PCTCN2021070983-appb-000003
表示数据样本Sample l的差补修正后的输出特征向量;
Figure PCTCN2021070983-appb-000004
表示数据样本Sample l的差补修正前的输出特征向量,即为通过仿真分析获取的输出特征向量;M表示该数据样本Sample l的标记集中的标记个数,也就是与该数据样本Sample l相关联的数据集ESet htrain中数据样本的个数;S z表示数据样本Sample l和第z个与其相关联的数据集ESet htrain中的数据样本之间的欧式距离,该值衡量了这两个数据样本之间的相似度;Δy z表示数据样本Sample l的输出特征向量和第z个与其相关的数据集ESet htrain中的数据样本的输出特征向量之间的绝对差值;
Figure PCTCN2021070983-appb-000005
表示第z个与数据样本Sample l相关的数据集ESet htrain中的数据样本的输出特征向量;α、β为超参数。
进一步地,所述的步骤(4)中,以差补修正后的计算仿真数据集MSet s作为训练样本集,以历史实测验证集ESet hvalid作为验证样本集,以历史实测测试集ESet htest作为测试样本集,结合禁忌搜索算法训练构建一个最优的BP神经网络模型BPNN sopt,以此作为最终的预测模型。该模型由一个输入层,三个隐藏层和一个输出层构成,层与层之间均为全连接,输入层的神经元个数为数据样本输入特征的个数,输出层的神经元个数为数据样本的输出特征的个数,隐藏层的神经元个数分别为h 1、h 2和h 3,通过
Figure PCTCN2021070983-appb-000006
确定h 1、h 2和h 3的范围;其中,L为隐藏层神经元个数,n in为输入层神经元个数,n out为输出层神经元个数,a为(1,10)之间的常数;在对应范围内选择不同的h 1、h 2和h 3,采用训练样本集对当前h 1、h 2和h 3组合下的模型进行训练,并采用验证样本集验证训练得到的模型,得到当前h 1、h 2和h 3组合下的验证误差;以验证误差为目标,采用禁忌搜索算法优化h 1、h 2和h 3,确定最优的隐藏层的神经元个数h 1opt、h 2opt和h 3opt,从而在当前隐藏层层数固定的基础上,训练得到一个最优模型BPNN sopt;最后采用测试样本集对该最优模型BPNN sopt进行测试,若满足要求则选取该模型为最终的预测模型,否则重新设置该BP神经网络模型的隐藏层层数并重新训练新的网络模型。
进一步地,所述的步骤(4)中,隐藏层的初始化采用如下方式:权重均初始化为服从正态分布的[-1,1]之间的随机数,偏差均初始化为0;激活函数采用relu-relu-relu-tanh形式,损失函数采用均方损失函数,采用小批量梯度下降法更新权重和偏差。
本发明的有益效果是:本发明的方法采用深度自编码器对数据样本输入特征进行编码, 将数据样本从输入空间映射到特征空间,以实现对数据关键特征进行表达,并能够表现出计算仿真数据样本与历史实测数据样本之间的相关性以及相似性;基于数据样本编码,采用邻域关联方法和相似度差补方法,实现了历史实测数据与计算仿真数据之间的相关联接,同时实现了高保真度的历史实测数据对低保真度的计算仿真数据的差补修正,从而能够有效实现计算仿真数据与历史实测数据的差补融合。本发明的方法通过异源数据差补融合能够实现对定制产品性能的变保真度预测,提高了性能预测模型的泛化能力,有效实现了在设计阶段对定制产品性能的高效可信预测。
附图说明
图1为按照本发明的定制产品性能预测方法的流程图;
图2为按照本发明的实施例的基于计算仿真数据集和历史实测数据集所构建的深度自编码器的拓扑结构图;
图3为按照本发明的实施例的基于差补修正后的计算仿真数据集构建最优BP神经网络BPNN sopt的流程图。
具体实施方式
下面结合附图及具体实施例对本发明作进一步详细地说明。
本发明以定制电梯产品轿厢水平振动加速度峰峰值这一性能预测作为实施例,训练构造BP神经网络模型,从而建立定制电梯产品的配置参数与电梯轿厢水平振动加速度峰峰值之间的映射关系,用以对不同配置参数下的电梯产品的轿厢水平振动性能进行可信预测。图1是按照本发明的实施例所构建的预测方法的流程图,如图1所示:
本发明的一种基于异源数据差补融合的定制产品性能预测方法,包括以下步骤:
步骤1:以定制产品的配置参数作为输入特征,定制产品的待预测性能作为输出特征,收集获取数据样本。从企业收集对已有产品实测的性能数据,构建定制产品性能预测的历史实测数据集;利用计算机仿真软件,建立定制产品的虚拟仿真模型,通过仿真分析获取性能数据,构建定制产品性能预测的计算仿真数据集。
本发明的实施例以定制电梯产品的最大运行速度v max、最大运行加速度a max、运行高度H、曳引绳密度ρ、曳引绳公称直径D、曳引绳弹性模量E、轿架质量m frame及等效转动惯量J frame、轿厢质量m car及等效转动惯量J car、额定载荷m load、导靴弹簧等效刚度k shoe及阻尼c shoe、减振橡胶等效刚度k rub及阻尼c rub作为输入特征,以定制电梯产品轿厢水平振动加速度峰峰值a hvpp为输出特性,采集获取训练数据样本。
从企业收集已有电梯产品的实测轿厢水平振动加速度峰峰值,构建历史实测数据集;基于拉丁超立方采样方法进行实验设计,然后利用计算机仿真软件ADAMS以及产品开发软件 NX,建立电梯产品的虚拟仿真模型,通过Ansys仿真分析获取轿厢水平振动加速度峰峰值,构建计算仿真数据集。
步骤2:对历史实测数据集和计算仿真数据集进行数据预处理,包括数据去噪、数据增补以及数据归一化处理。首先针对实测数据样本存在的噪声、特征值缺失等问题,对历史实测数据集进行数据去噪以及数据增补处理;然后分别对历史实测数据集和计算仿真数据集进行数据归一化处理。数据归一化处理之前的历史实测数据集和计算仿真数据集分别如表1、表2所示。
表1:电梯轿厢水平振动加速度峰峰值的历史实测数据
Figure PCTCN2021070983-appb-000007
表2:电梯轿厢水平振动加速度峰峰值的计算仿真数据
Figure PCTCN2021070983-appb-000008
Figure PCTCN2021070983-appb-000009
本发明的实施例对历史实测数据集进行数据去噪处理时,采用离群点检测方法。采用基于聚类的离群点检测方法,对数据集中的样本点进行聚类,通过聚类将数据样本点组织成“簇”,在聚类完成后无法归属于任意聚类簇的数据样本即为离群点,从而实现在发现聚类簇的同时检测出离群点。检测出的离群点是数据集中的噪声,从数据集中去除这些离群点即可实现数据集去噪。本发明的实施例采用DBSCAN聚类方法。
本发明的实施例对历史实测数据集进行数据增补处理时,当数据样本的特征值缺失个数超过5个时,则去除该数据样本,否则采用特征值均值填补该数据样本的缺失特征。
本发明的实施例对历史实测数据集和计算仿真数据集的输入特征进行分别归一化处理,使得数据样本的输入特征值均位于[-1,1]之间,数据归一化处理公式如下:
Figure PCTCN2021070983-appb-000010
式(1)中,x i′表示归一化后的第i个输入特征值,x i表示被归一化的第i个输入特征值,x i,max表示被归一化的第i个输入特征的最大值,x i,min表示被归一化的第i个输入特征的最小值,m表示数据集的输入特征的个数。
步骤3:基于历史实测数据集对计算仿真数据集进行差补修正。基于深度自编码器对历 史实测数据集和计算仿真数据集进行编码,将数据样本从输入空间映射到特征空间,实现对数据样本的关键特征进行表达,记编码后的历史实测数据集为ESet h,记编码后的计算仿真数据集为ESet s;随后以训练样本数比验证样本数比测试样本数之比为7:2:1的比例,随机采样,将数据集ESet h划分为训练样本集、验证样本集和测试样本集,分别记为历史实测训练集ESet htrain、历史实测验证集ESet hvalid和历史实测测试集ESet htest;最后采用邻域关联方法对数据集ESet s和数据集ESet htrain进行相关联接,并通过相似度差补方法,采用数据集ESet htrain对数据集ESet s进行差补修正,记差补修正后的数据集ESet s为ESet smod
本发明的实施例实现历史实测训练集ESet htrain对编码后的计算仿真数据集ESet s进行差补修正的过程的具体描述如下:
步骤3.1:以历史实测数据集和计算仿真数据集作为训练样本集,构建训练深度自编码器,对数据样本进行编码,记编码后的计算仿真数据集为ESet s,记编码后的历史实测数据集为ESet h
采用BP神经网络构建深度自编码器,该深度自编码器的拓扑结构如图2所示。该深度自编码器由输入层、编码器、特征表达层、解码器以及输出层构成,编码器与解码器均包含三个隐藏层,层与层之间的连接均为全连接。该深度自编码器的输入与输出均为数据样本的输入特征向量,即有输入层和输出层的神经元个数均为数据样本的输入特征的个数即为15;编码器的三个隐藏层的神经元个数分别为N e1、N e2和N e3,相对应地,解码器的三个隐藏层的神经元个数分别为N e3、N e2和N e1。通过经验公式
Figure PCTCN2021070983-appb-000011
可以确定N e1、N e2和N e3的范围,其中,L e为编码器/解码器的隐藏层神经元个数,n ein为输入层神经元个数,n eout为特征表达层神经元个数,a e为(1,10)之间的常数。基于十折交叉验证方法,选择由经验公式确定的隐藏层神经元个数的范围内选择不同的N e1、N e2和N e3组合,构建训练深度自编码器并计算该模型的交叉验证误差,选择交叉验证误差最小的那一组作为编码器/解码器的隐藏层神经元个数。
更多地,各隐藏层的初始化采用如下方式:权重均初始化为服从正态分布的[-1,1]之间的随机数,偏差均初始化为0;输入层与隐藏层、隐藏层与隐藏层之间的激活函数采用relu函数,隐藏层与输出层之间的激活函数采用tanh函数;损失函数采用均方损失函数;采用小批量梯度下降法更新权重和偏差。
采用计算仿真数据集和历史实测数据集对该深度自编码器进行训练,训练完成后,提取特征表达层的输出作为数据样本编码后的特征向量。
步骤3.2:对编码后的历史实测数据集ESet h进行划分。以训练样本数比验证样本数比测试样本数之比为7:2:1的比例,随机采样,将数据集ESet h划分为训练样本集、验证样本集 和测试样本集,记划分的样本集分别为历史实测训练集ESet htrain、历史实测验证集ESet hvalid和历史实测测试集ESet htest
步骤3.3:采用邻域相关方法,对编码后的计算仿真数据集ESet s与历史实测训练集ESet htrain进行关联。
定义:设置ε为邻域大小阈值,以数据样本A为中心,以邻域阈值ε为半径,若数据样本B位于数据样本A的邻域范围(超球体)内,则认定数据样本B与数据样本A相关,也就是说当数据样本B与该数据样本A之间的欧式距离小于阈值ε时,数据样本B与数据样本A是相关的,否则数据样本B与数据样本A不相关的。
基于上述定义,该邻域相关方法对数据样本进行关联的过程可以表述为:对数据集ESet s中每一个数据样本初始化一个空的标记集;在数据集ESet htrain中任选一个数据样本Sample k,以该数据样本Sample k为中心,以邻域阈值ε为半径,对位于此邻域范围(超球体)内的数据集ESet s中的数据样本添加标记至标记集,添加的标记为该数据样本Sample k的编号,与此同时将数据样本Sample k的访问属性设置为已访问;遍历数据集ESet htrain中所有访问属性为未访问的数据样本,重复上述过程直至数据集ESet htrain中的数据样本的访问属性均为已访问。因此,对于数据样本集ESet s中的任一数据样本而言,与其相关的数据集ESet htrain中的数据样本个数可能为零也可能有多个。
上述过程用下述的伪代码进行表达:
Figure PCTCN2021070983-appb-000012
Figure PCTCN2021070983-appb-000013
步骤3.4:通过相似度差补方法,采用历史实测训练集ESet htrain对编码后的计算仿真数据集ESet s进行差补修正,记差补修正后的计算仿真数据集为MSet s
基于步骤3.1中的对输入特征的编码,通过计算数据集ESet htrain中的数据样本与数据集ESet s中数据样本之间的欧式距离可以衡量两者之间的相似性,欧式距离越大表明两者之间的相似性越小,反之越大,从而可以用相似性衡量数据集ESet htrain中数据样本对数据集ESet s中数据样本进行修正时所占的权重。
具体地,该相似度差补方法可以表述为:遍历数据集ESet s,对每一个标记集不为空的数据样本Sample l按照下式对该数据样本的输出特征进行差补修正:
Figure PCTCN2021070983-appb-000014
Figure PCTCN2021070983-appb-000015
式中,
Figure PCTCN2021070983-appb-000016
表示数据样本Sample l的差补修正后的输出特征向量;
Figure PCTCN2021070983-appb-000017
表示数据样本Sample l的差补修正前的输出特征向量,即为通过仿真分析获取的输出特征向量;M表示该数据样本Sample l的标记集中的标记个数,也就是与数据样本Sample l相关联的数据集ESet htrain中数据样本的个数;S z表示数据样本Sample l和第z个与其相关的数据集ESet htrain中的数据样本之间的欧式距离,该值衡量了这两个数据样本之间的相似度;Δy z表示数据样本Sample l的输出特征向量和第z个与其相关的数据集ESet htrain中的数据样本的输出特征向量之间的绝对差值;
Figure PCTCN2021070983-appb-000018
表示第z个与数据样本Sample l相关的数据集ESet htrain中的数据样本的输出特征向量;α、β为超参数。
上述过程用下述的伪代码进行表达:
Figure PCTCN2021070983-appb-000019
Figure PCTCN2021070983-appb-000020
步骤4:选取BP神经网络模型作为定制电梯产品轿厢水平振动加速度峰峰值的预测模型,以差补修正后的计算仿真数据集MSet s作为训练样本集,以历史实测验证集ESet hvalid作为验证样本集,以历史实测测试集ESet htest作为测试样本集,结合禁忌搜索算法训练构建一个最优的BP神经网络模型BPNN sopt。更多地,构建最优模型BPNN sopt的流程如图3所示。
步骤4.1:模型BPNN s构建及初始化:该模型由一个输入层,三个隐藏层和一个输出层构成,层与层之间均为全连接。输入层的神经元个数为数据样本的输入特征个数即为15,输出层的神经元个数为数据样本输出特征个数即为1,隐藏层的神经元个数分别为h 1、h 2和h 3,通过经验公式
Figure PCTCN2021070983-appb-000021
可以确定h 1、h 2和h 3的范围,其中,L为隐藏层神经元个数,n in为输入层神经元个数,n out为输出层神经元个数,a为(1,10)之间的常数。更多地,三个隐藏层的权重均初始化为服从正态分布的[-1,1]之间的随机数,偏差均初始化为0,激活函数采用relu-relu-relu-tanh形式。
步骤4.2:模型BPNN s的训练及验证:设定训练过程采用均方损失函数作为损失函数,小批量梯度下降法更新权重和偏差,学习率为0.002,批量大小为30,学习误差目标为10 -3,学习最大循环次数为10000次。使用训练样本集MSet s进行迭代训练,训练迭代如下所述:1)随机采样一个批量大小的训练样本;2)样本依次输入模型,进行前向计算,计算相对应的输出;3)根据损失函数,计算该批量大小训练样本的损失l batch;4)误差反向传播,采用小批量梯度下降法对权重和偏差进行更新;5)重复1~4直至遍历完整个训练样本集MSet s的训练样本,并将各个批量的损失累加,得到整个训练样本集MSet s的损失l sum;6)判断第5步中的损失l sum是否满足设定的学习误差目标,若是,则模型训练完成,否则,进入下一步;7)判断迭代次数是否超过设置的学习最大循环次数,若是,则模型训练完成,否则,完成一次循环迭代,进入下一轮循环,跳转到第1步。
采用验证样本集ESet hvalid对训练完成的模型进行验证。将验证样本依次输入模型,通过前向计算,计算相对应的输出,然后根据损失函数计算样本误差,最后将各个验证样本的误差累加,即得到相应的验证误差。
步骤4.3:结合禁忌搜索算法优化模型参数h 1、h 2和h 3:在由L确定的范围内选择不同的h 1、h 2和h 3组合,首先按照步骤4.1构建并初始化模型BPNN s,然后按照步骤4.2对其进行训 练及验证,得到不同h 1、h 2和h 3组合下的验证误差。以验证误差为目标,采用禁忌搜索算法优化h 1、h 2和h 3,确定最优的隐藏层的神经元个数h 1opt、h 2opt和h 3opt,从而在当前隐藏层层数固定的基础上,基于差补修正后的计算仿真数据集MSet s训练得到一个最优模型BPNN sopt
步骤4.4:当前最优模型BPNN sopt的测试:采用测试样本集ESet htest对该当前最优模型BPNN sopt进行测试,计算测试误差,若测试误差满足要求则设置该模型为最终的产品性能预测模型,否则,重新设置BP神经网络的隐藏层层数,重复步骤4中的4.1~4.3步,构建、训练以及验证模型。
采用均方绝对百分比误差作为指标来计算预测模型BPNN sopt的测试误差,且均方绝对百分比误差的表达式如下所示:
Figure PCTCN2021070983-appb-000022
式中,N valid表示历史实测验证集的样本大小;
Figure PCTCN2021070983-appb-000023
表示预测模型Surr bpmix对历史实测验证集中第u个数据样本的电梯轿厢水平振动加速度峰峰值的预测值;
Figure PCTCN2021070983-appb-000024
表示历史实测验证集中第u个数据样本的电梯轿厢水平振动加速度峰峰值的实测值。
所构建的预测模型BPNN sopt对测试样本集ESet htest中样本的测试误差如表3所示。
表3 预测模型BPNN sopt对测试样本集ESet htest的预测误差
Figure PCTCN2021070983-appb-000025
Figure PCTCN2021070983-appb-000026
从表3可以看出,针对测试样本集ESet htest中样本,所构建的预测模型BPNN sopt对电梯轿厢水平振动加速度峰峰值的均方绝对百分比误差为2.79%。均方绝对百分比误差越小,表示模型的预测精度越高,预测性能越好。更多地,当均方绝对百分比误差低于10%时,模型的预测精度是满足要求的。因此,所构建的预测模型BPNN sopt可以实现对电梯轿厢水平震动加速度峰峰值的可信预测。
步骤5:待测数据样本的预测。对待预测的数据样本,首先按照步骤2中计算仿真数据集的数据归一化处理方式对其输入特征进行归一化,然后输入至步骤3中构建的深度自编码器中进行编码,最后将编码后的待预测样本输入至预测模型BPNN sopt进行预测,即可得到定制电梯产品在不同配置参数条件下的轿厢水平振动加速度峰峰值。

Claims (4)

  1. 一种基于异源数据差补融合的定制产品性能预测方法,其特征在于,该方法包括下列步骤:
    (1)以定制产品的配置参数作为输入特征,定制产品的待预测性能作为输出特征,收集获取数据样本;收集对已有产品实测的性能数据,构建定制产品性能预测的历史实测数据集;利用计算机仿真软件,建立定制产品的虚拟仿真模型,通过仿真分析获取性能数据,构建定制产品性能预测的计算仿真数据集;
    (2)对历史实测数据集和计算仿真数据集进行数据预处理;
    (3)基于历史实测数据集对计算仿真数据集进行差补修正:基于深度自编码器对历史实测数据集和计算仿真数据集进行编码,将数据样本从输入空间映射到特征空间,实现对数据样本的关键特征进行表达,记编码后的历史实测数据集为ESet h,记编码后的计算仿真数据集为ESet s;通过随机采样,将数据集ESet h划分为训练样本集、验证样本集和测试样本集,分别记为历史实测训练集ESet htrain、历史实测验证集ESet hvalid和历史实测测试集ESet htest;最后采用邻域关联方法对数据集ESet s和数据集ESet htrain进行相关联接,并通过相似度差补方法,采用数据集ESet htrain对数据集ESet s进行差补修正,记差补修正后的数据集ESet s为MSet s
    (4)选取BP神经网络模型作为定制产品的性能预测模型,并以步骤(1)中选取的输入特征和输出特征作为预测模型的输入和输出;以差补修正后的计算仿真数据集作为训练样本集,结合禁忌搜索算法训练构建一个最优BP神经网络模型;随后采用历史实测测试集ESet htest对该模型进行测试,以得到最终的定制产品的性能预测模型;
    (5)对待预测的数据样本,首先按照步骤(2)中对计算仿真数据集的处理进行数据预处理,然后输入至步骤(3)中构建的深度自编码器中进行编码,最后将编码后的待预测样本输入至步骤(4)中构建的预测模型进行预测,得到定制产品在不同配置参数条件下的产品性能;
    所述的步骤(3)中,采用历史实测数据集和计算仿真数据集训练一个神经网络模型,以此作为数据样本的深度自编码器;该深度自编码器由输入层、编码器、特征表达层、解码器以及输出层构成,且编码器与解码器均包含三个隐藏层;深度自编码器的输入与输出均为数据样本的输入特征向量,层与层之间采用全连接方式,输入层与隐藏层、隐藏层与隐藏层之间的激活函数采用relu函数,隐藏层与输出层之间的激活函数采用tanh函数;
    所述的步骤(3)中,采用邻域相关方法对编码后的计算仿真数据集ESet s和历史实测训 练集ESet htrain进行相关联接,关联的具体过程为:对数据集ESet s中每一个数据样本初始化一个空的标记集;在数据集ESet htrain中任选一个数据样本Sample k,以该数据样本Sample k为中心,以邻域阈值ε为半径,对位于此邻域范围内的数据集ESet s中的数据样本添加标记至标记集中,添加的标记为该数据样本Sample k的编号,与此同时将数据样本Sample k的访问属性设置为已访问;遍历数据集ESet htrain中所有访问属性为未访问的数据样本,重复对其邻域范围内的数据样本添加标记,直至数据集ESet htrain中的数据样本的访问属性均为已访问;
    所述的步骤(3)中,基于相似度差补方法,采用历史实测训练集ESet htrain对编码后的计算仿真数据集ESet s进行差补修正;该相似度差补方法为:遍历数据集ESet s,对其每一个标记集不为空的数据样本Sample l按照下式对该数据样本的输出特征进行差补修正:
    Figure PCTCN2021070983-appb-100001
    Figure PCTCN2021070983-appb-100002
    其中,
    Figure PCTCN2021070983-appb-100003
    表示数据样本Sample l的差补修正后的输出特征向量;
    Figure PCTCN2021070983-appb-100004
    表示数据样本Sample l的差补修正前的输出特征向量,即为通过仿真分析获取的输出特征向量;M表示该数据样本Sample l的标记集中的标记个数,也就是与该数据样本Sample l相关联的数据集ESet htrain中数据样本的个数;S z表示数据样本Sample l和第z个与其相关联的数据集ESet htrain中的数据样本之间的欧式距离,该值衡量了这两个数据样本之间的相似度;Δy z表示数据样本Sample l的输出特征向量和第z个与其相关的数据集ESet htrain中的数据样本的输出特征向量之间的绝对差值;
    Figure PCTCN2021070983-appb-100005
    表示第z个与数据样本Sample l相关的数据集ESet htrain中的数据样本的输出特征向量;α、β为超参数。
  2. 根据权利要求1所述的一种基于异源数据差补融合的定制产品性能预测方法,其特征在于,所述步骤(2)为首先对历史实测数据集进行数据去噪以及数据增补处理;然后分别对历史实测数据集和计算仿真数据集进行数据归一化处理。
  3. 根据权利要求1所述的一种基于异源数据差补融合的定制产品性能预测方法,其特征在于,所述的步骤(4)中,以差补修正后的计算仿真数据集MSet s作为训练样本集,以历史实测验证集ESet hvalid作为验证样本集,以历史实测测试集ESet htest作为测试样本集,结合禁忌搜索算法训练构建一个最优的BP神经网络模型BPNN sopt,以此作为最终的预测模型;该模型由一个输入层,三个隐藏层和一个输出层构成,层与层之间均为全连接,输入层的神经元个数为数据样本输入特征的个数,输出层的神经元个数为数据样本的输出特征的个数,隐藏层的神经元个数分别为h 1、h 2和h 3,通过
    Figure PCTCN2021070983-appb-100006
    确定h 1、h 2和h 3的范围;其中,L为隐藏层神经元个数,n in为输入层神经元个数,n out为输出层神经元个数,a为(1,10) 之间的常数;在对应范围内选择不同的h 1、h 2和h 3,采用训练样本集对当前h 1、h 2和h 2组合下的模型进行训练,并采用验证样本集验证训练得到的模型,得到当前h 1、h 2和h 3组合下的验证误差;以验证误差为目标,采用禁忌搜索算法优化h 1、h 2和h 3,确定最优的隐藏层的神经元个数h 1opt、h 2opt和h 3opt,从而在当前隐藏层层数固定的基础上,训练得到一个最优模型BPNN sopt;最后采用测试样本集对该最优模型BPNN sopt进行测试,若满足要求则选取该模型为最终的预测模型,否则重新设置该BP神经网络模型的隐藏层层数并重新训练新的网络模型。
  4. 根据权利要求3所述的一种基于异源数据差补融合的定制产品性能预测方法,其特征在于,所述的步骤(4)中,隐藏层的初始化采用如下方式:权重均初始化为服从正态分布的[-1,1]之间的随机数,偏差均初始化为0;激活函数采用relu-relu-relu-tanh形式,损失函数采用均方损失函数,采用小批量梯度下降法更新权重和偏差。
PCT/CN2021/070983 2020-10-20 2021-01-08 一种基于异源数据差补融合的定制产品性能预测方法 WO2022083009A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/522,921 US20220122103A1 (en) 2020-10-20 2021-11-10 Customized product performance prediction method based on heterogeneous data difference compensation fusion

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011124136.5 2020-10-20
CN202011124136.5A CN112257341B (zh) 2020-10-20 2020-10-20 一种基于异源数据差补融合的定制产品性能预测方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/522,921 Continuation US20220122103A1 (en) 2020-10-20 2021-11-10 Customized product performance prediction method based on heterogeneous data difference compensation fusion

Publications (1)

Publication Number Publication Date
WO2022083009A1 true WO2022083009A1 (zh) 2022-04-28

Family

ID=74244369

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/070983 WO2022083009A1 (zh) 2020-10-20 2021-01-08 一种基于异源数据差补融合的定制产品性能预测方法

Country Status (2)

Country Link
CN (1) CN112257341B (zh)
WO (1) WO2022083009A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062513A (zh) * 2022-06-20 2022-09-16 嘉兴学院 基于adnn的永磁同步直线电机定位力计算模型构建方法
CN116502544A (zh) * 2023-06-26 2023-07-28 武汉新威奇科技有限公司 一种基于数据融合的电动螺旋压力机寿命预测方法及系统
CN116839783A (zh) * 2023-09-01 2023-10-03 华东交通大学 一种基于机器学习的汽车板簧受力值及变形量的测量方法
CN116861347A (zh) * 2023-05-22 2023-10-10 青岛海洋地质研究所 一种基于深度学习模型的磁力异常数据计算方法
CN117349710A (zh) * 2023-12-04 2024-01-05 七七七电气科技有限公司 一种真空灭弧室开断能力的预测方法、电子设备及介质
CN117369425A (zh) * 2023-12-08 2024-01-09 南昌华翔汽车内外饰件有限公司 汽车仪表总成故障诊断方法、系统、存储介质及计算机

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113189589B (zh) * 2021-05-08 2024-05-17 南京航空航天大学 基于卷积神经网络的多通道合成孔径雷达动目标检测方法
CN113297527B (zh) * 2021-06-09 2022-07-26 四川大学 基于多源城市大数据的pm2.5全面域时空计算推断方法
CN113779910B (zh) * 2021-11-10 2022-02-22 海光信息技术股份有限公司 产品性能分布预测方法及装置、电子设备及存储介质
CN114861498B (zh) * 2022-05-18 2022-11-18 上海交通大学 融合多传感时序信号机理模型的电阻点焊质量在线检测方法
WO2024059965A1 (zh) * 2022-09-19 2024-03-28 浙江大学 基于双通道信息互补融合堆叠自编码器的产品质量预测方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023307A1 (en) * 2008-07-24 2010-01-28 University Of Cincinnati Methods for prognosing mechanical systems
CN104504292A (zh) * 2015-01-14 2015-04-08 济南大学 基于bp神经网络预测循环流化床锅炉最佳工作温度的方法
CN104636800A (zh) * 2013-11-06 2015-05-20 上海思控电气设备有限公司 基于最小二乘加权的冷冻站系统神经网络优化单元及其方法
CN108153982A (zh) * 2017-12-26 2018-06-12 哈尔滨工业大学 基于堆叠自编码深度学习网络的航空发动机修后性能预测方法
CN110634082A (zh) * 2019-09-24 2019-12-31 云南电网有限责任公司 一种基于深度学习的低频减载系统运行阶段预测方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951836A (zh) * 2014-03-25 2015-09-30 上海市玻森数据科技有限公司 基于神经网络技术的发帖预测系统
CN105426889A (zh) * 2015-11-13 2016-03-23 浙江大学 基于pca混合特征融合的气液两相流流型识别方法
CN105893669A (zh) * 2016-03-30 2016-08-24 浙江大学 一种基于数据挖掘的全局仿真性能预测方法
KR20190117849A (ko) * 2018-03-28 2019-10-17 주식회사 미래이씨피 Iptv 방송의 지역별 송출에서 방송 및 자막영역에서 노출되는 추천 제품 선정시 지역별 판매데이터 분석에 따른 맞춤형 제품 추천 장치
US20200050982A1 (en) * 2018-08-10 2020-02-13 Adp, Llc Method and System for Predictive Modeling for Dynamically Scheduling Resource Allocation
CN109145434A (zh) * 2018-08-16 2019-01-04 北京理工大学 一种利用改进bp神经网络预测广播星历轨道误差的方法
CN111079891A (zh) * 2019-01-18 2020-04-28 兰州理工大学 一种基于双隐含层bp神经网络的离心泵性能预测方法
CN110322933A (zh) * 2019-06-20 2019-10-11 浙江工业大学 一种基于动态误差补偿机制的聚丙烯熔融指数混合建模方法
CN111445010B (zh) * 2020-03-26 2021-03-19 南京工程学院 一种基于证据理论融合量子网络的配网电压趋势预警方法
CN111612029B (zh) * 2020-03-30 2023-08-04 西南电子技术研究所(中国电子科技集团公司第十研究所) 机载电子产品故障预测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023307A1 (en) * 2008-07-24 2010-01-28 University Of Cincinnati Methods for prognosing mechanical systems
CN104636800A (zh) * 2013-11-06 2015-05-20 上海思控电气设备有限公司 基于最小二乘加权的冷冻站系统神经网络优化单元及其方法
CN104504292A (zh) * 2015-01-14 2015-04-08 济南大学 基于bp神经网络预测循环流化床锅炉最佳工作温度的方法
CN108153982A (zh) * 2017-12-26 2018-06-12 哈尔滨工业大学 基于堆叠自编码深度学习网络的航空发动机修后性能预测方法
CN110634082A (zh) * 2019-09-24 2019-12-31 云南电网有限责任公司 一种基于深度学习的低频减载系统运行阶段预测方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062513A (zh) * 2022-06-20 2022-09-16 嘉兴学院 基于adnn的永磁同步直线电机定位力计算模型构建方法
CN115062513B (zh) * 2022-06-20 2024-05-17 嘉兴学院 基于adnn的永磁同步直线电机定位力计算模型构建方法
CN116861347A (zh) * 2023-05-22 2023-10-10 青岛海洋地质研究所 一种基于深度学习模型的磁力异常数据计算方法
CN116861347B (zh) * 2023-05-22 2024-06-11 青岛海洋地质研究所 一种基于深度学习模型的磁力异常数据计算方法
CN116502544A (zh) * 2023-06-26 2023-07-28 武汉新威奇科技有限公司 一种基于数据融合的电动螺旋压力机寿命预测方法及系统
CN116502544B (zh) * 2023-06-26 2023-09-12 武汉新威奇科技有限公司 一种基于数据融合的电动螺旋压力机寿命预测方法及系统
CN116839783A (zh) * 2023-09-01 2023-10-03 华东交通大学 一种基于机器学习的汽车板簧受力值及变形量的测量方法
CN116839783B (zh) * 2023-09-01 2023-12-08 华东交通大学 一种基于机器学习的汽车板簧受力值及变形量的测量方法
CN117349710A (zh) * 2023-12-04 2024-01-05 七七七电气科技有限公司 一种真空灭弧室开断能力的预测方法、电子设备及介质
CN117349710B (zh) * 2023-12-04 2024-03-22 七七七电气科技有限公司 一种真空灭弧室开断能力的预测方法、电子设备及介质
CN117369425A (zh) * 2023-12-08 2024-01-09 南昌华翔汽车内外饰件有限公司 汽车仪表总成故障诊断方法、系统、存储介质及计算机
CN117369425B (zh) * 2023-12-08 2024-02-27 南昌华翔汽车内外饰件有限公司 汽车仪表总成故障诊断方法、系统、存储介质及计算机

Also Published As

Publication number Publication date
CN112257341B (zh) 2022-04-26
CN112257341A (zh) 2021-01-22

Similar Documents

Publication Publication Date Title
WO2022083009A1 (zh) 一种基于异源数据差补融合的定制产品性能预测方法
US20220122103A1 (en) Customized product performance prediction method based on heterogeneous data difference compensation fusion
WO2021175058A1 (zh) 一种神经网络架构搜索方法、装置、设备及介质
CN109844749A (zh) 一种基于图算法的节点异常检测方法、装置及存储装置
CN113361777B (zh) 基于vmd分解和ihho优化lstm的径流预测方法及系统
WO2023065859A1 (zh) 物品推荐方法、装置及存储介质
Kaveh et al. An efficient two‐stage method for optimal sensor placement using graph‐theoretical partitioning and evolutionary algorithms
CN108204944A (zh) 基于apso优化的lssvm的埋地管道腐蚀速率预测方法
CN105760649A (zh) 一种面向大数据的可信度量方法
CN113128671A (zh) 一种基于多模态机器学习的服务需求动态预测方法及系统
CN114584406B (zh) 一种联邦学习的工业大数据隐私保护系统及方法
CN109948242A (zh) 基于特征哈希的网络表示学习方法
CN116187835A (zh) 一种基于数据驱动的台区理论线损区间估算方法及系统
CN115051929A (zh) 基于自监督目标感知神经网络的网络故障预测方法及装置
CN117668622A (zh) 设备故障诊断模型的训练方法、故障诊断方法及装置
CN113033074A (zh) 策略组合机制融合蜻蜓算法孔隙度预测方法、系统及设备
Zou et al. A multiobjective particle swarm optimization algorithm based on grid technique and multistrategy
CN115796327A (zh) 一种基于vmd和iwoa-f-gru模型的风电功率区间预测方法
CN112241811B (zh) “互联网+”环境下定制产品的分层混合性能预测方法
CN113300884B (zh) 一种基于gwo-svr的分步网络流量预测方法
CN115329146A (zh) 时序网络中的链路预测方法、电子设备及存储介质
CN108960406B (zh) 一种基于bfo小波神经网络的mems陀螺随机误差预测方法
CN111160715A (zh) 基于bp神经网络新旧动能转换绩效评价方法和装置
CN110837847A (zh) 用户分类方法及装置、存储介质、服务器
CN110852505A (zh) 基于量子遗传优化lvq神经网络的智慧城市交通流预测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881435

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21881435

Country of ref document: EP

Kind code of ref document: A1