CN113065704A

CN113065704A - Hyper-parameter optimization and post-processing method of non-invasive load decomposition model

Info

Publication number: CN113065704A
Application number: CN202110351229.XA
Authority: CN
Inventors: 谈竹奎; 刘斌; 张秋雁; 唐赛秋; 徐长宝; 林呈辉; 王冕; 高吉普; 欧家祥; 胡厚鹏; 王宇; 古庭赟; 汪明媚; 顾威; 孟令雯
Original assignee: Guizhou Power Grid Co Ltd
Current assignee: Guizhou Power Grid Co Ltd
Priority date: 2021-03-31
Filing date: 2021-03-31
Publication date: 2021-07-02
Anticipated expiration: 2041-03-31
Also published as: CN113065704B

Abstract

The invention discloses a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, which comprises the following steps: s1, collecting electrical operation data of the household bus and the target electrical appliance, and establishing a data set of the model; s2, respectively constructing a non-invasive load decomposition model based on a deep residual error network for the target electrical appliance based on a deep learning theory; s3, optimizing the hyper-parameters of the load decomposition model by using cluster Bayesian optimization; and S4, after the optimal decomposition models are respectively established for the target electrical appliances, training trainable parameters in the models by utilizing a training set, a verification set and a test set until convergence. The invention can realize load decomposition, introduces a Bayesian optimization method into the optimization of the hyperparameters of the load decomposition model, and solves the problems of poor effect and low efficiency caused by blind selection of the hyperparameters in the traditional method; and realizing the high efficiency of the super-parameter optimization through the searching behavior of the group and the information interaction in the group.

Description

Hyper-parameter optimization and post-processing method of non-invasive load decomposition model

Technical Field

The invention relates to the field of non-invasive load decomposition, in particular to a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model.

Background

One of the urgent needs of the smart grid and the energy internet is to acquire power consumption data of a single electric appliance, so that a user can know the power consumption law of each electric appliance and accordingly reduce energy consumption. This is an important step towards grid transparency and intelligence. Current measurement techniques can only automatically read the total power consumption data and it is difficult to further obtain the internal load information of the user. Load shedding techniques have become a major bottleneck in the development of smart grids.

Non-intrusive load dis-aggregation (NILD), first proposed by professor Hart in the 80's 20 th century, is a technique to estimate the power consumption of each appliance of a user knowing the total power demand of the user's bus. Compared with the Intrusive Load Distribution (ILD), the method has the advantages of simplicity and convenience in installation, disassembly and maintenance, simplicity in hardware and the like, and therefore the method has a wider development prospect.

At present, for NILD research, scholars at home and abroad research and obtain certain results. The method comprises the following steps that a non-invasive power load decomposition method [ J ] based on particle swarm algorithm search, power system protection and control, 2016,44(08):30-36 ] is adopted in a document (Liruyi, Huangming mountain, Zhou Dong nations, flood, Huwenshan. A non-invasive power load decomposition method [ J ] based on particle swarm algorithm search, and load decomposition is realized by searching for optimal matching of harmonic current and power of each electric appliance to total harmonic current and total power by utilizing particle swarm optimization; the document (Kolter J Z, Jaakkola T. additive interaction in additive factorization [ C ]. La Palma, Spain: Microtome Publishing,2012, translation Kolter J Z, approximate inference of Jaakkola T. additive factorization HMM and its application in energy decomposition [ C ]. La Palma, Spain: Microtome Publishing,2012) constructs a hidden Markov model for load decomposition, but the performance of the algorithm is affected when the number of appliances is increased; the method is characterized in that the method based on the signal sparsity underdetermined solution is used for carrying out load decomposition in a literature (WuXin, Hanxiao, a resident user non-intrusive load decomposition algorithm [ J ] based on the signal sparsity underdetermined solution, a power grid technology, 2017,41(09):3033 + 3040.), but has higher requirements on the calculation performance of hardware; the document (Venerun, san gang, non-invasive load decomposition method [ J ] based on a depth sequence translation model, power grid technology 2020,44(1):27-34.) combines a sequence translation model to construct a mapping relation between a signal to be decomposed and a state code of an electrical appliance, but the model is complex and the practicability is not strong. Although the method has a good effect, the model is complex, the super parameters are numerous, and a group of optimal solutions are difficult to determine, and if a non-invasive monitoring terminal is introduced, the hardware performance needs to be comprehensively considered. In addition, the method directly outputs the model decomposition result without processing, a fine correction strategy is lacked, and the model precision is to be further improved. Therefore, a non-invasive load decomposition method with relatively simple algorithm, capability of being converted into a programming language and high recognition speed is needed. .

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the hyper-parameter optimization and post-processing method of the non-intrusive load decomposition model is provided to solve the technical problems in the prior art.

The technical scheme adopted by the invention is as follows: a hyper-parameter optimization and post-processing method of a non-intrusive load decomposition model comprises the following steps:

step 1: collecting electrical operation data of a home bus and a target electrical appliance by using a voltage transformer and a current transformer, and establishing a data set of a model;

step 2: respectively constructing a non-invasive load decomposition model based on a depth residual error network for a target electrical appliance based on a deep learning theory;

and step 3: optimizing the hyper-parameters of the load decomposition model by using cluster Bayes optimization;

and 4, step 4: respectively establishing an optimal decomposition model for the target electrical appliance, and then training trainable parameters in the model by using a training set, a verification set and a test set until convergence;

and 5: and carrying out post-processing on the decomposition result of the model, and carrying out higher-precision correction on the prediction of the model based on the rationality of the model prediction result.

Further, in step 1, collecting electrical operation data of the home bus and the target electrical appliance, considering collection hardware limitation and cost requirements, not considering collection of high-frequency transient characteristics, and only considering steady-state characteristics including active power, reactive power and current effective values. The target electrical appliance refers to which kind of electrical appliance operation state is desired to be decomposed from the bus data, such as a refrigerator, a hot water kettle, an electric rice cooker and the like. The data set of the model refers to that the model takes the electrical operation data of the bus and the independent electrical operation data of the target electrical appliance at the same time as a sample and a label in the data set respectively, and a training set, a verification set and a test set are divided according to the proportion.

Further, in step 2, the building of the load decomposition models for the target electrical appliances respectively means that one target electrical appliance corresponds to one decomposition model, and the load decomposition models of different target electrical appliances do not affect each other. The non-invasive load decomposition model based on the depth residual error network is specifically described as follows:

in the non-intrusive load decomposition task, extracting the operating characteristics of the target appliance from the home bus is the key to achieving load decomposition. For this purpose, the power sequence at the home bus can be regarded as a two-dimensional image of the operation characteristics of the operation target electrical appliance, and the depth residual convolution network is used for extracting the characteristics to realize load decomposition. The basic composition unit of the depth residual error network is a residual error block, the difference between the residual error block and the convolution layer is that the idea of jump connection is introduced, the operation is that the depth residual error network can directly extract original characteristics from input, and the problems of gradient disappearance, explosion and the like can not occur along with the increase of the number of network layers. The residual block is divided into two parts, one part being the identity map and the other part being the residual part. The identity mapping is to perform dimension increasing or dimension reducing on an input and then connect the input with an output jump; the residual block is composed of a convolutional layer, a batch normalization layer and an activation function. The output of the residual block is the sum of the two, and the mathematical expression is:

y_l＝F(x_l,Θ_l)+h(x_l)

x_l+1＝f(y_l)

in the formula: x is the number of_lWhich is the input of the residual block; x is the number of_l+1Is the output of the residual block; theta_lIs a weight matrix of convolutional layers; h (-) is 1X1 convolution operation after the dimension of the input is increased or decreased; f (-) is the activation function.

And a non-invasive load decomposition model based on a depth residual error network can be constructed based on the residual error block. Total load power sequenceColumn(s) of

Segmenting according to a preset sliding window length k to obtain a first segment sequence

Each sliding moves the window one sample point forward, forming T-k +1 power sequences P ═ P₁,P₂,…P_T-k+1]^T. And each section of power sequence is used as the input of the model, and the output of the model is the power value of the target electrical appliance at the midpoint moment corresponding to the section of sequence. The method comprises the steps of firstly constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking a plurality of residual blocks to extract higher-level and more abstract characteristics, and finally using a plurality of full connection layers to realize nonlinear mapping of a total power sequence to a power value of a target electrical appliance at a midpoint moment corresponding to the sequence.

The non-invasive load decomposition model based on the residual error network carries out linear transformation on data by using maximum and minimum standardization, and maps the data to a [0,1] interval, which is defined as follows:

in the formula: x is the number of_maxIs the maximum value in the data; x is the number of_minIs the minimum value in the data.

Further, in step 3, the bayesian optimization optimizes hyper-parameters of the load decomposition model, wherein the hyper-parameters include: convolution kernel size, number of convolution kernels, number of full link layer neurons, sliding window size, etc. The bayesian optimization is described in detail as follows:

bayesian optimization makes use of the gaussian process. For the load decomposition model Y ═ f (x), we wish to determine the next search point by a known combination of hyper-parameters. If t sets of hyper-parameters x have been obtained₁,x₂,…x_tAnd a vector f (x) formed by index values, such as error values, of the model corresponding to the t sets of hyper-parameters_1:t)＝[f(x₁),f(x₂),…f(x_t)]Then the gaussian process assumes that this vector obeys a t-dimensional gaussian distribution f (x)_1:t)～N(μ(x_1:t),∑(x_1:t,x_1:t) Where the mean vector μ (x)_1:t)＝[μ(x₁),μ(x₂),…μ(x_t)]The covariance matrix is expressed as:

the covariance is calculated by a gaussian kernel function:

in the formula: alpha is alpha₀Are parameters of the kernel function.

After t sets of candidate solutions are obtained, a corresponding Gaussian regression model is established, the posterior probability of model index values at any point is obtained, and an acquisition function is established by utilizing the posterior probability to determine the next set of hyper-parameter combinations to be searched. The collection function expression is:

EI_t(x)＝E[max(f(x)-max(f(x₁),f(x₂),…f(x_t))]

x_t+1＝argmaxEI_t(x)

for the point x to be searched next_t+1And f (x)_t+1) Assume f (x) after adding a new sample point_1:t+1) The mean vector and covariance matrix are partitioned, subject to a t + 1-dimensional normal distribution, and can be written as:

in the formula: k ═ k (x)_t+1,x₁),k(x_t+1,x₂),…k(x_t+1,x_t)]^T。

Thus, the search process for the model hyperparameters may be described as being at a known f (x)_1:t) In the case of (2), f (x) is obtained_t+1) The condition distribution of (2). The conditional distribution obeys a one-dimensional normal distribution f (x) according to the nature of the multidimensional normal distribution_t+1)|f(x_1:t)～N(μ,σ²) The specific calculation formula is as follows:

μ＝κ(∑(x_1:t,x_1:t))^-1(f(x_1:t)-μ(x_1:t))+μ(x_t+1)

σ²＝k(x_t+1,x_t+1)-κ(∑(x_1:t,x_1:t))^-1κ^T

in the formula: mu and f (x)_1:t) Related, and σ²Only with respect to covariance values calculated by the Gaussian kernel function, and f (x)_1:t) Is irrelevant.

In summary, the algorithm flow of the Bayesian optimization hyperparameter is as follows:

s3.1, selecting t groups of hyper-parameters, and respectively calculating model index values of the hyper-parameters;

s3.2, according to the current sampling data x_1:tAnd f (x)_1:t) Update f (x)_t+1)|f(x_1:t) Mu and sigma of²；

S3.3, determining the next sampling point x according to the maximum value of the acquisition function_t+1；

S3.4, calculating model index value f (x) at the next sampling point_t+1)；

And S3.5, returning to the step S3.1 for recalculation, and ending the algorithm after the specified iteration times or the model performance reaches the expected effect.

The Bayesian optimization of the clusters refers to the intelligence of solving problems by introducing the idea of the Bayesian optimization and through the search behavior of the clusters and the information interaction in the clusters on the basis of the Bayesian optimization. The Bayesian optimization method for the cluster provided by the invention comprises the following specific steps:

s3.1, initializing the maximum searching times and the total number of searched individuals, and endowing each individual with an initial hyper-parameter combination in a reasonable range;

s3.2, respectively constructing models according to the hyper-parameter combinations of each individual, training by using the same data set, and obtaining index values of the models under the hyper-parameter combinations corresponding to each individual;

s3.3, comparing the hyper-parameter combination of each individual with the hyper-parameter combination corresponding to the individual with the best model index value in the population, and updating according to the following formula:

x_id＝x_id+rand(0,1)(p_id-x_id)+rand(0,1)(p_gd-x_id)

in the formula: x is the number of_idIs the value of the d-dimension of the ith individual; p is a radical of_idIs the optimal value of the d-dimension of the ith individual; p is a radical of_gdIs the optimal value of the d-th dimension in the population; rand (0,1) is a random number from 0 to 1.

And S3.4, if the end condition is not met, returning to the step S3.2, otherwise, ending the algorithm, and combining the hyper-parameters corresponding to the model with the best model index values in the group into the searched optimal solution.

Further, in step 4, the training of the trainable parameters in the model by using the training set, the verification set and the test set until convergence refers to updating the trainable parameters by using an Adam optimizer and a gradient back propagation algorithm; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.

Further, in step 5, the post-processing method has the following main ideas:

the information of the model training data is fully utilized, the operation rule and the trend of the target electrical appliance are fully mined and recorded in the template feature library, the reasonable activation is corrected with higher precision on the basis of the information, the unreasonable activation is eliminated, and therefore the decomposition effect of the model is comprehensively improved. The method comprises the following steps:

s5.1, recording the shortest activation time of the target electrical appliance in the training data through a threshold value method;

s5.2, recording the duration of each section of activation of the power decomposition value of the target electrical appliance through a threshold value method;

s5.3, eliminating the activation with the activation duration time less than the shortest activation time in the power decomposition value of the target electrical appliance;

s5.4, judging a total load power section corresponding to the residual activation, and if corresponding power rise and power drop exist in the total load power in the section, considering the activation to be reasonable; otherwise, the activation is considered unreasonable and rejected.

The invention has the beneficial effects that: compared with the prior art, the invention firstly provides a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, introduces the concept and idea of cluster optimization into the optimization of hyper-parameters of the load decomposition model, firstly understands a total power sequence extracted from a family bus into a two-dimensional image, constructs a depth residual network to extract the operation characteristics of a target electrical appliance from the two-dimensional image so as to realize load decomposition, and then introduces the method of cluster Bayesian optimization into the optimization of hyper-parameters of the load decomposition model, thereby overcoming the problems of poor effect and low efficiency caused by blind selection of hyper-parameters in the traditional method, and simultaneously introduces the idea of cluster optimization to provide the method of cluster Bayesian optimization, and realizes the high efficiency of the hyper-parameter optimization through the search behavior of a cluster and the information interaction in the cluster; and finally, further correcting the decomposition result of the model by using a post-processing method, and eliminating irrelevant and wrong decomposition results decomposed by the model, thereby comprehensively improving the decomposition performance of the model. In conclusion, the hyper-parameter optimization and predicted value post-processing method for the non-invasive load decomposition model can provide technical and theoretical support for the non-invasive load recognition device, and has certain practical application value.

Drawings

Fig. 1 is a schematic diagram of a residual block structure in an embodiment of the present invention.

Fig. 2 is a schematic diagram of a ResNet-based load split model in an embodiment of the present invention.

Fig. 3 is a schematic diagram of an optimization result of the bayesian optimization of clusters in the embodiment of the present invention.

Fig. 4 is a schematic view of the overall flow of load splitting in the embodiment of the present invention.

Detailed Description

The invention is further described with reference to the accompanying drawings and specific embodiments.

Example 1: as shown in fig. 1 to 4, in order to solve the problem that most of non-invasive load decomposition algorithms are complex, low in recognition speed and low in operation efficiency, and thus cannot be applied to practical engineering, the invention provides a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, which is used for realizing non-invasive load decomposition and can achieve an algorithm effect of fast, accurate and efficient recognition.

A method for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, as shown in FIG. 4, comprises the following steps:

the method comprises the steps of collecting electrical operation data of a home bus and a target electrical appliance, considering collection hardware limitation and cost requirements, not considering collection high-frequency transient characteristics, and only considering steady-state characteristics including active power, reactive power and current effective values. The target electrical appliance refers to which kind of electrical appliance operation state is desired to be decomposed from the bus data, such as a refrigerator, a hot water kettle, an electric rice cooker and the like. The data set of the model refers to that the model takes the electrical operation data of the bus and the independent electrical operation data of the target electrical appliance at the same time as a sample and a label in the data set respectively, and a training set, a verification set and a test set are divided according to the proportion of 3:1: 1.

the load decomposition models respectively constructed for the target electrical appliances refer to a decomposition model corresponding to one target electrical appliance, and the load decomposition models among different target electrical appliances are not influenced mutually. The non-invasive load decomposition model based on the depth residual error network is specifically described as follows:

in the non-intrusive load decomposition task, extracting the operating characteristics of the target appliance from the home bus is the key to achieving load decomposition. For this purpose, the power sequence at the home bus can be regarded as a two-dimensional image of the operation characteristics of the operation target electrical appliance, and the depth residual convolution network is used for extracting the characteristics to realize load decomposition. The basic composition unit of the depth residual error network is a residual error block, the difference between the residual error block and the convolution layer is that the idea of jump connection is introduced, the operation is that the depth residual error network can directly extract original characteristics from input, and the problems of gradient disappearance, explosion and the like can not occur along with the increase of the number of network layers. The residual block is divided into two parts, one part being the identity map and the other part being the residual part, as shown in fig. 1. The identity mapping is to perform dimension increasing or dimension reducing on an input and then connect the input with an output jump; the residual block is composed of a convolutional layer, a batch normalization layer and an activation function. The output of the residual block is the sum of the two, and the mathematical expression is:

y_l＝F(x_l,Θ_l)+h(x_l)

x_l+1＝f(y_l)

And a non-invasive load decomposition model based on a depth residual error network can be constructed based on the residual error block. Sequence of total load power

Each sliding moves the window one sample point forward, forming T-k +1 power sequences P ═ P₁,P₂,…P_T-k+1]^T. And each section of power sequence is used as the input of the model, and the output of the model is the power value of the target electrical appliance at the midpoint moment corresponding to the section of sequence. Firstly, constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking four residual blocks to extract higher-level and more abstract characteristics, and finally realizing a total power sequence to the first convolution layer by using two full-connection layersThe sequence of segments corresponds to a non-linear mapping of the power values of the target appliance at the midpoint time, as shown in fig. 2.

in the formula: x is the number of_maxIs the maximum value in the data; x is the number of_min is the minimum value in the data.

optimizing the hyperparameter of the load decomposition model by Bayesian optimization, wherein the hyperparameter comprises: convolution kernel size, number of convolution kernels, number of full link layer neurons, and sliding window size. The bayesian optimization is described in detail as follows:

the covariance is calculated by a gaussian kernel function:

in the formula: alpha is alpha₀Are parameters of the kernel function.

EI_t(x)＝E[max(f(x)-max(f(x₁),f(x₂),…f(x_t))]

x_t+1＝argmaxEI_t(x)

in the formula: k ═ k (x)_t+1,x₁),k(x_t+1,x₂),…k(x_t+1,x_t)]^T。

μ＝κ(∑(x_1:t,x_1:t))^-1(f(x_1:t)-μ(x_1:t))+μ(x_t+1)

σ²＝k(x_t+1,x_t+1)-κ(∑(x_1:t,x_1:t))^-1κ^T

S3.4, calculating model index value f (x) at the next sampling point_t+1)；

And S3.5, returning to the step 1 for recalculation, and ending the algorithm after the specified iteration times or the model performance reaches the expected effect.

The Bayesian optimization of the clusters is to introduce the idea of the Bayesian optimization on the basis of the Bayesian optimization and realize the intelligence of the problem solution through the search behavior of the clusters and the information interaction in the clusters. The Bayesian optimization method for the cluster provided by the invention comprises the following specific steps:

x_id＝x_id+rand(0,1)(p_id-x_id)+rand(0,1)(p_gd-x_id)

And S3.4, if the end condition is not met, returning to the step 2, otherwise, ending the algorithm, and combining the hyper-parameters corresponding to the model with the best model index values in the group into the searched optimal solution.

The optimization results of the cluster bayes optimization are shown in fig. 3.

training trainable parameters in the model by using a training set, a verification set and a test set until convergence, namely updating the trainable parameters by using an Adam optimizer and a gradient back propagation algorithm; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.

The main idea of the post-processing method is as follows:

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and therefore, the scope of the present invention should be determined by the scope of the claims.

Claims

1. A hyper-parameter optimization and post-processing method of a non-invasive load decomposition model is characterized by comprising the following steps: the method comprises the following steps:

and step 3: optimizing hyper-parameters of the non-invasive load decomposition model by using cluster Bayesian optimization;

and 5: and carrying out post-processing on the decomposition result of the model, and correcting the prediction of the model based on the model prediction result.

2. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in the step 1, collecting electrical operation data of a home bus and a target electrical appliance, wherein the electrical operation data comprises active power, reactive power and a current effective value; the target electrical appliance is the operation state of which electrical appliance is expected to be decomposed from the bus data, and the data set of the model is a sample and a label which are respectively used by the model as electrical operation data at the bus and independent electrical operation data of the target electrical appliance at the same time, and a training set, a verification set and a test set are divided according to the proportion.

3. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in the step 2, the non-invasive load decomposition model of the target electrical appliance is a decomposition model corresponding to the target electrical appliance, and the non-invasive load decomposition models of different target electrical appliances are not influenced by each other.

4. The method of claim 3, wherein the method comprises the steps of: the non-invasive load decomposition model construction method based on the depth residual error network comprises the following steps:

in a non-invasive load decomposition task, a power sequence at a home bus is regarded as a two-dimensional image for operating the operating characteristics of a target electrical appliance, a depth residual error network is used for extracting the operating characteristics of the target electrical appliance, a basic composition unit of the depth residual error network is a residual error block, the depth residual error network can directly extract original characteristics from input, the residual error block consists of a convolution layer, a batch processing normalization layer and an activation function, and an output mathematical expression of the residual error block is as follows:

y_l＝F(x_l,Θ_l)+h(x_l)

x_l+1＝f(y_l)

Constructing a non-invasive load decomposition model based on a depth residual error network based on a residual error block, and sequencing the total load power

Each sliding moves the window forward by one sampling point to form T-k +1 power sequencesP＝[P₁,P₂,…P_T-k+1]^TFirstly, constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking a plurality of residual blocks to extract higher-level and more abstract characteristics, and finally realizing nonlinear mapping from a total power sequence to the power value of the target electrical appliance at the midpoint moment corresponding to the segment of the sequence by using a plurality of full connection layers;

5. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 3, the hyper-parameters include: convolution kernel size, number of convolution kernels, number of full link layer neurons, and sliding window size.

6. The method of claim 1 or 5, wherein the method comprises the steps of: in step 3, the Bayesian clustering optimization method introduces the thought of clustering optimization on the basis of Bayesian optimization, and realizes the intelligence of problem solution through the search behavior of the clusters and the information interaction in the clusters, and the concrete steps are as follows:

x_id＝x_id+rand(0,1)(p_id-x_id)+rand(0,1)(p_gd-x_id)

in the formula: x is the number of_idIs the value of the d-dimension of the ith individual; p is a radical of_idIs the optimal value of the d-dimension of the ith individual; p is a radical of_gdIs the optimal value of the d-th dimension in the population; rand (0,1) is a random number from 0 to 1;

7. The method of claim 6, wherein the method comprises the steps of: the algorithm flow of the Bayesian optimization hyperparameter is as follows:

S3.4, calculating model index value f (x) at the next sampling point_t+1)；

8. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 4, training trainable parameters in the model by using a training set, a verification set and a test set until the trainable parameters are updated by using an Adam optimizer and a gradient back propagation algorithm in convergence; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.

9. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 5, the post-processing method comprises the following steps: and fully mining the operation rule and trend of the target electrical appliance by using the information of the model training data, recording the operation rule and trend in a template feature library, correcting reasonable activation with higher precision on the basis of the operation rule and trend, and removing unreasonable activation.

10. The method of claim 1 or 9 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: the post-treatment method comprises the following specific steps: