CN113065704A - Hyper-parameter optimization and post-processing method of non-invasive load decomposition model - Google Patents

Hyper-parameter optimization and post-processing method of non-invasive load decomposition model Download PDF

Info

Publication number
CN113065704A
CN113065704A CN202110351229.XA CN202110351229A CN113065704A CN 113065704 A CN113065704 A CN 113065704A CN 202110351229 A CN202110351229 A CN 202110351229A CN 113065704 A CN113065704 A CN 113065704A
Authority
CN
China
Prior art keywords
model
hyper
electrical appliance
target electrical
load decomposition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110351229.XA
Other languages
Chinese (zh)
Inventor
谈竹奎
刘斌
张秋雁
唐赛秋
徐长宝
林呈辉
王冕
高吉普
欧家祥
胡厚鹏
王宇
古庭赟
汪明媚
顾威
孟令雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202110351229.XA priority Critical patent/CN113065704A/en
Publication of CN113065704A publication Critical patent/CN113065704A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention discloses a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, which comprises the following steps: s1, collecting electrical operation data of the household bus and the target electrical appliance, and establishing a data set of the model; s2, respectively constructing a non-invasive load decomposition model based on a deep residual error network for the target electrical appliance based on a deep learning theory; s3, optimizing the hyper-parameters of the load decomposition model by using cluster Bayesian optimization; and S4, after the optimal decomposition models are respectively established for the target electrical appliances, training trainable parameters in the models by utilizing a training set, a verification set and a test set until convergence. The invention can realize load decomposition, introduces a Bayesian optimization method into the optimization of the hyperparameters of the load decomposition model, and solves the problems of poor effect and low efficiency caused by blind selection of the hyperparameters in the traditional method; and realizing the high efficiency of the super-parameter optimization through the searching behavior of the group and the information interaction in the group.

Description

Hyper-parameter optimization and post-processing method of non-invasive load decomposition model
Technical Field
The invention relates to the field of non-invasive load decomposition, in particular to a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model.
Background
One of the urgent needs of the smart grid and the energy internet is to acquire power consumption data of a single electric appliance, so that a user can know the power consumption law of each electric appliance and accordingly reduce energy consumption. This is an important step towards grid transparency and intelligence. Current measurement techniques can only automatically read the total power consumption data and it is difficult to further obtain the internal load information of the user. Load shedding techniques have become a major bottleneck in the development of smart grids.
Non-intrusive load dis-aggregation (NILD), first proposed by professor Hart in the 80's 20 th century, is a technique to estimate the power consumption of each appliance of a user knowing the total power demand of the user's bus. Compared with the Intrusive Load Distribution (ILD), the method has the advantages of simplicity and convenience in installation, disassembly and maintenance, simplicity in hardware and the like, and therefore the method has a wider development prospect.
At present, for NILD research, scholars at home and abroad research and obtain certain results. The method comprises the following steps that a non-invasive power load decomposition method [ J ] based on particle swarm algorithm search, power system protection and control, 2016,44(08):30-36 ] is adopted in a document (Liruyi, Huangming mountain, Zhou Dong nations, flood, Huwenshan. A non-invasive power load decomposition method [ J ] based on particle swarm algorithm search, and load decomposition is realized by searching for optimal matching of harmonic current and power of each electric appliance to total harmonic current and total power by utilizing particle swarm optimization; the document (Kolter J Z, Jaakkola T. additive interaction in additive factorization [ C ]. La Palma, Spain: Microtome Publishing,2012, translation Kolter J Z, approximate inference of Jaakkola T. additive factorization HMM and its application in energy decomposition [ C ]. La Palma, Spain: Microtome Publishing,2012) constructs a hidden Markov model for load decomposition, but the performance of the algorithm is affected when the number of appliances is increased; the method is characterized in that the method based on the signal sparsity underdetermined solution is used for carrying out load decomposition in a literature (WuXin, Hanxiao, a resident user non-intrusive load decomposition algorithm [ J ] based on the signal sparsity underdetermined solution, a power grid technology, 2017,41(09):3033 + 3040.), but has higher requirements on the calculation performance of hardware; the document (Venerun, san gang, non-invasive load decomposition method [ J ] based on a depth sequence translation model, power grid technology 2020,44(1):27-34.) combines a sequence translation model to construct a mapping relation between a signal to be decomposed and a state code of an electrical appliance, but the model is complex and the practicability is not strong. Although the method has a good effect, the model is complex, the super parameters are numerous, and a group of optimal solutions are difficult to determine, and if a non-invasive monitoring terminal is introduced, the hardware performance needs to be comprehensively considered. In addition, the method directly outputs the model decomposition result without processing, a fine correction strategy is lacked, and the model precision is to be further improved. Therefore, a non-invasive load decomposition method with relatively simple algorithm, capability of being converted into a programming language and high recognition speed is needed. .
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the hyper-parameter optimization and post-processing method of the non-intrusive load decomposition model is provided to solve the technical problems in the prior art.
The technical scheme adopted by the invention is as follows: a hyper-parameter optimization and post-processing method of a non-intrusive load decomposition model comprises the following steps:
step 1: collecting electrical operation data of a home bus and a target electrical appliance by using a voltage transformer and a current transformer, and establishing a data set of a model;
step 2: respectively constructing a non-invasive load decomposition model based on a depth residual error network for a target electrical appliance based on a deep learning theory;
and step 3: optimizing the hyper-parameters of the load decomposition model by using cluster Bayes optimization;
and 4, step 4: respectively establishing an optimal decomposition model for the target electrical appliance, and then training trainable parameters in the model by using a training set, a verification set and a test set until convergence;
and 5: and carrying out post-processing on the decomposition result of the model, and carrying out higher-precision correction on the prediction of the model based on the rationality of the model prediction result.
Further, in step 1, collecting electrical operation data of the home bus and the target electrical appliance, considering collection hardware limitation and cost requirements, not considering collection of high-frequency transient characteristics, and only considering steady-state characteristics including active power, reactive power and current effective values. The target electrical appliance refers to which kind of electrical appliance operation state is desired to be decomposed from the bus data, such as a refrigerator, a hot water kettle, an electric rice cooker and the like. The data set of the model refers to that the model takes the electrical operation data of the bus and the independent electrical operation data of the target electrical appliance at the same time as a sample and a label in the data set respectively, and a training set, a verification set and a test set are divided according to the proportion.
Further, in step 2, the building of the load decomposition models for the target electrical appliances respectively means that one target electrical appliance corresponds to one decomposition model, and the load decomposition models of different target electrical appliances do not affect each other. The non-invasive load decomposition model based on the depth residual error network is specifically described as follows:
in the non-intrusive load decomposition task, extracting the operating characteristics of the target appliance from the home bus is the key to achieving load decomposition. For this purpose, the power sequence at the home bus can be regarded as a two-dimensional image of the operation characteristics of the operation target electrical appliance, and the depth residual convolution network is used for extracting the characteristics to realize load decomposition. The basic composition unit of the depth residual error network is a residual error block, the difference between the residual error block and the convolution layer is that the idea of jump connection is introduced, the operation is that the depth residual error network can directly extract original characteristics from input, and the problems of gradient disappearance, explosion and the like can not occur along with the increase of the number of network layers. The residual block is divided into two parts, one part being the identity map and the other part being the residual part. The identity mapping is to perform dimension increasing or dimension reducing on an input and then connect the input with an output jump; the residual block is composed of a convolutional layer, a batch normalization layer and an activation function. The output of the residual block is the sum of the two, and the mathematical expression is:
yl=F(xll)+h(xl)
xl+1=f(yl)
in the formula: x is the number oflWhich is the input of the residual block; x is the number ofl+1Is the output of the residual block; thetalIs a weight matrix of convolutional layers; h (-) is 1X1 convolution operation after the dimension of the input is increased or decreased; f (-) is the activation function.
And a non-invasive load decomposition model based on a depth residual error network can be constructed based on the residual error block. Total load power sequenceColumn(s) of
Figure BDA0003002210390000041
Segmenting according to a preset sliding window length k to obtain a first segment sequence
Figure BDA0003002210390000042
Each sliding moves the window one sample point forward, forming T-k +1 power sequences P ═ P1,P2,…PT-k+1]T. And each section of power sequence is used as the input of the model, and the output of the model is the power value of the target electrical appliance at the midpoint moment corresponding to the section of sequence. The method comprises the steps of firstly constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking a plurality of residual blocks to extract higher-level and more abstract characteristics, and finally using a plurality of full connection layers to realize nonlinear mapping of a total power sequence to a power value of a target electrical appliance at a midpoint moment corresponding to the sequence.
The non-invasive load decomposition model based on the residual error network carries out linear transformation on data by using maximum and minimum standardization, and maps the data to a [0,1] interval, which is defined as follows:
Figure BDA0003002210390000043
in the formula: x is the number ofmaxIs the maximum value in the data; x is the number ofminIs the minimum value in the data.
Further, in step 3, the bayesian optimization optimizes hyper-parameters of the load decomposition model, wherein the hyper-parameters include: convolution kernel size, number of convolution kernels, number of full link layer neurons, sliding window size, etc. The bayesian optimization is described in detail as follows:
bayesian optimization makes use of the gaussian process. For the load decomposition model Y ═ f (x), we wish to determine the next search point by a known combination of hyper-parameters. If t sets of hyper-parameters x have been obtained1,x2,…xtAnd a vector f (x) formed by index values, such as error values, of the model corresponding to the t sets of hyper-parameters1:t)=[f(x1),f(x2),…f(xt)]Then the gaussian process assumes that this vector obeys a t-dimensional gaussian distribution f (x)1:t)~N(μ(x1:t),∑(x1:t,x1:t) Where the mean vector μ (x)1:t)=[μ(x1),μ(x2),…μ(xt)]The covariance matrix is expressed as:
Figure BDA0003002210390000044
the covariance is calculated by a gaussian kernel function:
Figure BDA0003002210390000045
in the formula: alpha is alpha0Are parameters of the kernel function.
After t sets of candidate solutions are obtained, a corresponding Gaussian regression model is established, the posterior probability of model index values at any point is obtained, and an acquisition function is established by utilizing the posterior probability to determine the next set of hyper-parameter combinations to be searched. The collection function expression is:
EIt(x)=E[max(f(x)-max(f(x1),f(x2),…f(xt))]
xt+1=argmaxEIt(x)
for the point x to be searched nextt+1And f (x)t+1) Assume f (x) after adding a new sample point1:t+1) The mean vector and covariance matrix are partitioned, subject to a t + 1-dimensional normal distribution, and can be written as:
Figure BDA0003002210390000051
in the formula: k ═ k (x)t+1,x1),k(xt+1,x2),…k(xt+1,xt)]T
Thus, the search process for the model hyperparameters may be described as being at a known f (x)1:t) In the case of (2), f (x) is obtainedt+1) The condition distribution of (2). The conditional distribution obeys a one-dimensional normal distribution f (x) according to the nature of the multidimensional normal distributiont+1)|f(x1:t)~N(μ,σ2) The specific calculation formula is as follows:
μ=κ(∑(x1:t,x1:t))-1(f(x1:t)-μ(x1:t))+μ(xt+1)
σ2=k(xt+1,xt+1)-κ(∑(x1:t,x1:t))-1κT
in the formula: mu and f (x)1:t) Related, and σ2Only with respect to covariance values calculated by the Gaussian kernel function, and f (x)1:t) Is irrelevant.
In summary, the algorithm flow of the Bayesian optimization hyperparameter is as follows:
s3.1, selecting t groups of hyper-parameters, and respectively calculating model index values of the hyper-parameters;
s3.2, according to the current sampling data x1:tAnd f (x)1:t) Update f (x)t+1)|f(x1:t) Mu and sigma of2
S3.3, determining the next sampling point x according to the maximum value of the acquisition functiont+1
S3.4, calculating model index value f (x) at the next sampling pointt+1);
And S3.5, returning to the step S3.1 for recalculation, and ending the algorithm after the specified iteration times or the model performance reaches the expected effect.
The Bayesian optimization of the clusters refers to the intelligence of solving problems by introducing the idea of the Bayesian optimization and through the search behavior of the clusters and the information interaction in the clusters on the basis of the Bayesian optimization. The Bayesian optimization method for the cluster provided by the invention comprises the following specific steps:
s3.1, initializing the maximum searching times and the total number of searched individuals, and endowing each individual with an initial hyper-parameter combination in a reasonable range;
s3.2, respectively constructing models according to the hyper-parameter combinations of each individual, training by using the same data set, and obtaining index values of the models under the hyper-parameter combinations corresponding to each individual;
s3.3, comparing the hyper-parameter combination of each individual with the hyper-parameter combination corresponding to the individual with the best model index value in the population, and updating according to the following formula:
xid=xid+rand(0,1)(pid-xid)+rand(0,1)(pgd-xid)
in the formula: x is the number ofidIs the value of the d-dimension of the ith individual; p is a radical ofidIs the optimal value of the d-dimension of the ith individual; p is a radical ofgdIs the optimal value of the d-th dimension in the population; rand (0,1) is a random number from 0 to 1.
And S3.4, if the end condition is not met, returning to the step S3.2, otherwise, ending the algorithm, and combining the hyper-parameters corresponding to the model with the best model index values in the group into the searched optimal solution.
Further, in step 4, the training of the trainable parameters in the model by using the training set, the verification set and the test set until convergence refers to updating the trainable parameters by using an Adam optimizer and a gradient back propagation algorithm; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.
Further, in step 5, the post-processing method has the following main ideas:
the information of the model training data is fully utilized, the operation rule and the trend of the target electrical appliance are fully mined and recorded in the template feature library, the reasonable activation is corrected with higher precision on the basis of the information, the unreasonable activation is eliminated, and therefore the decomposition effect of the model is comprehensively improved. The method comprises the following steps:
s5.1, recording the shortest activation time of the target electrical appliance in the training data through a threshold value method;
s5.2, recording the duration of each section of activation of the power decomposition value of the target electrical appliance through a threshold value method;
s5.3, eliminating the activation with the activation duration time less than the shortest activation time in the power decomposition value of the target electrical appliance;
s5.4, judging a total load power section corresponding to the residual activation, and if corresponding power rise and power drop exist in the total load power in the section, considering the activation to be reasonable; otherwise, the activation is considered unreasonable and rejected.
The invention has the beneficial effects that: compared with the prior art, the invention firstly provides a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, introduces the concept and idea of cluster optimization into the optimization of hyper-parameters of the load decomposition model, firstly understands a total power sequence extracted from a family bus into a two-dimensional image, constructs a depth residual network to extract the operation characteristics of a target electrical appliance from the two-dimensional image so as to realize load decomposition, and then introduces the method of cluster Bayesian optimization into the optimization of hyper-parameters of the load decomposition model, thereby overcoming the problems of poor effect and low efficiency caused by blind selection of hyper-parameters in the traditional method, and simultaneously introduces the idea of cluster optimization to provide the method of cluster Bayesian optimization, and realizes the high efficiency of the hyper-parameter optimization through the search behavior of a cluster and the information interaction in the cluster; and finally, further correcting the decomposition result of the model by using a post-processing method, and eliminating irrelevant and wrong decomposition results decomposed by the model, thereby comprehensively improving the decomposition performance of the model. In conclusion, the hyper-parameter optimization and predicted value post-processing method for the non-invasive load decomposition model can provide technical and theoretical support for the non-invasive load recognition device, and has certain practical application value.
Drawings
Fig. 1 is a schematic diagram of a residual block structure in an embodiment of the present invention.
Fig. 2 is a schematic diagram of a ResNet-based load split model in an embodiment of the present invention.
Fig. 3 is a schematic diagram of an optimization result of the bayesian optimization of clusters in the embodiment of the present invention.
Fig. 4 is a schematic view of the overall flow of load splitting in the embodiment of the present invention.
Detailed Description
The invention is further described with reference to the accompanying drawings and specific embodiments.
Example 1: as shown in fig. 1 to 4, in order to solve the problem that most of non-invasive load decomposition algorithms are complex, low in recognition speed and low in operation efficiency, and thus cannot be applied to practical engineering, the invention provides a hyper-parameter optimization and post-processing method of a non-invasive load decomposition model, which is used for realizing non-invasive load decomposition and can achieve an algorithm effect of fast, accurate and efficient recognition.
A method for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, as shown in FIG. 4, comprises the following steps:
step 1: collecting electrical operation data of a home bus and a target electrical appliance by using a voltage transformer and a current transformer, and establishing a data set of a model;
the method comprises the steps of collecting electrical operation data of a home bus and a target electrical appliance, considering collection hardware limitation and cost requirements, not considering collection high-frequency transient characteristics, and only considering steady-state characteristics including active power, reactive power and current effective values. The target electrical appliance refers to which kind of electrical appliance operation state is desired to be decomposed from the bus data, such as a refrigerator, a hot water kettle, an electric rice cooker and the like. The data set of the model refers to that the model takes the electrical operation data of the bus and the independent electrical operation data of the target electrical appliance at the same time as a sample and a label in the data set respectively, and a training set, a verification set and a test set are divided according to the proportion of 3:1: 1.
Step 2: respectively constructing a non-invasive load decomposition model based on a depth residual error network for a target electrical appliance based on a deep learning theory;
the load decomposition models respectively constructed for the target electrical appliances refer to a decomposition model corresponding to one target electrical appliance, and the load decomposition models among different target electrical appliances are not influenced mutually. The non-invasive load decomposition model based on the depth residual error network is specifically described as follows:
in the non-intrusive load decomposition task, extracting the operating characteristics of the target appliance from the home bus is the key to achieving load decomposition. For this purpose, the power sequence at the home bus can be regarded as a two-dimensional image of the operation characteristics of the operation target electrical appliance, and the depth residual convolution network is used for extracting the characteristics to realize load decomposition. The basic composition unit of the depth residual error network is a residual error block, the difference between the residual error block and the convolution layer is that the idea of jump connection is introduced, the operation is that the depth residual error network can directly extract original characteristics from input, and the problems of gradient disappearance, explosion and the like can not occur along with the increase of the number of network layers. The residual block is divided into two parts, one part being the identity map and the other part being the residual part, as shown in fig. 1. The identity mapping is to perform dimension increasing or dimension reducing on an input and then connect the input with an output jump; the residual block is composed of a convolutional layer, a batch normalization layer and an activation function. The output of the residual block is the sum of the two, and the mathematical expression is:
yl=F(xll)+h(xl)
xl+1=f(yl)
in the formula: x is the number oflWhich is the input of the residual block; x is the number ofl+1Is the output of the residual block; thetalIs a weight matrix of convolutional layers; h (-) is 1X1 convolution operation after the dimension of the input is increased or decreased; f (-) is the activation function.
And a non-invasive load decomposition model based on a depth residual error network can be constructed based on the residual error block. Sequence of total load power
Figure BDA0003002210390000091
Segmenting according to a preset sliding window length k to obtain a first segment sequence
Figure BDA0003002210390000092
Each sliding moves the window one sample point forward, forming T-k +1 power sequences P ═ P1,P2,…PT-k+1]T. And each section of power sequence is used as the input of the model, and the output of the model is the power value of the target electrical appliance at the midpoint moment corresponding to the section of sequence. Firstly, constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking four residual blocks to extract higher-level and more abstract characteristics, and finally realizing a total power sequence to the first convolution layer by using two full-connection layersThe sequence of segments corresponds to a non-linear mapping of the power values of the target appliance at the midpoint time, as shown in fig. 2.
The non-invasive load decomposition model based on the residual error network carries out linear transformation on data by using maximum and minimum standardization, and maps the data to a [0,1] interval, which is defined as follows:
Figure BDA0003002210390000093
in the formula: x is the number ofmaxIs the maximum value in the data; x is the number ofmin is the minimum value in the data.
And step 3: optimizing the hyper-parameters of the load decomposition model by using cluster Bayes optimization;
optimizing the hyperparameter of the load decomposition model by Bayesian optimization, wherein the hyperparameter comprises: convolution kernel size, number of convolution kernels, number of full link layer neurons, and sliding window size. The bayesian optimization is described in detail as follows:
bayesian optimization makes use of the gaussian process. For the load decomposition model Y ═ f (x), we wish to determine the next search point by a known combination of hyper-parameters. If t sets of hyper-parameters x have been obtained1,x2,…xtAnd a vector f (x) formed by index values, such as error values, of the model corresponding to the t sets of hyper-parameters1:t)=[f(x1),f(x2),…f(xt)]Then the gaussian process assumes that this vector obeys a t-dimensional gaussian distribution f (x)1:t)~N(μ(x1:t),∑(x1:t,x1:t) Where the mean vector μ (x)1:t)=[μ(x1),μ(x2),…μ(xt)]The covariance matrix is expressed as:
Figure BDA0003002210390000101
the covariance is calculated by a gaussian kernel function:
Figure BDA0003002210390000102
in the formula: alpha is alpha0Are parameters of the kernel function.
After t sets of candidate solutions are obtained, a corresponding Gaussian regression model is established, the posterior probability of model index values at any point is obtained, and an acquisition function is established by utilizing the posterior probability to determine the next set of hyper-parameter combinations to be searched. The collection function expression is:
EIt(x)=E[max(f(x)-max(f(x1),f(x2),…f(xt))]
xt+1=argmaxEIt(x)
for the point x to be searched nextt+1And f (x)t+1) Assume f (x) after adding a new sample point1:t+1) The mean vector and covariance matrix are partitioned, subject to a t + 1-dimensional normal distribution, and can be written as:
Figure BDA0003002210390000103
in the formula: k ═ k (x)t+1,x1),k(xt+1,x2),…k(xt+1,xt)]T
Thus, the search process for the model hyperparameters may be described as being at a known f (x)1:t) In the case of (2), f (x) is obtainedt+1) The condition distribution of (2). The conditional distribution obeys a one-dimensional normal distribution f (x) according to the nature of the multidimensional normal distributiont+1)|f(x1:t)~N(μ,σ2) The specific calculation formula is as follows:
μ=κ(∑(x1:t,x1:t))-1(f(x1:t)-μ(x1:t))+μ(xt+1)
σ2=k(xt+1,xt+1)-κ(∑(x1:t,x1:t))-1κT
in the formula: mu and f (x)1:t) Related, and σ2Only with respect to covariance values calculated by the Gaussian kernel function, and f (x)1:t) Is irrelevant.
In summary, the algorithm flow of the Bayesian optimization hyperparameter is as follows:
s3.1, selecting t groups of hyper-parameters, and respectively calculating model index values of the hyper-parameters;
s3.2, according to the current sampling data x1:tAnd f (x)1:t) Update f (x)t+1)|f(x1:t) Mu and sigma of2
S3.3, determining the next sampling point x according to the maximum value of the acquisition functiont+1
S3.4, calculating model index value f (x) at the next sampling pointt+1);
And S3.5, returning to the step 1 for recalculation, and ending the algorithm after the specified iteration times or the model performance reaches the expected effect.
The Bayesian optimization of the clusters is to introduce the idea of the Bayesian optimization on the basis of the Bayesian optimization and realize the intelligence of the problem solution through the search behavior of the clusters and the information interaction in the clusters. The Bayesian optimization method for the cluster provided by the invention comprises the following specific steps:
s3.1, initializing the maximum searching times and the total number of searched individuals, and endowing each individual with an initial hyper-parameter combination in a reasonable range;
s3.2, respectively constructing models according to the hyper-parameter combinations of each individual, training by using the same data set, and obtaining index values of the models under the hyper-parameter combinations corresponding to each individual;
s3.3, comparing the hyper-parameter combination of each individual with the hyper-parameter combination corresponding to the individual with the best model index value in the population, and updating according to the following formula:
xid=xid+rand(0,1)(pid-xid)+rand(0,1)(pgd-xid)
in the formula: x is the number ofidIs the value of the d-dimension of the ith individual; p is a radical ofidIs the optimal value of the d-dimension of the ith individual; p is a radical ofgdIs the optimal value of the d-th dimension in the population; rand (0,1) is a random number from 0 to 1.
And S3.4, if the end condition is not met, returning to the step 2, otherwise, ending the algorithm, and combining the hyper-parameters corresponding to the model with the best model index values in the group into the searched optimal solution.
The optimization results of the cluster bayes optimization are shown in fig. 3.
And 4, step 4: respectively establishing an optimal decomposition model for the target electrical appliance, and then training trainable parameters in the model by using a training set, a verification set and a test set until convergence;
training trainable parameters in the model by using a training set, a verification set and a test set until convergence, namely updating the trainable parameters by using an Adam optimizer and a gradient back propagation algorithm; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.
And 5: and carrying out post-processing on the decomposition result of the model, and carrying out higher-precision correction on the prediction of the model based on the rationality of the model prediction result.
The main idea of the post-processing method is as follows:
the information of the model training data is fully utilized, the operation rule and the trend of the target electrical appliance are fully mined and recorded in the template feature library, the reasonable activation is corrected with higher precision on the basis of the information, the unreasonable activation is eliminated, and therefore the decomposition effect of the model is comprehensively improved. The method comprises the following steps:
s5.1, recording the shortest activation time of the target electrical appliance in the training data through a threshold value method;
s5.2, recording the duration of each section of activation of the power decomposition value of the target electrical appliance through a threshold value method;
s5.3, eliminating the activation with the activation duration time less than the shortest activation time in the power decomposition value of the target electrical appliance;
s5.4, judging a total load power section corresponding to the residual activation, and if corresponding power rise and power drop exist in the total load power in the section, considering the activation to be reasonable; otherwise, the activation is considered unreasonable and rejected.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and therefore, the scope of the present invention should be determined by the scope of the claims.

Claims (10)

1. A hyper-parameter optimization and post-processing method of a non-invasive load decomposition model is characterized by comprising the following steps: the method comprises the following steps:
step 1: collecting electrical operation data of a home bus and a target electrical appliance by using a voltage transformer and a current transformer, and establishing a data set of a model;
step 2: respectively constructing a non-invasive load decomposition model based on a depth residual error network for a target electrical appliance based on a deep learning theory;
and step 3: optimizing hyper-parameters of the non-invasive load decomposition model by using cluster Bayesian optimization;
and 4, step 4: respectively establishing an optimal decomposition model for the target electrical appliance, and then training trainable parameters in the model by using a training set, a verification set and a test set until convergence;
and 5: and carrying out post-processing on the decomposition result of the model, and correcting the prediction of the model based on the model prediction result.
2. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in the step 1, collecting electrical operation data of a home bus and a target electrical appliance, wherein the electrical operation data comprises active power, reactive power and a current effective value; the target electrical appliance is the operation state of which electrical appliance is expected to be decomposed from the bus data, and the data set of the model is a sample and a label which are respectively used by the model as electrical operation data at the bus and independent electrical operation data of the target electrical appliance at the same time, and a training set, a verification set and a test set are divided according to the proportion.
3. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in the step 2, the non-invasive load decomposition model of the target electrical appliance is a decomposition model corresponding to the target electrical appliance, and the non-invasive load decomposition models of different target electrical appliances are not influenced by each other.
4. The method of claim 3, wherein the method comprises the steps of: the non-invasive load decomposition model construction method based on the depth residual error network comprises the following steps:
in a non-invasive load decomposition task, a power sequence at a home bus is regarded as a two-dimensional image for operating the operating characteristics of a target electrical appliance, a depth residual error network is used for extracting the operating characteristics of the target electrical appliance, a basic composition unit of the depth residual error network is a residual error block, the depth residual error network can directly extract original characteristics from input, the residual error block consists of a convolution layer, a batch processing normalization layer and an activation function, and an output mathematical expression of the residual error block is as follows:
yl=F(xll)+h(xl)
xl+1=f(yl)
in the formula: x is the number oflWhich is the input of the residual block; x is the number ofl+1Is the output of the residual block; thetalIs a weight matrix of convolutional layers; h (-) is 1X1 convolution operation after the dimension of the input is increased or decreased; f (-) is the activation function.
Constructing a non-invasive load decomposition model based on a depth residual error network based on a residual error block, and sequencing the total load power
Figure FDA0003002210380000021
Segmenting according to a preset sliding window length k to obtain a first segment sequence
Figure FDA0003002210380000022
Each sliding moves the window forward by one sampling point to form T-k +1 power sequencesP=[P1,P2,…PT-k+1]TFirstly, constructing a first convolution layer to preliminarily extract original characteristics of a load, then stacking a plurality of residual blocks to extract higher-level and more abstract characteristics, and finally realizing nonlinear mapping from a total power sequence to the power value of the target electrical appliance at the midpoint moment corresponding to the segment of the sequence by using a plurality of full connection layers;
the non-invasive load decomposition model based on the residual error network carries out linear transformation on data by using maximum and minimum standardization, and maps the data to a [0,1] interval, which is defined as follows:
Figure FDA0003002210380000023
in the formula: x is the number ofmaxIs the maximum value in the data; x is the number ofminIs the minimum value in the data.
5. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 3, the hyper-parameters include: convolution kernel size, number of convolution kernels, number of full link layer neurons, and sliding window size.
6. The method of claim 1 or 5, wherein the method comprises the steps of: in step 3, the Bayesian clustering optimization method introduces the thought of clustering optimization on the basis of Bayesian optimization, and realizes the intelligence of problem solution through the search behavior of the clusters and the information interaction in the clusters, and the concrete steps are as follows:
s3.1, initializing the maximum searching times and the total number of searched individuals, and endowing each individual with an initial hyper-parameter combination in a reasonable range;
s3.2, respectively constructing models according to the hyper-parameter combinations of each individual, training by using the same data set, and obtaining index values of the models under the hyper-parameter combinations corresponding to each individual;
s3.3, comparing the hyper-parameter combination of each individual with the hyper-parameter combination corresponding to the individual with the best model index value in the population, and updating according to the following formula:
xid=xid+rand(0,1)(pid-xid)+rand(0,1)(pgd-xid)
in the formula: x is the number ofidIs the value of the d-dimension of the ith individual; p is a radical ofidIs the optimal value of the d-dimension of the ith individual; p is a radical ofgdIs the optimal value of the d-th dimension in the population; rand (0,1) is a random number from 0 to 1;
and S3.4, if the end condition is not met, returning to the step S3.2, otherwise, ending the algorithm, and combining the hyper-parameters corresponding to the model with the best model index values in the group into the searched optimal solution.
7. The method of claim 6, wherein the method comprises the steps of: the algorithm flow of the Bayesian optimization hyperparameter is as follows:
s3.1, selecting t groups of hyper-parameters, and respectively calculating model index values of the hyper-parameters;
s3.2, according to the current sampling data x1:tAnd f (x)1:t) Update f (x)t+1)|f(x1:t) Mu and sigma of2
S3.3, determining the next sampling point x according to the maximum value of the acquisition functiont+1
S3.4, calculating model index value f (x) at the next sampling pointt+1);
And S3.5, returning to the step S3.1 for recalculation, and ending the algorithm after the specified iteration times or the model performance reaches the expected effect.
8. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 4, training trainable parameters in the model by using a training set, a verification set and a test set until the trainable parameters are updated by using an Adam optimizer and a gradient back propagation algorithm in convergence; meanwhile, an early training mechanism is added, namely, after the mean square error of the verification set stops decreasing to reach a certain iteration number, the training is forcibly ended.
9. The method of claim 1 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: in step 5, the post-processing method comprises the following steps: and fully mining the operation rule and trend of the target electrical appliance by using the information of the model training data, recording the operation rule and trend in a template feature library, correcting reasonable activation with higher precision on the basis of the operation rule and trend, and removing unreasonable activation.
10. The method of claim 1 or 9 for hyper-parametric optimization and post-processing of a non-intrusive load decomposition model, wherein: the post-treatment method comprises the following specific steps:
s5.1, recording the shortest activation time of the target electrical appliance in the training data through a threshold value method;
s5.2, recording the duration of each section of activation of the power decomposition value of the target electrical appliance through a threshold value method;
s5.3, eliminating the activation with the activation duration time less than the shortest activation time in the power decomposition value of the target electrical appliance;
s5.4, judging a total load power section corresponding to the residual activation, and if corresponding power rise and power drop exist in the total load power in the section, considering the activation to be reasonable; otherwise, the activation is considered unreasonable and rejected.
CN202110351229.XA 2021-03-31 2021-03-31 Hyper-parameter optimization and post-processing method of non-invasive load decomposition model Pending CN113065704A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110351229.XA CN113065704A (en) 2021-03-31 2021-03-31 Hyper-parameter optimization and post-processing method of non-invasive load decomposition model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110351229.XA CN113065704A (en) 2021-03-31 2021-03-31 Hyper-parameter optimization and post-processing method of non-invasive load decomposition model

Publications (1)

Publication Number Publication Date
CN113065704A true CN113065704A (en) 2021-07-02

Family

ID=76564940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110351229.XA Pending CN113065704A (en) 2021-03-31 2021-03-31 Hyper-parameter optimization and post-processing method of non-invasive load decomposition model

Country Status (1)

Country Link
CN (1) CN113065704A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673830A (en) * 2021-07-23 2021-11-19 华南理工大学 Self-adaptive household energy management method based on non-invasive load monitoring technology
CN113837894A (en) * 2021-08-06 2021-12-24 国网江苏省电力有限公司南京供电分公司 Non-invasive resident user load decomposition method based on residual convolution module
CN115130830A (en) * 2022-06-08 2022-09-30 山东科技大学 Non-intrusive load decomposition method based on cascade width learning and sparrow algorithm
CN115130830B (en) * 2022-06-08 2024-05-14 山东科技大学 Non-invasive load decomposition method based on cascade width learning and sparrow algorithm

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085764A (en) * 2017-04-12 2017-08-22 西安交通大学 A kind of load decomposition method and device based on improvement DFHMM models
CN108573281A (en) * 2018-04-11 2018-09-25 中科弘云科技(北京)有限公司 A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization
CN109409614A (en) * 2018-11-16 2019-03-01 国网浙江瑞安市供电有限责任公司 A kind of Methods of electric load forecasting based on BR neural network
CN110135621A (en) * 2019-04-10 2019-08-16 国网江苏省电力有限公司南通供电分公司 A kind of Short-Term Load Forecasting Method based on PSO optimization model parameter
CN110445126A (en) * 2019-06-25 2019-11-12 中国电力科学研究院有限公司 A kind of non-intrusion type load decomposition method and system
CN110598842A (en) * 2019-07-17 2019-12-20 深圳大学 Deep neural network hyper-parameter optimization method, electronic device and storage medium
CN111144581A (en) * 2019-12-31 2020-05-12 杭州雅拓信息技术有限公司 Machine learning hyper-parameter adjusting method and system
CN111428816A (en) * 2020-04-17 2020-07-17 贵州电网有限责任公司 Non-invasive load decomposition method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085764A (en) * 2017-04-12 2017-08-22 西安交通大学 A kind of load decomposition method and device based on improvement DFHMM models
CN108573281A (en) * 2018-04-11 2018-09-25 中科弘云科技(北京)有限公司 A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization
CN109409614A (en) * 2018-11-16 2019-03-01 国网浙江瑞安市供电有限责任公司 A kind of Methods of electric load forecasting based on BR neural network
CN110135621A (en) * 2019-04-10 2019-08-16 国网江苏省电力有限公司南通供电分公司 A kind of Short-Term Load Forecasting Method based on PSO optimization model parameter
CN110445126A (en) * 2019-06-25 2019-11-12 中国电力科学研究院有限公司 A kind of non-intrusion type load decomposition method and system
CN110598842A (en) * 2019-07-17 2019-12-20 深圳大学 Deep neural network hyper-parameter optimization method, electronic device and storage medium
CN111144581A (en) * 2019-12-31 2020-05-12 杭州雅拓信息技术有限公司 Machine learning hyper-parameter adjusting method and system
CN111428816A (en) * 2020-04-17 2020-07-17 贵州电网有限责任公司 Non-invasive load decomposition method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
罗平 等: "基于电器运行状态和深度学习的非侵入式负荷分解", 电力系统自动化, pages 49 - 56 *
邓舒迟 等: "基于时间序列的居民用户非侵入式负荷分解研究", 电子设计工程, pages 40 - 45 *
陈春玲: "基于深度学习算法的非侵入式负荷分解", 中国优秀硕士学位论文全文数据库 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673830A (en) * 2021-07-23 2021-11-19 华南理工大学 Self-adaptive household energy management method based on non-invasive load monitoring technology
CN113673830B (en) * 2021-07-23 2024-03-05 华南理工大学 Self-adaptive household energy management method based on non-invasive load monitoring technology
CN113837894A (en) * 2021-08-06 2021-12-24 国网江苏省电力有限公司南京供电分公司 Non-invasive resident user load decomposition method based on residual convolution module
CN113837894B (en) * 2021-08-06 2023-12-19 国网江苏省电力有限公司南京供电分公司 Non-invasive resident user load decomposition method based on residual convolution module
CN115130830A (en) * 2022-06-08 2022-09-30 山东科技大学 Non-intrusive load decomposition method based on cascade width learning and sparrow algorithm
CN115130830B (en) * 2022-06-08 2024-05-14 山东科技大学 Non-invasive load decomposition method based on cascade width learning and sparrow algorithm

Similar Documents

Publication Publication Date Title
Shamshirband et al. A survey of deep learning techniques: application in wind and solar energy resources
Zhu et al. Short-term prediction for wind power based on temporal convolutional network
CN112487702B (en) Method for predicting residual service life of lithium ion battery
Li et al. A wind power forecasting method based on optimized decomposition prediction and error correction
Yao et al. Multidimensional LSTM networks to predict wind speed
Ding et al. Predicting short wind speed with a hybrid model based on a piecewise error correction method and Elman neural network
Jinhu et al. Applying principal component analysis and weighted support vector machine in building cooling load forecasting
CN113065704A (en) Hyper-parameter optimization and post-processing method of non-invasive load decomposition model
CN114036850A (en) Runoff prediction method based on VECGM
Zhang et al. A hybrid forecasting system with complexity identification and improved optimization for short-term wind speed prediction
CN116644970A (en) Photovoltaic power prediction method based on VMD decomposition and lamination deep learning
CN111222689A (en) LSTM load prediction method, medium, and electronic device based on multi-scale temporal features
CN114596726B (en) Parking berth prediction method based on interpretable space-time attention mechanism
CN116960978A (en) Offshore wind power prediction method based on wind speed-power combination decomposition reconstruction
Shao et al. Short-term load forecasting based on EEMD-WOA-LSTM combination model
Zhao et al. A frequency item mining based embedded feature selection algorithm and its application in energy consumption prediction of electric bus
Fang et al. Power distribution transformer fault diagnosis with unbalanced samples based on neighborhood component analysis and k-nearest neighbors
Xiang et al. An Improved Multiple Imputation Method Based on Chained Equations for Distributed Photovoltaic Systems
Ye et al. Optimal Component IGSCV-SVR Ensemble Model Improved by VMD for Ultra-short-term Wind Speed Forecasting.
Zhou Event Scene Method of Legal Domain Knowledge Map Based on Neural Network Hybrid Model
Wang et al. A hybrid model with combined feature selection based on optimized VMD and improved multi-objective coati optimization algorithm for short-term wind power prediction
Ying et al. A Brief Investigation for Techniques of Deep Learning Model in Smart Grid
Wang et al. Prediction of heavy metal content in multivariate chaotic time series based on LSTM
Che et al. A Sample Entropy Parsimonious Model Using Decomposition-ensemble with SSA and CEEMDAN for Short-term Wind Speed Prediction.
Lv Deep Learning Network Traffic Prediction based on Bayesian Algorithm Optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination