CN114511058B

CN114511058B - Load element construction method and device for electric power user portrait

Info

Publication number: CN114511058B
Application number: CN202210101090.8A
Authority: CN
Inventors: 王进; 华淼杰; 倪格格; 翁蓓蓓; 徐兴春; 鞠玲; 刘振扬; 郑俊杰; 赵剑锋; 董坤; 毛妍纯
Original assignee: Taizhou Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Current assignee: Taizhou Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2022-01-27
Filing date: 2022-01-27
Publication date: 2023-06-02
Anticipated expiration: 2042-01-27
Also published as: CN114511058A

Abstract

The invention discloses a load element construction method and a device for an electric power user portrait, wherein the method comprises the following steps: collecting total power of a household user and active power of a single load element as a training set; carrying out thermal single code coding on the total power of the home subscriber and the active power of a single load element; carrying out hash processing on the user characteristic vector and the active power characteristic vector; respectively inputting the low-dimensional user feature vector and the low-dimensional active power feature vector into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and respectively outputting each user embedded feature vector and each load element embedded feature vector; calculating the relevance of each user embedded feature vector and the load element embedded feature vector based on cosine similarity, and converting a relevance calculation result into posterior probability by utilizing a logistic regression activation function to obtain a load element with highest relevance to the user; the method can improve the efficiency of establishing the user portrait model and has strong reliability.

Description

Load element construction method and device for electric power user portrait

Technical Field

The invention relates to the technical field of power, in particular to a method and a device for constructing a load element for a power consumer portrait.

Background

With the continuous development of electronic technology and big data technology, the formation of household electricity information is gradually converted into a data form, so that willingness information can be intuitively and conveniently obtained, the time consumption is further reduced, and the efficiency is improved. In order to find the target data more quickly and accurately in the network, various attribute information of household user electricity consumption needs to be divided by using a related classification technology. Supply and demand information of the home user can be further known through portrait construction, data such as behavior habits and the like are accurately positioned, and the overall view of the user information is conveniently known.

In the technical field of portrayal, embedded features are generated by utilizing artificially constructed features through a deep learning method and are input into a load element which is predicted to be opened by a user in a neural network, and finally, the habit of electricity utilization behavior of a household user is displayed. Load element opening prediction is the most important consideration of model effect, and a user is crucial to the historical opening behavior of the load element. However, most current power consumer representation models utilize some artificial features as raw features to input into the multi-layer neural network, which are derived from the consumer representation, the consumer's historical behavior, the load element properties, etc.

For example, patent document CN112417308A discloses a user portrait tag generating method based on electric power big data, which generates a user characteristic tag by using big data processing technology, wherein a basic database configuration of the big data processing technology is constructed around customer appeal, opinion and consultation data streams of channels such as electric power official telephones, electric power intranet and extranet, mobile phone APP, weChat public number lines, business hall opinion books are imported into the basic database as tag data original sources, and clients are marked in a tag form through data analysis, so that a user portrait is established.

The method can integrate various source data, build a multi-dimensional and three-dimensional customer portrait by means of big data analysis technology, and describe deep behavior characteristics of users through labels. However, the method is biased to adopt consultation opinions provided by users to describe the characteristics of the users, lacks acquisition of direct electricity information, does not well integrate structural features and node features of complex recommended network scenes, is poor in model reliability, is low in operation speed, and is low in user portrait model establishment efficiency.

Disclosure of Invention

The invention provides a method and a device for constructing a load element for an electric power user portrait, which are used for carrying out model training based on the total power of a household user and the active power of a single load element, and are used for integrating the structural characteristics and the node characteristics of a complex recommended network scene, so that the efficiency of establishing a user portrait model is improved, and the reliability is high.

A load element construction method for an electric power user portrait includes:

collecting total power of a household user and active power of a single load element as a training set;

performing thermal single code coding on the total power of the home user and the active power of a single load element to respectively obtain a user characteristic vector and an active power characteristic vector;

respectively carrying out hash processing on the user characteristic vector and the active power characteristic vector to respectively obtain a low-dimensional user characteristic vector and a low-dimensional active power characteristic vector;

the low-dimensional user feature vector and the low-dimensional active power feature vector are respectively input into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and each user embedded feature vector and each load element embedded feature vector are respectively output;

and calculating the correlation between each user embedded feature vector and the load element embedded feature vector based on the cosine similarity, and converting the correlation calculation result into posterior probability by utilizing a logistic regression activation function to obtain the load element with the highest degree of correlation with the user.

Further, performing hot unique code encoding on the total power of the home user, including:

dividing the same family user total power into user similar characteristics;

the similar characteristics of each user are expressed by binary system, and the user characteristic vector is obtained;

performing thermal single code encoding on the active power of the single load element to obtain an active power characteristic vector of the single load element, including:

dividing the active power of the same single load element into active power similar characteristics;

and (5) representing each active power similar characteristic by using a binary system to obtain an active power characteristic vector.

Further, hash processing is performed on the user feature vector to obtain a low-dimensional user feature vector, including:

respectively adding a start mark and an end mark before and after the user feature vector;

setting step length and sliding window based on N-GRAM algorithm, recording user characteristic vector slices of each scratch, recording all user characteristic vector slices as vector representations composed of a plurality of numbers, and obtaining low-dimension user characteristic vectors;

performing hash processing on the active power feature vector to obtain a low-dimension active power feature vector, including:

respectively adding a start mark and an end mark before and after the active power characteristic vector;

based on an N-GRAM algorithm, setting a step length and a sliding window, recording active power characteristic vector slices which are marked each time, recording all the active power characteristic vector slices as vector representations consisting of a plurality of numbers, and obtaining a low-dimensional active power characteristic vector.

Further, the low-dimensional user feature vector and the low-dimensional active power feature vector are respectively input into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and parameters of the first multi-layer perceptron network and the second multi-layer perceptron network are respectively obtained according to a minimized loss function.

Further, the parameters of the first multi-layer perceptron network and the second multi-layer perceptron network comprise weight matrixes and bias items of all hidden layers.

Further, the load element embedding feature vector is calculated by the following formula:

l _i ＝f(W _i l _i-1 +b _i )i＝2,…,N-1；

y _L ＝f(W _N l _N-1 +b _N )；

wherein ,y_L Embedding feature vectors for load elements, l _i Hidden layer of the ith layer of the first multi-layer perceptron network, W _i B is a weight matrix of an ith hidden layer of the first multi-layer perceptron network _i Bias item representing ith hidden layer of first multi-layer perceptron network, N is hidden layer number of first multi-layer perceptron network, b _N Bias items for the output layer of the first multi-layer perceptron network;

the user-embedded feature vector is calculated by the following formula:

l _m ＝f(W _m l _m-1 +b _m )m＝2,…,M-1；

y _U ＝f(W _M l _M-1 +b _M )；

wherein ,y_U Embedding feature vectors for users, l _m Hidden layer of the m-th layer of the second multi-layer perceptron network, W _m B is a weight matrix of an mth hidden layer of the second multi-layer perceptron network _m Bias term representing mth hidden layer of second multi-layer perceptron network, M is hidden layer number of second multi-layer perceptron network, b _M And the bias item of the output layer of the second multi-layer perceptron network.

Further, the correlation of the user embedded feature vector and the load element embedded feature vector is calculated by the following formula:

wherein ,y_U Embedding feature vectors for users, y _L The feature vector is embedded for the load element,

representing a transpose of the user embedded feature vector, R (U, L) representing a correlation of the user embedded feature vector with the load element embedded feature vector.

Further, the posterior probability is calculated by the following formula:

wherein ,P(L^j U) is a posterior probability, γ is a smoothing factor of a logistic regression activation function, L represents a set of load elements, U represents a set of users, L' represents a sum of all load elements in the set of load elements, j is "+" or "-", L ⁺ Representing a positive sample of the load cell, L ^- Representing a load cell negative sample.

Further, the minimized loss function is expressed by the following formula:

wherein Λ is a parameter of the multi-layer perceptron network, P (L ⁺ U) represents the posterior probability, U represents the set of users, L ⁺ Representing a positive sample of the load cell.

A load cell construction apparatus for a representation of an electric power consumer, comprising:

the acquisition module is used for acquiring the total power of the household users and the active power of the single load element as a training set;

the coding module is used for carrying out hot unique code coding on the total power of the home user and the active power of the single load element to respectively obtain a user characteristic vector and an active power characteristic vector;

the processing module is used for respectively carrying out hash processing on the user characteristic vector and the active power characteristic vector to respectively obtain a low-dimensional user characteristic vector and a low-dimensional active power characteristic vector;

the training module is used for respectively inputting the low-dimensional user characteristic vector and the low-dimensional active power characteristic vector into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and respectively outputting each user embedded characteristic vector and each load element embedded characteristic vector;

and the calculating module is used for calculating the correlation between each user embedded feature vector and the load element embedded feature vector based on the cosine similarity, converting the correlation calculation result into posterior probability by utilizing a logistic regression activation function, and obtaining the load element with the highest degree of correlation with the user.

The invention provides a load element construction method and a load element construction device for an electric power user portrait, which at least comprise the following beneficial effects:

the method comprises the steps of carrying out model training based on total power of household users and active power of load elements, carrying out hash processing on characteristics obtained through thermal single code coding, carrying out multi-layer perceptron network training, generating similarity between users and loads by means of a double-tower model, and finally obtaining the load element with highest degree of association with the users, wherein structural characteristics and node characteristics of complex recommended network scenes are fused, manual priori deviation is avoided to a certain extent, operation speed during on-line prediction is effectively improved, user portrait establishment efficiency is high, and reliability is high.

Drawings

FIG. 1 is a flow chart of one embodiment of a method for constructing a load cell for an electrical consumer representation provided by the present invention.

FIG. 2 is a schematic diagram of an embodiment of a load cell construction apparatus for consumer representation of electrical power provided by the present invention.

Description of the drawings: the system comprises a 1-acquisition module, a 2-coding module, a 3-processing module, a 4-training module and a 5-calculating module.

Detailed Description

In order to better understand the above technical solutions, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.

Referring to FIG. 1, in some embodiments, a load cell construction method for a representation of an electrical consumer is provided, comprising:

s1, collecting total power of a household user and active power of a single load element as a training set;

s2, performing hot unique code coding on the total power of the home user and the active power of a single load element to respectively obtain a user characteristic vector and an active power characteristic vector;

s3, carrying out hash processing on the user characteristic vector and the active power characteristic vector respectively to obtain a low-dimensional user characteristic vector and a low-dimensional active power characteristic vector respectively;

s4, respectively inputting the low-dimensional user feature vector and the low-dimensional active power feature vector into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and respectively outputting each user embedded feature vector and each load element embedded feature vector;

s5, calculating the correlation between each user embedded feature vector and the load element embedded feature vector based on cosine similarity, and converting a correlation calculation result into posterior probability by using a logistic regression activation function to obtain the load element with the highest degree of correlation with the user.

Specifically, in step S2, the hot-unique code encoding is performed on the total power of the home subscriber, including:

s211, dividing the total power of the same household user into user similar characteristics;

s212, representing the similar characteristics of each user by using binary system to obtain the user characteristic vector;

s221, dividing the active power of the same single load element into active power similar characteristics;

s222, representing each active power similar characteristic by binary system to obtain an active power characteristic vector.

In a specific embodiment, the total power of the home subscriber and the active power of each load element are processed separately, the first multi-layer perceptron network is used for processing the total power of the home subscriber, the second multi-layer perceptron network is used for processing the active power of each load element, and the first multi-layer perceptron network and the second multi-layer perceptron network form a double-tower model.

Specifically, the present active power data is first used as a dictionary, the number of word banks is the dimension of the thermal single code coding, the dictionary is a zero vector of the word bank number x 1 dimension, and the zero vector is 1 only where the current active power value is represented, for example, the power dictionary of a refrigerator is [0,1,2,3,4 … … ], and if the current power is 0, the thermal single code coding result is [1,0 … ].

When the active power of the same single load element is divided into the same kind of characteristics, only one bit of each characteristic in each sample needs to be enabled, and the rest bits are disabled.

In step S3, hash processing is performed on the user feature vector to obtain a low-dimensional user feature vector, including:

s311, respectively adding a start mark and an end mark before and after the user feature vector;

s312, setting step length and sliding window based on an N-GRAM algorithm, recording user feature vector slices scratched each time, recording all user feature vector slices as vector representations consisting of a plurality of numbers, and obtaining a low-dimensional user feature vector;

s321, respectively adding a start mark and an end mark before and after the active power characteristic vector;

s322, setting step length and sliding window based on an N-GRAM algorithm, recording active power characteristic vector slices of each pass, recording all active power characteristic vector slices as vector representations composed of a plurality of numbers, and obtaining a low-dimensional active power characteristic vector. The algorithm may be represented by the following formula:

l ₁ ＝W ₁ x

wherein x represents an input feature vector of the thermal single code coding, l ₁ Represents a first hidden layer, W ₁ Representing the weight matrix of the first layer.

In a specific embodiment, taking a refrigerator as an example of a load element, when the active power of the refrigerator at a certain moment is 258W, a sign #258# is added to the beginning and the end of a user feature vector, then the active power is divided into n numbers, and when n=3 is taken, the numbers are #25, 258 and 58#, so that the dictionary obtained by the hot unique code coding can be represented by a vector formed by a plurality of 3 numbers, and the dimension of the dictionary is greatly reduced. For example, when the active power is 259W and the hot-unique code dictionary is [ … #25, … 258, 259, 260, … 59# … ], the hash-processed code is [ … 1, … 0,1,0 … 1 … ].

In step S4, the low-dimensional user feature vector and the low-dimensional active power feature vector are respectively input into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and parameters of the first multi-layer perceptron network and the second multi-layer perceptron network are respectively obtained according to a minimized loss function.

The parameters of the first multi-layer perceptron network and the second multi-layer perceptron network comprise weight matrixes and bias items of all hidden layers.

The load element embedding feature vector is calculated by the following formula:

l _i ＝f(W _i l _i-1 +b _i )i＝2,…,N-1；

y _L ＝f(W _N l _N-1 +b _N )；

the user-embedded feature vector is calculated by the following formula:

l _m ＝f(W _m l _m-1 +b _m )m＝2,…,M-1；

y _U ＝f(W _M l _M-1 +b _M )；

Wherein, the activation functions of the hidden layer and the output layer adopt a tanh function, which is expressed by the following formula:

in a specific embodiment, the first multi-layer perceptron network and the second multi-layer perceptron network each comprise an input layer, a hidden layer and an output layer, after the low-dimensional user feature vector and the low-dimensional active power feature vector are input to the input layer, activating and embedding feature vector calculation is performed in the hidden layer and the output layer, and then the finally obtained 128-dimensional vector is output as an embedding feature vector.

In step S5, the correlation between the user embedded feature vector and the load element embedded feature vector is calculated by the following formula:

wherein ,y_U For embedding feature vectors for users, y _L The feature vector is embedded for the load element,

In order to obtain parameters of the multi-layer perceptron network and the embedded feature vector output network, the parameters comprise weight matrixes and bias items, and the parameters of the multi-layer perceptron network and the embedded feature vector output network are obtained based on a maximum likelihood estimation method according to the condition probability that all loads are started on the premise of maximizing the total power data of power consumption of a given household user.

The posterior probability is calculated by the following formula:

In one embodiment, the activation function may employ a softmax function to convert cosine similarity of the user and the load element to posterior probability, with gamma values typically set based on human experience.

Ideally, L should contain all load elements that may participate in the matching. For a scenario where one query of home users and turn on load element behavior occurs, use L ⁺ Representing the load that was turned on, i.e., positive samples, using negative sampling to randomly select 4 different loads as negative samples for which no turn-on behavior occurred, specifically expressed as

In training, model parameter estimation aims at maximizing the probability of load element start-up behavior to occur given the entire training of querying home users. Therefore, the minimization loss function employs maximum likelihood estimation, which is expressed by the following formula:

The posterior probability accuracy obtained according to the initial parameters of the double-tower model is low, so that the parameters of the model can be optimized through a random gradient descent method, the posterior probability is maximized, and the parameters of each layer of the double-tower model are updated.

Finally, the load element with the highest degree of association with the user is obtained, the obtained embedded feature vector can be used for comparing the similarity between the user and the load element, so that the load possibly started at the next moment can be found, the similarity between different family users can be compared, the family users with the same type can be found, and the information features of the family users can be collected.

Referring to FIG. 2, in some embodiments, there is provided a load cell construction apparatus for a representation of an electric utility, comprising:

the acquisition module 1 is used for acquiring the total power of the household users and the active power of a single load element as a training set;

the encoding module 2 is used for performing hot unique code encoding on the total power of the home user and the active power of the single load element to respectively obtain a user characteristic vector and an active power characteristic vector;

the processing module 3 is used for respectively carrying out hash processing on the user characteristic vector and the active power characteristic vector to respectively obtain a low-dimensional user characteristic vector and a low-dimensional active power characteristic vector;

the training module 4 is used for respectively inputting the low-dimensional user feature vector and the low-dimensional active power feature vector into a first multi-layer perceptron network and a second multi-layer perceptron network for training, and respectively outputting each user embedded feature vector and each load element embedded feature vector;

and the calculating module 5 is used for calculating the correlation between each user embedded feature vector and the load element embedded feature vector based on cosine similarity, and converting the correlation calculation result into posterior probability by utilizing a logistic regression activation function to obtain the load element with the highest degree of correlation with the user.

Specifically, the encoding module 2 is further configured to perform hot-independent encoding on the total power of the home subscriber, and includes:

dividing the same family user total power into user similar characteristics;

The processing module 3 is further configured to hash the user feature vector to obtain a low-dimensional user feature vector, and includes:

The training module 4 is further configured to input the low-dimensional user feature vector and the low-dimensional active power feature vector to the first multi-layer perceptron network and the second multi-layer perceptron network, respectively, to perform training, and obtain parameters of the first multi-layer perceptron network and the second multi-layer perceptron network, respectively, according to a minimized loss function.

l _i ＝f(W _i l _i-1 +b _i )i＝2,…,N-1；

y _L ＝f(W _N l _N-1 +b _N )；

wherein ,y_L Embedding feature vectors for load elements, l _i Hidden layer of the ith layer of the first multi-layer perceptron network, W _i B is a weight matrix of an ith hidden layer of the first multi-layer perceptron network _i The bias item representing the ith hidden layer of the first multi-layer perceptron network, wherein N is the number of hidden layers of the first multi-layer perceptron network;

the user-embedded feature vector is calculated by the following formula:

l _m ＝f(W _m l _m-1 +b _m )m＝2,…,M-1；

y _U ＝f(W _M l _M-1 +b _M )；

wherein ,y_U Embedding feature vectors for users, l _m Hidden layer of the m-th layer of the second multi-layer perceptron network, W _m B is a weight matrix of an mth hidden layer of the second multi-layer perceptron network _m And the bias item representing the M-th hidden layer of the second multi-layer perceptron network, wherein M is the number of hidden layers of the second multi-layer perceptron network.

The calculating module 5 is further configured to calculate a correlation between the user embedded feature vector and the load element embedded feature vector, where the calculation is performed by the following formula:

The posterior probability is calculated by the following formula:

By setting L', traversing the load elements within the set L can be achieved.

The minimized loss function is represented by the following formula:

According to the load element construction method and device for the electric power user portrait, model training is carried out based on total power of a household user and active power of the load element, hash processing is carried out on the characteristics obtained through thermal unique code coding, multi-layer perceptron network training is carried out, similarity between the user and each load is generated by means of a double-tower model, and finally the load element with the highest degree of association with the user is obtained, wherein structural characteristics and node characteristics of a complex recommended network scene are fused, manual priori deviation is avoided to a certain extent, operation speed during online prediction is effectively improved, user portrait model construction efficiency is high, and reliability is strong.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention. It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A load cell construction method for an electricity consumer representation, comprising:

calculating the relevance of each user embedded feature vector and the load element embedded feature vector based on cosine similarity, and converting a relevance calculation result into posterior probability by utilizing a logistic regression activation function to obtain a load element with highest relevance to the user;

performing hot unique code encoding on the total power of the home user, including:

dividing the same family user total power into user similar characteristics;

2. The method of claim 1, wherein hashing the user feature vector to obtain a low-dimensional user feature vector comprises:

3. The method of claim 1, wherein the low-dimensional user feature vector and the low-dimensional active power feature vector are input to a first multi-layer perceptron network and a second multi-layer perceptron network, respectively, for training, and parameters of the first multi-layer perceptron network and the second multi-layer perceptron network are obtained, respectively, according to a minimization loss function.

4. The method of claim 3, wherein the parameters of the first and second multi-layer perceptron networks comprise a weight matrix and bias terms for each hidden layer.

5. The method of claim 4, wherein the load element embedded feature vector is calculated by the formula:

；

；/>

；

wherein ,

embedding feature vectors for load elements, < >>

Conceal layer for the i-th layer of the first multi-layer perceptron network->

Weight matrix for the i-th hidden layer of the first multi-layer perceptron network,/for the i-th hidden layer>

Bias item representing ith hidden layer of first multi-layer perceptron network, number of hidden layers of first multi-layer perceptron network is N-2, < ->

Bias items for the output layer of the first multi-layer perceptron network; />

Input feature vector representing a thermal unique code, < ->

Representing the input layer->

Weight matrix representing input layer, W _N A weight matrix of the output layer of the first multi-layer perceptron network;

the user-embedded feature vector is calculated by the following formula:

；

；

wherein ,

embedding feature vectors for a user->

Conceal layer for the m-th layer of the second multi-layer perceptron network->

Weight matrix for the m-th hidden layer of the second multi-layer perceptron network,/for the m-th hidden layer>

Bias item representing mth hidden layer of second multi-layer perceptron network, number of hidden layers of second multi-layer perceptron network is M-2,/or->

Bias item W for output layer of second multi-layer perceptron network _M And the weight matrix is the weight matrix of the second multi-layer perceptron network output layer.

6. The method of claim 5, wherein the correlation of the user embedded feature vector with the load element embedded feature vector is calculated by the following formula:

；

wherein ,

embedding feature vectors for a user->

Embedding feature vectors for load elements, < >>

Representing the transpose of the user's embedded feature vector, +.>

The correlation of the user embedded feature vector with the load element embedded feature vector is represented.

7. The method of claim 6, wherein the posterior probability is calculated by the following formula:

；

wherein ,

for posterior probability>

For the smoothing factor of the logistic regression activation function, L represents the set of load elements, U represents the set of users, L' represents the sum of all load elements in the set of load elements, j is "+" or "-", "and>

representing a positive sample of the load cell, < >>

Representing a load cell negative sample.

8. The method of claim 7, wherein the minimization of the loss function is represented by the following formula:

；

wherein ,

is a parameter of a multi-layer perceptron network, +.>

Representing posterior probability, U representing the set of users, +.>

Representing a positive sample of the load cell.

9. A load cell construction device for an electric power consumer figure, comprising:

the computing module is used for computing the relevance of each user embedded feature vector and the load element embedded feature vector based on cosine similarity, converting a relevance computing result into posterior probability by utilizing a logistic regression activation function, and obtaining the load element with the highest relevance to the user;

the encoding module is also used for carrying out hot unique code encoding on the total power of the home user, and comprises the following steps:

dividing the same family user total power into user similar characteristics;