CN113435938B

CN113435938B - Distributed characteristic data selection method in electric power spot market

Info

Publication number: CN113435938B
Application number: CN202110763209.3A
Authority: CN
Inventors: 李俊; 胡本然; 关心; 胡妤飞
Original assignee: Mudanjiang university; State Grid Heilongjiang Electric Power Co Ltd; Heilongjiang University
Current assignee: Mudanjiang university; State Grid Heilongjiang Electric Power Co Ltd; Heilongjiang University
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2023-05-16
Anticipated expiration: 2041-07-06
Also published as: CN113435938A

Abstract

The distributed characteristic data selection method in the power spot market solves the problem that the cost and the accuracy of building a learning model cannot be simultaneously considered when the data are selected in the prior art, and belongs to the field of power data analysis. Comprising the following steps: the data buyer end determines a user side learning model, a sample data set, and the type and the data quantity of the missing power data, forms a query and sends the query to the data seller end, and the data seller end returns a corresponding given data set to the data buyer end; the data buyer end jointly optimizes the accuracy of a user side learning model, user payment, edge server task processing delay and block chain uploading delay, and aims at maximizing the accuracy of the user side learning model, minimizing payment and delay, so as to establish an objective function; and solving the objective function, selecting characteristic data conforming to the objective function in a given data set, uploading the selected characteristic data to a blockchain by a data seller terminal, paying and acquiring the data by the data buyer terminal through the blockchain, and adding the data into a sample data set.

Description

Distributed characteristic data selection method in electric power spot market

Technical Field

The invention relates to a distributed characteristic data selection method in an electric power spot market, and belongs to the field of electric power data analysis.

Background

With the rapid development of information technology, various edge devices in the energy internet have generated a large amount of power data describing meaningful information. Complex techniques can be used to obtain business needs and financial insights from such power data, making power data resources an essential production element and strategic resource for human society. Knowledge discovery, which aims to discover basic knowledge from a large amount of power data, is a hot topic in both academia and industry due to important financial and social values hidden in various power data.

However, for large amounts of power data in the energy internet, financial knowledge discovery in the power spot market is very difficult. Data driven decisions are popular in financial knowledge discovery, which is changing scientific, business and social activities. Such an approach can provide knowledge about small data quickly and accurately, assuming that the characteristic data is redundant in a given dataset, and therefore, they will not provide convincing valuable knowledge if there is insufficient characteristic data. However, the lack of feature data is a common phenomenon in financial knowledge discovery, as financial activities often contain complex features, often owned by different organizations. Although some feature data can be generated from the history data to complete the learning task, there are still feature data that are difficult to accurately generate, which makes acquiring features a major obstacle to learning financial knowledge. Thus, it is necessary to purchase feature data from different feature data vendors. However, since budgets are typically limited, this is critical to design strategies to trade off the cost of feature data against the accuracy of the user-side learning model.

Determining the importance of features is also challenging for purchasers, which also determines the strategy to get the best performance of the learning model in the case of budget limitation. Power signature data selection is an effective technique to address this problem. The purpose of the power feature data selection is to find the most appropriate subset of power features from the original feature set so that the built learning model is better and faster. In addition, the characteristic data selection can accelerate the data processing speed, and the calculation cost is saved.

Power signature data selection has been widely studied in data mining and machine learning. Although extensive research has been conducted, most existing feature data selection studies do not take into account data costs, and collecting all of the information of the training data can be very expensive. Furthermore, due to the ever increasing amount of data and the number of data dimensions, there is a problem with a single machine that is computationally intensive, both in terms of storage and computation.

Disclosure of Invention

Aiming at the problem that the feature data subset found from the original power feature data set of the existing power spot market cannot simultaneously achieve cost and accuracy of building a learning model, the invention provides a distributed feature data selection method in the power spot market.

The invention relates to a distributed characteristic data selection method in an electric power spot market, which is realized based on a block chain system, wherein the system comprises a data seller terminal, a data buyer terminal, an edge computing server and a block chain;

the method comprises the following steps:

s1, a data buyer determines a user side learning model, a sample data set and the type and the data quantity of power data lacking in the sample data set, forms a query according to the type and the data quantity of the power data lacking, and sends the query to the data seller, and the data seller returns a corresponding given data set to the data buyer;

s2, the data buyer end jointly optimizes the accuracy of a user side learning model, user payment, task processing delay and uploading delay of a block chain, maximizes the accuracy of the user side learning model, aims at minimizing payment and delay, and establishes an objective function:

s.t.φ _d (x)≤budget

0≤|x|≤Size

wherein phi is _d (x)＝φ _upd (x)·(1+kAcc(x))+userpr；beha＝Num ^Buy /Num ^Query The method comprises the steps of carrying out a first treatment on the surface of the When beha is less than or equal to 0.1, userpr is more than or equal to 0; when beha>0.1，userpr<0; x and λ represent inputs, x= (x ₁ ,x ₂ ,...,x _n ),x _i ∈{0,1},i＝1,2,...,n， x _i Feature data representing an ith type of a given dataset, n representing a number of types of features in the given dataset;

λ _a representing the ratio of the ith task to the total task of the data buyer side offloaded to the edge server, 0.ltoreq.lambda _a ≤1； d _a Representing the size of the a-th task input data; lambda (lambda) _a d _a Representing a computational task required on an edge computing node EN of an edge server; acc (x) represents the accuracy of the learning model at the user side, 0<Acc(x)<1, a step of; alpha represents the precision parameter of the learning model at the user side; beta represents a payment parameter; ζ represents a parameter of data buyer side task processing delay; η represents a parameter of a blockchain delay; phi (phi) _d (x) Representing a price; phi (phi) _upd (x) Representing static pricing irrespective of the user-side learning model; userpr represents a user behavioral rewards/penalty variable; pena represents the lower limit of the user behavior penalty variable, rewa represents the upper limit of the user behavior penalty variable, pena is less than or equal to userpr is less than or equal to rewa; k represents a price adjustment parameter; the budgets represent budgets; />

Representing the processing delay of the a-th task; />

A maximum limit value representing a task processing delay; />

Representing local tasks (1-lambda _a )d _a Is a local calculation time of (1); />

Representing the task lambda to be calculated _a d _a From user data end U _a Transmission time of edge computing node EN offloaded to edge server, +.>

Representing a computing task lambda to be accessed wirelessly _a d _a The computation time offloaded onto the edge computation node EN of the edge server; a=1, 2, …, N _E ，N _E Representing the number of tasks; / >

Represents the time of the transaction between the data buyer and the data seller, b=1, 2, …, N _B ,N _B Representing the number of blocks in the blockchain; />

A maximum limit value representing a transaction achievement time; />

Representing a b-th block packing time of the block chain; />

Representing the consensus time of the b-th block; />

Indicating the commit time of the b-th block; num (Num) ^Query Indicates the number of queries, num ^Buy Representing the number of purchases;

a maximum limit value representing the number of queries; />

A maximum limit value indicating the number of purchases; size represents the maximum limit of the amount of data that a user queries and purchases in real time;

s3, solving the objective function, selecting characteristic data which accords with the objective function in a given data set, uploading the selected characteristic data to a block chain by a data seller terminal, paying and acquiring the data through the block chain by a data buyer terminal, and adding the data into a sample data set.

Preferably, the method is characterized in that,

d represents a given data set, including m tuples, each tuple having n types of features, the query issued by the data buyer is Q, t _j Tuple representing the result of query Q at D, j=1, 2, … m; in D there is a table T ₁ ,T ₂ ,...,T _tn ，

Is T ₁ ,T ₂ ,...,T _tn Tn represents the number of sub-tables in D,/-sub-table>

Representing query Q versus D table T _i A set of lineage tuples;

representing query Q versus D table T _j Is a set of uncertain lineage tuples; />

Representing tuple t _j Data quality of (2); sen (0)<sen<1) Indicating the sensitivity of the user to the quality; delta represents a price coefficient for controlling a user price range; />

T _now A set of lineage tuples representing current non-purchased data; p is p _total Representing the overall price for a given dataset; ζ represents a coefficient of information entropy; />

Coefficients representing the integrity rate; integrity of j-th tuple->

index _ij =1 indicates that the element of the ith row and j column of the n row and m column feature data of a given dataset is present, and if not present, index _ij =0; h is the information entropy of a given dataset; h (t) _j ) Information entropy of the j-th tuple; w= (w) ₁ ,w ₂ ,...,w _n )，w _j A weight vector representing the j-th tuple of feature data; w (w) _min Representing the minimum value, w, of the weight vector _max Representing the maximum value of the weight vector.

Preferably, when α is not equal to 1, β is not equal to 0, ζ is not equal to 0, η is not equal to 0, and when the objective function is solved, the S3 calculates SU values of each feature data in the given dataset by using the SUFS algorithm, uses the SU values as weight vectors of the tuple feature data, and adopts a solver programming mode to solve the objective function.

Preferably, α=1, β=0, ζ=0, η=0;

searching a set of main features in n features in a given dataset by adopting SUFS algorithm Sign S _best Calculate each feature data S _best Is based on the threshold delta at the set of principal features S _best Is obtained by selecting characteristic data

And according to->

SU values of the characteristic data in (a) are ordered in descending order, redundant characteristic data are deleted, and the deleted +.>

The characteristic data which accords with the objective function and is selected by the data buyer side in a given data set;

SU (X, Y) represents SU value;

IG(X|Y)＝H(X)-H(X|Y)

h (x|y), IG (x|y) represents information gain, X represents random event X, i.e., a type of feature is selected; y represents a random event Y, i.e. selecting another type of feature data; h (X) represents the information entropy of event X; h (Y) represents the information entropy of event Y; p (y) _q ) Representing random event Y as Y _q Probability of (2); p (x) _p |y _q ) Representing random event Y as Y _q Under the condition that the random event X is X _p Conditional probability of (2); x is x _p Represents one of X, y _q And represents a certain class of X.

As a preferred alternative to this,

preferably, in the step S1, the method for determining the sample data set at the data buyer side includes:

data buyer side owns local data set D _own ；

Data buyer end-to-local data set D _own Repairing to obtain data set

Data buyer end-to-local data set D _own Predicting to obtain data set

In dataset D _own 、

And->

And training the user side learning model, determining the type and the data quantity of the power data which are lack in the sample data set when the accuracy of the user side learning model is lower than the required accuracy, forming a query according to the type and the data quantity of the power data which are lack, and sending the query to the data seller terminal, and returning the corresponding given data set to the data buyer terminal by the data seller terminal.

Preferably, the data buyer end pairs the local data set D _own Repairing to obtain data set

Or the data buyer end to the local data set D _own Prediction is carried out to obtain a data set +.>

The method of (1) comprises:

segmentation dataset D _own Obtaining a training data set D _train And test dataset D _test Establishing a deep learning model,

with training dataset D _train For deep learning modelTraining the model, outputting the parameters of the trained deep learning model and the loss value of each iteration, and using the test data set D _test Predicting to obtain an error value of the deep learning model prediction, adjusting parameters of the deep learning model, and repairing or predicting by using the deep learning model;

the deep learning model comprises two bidirectional LSTM layers, a multi-head attention layer, a maximum pooling layer, an average pooling layer and two fully connected layers, and a training data set D _train The input data of the two-way LSTM layers are input to the two-way LSTM layers at the same time, the output of the two-way LSTM layers is input to the multi-head attention layer, the output of the multi-head attention layer is input to the maximum pooling layer and the average pooling layer at the same time, the output of the maximum pooling layer and the average pooling layer is input to one full-connection layer at the same time, the output of the full-connection layer is input to the other full-connection layer, and the output of the full-connection layer is output through the output layer.

The invention provides a block chain-based framework for buying the function with limited budget through the calculation of the edge server in the energy internet, and the buying and selling parties of data can upload transaction data and transaction information to the alliance block chain, so that safe data transaction and data sharing are realized. The objective function established by the invention jointly optimizes the accuracy of the learning model at the user side, the payment of the user, the processing delay of the task and the uploading delay of the block chain, selects data in a given data set of a data seller terminal, and realizes the maximization of the accuracy of the learning model at the user side and the minimization of the payment and the delay.

Drawings

FIG. 1 is a schematic diagram of a framework constructed in accordance with the present invention;

FIG. 2 is a diagram illustrating the communication between a data seller and a data buyer;

FIG. 3 is a schematic diagram of an edge server;

FIG. 4 is a schematic diagram of a blockchain;

FIG. 5 is a schematic diagram of an atten-LSTM predictive model;

FIG. 6 is a comparison of the predictive effect of an atten-LSTM predictive model with real data;

FIG. 7 is a graph showing loss values for an atten-LSTM prediction model;

FIG. 8 is a graph showing the effect of LSTM and GRU and the atten-LSTM predictive model of the present invention in comparison to real data.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.

The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.

The distributed feature data selection method in the electric power spot market of the embodiment is implemented by using a blockchain-based system, wherein the system comprises a data seller terminal, a data buyer terminal, an edge server and a blockchain;

1) A data seller terminal. They own the data and determine the data pricing mechanism. When a request for a characteristic query is received at the data buyer, they will provide a static price for the characteristic. The data seller terminal can dynamically adjust the price of the feature according to the model accuracy given by the data buyer terminal by taking the feature set as a unit. They offer the proper price of the owned feature to maximize their revenue.

2) And the data buyer end. For some learning tasks, the data vendor side may use a transducer model to generate new data from existing historical data to implement some learning tasks. However, for characteristics that are difficult to generate, a query request for such characteristics must be initiated to the data vendor side. The data vendor side may choose the function of the purchase based on a limited budget and accuracy of the model.

3) And an edge server. There are some servers in the edge computing network. They have sufficient computing power to run some time consuming algorithms, such as reinforcement learning algorithms or deep learning algorithms. The data seller side or the data buyer side may offload some of the computing tasks to some edge servers.

4) A blockchain. To secure the data, the selected feature data is uploaded to the blockchain for sharing between the seller and the buyer. Both the data seller side and the data buyer side are members of the blockchain license.

The distributed characteristic data selection method in the electric power spot market of the embodiment comprises the following steps:

determining a user side learning model, a sample data set and the type and the data quantity of power data lacking in the sample data set by a data buyer side, forming a query according to the type and the data quantity of the power data lacking, and sending the query to the data seller side, and returning a corresponding given data set to the data buyer side by the data seller side;

step two, the data buyer end jointly optimizes the accuracy of a user side learning model, user payment, task processing delay and uploading delay of a block chain, and aims at maximizing the accuracy of the user side learning model, minimizing the payment and the delay, and establishes a target function:

s.t.φ _d (x)≤budget

0≤|x|≤Size

wherein phi is _d (x)＝φ _upd (x)·(1+kAcc(x))+userpr；beha＝Num ^Buy /Num ^Query The method comprises the steps of carrying out a first treatment on the surface of the When beha is less than or equal to 0.1, userpr is more than or equal to 0; when (when)beha>0.1，userpr<0; x and λ represent inputs, x= (x ₁ ,x ₂ ,...,x _n ),x _i ∈{0,1},i＝1,2,...,n，x _i Feature data representing an ith type of a given dataset, n representing a number of types of features in the given dataset; lambda= (lambda) ₁ ,λ ₂ ,...,λ _NE )，λ _a Representing the ratio of the ith task to the total task of the data buyer side offloaded to the edge server, 0.ltoreq.lambda _a ≤1；d _a Representing the size of the a-th task input data; lambda (lambda) _a d _a Representing a computing task required on an edge computing node EN of an edge server; acc (x) represents the accuracy of the learning model at the user side, 0 <Acc(x)<1, a step of; alpha represents the precision parameter of the learning model at the user side; beta represents a payment parameter; ζ represents a parameter of data buyer side task processing delay; η represents a parameter of a blockchain delay; phi (phi) _d (x) Representing a price; phi (phi) _upd (x) Representing static pricing irrespective of the user-side learning model; userpr represents a user behavioral rewards/penalty variable; pena represents the lower limit of the user behavior penalty variable, rewa represents the upper limit of the user behavior penalty variable, pena is less than or equal to userpr is less than or equal to rewa; k represents a price adjustment parameter; the budgets represent budgets;

representing the processing delay of the a-th task; />

A maximum limit value representing a task processing delay; />

Representing a computing task lambda to be accessed by wireless _a d _a The computation time offloaded onto the edge computation node EN of the edge server; a=1, 2, …, N _E ，N _E Representing the number of tasks; />

A maximum limit value representing a transaction achievement time; / >

Representing a b-th block packing time of the block chain; />

Representing the consensus time of the b-th block; />

a maximum limit value representing the number of queries; />

A maximum limit value representing the number of purchases; size represents the maximum limit of the amount of data that a user queries and purchases in real time;

step three, solving the objective function, selecting characteristic data which accords with the objective function in a given data set, uploading the selected characteristic data to a block chain by a data seller terminal, paying and acquiring the data by the data buyer terminal through the block chain, and adding the data into a sample data set.

The goal of the problem in this embodiment is to maximize the benefit of the buyer, i.e. to maximize accuracy and minimize payment. The decision variables are { x, lambda }, alpha, beta, zeta and eta, and can be set and adjusted through experiments.

Step two of the present embodiment includes constraints on data pricing:

for maximum benefit to the data seller side, static pricing based on horizontal and vertical pricing and dynamic pricing based on data content and query counts are included. The static part takes into account factors of data incompleteness, query incompleteness, repeated charges for historical queries, and data updates. Wherein the horizontal pricing takes the tuple as a minimum pricing unit; vertical pricing takes features as minimum pricing units. The dynamic part considers factors such as user behavior, user side model accuracy and the like.

The data analyst data buyer side of a power plant is performing a knowledge discovery task. The data buyer wants to study the power trading trends in different regions. However, the data buyer has only partially incomplete power transaction history data. Data sets with such information are partly present in their own power plants and partly in other power plants, which are expensive to exceed the budget. The data is purchased as little as possible, the data buyer side can know the trend of the historical data, and the local historical data is used for generating new data for the tasks of the data buyer side. Because data is difficult to learn, the data buyer side can issue several queries on the data set owned by the data seller and purchase at the appropriate price. In this way, the data buyer side does not pay a high price for all data sets used in the task.

As shown in graph 1, the sample dataset has three relational power plants (G), prices (P), consumers (C). In particular, the generated data and price data in the sample data set are incomplete due to loss of privacy protection or equipment failure.

TABLE 1

1) Static pricing:

the static pricing component of the present embodiment is determined based on the entropy of the information and the integrity of the data. For horizontal pricing, an incomplete sample data set D is given _m×n Has m tuples, each tuple has n characteristics, and the total price is P _total . Due to different task preferences, the tuple feature weight vector is w= (W ₁ ,w ₂ ,...,w _m ) Wherein w is _j (1. Ltoreq.j.ltoreq.m) represents the weight of the j-th tuple. Thus, the j-th tuple t _j The integrity of (c) is defined as follows:

wherein index _ij =1 indicates that the element of the ith row, j and column is present, otherwise index _ij ＝0。

Entropy is a measure of uncertainty in random variables, and information entropy is often used as a quantization index for information content. Let A be a discrete random variable and entropy be defined as

Thus the j-th tuple t _j The price of (2) is as follows:

/>

wherein h (t) _j ) And h is the information entropy of the j-th tuple, and h is the information entropy of the whole data set. ζ is,

Coefficients of information entropy and completeness rate, respectively, which satisfy the constraint +.>

The basic idea of static price is to scale the overall price according to the degree of integrity of the tuples and the amount of information.

Initially, the data buyer side initiates a query Q to the data set d= (G, C, P) ₁ The power load and total annual load data for city a, 2020.12 months, were investigated as shown in table 2. The data buyer side then initiates a query Q ₂ And (5) investigating the electricity price of the market A. When data buyer end sends Q ₃ At the time of inquiry, p ₃ The GID absence of the stripe data, as shown in table 1, results in inaccurate queries. For a data seller to profit from the data buyer side, important data, such as characteristics, can be sold, a key issue being how to price the data quickly and reasonably.

Table 2 SQL statement

TABLE 3 Table 3

First, the present embodiment uses the concept of a data lineage to determine which data is used. Given tuple t, the exact underlying data that yields t is called its lineage. In other words, each tuple t appears in the output of the query, a group of tuples in the input, called the lineage of t. Intuitively, the lineage definition of t refers to collecting all of the input data that "contributes" t or helps "produce" t, as described in definition 1.

Definition 1 (tuple lineage). Given data set D, there is a table T ₁ ,T ₂ ,...,T _tn And query Q. Let Q (D) =q (T ₁ ,T ₂ ,...,T _tn ) In Table T for query Q ₁ ,T ₂ ,...,T _tn Results set above. For a tuple t.epsilon.Q (D), a lineage set of T refers to Q in Table T ₁ ,T ₂ ,...,T _tn The above expression L (t ε Q (D), D) (abbreviated as L (t, D)) is defined by formula (5):

a vector form of a set of spectral coefficients of formula (6) t, each element

From T _j Is a tuple of (a). For j=1..,

is T is at T _j And T is of the lineage of _j Is helpful in generating a result tuple t. Formally, a->

Is T ₁ ,T ₂ ,...,T _tn Is a subset of (1)

/>

Then, one of the result sets Q (D) is queriedA set of spectral coefficients, denoted M (Q, D), is a joint set of spectral sets L (t, D), from each result tuple t e Q (D), i.e., M (Q, D) = u _t∈ Q _(D) L (t, D). It shows that the data usage of query Q is evaluated by M (Q, D).

The query result is ambiguous when the data is incomplete, wherein the root cause of the ambiguous query result is the absence of a key. For tuple t _j By parameter miss _j To indicate the degree of bond missing

Where j is a missing bond. Thus, tuple t _j Mass τ (t) _j ) Representation of

Wherein sen (0)<sen<1) Representing the user's sensitivity to quality, the parameter sen may be dynamically adjusted based on historical purchases of the data consumer. To some extent, the degree of variation in tuple quality can be measured. The smaller the sen value, the faster the mass value changes (i.e., mass versus mass _j The more sensitive).

Given an incomplete data set D and a query Q. The quality σ (Q, D) of the query Q is expressed as

Thus, the price function for a query (Q, D) on an incomplete dataset is defined as follows

Where n is the number of result tuples, i.e., n= |q (D) |, Δ is the price coefficient used to control the user price range, M (Q, D) is the set of lineage tuples of the query result Q (D).

After the data buyer purchases a query, the data buyer issues a new query. At this point, the results of the new query and some of the data already purchased are repeated. Thus, the pricing mechanism should prevent the buyer from paying excessive fees in view of the historical queries. When the information for t is not updated, the buyer can use the repeated lineage tuples for free and therefore pricing is as follows

T _now ＝M(Q,D)-T _buy

Wherein T is _buy Representing the set of lineage tuples that have been purchased, this function avoids repeated charges for historical queries.

When the data is updated, T _buy The middle part of the data changes, so an additional attribute ver is added to t, representing the version number of the tuple. The version number is initialized to 0 and incremented each time a tuple is updated. The system only retains the latest version of each tuple, priced as follows

T _n o _w ＝M(Q,D)-T _buy +T _upd (14)

Wherein T is _upd Representing the set of tuples that have been purchased for which ver has changed.

For vertical pricing, similar to equation 3, the price for the ith feature is defined as follows:

/>

μ+η＝1

wherein mnum _i Representing the number of missing values of the ith feature

2) Dynamic pricing: user behavior indicators are considered in the pricing method. The data buyer side initiates a query to the data seller side, and then the data seller side returns the query price phi (Q, D), and the data buyer side selects to pay money, purchase data, or abandon the purchase after obtaining the offer. Let the number of queries at the data buyer end be Num ^Query Number of purchases of Num ^Buy ， beha＝Num ^Buy /Num ^Query Is a user behavior index (beha is more than or equal to 0 and less than or equal to 1). When beha is less than or equal to 0.1, the method indicates that the query times of the data buyer end are more, but the purchase times are less, and if the data set is larger, the query delay is high, the calculation is complex, and the system resources are wasted. Thus, the present embodiment adds a user behavior rewards/penalties mechanism userpr; when beha is less than or equal to 0.1, userpr is less than or equal to 0, indicating that the user must pay a higher fee. Otherwise, the user pays a lower fee. Pricing function of

φ _user (Q,D)＝φ _upd (Q,D)+userpr (19)

Wherein, the liquid crystal display device comprises a liquid crystal display device,

in addition, the present embodiment also considers the impact of user-side model accuracy on data pricing. The data buyer wants to investigate the power trade situation in 2020 in different areas. The data buyer wants to test the predictive capability of the generated energy of some new energy sources on the electric power exchange. Initially, without a model, the data buyer needs to send a query Q to the data seller ₁ Buying data and training a model; after training, the accuracy of the model is kept stable, and a query Q is sent to the data seller terminal at the moment ₂ And fine tuning is carried out on the model.The price of the two queries is also different due to the different impact on the model accuracy. However, Q is desirable in this embodiment ₁ The query price is greater than Q ₂ Therefore, this embodiment has

φ _d (Q,D)＝φ _upd (Q,D)·(1+kAcc(Q,D))+userpr (21)

Let x= (Q, D), i.e

φ _d (x)＝φ _upd (x)·(1+kAcc(x))+userpr (22)

Wherein Acc (x) is model accuracy, 0< Acc (x) <1. The formula applies equally to vertical pricing.

For example, when the accuracy is less than 0.7, the price is higher than the static price, when the accuracy is greater than 0.85, the price is lower than the static price, and when the accuracy is between 0.7 and 0.85, the price is slightly higher than the static price. The price is adjusted in units of feature sets, so can be set

At the same time, after the data buyer purchases the data once, the weight w of the feature changes, and thus the price of the data also changes.

Constraint on edge server computation delay in this embodiment:

each data terminal (including a data buyer side and a data seller side) has a calculation task, wherein the calculation task of the data seller side is to price data, and the calculation task of the buyer is knowledge discovery. The computing Task may be described as Task _a (d _a ,s _a ),a∈N _E Wherein d is _a Representing the size of the input data, s _a Is a computing Task _a The required computing resources. Each task may be divided into two parts, calculated locally and calculated on the edge calculation node EN. Specifically lambda _a (0≤λ _a And less than or equal to 1) is an unloading ratio variable which represents the ratio of the unloading task to the total task. User terminal U _a Lambda is set to _a d _a Data offloaded to EN, locallyCalculate the rest (1-lambda) _a )d _a Is a data of (a) a data of (b).

The task processing delay may be divided into two parts accordingly. The first part is that the task passes through the wireless channel from the node U _a Transmission time to EN. The second part is the computation time, which depends on the allocated computing resources and task size.

1) Transmission time: lambda is set to _a d _a From U _a The transfer time offloaded to EN can be written as

Wherein R is _a Is the throughput. Throughput represents the amount of data that enters and passes through a system in a time slot, which may also be expressed in terms of data rate.

2) Calculating time: task for each Task _a ，U _a Can locally execute Task on own computing resource _a (1-lambda) _a )d _a Part of the process and offload the rest to the edge computing node EN for processing. Let f _a Representing U _a Which varies for different users and can be obtained by offline measurements. Local calculation time

Can be expressed as

For edge computing node computing method, U _a Lambda is set by wireless access _a d _a Is offloaded to EN. Calculation time on EN

Will be

3) Task processing delay: since each task can be split into two parts, executed locally in parallel and on EN in parallel, the task processing delay is determined by the maximum of the two parallel execution parts. If the task processing delay is determined by the local execution section, the task processing delay is equal to the local calculation time because the local execution section does not undergo the process of wireless transmission. If the task processing delay is determined by the EN execution portion, the task processing delay includes two portions: 1) Transmission time; 2) Calculation time on EN. Thus, task processing is delayed by

Constraint of blockchain delay in this embodiment:

blockchain transaction time: the block packing time of the block chain is T ^p Related to the data content of the transaction; the consensus time of the block is T ^C The commit time of the block is T ^s Related to the size of the block and the consensus mechanism. The time for the transaction to be completed is

T ^BC ＝T ^p +T ^C +T ^s (28)

In this embodiment, the feature selection problem is formulated:

the present embodiment represents a solution to the feature selection problem with a binary coded vector x, which is described as follows:

x＝(x ₁ ,x ₂ ,...,x _n ),x _i ∈{0,1},i＝1,2,...,n (29)

wherein x is _i =1 indicates that the ith feature is selected, and x _i =0 means that the feature is not selected.

(30-39)

In the embodiment, when α is not equal to 1, β is not equal to 0, ζ is not equal to 0, η is not equal to 0, and S3 calculates SU values of each feature data in a given data set by using a SUFS algorithm when the objective function is solved, and uses SU values as weight vectors of the tuple feature data, and solves the objective function by using a solver programming method.

In this embodiment, the feature selection is performed using the symmetry uncertainty as a measurement index. Comprising 2 aspects: (1) how to determine whether a feature is associated with a tag; (2) How to determine whether such relevant features are redundant when considering other relevant features;

when α=1, β=0, ζ=0, η=0; searching a group of main features S in n features in a given data set by adopting SUFS algorithm _best Calculate each feature data S _best Is based on the threshold delta at the set of principal features S _best Is obtained by selecting characteristic data

And according to->

The characteristic data which is selected by the data buyer side in the given data set and accords with the objective function is the characteristic data;

SU (X, Y) represents SU value;

IG(X|Y)＝H(X)-H(X|Y)

h (x|y), IG (x|y) represents information gain, X represents random event X, i.e., a type of feature is selected; y represents a random event Y, i.e. selecting another type of feature data; h (X) represents event XIs an information entropy of (a); h (Y) represents the information entropy of event Y; p (y) _q ) Representing random event Y as Y _q Probability of (2); p (x) _p |y _q ) Representing random event Y as Y _q Under the condition that the random event X is X _p Conditional probability of (2); x is x _p Represents one of X, y _q And represents a certain class in X.

The information gain is symmetrical for two random variables X and Y. Symmetry is a desirable attribute for measuring correlation between features. However, the information gain is biased to facilitate a function with more values. Furthermore, the values must be normalized to ensure that they are comparable, with the same effect.

Step one of the present embodiment further includes recovery of the history data, generation of the history data, and selection of the features. The present embodiment is based on data restoration and generation of an attention mechanism, and uses data prediction of an attention mechanism, and in the present embodiment, a data buyer side has a local data set with missing data, which is time-series data, as shown in table 3. To better discover important information from existing data, the data buyer can repair the missing values and generate new data from the existing data. Time series data prediction is an effective data repair and generation method.

Self-attention, also known as internal attention, is a mechanism of attention associated with different positions of a single sequence, with the aim of calculating a representation of the sequence. Attention mechanisms have become an integral part of task sequence modeling for reading understanding, text implications, sequence prediction, etc. It allows modeling of dependent items in an input or output sequence irrespective of the distance between them. This embodiment combines a self-attention mechanism with an algorithm that loops the neural network,

in the first step, the method for determining the sample data set by the data buyer side comprises the following steps:

data buyer side owns local data set D _own ；

Data buyer end-to-local data set D _own Repairing to obtain data set

Data buyer end-to-local data set D _own Predicting to obtain data set

In dataset D _own 、

And->

When the accuracy of the user side learning model is lower than the required accuracy, determining the type and the data quantity of the power data lacking in the sample data set, forming a query according to the type and the data quantity of the power data lacking, and sending the query to a data seller terminal, and returning the corresponding given data set to the data buyer terminal by the data seller terminal.

Data buyer end-to-local data set D _own Repairing to obtain data set

The method of (1) comprises:

with training dataset D _train Training the deep learning model, outputting the parameters of the trained deep learning model and the loss value of each iteration, and using the test data set D _test Predicting to obtain an error value of the deep learning model prediction, adjusting parameters of the deep learning model, and repairing or predicting by using the deep learning model;

The concept of the attention mechanism, as its name implies, is to pay attention to different features to predict the outcome. The self-care function is described as mapping a query and a set of key-value pairs to an output, where the query, key, value, and output are vectors. The output is calculated as a weighted sum of values, where the weight assigned to each value is calculated by querying the compatibility function with the corresponding key. For example, when a word is encoded, the representations (value vectors) of all words are weighted and summed, and the weights are obtained by the dot product of the word representation (key vector) and the encoded word representation (query vector) and by softmax.

For Scaled Dot-Product Attention, the input consists of a query with dimension d _k A bond and a dimension d _v Is comprised of values of (a). In general, this embodiment calculates the attention function on a query bundle packed as matrix Q, keys and values packed as matrices K and V, respectively. Thus, the output matrix is:

the multi-headed attention mechanism expands the ability of the model to focus on different locations. d, d _model The key, value and attention function of the query of the dimension are projected linearly h times. After the attention function is executed in parallel, the results are connected and projected to obtain a final value.

Wherein the projection is a parameter matrix

And->

This embodiment employs h=3 parallel attention headers. For each of these, d is used _k ＝d _v ＝64,d _m o _del =300. In the atten-LSTM predictive algorithm model, the bi-directional LSTM layer is located before the multi-headed attention layer, as shown in FIG. 5.

LSTM includes 4 components: input gate, forget gate, cell status and output gate. Input gate i _t Including the current input x _t Last hidden layer h _t-1 Last cell state c _t-1 And a weight matrix W _xi ，W _hi ，W _ci ，b _i Determining how much new information to add:

i _t ＝σ(W _xi x _t +W _hi h _t-1 +W _ci c _t-1 +b _i ) (42)

forgetting door f _t Including the current input x _t Last hidden layer h _t-1 Last cell state c _t-1 And a weight matrix W _xf ， W _hf ，W _cf ，b _f Determining how much old information was discarded:

f _t ＝σ(W _xf x _t +W _hf h _t-1 +W _cf c _t-1 +b _f ) (43)

the cell state is as follows:

c _t ＝i _t g _t +f _t c _t-1 (44)

g _t ＝tanh(W _xc x _t +W _hc h _t-1 +W _cc c _t-1 +b _c ) (45)

the output gate includes the current input x _t Last hidden layer h _t-1 Current cell state c _t And a weight matrix W _xo ，W _ho ， W _co ，b _o Which information outputs are determined:

o _t ＝σ(W _xo x _t +W _ho h _t-1 +W _co c _t +b _o ) (46)

h _t ＝o _t tanh(c _t ) (47)

and (3) experimental verification:

this embodiment conducted extensive experimentation to evaluate the performance of the proposed algorithm. First, the present embodiment tested the atten-LSTM predictive model using the underlying dataset in the UCI machine learning store. The atten-LSTM predictive model is then compared to existing temporal data predictive algorithms (LSTM, GRU) to demonstrate the superiority of the LSTM algorithm based on the self-attention mechanism used. Next, the present embodiment uses several benchmark data sets of the UCI machine learning store to evaluate the performance of the SUFS algorithm feature selection and give a pricing table. In the experiment, one benchmark was first selected from the dataset as the test sample, the remainder constituting the training sample. All experiments were performed on a personal computer using the Intel Core i79750H CPU 2.6ghz,16gb RAM and Windows 10 64bit of python 3.7.

Performance evaluation:

1) Performance of the atten-LSTM prediction model:

the present embodiment uses a time series data set to test the predictive performance of the atten-LSTM predictive model, with training set 7148 and training set 893. In this embodiment, the number of nodes in the bidirectional LSTM layer is 128, and the number of parallel heads in the multi-head attention mechanism is h=3, and d _k ＝d _v ＝64,d _model =300. The prediction result is shown in fig. 6, and it can be seen from fig. 6 that the time-ordered data can be predicted well by the atten-LSTM algorithm.

The number of iterations of the atten-LSTM prediction model is 200, the mean square error (Mean Squared Error, MSE) is the loss, and the loss value for each iteration is shown in fig. 7. As can be seen from fig. 7, the algorithm reaches convergence at 50 iterations.

Further, in this embodiment, a conventional time-series data prediction algorithm, i.e., LSTM and GRU, is selected and compared with the algorithm proposed in this embodiment, where both LSTM and GRU use 128 node numbers. The comparison is shown in fig. 8, where the Root Mean Square Error (RMSE) of the atten_lstm prediction, LSTM prediction and GRU prediction are 0.0645,0.05473, 0.05582, respectively; the Mean Square Error (MSE) is 0.00396,0.00337,0.00327, respectively. It can be seen that the algorithm proposed in this embodiment has the same ability to predict time series data as the existing algorithm, and the average time per iteration is 1s 105us, which is less than 1s 125us and 1s 120us of LSTM and GRU, when training. Compared with RNN, the Attention mechanism has the advantage of parallel computation, and the training time is greatly reduced. In addition, RNNs themselves have some capture capability for long-range dependencies, but since the sequence model is made to flow through the gating unit, information is kept flowing and is selectively delivered. However, under the condition that the time sequence length is longer and longer, the capability of capturing the dependency relationship is lower and lower, and each recursion is accompanied by information loss, so that an Attention mechanism is added to enhance capturing of the part of the dependency relationship focused by the embodiment.

2) Performance of feature selection based on symmetry uncertainty:

the present embodiment uses the BreastCancer benchmark dataset of the UCI machine learning store to evaluate the performance of feature selection based on symmetry uncertainty and to give a pricing table. Using Symmetry Uncertainty (SU) as a measure, good features for classification are selected based on a correlation analysis of the features, including tags. Among them, two aspects are considered: (1) how to determine whether a feature is associated with a tag; (2) How to determine whether such related features are redundant when considering other related features. And sorting the feature data according to the feature correlation analysis based on the symmetry uncertainty, and deleting redundant features, wherein a threshold value of 0.05 is set. The main features are 1,2,4,5,6,7,8,9 features, the most main feature is 2 nd feature, the redundant feature is 3 rd feature, and the redundant feature is deleted.

After feature ordering is obtained, the present embodiment sets feature weights w=10×su+n-rank according to SU of the feature, i.e. (w) _l ，w ₂ ，…，w _n )＝10(SU _1，C ，SU _2，C ，…，SU _n，C ) +n-rank as shown in Table 4.

TABLE 4 SU value, weight and number of missing values for the features

Features (e.g. a character)	SU	Weighting of	Number of missing values
				1	0.233938	3.33938	1
2	0.419128	12.19128	4
				3	0.386141	9.86141	5
4	0.287368	4.87368	2
				5	0.319727	8.19727	1
6	0.391315	10.91315	15
				7	0.296451	5.96451	1
8	0.319566	7.19566	2
				9	0.206039	2.06039	3
Totals to	2.473536	64.39677	34

3) Influence of pricing mechanism:

Static vertical pricing is performed according to equation (15), with μ=η=0.5. Note also that redundant feature 3 and feature 2 need only be purchased one. Static vertical pricing of features is shown in table 5.

Table 5 data pricing

Features (e.g. a character)	Static vertical pricing	Dynamic pricing
			1	10.10491	9.77227
2	15.20727	14.95914
			3	13.75983	13.48772
4	9.20084	8.85323
			5	12.01888	11.71795
6	13.24595	12.96534
			7	11.47236	11.16237
8	10.63601	10.31217
			9	4.353939	3.926051
Totals to	100	97.15626

According to the greedy algorithm, the data buyer first purchases the feature with the greatest influence on the tag, namely the 2 nd feature. After the data buyer obtains the characteristics, the characteristics are utilized to train the model, and the model accuracy Acc (x) is obtained. Setting the trained classification model as a logistic regression model, training for 200 times by using the 2 nd column of characteristic data to obtain the average accuracy rate of 0.82813, wherein the average accuracy rate is smaller than the target accuracy rate of 0.95 of the data buyer side, as shown in the table, so that the data buyer side then initiates a query purchase application to the data seller. Because the data buyer has purchased the 2 nd column of features and trained the model, the feature ordering and feature weights provided by the seller change, and the data pricing changes dynamically.

TABLE 6

Feature set	Accuracy rate of
		{1}	0.79625730994152
{2}	0.82812865497076
		……	……
{2，3，4，5，6，7，8，9}	0.982456140350877
		{1，2，3，4，5，6，7，8，9}	0.964912280701754

According to feature selection based on symmetry uncertainty and formulas (19) - (23), num of data buyer side is set ^Query ＝5， Num ^Buy =1, userpr= -0.5, so the price change is shown in table 6. At the position of

In the above, let α=1, β=0, and widget=50, that is, the data buyer needs to be within the budget 50, so as to obtain the data with the highest accuracy of the model. According to the greedy algorithm, the feature set purchased by the data buyer is {2,6,5,8}. Experiments prove that the model and the method provided by the embodiment enable a data buyer to select more effective characteristics under limited budget, and the data seller can price the data dynamically according to the requirements of users.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that the different dependent claims and the features described herein may be combined in ways other than as described in the original claims. It is also to be understood that features described in connection with separate embodiments may be used in other described embodiments.

Claims

1. The distributed characteristic data selection method in the electric power spot market is characterized in that the method is realized based on a block chain system, and the system comprises a data seller terminal, a data buyer terminal, an edge calculation server and a block chain;

The method comprises the following steps:

s.t.φ _d (x)≤budget

0≤|x|≤Size

wherein phi is _d (x)＝φ _upd (x)·(1+kAcc(x))+userpr；beha＝Num ^Buy /Num ^Query The method comprises the steps of carrying out a first treatment on the surface of the When beha is less than or equal to 0.1, userpr is more than or equal to 0; when beha is more than 0.1, userpr is less than 0; x and λ represent inputs, x= (x ₁ ,x ₂ ,...,x _n ),x _i ∈{0,1},i＝1,2,...,n，x _i Feature data representing an ith type of a given dataset, n representing a number of types of features in the given dataset;

λ _a representing the ratio of the a-th task to the total task of the data buyer side unloaded to the edge server, 0 is less than or equal to lambda _a ≤1；d _a Representing the size of the a-th task input data; lambda (lambda) _a d _a Representing a computational task required on an edge computing node EN of an edge server; acc (x) represents the accuracy of the learning model at the user side, and 0 < Acc (x) < 1; alpha represents the precision parameter of the learning model at the user side; beta represents a payment parameter; ζ represents a parameter of data buyer side task processing delay; η represents a parameter of a blockchain delay; phi (phi) _d (x) Representing a price; phi (phi) _upd (x) Representing static pricing irrespective of the user-side learning model;

userpr represents a user behavioral rewards/penalty variable; pena represents the lower limit of the user behavior penalty variable, rew represents the upper limit of the user behavior penalty variable, pena is less than or equal to userpr is less than or equal to rew; k represents a price adjustment parameter; the budgets represent budgets; t (T) _a ^EC Representing the processing delay of the a-th task;

a maximum limit value representing a task processing delay; />

Representing the calculationTask lambda _a d _a From user data end U _a Transmission time of edge computing node EN offloaded to edge server, +.>

Representing a calculation task lambda by wireless access _a d _a The computation time offloaded onto the edge computation node EN of the edge server; a=1, 2,.. _E ，N _E Representing the number of tasks; t (T) _b ^BC Indicating the data buyer side and the data seller side b block transaction achievement time, b=1, 2,.. _B ,N _B Representing the number of blocks in the blockchain; />

A maximum limit value representing a transaction achievement time; />

Representing a b-th block packing time of the block chain; />

Representing the consensus time of the b-th block;

A maximum limit value representing the number of queries; />

A maximum limit value representing the number of purchases; size represents the most amount of data that a user queries and purchases in real timeA large limit; />

2. The method for distributed feature data selection in a power spot market according to claim 1,

d represents a given data set, including m tuples, each tuple having n types of features, the query issued by the data buyer is Q, t _j A tuple representing the result of query Q at D, j=1, 2,..m; in D there is a table T ₁ ,T ₂ ,...,T _tn ，

Representing query Q versus D table T _i A set of lineage tuples; />

Representing tuple t _j Data quality of (2); sen represents the sensitivity of the user to quality, 0<sen<1，miss _j Indicating the degree of bond deletion; delta represents a price coefficient for controlling a user price range; />

Coefficients representing the integrity rate; integrity of j-th tuple->

index _ij =1 indicates that the element of the ith row and column of the n row and column characteristic data of a given data set is present, and if not present, index _ij =0; h is the information entropy of a given dataset; h (t) _j ) Information entropy of the j-th tuple; w= (w) ₁ ,w ₂ ,...,w _n )，w _j A weight vector representing the j-th tuple of feature data; w (w) _min Representing the minimum value, w, of the weight vector _max Representing the maximum value of the weight vector.

3. The method for distributed feature data selection in a power spot market according to claim 1,

4. the method for selecting distributed feature data in a power spot market according to claim 1, wherein in S1, the method for determining a sample data set at a data buyer side includes:

data buyer side owns local data set D _own ；

Data buyer end-to-local data set D _own Repairing to obtain data set

Data buyer end-to-local data set D _own Predicting to obtain data set

In dataset D _own 、

And->

And training the user side learning model, determining the type and the data quantity of the power data lacking in the sample data set when the accuracy of the user side learning model is lower than the required accuracy, forming a query according to the type and the data quantity of the power data lacking, and sending the query to the data seller terminal, and returning the corresponding given data set to the data buyer terminal by the data seller terminal.

5. The method for distributed feature data selection in a power spot market according to claim 4, wherein the data buyer end pairs the local data set D _own Repairing to obtain data set

The method of (1) comprises:

segmentation dataset D _own Obtaining a training data set D _train And test dataset D _test Establishing a deep learning model by using a training data set D _train Training the deep learning model, outputting the parameters of the trained deep learning model and the loss value of each iteration, and using the test data set D _test Predicting to obtain an error value of the deep learning model prediction, adjusting parameters of the deep learning model, and repairing or predicting by using the deep learning model;