CN115659165A - Method, device, equipment and storage medium for constructing park load data sample - Google Patents

Method, device, equipment and storage medium for constructing park load data sample Download PDF

Info

Publication number
CN115659165A
CN115659165A CN202210978081.7A CN202210978081A CN115659165A CN 115659165 A CN115659165 A CN 115659165A CN 202210978081 A CN202210978081 A CN 202210978081A CN 115659165 A CN115659165 A CN 115659165A
Authority
CN
China
Prior art keywords
load
sample
target
subsamples
load sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210978081.7A
Other languages
Chinese (zh)
Inventor
郇嘉嘉
梁正
张璇
沈欣炜
乔百豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202210978081.7A priority Critical patent/CN115659165A/en
Publication of CN115659165A publication Critical patent/CN115659165A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a construction method, a device, equipment and a storage medium of a park load data sample, wherein the method comprises the following steps: selecting a reference attribute from the plurality of load attributes by acquiring a load data set comprising a plurality of load curves and a plurality of load attributes, taking the rest load attributes as a complementary set, selecting a load curve corresponding to the reference attribute as a first load sample, selecting a plurality of load curves corresponding to the complementary set as sub-samples to form a second load sample, and calculating a regular path distance between each sub-sample and the first load sample; analyzing a plurality of target subsamples with the minimum regular path distance to obtain distribution coefficients; and according to the distribution coefficient, weighting the first load sample and the plurality of target subsamples to generate a new load sample, so as to generate a campus load data sample set, realize sample expansion of a time sequence represented by a load, and provide a data basis for planning and evaluating a campus data driving model.

Description

Method, device, equipment and storage medium for constructing park load data sample
Technical Field
The invention relates to the technical field of data processing, in particular to a construction method, a construction device, construction equipment and a storage medium of a park load data sample.
Background
In the stage of planning the park, a reasonable park load data driving and predicting model is established, which is an important premise and powerful guarantee for reasonably planning the park. However, currently, the measured data samples at the user side of the park are often relatively few, and especially, the load data such as electricity, cold, heat and the like are difficult to obtain. Therefore, how to generate data samples meeting the planning requirement from a small amount of load data is a problem to be solved urgently.
At present, load data sample expansion methods for a campus are less researched, and most of the existing sample expansion methods adopt Euclidean distance as a similarity measurement method, which is not suitable for similarity analysis between two load attributes which are similar in overall morphological characteristics but not strictly aligned on a time axis; some methods are realized by machine learning models such as a generative confrontation network, but have the problems of complex models and difficult parameter adjustment.
Disclosure of Invention
The invention provides a construction method, a construction device, construction equipment and a storage medium of a campus load data sample, which can expand a sample of time sequence data represented by a load in a campus, thereby providing a data basis for planning and evaluating a campus data driving model and promoting the development of an energy Internet.
In order to solve the technical problem, in a first aspect, the present invention provides a method for constructing a campus load data sample, including:
acquiring a load data set, wherein the load data set comprises a plurality of load curves of a target park and other parks and load attributes corresponding to the load curves, one load attribute is selected from the load attributes as a reference attribute, the other load attributes are used as a complement of the reference attribute, one load curve corresponding to the reference attribute is selected from the target park as a first load sample, and a plurality of load curves corresponding to the complement are selected from the other parks as subsamples to form a second load sample;
calculating a warped path distance between each subsample in the second load sample and the first load sample;
analyzing a plurality of target subsamples with the minimum regular path distance with the first load sample to obtain distribution coefficients of the plurality of target subsamples;
weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample;
and generating a park load data sample set based on the first load sample and the new load sample.
Preferably, calculating the warped path distance between each subsample in the second load sample and the first load sample comprises:
calculating the Euclidean distance between each subsample in the second load sample and the first load sample;
and calculating a regularized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time regularization algorithm.
Preferably, the expression of the dynamic time warping algorithm is as follows:
Figure BDA0003797458210000021
where L (i, j) represents the warped path distance, D (i, j) represents the Euclidean distance, and min { L (i-1, j-1), L (i-1, j), L (i, j-1) } represents the minimum cumulative distance between element i of the first load sample and element j of the child sample in the second load sample when a preset constraint condition is satisfied.
Preferably, the analyzing the target subsamples with the minimum regular path distance to the first load sample to obtain the distribution coefficients of the target subsamples includes:
generating a judgment matrix based on load attributes corresponding to a plurality of target subsamples;
calculating the maximum eigenvalue of the judgment matrix, and determining the eigenvector corresponding to the maximum eigenvalue;
normalizing the characteristic vector to obtain a weight matrix;
and according to the weight matrix, scoring the plurality of target sub-samples to obtain the distribution coefficient of each target sub-sample.
Preferably, the weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample includes:
based on the distribution coefficient, carrying out weight distribution on a plurality of target subsamples to obtain a target weight of each target subsample:
and according to the target weight and the preset weight of the first load sample, carrying out weighted summation on the first load sample and the plurality of target subsamples to obtain a new load sample.
Preferably, before generating the campus load data sample set based on the first load sample and the new load sample, the method includes:
if the sum of the data volumes of the first load sample and the new load sample is not larger than a preset value, adding the new load sample to the first load sample to obtain a new load data set;
and generating a new load sample again by using the new load data set until the sum of the data amount is greater than a preset value, and then performing a step of generating a park load data sample set based on the first load sample and the new load sample.
In a second aspect, the present application provides a construction apparatus for a campus load data sample, including:
the system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring a load data set, the load data set comprises a plurality of load curves of a target park and other parks and load attributes corresponding to the load curves, one load attribute is selected from the load attributes as a reference attribute, the other load attributes are used as a complement of the reference attribute, one load curve corresponding to the reference attribute is selected from the target park as a first load sample, and a plurality of load curves corresponding to the complement are selected from the other parks as sub-samples to form a second load sample;
a calculating module for calculating a warped path distance between each subsample in the second load sample and the first load sample;
the analysis module is used for analyzing the plurality of target subsamples with the minimum regular path distance with the first load sample to obtain the distribution coefficients of the plurality of target subsamples;
the weighting module is used for weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample;
and the generating module is used for generating a park load data sample set based on the first load sample and the new load sample.
Preferably, the calculation module is specifically configured to:
calculating the Euclidean distance between each subsample in the second load sample and the first load sample;
and calculating the normalized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time normalization algorithm.
In a third aspect, the present application provides a computer device comprising a processor and a memory for storing a computer program which, when executed by the processor, implements the method of constructing a campus load data sample according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method for constructing a campus load data sample according to the first aspect.
Compared with the prior art, the invention has the following beneficial effects:
the method comprises the steps that a load data set is obtained, wherein the load data set comprises a plurality of load curves of a target park and other parks and load attributes corresponding to the load curves, one load attribute is selected from the load attributes as a reference attribute, the other load attributes are used as a complement of the reference attribute, one load curve corresponding to the reference attribute is selected from the target park as a first load sample, a plurality of load curves corresponding to the complement are selected from the other parks as sub-samples to form a second load sample, and a regular path distance between each sub-sample in the second load sample and the first load sample is calculated, so that the regular path distance is used as a similarity measuring basis, and the regular path distance is calculated based on time sequence elements of daily load data one by one to calculate an accumulated distance, and therefore the method can be effectively applied to load data with time characteristics; then analyzing the plurality of target subsamples with the minimum regular path distance with the first load sample to obtain the distribution coefficients of the plurality of target subsamples so as to obtain the objective weight of each target subsample and improve the calculation accuracy of the weight; and finally, weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample, and generating a park load data sample set based on the first load sample and the new load sample, so that sample expansion is performed on time sequence data represented by loads in a target park, and a data basis is provided for planning and evaluating a park data driving model.
Drawings
Fig. 1 is a flowchart illustrating a method for constructing a campus load data sample according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a building apparatus for a campus load data sample according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for constructing a campus load data sample according to an embodiment of the present invention. The construction method of the campus load data sample in the embodiment of the invention can be applied to computer equipment, wherein the computer equipment comprises but is not limited to equipment such as a smart phone, a notebook computer, a tablet computer, a desktop computer, a physical server and a cloud server. As shown in fig. 1, the method for constructing a campus load data sample according to this embodiment includes steps S101 to S105, which are detailed as follows:
step S101, a load data set is obtained, the load data set comprises a plurality of load curves of a target park and other parks and load attributes corresponding to the load curves, one load attribute is selected from the load attributes as a reference attribute, the rest load attributes are used as a complement of the reference attribute, one load curve corresponding to the reference attribute is selected from the target park as a first load sample, and a plurality of load curves corresponding to the complement are selected from the other parks as sub-samples to form a second load sample.
In this step, each load curve in the load data set corresponds to a load attribute, which is exemplified as follows:
L={Load|District,Function,Season,Wearher};
where Load represents daily Load data (i.e., a Load curve), specifically, time-series samples X = { X = { (X) } 1 ,x 2 ,...,x n N represents the number of sampling points on the day; district, function, season, weather denote load attributes; district represents the attribute of administrative districts, and is recorded as Dis =1, 2. Function denotes functional area attributes, including residential (1), commercial (2), industrial (3), municipal (4) and others (5), noted as Fun =1,2,3,4,5; season represents the quarterly attributes, including first quarter (1), second quarter (2), third quarter (3), and fourth quarter (4), noted S =1,2,3,4; weather denotes Weather attributes including sunny (1), rainy and snowy (2) and extreme Weather (3), denoted as W =1,2,3.
Illustratively, the load attribute of the first load sample is { Dis = a, fun = b, S = c, W = d }, and then the first load sample represents the load of the functional area b of the target campus, which is the administrative district a, in the weather d in the quarter c. The remaining load attributes are { Dis = i, fun = j, S = p, W = q | i ∈ [1, k ], i ≠ a, j ∈ [1,5], j ≠ b, p ∈ [1,4], p ≠ c, q = [1,3], q ≠ d }, i.e., the load attribute of the second load sample is the complement of the load attribute of the first load sample.
Step S102, calculating a warped path distance between each subsample in the second load sample and the first load sample.
In this step, a load curve corresponding to the reference attribute is selected from the target campus as a first load sample: { X = { X = 1 ,x 2 ,...,x n } | Dis = a, fun = b, S = c, W = d }; selecting a plurality of load curves corresponding to the complementary set from the other parks as subsamples to form a second load sample: { Y = { Y 1 ,y 2 ,...,y m }|Dis=i,Fun=j,S=p,W=q|i∈[1,k],i≠a,j∈[1,5],j≠b,p∈[1,4],p≠c,q=[1,3]And q is not equal to d }, calculating a normalized path distance between the sub samples in the first load sample and the second load sample based on a dynamic time normalization algorithm.
In an embodiment, the step S102 includes: calculating a Euclidean distance between each subsample in the second load sample and the first load sample; and calculating a regularized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time warping algorithm.
In the present embodiment, daily load data X = { X = based on the first load sample 1 ,x 2 ,...,x n And daily load data Y = { Y } for sub-samples in the second load sample 1 ,y 2 ,...,y m Constructing a distance matrix D epsilon R with the size of n multiplied by m n×m And measuring the distance between each time sequence sampling point of the two daily load data, wherein an element D (i, j) is the Euclidean distance between the sampling point i of the first load sample and the sampling point j of the second load sample, and the expression is as follows:
D(i,j)=||x i -y j || 2
wherein, the curved path is defined as a set of adjacent elementsIs P = { P 1 ,p 2 ,...,p s ,...,p k },p s And = (= (i, j)), which is a set of one-to-one sampling points in two daily load data. Wherein the curved path must satisfy the following preset constraints:
boundary property: the starting point and the end point must be fixed, and the starting point is an element formed by the first sampling point of two daily load data, namely p 1 End point is the element consisting of the last sample point of the two daily load data, i.e. p = (1, 1) k =(n,m)。
Continuity: for any point p in the path s = (i, j) and its neighboring point p s+1 = (g, h), must satisfy g-i ≦ 1 and h-j ≦ 1.
Continuity: for any point p in the path s = (i, j) and its neighbor point p s+1 = (g, h), must satisfy g-i ≧ 0 and h-j ≧ 0.
The objective of the dynamic time warping algorithm is to find a piece of slave p 1 = (1, 1) to p k An optimal curved path of = n, m, such that the cumulative distance is minimal, i.e.:
Figure BDA0003797458210000071
further, fill the accumulated distance matrix L, whose expression is:
Figure BDA0003797458210000072
wherein L (i, j) represents a warped path distance, D (i, j) represents a Euclidean distance, and min { L (i-1, j-1), L (i-1, j), L (i, j-1) } represents a minimum cumulative distance between an element i of the first load sample and an element j of a child sample in the second load sample when a preset constraint condition is satisfied.
From p k = (n, m) backtracking to p 1 If =1, 1 seeks the shortest path, the shortest path distance L (n, m) is a regular path distance between the daily load data X and the daily load data Y, and is denoted as DTW (X, Y). Wherein the smaller the regular path distance, the data isThe higher the similarity between them.
Step S103, analyzing a plurality of target subsamples with the minimum regular path distance with the first load sample to obtain distribution coefficients of the plurality of target subsamples.
In this step, optionally, a judgment matrix is generated based on load attributes corresponding to a plurality of target subsamples; calculating the maximum eigenvalue of the judgment matrix, and determining the eigenvector corresponding to the maximum eigenvalue; normalizing the characteristic vector to obtain a weight matrix; and according to the weight matrix, scoring the plurality of target subsamples to obtain the distribution coefficient of each target subsample.
In this embodiment, 5 subsamples in the second load sample are taken as an example. The analytic hierarchy process needs to establish a target layer, a criterion layer and a scheme layer and determine subjective weight through hierarchical decomposition. The target layer is set to score daily load data of 5 sub-samples; the criterion layer sets administrative regions, functional regions, quarters and weather attributes of the samples as criteria according to the influence factors of the daily load data of the target park; the scenario layer is daily load data of 5 subsamples.
Firstly, setting relative importance numerical values among all indexes of a criterion layer by adopting an expert review method, constructing a judgment matrix, determining the relative importance of each element of the criterion layer, and marking the relative importance as J = (u) ij ) n×n Wherein J is a judgment matrix; u. u ij The relative importance value of the ith index to the jth index is obtained; n is the index number, where n =4. The judgment matrix is constructed as follows:
load attribute Administrative district Functional area Quarterly Weather conditions
Administrative district 1 1/3 1/2 1
Functional area 3 1 4 5
Quarterly 2 1/4 1 2
Weather (weather) 1 1/5 1/2 1
Namely, it is
Figure BDA0003797458210000081
Calculating the maximum eigenvalue lambda of the judgment matrix max And λ max Corresponding feature vector W max W is to be max Obtaining a subjective weight vector J corresponding to the criterion layer index after normalization W
At this time lambda max =4.0916,W max =[0.2074,0.9063,0.3225,0.1779] T And obtaining a subjective weight vector J after normalization W =[0.1285,0.5615,0.1998,0.1102] T
And (4) carrying out consistency check on the judgment matrix, calculating a consistency ratio CR, wherein the smaller CR is, the better the consistency is, and generally, the CR is less than 0.1 and can pass the consistency check. The CR calculation method is as follows:
Figure BDA0003797458210000082
wherein, CR is a consistency ratio; CI is a consistency index; RI is an average random consistency index, the value of which is related to the order number n of a judgment matrix, and the corresponding relation is shown in the following table:
Figure BDA0003797458210000083
the criterion layer CR =0.0343 < 0.1 is calculated, and the judgment matrix J of the criterion layer relative to the target layer is reasonably constructed through the consistency check.
Then, a matrix of the scheme layer relative to the criterion layer is constructed, namely, a judgment criterion is fixed, each scheme is determined, namely, the load curve and the load attribute of the obtained 5 subsamples are recorded as follows:
Figure BDA0003797458210000091
the relative importance of the 5 subsamples under this criterion is calculated. Since the rule layer has four elements, it is necessary to construct a matrix of four scheme layers with respect to the rule layer, denoted as J 1 ,J 2 ,J 3 ,J 4 . When the relative importance value of each scheme is set by adopting an expert review method, the differences between each administrative region, each functional region, each quarter and the weather are fully considered. As in the attribute of the administrative district, the geographic position, population, regional function positioning and economic development level of the administrative district need to be considered; aim at workEnergy area attribute, which is to consider the difference degree of loads among different functional areas; the quarterly attribute needs to consider that the difference between adjacent quarters is smaller than the difference between non-adjacent quarters; the weather attribute needs to take into account that the difference between extreme weather and normal weather is greater than the difference between sunny and rainy days. In light of the above considerations, general principles are given:
(1) according to different administrative district attributes, the economic development level is equivalent, and administrative district load sample differences with similar regional function positioning are smaller.
(2) Aiming at different functional area attributes, the difference between the business area and the municipal area is small, the difference degree between the residential area and the business area and between the residential area and the municipal area is medium, and the difference between the industrial area and other four functional areas is the largest.
(3) The second quarterly variability is less than the first quarterly and the second quarterly for different quarterly attributes.
(4) For different weather attributes, the difference between extreme weather and normal weather is greater than the difference between sunny and rainy days.
Calculating the maximum eigenvalue of the criterion layer judgment matrix
Figure BDA0003797458210000092
λ max Corresponding feature vector
Figure BDA0003797458210000093
And a weight vector
Figure BDA0003797458210000094
Then carrying out consistency check, after the consistency check, constructing a complete weight matrix, and comparing L 1 ,L 2 ,L 3 ,L 4 ,L 5 Respectively scoring to obtain the distribution coefficient corresponding to 5 subsamples as Score 1 ,Score 2 ,Score 3 ,Score 4 ,Score 5
And step S104, weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample.
In this step, optionally, based on the distribution coefficient, performing weight distribution on a plurality of target subsamples to obtain a target weight of each target subsample; and according to the target weight and a preset weight of the first load sample, carrying out weighted summation on the first load sample and the plurality of target subsamples to obtain the new load sample.
In this embodiment, taking 5 subsamples as an example, the first load sample X is given a weight of a, and is according to Y 1 ,Y 2 ,Y 3 ,Y 4 ,Y 5 The distribution coefficient in (1) distributes the weight of b, and a + b =1 obtains a weight vector as:
W new =[a,bScore 1 ,bScore 2 ,bScore 3 ,bScore 4 ,bScore 5 ] T
generating new samples by weight
Figure BDA0003797458210000101
Wherein
Figure BDA0003797458210000102
Figure BDA0003797458210000103
And step S105, generating a park load data sample set based on the first load sample and the new load sample.
In this step, the new load sample is added to the load data set to obtain a campus load data sample set.
Optionally, the load data set is traversed, and the traversing process includes: if the sum of the data volumes of the first load sample and the new load sample is not larger than a preset value, adding the new load sample to the first load sample to obtain a new load data set; and generating a new load sample again by using the new load data set until the sum of the data volumes is greater than a preset value, and then performing the step of generating the campus load data sample set based on the first load sample and the new load sample.
By way of example and not limitation, assuming that the reference load attribute of the first load sample is { X | Dis =1, fun =2, s =4, w =2}, i.e., the load of the commercial district 2 of the administrative district 1 on the fourth quarter of a rainy and snowy day, the load curve and the load attribute of the 5 sub-samples are:
L 1 ={Y 1 i.e. | Dis =2, fun =2, s =4, w =2}, i.e. the load of the business district of the administrative district 2 on the fourth quarter of a rainy and snowy day; l is 2 ={Y 2 I.e. | Dis =3, fun =2, s =4, w =2}, i.e. the load of the business district of administrative district 3 on the fourth quarter of a rainy or snowy day; l is 3 ={Y 3 I Dis =2, fun =2, s =1, w =1}, i.e. the load on the first seasonal sunny day of the business district of administrative district 2; l is 4 ={Y 4 I | Dis =2, fun =5, s =3, w =1, i.e. the load on the third quarter sunny day of the other functional zones of the administrative zone 2; l is 5 ={Y 5 I Dis =5, fun =4, s =1, w =3}, i.e. the load of the extreme weather in the first quarter of the municipality of the administrative district 5. The matrix of the construction scheme layer with respect to the criteria layer is as follows:
Figure BDA0003797458210000104
Figure BDA0003797458210000111
functional region L 1 L 2 L 3 L 4 L 5
L 1 1 1 1 5 3
L 2 1 1 1 5 3
L 3 1 1 1 5 3
L 4 1/5 1/5 1/5 1 2
L 5 1/3 1/3 1/3 1/2 1
Quarterly L 1 L 2 L 3 L 4 L 5
L 1 1 1 3 5 3
L 2 1 1 3 5 3
L 3 1/3 1/3 1 2 1
L 4 1/5 1/5 1/2 1 1/2
L 5 1/3 1/3 1 2 1
Weather (weather) L 1 L 2 L 3 L 4 L 5
L 1 1 1 3 3 5
L 2 1 1 3 3 5
L 3 1/3 1/3 1 1 2
L 4 1/3 1/3 1 1 2
L 5 1/5 1/5 1/2 1/2 1
The decision matrix is constructed as follows:
Figure BDA0003797458210000112
Figure BDA0003797458210000121
Figure BDA0003797458210000122
Figure BDA0003797458210000123
calculating the maximum eigenvalue of the judgment matrix
Figure BDA0003797458210000124
λ max Corresponding feature vector
Figure BDA0003797458210000125
And a weight vector
Figure BDA0003797458210000126
Specifically, the method comprises the following steps:
Figure BDA0003797458210000127
Figure BDA0003797458210000128
Figure BDA0003797458210000129
Figure BDA00037974582100001210
and (3) checking consistency: CR 1 =0.0682,CR 2 =0.0407,CR 3 =0.0012,CR 4 And =0.0012, which are all less than 0.01, and the four judgment matrices pass consistency check. Integrity rightThe heavy matrix construction is completed as follows:
Figure BDA00037974582100001211
Figure BDA0003797458210000131
to L is paired with 1 ,L 2 ,L 3 ,L 4 ,L 5 Scoring, multiplying the corresponding columns and summing again by L 1 For example, the corresponding scores are:
Score 1 =0.1285×0.2957+0.5615×0.2821+0.1998×0.3475+0.1102×0.3475=0.3041;
similarly, the distribution coefficient corresponding to other subsamples is calculated to be Score 2 =0.2899,Score 3 =0.2224,Score 4 =0.10303,Score 5 =0.0806。
For example, a =0.3 and b =0.7 are taken, and the first load sample X is given a weight of 0.3, Y 1 ,Y 2 ,Y 3 ,Y 4 ,Y 5 And distributing the weight of 0.7 according to the distribution coefficient to obtain a weight vector:
W new =[0.3,0.2129,0.2029,0.1557,0.0721,0.0564] T
and (3) generating a new sample:
Figure BDA0003797458210000132
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003797458210000133
Figure BDA0003797458210000134
and traversing the load data set to generate a park load data sample set.
In order to execute the construction method of the campus load data sample corresponding to the method embodiment, corresponding functions and technical effects are achieved. Referring to fig. 2, fig. 2 is a block diagram illustrating a structure of a device for constructing a campus load data sample according to an embodiment of the present invention. For convenience of explanation, only the part related to the embodiment is shown, and the apparatus for constructing a campus load data sample according to the embodiment of the present invention includes:
an obtaining module 201, configured to obtain a load data set, where the load data set includes multiple load curves of a target campus and other campuses, and load attributes corresponding to the load curves, and select one load attribute from the multiple load attributes as a reference attribute, and the remaining load attributes as a complement of the reference attribute, select a load curve corresponding to the reference attribute from the target campus as a first load sample, and select multiple load curves corresponding to the complement from the other campuses as sub-samples to form a second load sample;
a calculating module 202, configured to calculate a warped path distance between each subsample in the second load sample and the first load sample;
the analysis module 203 is configured to analyze a plurality of target subsamples having a minimum regular path distance from the first load sample to obtain distribution coefficients of the plurality of target subsamples;
a weighting module 204, configured to weight the first load sample and the plurality of target subsamples according to the distribution coefficient, and generate a new load sample;
a generating module 205, configured to generate a campus load data sample set based on the first load sample and the new load sample.
In an embodiment, the calculating module 202 is specifically configured to:
calculating a Euclidean distance between each subsample in the second load sample and the first load sample;
and calculating a regularized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time warping algorithm.
In one embodiment, the expression of the dynamic time warping algorithm is:
Figure BDA0003797458210000141
wherein L (i, j) represents a warped path distance, D (i, j) represents a Euclidean distance, and min { L (i-1, j-1), L (i-1, j), L (i, j-1) } represents a minimum cumulative distance between an element i of the first load sample and an element j of a child sample in the second load sample when a preset constraint condition is satisfied.
In an embodiment, the analysis module 203 is specifically configured to:
generating a judgment matrix based on load attributes corresponding to the target subsamples;
calculating the maximum eigenvalue of the judgment matrix, and determining the eigenvector corresponding to the maximum eigenvalue;
normalizing the characteristic vector to obtain a weight matrix;
and according to the weight matrix, scoring the plurality of target subsamples to obtain the distribution coefficient of each target subsample.
In an embodiment, the weighting module 204 is specifically configured to:
based on the distribution coefficient, carrying out weight distribution on a plurality of target subsamples to obtain a target weight of each target subsample;
and according to the target weight and a preset weight of the first load sample, carrying out weighted summation on the first load sample and the plurality of target subsamples to obtain the new load sample.
In one embodiment, the apparatus further comprises:
the adding module is used for adding the new load sample to the first load sample to obtain a new load data set if the sum of the data volumes of the first load sample and the new load sample is not larger than a preset value;
and a regeneration module, configured to regenerate a new load sample with the new load data set until the sum of the data amounts is greater than a preset value, and perform the step of generating a campus load data sample set based on the first load sample and the new load sample.
The above-described apparatus for constructing a campus load data sample can implement the method for constructing a campus load data sample according to the above-described method embodiment. The alternatives in the above-described method embodiments are also applicable to this embodiment and will not be described in detail here. The rest of the embodiments of the present invention may refer to the contents of the above method embodiments, and in this embodiment, details are not repeated.
Fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 3, the computer device 3 of this embodiment includes: at least one processor 30 (only one shown in fig. 3), a memory 31, and a computer program 32 stored in the memory 31 and executable on the at least one processor 30, the processor 30 implementing the steps of any of the above-described method embodiments when executing the computer program 32.
The computer device 3 may be a computing device such as a smart phone, a tablet computer, a desktop computer, and a cloud server. The computer device may include, but is not limited to, a processor 30, a memory 31. Those skilled in the art will appreciate that fig. 3 is merely an example of the computer device 3, and does not constitute a limitation of the computer device 3, and may include more or less components than those shown, or combine some of the components, or different components, such as input output devices, network access devices, etc.
The Processor 30 may be a Central Processing Unit (CPU), and the Processor 30 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 31 may in some embodiments be an internal storage unit of the computer device 3, such as a hard disk or a memory of the computer device 3. The memory 31 may also be an external storage device of the computer device 3 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 3. Further, the memory 31 may also include both an internal storage unit and an external storage device of the computer device 3. The memory 31 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer programs. The memory 31 may also be used to temporarily store data that has been output or is to be output.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in any of the method embodiments described above.
Embodiments of the present invention provide a computer program product, which when running on a computer device, enables the computer device to implement the steps in the above method embodiments when executed.
In several embodiments provided by the present invention, it will be understood that each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the scope of the present invention. It should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A construction method of a park load data sample is characterized by comprising the following steps:
acquiring a load data set, wherein the load data set comprises a plurality of load curves of a target garden and other gardens and load attributes corresponding to the load curves, one load attribute is selected from the load attributes as a reference attribute, the rest load attributes are selected as a complement of the reference attribute, one load curve corresponding to the reference attribute is selected from the target garden as a first load sample, and a plurality of load curves corresponding to the complement are selected from the other gardens as subsamples to form a second load sample;
calculating a warped path distance between each subsample in the second load sample and the first load sample;
analyzing a plurality of target subsamples with the minimum regular path distance with the first load sample to obtain distribution coefficients of the plurality of target subsamples;
weighting the first load sample and a plurality of target subsamples according to the distribution coefficient to generate a new load sample;
generating a campus load data sample set based on the first load sample and the new load sample.
2. The method of building a campus load data sample according to claim 1, wherein the calculating a warped path distance between each subsample of the second load samples and the first load samples includes:
calculating a Euclidean distance between each subsample in the second load sample and the first load sample;
and calculating a regularized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time warping algorithm.
3. The method of constructing load data samples for a campus of claim 2 wherein the dynamic time warping algorithm is expressed as:
Figure FDA0003797458200000011
wherein L (I, j) represents a warped path distance, d (I, j) represents a Euclidean distance, and min { L (I-1, j-1), L (I-1, j), L (I, j-1) } represents a minimum cumulative distance between an element I of the first load sample and an element j of a child sample in the second load sample when a preset constraint condition is satisfied.
4. The method according to claim 1, wherein the analyzing the target subsamples having the minimum regular path distance from the first load sample to obtain the distribution coefficients of the target subsamples comprises:
generating a judgment matrix based on load attributes corresponding to the target subsamples;
calculating the maximum eigenvalue of the judgment matrix, and determining the eigenvector corresponding to the maximum eigenvalue;
normalizing the characteristic vectors to obtain a weight matrix;
and according to the weight matrix, scoring the plurality of target subsamples to obtain the distribution coefficient of each target subsample.
5. The method according to claim 1, wherein the step of weighting the first load sample and a plurality of the target subsamples according to the distribution coefficient to generate a new load sample comprises:
based on the distribution coefficient, carrying out weight distribution on a plurality of target subsamples to obtain a target weight of each target subsample;
and according to the target weight and a preset weight of the first load sample, carrying out weighted summation on the first load sample and the plurality of target subsamples to obtain the new load sample.
6. The method of constructing a campus load data sample set according to claim 1, wherein before generating the campus load data sample set based on the first load sample and the new load sample, the method comprises:
if the sum of the data volumes of the first load sample and the new load sample is not larger than a preset value, adding the new load sample to the first load sample to obtain a new load data set;
and generating a new load sample again by the new load data set until the sum of the data amount is larger than a preset value, and then entering the step of generating the campus load data sample set based on the first load sample and the new load sample.
7. A building device of a park load data sample is characterized by comprising:
an obtaining module, configured to obtain a load data set, where the load data set includes multiple load curves of a target campus and other campuses, and load attributes corresponding to the load curves, and select one load attribute from the multiple load attributes as a reference attribute, and the remaining load attributes as a complement of the reference attribute, select one load curve corresponding to the reference attribute from the target campus as a first load sample, and select multiple load curves corresponding to the complement as sub-samples from the other campuses to form a second load sample;
a calculation module for calculating a warped path distance between each subsample in the second load sample and the first load sample;
the analysis module is used for analyzing a plurality of target subsamples with the minimum regular path distance with the first load sample to obtain distribution coefficients of the plurality of target subsamples;
the weighting module is used for weighting the first load sample and the plurality of target subsamples according to the distribution coefficient to generate a new load sample;
and the generating module is used for generating a park load data sample set based on the first load sample and the new load sample.
8. The campus load data sample construction apparatus of claim 7, wherein the calculation module is specifically configured to:
calculating a Euclidean distance between each subsample in the second load sample and the first load sample;
and calculating a regularized path distance between the first load sample and each subsample according to the Euclidean distance by using a dynamic time warping algorithm.
9. A computer device comprising a processor and a memory, the memory for storing a computer program which, when executed by the processor, implements a method of building a campus load data sample as claimed in any one of claims 1 to 6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a method of constructing a campus load data sample as claimed in any one of claims 1 to 6.
CN202210978081.7A 2022-08-15 2022-08-15 Method, device, equipment and storage medium for constructing park load data sample Pending CN115659165A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210978081.7A CN115659165A (en) 2022-08-15 2022-08-15 Method, device, equipment and storage medium for constructing park load data sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210978081.7A CN115659165A (en) 2022-08-15 2022-08-15 Method, device, equipment and storage medium for constructing park load data sample

Publications (1)

Publication Number Publication Date
CN115659165A true CN115659165A (en) 2023-01-31

Family

ID=85024093

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210978081.7A Pending CN115659165A (en) 2022-08-15 2022-08-15 Method, device, equipment and storage medium for constructing park load data sample

Country Status (1)

Country Link
CN (1) CN115659165A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149099A (en) * 2023-10-31 2023-12-01 江苏华鲲振宇智能科技有限责任公司 Calculation and storage split server system and control method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149099A (en) * 2023-10-31 2023-12-01 江苏华鲲振宇智能科技有限责任公司 Calculation and storage split server system and control method
CN117149099B (en) * 2023-10-31 2024-03-12 江苏华鲲振宇智能科技有限责任公司 Calculation and storage split server system and control method

Similar Documents

Publication Publication Date Title
US11341424B2 (en) Method, apparatus and system for estimating causality among observed variables
Moody et al. Architecture selection strategies for neural networks: Application to corporate bond rating prediction
Shabri et al. Streamflow forecasting using least-squares support vector machines
CN107480694B (en) Weighting selection integration three-branch clustering method adopting two-time evaluation based on Spark platform
WO2022217839A1 (en) Air quality prediction method based on deep spatiotemporal similarity
CN114254561A (en) Waterlogging prediction method, waterlogging prediction system and storage medium
Sainct et al. Efficient methodology for seismic fragility curves estimation by active learning on Support Vector Machines
CN111079780B (en) Training method for space diagram convolution network, electronic equipment and storage medium
Poczęta et al. Learning fuzzy cognitive maps using structure optimization genetic algorithm
Im et al. Group skyline computation
CN111047078B (en) Traffic characteristic prediction method, system and storage medium
CN112187554B (en) Operation and maintenance system fault positioning method and system based on Monte Carlo tree search
CN110751355A (en) Scientific and technological achievement assessment method and device
Nurmanova et al. A synthetic forecast engine for wind power prediction
CN112926570A (en) Adaptive bit network quantization method, system and image processing method
CN114822033B (en) Road network traffic flow data restoration method and system based on feature pyramid network
CN115659165A (en) Method, device, equipment and storage medium for constructing park load data sample
CN114881343A (en) Short-term load prediction method and device of power system based on feature selection
CN109255368B (en) Method, device, electronic equipment and storage medium for randomly selecting characteristics
Zhang et al. Accelerating exact nearest neighbor search in high dimensional Euclidean space via block vectors
CN105760442A (en) Image feature enhancing method based on database neighborhood relation
CN116386312A (en) Traffic prediction model construction method and system
CN114792397A (en) SAR image urban road extraction method, system and storage medium
CN111125541B (en) Method for acquiring sustainable multi-cloud service combination for multiple users
Ziat et al. Learning Embeddings for Completion and Prediction of Relationnal Multivariate Time-Series.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination