CN113011472B

CN113011472B - Multi-section electric power quotation curve similarity judging method and device

Info

Publication number: CN113011472B
Application number: CN202110222513.7A
Authority: CN
Inventors: 赵越; 刘思捷; 白杨; 林少华; 蔡秋娜; 高海翔; 龚超
Original assignee: Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Current assignee: Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2023-09-01
Anticipated expiration: 2041-02-26
Also published as: CN113011472A

Abstract

The invention provides a method and a device for judging similarity of multi-section power quotation curves, wherein the method comprises the following steps: acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a point set P; carrying out density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition, and outputting a density clustering result; optimizing and calculating a preset objective function based on the density clustering result, and obtaining the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified point and the unclassified point corresponding to the optimal cluster; and screening out quotation similar unit clusters according to the neighborhood radius of the optimal clusters and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal clusters. According to the invention, through optimizing the clustering parameters, the condition that the number of unclassified points is excessive can be avoided when the quotation point set is optimally clustered, so that the accuracy of similarity judgment of the multi-section electric power quotation curves is effectively improved.

Description

Multi-section electric power quotation curve similarity judging method and device

Technical Field

The invention relates to the technical field of power data analysis, in particular to a method and a device for judging similarity of multi-section power quotation curves.

Background

With the promotion of new power market reform, the test operation work has been carried out on the spot market in China. In the power spot market, the market main body declares a plurality of sections of quotations according to the self-assembly cost and characteristics and in combination with the market boundary conditions and the prejudgment of the game pattern, so the spot quotations are often highly complex. In the process of adapting to the market for a long time, the spot quotation of each unit gradually changes from high differentiation to local differentiation, and collusion behavior that a plurality of market subjects form a substantial quotation alliance is intended to control the market price can occur. Quotation behavior analysis and collusion recognition will play an important role in power spot market monitoring.

The quotation curve similarity analysis of the electric power spot market is a basic work for judging whether collusion quotation exists, and mainly judges which quotation similar unit groups exist and the quotation similarity degree of the unit groups through cluster analysis. In collusion bidding behavior analysis based on density clustering, the clustering parameters are the neighborhood radius and the minimum number of bid points in the neighborhood. The setting of the parameters influences the number and the similarity of cluster types after density clustering. The similarity of density clusters is usually determined by the sum of the distances from all classified points to their core points after density clustering, but this approach may result in excessive unclassified points such that similar quotation behavior between certain units is ignored. How to reasonably set the clustering parameters and better classify the power quotation points is the key of using the clustering analysis to analyze the quotation pattern in the power spot market.

Disclosure of Invention

The invention aims to provide a method and a device for judging the similarity of a multi-section power quotation curve, which are used for solving the technical problems, so that the accuracy of judging the similarity of the multi-section power quotation curve can be effectively improved.

In order to solve the technical problems, the invention provides a method for judging similarity of multi-section power quotation curves, which comprises the following steps:

acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a point set P;

carrying out density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition, and outputting a density clustering result; the density clustering result comprises all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points;

optimizing and calculating a preset objective function based on the density clustering result, and obtaining the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified point and the unclassified point corresponding to the optimal cluster;

and screening out quotation similar unit clusters according to the neighborhood radius of the optimal clusters and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal clusters.

The density clustering is carried out on the quotation points in the point set P according to the preset optimization variable constraint condition, and a density clustering result is output, and the method specifically comprises the following steps:

core point judgment is carried out on the quotation points in the point set P according to the preset density clustering neighborhood radius and the preset minimum quotation point number in the neighborhood;

performing density clustering on a point set P based on each core point, calculating Hausdorff distance between the core point and unprocessed points in the point set P, classifying the points meeting preset conditions into the same cluster according to each Hausdorff distance, and marking the points of the same cluster as processed points;

repeating the core point judgment and density clustering steps until all points in the point set P are classified into clusters or are judged to be non-core points, and classifying points which do not belong to any clusters as unclassified points;

and outputting all the neighborhood radiuses conforming to the constraint conditions of the optimized variables and the cluster class, the core point, the classified points and the unclassified points corresponding to the neighborhood radiuses.

The objective function is:

；

wherein ,for the m-th classified point->，/>、/>Declaration price and Shen Baoliang, < > -of the r-th declaration section of the market subject represented by the m-th classification point, respectively>Core Point of cluster class where the mth classified Point is located +.>，/>Representation->To->M represents the number of classified quotation points, N is the total number of quotation points;

the constraint condition of the optimization variable d of the optimal cluster is as follows:

；

wherein ,neighborhood radius for density clustering, +.>For optimal clustering of the set of all N points, k denotes the number of significant digits, ++>Multidimensional vector P for the ith bid ⁱ And a j-th offer multidimensional vector P ^j A hausdorff distance therebetween.

Further, the classifying the points meeting the preset condition into the same cluster according to each hausdorff distance specifically includes:

if judging that the Hausdorff distance between each unprocessed point and the core point belongs to the condition that the direct density is reachable, the density is reachable or the densities are connected, classifying the core point and the unprocessed points corresponding to the core point into the same cluster.

Further, when the optimal clustering is performed, only other quotation points except the quotation points of the first reporting segments are optimally clustered.

Further, screening out a quotation similar unit cluster according to the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal cluster, specifically:

and selecting a unit corresponding to the core point of each cluster and units corresponding to other quotation points in the neighborhood of the core point according to the neighborhood radius of the optimal cluster and the clusters, the core points, the classified points and the unclassified points corresponding to the optimal cluster, so as to obtain the quotation similar unit cluster.

In order to solve the same technical problems, the invention also provides a device for judging the similarity of the multi-section power quotation curves, which comprises the following steps:

the spot set acquisition module is used for acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a spot set P;

the density clustering module is used for carrying out density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition and outputting a density clustering result; the density clustering result comprises all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points;

the optimization calculation module is used for carrying out optimization calculation on a preset objective function based on the density clustering result and obtaining the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified point and the unclassified point corresponding to the neighborhood radius;

and the similar quotation screening module is used for screening quotation similar unit clusters according to the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal cluster.

The density clustering module is specifically used for:

The objective function is:

；

Further, the similar offer screening module is specifically configured to: and selecting a unit corresponding to the core point of each cluster and units corresponding to other quotation points in the neighborhood of the core point according to the neighborhood radius of the optimal cluster and the clusters, the core points, the classified points and the unclassified points corresponding to the optimal cluster, so as to obtain the quotation similar unit cluster.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a method and a device for judging similarity of multi-section power quotation curves, wherein the method comprises the following steps: acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a point set P; carrying out density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition, and outputting a density clustering result; the density clustering result comprises all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points; optimizing and calculating a preset objective function based on the density clustering result, and obtaining the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified point and the unclassified point corresponding to the optimal cluster; and screening out quotation similar unit clusters according to the neighborhood radius of the optimal clusters and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal clusters. According to the invention, through optimizing the clustering parameters, the condition that the number of unclassified points is excessive can be avoided when the quotation point set is optimally clustered, so that the accuracy of similarity judgment of the multi-section electric power quotation curves is effectively improved.

Drawings

FIG. 1 is a flow chart of a method for determining similarity of a multi-segment power quotation curve according to one embodiment of the invention;

FIG. 2 is a schematic diagram of another flow chart of a method for determining similarity of a multi-segment power quotation curve according to one embodiment of the invention;

fig. 3 is a schematic structural diagram of a multi-segment power quotation curve similarity determination device according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, an embodiment of the present invention provides a method for determining similarity of a multi-segment power quotation curve, including the steps of:

s1, acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a point set P.

S2, performing density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition, and outputting a density clustering result; the density clustering result comprises all neighborhood radiuses conforming to the constraint conditions of the optimization variables and corresponding cluster types, core points, classified points and unclassified points.

Further, step S2 specifically includes:

s201, core point judgment is carried out on the quotation points in the point set P according to a preset density clustering neighborhood radius and a preset minimum quotation point number in the neighborhood.

S202, carrying out density clustering on a point set P based on each core point, calculating Hausdorff distance between the core point and an unprocessed point in the point set P, classifying the points meeting the preset condition into the same cluster according to each Hausdorff distance, and marking the points of the same cluster as processed points.

S203, repeating the steps S201-S202 until all points in the point set P are classified into clusters or judged as non-core points, and classifying points which do not belong to any clusters as unclassified points.

S204, outputting all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points.

And S3, carrying out optimization calculation on a preset objective function based on the density clustering result, and obtaining the neighborhood radius of the optimal cluster, the cluster class, the core point, the classified point and the unclassified point corresponding to the optimal cluster.

S4, screening out quotation similar unit clusters according to the neighborhood radius of the optimal clusters and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal clusters.

Further, step S4 specifically includes:

Referring to fig. 2, based on the above scheme, in order to better understand the multi-segment power quotation curve similarity determination method provided by the embodiment of the invention, the following details are described:

1. reading spot quotation data of a market main body in a statistical period, forming quotation point data (multi-section quotation forms multidimensional vector points) after per unit conversion, and counting into a point setIs->. wherein />、/>The declaration price and declaration quantity of the r declaration segment of the market subject represented by the nth point. Since price games in the spot market are mainly concentrated on the next few segments of quotations, the first few segments of quotations of the generator may be lower than the generation cost declaration, so that the bid is marked for starting upAnd therefore select other bid points than the first two to cluster optimally.

2. And carrying out optimal clustering on N quotation points except the first two segments of quotation points, wherein the optimal variable is the neighborhood radius d of the density clustering, and the objective function is as follows:

(1)

where M represents the number of bid points classified,for the m-th classified point->，、/>Declaration price and Shen Baoliang, < > -of the r-th declaration section of the market subject represented by the m-th classification point, respectively>Core Point of cluster class where the mth classified Point is located +.>，/>Representation->To the point ofThe Hausdorff (Hausdorff) distance is calculated as shown in formula (6). The optimization variable d of the optimal clustering model has the following constraint conditions:

(2)

in the formula ,to optimally cluster a set of all N points; />K effective numbers are reserved for the neighborhood radius of the density cluster, so that the dispersion of the objective function is ensured, and the optimization model has a certain solution. />Is the hausdorff distance between the bid multidimensional vectors.

3. When the optimization model objective function is calculated in the step 2, density clustering is needed to be carried out on all N points in the point set P, so that constraint conditions conforming to the formula (2) are obtainedCorresponding cluster class, core point, classified point and unclassified point. The specific process of density clustering is as follows:

3.1 inputting the density clustering neighborhood radius of the optimization iterationAnd setting the minimum number MinNum of quotation points in the neighborhood. The MinNum value can be set according to the number of units in the spot market and the demand for density clustering. For example, there are 50 units in the market, minNum can be set to 5, that is, in each cluster after density clustering, 5 other units corresponding to the core point are judged to be similar to the quotation.

3.2 taking a point from the Point set PAnd judging whether the point is a core point or not.

If the neighborhood of the quotation pointAt least MinNum quotation points are included, and the quotation points are core points. All unprocessed points in a point setGathering and calculating unprocessed point and core point +.>Is a hausdorff distance.

If the quotation point is not the core point, the cycle is jumped out, and the next point in the point set P is judged as the core point.

The Haoskov distance calculation method is as follows:

（3）

in the formula ,is a dot set->Arrival point setIs one-way Haosdorf distance, +.>Is a dot set->To the point set->Is calculated as follows

（4）

（5）

in the formula ,representing two-dimensional points +.>And (4) point->Euclidean distance of (c):

（6）

in the formula ,meaning of (1) is that the point set->R in total) and the point set +.>All points (s in total) are subjected to Euclidean distance calculation, the minimum value is taken from the s Euclidean distances, and the maximum value is taken from all the r minimum Euclidean distances.

And 3.3, judging whether the Hausdorff distance between the unprocessed point and the core point belongs to the conditions of direct density accessibility, density accessibility and density connection, if so, classifying the points meeting the conditions into the same cluster class and marking the points (including the core point) in the same cluster class as processed. The judging method for the direct density is reachable, the density is reachable and the density is connected is as follows:

1) The direct density can be achieved: if the quotation point x is at a core point yWithin the neighborhood (also including boundary points), x and y are considered to be directly reachable densities.

2) The density can be achieved: if there are quotation points x, y, z, wherein the direct densities of x and y are reachable, the direct densities of y and z are reachable, but z is not at xIn the neighborhood. In this case, x and z cannot be reached directly in density, but are +.>The y-point in the neighborhood can be directly density reachable to the z-point, then x and z are defined as density reachable.

3) Density connection: if the quotation point w can not be directly reached or reached with the core point x, but the quotation point w can be reachedIn the neighborhood, x and w are defined to be connected in density.

3.4 repeating the steps 3.2 and 3.3 until all N points in the point set P are classified into clusters or judged as non-core points, and classifying points not belonging to any clusters as unclassified points. Outputting the density clustering neighborhood radius of the optimized iterationAll corresponding cluster classes, core points, classified points and unclassified points.

4. Optimizing the objective function and obtaining the neighborhood radius of the optimal clusterAnd its corresponding cluster class, core point, classified point, and unclassified point. When the method is applied to collusion analysis, the core point of each cluster class is +.>Corresponding units and other quotation points in the neighborhood of the units>The corresponding units are the unit clusters with the screened stock quotes being highly similar, and the possibility of collusion quotes exists. In addition, the method can be used for analyzing the quotation mode of a specific market body in a certain time period or summarizing the price and capacity characteristics of the quotation points of each cluster to form a general quotation strategy library and the like.

It should be noted that, for simplicity of description, the above method or flow embodiments are all described as a series of combinations of acts, but it should be understood by those skilled in the art that the embodiments of the present invention are not limited by the order of acts described, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are all alternative embodiments and that the actions involved are not necessarily required for the embodiments of the present invention.

It should be noted that, in the embodiment of the invention, on the basis of spot price report per unit, similarity judgment is performed on the multi-section power quotation curves through an optimal clustering model. When classifying the power quotation curves based on density clustering, the method of the invention creatively optimizes the clustering parameters. The neighborhood radius in the density cluster is used as an optimization variable in the optimization model, and the sum of the distances from all points to the core point is minimum as an objective function. The neighborhood radius only takes k-bit effective numbers, so that the optimal clustering model is converted into a discrete optimization model, the optimization model is guaranteed to have an extremum, and the process of infinite optimization is avoided. The method of the invention satisfies the optimization solution under the condition of certain precision, and meets the engineering requirement.

The method sets the distance from the unclassified point to the core point as the average value of the distances from all points in other classified clusters to the core point of the cluster, instead of 0 in the traditional density clustering. The inventive arrangement may trade off the number of classified and unclassified points, avoiding the situation where unclassified points are too many in order to minimize the sum of all point-to-center point distances. When the distances from the core point to other points are calculated, the accuracy of judging the similarity of the quotation curve of the method can be better exerted under the condition of multi-section quotation (five sections and above) by using the Haoskov distance instead of the Euclidean distance.

Referring to fig. 3, in order to solve the same technical problem, the present invention further provides a device for determining similarity of multi-segment power quotation curves, including:

the point set acquisition module 1 is used for acquiring spot quotation data of a market main body and carrying out per unit on the spot quotation data to form a point set P;

the density clustering module 2 is used for carrying out density clustering on the quotation points in the point set P according to a preset optimization variable constraint condition and outputting a density clustering result; the density clustering result comprises all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points;

the optimization calculation module 3 is used for carrying out optimization calculation on a preset objective function based on the density clustering result and obtaining the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified point and the unclassified point corresponding to the neighborhood radius;

and the similar quotation screening module 4 is used for screening quotation similar unit clusters according to the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal cluster.

Further, the density clustering module 2 is specifically configured to:

Further, the similar offer screening module 4 is specifically configured to: and selecting a unit corresponding to the core point of each cluster and units corresponding to other quotation points in the neighborhood of the core point according to the neighborhood radius of the optimal cluster and the clusters, the core points, the classified points and the unclassified points corresponding to the optimal cluster, so as to obtain the quotation similar unit cluster.

It can be understood that the embodiment of the device item corresponds to the embodiment of the method item of the present invention, and the device for judging the similarity of the multi-segment power quotation curves provided by the embodiment of the present invention can implement the method for judging the similarity of the multi-segment power quotation curves provided by any one of the embodiments of the method item of the present invention.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims

1. A multi-section electric power quotation curve similarity judging method is characterized by comprising the following steps:

screening out quotation similar unit clusters according to the neighborhood radius of the optimal clusters and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal clusters;

outputting all neighborhood radiuses conforming to the constraint conditions of the optimized variables and corresponding cluster types, core points, classified points and unclassified points;

the objective function is:

；

2. The method for determining similarity of multiple segments of power quotation curves according to claim 1, wherein the classifying points satisfying the preset condition into the same cluster according to each hausdorff distance comprises:

3. The method for determining similarity of multiple segments of power quotation curves according to claim 1, wherein only the points other than the first ones of the declared segments are optimally clustered when the optimal clustering is performed.

4. The method for judging similarity of multi-segment power quotation curves according to claim 1, wherein the screening of the quotation similar unit clusters according to the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified points and the unclassified points corresponding to the neighborhood radius, specifically comprises:

5. A multi-segment power quotation curve similarity determination device, comprising:

the similar quotation screening module is used for screening quotation similar unit clusters according to the neighborhood radius of the optimal cluster and the cluster class, the core point, the classified points and the unclassified points corresponding to the optimal cluster;

the density clustering module is specifically used for:

the objective function is:

；

wherein ,for the m-th classified point->，/>、/>Declaration price and Shen Baoliang, < > -of the r-th declaration section of the market subject represented by the m-th classification point, respectively>For the m-th classified pointIn the core point of cluster->，/>Representation->To->M represents the number of classified quotation points, N is the total number of quotation points;

；

6. The apparatus for determining similarity of multiple segments of power quotation curves according to claim 5, wherein the classifying points satisfying a predetermined condition into the same cluster according to each hausdorff distance comprises:

7. The multi-segment power quotation curve similarity judging device according to claim 5, wherein when the optimal clustering is performed, only the quotation points other than the preceding claims segment quotations are optimally clustered.

8. The multi-segment power quotation curve similarity determination device of claim 5, wherein the similarity quotation screening module is specifically configured to: and selecting a unit corresponding to the core point of each cluster and units corresponding to other quotation points in the neighborhood of the core point according to the neighborhood radius of the optimal cluster and the clusters, the core points, the classified points and the unclassified points corresponding to the optimal cluster, so as to obtain the quotation similar unit cluster.