CN111144705A

CN111144705A - Whole-network same-section data processing method based on information acquisition with time scale

Info

Publication number: CN111144705A
Application number: CN201911231245.4A
Authority: CN
Inventors: 魏凯; 王顺江; 王强; 刘云松; 邱鹏; 管文; 刘杨; 赵琰; 王东来; 葛维春; 刘前卫; 常乃超; 葛延峰; 曹丽娜; 刘金波; 王永福; 胡博; 陈晓东; 高凯; 周桂平
Original assignee: Jinzhou Electric Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd; State Grid Corp of China SGCC; State Grid Liaoning Electric Power Co Ltd; Shenyang Institute of Engineering
Current assignee: Jinzhou Electric Power Supply Co Of State Grid Liaoning Electric Power Supply Co ltd; State Grid Corp of China SGCC; State Grid Liaoning Electric Power Co Ltd; Shenyang Institute of Engineering
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2020-05-12
Anticipated expiration: 2039-12-05
Also published as: CN111144705B

Abstract

The data processing of the same section is divided into two directions of a transformer substation and a dispatching master station, and the data processing of the same section in the transformer substation provides a basis for the dispatching master station to form a database of the same section. In a station, a maximum value and minimum value calculation method of the acquisition amount under normal operation is given according to historical data of an acquisition device, and an improved cubic B-spline fitting algorithm is provided for curvilinearly processing the acquisition data aiming at the acquisition amount which does not exceed the range. And intercepting fitting curve data of the same time point by selecting a time section and setting a time interval, storing to obtain data of the same section in the plant station, and uploading the data of the same section to a dispatching master station by a plurality of plant stations to obtain information of the same section of the whole station. The method can improve the accuracy and the same section of the power grid data, form a data chain of the whole grid and the same section for the power grid, and improve the availability of the data.

Description

Whole-network same-section data processing method based on information acquisition with time scale

Technical Field

The invention belongs to the field of automation information of power systems, and particularly relates to a full-network same-section data processing method based on information acquisition with time scales.

Background

With the automation of the power grid and the improvement of the informatization level, the requirements of the power grid on the accuracy and the real-time performance of power data are higher and higher. After the data collected by the power system are uploaded to the master station, the collected data do not have timeliness, the time uniformity of each plant station is not high, the time and section of the data are low, the data cannot be directly applied to a power grid, and meanwhile, the collected data are wrong data due to the fact that measurement deviation possibly exists, and the accuracy of scheduling decision is influenced by direct application. At present, technical research on processing collected power grid data into data with the same section is few, and a state estimation method for power system data verification can only identify bad data, but because time references of collection points of various stations may be different, the accuracy of the whole evaluation value is not high due to the used data property, and a final processing result cannot form a data chain with the same section of the whole network, so that the data availability is low.

Disclosure of Invention

The invention aims to solve the technical problems and provides a whole-network same-section data processing method based on time scale information acquisition, which performs weighted interpolation processing on discrete time data of data acquired in a station and a main station to obtain continuous time data, and on the basis, the influence of time synchronization precision on the time section is considered, and a proper time section and a proper time interval are selected to form whole-network same-section data, so that the accuracy and the same section of power grid data are greatly improved, a whole-network same-section data chain is formed for a power grid, and the availability of the data is improved.

The technical solution of the invention is as follows:

a whole-network same-section data processing method based on time scale information acquisition is characterized by comprising the following steps:

step 1: a whole network same-section data processing method based on time scale information acquisition is characterized in that the implementation process is divided into two stages, the first stage is the formation of same-section data in a transformer substation, and the second stage is the formation of whole network same-section data between transformer substations;

firstly, a plurality of data acquisition and measurement points exist in a station, meanwhile, an alternating current sampling device has a time check function, the time of the acquisition points is ensured to be synchronous with a main clock of the station, the acquired data are marked with acquisition time scales, and the sampled data with the time scales are sent to a station background server for data storage and processing;

step 2: determining the maximum value and the minimum value limit of the collected data quantity according to historical operating data of the collection point, including historical data within the last year; setting the upper limit of the normal operation interval of the collection amount as X_maxThe lower limit of the normal operation interval is X_minAccording to the maximum value and the minimum value of the collection amount within the last 12 months, the upper limit and the lower limit of the normal operation interval are calculated by a weighting algorithm, and the calculation formula is as follows:

where i is the ith month, k, nearest to the sampling time_iIs the ith month upper limit value weight factor, h_iA weighting factor x for the ith month lower limit_imaxMaximum value of collected X at month i, X_iminIs the minimum value of the collected amount X at month i; when the acquisition amount exceeds the upper limit and the lower limit of operation, the acquired data exceeding the limit does not participate in the subsequent operation process;

and step 3: dividing the collected data into two situations according to the reason that the collected data exceeds the normal operation range; one is that two points exceed the normal operating range after the single point, the number of erroneous acquisition points is very small, the situation is inferred to be caused by the measurement error of the acquisition device, and the processing mode is to delete data; the other is that the acquisition amount exceeds the normal operation range when multiple points occur, the error acquisition points are more than two, the condition is inferred that an accident occurs, the acquisition amount exceeds the normal operation range, and at the moment, the processing mode is to start fault filtering to record related operation data so as to facilitate the follow-up analysis of accident reasons;

and 4, step 4: carrying out interpolation fitting curve processing on the collected data in the normal operation range, and changing discrete collected data points into a continuous smooth time curve; the method comprises the steps of calculating the curvature of discrete points, setting a reasonable curvature threshold value to extract characteristic points, carrying out chord length parameterization on the characteristic points, constructing node vectors according to the parameters of the characteristic points, and solving a control vertex by using a least square method to fit discrete acquisition quantity data;

step 4.1: at a given acquisition quantity data point vector P_kWherein k is (0,1,2, … n, consisting of [ A, B ]]In the interval, every 4 points are sequentially connected into a polygon called a B characteristic polygon, and a fitting curve formed by fitting the B characteristic polygon by utilizing a cubic spline function is a cubic B spline curve;

the matrix representation of the cubic B-spline curve is in the form:

the expression of the i-th cubic B-spline curve is therefore expressed as:

wherein t is a parameter, P_i,P_i+1,P_i+2,P_i+3(i ═ 0,1,2, …, n) is the four vertices where the feature polygon is adjacent; take j as 3, N_j,3(t) is the basis function of the cubic B-spline curve, and the recursion formula is as follows:

because the four basic functions are positioned in the same node interval, a finished B spline can be spliced;

step 4.2: in the actual fitting process of the sampled data, the data points are often more, and if the data all participate in fitting, the fitting efficiency is possibly low, and the iteration times are more; the extraction of characteristic points of the curve is called as a key, and the key has a direct effect on the shape of the fitted curve; the curvature reflects the overall and local shape characteristics of the curve, and a method for selecting characteristic points by calculating the curvature is adopted;

for a cubic spline B (t), the parameter value t is obtained by a differentiation method_iThe curvature of (b) is calculated by the formula:

wherein B' (t)_i) The first derivative of curve B (t), B' (t)_i) Is the second derivative, k, of curve B (t)_iIs a curvature;

step 4.3: extracting characteristic points, and obtaining curvature K ═ K (K) of discrete points by using₀,k₁,…,k_n) The principle of extracting the feature points is as follows:

(1) for curves that are not closed, the two endpoint values must be selected;

(2) determining a set curvature threshold value selection characteristic point, and setting a point with the curvature larger than the threshold value as an initial characteristic point; mean value of curvature K_avgThe curvature threshold is set to K_ths＝α×K_avgα is a scaling factor;

the selection requirement of the threshold value can not only reflect the overall shape of the curve, but also reduce the number of characteristic points as much as possible; when the address selection of the threshold is small, more control points are possibly caused; when the threshold value is too large, the overall shape of the curve may not be represented;

step 4.4: in the step, a chord length parameter method is adopted to carry out characteristic point parameterization, and a data point Q₀,Q₁,…Q_nThe parameter field t is equal to [0,1 ]]The nodes between the nodes have one-to-one correspondence; let d total chord length:

then

Wherein Q_k-Q_k-1Is a chord edge vector; the parameterized method truly reflects the condition of data points of chord length distribution, and overcomes the problem of processing the condition of uneven chord length distribution;

step 4.5: constructing a node vector, namely constructing the node vector by adopting an average value method, and evenly distributing the node vector, namely:

t₀＝t₁＝…t_p＝0

t_m-p＝t_m-p-1＝…＝t_m＝1

m +1 represents the total number of nodes, both ends of the node vector are respectively provided with p +1 same nodes so as to control the tangency of the head end point and the tail end point, j represents the ordinal number of the middle node, and the node vector can well reflect the parameter point vector by adopting an average value method

The distribution of (2);

step 4.6: fitting a least square cubic B spline curve, and constructing a cubic B spline curve approximating the characteristic points by using a node vector and a least square method; as can be seen from the cubic B-spline expression, the feature point D is approximated in the least-squares sense, i.e.:

wherein D₀＝B(0)，D_rB (1), r is the number of feature points;

order to

f is about n-1 variables P₁,…P_n-1A scalar value function of; to minimize f, let f pair control points P_lThe partial derivative of (a) is zero, the following equation is obtained:

this is an unknown quantity P₁,…,P_n-1Let l be 1,2, …, and n-1, then a linear system of equations with n-1 unknowns and n-1 equations is obtained:

(N^TN)P＝R

solving the linear equation set can obtain the value of the unknown quantity P, so that a fitting curve of cubic B-spline can be determined;

and 5: the fitting curve is refined, the curve only fitted by the characteristic points can not meet the requirement of approximate allowable error generally, and the number of the characteristic points needs to be increased in order to obtain a high-quality curve; calculating the deviation between the data point and the initial fitting curve, taking the point with the deviation exceeding a set deviation threshold value as a characteristic point, and inserting the characteristic point into the initial fitting curve for fitting again;

step 5.1: calculating approximation deviation, namely calculating the approximation deviation between the data points and the curve by adopting Hausdorff distance; assume that two sets of data sets a ═ a₁,a₂,…,a_p}，B＝{b₁,b₂,…b_qAnd the Hausdorff distance between the two point sets of A and B is:

H(A,B)＝max(h(A,B),h(B,A))

where | · | | is a distance model between the point set a and the point set B, such as a euclidean distance, H (a, B) is a bidirectional Hausdorff distance, and H (a, B) and H (B, a) are unidirectional Hausdorff distances from the set a to the set B and from the set B to the set a, respectively; the two-way Hausdorff distance H (A, B), which is the greater of the two singleton distances H (A, B) and H (B, A), measures the maximum degree of mismatch between the two sets of points;

step 5.2: refining the curve locally, calculating deviation values of the initial curve and all corresponding acquisition points, and setting a deviation threshold value e_thsFinding out all maximum value points exceeding the deviation threshold, and the specific operation is as follows:

(1) calculating to obtain the approximation deviation of the data point and the curve according to a Hausdorff distance formula

(2) Solving the maximum approximation deviation h (Q, B);

(3) setting a deviation threshold e_thsβ xh (Q, B), β is the scaling factor;

(4) extracting all the deviation values larger than the deviation threshold value e on the deviation curve_thsA point of (a);

(5) re-fitting the points as new feature points, wherein one-time increased key point between two initial key points cannot exceed one;

(6) repeating the steps until the fitting precision is met, and finishing the calculation;

since the cubic B-spline has locality, the newly inserted feature points can change the local knowledge quantity of the fitting curve without influencing the whole body; the fitting method greatly reduces the number of control points and improves the calculation efficiency on the premise of ensuring the precision;

step 5.3: and (3) evaluating the fitting precision, and after extracting the characteristic points of the previous stage, obtaining two adjacent characteristic points which are respectively (x)_k,y_k) And (x)_k+1,y_k+1) At the point (x) on the curve segment between which the deviation from the two characteristic points is greatest_i,y_i) By calculating the h value of a feature point, i.e. point (x)_i,y_i) To point (x)_k,y_k) And (x)_k+1,y_k+1) The vertical distance of the connected straight lines is calculated by the following formula:

calculating all h values except the end points; will be provided with

The integral square error is used as an index for measuring the fitting accuracy of the curve, so that the original shape of the curve can be ensured;

step 6: after the acquired data in the plant station is subjected to cubic B spline fitting, selecting an initial time section t₀And a time interval Δ t; the sampling data is data with time mark, and the relation of the sampling quantity X with respect to time is set as X_i(t), wherein i is the ith collection quantity number, and the collection quantity X is the operation parameters of the power grid such as voltage, current and the like; the acquired data becomes a continuous curve at discrete points after fitting, and X at any time can be obtained_i(t) numerical value, cross-section through time t₀And determination of the time interval Δ t, finally forming in the plant at t₀,t_0+Δt,t_0+2Δt,…t_0+nΔtAll the acquired section data at the moment point are stored according to the time axis as scales to obtain a same-section database in the station;

and 7: different stations obtain databases of the same section of each station according to the same processing mode of the acquired data; and then, each plant station uploads the data of the same cross section to a scheduling master station, and a full-network database of the same cross section is formed at the scheduling master station. The same-section database is established, so that the accuracy of the real-time data of the power grid is improved, and reliable data support is provided for power grid dispatching operation decision.

The invention has the beneficial effects that:

the invention divides the formation of the database with the same section into two stages, and provides a calculation method of the collection quantity operation limit value and a processing method of the data exceeding the limit value by analyzing the normal operation interval limit value of the collected data with the time mark, thereby improving the time-same section property and the data processing accuracy of the data; a least square method is adopted to carry out cubic B spline fitting to change discrete power acquisition data into a continuous time curve, and approximation deviation and a deviation threshold value are introduced to achieve fitting curve refinement, so that the number of control points is greatly reduced on the premise of ensuring precision, and the calculation efficiency is improved; finally, a whole-network same-section database is established on the side of the dispatching master station, so that the accuracy of the real-time data of the power grid is improved, data support is provided for the analysis of advanced application software of the power grid and the dispatching operation decision of the power grid, and the application level of the real-time information of the power grid is improved; the invention is suitable for supporting each system by high-precision electric power real-time same-section data, and has important reference for the construction of a digital and information intelligent power grid.

Drawings

FIG. 1 is a flow chart of a full-network co-section data processing method of the present invention;

FIG. 2 is a least squares curve fit plot of the present invention;

FIG. 3 is a Bezier curve fit of the present invention;

FIG. 4 is a Lagrangian fitting graph of the present invention;

FIG. 5 is a graph of a cubic B-spline curve fit according to the present invention;

FIG. 6 is a graph of the results of a cubic B-spline initial fit of a period of sampled data according to the present invention;

FIG. 7 is a graph of a cubic B-spline multiple optimization fit of one cycle of sampled data according to the present invention;

Detailed Description

As shown in fig. 1 to 7, the method for processing the data of the whole network and the same section based on the information acquisition with the time scale includes the following steps:

step 1: a whole-network same-section data processing method based on time scale information acquisition is characterized in that the implementation process is divided into two stages, the first stage is the formation of same-section data in a transformer substation, and the second stage is the formation of same-section data between the transformer substations.

Firstly, a plurality of data acquisition and measurement points exist in a station, meanwhile, the alternating current sampling device has a time check function, the time of the acquisition points is ensured to be synchronous with a main clock of the station, the acquired data are marked with acquisition time scales, and the sampling data with the time scales are sent to a background server of the station.

The method is verified by selecting a certain 220kV transformer substation as an example, the in-station time synchronization device adopts a Beidou/GPS double-time-alignment mode, the time synchronization of all devices in the station is guaranteed to meet the measurement requirement of on-off surface data, the voltage of the high-voltage side of the transformer in the station is selected, and the measurement data of the sampling device is read when the transformer substation normally operates. The sampling frequency of the acquisition device is 80 points/period, and 80 sampling data in one period of 0.02s are taken as an example for analysis. Data collected are shown in table 1:

table 1: 220kV station one-period voltage sampling data

where i is the ith month, k, nearest to the sampling time_iIs the ith month upper limit value weight factor, h_iA weighting factor x for the ith month lower limit_imaxMaximum value of collected X at month i, X_iminIs the minimum value of X collected at month i. When the collection amount exceeds the upper limit and the lower limit of operation, the collected data exceeding the limit willDoes not participate in the subsequent operation process. The last year of voltage operation upper limit and lower limit are U through big data analysis_max＝245kV，U_min＝-245kV。

And step 3: dividing the collected data into two situations according to the fact that the collected data exceed the operation range; one is that two points exceed the normal operation range after the single point, the number of erroneous acquisition points is very small, the situation is inferred to be caused by the measurement error of the acquisition device, and the processing mode is to delete data; the other is that the acquisition amount exceeds the normal operation range at multiple points, the error acquisition points are more than two, and the condition is concluded that an accident occurs, so that the acquisition amount exceeds the normal operation range; at the moment, the processing mode is to start fault filtering to record relevant operation data, so that accident reasons can be conveniently analyzed subsequently. According to the comparison between the voltage data of the sampling points and the upper and lower operation limits, the serial numbers of the abnormal points are 34 and 75 (see table 2), which are caused by measurement errors, and the data of the two points are removed.

Table 2: outlier data

And 4, step 4: and carrying out interpolation fitting curve processing on the acquired data in the normal operation range, and changing discrete acquired data points into a continuous smooth time curve. The method comprises the steps of calculating the curvature of discrete points, setting a reasonable curvature threshold value to extract characteristic points, carrying out chord length parameterization on the characteristic points, constructing node vectors according to the parameters of the characteristic points, and solving a control vertex by using a least square method to fit discrete acquisition quantity data.

Step 4.1: the B spline curve is utilized to process the collection amount, a fitting mode of the B spline curve usually adopts k +1 vertexes to define a k-th-order polynomial, and the mathematical expression of the k-th-order polynomial is as follows:

in the formulaP_iIs a position vector of each vertex, N_i,kIs a Bernstein-basis function, called a k-th order (k-1) th-order B-spline basis function, each of which is called a B-spline. In order to effectively realize the definition of the spline function in a computer, the recursion formula is defined by the following expression:

the following table i indicates the number and k the number. The recursion formula shows that t is needed to determine the ith k-th B spline_i,t_i+1,…,t_i+k+1Total k +2 nodes, interval [ t ]_i,t_i+k]Is N_i,k(t) a support zone. In the curve equation, with n +1 control vertices P_i(i-0, 1,2, …, N) for N +1 k-th-order B-spline basis functions N_i,k(t), (i ═ 0,1,2, …, n). Therefore, the high-order B-spline can be derived on the basis of the low-order B-spline.

The cubic B-spline fitting effect is illustrated by using discrete data points (see Table 3), and curve fitting comparative analysis is performed by using a least square method, a Bezier method, a Lagrangian method and a cubic B-spline method respectively.

Table 3: discrete point data

The results of fitting the data curves are shown in the attached drawings 2-5, and the least square method reflects the general trend of data points and eliminates local fluctuation by comparing and analyzing the fitting methods of the curves, but the fitting effect of the least square method on the discrete points reflecting the key change of the linear trend is not ideal; the Bezier curve fits the whole trend of the intelligent response curve, and the local fitting effect on the curve is not ideal; the Lagrange method has the Longge phenomenon when more data points exist, and the fitting error is larger; the cubic B-spline fitting overcomes the defects of the Bezier method, keeps some points, only two adjacent curve line segments can be changed by changing the position of a certain control point, the shapes of other curve line segments are not influenced, the local fitting effect is good, the fitting curve is smoother, and the fitting method is more suitable for fitting the station sampling data with more processing data.

Step 4.2: at a given acquisition quantity data point vector P_kWherein k ═ (0,1,2, …, n) constitutes [ a, B]In the interval, every 4 points are connected into a polygon in sequence, and the polygon is called a B-feature polygon. And fitting the B characteristic polygon by utilizing a cubic spline function to form a fitting curve which is a cubic B spline curve.

The matrix representation of the cubic B-spline curve is in the form:

the expression of the i-th cubic B-spline curve can be expressed as:

because the four basic functions are in the same node interval, a finished B spline can be spliced.

Step 4.3: the feature points occupy important positions in the accuracy of the data fitting, which has a direct effect on the shape of the fitted curve. The curvature reflects the overall and local shape characteristics of the curve, and a method for selecting characteristic points by calculating the curvature is adopted.

wherein B' (t)_i) The first derivative of curve B (t), B' (t)_i) Is the second derivative, k, of curve B (t)_iIs a curvature.

Step 4.4: in the actual process of fitting the sampled data, there are often many data points, and if all the data participate in fitting, the fitting efficiency may be low and the number of iterations may be many. Therefore, it is very important to use fewer points to achieve the purpose of curve fitting on the premise of ensuring the accuracy, and the extraction of characteristic points of the curve is called as a key.

Using curvature K ═ K (K) to obtain discrete points₀,k₁,…,k_n) The principle of extracting the graph is as follows:

(1) for curves that are not closed, the two endpoint values must be selected.

(2) And determining a set curvature threshold value selection characteristic point, and setting a point with the curvature larger than the threshold value as an initial characteristic point. Mean value of curvature K_avgThe curvature threshold is set to K_ths＝α×K_avg(α is a proportionality coefficient)

The selection of the threshold is crucial, and the requirement is that the overall shape of the curve can be reflected, and the number of the characteristic points is as small as possible. When the address selection of the threshold is small, more control points are possibly caused; if the threshold value is too large, the entire shape of the curve may not be expressed.

According to the sampling data in a period, the curvature threshold value is 0.18, 16 selected feature points are selected, and the table 4 extracts the feature points for the sampling data.

Table 4: sampling data extraction feature point

It should be noted that, since the selection of the threshold causes the occurrence of an approximation error, the curve is subsequently locally optimized to achieve accurate fitting of the curve.

Step 4.5: in the step, a chord length parameter method is adopted to carry out characteristic point parameterization, and a data point Q₀,Q₁,…Q_nThe parameter field t is equal to [0,1 ]]There is a one-to-one correspondence between the nodes. Let d total chord length:

then

Wherein Q_k-Q_k-1The vector of the chord edge is taken as the vector of the chord edge,

the method is a parameter point vector, truly reflects the condition of data points of chord length distribution, and overcomes the problem of processing the condition of uneven chord length distribution.

Step 4.6: constructing node vectors, namely constructing the node vectors by adopting an average value method, and evenly distributing the node vectors, wherein the node vectors are [ t ]₀,t₁,…,t_p,…,t_m-p,t_m-p+1,…,t_m]Wherein:

t₀＝t₁＝…t_p＝0

t_m-p＝t_m-p+1＝…＝t_m＝1

m +1 represents the total number of nodes, both ends of the node vector are respectively provided with p +1 same nodes so as to control the tangency of the head end point and the tail end point, j represents the ordinal number of the middle node, and the node vector can well reflect by adopting an average value method

(parameter point vector) distribution.

Step 4.7: and fitting the least square cubic B-spline curve, and constructing a cubic B-spline curve approximating the characteristic points by using a node vector and a least square method. As can be seen from the cubic B-spline expression, the feature point D is approximated in the least-squares sense, i.e.:

wherein D₀＝B(0)，D_rR is the number of feature points B (1). Order:

f is about n-1 variables P₁,…P_n-1A scalar value function of. To minimize f, let f pair control points P_lThe partial derivative of (c) is zero.

Namely:

this is an unknown quantity P₁,…,P_n-1Let l be 1,2, …, and n-1, then a system of line equations with n-1 unknowns and n-1 equations can be obtained:

(N^TN)P＝R

the value of the unknown quantity P can be solved by solving the system of linear equations, so that a fitting curve of a cubic B-spline can be determined. Extracting the characteristic points according to the step 4.4, and performing initial cubic B-spline curve fitting, wherein the results of the cubic B-spline initial fitting of the sampling data in one period are shown in an attached figure 6. The shape of the acquired volume data points is substantially fitted from the feature points, but refined locally.

And 5: the curve fitted by only using the feature points generally cannot meet the requirement of approximation tolerance, and the number of the feature points needs to be increased in order to obtain a high-quality curve. And calculating the deviation between the data point and the initial fitting curve, taking the point with the deviation exceeding a set deviation threshold value as a characteristic point, and inserting the characteristic point into the initial fitting curve for re-fitting.

Step 5.1: and calculating approximation deviation, namely calculating the approximation deviation between the data points and the curve by adopting Hausdorff distance. The Hausdorff distance is a measure describing the degree of similarity between two sets of points.

Assume that two sets of data sets a ═ a₁,a₂,…,a_p}，B＝{b₁,b₂,…b_qAnd the Hausdorff distance between the two point sets of A and B is:

H(A,B)＝max(h(A,B),h(B,A))

where | · | | is a distance range (e.g., euclidean distance) between the point set a and the point set B, H (a, B) is a bidirectional Hausdorff distance, and H (a, B) and H (B, a) are unidirectional Hausdorff distances from the set a to the set B and from the set B to the set a, respectively. The two-way Hausdorff distance H (A, B) is the larger of the two singleton distances H (A, B) and H (B, A), which measures the maximum degree of mismatch between the two point sets.

(2) Solving the maximum approximation deviation h (Q, B);

(3) setting a deviation threshold e_thsβ × h (Q, B) (β is a scaling factor);

(5) refitting these points as new feature points (one more keypoint added between two initial keypoints)

(6) And repeating the steps until the fitting precision is met, and finishing the calculation.

Since cubic B-splines are local, the newly inserted feature points can change the local knowledge of the fitted curve without affecting the whole.

The results of the cubic B-spline multiple optimization fitting of the sampled data in one period are shown in the attached figure 7. The initial fitting approximation deviation of the sampled data is compared with the calculated value of the optimized approximation deviation in table 5.

Table 5: comparing initial fitting deviation of sampling data with fitting deviation after optimization

The comparison in the table shows that the optimized approximation deviation is obviously better than the initial fitting effect, which also shows that the fitting effect of the method provided by the invention is better than that of the traditional method.

Step 5.3: and (4) evaluating the fitting accuracy, wherein the fitted data are irregular discrete points, and a curve equation between the fits is unknown and is formed by fitting characteristic points. After extracting the feature points of the upper level, obtaining two adjacent feature points as (x)_k,y_k) And (x)_k+1,y_k+1) The point (x) on the curve segment between which the deviation from the two characteristic points is the greatest_i,y_i) By calculating the h value of a feature point, i.e. point (x)_i,y_i) To point (x)_k,y_k) And (a)x_k+1,y_k+1) The vertical distance of the connected straight lines is calculated by the following formula:

all h values except the end point are calculated, and

as the integral square error of the curve, the integral square error is used as an index for measuring the fitting accuracy of the curve. The integral square error is an index for measuring the fidelity of the curve at the characteristic point of curve extraction, and the original shape of the curve can be ensured.

Considering that only the initial characteristic point fitting curve can not meet the requirement of approximation error allowance generally, the Hausdorff distance is used for calculating the approximation deviation, and the point exceeding the deviation threshold value is used as the new characteristic point until the fitting precision is met. The fitting method greatly reduces the number of control points and improves the calculation efficiency on the premise of ensuring the precision.

Step 6: selecting an initial time section t after the collection amount of a plurality of collection points in the plant station is subjected to the fitting processing curve₀And a time interval. Since the collected data is data with time mark, the relation of the collected quantity and the time is set as X_iAnd (t), wherein i is the ith collection quantity number, and the collection quantity X is collected power grid operation parameters such as voltage, current and the like. The discrete points after fitting become a continuous curve, and X at any time can be obtained_i(t) numerical value, cross-section through time t₀And determination of the time interval Δ t, which may ultimately be formed within the plant at t₀,t_0+Δt,t_0+2Δt,…t_0+nΔtAnd storing the section data of the plant side according to the time axis as scales to obtain the same-section database of the plant side by using all the acquisition amount section data at the moment.

And 7: and different stations obtain the same-section database of each station according to the same processing mode of the acquired data. And then, each plant station uploads the same section to the scheduling master station, and a same-section database of the whole network is formed at the scheduling master station, so that the accuracy of real-time data of the power grid is improved, and reliable data support is provided for power grid scheduling operation decision.

Claims

1. A full-network same-section data processing method based on information acquisition with time marks is characterized by comprising the following steps:

step 1: the realization process is divided into two stages, wherein the first stage is the formation of data with the same section in the transformer substation, and the second stage is the formation of data with the same section in the whole network between the transformer substations;

step 2: according to historical operation data of the acquisition points, determining the maximum value and the minimum value of the acquisition amount under normal operation, and setting the upper limit of the normal operation interval of the acquisition amount as X_maxThe lower limit of the normal operation interval is X_minAccording to the maximum value and the minimum value of the collection amount within the last 12 months, the upper limit and the lower limit of the normal operation interval are calculated by a weighting algorithm, and the calculation formula is as follows:

and step 3: dividing the collected data beyond the normal operation range into two situations; one is that a single point or two points exceed the normal operation range, and the situation is caused by the measurement error of the acquisition device, and the processing mode is to delete data; the other is that the acquisition amount exceeds the normal operation range at multiple points, the condition is an accident, and the processing mode is to start fault filtering to record related operation data;

and 4, step 4: aiming at the acquisition quantity data in a normal operation range, a cubic B-spline curve fitting method suitable for dense data is adopted, the method comprises the steps of calculating the curvature of discrete points, setting a reasonable curvature threshold value to extract characteristic points, carrying out chord length parameterization on the characteristic points, constructing node vectors according to the parameters of the characteristic points, and solving a control vertex by utilizing a least square method to fit the discrete acquisition quantity data;

and 5: in order to obtain a high-quality curve, the number of characteristic points is increased, the deviation between the data point and the initial fitting curve is calculated, and the point with the deviation exceeding a set deviation threshold value is used as the characteristic point and is inserted into the initial fitting curve for re-fitting;

step 6: by selecting the time section t₀And a time interval Δ t, finally forming t in the plant₀，t_0+Δt，t_0+2Δt，…t_0+nΔtAll the acquisition quantity section data at the moment point, the sampling data are data with time marks, and the relation of the acquisition quantity X with respect to time is set as X_i(t), wherein i is the ith collection quantity number, and the collection quantity X is the operation parameters of the power grid such as voltage, current and the like; the acquired data becomes a continuous curve at discrete points after fitting, and X at any time can be obtained_i(t) numerical value, cross-section through time t₀And determination of the time interval Δ t, finally forming in the plant at t₀，t_0+Δt，t_0+2Δt，…t_0+nΔtAll the acquired section data at the moment point are stored according to the time axis as scales to obtain a same-section database in the station;

and 7: each plant station processes the acquired data to obtain an in-station same-section database, uploads the in-station same-section database to the scheduling master station, and the full-network same-section database is formed at the scheduling master station.

2. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 1, wherein the step 4 comprises the following steps:

step 4.1: at a given acquisition quantity data point vector P_kWherein k ═ (0,1,2, …, n) constitutes [ a, B]In the interval, every 4 points are sequentially connected into a polygon called a B characteristic polygon, and a fitting curve formed by fitting the B characteristic polygon by utilizing a cubic spline function is a cubic B spline curve;

matrix representation of cubic B-spline curve:

the expression of the i-th cubic B-spline curve is therefore expressed as:

wherein t is a parameter, P_i，P_i+1，P_i+2，P_i+3(i ═ 0,1,2, …, n) is the four vertices where the feature polygon is adjacent; take j as 3, N_j，3(t) is the basis function of the cubic B-spline curve, and the recursion formula is as follows:

3. the method for processing the whole-network same-section data based on the time-scale information acquisition as claimed in claim 2, wherein the step 4 comprises the following steps:

step 4.2: for a cubic spline B (t), the parameter value t is obtained by a differentiation method_iThe curvature of (b) is calculated by the formula:

4. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 3, wherein the step 4 comprises the following steps:

step 4.3: extracting characteristic points, and obtaining curvature K ═ K (K) of discrete points by using₀，k₁，…，k_n) The principle of extracting the feature points is as follows:

(1) for curves that are not closed, the two endpoint values must be selected;

(2) determining a set curvature threshold value selection characteristic point, and setting a point with the curvature larger than the threshold value as an initial characteristic point; mean value of curvature K_avgThe curvature threshold is set to K_ths＝α×K_avgAnd α is a scaling factor.

5. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 4, wherein the step 4 comprises the following steps:

step 4.4: parameterizing characteristic points by adopting a chord length parameter method, and obtaining data points Q₀，Q₁，…Q_nThe parameter field t is equal to [0,1 ]]The nodes between the nodes have one-to-one correspondence; let d total chord length:

then

Wherein Q_k-Q_k-1Is a chord edge vector.

6. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 5, wherein the step 4 comprises the following steps:

step 4.5: and (3) constructing a node vector by adopting an average value method, and evenly distributing the node vector, namely:

t₀＝t₁＝…t_p＝0

t_m-p＝t_m-p-1＝…＝t_m＝1

Distribution of (2).

7. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 6, wherein the step 4 comprises the following steps:

step 4.6: constructing a cubic B spline curve approximating the characteristic points by using a node vector and a least square method; as can be seen from the cubic B-spline expression, the feature point D is approximated in the least-squares sense, i.e.:

wherein D₀＝B(0)，D_rB (1), r is the number of feature points;

order to

f is about n-1 variables P₁，…P_n-1For minimizing f, let f pair control points P_lThe partial derivative of (a) is zero, the following equation is obtained:

this is an unknown quantity P₁，…，P_n-1Let l be 1,2, …, and n-1, then a linear system of equations with n-1 unknowns and n-1 equations is obtained:

(N^TN)P＝R

the value of the unknown quantity P can be solved by solving the system of linear equations, so that a fitting curve of a cubic B-spline can be determined.

8. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 1, wherein the step 5 comprises the following steps:

step 5.1: calculating approximation deviation, namely calculating the approximation deviation between the data points and the curve by adopting Hausdorff distance; assume that two sets of data sets a ═ a₁，a₂，…，a_p}，B＝{b₁，b₂，…b_qAnd the Hausdorff distance between the two point sets of A and B is:

H(A，B)＝max(h(A，B)，h(B，A))

where | · | | is a distance model between the point set a and the point set B, such as a euclidean distance, H (a, B) is a bidirectional Hausdorff distance, and H (a, B) and H (B, a) are unidirectional Hausdorff distances from the set a to the set B and from the set B to the set a, respectively; the two-way Hausdorff distance H (A, B) is the larger of the two singleton distances H (A, B) and H (B, A), which measures the maximum degree of mismatch between the two point sets.

9. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 8, wherein the step 5 comprises the following steps:

(2) Solving the maximum approximation deviation h (Q, B);

(3) setting a deviation threshold e_thsβ xh (Q, B), β is the scaling factor;

10. The method for processing the whole-network co-section data based on the time-scale information acquisition as claimed in claim 1, wherein the step 5 comprises the following steps:

step 5.3: and (3) evaluating the fitting precision, and after extracting the characteristic points of the previous stage, obtaining two adjacent characteristic points which are respectively (x)_k，y_k) And (x)_k+1y_k+1) At the point (x) on the curve segment between which the deviation from the two characteristic points is greatest_i，y_i) By calculating the h value of a feature point, i.e. point (x)_i，y_i) To point (x)_k，y_k) And (x)_k+1，y_k+1) The vertical distance of the connected straight lines is calculated by the following formula:

all h values except the end point are calculated, and

as the integral square error of the curve, the integral square error is used as an index for measuring the fitting accuracy of the curve.