Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In an embodiment, as shown in fig. 1, an electric quantity variation factor analysis method is provided, which is applied to a terminal for example, and this embodiment is applied to a terminal for example, it is understood that this method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:
step 102, acquiring electric quantity related original data in a target time sequence.
The electric quantity related original data are all original data related to power consumption change in a target time interval and a target area, and are obtained by arranging all original data in a preset time interval according to a time sequence.
Specifically, the electricity quantity related original data in the target time sequence are obtained, wherein the electricity quantity related original data are all original data related to the change of the target area and the electricity consumption in the target time interval. For example, if the power consumption change factors in five southern provinces (guangdong, guangxi, Yunnan, Guizhou, Hainan) in the last two years need to be analyzed, the daily total social power consumption data, the air temperature data and the policy data related to the power consumption in the last two years of the five southern provinces are acquired as the raw data, and the raw data are arranged according to the time sequence to obtain the raw data related to the power consumption.
In one embodiment, the air temperature data may be a collection of daily average air temperature data.
And 104, preprocessing the original data related to the electric quantity to obtain key factor characteristic data.
The key factor characteristic data is characteristic data corresponding to each factor which has a key influence on the electric quantity change. As can be understood, the key factor feature data includes feature data corresponding to a plurality of influence factors of a preset dimension.
Specifically, after the electric quantity related original data in the target time sequence are acquired, the electric quantity related original data are preprocessed, and feature data corresponding to each factor which has a key influence on electric quantity change are acquired. It is understood that the preprocessing of the raw data related to the electric quantity may be processing missing values in the data, performing feature coding on the data, or performing normalization processing on the data.
Step 106, inputting the key factor characteristic data into a pre-trained electric quantity change factor analysis model, and performing fitting calculation on the key factor characteristic data to obtain electric quantity components corresponding to influence factors of all preset dimensions in the key factor characteristic data; the electric quantity change factor analysis model comprises a first characteristic input layer and a first hidden layer, and the first hidden layer is provided with fitting functions corresponding to all influence factors and used for calculating and outputting electric quantity components corresponding to all the influence factors.
The electric quantity variation factor analysis model is obtained by training initial analysis in advance, and can be a deep learning model which can calculate input key factor characteristic data and then output electric quantity components corresponding to all influence factors. The electric quantity variation factor analysis template comprises a first characteristic input layer and a first hidden layer. The first hidden layer is provided with fitting functions corresponding to the influence factors, and the fitting calculation can be performed on the input feature data corresponding to the influence factors of the preset dimensionality respectively to obtain and output the electric quantity components corresponding to the influence factors. It will be appreciated that the type of initial analytical model may be selected based on the actual circumstances.
The influence factors of each preset dimension refer to factors of different dimensions which influence the electric quantity change. It can be understood that the preset dimension can be selected according to the actual situation of the target area and the target time interval. For example, if the reason for the change of the electric quantity in the last three months in Guangdong province needs to be analyzed, the preset dimension at this time may be a temperature dimension, a policy dimension, an economic dimension, and the like; if the reason for the change of the electric quantity of a certain factory in the last year needs to be analyzed, the preset dimension can be a factory benefit dimension, a holiday dimension and the like.
The fitting function is an accurate expression of the function obtained by fitting the function in the initial neural network. Specifically, iterative computation is performed on an initial function in the initial model, and parameters of the initial function are updated and replaced according to an output result to obtain an accurate expression of a fitting function. It can be understood that, because the preset dimensions to which each influence factor belongs are different, in the electric quantity variation factor analysis model, the influence factors of each preset dimension have corresponding fitting functions.
Specifically, the key factor characteristic data is input into a first characteristic input layer of a pre-trained electric quantity change factor analysis model, the first characteristic input layer inputs the characteristic data of each influence factor into a corresponding fitting function of a first hidden layer according to the dimension to which the influence factor belongs, the fitting function performs fitting calculation on the characteristic data of each influence factor, and electric quantity components corresponding to each influence factor are output.
In one embodiment, the initial analysis model may be a bp (back propagation) neural network model. The basic principle of the BP neural network model is that an input vector is subjected to a series of transformations of a hidden layer, and then an output vector is obtained, so that a mapping relation between input data and output data is realized. The forward propagation of the input information and the backward propagation of the output error constitute the information loop of the BP network.
And 108, determining the influence of each influence factor in the target time sequence on the electric quantity change based on the electric quantity component corresponding to each influence factor.
Specifically, based on the electric quantity component corresponding to each influence factor output by the first hidden layer, the influence degree of each influence factor in the target time series on the electric quantity change can be determined.
In one embodiment, determining the influence of each influence factor on the change of the electric quantity in the target time series based on the electric quantity component corresponding to each influence factor includes: calculating the sum of the electric quantity components corresponding to all the influence factors in the target time sequence to obtain all the first total electric quantity components corresponding to all the influence factors; and calculating the difference value of each first total electric quantity component and each second total electric quantity component corresponding to each influence factor in the comparison time sequence, and determining the influence degree of each influence factor in the target time sequence on the electric quantity change according to the difference value. Wherein the comparison time sequence is a time sequence for determining the electric quantity change value with the target time sequence. It is understood that the second total electric quantity component may be pre-stored, or may be obtained by inputting the key factor characteristic data in the comparison time series into the electric quantity variation factor analysis model.
For example, if it is required to determine the influence degree of each influence factor on the change of the electric quantity in the comparison between the Guangdong province 4 month of 2021 and the Guangdong province 4 month of 2020, the target time series is the whole month of the Guangdong province 2021 year 4 month, and the comparison time series is the whole month of the Guangdong province 2020 year 4 month. Inputting the key factor characteristic data of each day of the 4 months in 2021 into the electric quantity change factor analysis model to obtain electric quantity components of daily electric quantity of each day corresponding to each influence factor, and respectively adding the electric quantity components corresponding to each influence factor to obtain a first total electric quantity component of the 4 months and the whole month corresponding to each influence factor. At this time, second total electric quantity components corresponding to each influence factor in the whole month of 2020 year 4 are obtained, and the difference value between each first total electric quantity component and each second total electric quantity component is calculated, if the first total air temperature electric quantity component is 200 thousands at this time, and the second air temperature total electric quantity component is 250 thousands at this time, the influence of the air temperature variation of 2021 year 4 month on the electricity consumption of Guangdong province in the same ratio of 2020 year 4 month air temperature variation can be quantized to obtain the electricity consumption increased by 50 thousands.
In one embodiment, determining the influence of each influence factor on the change of the electric quantity in the target time series based on the electric quantity component corresponding to each influence factor includes: and calculating the proportion of each influence factor in the electric quantity change based on the electric quantity component corresponding to each influence factor, determining the proportion as the influence proportion of each influence factor in the target time sequence on the electric quantity change, and determining the influence of each influence factor in the target time sequence on the electric quantity change according to the influence proportion.
In the electric quantity change factor analysis method, electric quantity related original data in a target time sequence are obtained, key factor characteristic data influencing electric quantity change are obtained by processing the original data, the key factor characteristic data are input into a pre-trained electric quantity change factor analysis model, fitting calculation is carried out on the key factor characteristic data, and electric quantity components corresponding to influence factors of all preset dimensions in the key factor characteristic data are obtained. And determining the influence of each influence factor on the electric quantity change in the target time sequence based on the electric quantity component corresponding to each influence factor. When the reason for the electric quantity change in a certain period needs to be analyzed, the influence of each influence factor on the electric quantity change in the period can be determined according to the influence corresponding to each influence factor. By considering the influence of a plurality of influence factors on the electric quantity change, the accuracy of the influence of the quantitative influence factors on the electric quantity change value is improved, and the method has higher reference value for subsequent analysis of areas, industrial power utilization states and the like.
In one embodiment, the electricity-related raw data includes electricity usage data, air temperature data, and electricity-related policy data, and the key factor characteristic data includes air temperature characteristic data, vacation characteristic data, trend characteristic data, and policy characteristic data. As shown in fig. 2, preprocessing the raw data related to the electric quantity to obtain the characteristic data of the key factors includes the following steps:
step 202, encoding the original data related to the electric quantity based on a preset encoding mapping relation to obtain feature encoding data, wherein the feature encoding data comprises air temperature encoding data, first vacation encoding data, trend encoding data and policy encoding data.
The preset coding mapping relation is the corresponding relation between the original data and the characteristic coded data, and the corresponding characteristic coded data can be obtained according to the original data through the preset coding mapping relation. Specifically, the preset coding mapping relationship is a mapping relationship obtained by coding training data when an initial model is trained, the preset coding mapping relationship is stored in the preprocessing module in advance, and when the original data needs to be preprocessed in practical application, the preprocessing module processes the original data according to the preset coding mapping relationship. It can be understood that the preset coding mapping relationship may be a preset coding rule, a preset coding correspondence table, and the like, and the influence factor of each dimension has a corresponding preset coding mapping relationship.
The temperature coded data is coded data obtained by processing temperature data in the original data related to the electric quantity according to a preset coding mapping relation.
The first vacation encoding data is encoding data generated according to dates corresponding to the electric quantity related original data. And obtaining first vacation coding data according to a vacation coding mapping relation between the date and the preset coding mapping relation. Specifically, the dates in the target time interval are sorted according to the time sequence, and the holiday type of each date is determined according to the holiday corresponding to each date. And coding each date according to the holiday type and the preset coding mapping relation to obtain first holiday coded data. For example, holiday types may include weekdays, weekends, short holidays, long holidays, spring festival, and the like.
The trend coded data is coded data generated according to dates corresponding to the electric quantity related original data. And sequencing the dates in the target time interval according to the time sequence, and obtaining trend coded data according to the trend coding mapping relation between the dates and the preset coding mapping relation. For example, as the development of the economic society of China is steadily promoted, the increase of the power consumption along with the time shows a rising trend overall, so that the dates in the target time interval are sorted according to the time sequence, the date corresponding to the minimum year and month is selected as the base date code, and the dates in the target time interval are coded according to the rising trend to obtain the trend coded data.
The policy coding data is generated according to power consumption related policy data in the power consumption related original data and the corresponding date. And after arranging the dates in the target time interval according to the time sequence, obtaining policy coding data according to the policy data corresponding to the dates and the policy coding mapping relation in the preset coding mapping relation.
Specifically, the original data related to the electric quantity is encoded based on a preset encoding mapping relation, and corresponding air temperature encoding data, first vacation encoding data, trend encoding data and policy encoding data are obtained.
And 204, respectively carrying out normalization processing on the feature coded data according to the normalization processing parameters corresponding to the feature coded data to obtain air temperature feature data, vacation feature data, trend feature data and policy feature data.
The normalization processing is to limit the data to be processed within a certain range after processing for the convenience of the subsequent data processing and ensuring the accelerated convergence when the program runs. The specific function is to summarize the statistical distribution of uniform samples. The normalization is a statistical probability distribution between 0 and 1 and the normalization is a statistical coordinate distribution over a certain interval.
Specifically, normalization processing is performed on each feature coded data according to a normalization processing parameter corresponding to each feature coded data, so that air temperature feature data, vacation feature data, trend feature data and policy feature data are obtained.
In this embodiment, the obtained electric quantity-related raw data is encoded based on a preset encoding mapping relationship to obtain each feature encoding data corresponding to the influence factor, and then normalization processing is performed on each encoding data to obtain air temperature feature data, holiday feature data, trend feature data and policy feature data corresponding to the influence factor. The data classification after the preprocessing is clearer, the subsequent input into the model is more convenient to process, and a data basis is provided for obtaining the electric quantity components corresponding to the influence factors through the electric quantity change factor analysis model.
In the conventional method, fitting the air temperature component is performed by using a polynomial function (generally, a first order function and a second order function). However, in experiments, the daily electric quantity and the daily average air temperature are subjected to visual analysis for many times, and the relationship between the daily electric quantity and the daily average air temperature cannot be simply fitted by using a linear function or a quadratic function. For example, the daily electric power consumption in the areas of guangdong, guangxi and Hainan is sensitive to a high temperature range, wherein the daily electric power consumption in the low temperature range (below 22 ℃) in Guangdong province is low in sensitivity to daily average air temperature, and slightly decreases as the daily average air temperature decreases, but the daily electric power consumption in the high temperature range (above 22 ℃) is high in sensitivity to daily average air temperature, and greatly increases as the daily average air temperature increases, as shown in FIG. 3; the daily electric quantity in Yunnan and Guizhou is sensitive to a low temperature range, wherein the daily electric quantity in the low temperature range (below 22 ℃) in Yunnan has high sensitivity to daily average temperature, the daily electric quantity greatly rises along with the reduction of the daily average temperature, but the daily electric quantity in the high temperature range (above 22 ℃) has low sensitivity to the daily average temperature, and the daily electric quantity slightly rises along with the rise of the daily average temperature, as shown in FIG. 4.
Therefore, in one embodiment, the fitting function corresponding to the temperature characteristic data uses a piecewise quadratic fitting function obtained by training based on the daily temperature and the daily electric quantity.
The piecewise quadratic fit function is a function with different analytic expressions for different value ranges of the independent variable. The formula is as follows:
wherein, T
t Is the electric quantity component influenced by the air temperature factor,
the temperature characteristic data is preprocessed; (h)
temp ,k
temp ) Is a common vertex of two quadratic function curves of the segment, wherein
Training and optimizing a piecewise function boundary through an initial analysis model; (a _ l)
temp ,a_r
temp ) The sensitivity of the daily electric quantity and the daily air temperature is controlled for the parameters of two piecewise quadratic function curves, and optimization is carried out through a BP neural network. The daily air temperature may be a daily average air temperature.
Fitting the Guangdong air temperature data in FIG. 3 using a piecewise quadratic fitting function to obtain a fitted curve as shown in FIG. 5; the yunnan air temperature data in fig. 4 was fitted, and the resulting fitted curve is shown in fig. 6. As can be seen from the figure, the temperature characteristic data is fitted by using the piecewise quadratic fitting function, the fitting precision is higher, and the fitting result is more accurate.
In one embodiment, the fitting function to which the vacation feature data corresponds is a quadratic function.
Specifically, the formula for the vacation fit function is as follows:
wherein H
t Is a component of the influence of vacation factors;
the data is preprocessed vacation characteristic data; a is a
holiday 、b
holiday 、c
holiday To fit the parameters of the function, an initial analytical model is used for optimization.
In one embodiment, the fitting function corresponding to the trend characteristic data is a linear function.
Specifically, the formula of the trend fitting function is as follows:
wherein, C
t Is a component of the influence of vacation factors;
the trend characteristic data after the pretreatment is obtained; a is a
trend 、b
trend To fit the parameters of the function, an initial analytical model is used for optimization.
In one embodiment, the fitting function to which the policy characteristics data corresponds is a quadratic function.
Specifically, the formula of the policy fitting function is as follows:
wherein, P
t Is a component of the influence of vacation factors;
the processed policy characteristic data; a is
policy 、 b
policy 、c
policy To fit the parameters of the function, an initial analytical model is used for optimization.
In one embodiment, as shown in fig. 7, the training process of the electricity quantity variation factor analysis model includes the following steps:
step 702, an original training data set is obtained.
Specifically, historical data related to electricity usage in different regions within a preset time series is acquired as the original training data set 1.
Step 704, preprocessing the original training data set to obtain a training feature data set.
Specifically, after an original training data set is obtained, the original training data set is preprocessed, and training feature data corresponding to each factor which has a key influence on electric quantity change is obtained. It will be appreciated that the pre-processing of the raw training data set may be to process missing values in the data, to feature code or normalize the data, etc.
Step 706, inputting the training feature data set into an initial analysis model for fitting calculation, and obtaining a fitting electric quantity value through an output layer in the initial analysis model, wherein the initial analysis model comprises a second feature input layer, a second hidden layer and an output layer.
The initial analysis model is a built neural network model and comprises a second characteristic input layer, a second hidden layer and an output layer.
Specifically, a second characteristic input layer of the initial analysis model is connected with a second hidden layer, after receiving input training characteristic data, the second characteristic input layer inputs the training characteristic data into a corresponding fitting function arranged in the second hidden layer according to the influence factor dimensionality corresponding to the training characteristic data, after the second hidden layer performs fitting calculation on each training characteristic data, electric quantity components corresponding to each influence factor are output, each electric quantity component is input into an output layer, and the output layer obtains a fitting electric quantity value according to the sum of each electric quantity component.
Step 708, adjusting parameters of a second characteristic input layer, a second hidden layer and an output layer in the initial analysis model according to the deviation between the fitted electric quantity value and the real electric quantity value in the historical electric quantity data, updating the initial analysis model, and determining whether the current initial analysis model meets the requirements or not based on the fitted electric quantity value and the real electric quantity value in the historical electric quantity data.
Wherein, the real electric quantity value is the real practical electric quantity value in the training area. Specifically, according to the date corresponding to the fitted electric quantity value, the corresponding real electric quantity value can be searched from the historical electric quantity data.
Specifically, a fitting electric quantity value output by the initial analysis model is compared with a real electric quantity value corresponding to the fitting electric quantity value, parameters of a second characteristic input layer, a second hidden layer and an output layer in the initial analysis model are adjusted according to the deviation of the fitting electric quantity value and the real electric quantity value, and the initial analysis model is updated.
In one embodiment, adjusting the parameters of the second characteristic input layer, the second hidden layer and the output layer in the initial analysis model according to the deviation of the fitted electric quantity value from the real electric quantity value comprises: inputting the fitting electric quantity value and the real electric quantity value into an objective function to obtain an objective function value, and if the objective function value does not meet a preset objective function difference value, determining that the current initial analysis model does not meet the requirements; and adjusting parameters of a second characteristic input layer, a second hidden layer and an output layer in the initial analysis model by using an error back propagation mechanism of the initial analysis model and combining a gradient descent optimization algorithm.
In one embodiment, adjusting the parameters of the second characteristic input layer, the second hidden layer and the output layer in the initial analysis model according to the deviation of the fitted electric quantity value from the real electric quantity value comprises: calculating a difference value between the fitting electric quantity value and the real electric quantity value, comparing the difference value with a preset difference value, and if the difference value does not meet the preset difference value, determining that the current initial analysis model does not meet the requirements; and adjusting parameters of a second characteristic input layer, a second hidden layer and an output layer in the initial analysis model by utilizing an initial analysis model error back propagation mechanism and combining a gradient descent optimization algorithm.
And 710, iteratively returning to the step of inputting the training characteristic data set into the initial analysis model for fitting calculation to obtain a fitted electric quantity value, and continuing training until a training end condition is met to obtain a trained electric quantity change factor analysis model.
Specifically, after parameters of a second feature input layer, a second hidden layer and an output layer in the initial analysis model are adjusted, iteration is returned, a training feature data set is input into the initial analysis model to perform fitting calculation, and a fitting electric quantity value is obtained, and training is continued until a training end condition is met. When the training end condition is met, the parameters of the fitting functions in the second hidden layer are used for performing fitting calculation on the input characteristic data, so that an accurate electric quantity component value can be obtained, and the fitting electric quantity value output by the output layer can be closer to the real electric quantity value. And determining that the current initial analysis model meets the requirements, removing an output layer of the current initial analysis model, and determining the initial analysis model without the output layer as an electric quantity change factor analysis model.
In one embodiment, the fitting electric quantity value and the real electric quantity value are input into the objective function to obtain an objective function value, and if the objective function value meets a preset objective function difference value, it is determined that the current initial analysis model meets the training end condition.
In one embodiment, a difference value between the fitting electric quantity value and the real electric quantity value is calculated, the difference value is compared with a preset difference value, and if the difference value meets the preset difference value, it is determined that the current initial analysis model meets the training end condition.
In the implementation, the obtained original training data set is preprocessed to obtain the training characteristic data set, the training characteristic data set is input into the initial analysis model, iterative training is carried out on the initial analysis model, the electric quantity change factor analysis model with the electric quantity component corresponding to each influence factor is obtained through accurate calculation, and the accuracy of the influence of the quantitative influence factors on the electric quantity change value is effectively improved.
In one embodiment, the raw training data set includes historical power consumption data, historical air temperature data, and historical policy data related to power consumption over a preset time sequence, and the training feature data set includes training air temperature feature data, training holiday feature data, training trend feature data, and training policy feature data. As shown in fig. 8, preprocessing the original training data set to obtain a training feature data set includes the following steps:
and step 802, performing exponential weighted moving average temperature calculation on the historical temperature data to obtain historical temperature coded data.
The exponentially weighted moving average is used for estimating a local mean of the variable, so that the updating of the variable is related to historical values in a period of time. The calculation of the exponentially weighted moving average temperature is carried out on the historical temperature data, so that the robustness of the model can be effectively increased, and the temperature characteristic information can be fully mined.
Specifically, the exponentially weighted moving average air temperature calculation formula is as follows:
wherein temp _ ewm t Exponentially weighting the moving average temperature for t days; temp. is used t The actual average temperature at day t; the specific magnitude of β can be set according to actual conditions as a decay factor, and in the present embodiment, β is set to 2/3.
And performing weighted moving average temperature calculation on the historical temperature data by using the exponential weighted moving average temperature calculation formula to obtain historical temperature encoding data.
And 804, obtaining historical date data according to the preset time sequence, and encoding the historical date data to obtain historical vacation encoding data and historical trend encoding data.
Wherein, the historical date data is obtained according to a preset time sequence. It is understood that the historical date data is obtained by collecting dates in the preset time sequence.
Specifically, date in a preset time sequence is collected to obtain historical date data, and historical vacation coded data and historical trend coded data are obtained according to the obtained historical date data.
In one embodiment, the obtaining of the historical trend encoded data from the historical date data encoding includes: and coding the historical date data by using an encoding order (Ordinal Encoder) rule to obtain historical trend coded data. Specifically, the date data of the smallest year and month in the historical date data is used as a base date code, the base date code value is 1, the next month code value is 2, and so on, and the historical trend coded data is obtained.
Step 806, encoding the historical policy data to obtain historical policy encoded data.
The historical policy data is policy data related to historical electricity consumption in a preset time sequence. It will be appreciated that the policies in the historical policy data correspond to a certain date or dates in the historical date.
Specifically, historical policy data is encoded to obtain historical policy encoded data. The encoding can be automatic encoding by using a preset encoding function or a preset encoding table, or manual encoding by using manual labeling.
For example, in some specific periods, the effect of policies on power consumption is very significant, for example, in the next half year of 2021, the power saving and demand situation in Yunnan and the energy consumption dual control situation in Yunnan are very tight, and government departments and combined power grid enterprises develop two power limiting and production limiting measures in total, the first round is orderly power consumption concentrated in 5 and 6 months, mainly aiming at the power consumption gap caused by insufficient starting of thermal power plants and water and electricity shortage in the dry season, and the second round of energy consumption dual control measure is executed from 9 months, aiming at restraining the blind development of the project of 'two high' and strengthening the control of important industries such as steel, cement, yellow phosphorus, electrolytic aluminum, industrial silicon and coal. During the two rounds of power limiting and production limiting processes, the daily electric quantity is remarkably reduced. Thus, the date unaffected by the policy may be encoded as 1, the first power-limiting policy period as 0.75, and the second power-limiting policy period as 0.8 by hand-encoding according to the policy-affected scope, such as Yunnan.
And 808, respectively carrying out normalization processing on each coded data to obtain training air temperature characteristic data, training vacation characteristic data, training trend characteristic data and training policy characteristic data.
The normalization processing is to limit the data to be processed within a certain range after processing for the convenience of the subsequent data processing and ensuring the accelerated convergence when the program runs. The specific function is to conclude the statistical distribution of unified samples. The normalization is a statistical probability distribution between 0 and 1 and the normalization is a statistical coordinate distribution over a certain interval.
Specifically, the historical temperature encoded data is normalized by the following formula:
wherein,
training temperature characteristic data of t days after normalization; temp _ ewm
t Historical air temperature coded data for day t;
all temp _ ewm for this area
t Average value of (a);
all temp _ ewm for this area
t Standard deviation of (d).
The historical vacation coded data is normalized, and the formula is as follows:
wherein,
training holiday characteristic data of t days after normalization; holiday
t Encoding data for historical holidays for day t;
all hold for this area
t Average value of (d);
all holitray for the area
t Standard deviation of (d).
The historical trend coded data is normalized, and the formula is as follows:
wherein,
training trend characteristic data of t days after normalization; trend
t Encoding data for historical trends for day t;
all trend for the region
t Average value of (d);
all trend for the region
t Standard deviation of (2).
The method is characterized in that the historical policy coded data is subjected to normalization processing, and the formula is as follows:
wherein,
training policy feature data of t days after normalization; trend
t Encoding data for the index historical policy for day t;
all trend for the region
t Average value of (d);
is the areaWith tresnd
t Standard deviation of (2).
In this implementation, the original training data set is encoded to obtain the encoding feature data corresponding to each influencing factor, and then the encoding feature data is normalized to obtain the training air temperature feature data, the training vacation feature data, the training trend feature data and the training policy feature data corresponding to the influencing factors. The data after preprocessing are classified more clearly, the data are input into the model to be processed more conveniently, and a data basis is provided for obtaining a final electric quantity change factor analysis model through training the initial analysis model.
When historical holiday coded data are obtained, if the electric quantity corresponding to the historical date data is directly used as a target for coding, extreme coded data are easy to appear, for example, the electricity consumption peak of Yunnan and Guizhou all year around appears in winter, the electricity quantity is increased rapidly due to heating electricity consumption, so the electricity quantity in the spring transportation and spring festival period is generally larger than that in non-winter, and the coding value corresponding to the spring transportation and spring festival period is too large due to direct target coding on the electricity quantity.
Therefore, in one embodiment, as shown in fig. 9, encoding the historical date data to obtain the historical vacation encoding data comprises the following steps:
and 902, classifying the historical date data according to the holiday types corresponding to the dates, and setting holiday labels for the dates.
Specifically, each date has its corresponding holiday type, and each date is bound to its corresponding holiday type, and a holiday label is set for each date. Wherein, the holiday labels comprise workdays, saturdays, sundays, holidays (New year day, Qingming day, afternoon day, mid-autumn day), holidays (labor day, national day), spring festival (two weeks before spring festival, two weeks after spring festival), spring shipping and the like, and the holidays are divided into holiday first day, holiday second day and holiday third day; the long holiday is divided into a long holiday first day and a long holiday second day … a long holiday seventh day; the spring transportation includes the first day of spring transportation, the second day of spring transportation … the thirty-fifth day of spring transportation, and the like.
Step 904, according to the holiday label corresponding to each date, determining a second holiday code value corresponding to each date from the preset code value table, and collecting to obtain second holiday code data.
The preset coding value table is a coding value table preset according to actual experience and conditions, and each holiday label can be inquired into a corresponding preset coding value in the preset coding value table.
Specifically, according to the holiday label corresponding to each date, the second holiday code value corresponding to each date is inquired and determined from the preset code value table, and the second holiday code data is obtained by the second holiday code value set.
For example, in the preset encoding value table, the encoding value corresponding to the working day is 1; the encoding value corresponding to Saturday is 0.75; the encoding value corresponding to the weekday is 0.5; the first-day code value of the short-term and long-term vacation is 0; coding the date from the first date of the holiday to the first working day after the holiday according to the arithmetic group of [0,1 ]; the divided date in the spring festival holiday is 0, the dates from the last working day before spring transportation to the divided date are coded according to the arithmetic array of the [1, 0] interval, and the dates from the divided date to the first working day after spring transportation are coded according to the arithmetic array of the [0,1] interval.
Step 906, inputting the second vacation coded data, the historical temperature coded data, the historical trend coded data and the historical policy coded data into the initial analysis model, and acquiring a first vacation electric quantity component output by a hidden layer of the initial analysis model.
Specifically, the second vacation encoding data, the historical air temperature encoding data, the historical trend encoding data, and the historical policy encoding data are input into the initial analysis model. And fitting and calculating the coded data by using a fitting function arranged in the hidden layer of the initial analysis model to obtain the electric quantity component corresponding to each influence factor. And acquiring a first vacation electric quantity component corresponding to the vacation factor.
And 908, performing target coding on the first vacation electric quantity component to obtain a third vacation coding value, inputting the third vacation coding value into the initial analysis model, and performing iterative training to obtain historical vacation coding data when a finishing condition is met.
Among them, Target encoding (Target encoding) is a very efficient method for representing classified columns and occupies only one feature space, also called mean encoding. Each value in the column is replaced by an average target value for that class. This may more directly represent the relationship between the classification variable and the target variable.
Specifically, a first vacation electric quantity component output by a hidden layer of the initial analysis model is encoded by using a target encoding method to obtain a third vacation encoding value, the third vacation encoding value and encoding values corresponding to other influence factors are input into the initial analysis model again, iterative training is carried out, and when a finishing condition is met, historical vacation encoding data are obtained. And the ending condition is that the numerical difference value between the vacation electric quantity component output by the initial analysis model and the vacation electric quantity component output in the previous training is smaller than a preset threshold value. At this point it can be assumed that the encoded value has already stabilized.
Taking the preset time sequence of 1/2020 to 10/8/2020 as an example, the following is a second vacation coding data table:
date of day
|
Vacation name
|
Vacation label
|
Vacation coding
|
2020-01-01
|
New year's day festival
|
Short false _1
|
-0.287160
|
2020-01-02
|
Working day
|
Working day
|
0.488444
|
2020-01-04
|
Saturday medicine
|
Saturday wine
|
0.026795
|
2020-01-05
|
(Sunday)
|
(Sunday)
|
-0.100472
|
2020-01-10
|
Working day
|
Spring transportation _1
|
1.325628
|
2020-01-11
|
Saturday wine
|
Spring fortune _2
|
1.055234
|
2020-02-12
|
Working day
|
Spring fortune _34
|
-1.456229
|
2020-02-13
|
Working day
|
Spring fortune _35
|
-1.510380
|
2020-04-04
|
Qingming festival
|
Short false _1
|
-0.116852
|
2020-04-05
|
Qingming festival
|
Short false 2
|
-0.116852
|
2020-04-06
|
Qingming festival
|
Short false 3
|
0.025990
|
2020-10-01
|
Festival of national day
|
Long false _1
|
-1.343969
|
2020-10-02
|
National celebration festival
|
Long false 2
|
-1.223036
|
2020-10-08
|
Festival of national day
|
Long false _8
|
-0.035673 |
In this embodiment, first encoding is performed on each date by using a preset encoding value table to obtain second vacation encoding data, and then the second vacation encoding data is input into an initial analysis model, and after electric quantities affected by factors such as air temperature, trend, policy and the like are removed, target encoding is performed. By using the method in the embodiment, the obtained vacation coding data can be made accurate. And a data basis is provided for subsequent training of the initial analysis model to obtain the electric quantity variation factor analysis model.
In one embodiment, as shown in fig. 10, there is provided a method for analyzing a power variation factor, including the steps of:
firstly, a BP neural network model is established as an initial analysis model, as shown in fig. 11, the model generally consists of three layers: input layer, hidden layer, output layer. The entire model can be represented by the following formula:
wherein,
the electric quantity of the t day fitted for the model; t is a unit of
t The electric quantity component influenced by the air temperature factor fitting for the model is referred to as the air temperature component hereinafter; h
t The electric quantity component influenced by the vacation factors fitting the model is referred to as a vacation component hereinafter; c
t The electric quantity component influenced by the trend factor of model fitting is referred to as the trend component hereinafter; p is
t And the electric quantity component influenced by the policy factor fitting the model is referred to as the policy component hereinafter.
The optimized objective function is MAPE:
wherein t is the tth day, and n is the total number of days in the training set;
power consumption on the t-th day fitted to the model; y is
t The actual electric quantity of the day t is obtained.
Then, an original training data set is obtained, wherein the original training data set comprises historical electricity consumption data, historical air temperature data, historical date data and historical policy data in a preset time sequence. And encoding the original training data set to obtain historical temperature encoding data, historical vacation encoding data, historical trend encoding data and historical policy encoding data. And carrying out normalization processing on each coded data to obtain training air temperature characteristic data, training vacation characteristic data, training trend characteristic data and training policy characteristic data. Inputting each training characteristic data into an input layer corresponding to the initial analysis model, inputting each training characteristic data into the hidden layer by the input layer, performing fitting calculation through each fitting function arranged in the hidden layer to obtain a training air temperature fitting component, a training holiday fitting component, a training trend fitting component and a training policy fitting component, and adding the components to obtain a solar power fitting component. Inputting the daily electricity fitting component and the corresponding daily electricity into the objective function, and if the calculated objective function value does not meet a preset objective function threshold value, utilizing a BP neural network error back propagation mechanism and combining a gradient descent optimization algorithm to achieve the purpose of updating the model parameters to minimize the objective function through repeated iteration. And updating the parameters of each component fitting function until the obtained objective function value meets 1 objective function threshold, removing an output layer of the initial analysis model at the moment, and determining the initial analysis model as an electric quantity change factor analysis model. The electricity quantity variation factor analysis model is shown in fig. 12. And storing the preprocessing data into a preprocessing module, wherein the preprocessing data are parameters and tables used for encoding and normalizing the data.
In actual application, the related original data of the electric quantity in the target time sequence is obtained, wherein the related original data of the electric quantity comprises electricity consumption data, air temperature data and policy data related to the electricity consumption. The method comprises the steps of preprocessing the original data related to the electric quantity through a preprocessing module, namely encoding the original data related to the electric quantity, and then normalizing the obtained encoded data to obtain key factor characteristic data. The key factor characteristic data comprises air temperature characteristic data, vacation characteristic data, trend characteristic data and policy characteristic data. And inputting the key factor characteristic data into the electric quantity change factor analysis model to obtain a gas temperature fitting electric quantity component, a holiday fitting electric quantity component, a trend fitting electric quantity component and a policy fitting electric quantity component. And determining the influence of each influence factor on the electric quantity change in the target time sequence based on each electric quantity component. Specifically, calculating the sum of the electric quantity components corresponding to each influence factor in the target time sequence to obtain each first total electric quantity component corresponding to each influence factor; and calculating the difference value of each first total electric quantity component and each second total electric quantity component corresponding to each influence factor in the comparison time sequence, and determining the influence degree of each influence factor in the target time sequence on the electric quantity change according to the difference value. Wherein the comparison time sequence is a time sequence for determining the electric quantity change value with the target time sequence. It is understood that the second total electric quantity component may be pre-stored, or may be obtained by inputting the key factor characteristic data in the comparison time series into the electric quantity variation factor analysis model.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an electric quantity variation factor analysis device for realizing the electric quantity variation factor analysis method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so that specific limitations in one or more embodiments of the power variation factor analysis device provided below can be referred to the limitations of the power variation factor analysis method in the above, and details are not repeated herein.
In one embodiment, as shown in fig. 13, there is provided an electricity quantity variation factor analysis apparatus 1300 including: an original data obtaining module 1301, a preprocessing module 1302, an electric quantity component calculating module 1303 and an influence determining module 1304, wherein:
the raw data obtaining module 1301 is configured to obtain raw data related to electric quantity in the target time sequence.
The preprocessing module 1302 is configured to preprocess the raw data related to the electric quantity to obtain key factor characteristic data.
The electric quantity component calculation module 1303 is used for inputting the key factor characteristic data into a pre-trained electric quantity change factor analysis model, and performing fitting calculation on the key factor characteristic data to obtain electric quantity components corresponding to influence factors of all preset dimensions in the key factor characteristic data; the electric quantity change factor analysis model comprises a first characteristic input layer and a first hidden layer, wherein the first hidden layer is provided with a fitting function corresponding to each influence factor and used for calculating and outputting electric quantity components corresponding to each influence factor.
And an influence determining module 1304, configured to determine, based on the electric quantity component corresponding to each influence factor, an influence of each influence factor on electric quantity change in the target time interval.
The electric quantity change factor analysis device obtains electric quantity related original data in a target time sequence, key factor characteristic data influencing electric quantity change are obtained by processing the original data, the key factor characteristic data are input into a pre-trained electric quantity change factor analysis model, fitting calculation is carried out on the key factor characteristic data, and electric quantity components corresponding to influence factors of all preset dimensions in the key factor characteristic data are obtained. When the reason for the electric quantity change in a certain period needs to be analyzed, the influence of each influence factor on the electric quantity change in the target time sequence can be determined based on the electric quantity component corresponding to each influence factor. By considering the influence of a plurality of influence factors on the electric quantity change, the accuracy of the influence of the quantitative influence factors on the electric quantity change value is improved, and the method has higher reference value for subsequent analysis of areas, industrial power utilization states and the like.
In one embodiment, the pre-processing module is further to: coding the original data related to the electric quantity based on a preset coding mapping relation to obtain feature coded data, wherein the feature coded data comprise air temperature coded data, first vacation coded data, trend coded data and policy coded data; and respectively carrying out normalization processing on the feature coded data according to the normalization processing parameters corresponding to the feature coded data to obtain air temperature feature data, vacation feature data, trend feature data and policy feature data.
In one embodiment, the electric quantity variation factor analyzing device further includes: the model training module is used for acquiring an original training data set; preprocessing the original training data set to obtain a training characteristic data set; inputting the training characteristic data set into an initial analysis model for fitting calculation to obtain a fitting electric quantity value, wherein the initial analysis model comprises a second characteristic input layer, a second hidden layer and an output layer; determining whether the current initial analysis model meets the requirements or not based on the fitting electric quantity value and a real electric quantity value in the historical electric quantity data; and if the initial analysis model meets the requirements, determining the initial analysis model without the output layer as an electric quantity variation factor analysis model.
In one embodiment, the model training module is further to: carrying out exponential weighting moving average temperature calculation on the historical temperature data to obtain historical temperature coded data; obtaining historical date data according to the preset time sequence, and coding the historical date data to obtain historical vacation coded data and historical trend coded data; encoding the historical policy data to obtain historical policy encoded data; and respectively carrying out normalization processing on each coded data to obtain training air temperature characteristic data, training vacation characteristic data, training trend characteristic data and training policy characteristic data.
In one embodiment, the model training module is further to: classifying the historical date data according to the holiday types corresponding to the dates, and setting holiday labels for the dates; according to the holiday label corresponding to each date, determining a second holiday code value corresponding to each date from a preset code value table, and collecting to obtain second holiday code data; inputting the second vacation coded data, the historical temperature coded data, the historical trend coded data and the historical policy coded data into the initial analysis model, and acquiring a first vacation electric quantity component output by a hidden layer of the initial analysis model; and performing target coding on the first vacation electric quantity component to obtain a third vacation coding value, inputting the third vacation coding value into the initial analysis model, performing iterative training, and obtaining historical vacation coding data when a finishing condition is met.
All or part of each module in the electric quantity variation factor analysis device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the computer device is used for storing data such as key factor characteristics, fitting functions, preprocessing and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a power variation factor analysis method.
Those skilled in the art will appreciate that the architecture shown in fig. 14 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the power variation factor analysis of the above embodiments when executing the computer program.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the power variation factor analysis of the above embodiments.
In one embodiment, a computer program product is provided, comprising a computer program that when executed by a processor implements the steps of power variation factor analysis of the various embodiments described above.
It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, displayed data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), for example. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.