CN119090096A - Energy-saving optimization control method and system based on load curve - Google Patents

Energy-saving optimization control method and system based on load curve Download PDF

Info

Publication number
CN119090096A
CN119090096A CN202411576832.8A CN202411576832A CN119090096A CN 119090096 A CN119090096 A CN 119090096A CN 202411576832 A CN202411576832 A CN 202411576832A CN 119090096 A CN119090096 A CN 119090096A
Authority
CN
China
Prior art keywords
data
energy consumption
feature
preset
load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411576832.8A
Other languages
Chinese (zh)
Inventor
杨宏强
吴伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ruizhitong Technology Co ltd
Original Assignee
Shenzhen Ruizhitong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ruizhitong Technology Co ltd filed Critical Shenzhen Ruizhitong Technology Co ltd
Priority to CN202411576832.8A priority Critical patent/CN119090096A/en
Publication of CN119090096A publication Critical patent/CN119090096A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/08Construction

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Air Conditioning Control Device (AREA)

Abstract

The invention provides an energy-saving optimization control method and system based on a load curve, which are applied to the field of energy consumption data processing; according to the invention, a more comprehensive energy consumption view can be formed by collecting multi-source data from different data sources, the data can be integrated to provide a richer context, various factors influencing energy consumption can be identified, the accuracy of energy saving optimization is improved, meanwhile, characteristic parameters are extracted to enable a model to be analyzed based on multiple dimensions, the characteristics can reveal how different factors influence energy consumption, the model can be better understood and predicted, a baseline model is established on an effective characteristic set, mean square errors and decision coefficients are calculated, the performance of the model can be quantized, a reference is provided for energy consumption prediction, model parameters can be continuously optimized through comparison with actual data, and the prediction accuracy is improved.

Description

Energy-saving optimal control method and system based on load curve
Technical Field
The invention relates to the field of energy consumption data processing, in particular to an energy-saving optimization control method and system based on a load curve.
Background
With the development of technology, more and more intelligent devices are arranged in a building, and most common meters (electricity meters, water meters, gas meters, energy meters, PM2.5 meters and the like) are often not independently operated, and data management software is required to be configured to manage and analyze data.
Because the formats and structures of data (such as power consumption, weather data, market demands and the like) from different sources are different, unified standards and frameworks are required to be established for integration, and multi-source data may have defects, errors or inconsistencies, which affect the reliability of the fused data, therefore, an effective data cleaning and verification mechanism is required.
Disclosure of Invention
The invention aims to solve the problem of how to effectively integrate data from different sources so as to optimize energy consumption and load prediction, and provides an energy-saving optimization control method and system based on a load curve.
The invention adopts the following technical means for solving the technical problems:
The invention provides an energy-saving optimization control method based on a load curve, which comprises the following steps:
based on a preset data unit type of a building terminal, unifying a data format of energy consumption data into the data unit type, wherein the data unit type specifically comprises temperature, electric power and a time stamp;
Judging whether the energy consumption data can be presented to the building terminal in real time;
If not, acquiring corresponding various multi-source data from different data sources through a sensor preset by the building terminal, identifying a time sequence of the multi-source data, integrating the multi-source data according to the time sequence by applying preset data fusion, obtaining fusion data, extracting corresponding characteristic parameters from the fusion data, and constructing an effective characteristic set according to the characteristic parameters, wherein the time sequence specifically comprises trend change and periodical change, and the characteristic parameters specifically comprise weather influence indexes, economic activity levels and historical load data;
judging whether the effective feature set needs preset data cleaning or not;
If not, acquiring linear relations between all the features in the effective feature set and the load variables by adopting preset correlation coefficients, establishing a corresponding baseline model on the effective feature set, calculating the mean square error and the decision coefficients of the baseline model, constructing a load prediction result of the building terminal through the baseline model, and comparing the load prediction result with an actual load value to generate a load line graph.
Further, before the step of collecting various corresponding multi-source data from different data sources through the sensor preset by the building terminal, the method further comprises:
dividing an acquisition area covered by the sensor in a building through the building terminal based on a preset working range of the sensor;
Judging whether the acquisition area can detect a preset overlapping monitoring edge or not;
If so, constructing an overlapping time window according to the overlapping monitoring edge, identifying acquisition measurement values of a plurality of sensors in the overlapping time window, generating data features to be fused according to the acquisition measurement values, carrying out weight distribution on the acquisition measurement values by applying a preset Kalman filter, and calculating a weighted average value of the data features, wherein the data features specifically comprise temperature, humidity and energy consumption.
Further, the step of constructing an effective feature set according to the feature parameters further includes:
Constructing seasonal features of the building terminal on energy consumption based on preset virtual variables, and calculating ratio features and difference features between the seasonal features, wherein the virtual variables specifically comprise spring, summer, autumn and winter, the ratio features specifically comprise the ratio of energy consumption to temperature, and the difference features specifically comprise the difference between current load and past load;
judging whether the ratio feature and the difference feature have missing values or not;
If not, carrying out standardization processing on the ratio characteristic and the difference characteristic, ensuring that the ratio characteristic and the difference characteristic are always on the same scale, combining the ratio characteristic and the difference characteristic with other characteristics to form a composite characteristic, and capturing energy consumption influence corresponding to seasonal change according to the composite characteristic.
Further, in the step of establishing a corresponding baseline model on the effective feature set, the method further includes:
Drawing an X-axis as an energy consumption predicted value from a graph based on a preset graph of the building terminal on energy consumption, drawing a Y-axis as an energy consumption residual error from the graph, and combining to obtain an energy dissipation point diagram;
Judging whether residual errors on the dissipative dot diagram are randomly distributed or not;
if yes, the dissipative point diagram is input to the baseline model, cross verification is conducted through the baseline model, and fluctuation information of the baseline model on different data subsets is obtained.
Further, in the step of determining whether the energy consumption data can be presented to the building terminal in real time, the method further includes:
Detecting network delay information when the energy consumption data are transmitted based on a network bandwidth preset by the building terminal;
judging whether the network delay information exceeds a preset delay period;
if yes, adding a time stamp label in a data acquisition link and a data display link of the energy consumption data, and calculating the overall delay of the energy consumption data according to the time stamp label.
Further, in the step of determining whether the valid feature set needs preset data cleaning, the method further includes:
calculating the proportion of the feature missing values based on the pre-statistical features of the effective feature set, and obtaining the distribution of the feature missing values according to the proportion to generate a corresponding missing value heat map;
Judging whether the missing value heat map is matched with a preset random missing;
If not, identifying the association information of the feature missing value and the pre-statistics feature, and acquiring the corresponding missing source according to the association information.
Further, the step of unifying the data format of the energy consumption data into the data unit type based on the data unit type preset by the building terminal further includes:
detecting an original unit in the energy consumption data, and identifying different units to be converted based on the original unit;
judging whether the different units can be uniformly converted into the data unit type;
If not, marking inconsistent items of different units, converting the corresponding data of the different units row by row according to the inconsistent items, and comparing the data distribution and the statistical characteristics before and after conversion.
The invention also provides an energy-saving optimization control system based on the load curve, which comprises the following steps:
The unified module is used for unifying the data format of the energy consumption data into the data unit type based on the data unit type preset by the building terminal, wherein the data unit type specifically comprises temperature, electric power and a time stamp;
the judging module is used for judging whether the energy consumption data can be presented to the building terminal in real time;
The execution module is used for acquiring corresponding various multi-source data from different data sources through a sensor preset by the building terminal if not, identifying a time sequence of the multi-source data, integrating the multi-source data according to the time sequence by applying preset data fusion, obtaining fusion data, extracting corresponding characteristic parameters from the fusion data, and constructing an effective characteristic set according to the characteristic parameters, wherein the time sequence specifically comprises trend change and periodical change, and the characteristic parameters specifically comprise weather effect indexes, economic activity levels and historical load data;
the second judging module is used for judging whether the effective feature set needs preset data cleaning or not;
And the second execution module is used for acquiring the linear relation between each characteristic in the effective characteristic set and the load variable by adopting a preset correlation coefficient if not needed, establishing a corresponding baseline model on the effective characteristic set, calculating the mean square error and the decision coefficient of the baseline model, constructing a load prediction result of the building terminal through the baseline model, and comparing the load prediction result with an actual load value to generate a load line graph.
Further, the method further comprises the following steps:
the dividing module is used for dividing an acquisition area covered by the sensor in a building through the building terminal based on a preset working range of the sensor;
the third judging module is used for judging whether the acquisition area can detect a preset overlapping monitoring edge;
And the third execution module is used for constructing an overlapping time window according to the overlapping monitoring edge if the data characteristics are enabled, identifying the acquisition measured values of the plurality of sensors in the overlapping time window, generating the data characteristics to be fused according to the acquisition measured values, carrying out weight distribution on the acquisition measured values by applying a preset Kalman filter, and calculating the weighted average value of the data characteristics, wherein the data characteristics specifically comprise temperature, humidity and energy consumption.
Further, the execution module further includes:
The calculating unit is used for constructing seasonal characteristics of the building terminal to energy consumption based on preset virtual variables, and calculating ratio characteristics and difference characteristics among the seasonal characteristics, wherein the virtual variables specifically comprise spring, summer, autumn and winter, the ratio characteristics specifically comprise the ratio of energy consumption to temperature, and the difference characteristics specifically comprise the difference between current load and past load;
a judging unit configured to judge whether or not there is a missing value of the ratio feature and the difference feature;
And the execution unit is used for carrying out standardization processing on the ratio characteristic and the difference characteristic if not, ensuring that the ratio characteristic and the difference characteristic are always on the same scale, combining the ratio characteristic and the difference characteristic with other characteristics to form a composite characteristic, and capturing the energy consumption influence corresponding to seasonal change according to the composite characteristic.
The invention provides an energy-saving optimization control method and system based on a load curve, and the energy-saving optimization control method and system have the following beneficial effects:
According to the invention, a more comprehensive energy consumption view can be formed by collecting multi-source data from different data sources, the data can be integrated to provide a richer context, various factors influencing energy consumption can be identified, the accuracy of energy saving optimization is improved, meanwhile, characteristic parameters are extracted to enable a model to be analyzed based on multiple dimensions, the characteristics can reveal how different factors influence energy consumption, the model can be better understood and predicted, a baseline model is established on an effective characteristic set, mean square errors and decision coefficients are calculated, the performance of the model can be quantized, a reference is provided for energy consumption prediction, model parameters can be continuously optimized through comparison with actual data, and the prediction accuracy is improved.
Drawings
FIG. 1 is a schematic flow chart of one embodiment of an energy-saving optimization control method based on a load curve;
FIG. 2 is a block diagram illustrating an embodiment of an energy-efficient optimization control system based on a load curve according to the present invention.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present invention, as the achievement, functional features, and advantages of the present invention are further described with reference to the embodiments, with reference to the accompanying drawings.
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, an energy-saving optimization control method based on a load curve according to an embodiment of the invention includes:
S1, unifying the data format of energy consumption data into a data unit type based on the preset data unit type of a building terminal, wherein the data unit type specifically comprises temperature, electric power and a time stamp;
S2, judging whether the energy consumption data can be presented to the building terminal in real time;
S3, if not, acquiring corresponding various multi-source data from different data sources through a preset sensor of the building terminal, identifying a time sequence of the multi-source data, integrating the multi-source data according to the time sequence by applying preset data fusion, obtaining fusion data, extracting corresponding characteristic parameters from the fusion data, and constructing an effective characteristic set according to the characteristic parameters, wherein the time sequence specifically comprises trend change and periodical change, and the characteristic parameters specifically comprise weather effect indexes, economic activity levels and historical load data;
S4, judging whether the effective feature set needs preset data cleaning or not;
And S5, if not needed, acquiring a linear relation between each characteristic in the effective characteristic set and a load variable by adopting a preset correlation coefficient, establishing a corresponding baseline model on the effective characteristic set, calculating a mean square error and a decision coefficient of the baseline model, constructing a load prediction result of the building terminal through the baseline model, and comparing the load prediction result with an actual load value to generate a load line graph.
In this embodiment, the system unifies the data format of the energy consumption data acquired in the building into the data unit type based on the preset data unit type of the building terminal, wherein the data unit type specifically comprises temperature, electric power and time stamp, and then the system judges whether the energy consumption data can be presented to the building terminal in real time so as to execute the corresponding steps; for example, when the system determines that the collected energy consumption data can be presented to the building terminal in real time, the system considers the data to be satisfactory in quality, the system can effectively reflect the actual energy consumption condition, and can display the collected energy consumption data, such as an energy consumption graph, a power consumption control system and a power consumption control system on a building terminal in real time, The pie chart or the histogram is convenient for a user to quickly acquire information, an alarm threshold is set according to historical data and requirements, when certain energy consumption exceeds a normal range, the system automatically gives an alarm so as to take measures in time, real-time data are recorded into a database so as to be convenient for subsequent analysis and inquiry, the integrity and traceability of the data are ensured, dynamic data analysis is carried out, a trend chart or a prediction model is generated, and a manager is helped to identify an energy consumption mode and potential abnormal conditions; for example, when the system determines that the collected energy consumption data cannot be presented to the building terminal in real time, the system considers that the actual energy consumption situation cannot be reflected, the system collects corresponding various multi-source data from different energy consumption data sources through a sensor preset by the building terminal, identifies time sequences of the multi-source data, the time sequences specifically comprise trend changes and periodical changes, integrates the multi-source data according to data fusion preset by the time sequences, obtains fusion data, extracts corresponding characteristic parameters from the fusion data, and the characteristic parameters specifically comprise weather influence indexes, The system can capture the dynamic characteristics of the energy consumption data by identifying and analyzing the trend change and the periodical change in the time sequence, the process not only helps understand the past energy consumption mode, but also provides important basis for future energy consumption prediction, and the manager can identify seasonal fluctuation, seasonal fluctuation and the like by deeply analyzing the time sequence, The system integrates data from different sources into a unified data set through a preset data fusion algorithm in the fusion process, the process is favorable for eliminating noise and redundancy in the data, the data quality is improved, a manager can obtain more accurate information through data fusion, further better supports energy consumption analysis and decision, an effective feature set is constructed according to the extracted feature parameters to lay a foundation for a subsequent prediction model, key driving factors of building energy consumption can be captured through the effective feature set, the prediction accuracy is improved, then the system judges whether the effective feature set needs preset data cleaning to execute corresponding steps, for example, when the system judges that the effective feature set needs preset data cleaning, the system considers that the data in the effective feature set may have errors, Abnormal values or noise, which may affect the accuracy of subsequent analysis and model construction, the system may determine the goals of cleaning, e.g., removing duplicate data, filling in missing values, handling abnormal values, etc., ensure that all cleaning steps are consistent with the requirements of subsequent analysis, while checking for missing values in the feature set, determining how to handle the missing values, and ensuring that all features are consistent in units, format, and types, e.g., temperature units may need to be unified to degrees celsius or degrees fahrenheit, power units may need to ensure consistency, and by identifying and removing duplicate data, ensuring uniqueness of feature sets, reducing redundancy to the analysis, e.g., when the system determines that the valid feature set does not need a preset data cleaning, the system may consider that the data in the valid feature set is free of errors, The system can save time and resources required by data cleaning by directly using the effective feature set, quicken the speed of model establishment, enable the system to rapidly respond to changes and provide load prediction results, meanwhile, through the preset correlation coefficient, the system can analyze the linear relation between each feature in the effective feature set and the load variable, the analysis can help a manager to better understand key factors influencing building energy consumption, guide subsequent management decisions, and the base line model established based on the linear relation generally has better interpretability, the manager can clearly see the influence degree of each feature on load prediction, thereby making targeted adjustment, and can provide a more accurate and better control on the energy consumption of the load prediction, and can provide a feedback strategy for the prediction result by using the system to optimize the energy consumption prediction coefficient, thereby optimizing the energy consumption of the system.
It should be noted that, the time sequence of the multi-source data is identified, and the preset data fusion is applied according to the time sequence, so as to integrate the multi-source data to obtain fusion data, which is specifically illustrated as follows:
Assuming energy consumption prediction in a large commercial building, data sources including weather, economic activity and historical energy consumption data will be exemplified by day 1 of 10 in 2024, which details how multi-source data fusion is performed through time series;
Data is collected, assuming that the following information is collected from different data sources:
Weather data (in hours):
08:00-temperature, 10 ℃ and 60% humidity;
09:00-temperature, 12 ℃, humidity 55%;
10:00-temperature, 14 ℃ and 50% humidity;
Economic activity data (in hours):
08:00-passenger flow 1000 people, commercial activity index 0.6;
09:00-passenger flow 1200 person, commercial activity index 0.7;
10:00-passenger flow 1500 people, commercial activity index 0.8;
Historical load data (in hours):
08:00-energy consumption 200kWh;
09:00-energy consumption 220kWh;
10:00-energy consumption 240kWh;
Identifying a time series, aligning the time points of each data source, ensuring that all data are within the same time frame, e.g., the time stamps of all data are 08:00, 09:00, and 10:00 on day 1 of 10, 2024;
applying data fusion, at this stage, integrating all data into one unified data set, the following steps are specific:
aligning the time stamp, namely, all data are at the same time point without adjustment;
creating a fusion data table by integrating the features of the data sources into one table, each row representing data at one point in time:
Time stamp Temperature (° C) Humidity (%) Passenger flow volume (person) Index of commercial activity Energy consumption (kWh)
2024-10-01 08:00 10 60 1000 0.6 200
2024-10-01 09:00 12 55 1200 0.7 220
2024-10-01 10:00 14 50 1500 0.8 240
Weighted averaging assuming that the effect of temperature on energy consumption is greater than the humidity and economic activity index, weights may be set for different features, such as:
The temperature is 0.5;
Humidity is 0.2;
Passenger flow volume is 0.2;
Commercial activity index 0.1;
from these weights, a weighted composite feature can be calculated, for example:
Complex index = 0.5 temperature +0.2 humidity +0.2 passenger flow/1000 +0.1 commercial activity index
The fusion data is obtained, the system obtains the fusion data through the steps, and the fusion data set provides support for subsequent load prediction analysis;
Data analysis and application, feature extraction, wherein relevant features are extracted according to the fusion data and used for establishing a load prediction model, and the features comprise temperature, passenger flow volume and commercial activity indexes;
modeling and prediction, training a prediction model, such as a linear regression or decision tree, using the fused dataset to predict future energy consumption, such as predicting the next 1 hour of energy consumption using the fused data for the past 3 hours;
after the result is visualized, a load line graph is generated, and the actual energy consumption is compared with the model predicted energy consumption to evaluate the effect of the model;
In summary, the system effectively fuses the multi-source data through the steps to form a complete analysis basis, and the data fusion method not only improves the utilization efficiency of the data, but also provides higher accuracy and reliability for energy consumption prediction.
Then a preset correlation coefficient is adopted to collect the linear relationship between each feature in the effective feature set and the load variable, and specific examples are as follows:
Assuming that the following energy consumption related data is collected in a commercial building to predict its power consumption, the collected characteristics include:
temperature (°c);
Passenger flow volume (person);
commercial activity index (dimensionless);
humidity (%);
Energy consumption (kWh);
a dataset, the following data were considered as input datasets:
Time stamp Temperature (° C) Passenger flow volume (person) Index of commercial activity Humidity (%) Energy consumption (kWh)
2024-10-01 08:00 10 1000 0.6 60 200
2024-10-01 09:00 12 1200 0.7 55 220
2024-10-01 10:00 14 1500 0.8 50 240
2024-10-01 11:00 15 1600 0.85 45 260
2024-10-01 12:00 16 1700 0.9 40 280
2024-10-01 13:00 18 1800 0.95 35 300
Calculating the correlation coefficients, assuming that it is desired to calculate the correlation coefficients between these features and the energy consumption, the pearson correlation coefficients can be used for the calculation, the following are the detailed steps of how to calculate:
Data preparation, first, needs to ensure that all data is in digital format, the following is an example of pseudo code for correlation coefficient calculation with Python:
import pandas as pd
# creation data frame
data = {
'Temperature' [10, 12, 14, 15, 16, 18],
'Passenger flow volume' [1000, 1200, 1500, 1600, 1700, 1800],
'Commercial activity index': 0.6, 0.7, 0.8, 0.85, 0.9, 0.95,
'Humidity' [60, 55, 50, 45, 40, 35],
'Energy consumption' [200, 220, 240, 260, 280, 300]
}
df = pd.DataFrame(data)
Calculation of correlation coefficient #
correlation_matrix = df.corr()
print(correlation_matrix)
The calculated correlation coefficient matrix calculated by the false design is as follows:
temperature (temperature) Passenger flow volume Index of commercial activity Humidity of the water Energy consumption
Temperature (temperature) 1.00 0.98 0.95 -0.97 0.97
Passenger flow volume 0.98 1.00 0.96 -0.95 0.98
Index of commercial activity 0.95 0.96 1.00 -0.94 0.96
Humidity of the water -0.97 -0.95 -0.94 1.00 -0.96
Energy consumption 0.97 0.98 0.96 -0.96 1.00
The correlation coefficient is analyzed, the correlation coefficient of the temperature and the energy consumption is 0.97, which shows that a very strong positive correlation exists between the temperature and the energy consumption, and the energy consumption is obviously increased when the air temperature rises;
the correlation coefficient of the passenger flow volume and the energy consumption is 0.98, which indicates that the increase of the passenger flow volume is closely related to the increase of the energy consumption;
the correlation coefficient of humidity and energy consumption is-0.96, which shows that there is strong negative correlation between humidity and energy consumption, and when humidity is reduced, energy consumption is increased;
feature selection, which features should be retained in the model can be determined based on the calculated correlation coefficients, features with higher correlation to energy consumption, such as temperature, passenger flow, and commercial activity index, can be selected, while humidity may be less important in the model;
establishing a baseline model, and establishing a linear regression model by using the characteristic with a strong relation with energy consumption, wherein the method can comprise the following steps:
Model training, namely taking temperature, passenger flow and commercial activity index as input characteristics, taking energy consumption as a target variable, and training a linear regression model;
from sklearn.linear_model import LinearRegression
# split input and output
X=df [ [ 'temperature', 'passenger flow', 'commercial activity index', ]
Y=df [' energy consumption ]
# Creation and training model
model = LinearRegression()
model.fit(X, y)
Model evaluation, namely calculating a Mean Square Error (MSE) and a decision coefficient (R2) of the model, and evaluating the performance of the model;
The result is visualized, a scatter diagram is drawn, the actual energy consumption is compared with a model predicted value, and a scatter diagram and a load line diagram are generated, so that the accuracy of a visualized model is improved;
in summary, how to calculate the correlation coefficient between each feature in the effective feature set and the energy consumption variable, and how to perform feature selection and baseline model establishment based on the results are shown in detail.
Establishing a corresponding baseline model on the effective feature set, calculating the mean square error and the decision coefficient of the baseline model, and constructing a load prediction result of the building terminal through the baseline model, wherein the specific example is as follows:
assuming a commercial building energy consumption dataset, the collected characteristics including temperature, passenger flow and commercial activity index, which are intended to be used to predict building energy consumption (in kWh), the dataset is as follows:
Time stamp Temperature (° C) Passenger flow volume (person) Index of commercial activity Energy consumption (kWh)
2024-10-01 08:00 10 1000 0.6 200
2024-10-01 09:00 12 1200 0.7 220
2024-10-01 10:00 14 1500 0.8 240
2024-10-01 11:00 15 1600 0.85 260
2024-10-01 12:00 16 1700 0.9 280
2024-10-01 13:00 18 1800 0.95 300
Establishing a baseline model, and predicting energy consumption by using a linear regression model;
First, necessary libraries are imported and data is prepared:
import pandas as pdfrom sklearn.linear_model import LinearRegressionfrom sklearn.metrics import mean_squared_error, r2_scoreimport matplotlib.pyplot as plt
# creation data frame
data = {
' Timestamp [ '2024-10-01 08:00', '2024-10-01 09:00', '2024-10-01 10:00', ' 2024-01:00 ',
'2024-10-01 11:00', '2024-10-01 12:00', '2024-10-01 13:00'],
'Temperature' [10, 12, 14, 15, 16, 18],
'Passenger flow volume' [1000, 1200, 1500, 1600, 1700, 1800],
'Commercial activity index': 0.6, 0.7, 0.8, 0.85, 0.9, 0.95,
'Energy consumption' [200, 220, 240, 260, 280, 300]
}
df = pd.DataFrame(data)
Df [ 'timestamp' ] =pd.to_ datetime (df [ 'timestamp' ]) # is converted to a date-time format
Then dividing the characteristic and the target variable, and separating the characteristic (X) and the target variable (y);
x=df [ [ 'temperature', 'passenger flow', 'commercial activity index', ]
Y=df [' energy consumption ]
A model is then created and trained, using a linear regression model for training:
model = LinearRegression()
model.fit(X, y)
and predicting the energy consumption, wherein the trained model can be used for energy consumption prediction:
predictions = model.predict(X)
Calculating a Mean Square Error (MSE) and a decision coefficient (R2) using the prediction result:
mse = mean_squared_error(y, predictions)
r2 = r2_score(y, predictions)
print (f "Mean Square Error (MSE): { MSE }) # outputs a mean square error print (f" decision coefficient (R2): { R2 }) # outputs a decision coefficient
Example results assume the calculation results are as follows:
Mean Square Error (MSE) 25.0;
determining a coefficient (R2) of 0.98;
The Mean Square Error (MSE) is 25.0, which represents the average square error between the predicted value and the actual value, and the smaller the numerical value is, the higher the prediction accuracy of the model is;
the determination coefficient (R2) is 0.98, which indicates that the model can explain 98% of energy consumption change and that the model fitting effect is very good;
Finally, generating a load prediction result, comparing the prediction result with an actual energy consumption value, and generating a load prediction line graph:
plt.figure(figsize=(10, 5))
plt.plot (df [ 'timestamp' ], y, label = 'actual energy consumption', marker = 'o')
Plt.plot (df [ 'timestamp' ], predictions, label= 'prediction energy consumption', marker= 'x')
Plt.xlabel ('time')
Plt_ylabel (' energy consumption (kWh))
Plt.title ('energy consumption prediction')
plt.legend()
plt.xticks(rotation=45)
plt.grid()
plt.show()
In summary, it is shown in detail how to build a baseline model on an effective feature set, calculate a mean square error and a decision coefficient, and generate a visualization of a load prediction result, which helps to understand the model performance and provides a data base for future energy consumption prediction.
In this embodiment, before step S3 of collecting corresponding various multi-source data from different data sources by using a sensor preset by the building terminal, the method further includes:
S301, dividing an acquisition area covered by the sensor in a building through the building terminal based on a preset working range of the sensor;
S302, judging whether the acquisition area can detect a preset overlapping monitoring edge;
And S303, if yes, constructing an overlapping time window according to the overlapping monitoring edge, identifying acquisition measurement values of a plurality of sensors in the overlapping time window, generating data features to be fused according to the acquisition measurement values, carrying out weight distribution on the acquisition measurement values by applying a preset Kalman filter, and calculating a weighted average value of the data features, wherein the data features specifically comprise temperature, humidity and energy consumption.
In the embodiment, the system divides the acquisition areas covered by the sensor in the building through the building terminal based on the preset working range of the sensor in the building, and then the system judges whether the acquisition areas can detect preset overlapping monitoring edges or not so as to execute corresponding steps; for example, when the system determines that the acquisition areas covered by the sensors in the building do not detect the preset overlapping monitoring edges, the system considers that the working range of the sensors is divided properly, each acquisition area covered by the sensors is independent, repeated monitoring is avoided, the system continuously acquires energy consumption data of each area, transmits the data to the building terminal according to preset frequency, simultaneously, the stability of data transmission is monitored, the data of each sensor when the sensors are in the independent acquisition areas can be completely transmitted to the terminal in time, and the system can periodically check the states of the sensors, including battery life, signal intensity and the like, so as to ensure long-term stable operation, and because the coverage areas of the sensors are not overlapped, the system periodically checks whether the data of the areas are missed, and if the data acquisition of certain areas is missed or delayed, repair measures are needed, for example, when the system determines that the energy consumption data of the sensors are detected in the acquisition areas covered by the building, the system can consider that the working range of the sensors is multiple covered by the overlapping monitoring edges, the system can periodically check the states of the sensors according to the overlapping edges, the conditions of the overlapping edges, the temperature window comprises the measured values, the measured values of the sensors are different in time window, the humidity window is different, the characteristics are fused, and the characteristics of the measured values are different in time are generated according to the characteristics of the acquisition data, the system can reduce errors possibly brought by a single sensor by combining measured values from a plurality of sensors in an overlapping area, the redundant data can help to confirm and correct abnormal values, so that the accuracy of overall data is improved, meanwhile, the overlapping monitoring allows the system to utilize data characteristics of a plurality of sources, so that the reliability of the data is improved, the system can obtain more stable and reliable data characteristics through weighted average under the condition that noise and uncertainty exist by the Kalman filtering, and generate the data characteristics to be fused according to a plurality of measured values in an overlapping time window, so that the diversity and the comprehensiveness of data fusion are ensured, the measured results of different sensors at the same moment can provide more rich information, the finally fused data characteristics are more representative, the Kalman filtering automatically adjusts the weights of different sensors according to the change of real-time measurement, so that the system can dynamically optimize the data processing according to the actual situation, the self-adaptive capacity is important in the change environment, and the state of the sensors can be responded in real time.
It should be noted that, an overlapping time window is constructed according to the overlapping monitoring edge, collected measurement values of a plurality of sensors in the overlapping time window are identified, a data feature to be fused is generated according to the collected measurement values, a preset kalman filter is applied to perform weight distribution on the collected measurement values, and a weighted average value of the data feature is calculated, and specific examples are as follows:
It is assumed that in an office building there are two temperature sensors, sensor a and sensor B, which overlap in time windows of 10:00 to 10:05, during which time two sensors each record temperature data, the following are specific data records:
recording of sensor a:
10:00 - 22°C;
10:01 - 23°C;
10:02 - 24°C;
10:03 - 22°C;
10:04 - 23°C;
10:05 - 24°C;
Recording of sensor B:
10:00 - 21°C;
10:01 - 22°C;
10:02 - 23°C;
10:03 - 22°C;
10:04 - 21°C;
10:05 - 23°C;
constructing an overlapping time window identifying an overlapping time period of sensors a and B at 10:00 to 10:05, which time window is to be used for integrating the data of the two sensors;
identifying the collected measurements, the system will extract the temperature measurements of both sensors within the overlapping time window, yielding the following table:
Time of Sensor A (° C) Sensor B (° C)
10:00 22 21
10:01 23 22
10:02 24 23
10:03 22 22
10:04 23 21
10:05 24 23
Generating data features to be fused, extracting temperature features to be fused from the table, wherein for a time period of 10:00, the features to be fused are [22, 21];
the Kalman filtering is applied and filtering parameters are set, the measurement noise covariance of the sensor A is assumed to be 0.1, the measurement noise covariance of the sensor B is assumed to be 0.3, and the system dynamically adjusts the weight according to different noises.
Weight calculation:
weighting of sensor A
Weighting of sensor B
Total weight =
Normalized weights,
Taking 10:00 as an example, a weighted average of the data features is calculated:
Repeating the steps for other time points to obtain the following steps:
Time of Weighted average temperature (°c)
10:00 21.75
10:01 (0.75×23)+(0.25×22)=22.75
10:02 (0.75×24)+(0.25×23)=23.75
10:03 (0.75×22)+(0.25×22)=22.00
10:04 (0.75×23)+(0.25×21)=22.75
10:05 (0.75×24)+(0.25×23)=23.75
In conclusion, the finally obtained weighted average temperature data are more accurate, the actual environmental condition can be better reflected, the data fusion method based on the overlapped monitoring edges and the Kalman filtering provides more accurate and reliable environmental monitoring data for intelligent building management, and the effectiveness and the scientificity of decision making are ensured.
In this embodiment, in step S3 of constructing an effective feature set according to the feature parameters, the method further includes:
S31, constructing seasonal features of the building terminal on energy consumption based on preset virtual variables, and calculating ratio features and difference features between the seasonal features, wherein the virtual variables specifically comprise spring, summer, autumn and winter, the ratio features specifically comprise the ratio of energy consumption to temperature, and the difference features specifically comprise the difference between current load and past load;
s32, judging whether missing values exist in the ratio feature and the difference feature;
And S33, if not, carrying out standardization processing on the ratio characteristic and the difference characteristic, ensuring that the ratio characteristic and the difference characteristic are always on the same scale, combining the ratio characteristic and the difference characteristic with other characteristics to form a composite characteristic, and capturing the energy consumption influence corresponding to seasonal change according to the composite characteristic.
In this embodiment, the system builds seasonal features of the energy consumption by building the building terminal based on preset virtual variables, specifically including spring, summer, autumn and winter, calculates the ratio features and the difference features between the seasonal features, specifically the ratio of the energy consumption to the temperature, and calculates the difference features, specifically the difference between the current load and the past load, then the system judges whether the ratio features and the difference features have missing values to execute corresponding steps, for example, when the system judges that the ratio features and the difference features between the seasonal features have missing values, the system considers that the features may not accurately reflect the seasonal variation of the energy consumption, which may affect the subsequent analysis and prediction, the system records the specific positions (time points and features) of the missing values, calculates the ratio of the missing values, simultaneously resamples the data, calculates the representative data within a selected time period, ensures the timeliness and continuity of the data, and judges that the missing values are unreasonably marked as "missing" or "abnormal" for the unreasonable missing values, processes in subsequent processes, and processes the ratio features are removed or the difference values are calculated separately, and the ratio features are always combined to ensure that the ratio features and the difference features are accurately calculated, and the ratio features are accurately analyzed when the ratio features are evaluated, the difference is completely and the difference is calculated, the ratio features is accurately calculated, and the ratio feature is calculated, and the difference is accurately is calculated, and the ratio feature is calculated is bad is calculated, the system can ensure that all the characteristics are compared on the same scale by carrying out standardization processing on the contrast ratio characteristic and the difference characteristic, the standardization can prevent certain characteristics from leading other characteristics because of larger numerical range, the model is more balanced, the prediction precision is improved, for example, the ratio value of energy consumption to temperature can be relatively smaller, the energy consumption difference value can be larger, the energy consumption difference value can be calculated in the same range after the standardization processing, unbalance among the characteristics is avoided, and meanwhile, the composite characteristic can capture more complex energy consumption modes by combining different ratio characteristics and difference characteristics. The combination features can reveal deeper correlation between seasonal variation and energy consumption fluctuation in the model, so that the model is not only dependent on single features, but can comprehensively analyze driving factors of energy consumption variation from multiple dimensions, the prediction accuracy of the model is improved, the combination of ratio features and difference features can accurately capture variation trends of energy consumption in different seasons, such as load variation in winter and possible energy consumption driving factors in spring and summer, by analyzing the composite features, the system can identify specific energy consumption modes in winter or summer, so as to optimize load prediction, the composite features can provide richer information for the prediction model, so that the prediction result is more explanatory, and a decision maker can determine specific influence factors of different seasons on energy consumption based on the composite features, so that fluctuation sources of energy consumption can be better understood in actual operation, and corresponding measures can be taken.
It should be noted that, the ratio feature and the difference feature are normalized, so as to ensure that the ratio feature and the difference feature are always on the same scale, and the ratio feature and the difference feature are combined with other features to form a composite feature, and the energy consumption effect corresponding to seasonal variation is captured according to the composite feature, which is specifically exemplified as follows:
assuming the system manages a large office building, the following data is collected:
the temperature (T) is the ambient temperature (unit: °C) of each day;
Energy consumption (E) of the building (unit: kWh), recorded per hour;
historical energy consumption (E_history) records of energy consumption (units: kWh) for the past 24 hours;
Four virtual variables are represented by the current seasons of spring (1), summer (2), autumn (3) and winter (4);
example data, assume that during the afternoon of a summer season, the system collects the following data:
the current temperature (T) is 30 ℃;
the current energy consumption (E) is 300 kWh;
energy consumption (E_history) of 250 kWh for the past 24 hours;
season (S) summer (2);
The ratio feature and the difference feature are first calculated,
Ratio feature (R):
calculating the ratio of energy consumption to temperature:
difference feature (D):
calculating the difference between the current energy consumption and the past energy consumption:
It is assumed that in the energy consumption data in the entire summer, the following is the statistical information of the ratio feature and the difference feature:
Mean value of ratio feature (μr): 8
Standard deviation of ratio characteristic (sigma r): 2
Mean value of difference characteristic (mu d) 30
Standard deviation of difference characteristic (sigma d): 10
Normalization of the ratio features:
normalization of the difference features:
combining the normalized ratio feature and the difference feature with the seasonal feature to form a composite feature;
assuming that the seasonal feature S (summer) is denoted as 2, the following composite feature may be formed:
Capturing the influence of seasonal changes on energy consumption, wherein the system has composite characteristics C1 and C2, and the seasonal influence of energy consumption can be analyzed by using the composite characteristics C1 and C2;
the composite characteristic c1=2 shows that in summer, the sensitivity of energy consumption to temperature is relatively high;
the composite feature c2=4 shows that the current energy consumption is significantly increased by 50 kWh compared to the historical energy consumption, and this variation is common in summer;
In summary, with these composite features, the system can use machine learning models to predict energy consumption for several hours in the future, taking into account seasonal and temperature variations, while if the system predicts that temperature will continue to rise within several hours in the future, the composite features show that energy consumption will increase significantly, management personnel can initiate energy saving measures in advance, such as adjusting the temperature settings of the air conditioner or activating standby energy sources, and the system can generate detailed reports on energy consumption and climate effects, helping building managers to make more scientific decisions.
In this embodiment, in step S5 of establishing a corresponding baseline model on the effective feature set, the method further includes:
S51, drawing an X-axis as an energy consumption predicted value from a preset chart of the building terminal on the energy consumption, drawing a Y-axis as an energy consumption residual error from the chart, and combining to obtain a dissipation point diagram;
s52, judging whether residual errors on the dissipative point diagram are randomly distributed or not;
and S53, if yes, inputting the dissipative point diagram to the baseline model, and performing cross verification through the baseline model to acquire fluctuation information of the baseline model on different data subsets.
In the embodiment, the system draws an X-axis as an energy consumption predicted value from a graph based on a graph preset by the building terminal on the energy consumption, draws a Y-axis as an energy consumption residual error on the graph, combines the energy consumption residual errors to obtain a dissipation point diagram, and then judges whether the residual errors on the dissipation point diagram are randomly distributed or not to execute corresponding steps; for example, when the system determines that the residuals on the energy consumption point diagram are not randomly distributed, the system considers that the baseline model may have systematic deviation, the system considers that the model does not accurately capture the real relation between the energy consumption and the related characteristics, the system checks whether the characteristics used by the model cover all key factors which can affect the energy consumption, especially whether the characteristics of weather, economic activities, running states of building equipment and other external factors are fully reflected, enhances the recognition capability of the model to the energy consumption mode by introducing more virtual variables, adjusting the construction mode of the characteristics or considering interactive characteristics, and generates the energy consumption point diagram again, checks whether the residual distribution shows randomness, ensures that the model can better reflect actual energy consumption fluctuation, for example, when the system determines that the residuals on the energy consumption point diagram belong to random distribution, the system considers that the baseline model can accurately capture the real relation between the energy consumption and the related characteristics, the system inputs the energy consumption point diagram to the baseline model, cross-verifies the baseline model, acquires fluctuation information of the baseline model on different data subsets, and cross-verifies the different data subsets, so that the cross-stability and consistency of the model can be better reflected on the different data subsets, and the performance of the model can be better in the same order (for example, the performance of the error is better than that the error-proof is guaranteed) can be ensured under different data subsets Deciding coefficients, etc.), the prediction capability of the model is comprehensively known, the super parameters of the model can be adjusted in the cross-validation process to find the optimal model configuration, the prediction accuracy and the robustness of the model are improved, in the cross-validation process, the influence of each feature on the model prediction result can be analyzed, and the recognition of which features still have important roles under different conditions is facilitated, so that basis is provided for subsequent feature selection and model improvement.
The dissipation point diagram is input to the baseline model, and the baseline model is used for cross-validation to obtain fluctuation information of the baseline model on different data subsets, wherein the specific examples are as follows:
assuming an energy consumption dataset for an office building, the data includes the following features:
temperature (°c);
humidity (%);
economic activity level (index);
Historical energy consumption (kWh);
current energy consumption (kWh) (target variable);
the preparation of the data is carried out,
Collecting data, namely collecting the hour energy consumption data in one year, wherein a total of 8,760 records are assumed;
Data cleaning, namely removing missing values and abnormal values, and ensuring the data quality;
dividing the dataset by randomly dividing the dataset into 10 subsets (k=10), i.e. each subset contains 876 records;
cross-validation process, cyclic training and testing:
round 1, using subset 1 as the test set, subset 2 to subset 10 as the training set;
Training a model, recording model parameters, predicting energy consumption by using input features of the subset 1, and calculating a Mean Square Error (MSE);
round 2, using subset 2 as the test set, subset 1 and subsets 3 through 10 as the training set;
This process continues until all subsets are used as test sets once;
the performance index was calculated assuming that after 10 rounds of cross-validation, the recorded mean square error was as follows:
subset 1 mse=5.2;
subset 2 mse=4.8;
Subset 3 mse=5.0;
Subset 4 mse=6.1;
subset 5 mse=5.4;
subset 6 mse=4.9;
subset 7 mse=5.3;
subset 8 mse=5.1;
Subset 9 mse=4.7;
subset 10 mse=5.5;
summarizing, calculating average MSE and standard deviation:
Average mse= (5.2+4.8+5.0+6.1+5.4+4.9+5.3+5.1+4.7+5.5)/10=5.2;
standard deviation=0.26 (calculated on assumption);
The visualization and the analysis of the images are carried out,
Drawing a chart, namely generating the chart, wherein the horizontal axis is actual energy consumption, the vertical axis is predicted energy consumption, adding trend lines, and visually displaying the model prediction capability;
drawing a residual error map, checking whether the residual error is randomly distributed and whether a mode exists;
The final model optimizes the analysis result, and finds that the MSE of the model is higher in certain specific time periods (such as during summer high temperature), possibly because of the complex relationship between temperature and energy consumption;
Model tuning, taking into account the addition of more features (such as solar radiation intensity, wind speed, etc.);
In summary, through the detailed cross-validation process, the performance of the model on different data sets can be comprehensively evaluated, the model is ensured to have good generalization capability, if the model is found to perform poorly under certain specific conditions, targeted adjustment can be performed according to the cross-validation result, so that the accuracy of building energy consumption prediction is improved, and the detailed analysis and optimization process is beneficial to realizing more effective energy consumption management, improving energy efficiency and reducing operation cost.
In this embodiment, the step S2 of determining whether the energy consumption data can be presented to the building terminal in real time further includes:
s21, detecting network delay information when the energy consumption data are transmitted based on a network bandwidth preset by the building terminal;
s22, judging whether the network delay information exceeds a preset delay period;
and S23, if yes, adding a time stamp label in the data acquisition link and the data display link of the energy consumption data, and calculating the overall delay of the energy consumption data according to the time stamp label.
In this embodiment, the system detects network delay information when the energy consumption data is transmitted based on a network bandwidth preset by the building terminal, and then the system judges whether the network delay information exceeds a preset delay period to execute corresponding steps; for example, when the system judges that the network delay information when the energy consumption data is transmitted does not exceed the preset delay period, the system considers that the network connection is normal, the data transmission is smooth, the real-time energy consumption condition in the building can be timely reflected, the system can keep the current data transmission state, the continuous uploading of the energy consumption data to the building terminal is ensured, the current delay information is recorded, the system monitoring panel is updated, the real-time monitoring of the data transmission state is ensured, the data integrity check is carried out, the data loss or damage does not occur in the transmission process, a real-time feedback report is generated, the state of the current energy consumption data and the network transmission condition are displayed for the management personnel, the system operation condition is ensured to be known by all related personnel, for example, when the system judges that the network delay information when the energy consumption data is transmitted exceeds the preset delay period, the system considers that the data transmission is abnormal, the real-time energy consumption condition in the building can not be timely reflected, the system can add a time stamp label in a data acquisition link and a data display link of the energy consumption data, the self-adaptive delay of the energy consumption data is calculated according to different time stamp labels, the time stamp label passes, the time stamp label can be recorded, the time stamp and each time stamp is accurately recorded, the time stamp is accurately and the time stamp is accurately analyzed, the system can be accurately due to the delay is accurately due to the delay and the delay is analyzed by the system, and the time label is accurately analyzed, therefore, the reason of the delay increase under specific conditions is understood, the subsequent optimization of network configuration is facilitated, when network delay occurs, the system can be quickly positioned to a specific delay link (such as data acquisition or data transmission), so that technicians can conduct fault removal and repair in a targeted manner, and the adaptive calculation of the overall delay can help the system adjust a data processing strategy, such as reducing the data transmission frequency or selecting a more efficient data compression mode when the network condition is poor.
In this embodiment, in step S4 of determining whether the valid feature set needs preset data cleaning, the method further includes:
S41, calculating the proportion of the feature missing values based on the pre-statistical features of the effective feature set, and obtaining the distribution of the feature missing values according to the proportion to generate a corresponding missing value heat map;
S42, judging whether the missing value heat map is matched with a preset random missing;
And S43, if not, identifying the association information of the feature missing value and the pre-statistics feature, and acquiring the corresponding missing source according to the association information.
In this embodiment, the system calculates the proportion of the feature missing values based on the pre-statistical features of the effective feature set, obtains the distribution of the feature missing values according to different proportions, generates a corresponding missing value heat map, and then the system judges whether the missing value heat map matches with a preset random missing or not to execute the corresponding steps; for example, when the system determines that the missing value heat map can be matched with a preset random missing, the system considers that the missing value distribution of the features is not influenced by systematic factors, the missing values are randomly generated and are not caused by certain specific reasons, the system records the result of the missing value heat map and the matching situation thereof, generates an analysis report so that a team knows the state of missing data and the influence of the missing data on analysis, and simultaneously, the system can continuously evaluate and verify the stability and the accuracy of a model without major adjustment on the data because the missing values are random, and the system still needs to monitor the missing situation of subsequent data to ensure that the data quality continuously meets the expected standard although the current missing values are random; for example, when the system determines that the missing value heat map cannot match with preset random missing, the system considers that the missing value distribution of the features is affected, the system identifies the associated information of the missing values of the features and the preset statistical features, and acquires corresponding missing sources according to different associated information, the system can identify potential reasons causing the missing by analyzing the association of the missing values with other features, such as sensor faults, data acquisition errors or external environment factors, the method is helpful for solving the problem in a targeted manner, meanwhile, the identification of the non-random missing values can help a model designer understand the limitation of the data set, the factors are considered when the model is constructed, the robustness of the model to missing data is enhanced, and the identification of the sources of the feature missing can provide important information for building managers, so that the building managers can be helped to consider the potential influence of the data missing on energy consumption analysis and prediction when deciding.
The ratio of the feature missing values is calculated based on the pre-statistical features of the effective feature set, the distribution of the feature missing values is obtained according to the ratio, and a corresponding missing value heat map is generated, and specific examples are as follows:
Assume that there is a data set for building energy consumption monitoring, comprising the following features:
A temperature;
humidity;
Energy consumption;
An economic activity level;
a sample record of the dataset is as follows:
Temperature (° C) Humidity (%) Energy consumption (kWh) Level of economic activity
22 60 300 0.8
23 NaN 320 0.9
NaN 65 290 NaN
24 70 NaN 0.7
25 NaN 310 0.85
NaN 68 330 0.9
The missing value ratio is calculated, and for each feature, the missing value ratio is calculated.
Temperature: number of missing values=3, total number of records=6, missing value ratio=3/6=0.5 (50%);
Humidity, number of missing values = 3, ratio of missing values = 3/6 = 0.5 (50%);
Energy consumption, number of missing values=2, ratio of missing values=2/6=0.33 (33.33%);
Economic activity level: number of missing values=1, ratio of missing values=1/6=0.17 (16.67%);
Obtaining missing value distribution, and arranging the calculation result into a table:
Features (e.g. a character) Ratio of missing values
Temperature (temperature) 0.50
Humidity of the water 0.50
Energy consumption 0.33
Level of economic activity 0.17
A missing value heat map is generated, and example code for generating the heat map is as follows:
import seaborn as snsimport matplotlib.pyplot as pltimport pandas as pd
# creation example data
data = {
Features [ 'temperature', 'humidity', 'energy consumption', 'economic activity level',
'Ratio of deficiency values': 0.50, 0.50, 0.33, 0.17]
}
df = pd.DataFrame(data)
# Create heat map
plt.figure(figsize=(8, 4))
Heatmap _data=df_set_index ('feature')
sns.heatmap(heatmap_data, annot=True, cmap='YlGnBu', cbar=True)
Plt.title ('missing value heat map')
Plt.xlabel ('missing value ratio')
Plt. ylabel ('feature')
plt.show()
It can be seen from the heat map that the ratio of the missing values of "temperature" and "humidity" reaches 50%, which indicates that the data quality of these two features is poor and may require further cleaning and shimming;
In summary, by generating the heat map, the missing condition of the data is intuitively displayed, so that the data problem can be conveniently and quickly identified, meanwhile, according to the heat map result, a more reasonable data cleaning and processing strategy is formulated, the integrity and accuracy of the data are improved, the heat map is regularly generated, the change of the missing condition of the data is monitored, and the continuous effectiveness of data collection is ensured.
In this embodiment, based on a data unit type preset by the building terminal, the step S1 of unifying the data format of the energy consumption data into the data unit type further includes:
s11, detecting an original unit in the energy consumption data, and identifying different units to be converted based on the original unit;
S12, judging whether the different units can be uniformly converted into the data unit type;
And S13, if not, marking inconsistent items of different units, carrying out unit-by-unit conversion on corresponding data of the different units row by row according to the inconsistent items, and comparing data distribution and statistical characteristics before and after conversion.
In this embodiment, the system identifies different units to be converted based on each original unit by detecting the original unit in the energy consumption data, and then the system determines whether the different units can be uniformly converted into the data unit type to execute the corresponding steps; for example, when the system judges that different units to be converted can be uniformly converted into data unit types, the system can consider that the units have the same dimension, conversion can be performed through a standard conversion formula or proportion, the system can definitely convert the data to be converted according to a defined conversion rule item by item, the converted data is updated to a data set, the original units and the converted units are recorded so as to be convenient for tracing and auditing, further data analysis, modeling and visualization can be performed once the data are uniformly converted into the standard unit types, for example, when the system judges that the units to be converted cannot be uniformly converted into the data unit types, the system can consider that the units cannot be uniformly converted, the system can mark inconsistent items of the different units, corresponding data of the different units can be converted line by line according to the inconsistent items, the system can avoid errors or deviations possibly introduced during global processing, especially when the data are uniformly converted into the standard unit types, the system can ensure that the data can not be accurately processed according to the different dimension, and the data can not be accurately discarded according to the different dimension of the data unit types, by comparing the data distribution and the statistical characteristics before and after conversion, the method ensures that meaningful information is reserved, can keep higher data integrity in a complex data environment, marks inconsistent items of different units, ensures that the inconsistent items can be traced in the subsequent analysis process, prevents confusion or misuse of data of an error unit, and can be rapidly positioned and corrected when a problem occurs.
Referring to fig. 2, an energy saving optimization control system based on a load curve according to an embodiment of the present invention includes:
The unifying module 10 is configured to unify a data format of the energy consumption data into a data unit type based on the data unit type preset by the building terminal, where the data unit type specifically includes a temperature, an electric power and a timestamp;
A judging module 20, configured to judge whether the energy consumption data can be presented to the building terminal in real time;
The execution module 30 is configured to, if not, collect corresponding various multi-source data from different data sources through a preset sensor of the building terminal, identify a time sequence of the multi-source data, apply preset data fusion according to the time sequence, integrate the multi-source data to obtain fusion data, extract corresponding feature parameters from the fusion data, and construct an effective feature set according to the feature parameters, where the time sequence specifically includes trend change and periodic change, and the feature parameters specifically include weather effect index, economic activity level and historical load data;
a second judging module 40, configured to judge whether the effective feature set needs preset data cleaning;
and the second execution module 50 is configured to, if not required, collect linear relationships between each feature in the effective feature set and the load variable by using a preset correlation coefficient, establish a corresponding baseline model on the effective feature set, calculate a mean square error and a decision coefficient of the baseline model, construct a load prediction result of the building terminal through the baseline model, and compare the load prediction result with an actual load value to generate a load line graph.
In this embodiment, the unifying module 10 unifies the data format of the energy consumption data collected in the building into the data unit type based on the preset data unit type of the building terminal, the data unit type specifically includes temperature, power and time stamp, and then the judging module 20 judges whether the energy consumption data can be presented to the building terminal in real time to execute the corresponding steps, for example, when the system judges that the collected energy consumption data can be presented to the building terminal in real time, the system considers that the data meets the quality requirement, the actual energy consumption condition can be effectively reflected, and the system displays the collected energy consumption data on the building terminal in real time, for example, the energy consumption graph, the, the pie chart or the histogram is convenient for a user to quickly acquire information, an alarm threshold is set according to historical data and requirements, when certain energy consumption exceeds a normal range, the system automatically gives an alarm so as to take measures in time, real-time data are recorded into a database so as to be convenient for subsequent analysis and inquiry, the integrity and traceability of the data are ensured, dynamic data analysis is carried out, a trend chart or a prediction model is generated, and a manager is helped to identify an energy consumption mode and potential abnormal conditions; for example, when the system determines that the collected energy consumption data cannot be presented to the building terminal in real time, the execution module 30 considers that the actual energy consumption situation cannot be reflected, the system collects corresponding various multi-source data from different energy consumption data sources through a sensor preset by the building terminal, identifies time sequences of the multi-source data, the time sequences specifically include trend changes and periodical changes, integrates the multi-source data according to data fusion preset by the time sequence application, obtains fusion data, extracts corresponding characteristic parameters from the fusion data, and the characteristic parameters specifically include weather effect indexes, The system can capture the dynamic characteristics of the energy consumption data by identifying and analyzing the trend change and the periodical change in the time sequence, the process not only helps understand the past energy consumption mode, but also provides important basis for future energy consumption prediction, and the manager can identify seasonal fluctuation, seasonal fluctuation and the like by deeply analyzing the time sequence, The system integrates data from different sources into a unified data set through a preset data fusion algorithm in the fusion process, the process is favorable for eliminating noise and redundancy in the data, the data quality is improved, a manager can obtain more accurate information through data fusion, further better supports energy consumption analysis and decision, an effective feature set is constructed according to the extracted feature parameters to lay a foundation for a subsequent prediction model, key driving factors of building energy consumption can be captured through the effective feature set to improve the prediction accuracy, and then a second judging module 40 judges whether the effective feature set needs preset data cleaning or not to execute corresponding steps, for example, when the system judges that the effective feature set needs preset data cleaning, the system considers that the data in the effective feature set may have errors, Abnormal values or noise, which may affect the accuracy of subsequent analysis and model construction, the system may determine the goals of cleaning, e.g., removing duplicate data, filling in missing values, handling abnormal values, etc., ensure that all cleaning steps are consistent with the requirements of subsequent analysis, while checking for missing values in the feature set, determining how to handle the missing values, and ensuring that all features are consistent in units, format, and types, e.g., temperature units may need to be unified to degrees celsius or degrees fahrenheit, power units may need to ensure consistency, and by identifying and removing duplicate data, ensuring uniqueness of feature sets, reducing redundancy to the analysis, e.g., when the system determines that the active feature set does not need a preset data cleaning, the second execution module 50 may consider that the data in the active feature set is free of errors, The system can save time and resources required by data cleaning by directly using the effective feature set, quicken the speed of model establishment, enable the system to rapidly respond to changes and provide load prediction results, meanwhile, through the preset correlation coefficient, the system can analyze the linear relation between each feature in the effective feature set and the load variable, the analysis can help a manager to better understand key factors influencing building energy consumption, guide subsequent management decisions, and the base line model established based on the linear relation generally has better interpretability, the manager can clearly see the influence degree of each feature on load prediction, thereby making targeted adjustment, and can provide a more accurate and better control on the energy consumption of the load prediction, and can provide a feedback strategy for the prediction result by using the system to optimize the energy consumption prediction coefficient, thereby optimizing the energy consumption of the system.
In this embodiment, further comprising:
the dividing module is used for dividing an acquisition area covered by the sensor in a building through the building terminal based on a preset working range of the sensor;
the third judging module is used for judging whether the acquisition area can detect a preset overlapping monitoring edge;
And the third execution module is used for constructing an overlapping time window according to the overlapping monitoring edge if the data characteristics are enabled, identifying the acquisition measured values of the plurality of sensors in the overlapping time window, generating the data characteristics to be fused according to the acquisition measured values, carrying out weight distribution on the acquisition measured values by applying a preset Kalman filter, and calculating the weighted average value of the data characteristics, wherein the data characteristics specifically comprise temperature, humidity and energy consumption.
In the embodiment, the system divides the acquisition areas covered by the sensor in the building through the building terminal based on the preset working range of the sensor in the building, and then the system judges whether the acquisition areas can detect preset overlapping monitoring edges or not so as to execute corresponding steps; for example, when the system determines that the acquisition areas covered by the sensors in the building do not detect the preset overlapping monitoring edges, the system considers that the working range of the sensors is divided properly, each acquisition area covered by the sensors is independent, repeated monitoring is avoided, the system continuously acquires energy consumption data of each area, transmits the data to the building terminal according to preset frequency, simultaneously, the stability of data transmission is monitored, the data of each sensor when the sensors are in the independent acquisition areas can be completely transmitted to the terminal in time, and the system can periodically check the states of the sensors, including battery life, signal intensity and the like, so as to ensure long-term stable operation, and because the coverage areas of the sensors are not overlapped, the system periodically checks whether the data of the areas are missed, and if the data acquisition of certain areas is missed or delayed, repair measures are needed, for example, when the system determines that the energy consumption data of the sensors are detected in the acquisition areas covered by the building, the system can consider that the working range of the sensors is multiple covered by the overlapping monitoring edges, the system can periodically check the states of the sensors according to the overlapping edges, the conditions of the overlapping edges, the temperature window comprises the measured values, the measured values of the sensors are different in time window, the humidity window is different, the characteristics are fused, and the characteristics of the measured values are different in time are generated according to the characteristics of the acquisition data, the system can reduce errors possibly brought by a single sensor by combining measured values from a plurality of sensors in an overlapping area, the redundant data can help to confirm and correct abnormal values, so that the accuracy of overall data is improved, meanwhile, the overlapping monitoring allows the system to utilize data characteristics of a plurality of sources, so that the reliability of the data is improved, the system can obtain more stable and reliable data characteristics through weighted average under the condition that noise and uncertainty exist by the Kalman filtering, and generate the data characteristics to be fused according to a plurality of measured values in an overlapping time window, so that the diversity and the comprehensiveness of data fusion are ensured, the measured results of different sensors at the same moment can provide more rich information, the finally fused data characteristics are more representative, the Kalman filtering automatically adjusts the weights of different sensors according to the change of real-time measurement, so that the system can dynamically optimize the data processing according to the actual situation, the self-adaptive capacity is important in the change environment, and the state of the sensors can be responded in real time.
In this embodiment, the execution module further includes:
The calculating unit is used for constructing seasonal characteristics of the building terminal to energy consumption based on preset virtual variables, and calculating ratio characteristics and difference characteristics among the seasonal characteristics, wherein the virtual variables specifically comprise spring, summer, autumn and winter, the ratio characteristics specifically comprise the ratio of energy consumption to temperature, and the difference characteristics specifically comprise the difference between current load and past load;
a judging unit configured to judge whether or not there is a missing value of the ratio feature and the difference feature;
And the execution unit is used for carrying out standardization processing on the ratio characteristic and the difference characteristic if not, ensuring that the ratio characteristic and the difference characteristic are always on the same scale, combining the ratio characteristic and the difference characteristic with other characteristics to form a composite characteristic, and capturing the energy consumption influence corresponding to seasonal change according to the composite characteristic.
In this embodiment, the system builds seasonal features of the energy consumption by building the building terminal based on preset virtual variables, specifically including spring, summer, autumn and winter, calculates the ratio features and the difference features between the seasonal features, specifically the ratio of the energy consumption to the temperature, and calculates the difference features, specifically the difference between the current load and the past load, then the system judges whether the ratio features and the difference features have missing values to execute corresponding steps, for example, when the system judges that the ratio features and the difference features between the seasonal features have missing values, the system considers that the features may not accurately reflect the seasonal variation of the energy consumption, which may affect the subsequent analysis and prediction, the system records the specific positions (time points and features) of the missing values, calculates the ratio of the missing values, simultaneously resamples the data, calculates the representative data within a selected time period, ensures the timeliness and continuity of the data, and judges that the missing values are unreasonably marked as "missing" or "abnormal" for the unreasonable missing values, processes in subsequent processes, and processes the ratio features are removed or the difference values are calculated separately, and the ratio features are always combined to ensure that the ratio features and the difference features are accurately calculated, and the ratio features are accurately analyzed when the ratio features are evaluated, the difference is completely and the difference is calculated, the ratio features is accurately calculated, and the ratio feature is calculated, and the difference is accurately is calculated, and the ratio feature is calculated is bad is calculated, the system can ensure that all the characteristics are compared on the same scale by carrying out standardization processing on the contrast ratio characteristic and the difference characteristic, the standardization can prevent certain characteristics from leading other characteristics because of larger numerical range, the model is more balanced, the prediction precision is improved, for example, the ratio value of energy consumption to temperature can be relatively smaller, the energy consumption difference value can be larger, the energy consumption difference value can be calculated in the same range after the standardization processing, unbalance among the characteristics is avoided, and meanwhile, the composite characteristic can capture more complex energy consumption modes by combining different ratio characteristics and difference characteristics. The combination features can reveal deeper correlation between seasonal variation and energy consumption fluctuation in the model, so that the model is not only dependent on single features, but can comprehensively analyze driving factors of energy consumption variation from multiple dimensions, the prediction accuracy of the model is improved, the combination of ratio features and difference features can accurately capture variation trends of energy consumption in different seasons, such as load variation in winter and possible energy consumption driving factors in spring and summer, by analyzing the composite features, the system can identify specific energy consumption modes in winter or summer, so as to optimize load prediction, the composite features can provide richer information for the prediction model, so that the prediction result is more explanatory, and a decision maker can determine specific influence factors of different seasons on energy consumption based on the composite features, so that fluctuation sources of energy consumption can be better understood in actual operation, and corresponding measures can be taken.
In this embodiment, the second execution module further includes:
The drawing unit is used for drawing an X-axis as an energy consumption predicted value from a preset chart of the building terminal on the energy consumption, drawing a Y-axis as an energy consumption residual error from the chart, and combining to obtain a dissipation chart;
the second judging unit is used for judging whether the residual errors on the dissipative point diagram are randomly distributed or not;
and the second execution unit is used for inputting the dissipative point diagram to the baseline model if the dissipative point diagram is positive, and acquiring fluctuation information of the baseline model on different data subsets through cross verification of the baseline model.
In the embodiment, the system draws an X-axis as an energy consumption predicted value from a graph based on a graph preset by the building terminal on the energy consumption, draws a Y-axis as an energy consumption residual error on the graph, combines the energy consumption residual errors to obtain a dissipation point diagram, and then judges whether the residual errors on the dissipation point diagram are randomly distributed or not to execute corresponding steps; for example, when the system determines that the residuals on the energy consumption point diagram are not randomly distributed, the system considers that the baseline model may have systematic deviation, the system considers that the model does not accurately capture the real relation between the energy consumption and the related characteristics, the system checks whether the characteristics used by the model cover all key factors which can affect the energy consumption, especially whether the characteristics of weather, economic activities, running states of building equipment and other external factors are fully reflected, enhances the recognition capability of the model to the energy consumption mode by introducing more virtual variables, adjusting the construction mode of the characteristics or considering interactive characteristics, and generates the energy consumption point diagram again, checks whether the residual distribution shows randomness, ensures that the model can better reflect actual energy consumption fluctuation, for example, when the system determines that the residuals on the energy consumption point diagram belong to random distribution, the system considers that the baseline model can accurately capture the real relation between the energy consumption and the related characteristics, the system inputs the energy consumption point diagram to the baseline model, cross-verifies the baseline model, acquires fluctuation information of the baseline model on different data subsets, and cross-verifies the different data subsets, so that the cross-stability and consistency of the model can be better reflected on the different data subsets, and the performance of the model can be better in the same order (for example, the performance of the error is better than that the error-proof is guaranteed) can be ensured under different data subsets Deciding coefficients, etc.), the prediction capability of the model is comprehensively known, the super parameters of the model can be adjusted in the cross-validation process to find the optimal model configuration, the prediction accuracy and the robustness of the model are improved, in the cross-validation process, the influence of each feature on the model prediction result can be analyzed, and the recognition of which features still have important roles under different conditions is facilitated, so that basis is provided for subsequent feature selection and model improvement.
In this embodiment, the judging module further includes:
The detection unit is used for detecting network delay information when the energy consumption data are transmitted based on the network bandwidth preset by the building terminal;
a third judging unit, configured to judge whether the network delay information exceeds a preset delay period;
and the third execution unit is used for adding a time stamp label in the data acquisition link and the data display link of the energy consumption data if yes, and calculating the overall delay of the energy consumption data according to the time stamp label.
In this embodiment, the system detects network delay information when the energy consumption data is transmitted based on a network bandwidth preset by the building terminal, and then the system judges whether the network delay information exceeds a preset delay period to execute corresponding steps; for example, when the system judges that the network delay information when the energy consumption data is transmitted does not exceed the preset delay period, the system considers that the network connection is normal, the data transmission is smooth, the real-time energy consumption condition in the building can be timely reflected, the system can keep the current data transmission state, the continuous uploading of the energy consumption data to the building terminal is ensured, the current delay information is recorded, the system monitoring panel is updated, the real-time monitoring of the data transmission state is ensured, the data integrity check is carried out, the data loss or damage does not occur in the transmission process, a real-time feedback report is generated, the state of the current energy consumption data and the network transmission condition are displayed for the management personnel, the system operation condition is ensured to be known by all related personnel, for example, when the system judges that the network delay information when the energy consumption data is transmitted exceeds the preset delay period, the system considers that the data transmission is abnormal, the real-time energy consumption condition in the building can not be timely reflected, the system can add a time stamp label in a data acquisition link and a data display link of the energy consumption data, the self-adaptive delay of the energy consumption data is calculated according to different time stamp labels, the time stamp label passes, the time stamp label can be recorded, the time stamp and each time stamp is accurately recorded, the time stamp is accurately and the time stamp is accurately analyzed, the system can be accurately due to the delay is accurately due to the delay and the delay is analyzed by the system, and the time label is accurately analyzed, therefore, the reason of the delay increase under specific conditions is understood, the subsequent optimization of network configuration is facilitated, when network delay occurs, the system can be quickly positioned to a specific delay link (such as data acquisition or data transmission), so that technicians can conduct fault removal and repair in a targeted manner, and the adaptive calculation of the overall delay can help the system adjust a data processing strategy, such as reducing the data transmission frequency or selecting a more efficient data compression mode when the network condition is poor.
In this embodiment, the second judging module further includes:
The generating unit is used for calculating the proportion of the feature missing values based on the pre-statistical features of the effective feature set, acquiring the distribution of the feature missing values according to the proportion, and generating a corresponding missing value heat map;
a fourth judging unit, configured to judge whether the missing value heat map matches a preset random missing;
and the fourth execution unit is used for identifying the association information of the feature missing value and the pre-statistics feature if not, and acquiring the corresponding missing source according to the association information.
In this embodiment, the system calculates the proportion of the feature missing values based on the pre-statistical features of the effective feature set, obtains the distribution of the feature missing values according to different proportions, generates a corresponding missing value heat map, and then the system judges whether the missing value heat map matches with a preset random missing or not to execute the corresponding steps; for example, when the system determines that the missing value heat map can be matched with a preset random missing, the system considers that the missing value distribution of the features is not influenced by systematic factors, the missing values are randomly generated and are not caused by certain specific reasons, the system records the result of the missing value heat map and the matching situation thereof, generates an analysis report so that a team knows the state of missing data and the influence of the missing data on analysis, and simultaneously, the system can continuously evaluate and verify the stability and the accuracy of a model without major adjustment on the data because the missing values are random, and the system still needs to monitor the missing situation of subsequent data to ensure that the data quality continuously meets the expected standard although the current missing values are random; for example, when the system determines that the missing value heat map cannot match with preset random missing, the system considers that the missing value distribution of the features is affected, the system identifies the associated information of the missing values of the features and the preset statistical features, and acquires corresponding missing sources according to different associated information, the system can identify potential reasons causing the missing by analyzing the association of the missing values with other features, such as sensor faults, data acquisition errors or external environment factors, the method is helpful for solving the problem in a targeted manner, meanwhile, the identification of the non-random missing values can help a model designer understand the limitation of the data set, the factors are considered when the model is constructed, the robustness of the model to missing data is enhanced, and the identification of the sources of the feature missing can provide important information for building managers, so that the building managers can be helped to consider the potential influence of the data missing on energy consumption analysis and prediction when deciding.
In this embodiment, the unified module further includes:
the identification unit is used for detecting an original unit in the energy consumption data, and identifying different units to be converted based on the original unit;
A fifth judging unit, configured to judge whether the different unit bits can be uniformly converted into the data unit type;
And the fifth execution unit is used for marking inconsistent items of different units if not, carrying out unit-by-unit conversion on corresponding data of the different units row by row according to the inconsistent items, and comparing data distribution and statistical characteristics before and after conversion.
In this embodiment, the system identifies different units to be converted based on each original unit by detecting the original unit in the energy consumption data, and then the system determines whether the different units can be uniformly converted into the data unit type to execute the corresponding steps; for example, when the system judges that different units to be converted can be uniformly converted into data unit types, the system can consider that the units have the same dimension, conversion can be performed through a standard conversion formula or proportion, the system can definitely convert the data to be converted according to a defined conversion rule item by item, the converted data is updated to a data set, the original units and the converted units are recorded so as to be convenient for tracing and auditing, further data analysis, modeling and visualization can be performed once the data are uniformly converted into the standard unit types, for example, when the system judges that the units to be converted cannot be uniformly converted into the data unit types, the system can consider that the units cannot be uniformly converted, the system can mark inconsistent items of the different units, corresponding data of the different units can be converted line by line according to the inconsistent items, the system can avoid errors or deviations possibly introduced during global processing, especially when the data are uniformly converted into the standard unit types, the system can ensure that the data can not be accurately processed according to the different dimension, and the data can not be accurately discarded according to the different dimension of the data unit types, by comparing the data distribution and the statistical characteristics before and after conversion, the method ensures that meaningful information is reserved, can keep higher data integrity in a complex data environment, marks inconsistent items of different units, ensures that the inconsistent items can be traced in the subsequent analysis process, prevents confusion or misuse of data of an error unit, and can be rapidly positioned and corrected when a problem occurs.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1.一种基于负荷曲线的节能优化控制方法,其特征在于,包括以下步骤:1. An energy-saving optimization control method based on a load curve, characterized in that it comprises the following steps: 基于建筑终端预设的数据单位类型,将能耗数据的数据格式统一为所述数据单位类型,其中,所述数据单位类型具体包括温度、电力和时间戳;Based on the data unit type preset by the building terminal, the data format of the energy consumption data is unified into the data unit type, wherein the data unit type specifically includes temperature, power and timestamp; 判断所述能耗数据能否实时呈现至所述建筑终端;Determining whether the energy consumption data can be presented to the building terminal in real time; 若否,则通过所述建筑终端预设的传感器,从不同的数据源中采集对应的各种多源数据,识别所述多源数据的时间序列,根据所述时间序列应用预设的数据融合,将所述多源数据进行整合,得到融合数据,从所述融合数据中提取对应的特征参数,依据所述特征参数构建有效特征集,其中,所述时间序列具体包括趋势变化和周期性变化,所述特征参数具体包括天气影响指数、经济活动水平和历史负荷数据;If not, then collect corresponding various multi-source data from different data sources through the preset sensors of the building terminal, identify the time series of the multi-source data, apply the preset data fusion according to the time series, integrate the multi-source data to obtain fused data, extract corresponding feature parameters from the fused data, and construct a valid feature set based on the feature parameters, wherein the time series specifically includes trend changes and periodic changes, and the feature parameters specifically include weather impact index, economic activity level and historical load data; 判断所述有效特征集是否需要预设的数据清洗;Determining whether the valid feature set requires preset data cleaning; 若不需要,则采用预设的相关系数,采集所述有效特征集中的各个特征与负荷变量之间的线性关系,在所述有效特征集上建立对应的基线模型,计算所述基线模型的均方误差和决定系数,通过所述基线模型构建所述建筑终端的负荷预测结果,将所述负荷预测结果与实际负荷值对比,生成负荷折线图。If not necessary, a preset correlation coefficient is used to collect the linear relationship between each feature in the effective feature set and the load variable, a corresponding baseline model is established on the effective feature set, the mean square error and determination coefficient of the baseline model are calculated, and the load prediction result of the building terminal is constructed through the baseline model. The load prediction result is compared with the actual load value to generate a load line chart. 2.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述则通过所述建筑终端预设的传感器,从不同的数据源中采集对应的各种多源数据的步骤前,还包括:2. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that before the step of collecting corresponding various multi-source data from different data sources through the sensors preset by the building terminal, it also includes: 基于所述传感器预设的工作范围,通过所述建筑终端划分所述传感器在建筑内覆盖的采集区域;Based on the preset working range of the sensor, dividing the collection area covered by the sensor in the building through the building terminal; 判断所述采集区域能否检测到预设的重叠监测边缘;Determining whether the acquisition area can detect a preset overlapping monitoring edge; 若能,则根据所述重叠监测边缘构建重叠时间窗口,标识多个传感器在所述重叠时间窗口的采集测量值,依据所述采集测量值生成待融合的数据特征,应用预设的卡尔曼滤波为所述采集测量值进行权重分配,计算所述数据特征的加权平均值,其中,所述数据特征具体包括温度、湿度和能耗。If possible, an overlapping time window is constructed based on the overlapping monitoring edges, and the collected measurement values of multiple sensors in the overlapping time window are identified. Data features to be fused are generated based on the collected measurement values, and a preset Kalman filter is applied to assign weights to the collected measurement values, and the weighted average of the data features is calculated, wherein the data features specifically include temperature, humidity and energy consumption. 3.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述依据所述特征参数构建有效特征集的步骤中,还包括:3. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that the step of constructing a valid feature set according to the feature parameters further comprises: 基于预设的虚拟变量,构建所述建筑终端对能耗的季节性特征,计算所述季节性特征之间的比率特征和差值特征,其中,所述虚拟变量具体包括春季、夏季、秋季和冬季,所述比率特征具体为能耗与温度的比率,所述差值特征具体为当前负荷与过去负荷的差值;Based on the preset virtual variables, the seasonal characteristics of the energy consumption of the building terminals are constructed, and the ratio characteristics and the difference characteristics between the seasonal characteristics are calculated, wherein the virtual variables specifically include spring, summer, autumn and winter, the ratio characteristics are specifically the ratio of energy consumption to temperature, and the difference characteristics are specifically the difference between the current load and the past load; 判断所述比率特征和所述差值特征是否存在缺失值;Determine whether there are missing values in the ratio feature and the difference feature; 若否,则对所述比率特征和所述差值特征进行标准化处理,确保所述比率特征和所述差值特征始终处于同一尺度上,将所述比率特征和所述差值特征与其他特征进行组合,形成复合特征,根据所述复合特征捕捉季节性变化对应的能耗影响。If not, the ratio feature and the difference feature are standardized to ensure that they are always on the same scale, and the ratio feature and the difference feature are combined with other features to form a composite feature, and the energy consumption impact corresponding to seasonal changes is captured according to the composite feature. 4.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述在所述有效特征集上建立对应的基线模型的步骤中,还包括:4. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that the step of establishing a corresponding baseline model on the effective feature set further includes: 基于所述建筑终端对能耗的预设图表,从所述图表上绘制X轴为能耗预测值,并从所述图表上绘制Y轴为能耗残差,结合得到能耗散点图;Based on the preset chart of energy consumption of the building terminal, the X-axis is drawn from the chart as the energy consumption prediction value, and the Y-axis is drawn from the chart as the energy consumption residual, and the energy consumption scatter plot is obtained by combining them; 判断所述能耗散点图上的残差是否随机分布;Determining whether the residuals on the energy consumption scatter plot are randomly distributed; 若是,则将所述能耗散点图输入至所述基线模型,通过所述基线模型进行交叉验证,获取所述基线模型在不同数据子集上的波动信息。If yes, the energy consumption scatter plot is input into the baseline model, and cross-validation is performed through the baseline model to obtain fluctuation information of the baseline model on different data subsets. 5.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述判断所述能耗数据能否实时呈现至所述建筑终端的步骤中,还包括:5. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that the step of judging whether the energy consumption data can be presented to the building terminal in real time further comprises: 基于所述建筑终端预设的网络带宽,检测所述能耗数据进行数据传输时的网络延迟信息;Based on the preset network bandwidth of the building terminal, detecting the network delay information when the energy consumption data is transmitted; 判断所述网络延迟信息是否超出预设的延迟时段;Determining whether the network delay information exceeds a preset delay period; 若是,则在所述能耗数据的数据采集环节和数据展示环节中添加时间戳标签,根据所述时间戳标签计算所述能耗数据的整体延迟。If so, a timestamp tag is added in the data collection link and the data display link of the energy consumption data, and the overall delay of the energy consumption data is calculated according to the timestamp tag. 6.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述判断所述有效特征集是否需要预设的数据清洗的步骤中,还包括:6. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that the step of judging whether the effective feature set requires preset data cleaning further comprises: 基于所述有效特征集的预统计特征,计算特征缺失值的比例,根据所述比例获取所述特征缺失值的分布,生成对应的缺失值热图;Based on the pre-statistical features of the effective feature set, the proportion of feature missing values is calculated, the distribution of the feature missing values is obtained according to the proportion, and a corresponding missing value heat map is generated; 判断所述缺失值热图是否匹配预设的随机缺失;Determine whether the missing value heat map matches a preset random missing; 若否,则识别所述特征缺失值与所述预统计特征的关联信息,依据所述关联信息采集对应的缺失出处。If not, then identify the association information between the feature missing value and the pre-statistical feature, and collect the corresponding missing source based on the association information. 7.根据权利要求1所述的基于负荷曲线的节能优化控制方法,其特征在于,所述基于建筑终端预设的数据单位类型,将能耗数据的数据格式统一为所述数据单位类型的步骤中,还包括:7. The energy-saving optimization control method based on load curve according to claim 1 is characterized in that the step of unifying the data format of energy consumption data into the data unit type based on the data unit type preset by the building terminal further comprises: 检测所述能耗数据中的原始单位,基于所述原始单位识别出待转换的不同单位;detecting original units in the energy consumption data, and identifying different units to be converted based on the original units; 判断所述不同单位能否统一转换为所述数据单位类型;Determining whether the different units can be uniformly converted into the data unit type; 若否,则标记所述不同单位的不一致项,根据所述不一致项对所述不同单位的对应数据逐行进行单位转换,比对转换前后的数据分布和统计特性。If not, the inconsistent items of the different units are marked, and the corresponding data of the different units are converted row by row according to the inconsistent items, and the data distribution and statistical characteristics before and after the conversion are compared. 8.一种基于负荷曲线的节能优化控制系统,其特征在于,包括:8. An energy-saving optimization control system based on a load curve, characterized by comprising: 统一模块,用于基于建筑终端预设的数据单位类型,将能耗数据的数据格式统一为所述数据单位类型,其中,所述数据单位类型具体包括温度、电力和时间戳;A unification module, used for unifying the data format of the energy consumption data into the data unit type preset by the building terminal based on the data unit type, wherein the data unit type specifically includes temperature, power and timestamp; 判断模块,用于判断所述能耗数据能否实时呈现至所述建筑终端;A determination module, used to determine whether the energy consumption data can be presented to the building terminal in real time; 执行模块,用于若否,则通过所述建筑终端预设的传感器,从不同的数据源中采集对应的各种多源数据,识别所述多源数据的时间序列,根据所述时间序列应用预设的数据融合,将所述多源数据进行整合,得到融合数据,从所述融合数据中提取对应的特征参数,依据所述特征参数构建有效特征集,其中,所述时间序列具体包括趋势变化和周期性变化,所述特征参数具体包括天气影响指数、经济活动水平和历史负荷数据;an execution module, for, if not, collecting corresponding various multi-source data from different data sources through the preset sensors of the building terminal, identifying the time series of the multi-source data, applying preset data fusion according to the time series, integrating the multi-source data to obtain fused data, extracting corresponding feature parameters from the fused data, and constructing a valid feature set according to the feature parameters, wherein the time series specifically includes trend changes and periodic changes, and the feature parameters specifically include weather impact index, economic activity level and historical load data; 第二判断模块,用于判断所述有效特征集是否需要预设的数据清洗;A second judgment module is used to judge whether the valid feature set requires preset data cleaning; 第二执行模块,用于若不需要,则采用预设的相关系数,采集所述有效特征集中的各个特征与负荷变量之间的线性关系,在所述有效特征集上建立对应的基线模型,计算所述基线模型的均方误差和决定系数,通过所述基线模型构建所述建筑终端的负荷预测结果,将所述负荷预测结果与实际负荷值对比,生成负荷折线图。The second execution module is used to use a preset correlation coefficient if it is not needed, collect the linear relationship between each feature in the effective feature set and the load variable, establish a corresponding baseline model on the effective feature set, calculate the mean square error and determination coefficient of the baseline model, construct the load prediction result of the building terminal through the baseline model, compare the load prediction result with the actual load value, and generate a load line graph. 9.根据权利要求8所述的基于负荷曲线的节能优化控制系统,其特征在于,还包括:9. The energy-saving optimization control system based on load curve according to claim 8, characterized in that it also includes: 划分模块,用于基于所述传感器预设的工作范围,通过所述建筑终端划分所述传感器在建筑内覆盖的采集区域;A division module, used to divide the collection area covered by the sensor in the building through the building terminal based on the preset working range of the sensor; 第三判断模块,用于判断所述采集区域能否检测到预设的重叠监测边缘;A third judgment module is used to judge whether the acquisition area can detect a preset overlapping monitoring edge; 第三执行模块,用于若能,则根据所述重叠监测边缘构建重叠时间窗口,标识多个传感器在所述重叠时间窗口的采集测量值,依据所述采集测量值生成待融合的数据特征,应用预设的卡尔曼滤波为所述采集测量值进行权重分配,计算所述数据特征的加权平均值,其中,所述数据特征具体包括温度、湿度和能耗。The third execution module is used to, if possible, construct an overlapping time window according to the overlapping monitoring edges, identify the collected measurement values of multiple sensors in the overlapping time window, generate data features to be fused according to the collected measurement values, apply a preset Kalman filter to weight the collected measurement values, and calculate the weighted average of the data features, wherein the data features specifically include temperature, humidity and energy consumption. 10.根据权利要求8所述的基于负荷曲线的节能优化控制系统,其特征在于,所述执行模块还包括:10. The energy-saving optimization control system based on load curve according to claim 8, characterized in that the execution module further comprises: 计算单元,用于基于预设的虚拟变量,构建所述建筑终端对能耗的季节性特征,计算所述季节性特征之间的比率特征和差值特征,其中,所述虚拟变量具体包括春季、夏季、秋季和冬季,所述比率特征具体为能耗与温度的比率,所述差值特征具体为当前负荷与过去负荷的差值;A calculation unit, configured to construct seasonal characteristics of energy consumption of the building terminal based on preset virtual variables, and calculate ratio characteristics and difference characteristics between the seasonal characteristics, wherein the virtual variables specifically include spring, summer, autumn and winter, the ratio characteristics specifically are the ratio of energy consumption to temperature, and the difference characteristics specifically are the difference between current load and past load; 判断单元,用于判断所述比率特征和所述差值特征是否存在缺失值;A judging unit, used for judging whether there are missing values in the ratio feature and the difference feature; 执行单元,用于若否,则对所述比率特征和所述差值特征进行标准化处理,确保所述比率特征和所述差值特征始终处于同一尺度上,将所述比率特征和所述差值特征与其他特征进行组合,形成复合特征,根据所述复合特征捕捉季节性变化对应的能耗影响。an execution unit, for, if not, standardizing the ratio feature and the difference feature to ensure that the ratio feature and the difference feature are always on the same scale, combining the ratio feature and the difference feature with other features to form a composite feature, and capturing the energy consumption impact corresponding to seasonal changes according to the composite feature.
CN202411576832.8A 2024-11-06 2024-11-06 Energy-saving optimization control method and system based on load curve Pending CN119090096A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411576832.8A CN119090096A (en) 2024-11-06 2024-11-06 Energy-saving optimization control method and system based on load curve

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411576832.8A CN119090096A (en) 2024-11-06 2024-11-06 Energy-saving optimization control method and system based on load curve

Publications (1)

Publication Number Publication Date
CN119090096A true CN119090096A (en) 2024-12-06

Family

ID=93665267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411576832.8A Pending CN119090096A (en) 2024-11-06 2024-11-06 Energy-saving optimization control method and system based on load curve

Country Status (1)

Country Link
CN (1) CN119090096A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120146635A (en) * 2025-05-16 2025-06-13 山东和光智慧能源科技有限公司 An intelligent management platform for energy consumption monitoring
CN120315353A (en) * 2025-05-21 2025-07-15 志峰(北京)环境科技集团有限公司 Distributed sewage treatment intelligent control system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109118A (en) * 2019-05-31 2019-08-09 东北林业大学 A kind of prediction technique of Forest Canopy biomass
CN117689118A (en) * 2024-02-01 2024-03-12 深圳柯赛标识智能科技有限公司 Intelligent identification energy-saving control management method, system and equipment
CN118868404A (en) * 2024-07-17 2024-10-29 深圳市博尔特科技发展有限公司 A smart electricity monitoring system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109118A (en) * 2019-05-31 2019-08-09 东北林业大学 A kind of prediction technique of Forest Canopy biomass
CN117689118A (en) * 2024-02-01 2024-03-12 深圳柯赛标识智能科技有限公司 Intelligent identification energy-saving control management method, system and equipment
CN118868404A (en) * 2024-07-17 2024-10-29 深圳市博尔特科技发展有限公司 A smart electricity monitoring system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘召华: "物联网数据融合技术浅析", 《知识文库》, no. 11, 8 June 2016 (2016-06-08), pages 1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120146635A (en) * 2025-05-16 2025-06-13 山东和光智慧能源科技有限公司 An intelligent management platform for energy consumption monitoring
CN120315353A (en) * 2025-05-21 2025-07-15 志峰(北京)环境科技集团有限公司 Distributed sewage treatment intelligent control system

Similar Documents

Publication Publication Date Title
CN119090096A (en) Energy-saving optimization control method and system based on load curve
CN119834735B (en) A remote fault diagnosis method for photovoltaic power generation equipment
CN119541170B (en) Drought disaster prediction and early warning method and system for rural water supply engineering
CN118780642A (en) A method and system for predicting power of photovoltaic power generation
CN117996966B (en) Intelligent management method and system for power screen cabinet based on optimization algorithm
CN118671267B (en) Building operation carbon emission metering monitoring management method and system
CN117951192A (en) A method and system for managing carbon emission monitoring data based on data mining
CN116308958A (en) Carbon emission online detection and early warning system and method based on mobile terminal
CN118606680B (en) An intelligent monitoring and assessment method for environmental chemical pollutants
CN117808366A (en) Digital model system for black land protection utilization and safety productivity evaluation
CN120044302A (en) Intelligent electric energy meter capable of monitoring and reporting operation errors
CN118911977A (en) Water pump operation efficiency testing method, tester, equipment and medium
CN119128768A (en) An intelligent environmental monitoring method and system based on big data
CN119089249A (en) Online household change relationship identification method and system based on multi-source data fusion
CN120632323A (en) A method and system for detecting anomaly in power equipment data based on LSTM-COF
CN118669281A (en) Online state evaluation method, device and equipment of wind motor and storage medium
CN118780820B (en) Energy management method for green power tracing
CN119309613B (en) Intelligent detection data calibration system driven by enhanced algorithm
CN118760927B (en) Industrial aggregation composite pollutant identification system based on big data
CN120146395A (en) Auditing methods and systems for natural resources
CN110455370B (en) Flood-prevention drought-resisting remote measuring display system
CN118921386A (en) Intelligent gateway equipment is checked to carbon
CN119945877A (en) A data center energy efficiency monitoring method and system based on digital twin
CN119171625A (en) A microgrid edge control system
CN117220417A (en) Dynamic monitoring method and system for consumer-end electric load

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20241206