CN117170979B - Energy consumption data processing method, system, equipment and medium for large-scale equipment - Google Patents

Energy consumption data processing method, system, equipment and medium for large-scale equipment Download PDF

Info

Publication number
CN117170979B
CN117170979B CN202311396337.4A CN202311396337A CN117170979B CN 117170979 B CN117170979 B CN 117170979B CN 202311396337 A CN202311396337 A CN 202311396337A CN 117170979 B CN117170979 B CN 117170979B
Authority
CN
China
Prior art keywords
energy consumption
consumption data
data
determining
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311396337.4A
Other languages
Chinese (zh)
Other versions
CN117170979A (en
Inventor
李孔政
王晓明
陈晓丰
黄嘉荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Baxtrand Technology Co ltd
Original Assignee
Guangdong Baxtrand Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Baxtrand Technology Co ltd filed Critical Guangdong Baxtrand Technology Co ltd
Priority to CN202311396337.4A priority Critical patent/CN117170979B/en
Publication of CN117170979A publication Critical patent/CN117170979A/en
Application granted granted Critical
Publication of CN117170979B publication Critical patent/CN117170979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to an energy consumption data processing method, system, equipment and medium of large-scale equipment, wherein the method specifically comprises the following steps: collecting energy consumption data of large-scale equipment according to a stream processing technology; determining the energy consumption flow direction of each device in the large-scale devices, and calculating the energy consumption cost corresponding to each device; establishing a reference model for each equipment type based on a Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model; and carrying out optimization analysis on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimization scheme of each device. The invention realizes the real-time acquisition, flow direction analysis, anomaly detection and energy consumption optimization of the energy consumption data by combining a flow processing technology, a Gaussian mixture model and a genetic algorithm.

Description

Energy consumption data processing method, system, equipment and medium for large-scale equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to an energy consumption data processing method, system, device, and medium for a large-scale device.
Background
In the popularization and application process of the Internet of things equipment, energy consumption statistical analysis is important to realizing energy conservation and resource optimization utilization. The statistical range of the energy consumption is wide, and the statistical range comprises data collection and analysis in the aspects of gas consumption, electricity consumption, water consumption and the like.
Because energy consumption data is typically generated on a large scale in real-time, processing such data requires efficient algorithms and computational power to ensure timely analysis and response of the data, but current energy consumption analysis only provides an overall understanding of energy consumption, limiting the comprehensiveness and depth of the user in making decisions. Meanwhile, abnormal data may exist in the energy consumption data, but the traditional abnormal data identification method is generally only capable of capturing obvious abnormal conditions based on fixed rules or thresholds, and cannot effectively process complex multi-mode energy consumption data distribution.
Disclosure of Invention
The invention aims to provide an energy consumption data processing method, system, equipment and medium for large-scale equipment, which realize real-time acquisition, flow direction analysis, anomaly detection and energy consumption optimization of energy consumption data by combining a flow processing technology, a Gaussian mixture model and a genetic algorithm so as to solve at least one of the problems in the prior art.
The invention provides an energy consumption data processing method of large-scale equipment, which specifically comprises the following steps:
collecting energy consumption data of large-scale equipment according to a stream processing technology;
determining the energy consumption flow direction of each device in the large-scale devices, and calculating the energy consumption cost corresponding to each device;
establishing a reference model for each equipment type based on a Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model;
and carrying out optimization analysis on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimization scheme of each device.
Further, the collecting the energy consumption data of the large-scale equipment according to the stream processing technology specifically comprises:
setting Apache Kafka, and configuring a Kafka producer for acquiring energy consumption data of large-scale equipment in real time;
configuring a Kafka theme for each equipment type, packaging the energy consumption data of each equipment type into a Kafka Producer record according to the Kafka Producer, and sending the Kafka record to the corresponding Kafka theme;
and configuring a Spark Streaming stream processing engine, and acquiring the energy consumption data of the Kafka theme in real time and processing the data according to the Spark Streaming stream processing engine.
Further, the establishing a reference model for each equipment type based on the gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model specifically includes:
determining an energy consumption dataset for each device type;
according to the energy consumption data set of each equipment type, comparing the AIC value and the BIC value of the Gaussian mixture model corresponding to each equipment type under different Gaussian component numbers to obtain a reference model corresponding to each equipment type;
and carrying out anomaly identification on the energy consumption data of each device according to the reference model, and determining the energy consumption anomaly data in the energy consumption data.
Further, the comparing the AIC value and the BIC value of the gaussian mixture model corresponding to each equipment type under different gaussian component numbers to obtain a reference model corresponding to each equipment type specifically includes:
determining initial parameters of a Gaussian mixture model corresponding to each equipment type, wherein the initial parameters comprise a mean value, a covariance matrix and a mixing coefficient;
calculating the responsivity of the energy consumption dataset on each Gaussian distribution based on a probability density function;
updating the mean, the covariance matrix and the mixing coefficients according to the responsivity of the energy consumption dataset on each Gaussian distribution;
repeating the calculation of the responsivity and the updating of the mean value, the covariance matrix and the mixing coefficient until the likelihood function value of the Gaussian mixture model is maximized;
and determining an optimal Gaussian mixture model corresponding to each equipment type by comparing the AIC value and the BIC value of the Gaussian mixture model under different Gaussian component quantities, and taking the optimal Gaussian mixture model as a reference model.
Further, the calculating the responsivity of the energy consumption dataset on each gaussian distribution based on the probability density function specifically includes:
calculating probability density of the energy consumption dataset on each Gaussian distribution according to the mean value, the covariance matrix and the mixing coefficient based on a probability density function;
normalizing the probability density of the energy consumption data set on each Gaussian distribution, and then summing to obtain the sum of the probability densities of the energy consumption data set under all Gaussian distributions;
based on the Bayesian theorem, calculating the probability that the energy consumption data set belongs to any Gaussian distribution according to the sum of probability densities of the energy consumption data set under all Gaussian distributions, and obtaining the responsivity of the energy consumption data set under each Gaussian distribution.
Further, the performing anomaly recognition on the energy consumption data of each device according to the reference model, and determining the energy consumption anomaly data in the energy consumption data specifically includes:
dividing an energy consumption data set into a normal data set and an abnormal data set, and setting an initial probability threshold according to the frequency between the normal data set and the abnormal data set;
classifying a new energy consumption data set according to the initial probability threshold based on a decision tree algorithm, performing iterative adjustment on the initial probability threshold through evaluation of false alarm rate and false missing alarm rate, and determining respective optimized probability threshold for each Gaussian distribution;
calculating the responsivity of the energy consumption dataset under each Gaussian distribution according to the reference model;
and determining whether energy consumption abnormal data exist in the energy consumption data of each device by comparing the responsivity of the energy consumption data set under each Gaussian distribution with an optimization probability threshold corresponding to each Gaussian distribution.
Further, the genetic algorithm-based optimization analysis is performed on the energy consumption data of each device in real time to obtain an energy consumption optimization scheme of each device, and the method specifically comprises the following steps:
setting gene codes according to the energy consumption data and the equipment parameters of each equipment;
taking the reduction of the energy consumption of the equipment and the improvement of the energy utilization efficiency as optimization targets and taking the maximum energy consumption limit of each equipment as a constraint condition;
determining a difference between the optimal target value and the actual value for each device;
setting an fitness function according to the gene codes, the optimization targets, the constraint conditions and the difference between the optimized target value and the actual value of each device;
and determining a plurality of energy consumption operation modes of each device according to the genetic code based on a genetic algorithm, calculating the fitness value of each energy consumption operation mode of each device according to the fitness function, and determining the energy consumption operation mode with the highest fitness value as the optimal energy consumption operation mode.
The invention also provides an energy consumption data processing system of the large-scale equipment, which specifically comprises:
the energy consumption data acquisition module is used for acquiring energy consumption data of the large-scale equipment according to a stream processing technology;
the energy consumption flow direction analysis module is used for determining the energy consumption flow direction of each device in the large-scale device and calculating the energy consumption cost corresponding to each device;
the abnormal data identification module is used for establishing a reference model for each equipment type based on the Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model;
the energy consumption optimization analysis model is used for carrying out optimization analysis on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimization scheme of each device.
The present invention also provides a computer device comprising: memory and processor and computer program stored on the memory, which when executed on the processor, implements a method for energy consumption data processing of a large scale device according to any of the above methods.
The invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of energy consumption data processing for a large scale device as described in any of the above methods.
Compared with the prior art, the invention has at least one of the following technical effects:
1. through the stream processing technology, the acquisition and the processing of the energy consumption data have real-time performance, so that the energy consumption analysis is more timely and accurate, and the method is suitable for energy consumption monitoring of large-scale equipment.
2. The Gaussian mixture model is utilized for anomaly detection, so that the multi-modal property of data distribution can be captured, the method is suitable for the situation that different energy consumption modes exist in energy consumption data, various energy consumption anomalies can be effectively identified, and the detection accuracy is improved.
3. Based on a genetic algorithm, the optimal energy consumption operation mode can be found according to the personalized optimization target of the equipment parameters.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for analyzing energy consumption of large-scale equipment data according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a system for analyzing energy consumption of large-scale equipment data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
In the popularization and application process of the Internet of things equipment, energy consumption statistical analysis is important to realizing energy conservation and resource optimization utilization. The statistical range of the energy consumption is wide, and the statistical range comprises data collection and analysis in the aspects of gas consumption, electricity consumption, water consumption and the like.
Because energy consumption data is typically generated on a large scale in real-time, processing such data requires efficient algorithms and computational power to ensure timely analysis and response of the data, but current energy consumption analysis only provides an overall understanding of energy consumption, limiting the comprehensiveness and depth of the user in making decisions. Meanwhile, abnormal data may exist in the energy consumption data, but the traditional abnormal data identification method is generally only capable of capturing obvious abnormal conditions based on fixed rules or thresholds, and cannot effectively process complex multi-mode energy consumption data distribution.
Referring to fig. 1, an embodiment of the present invention provides a method for processing energy consumption data of a large-scale device, where the method specifically includes:
s101: and collecting energy consumption data of the large-scale equipment according to a stream processing technology.
In some embodiments, the collecting energy consumption data of the large-scale device according to the stream processing technology specifically includes:
setting Apache Kafka, and configuring a Kafka producer for acquiring energy consumption data of large-scale equipment in real time;
configuring a Kafka theme for each equipment type, packaging the energy consumption data of each equipment type into a Kafka Producer record according to the Kafka Producer, and sending the Kafka record to the corresponding Kafka theme;
and configuring a Spark Streaming stream processing engine, and acquiring the energy consumption data of the Kafka theme in real time and processing the data according to the Spark Streaming stream processing engine.
In this embodiment, an Apache Kafka cluster needs to be built and configured first, then basic settings of Kafka, such as ports, data storage paths, etc., are configured in the Kafka directory by editing properties files, and then the clusters and metadata are managed by using a ZooKeeper server.
In the Producer code, relevant parameters of the Kafka Producer, such as Kafka server address and subject name, etc., are configured, energy consumption data sources connecting the Kafka Producer and the large-scale apparatus are used for acquiring the energy consumption data in real time, and the acquired real-time data are packaged into a Kafka Producer record, which usually contains keys (keys) and values (values). In the Kafka cluster, a specific Kafka theme is configured for each device type, and a theme can be created for each device type such as air conditioner, lighting, and electric appliance.
Next, spark Streaming is configured to consume energy consumption data in the Kafka theme, and may be connected to the Kafka cluster to obtain a data stream in real time, and perform real-time processing and analysis. By writing the Spark Streaming application, data is consumed from the Kafka theme, and various processing operations such as data cleansing, aggregation, calculation, and the like are performed.
In Spark Streaming applications, a Spark Streaming context is initialized, application names, batch time intervals, etc. are configured, a Kafka Direct Stream (Direct Stream) API provided by Spark Streaming is used to connect to a Kafka cluster, and Kafka server addresses and subject names are configured to consume data. In the Spark Streaming code, logic for processing real-time data streams of each batch is defined, received data are processed, converted and analyzed, a start () method is called to start a Spark Streaming context, and the real-time data streams are started to be received and processed.
S102: and determining the energy consumption flow direction of each device in the large-scale devices, and calculating the energy consumption cost corresponding to each device.
In this embodiment, information such as a device identifier, a time stamp, and an energy consumption value of each data point of the energy consumption data is acquired, the energy consumption data is grouped for each time window, the total energy consumption of each device is calculated according to the device identifier, the energy consumption flow direction is identified according to the energy consumption change between the devices, and the energy transfer path is determined.
In case the energy consumption flow direction of each device is determined, the energy consumption cost of each device is determined according to the energy consumption type (electric energy, water energy, gas energy, etc.) involved in each device, according to the operation time of each device, the energy consumption rate, and the price of the corresponding energy consumption type.
Meanwhile, the energy consumption flow of each device and the corresponding energy consumption cost can be presented through a visual graph, and a comprehensive and detailed energy consumption analysis result is provided for a user, so that the energy consumption of the device can be managed and optimized better, the cost is reduced, and the energy utilization efficiency is improved.
S103: and establishing a reference model for each equipment type based on the Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model.
In some embodiments, the establishing a reference model for each equipment type based on the gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model specifically includes:
determining an energy consumption dataset for each device type;
according to the energy consumption data set of each equipment type, comparing the AIC value and the BIC value of the Gaussian mixture model corresponding to each equipment type under different Gaussian component numbers to obtain a reference model corresponding to each equipment type;
and carrying out anomaly identification on the energy consumption data of each device according to the reference model, and determining the energy consumption anomaly data in the energy consumption data.
In this embodiment, an attempt is made to construct a gaussian mixture model using a different number of components for each device type, e.g., a number of components that gradually increases from 1 component to a range. The model is trained using different component numbers and parameters of the model, such as mean, covariance matrix, and mixing coefficients, are estimated.
For each component number, the values of AIC and BIC are calculated. These indices can be used to evaluate the quality of the fit of the model and the choice of the number of components. The smaller the AIC (Akaike Information Criterion, red pool information criterion) value and BIC (Bayesian Information Criterion ) value, the better the model is when fitting data, specifically, aic= 2*K-2×ln (L), bic=k×ln (N) -2×ln (L), where K is the component number, N is the sample number, and L is the likelihood value of the model. Therefore, the number of components having the minimum AIC value and BIC value may be selected to construct a gaussian mixture model as a reference model.
In some embodiments, the comparing the AIC value and the BIC value of the gaussian mixture model corresponding to each equipment type under different gaussian component numbers to obtain a reference model corresponding to each equipment type specifically includes:
determining initial parameters of a Gaussian mixture model corresponding to each equipment type, wherein the initial parameters comprise a mean value, a covariance matrix and a mixing coefficient;
calculating the responsivity of the energy consumption dataset on each Gaussian distribution based on a probability density function;
updating the mean, the covariance matrix and the mixing coefficients according to the responsivity of the energy consumption dataset on each Gaussian distribution;
repeating the calculation of the responsivity and the updating of the mean value, the covariance matrix and the mixing coefficient until the likelihood function value of the Gaussian mixture model is maximized;
and determining an optimal Gaussian mixture model corresponding to each equipment type by comparing the AIC value and the BIC value of the Gaussian mixture model under different Gaussian component quantities, and taking the optimal Gaussian mixture model as a reference model.
In some embodiments, the calculating the responsivity of the energy consumption dataset on each gaussian distribution based on the probability density function specifically includes:
calculating probability density of the energy consumption dataset on each Gaussian distribution according to the mean value, the covariance matrix and the mixing coefficient based on a probability density function;
normalizing the probability density of the energy consumption data set on each Gaussian distribution, and then summing to obtain the sum of the probability densities of the energy consumption data set under all Gaussian distributions;
based on the Bayesian theorem, calculating the probability that the energy consumption data set belongs to any Gaussian distribution according to the sum of probability densities of the energy consumption data set under all Gaussian distributions, and obtaining the responsivity of the energy consumption data set under each Gaussian distribution.
In this embodiment, for each gaussian distribution, the mean, covariance matrix and mixing coefficients thereof need to be initialized, the responsivity of each sample belonging to each gaussian distribution, i.e. the posterior probability (hidden variable), is calculated by the probability density function of the multiple gaussian distribution, the probability density function satisfyingWherein->Represents the probability density of data point x on the gaussian distribution, x represents the data point, +.>Mean value of Gaussian distribution, +.>Covariance matrix representing gaussian distribution, D representing dimension of energy consumption data, +.>Is a constant for ensuring that the total volume of the density probability is 1 (i.e. normalization), is +.>Is the square root of the determinant of the covariance matrix, which is related to the degree of dispersion of the data, and is also a normalization term,>representing the deviation of the data point from the mean, the smaller the value is, the more likely the data point is to belong to the gaussian distribution, where T represents the transpose operation of the matrix, i.e. changing a row vector to a column vector or a column vector to a row vector.
Thus, the responsivity can be expressed asWherein->Representing the responsivity of data point x in kth Gaussian distribution, +.>Representing the weight of the kth Gaussian distribution in the overall Gaussian distribution, i.e. the mixing coefficient, +.>Represents the mean value of data point x over the kth gaussian distribution, +.>A covariance matrix representing data point x on the kth Gaussian distribution, j representing the number of Gaussian distributions, K representing the number of times the probability density on each Gaussian distribution is summed after normalization, and +.>Representing the sum of the probability densities of data point x under all gaussian distributions.
In some embodiments, the identifying the abnormality of the energy consumption data of each device according to the reference model, and determining the energy consumption abnormality data in the energy consumption data specifically includes:
dividing an energy consumption data set into a normal data set and an abnormal data set, and setting an initial probability threshold according to the frequency between the normal data set and the abnormal data set;
classifying a new energy consumption data set according to the initial probability threshold based on a decision tree algorithm, performing iterative adjustment on the initial probability threshold through evaluation of false alarm rate and false missing alarm rate, and determining respective optimized probability threshold for each Gaussian distribution;
calculating the responsivity of the energy consumption dataset under each Gaussian distribution according to the reference model;
and determining whether energy consumption abnormal data exist in the energy consumption data of each device by comparing the responsivity of the energy consumption data set under each Gaussian distribution with an optimization probability threshold corresponding to each Gaussian distribution.
In this embodiment, the energy consumption dataset is divided into a normal dataset and an abnormal dataset, and some statistical method (e.g., 3σ) may be used to identify abnormal data points and treat them as abnormal datasets. Then, an initial probability threshold is set according to the frequencies of the normal data set and the abnormal data set, for example, a relatively low threshold is set according to the normal distribution characteristics to capture the abnormal data.
Training a decision tree model, training a normal data set and an abnormal data set as labels, classifying a new energy consumption data set by using the trained decision tree model, and classifying the data into two types of normal and abnormal by using a decision tree. And evaluating the classification result, calculating the false alarm rate (the proportion of the normal data misclassified as the abnormal data) and the false alarm rate (the proportion of the abnormal data misclassified as the normal data), and adjusting the initial probability threshold according to the evaluation result of the false alarm rate and the false alarm rate, so that balance can be found between the false alarm rate and the false alarm rate.
And calculating the responsivity of the energy consumption data of each device under each Gaussian distribution, comparing the calculated responsivity with an optimized probability threshold value corresponding to each Gaussian distribution, and marking the corresponding data as energy consumption abnormal data if the responsivity under a certain distribution exceeds the threshold value.
S104: and carrying out optimization analysis on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimization scheme of each device.
In some embodiments, the optimizing analysis is performed on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimizing scheme of each device, which specifically includes:
setting gene codes according to the energy consumption data and the equipment parameters of each equipment;
taking the reduction of the energy consumption of the equipment and the improvement of the energy utilization efficiency as optimization targets and taking the maximum energy consumption limit of each equipment as a constraint condition;
determining a difference between the optimal target value and the actual value for each device;
setting an fitness function according to the gene codes, the optimization targets, the constraint conditions and the difference between the optimized target value and the actual value of each device;
and determining a plurality of energy consumption operation modes of each device according to the genetic code based on a genetic algorithm, calculating the fitness value of each energy consumption operation mode of each device according to the fitness function, and determining the energy consumption operation mode with the highest fitness value as the optimal energy consumption operation mode.
In this embodiment, for example, for an air conditioning device, historical energy consumption data and parameter settings for the air conditioning device are collected, and a genetic code is set for the energy consumption parameter of each device, for example: temperature settings, operating modes, wind speeds, etc., the genetic code may use binary coding, with each bit representing the state of a parameter. The method takes the reduction of equipment energy consumption and the improvement of energy utilization efficiency as optimization targets, avoids bad refrigeration effect while reducing the equipment energy consumption, and takes the maximum energy consumption limit of each air conditioner as constraint condition. And calculating the difference between the optimal target value (energy consumption reduction target) and the actual value (historical average energy consumption) of each air conditioner, the optimal target, constraint conditions, gene coding and other design fitness functions according to the historical data.
Different gene coding combination modes are generated by using a genetic algorithm, different energy consumption operation modes are represented, the fitness value of each energy consumption operation mode is calculated, each mode is evaluated according to a fitness function, and the energy consumption operation mode with the highest fitness value is selected as the optimal energy consumption operation mode, so that the optimal air conditioner operation mode can be found in various parameter settings, the energy consumption is reduced, and the energy utilization efficiency is improved.
Referring to fig. 2, the embodiment of the present invention further provides an energy consumption data processing system 2 of a large-scale device, where the system 2 specifically includes:
the energy consumption data acquisition module 201 is used for acquiring energy consumption data of the large-scale equipment according to a stream processing technology;
an energy consumption flow direction analysis module 202, configured to determine an energy consumption flow direction of each device in the large-scale device, and calculate an energy consumption cost corresponding to each device;
an abnormal data identification module 203, configured to establish a reference model for each equipment type based on a gaussian mixture model, and determine abnormal data in the energy consumption data according to the reference model;
the energy consumption optimization analysis model 204 is used for performing optimization analysis on the energy consumption data of each device based on the genetic algorithm in real time to obtain an energy consumption optimization scheme of each device.
It can be understood that the content of the embodiment of the method for processing the energy consumption data of the large-scale device shown in fig. 1 is applicable to the embodiment of the system for processing the energy consumption data of the large-scale device, and the functions of the embodiment of the system for processing the energy consumption data of the large-scale device are the same as those of the embodiment of the method for processing the energy consumption data of the large-scale device shown in fig. 1, and the beneficial effects achieved by the embodiment of the method for processing the energy consumption data of the large-scale device shown in fig. 1 are the same.
It should be noted that, because the content of information interaction and execution process between the above systems is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the system is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Referring to fig. 3, an embodiment of the present invention further provides a computer device 3, including: a memory 302 and a processor 301 and a computer program 303 stored on the memory 302, which computer program 303, when executed on the processor 301, implements a method for energy consumption data processing of a large scale device according to any of the above methods.
The computer device 3 may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The computer device 3 may include, but is not limited to, a processor 301, a memory 302. It will be appreciated by those skilled in the art that fig. 3 is merely an example of the computer device 3 and is not meant to be limiting as the computer device 3, and may include more or fewer components than shown, or may combine certain components, or different components, such as may also include input-output devices, network access devices, etc.
The processor 301 may be a central processing unit (Central Processing Unit, CPU), the processor 301 may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 302 may in some embodiments be an internal storage unit of the computer device 3, such as a hard disk or a memory of the computer device 3. The memory 302 may in other embodiments also be an external storage device of the computer device 3, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 3. Further, the memory 302 may also include both an internal storage unit and an external storage device of the computer device 3. The memory 302 is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs, such as program code for the computer program. The memory 302 may also be used to temporarily store data that has been output or is to be output.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being run by a processor, implements the method for processing energy consumption data of a large-scale device according to any one of the above methods.
In this embodiment, the integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing device/terminal apparatus, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments disclosed in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Claims (8)

1. A method for processing energy consumption data of a large-scale device, the method comprising:
collecting energy consumption data of large-scale equipment according to a stream processing technology;
determining the energy consumption flow direction of each device in the large-scale devices, and calculating the energy consumption cost corresponding to each device;
establishing a reference model for each equipment type based on a Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model;
the method for determining abnormal data in the energy consumption data based on the Gaussian mixture model comprises the steps of establishing a reference model for each equipment type, and determining the abnormal data in the energy consumption data according to the reference model, wherein the method specifically comprises the following steps:
determining an energy consumption dataset for each device type;
according to the energy consumption data set of each equipment type, comparing the AIC value and the BIC value of the Gaussian mixture model corresponding to each equipment type under different Gaussian component numbers to obtain a reference model corresponding to each equipment type;
performing anomaly identification on the energy consumption data of each device according to the reference model, and determining energy consumption anomaly data in the energy consumption data;
carrying out optimization analysis on the energy consumption data of each device in real time based on a genetic algorithm to obtain an energy consumption optimization scheme of each device;
the genetic algorithm-based optimization analysis is performed on the energy consumption data of each device in real time to obtain an energy consumption optimization scheme of each device, and the method specifically comprises the following steps:
setting gene codes according to the energy consumption data and the equipment parameters of each equipment;
taking the reduction of the energy consumption of the equipment and the improvement of the energy utilization efficiency as optimization targets and taking the maximum energy consumption limit of each equipment as a constraint condition;
determining a difference between the optimal target value and the actual value for each device;
setting an fitness function according to the gene codes, the optimization targets, the constraint conditions and the difference between the optimized target value and the actual value of each device;
and determining a plurality of energy consumption operation modes of each device according to the genetic code based on a genetic algorithm, calculating the fitness value of each energy consumption operation mode of each device according to the fitness function, and determining the energy consumption operation mode with the highest fitness value as the optimal energy consumption operation mode.
2. The method according to claim 1, wherein the collecting energy consumption data of the large-scale apparatus according to the stream processing technique comprises:
setting Apache Kafka, and configuring a Kafka producer for acquiring energy consumption data of large-scale equipment in real time;
configuring a Kafka theme for each equipment type, packaging the energy consumption data of each equipment type into a Kafka Producer record according to the Kafka Producer, and sending the Kafka record to the corresponding Kafka theme;
and configuring a Spark Streaming stream processing engine, and acquiring the energy consumption data of the Kafka theme in real time and processing the data according to the Spark Streaming stream processing engine.
3. The method according to claim 1, wherein the comparing AIC values and BIC values of the gaussian mixture model corresponding to each device type under different gaussian component numbers, and obtaining the reference model corresponding to each device type specifically includes:
determining initial parameters of a Gaussian mixture model corresponding to each equipment type, wherein the initial parameters comprise a mean value, a covariance matrix and a mixing coefficient;
calculating the responsivity of the energy consumption dataset on each Gaussian distribution based on a probability density function;
updating the mean, the covariance matrix and the mixing coefficients according to the responsivity of the energy consumption dataset on each Gaussian distribution;
repeating the calculation of the responsivity and the updating of the mean value, the covariance matrix and the mixing coefficient until the likelihood function value of the Gaussian mixture model is maximized;
and determining an optimal Gaussian mixture model corresponding to each equipment type by comparing the AIC value and the BIC value of the Gaussian mixture model under different Gaussian component quantities, and taking the optimal Gaussian mixture model as a reference model.
4. A method according to claim 3, characterized in that said calculating the responsivity of said energy consumption dataset on each gaussian distribution based on a probability density function, in particular comprises:
calculating probability density of the energy consumption dataset on each Gaussian distribution according to the mean value, the covariance matrix and the mixing coefficient based on a probability density function;
normalizing the probability density of the energy consumption data set on each Gaussian distribution, and then summing to obtain the sum of the probability densities of the energy consumption data set under all Gaussian distributions;
based on the Bayesian theorem, calculating the probability that the energy consumption data set belongs to any Gaussian distribution according to the sum of probability densities of the energy consumption data set under all Gaussian distributions, and obtaining the responsivity of the energy consumption data set under each Gaussian distribution.
5. The method according to claim 4, wherein the anomaly identification is performed on the energy consumption data of each device according to the reference model, and the determining of the energy consumption anomaly data in the energy consumption data specifically includes:
dividing an energy consumption data set into a normal data set and an abnormal data set, and setting an initial probability threshold according to the frequency between the normal data set and the abnormal data set;
classifying a new energy consumption data set according to the initial probability threshold based on a decision tree algorithm, performing iterative adjustment on the initial probability threshold through evaluation of false alarm rate and false missing alarm rate, and determining respective optimized probability threshold for each Gaussian distribution;
calculating the responsivity of the energy consumption dataset under each Gaussian distribution according to the reference model;
and determining whether energy consumption abnormal data exist in the energy consumption data of each device by comparing the responsivity of the energy consumption data set under each Gaussian distribution with an optimization probability threshold corresponding to each Gaussian distribution.
6. A system for energy consumption data processing of a large-scale plant, said system comprising in particular:
the energy consumption data acquisition module is used for acquiring energy consumption data of the large-scale equipment according to a stream processing technology;
the energy consumption flow direction analysis module is used for determining the energy consumption flow direction of each device in the large-scale device and calculating the energy consumption cost corresponding to each device;
the abnormal data identification module is used for establishing a reference model for each equipment type based on the Gaussian mixture model, and determining abnormal data in the energy consumption data according to the reference model;
the method for determining abnormal data in the energy consumption data based on the Gaussian mixture model comprises the steps of establishing a reference model for each equipment type, and determining the abnormal data in the energy consumption data according to the reference model, wherein the method specifically comprises the following steps:
determining an energy consumption dataset for each device type;
according to the energy consumption data set of each equipment type, comparing the AIC value and the BIC value of the Gaussian mixture model corresponding to each equipment type under different Gaussian component numbers to obtain a reference model corresponding to each equipment type;
performing anomaly identification on the energy consumption data of each device according to the reference model, and determining energy consumption anomaly data in the energy consumption data;
the energy consumption optimization analysis model is used for carrying out optimization analysis on the energy consumption data of each device based on a genetic algorithm in real time to obtain an energy consumption optimization scheme of each device;
the genetic algorithm-based optimization analysis is performed on the energy consumption data of each device in real time to obtain an energy consumption optimization scheme of each device, and the method specifically comprises the following steps:
setting gene codes according to the energy consumption data and the equipment parameters of each equipment;
taking the reduction of the energy consumption of the equipment and the improvement of the energy utilization efficiency as optimization targets and taking the maximum energy consumption limit of each equipment as a constraint condition;
determining a difference between the optimal target value and the actual value for each device;
setting an fitness function according to the gene codes, the optimization targets, the constraint conditions and the difference between the optimized target value and the actual value of each device;
and determining a plurality of energy consumption operation modes of each device according to the genetic code based on a genetic algorithm, calculating the fitness value of each energy consumption operation mode of each device according to the fitness function, and determining the energy consumption operation mode with the highest fitness value as the optimal energy consumption operation mode.
7. A computer device, comprising: memory and processor and computer program stored on the memory, which when executed on the processor, implements the energy consumption data processing method of a large scale device according to any one of claims 1 to 5.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements the energy consumption data processing method of a large scale device according to any one of claims 1 to 5.
CN202311396337.4A 2023-10-26 2023-10-26 Energy consumption data processing method, system, equipment and medium for large-scale equipment Active CN117170979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311396337.4A CN117170979B (en) 2023-10-26 2023-10-26 Energy consumption data processing method, system, equipment and medium for large-scale equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311396337.4A CN117170979B (en) 2023-10-26 2023-10-26 Energy consumption data processing method, system, equipment and medium for large-scale equipment

Publications (2)

Publication Number Publication Date
CN117170979A CN117170979A (en) 2023-12-05
CN117170979B true CN117170979B (en) 2024-04-05

Family

ID=88935682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311396337.4A Active CN117170979B (en) 2023-10-26 2023-10-26 Energy consumption data processing method, system, equipment and medium for large-scale equipment

Country Status (1)

Country Link
CN (1) CN117170979B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117539726B (en) * 2024-01-09 2024-04-26 广东奥飞数据科技股份有限公司 Energy efficiency optimization method and system for green intelligent computing center

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670225A (en) * 2018-12-10 2019-04-23 百度在线网络技术(北京)有限公司 Vehicle dimension template library generating method and device
CN110570012A (en) * 2019-08-05 2019-12-13 华中科技大学 Storm-based power plant production equipment fault early warning method and system
CN116341929A (en) * 2023-02-13 2023-06-27 武汉科技大学 Prediction method based on clustering and adaptive gradient lifting decision tree
CN116627241A (en) * 2023-05-19 2023-08-22 苏州浪潮智能科技有限公司 Method, system, equipment and storage medium for optimizing energy consumption of server
CN116756594A (en) * 2023-06-20 2023-09-15 中国电力科学研究院有限公司 Method, system, equipment and medium for detecting abnormal points of power grid data
CN116910144A (en) * 2023-01-03 2023-10-20 中国移动通信集团设计院有限公司 Computing power network resource center, computing power service system and data processing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670225A (en) * 2018-12-10 2019-04-23 百度在线网络技术(北京)有限公司 Vehicle dimension template library generating method and device
CN110570012A (en) * 2019-08-05 2019-12-13 华中科技大学 Storm-based power plant production equipment fault early warning method and system
CN116910144A (en) * 2023-01-03 2023-10-20 中国移动通信集团设计院有限公司 Computing power network resource center, computing power service system and data processing method
CN116341929A (en) * 2023-02-13 2023-06-27 武汉科技大学 Prediction method based on clustering and adaptive gradient lifting decision tree
CN116627241A (en) * 2023-05-19 2023-08-22 苏州浪潮智能科技有限公司 Method, system, equipment and storage medium for optimizing energy consumption of server
CN116756594A (en) * 2023-06-20 2023-09-15 中国电力科学研究院有限公司 Method, system, equipment and medium for detecting abnormal points of power grid data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于Spark Streaming的实时能耗分项计量系统;武志学;;计算机应用(04);第1-8页 *
基于改进聚类分析的电力数据智能分析与处理算法;张超等;电子设计工程;第第31卷卷(第第2期期);第1-5页 *

Also Published As

Publication number Publication date
CN117170979A (en) 2023-12-05

Similar Documents

Publication Publication Date Title
CN117170979B (en) Energy consumption data processing method, system, equipment and medium for large-scale equipment
CN111475680A (en) Method, device, equipment and storage medium for detecting abnormal high-density subgraph
CN111314353B (en) Network intrusion detection method and system based on hybrid sampling
CN111626360B (en) Method, apparatus, device and storage medium for detecting boiler fault type
CN111556016B (en) Network flow abnormal behavior identification method based on automatic encoder
CN113542241B (en) Intrusion detection method and device based on CNN-BiGRU hybrid model
CN112134862B (en) Coarse-fine granularity hybrid network anomaly detection method and device based on machine learning
CN109918498B (en) Problem warehousing method and device
CN110795690A (en) Wind power plant operation abnormal data detection method
CN117113235B (en) Cloud computing data center energy consumption optimization method and system
CN110825545A (en) Cloud service platform anomaly detection method and system
CN111767538A (en) Industrial control intrusion detection system feature selection method based on related information entropy
CN110796159A (en) Power data classification method and system based on k-means algorithm
CN114912720A (en) Memory network-based power load prediction method, device, terminal and storage medium
CN111224984B (en) Snort improvement method based on data mining algorithm
CN115952067A (en) Database operation abnormal behavior detection method and readable storage medium
CN115022038A (en) Power grid network anomaly detection method, device, equipment and storage medium
CN115632874A (en) Method, device, equipment and storage medium for detecting threat of entity object
CN115375921A (en) Two-stage non-intrusive load identification method and terminal
CN109389172B (en) Radio signal data clustering method based on non-parameter grid
CN116701887B (en) Power consumption prediction method and device, electronic equipment and storage medium
CN117478390A (en) Network intrusion detection method based on improved density peak clustering algorithm
CN112422546A (en) Network anomaly detection method based on variable neighborhood algorithm and fuzzy clustering
CN117155701A (en) Network flow intrusion detection method
CN117294497A (en) Network traffic abnormality detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant