WO2021135177A1 - 一种能源数据仓库系统构建方法及装置 - Google Patents

一种能源数据仓库系统构建方法及装置 Download PDF

Info

Publication number
WO2021135177A1
WO2021135177A1 PCT/CN2020/103657 CN2020103657W WO2021135177A1 WO 2021135177 A1 WO2021135177 A1 WO 2021135177A1 CN 2020103657 W CN2020103657 W CN 2020103657W WO 2021135177 A1 WO2021135177 A1 WO 2021135177A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
energy
layer
theme
construct
Prior art date
Application number
PCT/CN2020/103657
Other languages
English (en)
French (fr)
Inventor
徐锡明
黄博淘
吴建波
Original Assignee
新奥数能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 新奥数能科技有限公司 filed Critical 新奥数能科技有限公司
Publication of WO2021135177A1 publication Critical patent/WO2021135177A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Definitions

  • the invention belongs to the technical field of energy data processing, and in particular relates to a method and device for constructing an energy data warehouse system.
  • the data warehouse is a subject-oriented, integrated, relatively stable data collection that reflects historical changes and is used to support management decision-making.
  • the enterprise data warehouse architecture proposed by Bill Inmon, the father of data warehouses, and the dimensional data warehouse architecture proposed by Ralph Kimball are two mainstream data warehouse construction methods.
  • Teradata has its own FS-LDM (Teradata Financial Services Logical Data Model) model
  • IBM has its own BDWM (Banking Data Warehouse Model) model
  • Teradata has its own CLDM (Teradata Communications Logical) model
  • Data Model and IBM has TDWM (Telecom Data Warehouse Model).
  • the purpose of the embodiments of the present invention is to provide a method and device for constructing an energy data warehouse system to solve the technical problem that there is no data warehouse system for the energy industry in the prior art.
  • the first aspect of the embodiments of the present invention provides an energy data warehouse system construction method, including:
  • the second aspect of the embodiments of the present invention provides an energy data warehouse system construction device, including:
  • the operation data layer construction module is used to perform first data processing on the energy data table of the data source to obtain the detailed data table corresponding to the energy data to construct the operation data layer;
  • the basic data layer building module is used to perform second data processing on the detailed data table according to the type of energy equipment to obtain the basic data table to construct the basic data layer;
  • the general data layer building module is used to perform third data processing on the basic data table according to business analysis needs to obtain the data warehouse theme to build a general data layer;
  • the application data layer construction module is used to perform fourth data processing on the data warehouse theme according to the business unit, and obtain the data mart corresponding to the business unit to construct the application data layer.
  • the third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, The steps to realize the construction method of the energy data warehouse system described above.
  • a fourth aspect of the embodiments of the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the energy data warehouse system construction method described above.
  • the beneficial effect of the energy data warehouse system construction method provided by the embodiment of the present invention is at least that: the embodiment of the present invention constructs an operation data layer, a basic data layer, a general data layer, and an application data layer, so that an energy data warehouse for the energy industry can be constructed.
  • the system effectively solves the problem that the general data warehouse cannot be applied to the energy field, helps to form a unified and standard data system, speeds up the energy industry’s data processing and data analysis of energy equipment, and facilitates data analysts and data scientists based on high Massive quality data for real-time and effective analysis.
  • Fig. 1 is a schematic diagram 1 of the implementation process of the construction method of an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the implementation process of constructing an operation data layer in the method for constructing an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of the implementation process of constructing a basic data layer in the method for constructing an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 4 is a schematic diagram of the implementation process of constructing a general data layer in the method for constructing an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 5 is a schematic diagram of the implementation process of constructing an application data layer in the method for constructing an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 6 is a schematic diagram of the second implementation process of the construction method of an energy data warehouse system provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of an energy data warehouse system constructed by an energy data warehouse system construction method provided by an embodiment of the present invention.
  • FIG. 8 is an implementation flowchart of a workflow from raw data to basic data in the method for constructing an energy data warehouse system provided by an embodiment of the present invention
  • FIG. 9 is an implementation flow chart of a workflow for generating an energy consumption report based on basic data in a method for constructing an energy data warehouse system provided by an embodiment of the present invention.
  • Fig. 10 is a schematic diagram of data collection in a method for constructing an energy data warehouse system provided by an embodiment of the present invention.
  • Fig. 11 is a schematic diagram 1 of an energy data warehouse system construction device provided by an embodiment of the present invention.
  • FIG. 12 is a second schematic diagram of an energy data warehouse system construction device provided by an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of a terminal device provided by an embodiment of the present invention.
  • the data warehouse is a subject-oriented, integrated, relatively stable data collection that reflects historical changes and is used to support management decision-making.
  • Traditional data warehouse vendors have relatively mature data warehouse products.
  • the existing general data warehouse solutions are very difficult to implement when applied to energy data warehouse systems. If it is a data warehouse without an industry model, it needs to be applied to the energy industry from Designing data models and data processing logic from scratch, and deploying them on general tools, is a huge workload and requires a very professional industry knowledge background. Otherwise, the established data model is difficult to meet the needs of energy data analysis, and it is currently oriented to the banking industry and The data model of traditional industries such as the telecommunications industry is different from that of the energy industry, and thus cannot be applied to the energy industry.
  • This embodiment provides an energy data warehouse system construction method for the energy industry, which can combine the business characteristics of the energy industry, and the energy data warehouse system constructed is integrated based on the business data of the energy industry and is oriented to energy analysis topics.
  • a relatively stable collection of energy data that reflects historical changes can be used for data analysis of energy companies to support management decisions of energy companies.
  • Figure 1 is an energy data warehouse system construction method provided by this embodiment, including:
  • Step S11 Perform first data processing on the energy data table of the data source to obtain a detailed data table corresponding to the energy data to construct an operation data layer.
  • the operating data layer serves as a buffer layer that incrementally stores newly generated or updated data between each data collection interval.
  • it is necessary to perform data processing on the obtained energy data, and write the processed energy data correspondingly into the detailed data table, so as to realize the construction of the operation data layer.
  • the data source is the data source of the energy data warehouse system.
  • the energy data source of the energy data warehouse system mainly includes energy equipment operation data, energy system configuration data, business data, Internet data, and third-party data. Each type of data passes through different Ways to collect, so as to build a data source.
  • Energy equipment operating data is the main data source of the energy data warehouse system. A large amount of equipment operating data is collected and uploaded to the message bus through the Internet of Things. The data collection program of the energy data warehouse system consumes data in real time from the message bus and stores them. Enter the original data layer. The energy equipment operating data received from the message bus can be data in the standard json format collected by the Internet of Things. The main information includes information about the equipment, measurement attributes, measurement time, and measurement values.
  • the configuration information about the energy system is the key information of the energy system data model and the main source of dimensional information in energy data analysis, such as the structure of the energy system, park information, system information, equipment attributes and relationship information. This part of the data is collected from the data source of the energy data warehouse system by synchronizing from the configuration database.
  • Business data is data related to personnel, organization, and processes in the company's business development process, including employee information, department information, product information, purchasing information, sales information, project information, etc.
  • the business data is synchronized through the business database and collected into the data source of the energy data warehouse system.
  • Internet data In the process of energy data analysis, some external data is required, such as weather data (temperature, humidity, wind direction, wind force, etc.) of the equipment operating environment, and price data of different types of energy in different regions.
  • Internet data is collected into the data source of the data warehouse system through the Internet data crawling program.
  • Third-party data In data analysis, in addition to the energy equipment operating data collected through the Internet of Things, a large number of third-party manufacturers have already collected and stored data in the third-party system.
  • the third-party data mainly includes equipment information data, equipment operation data, etc. This part of the data is collected into the data source of the energy data warehouse system through the third-party data interface service.
  • the energy data source of the energy data warehouse system may also be other, and it is not limited to the above-mentioned situation, and there is no limitation here.
  • the first data processing includes at least loading the energy data into the original data table, analyzing the energy data, filtering out abnormal data in the energy data, and partitioning the energy data according to the type of energy equipment and the acquisition time.
  • Step S11 may include the following steps:
  • Step S111 Load the energy data in the data source to the original data table. All collected energy data will enter the message queue.
  • the message program obtains energy data from the message queue and stores it in the distributed file system.
  • the batch program periodically loads the energy data from the distributed file system to the original data table through an external table. .
  • the structure of the original data table is consistent with the format of the energy data received by the distributed file system.
  • energy data may have data interruptions, multiple uploads of data, and data abnormalities in the collection stage of the Internet of Things, when loading from the original data table to the detailed data table, the energy data can be processed and processed accordingly.
  • Step S112 Analyze the original data table according to the processing time of the energy data, and determine the newly loaded data in the original data table.
  • the energy data has a time stamp when it is loaded into the original data table, which can facilitate subsequent data processing. For example, when loading energy data, you can determine which energy data is newly loaded data and which data is not newly loaded data according to the processing time of energy data. Only the newly loaded data requires further processing instead of newly loaded data. It can be written into the detailed data table after the corresponding processing has been carried out in the previous process.
  • Step S113 Determine whether the newly loaded data is abnormal data.
  • Step S114 Perform format conversion on the newly loaded data to obtain intermediate data.
  • Step S115 Add the newly loaded data to the data abnormality log.
  • Step S116 partition the intermediate data according to the type of energy equipment and the acquisition time, and write the intermediate data into the detailed data table to construct an operation data layer. Since the amount of energy data is usually relatively large, in the design of the detailed data table, combining the characteristics of energy data, partitioning the data acquisition time dimension and device type dimension can improve the efficiency of energy data storage and subsequent processing.
  • the structure of the detailed data sheet is based on time sequence. The structure mainly includes the following information: site information, device type, device identification (device ID), measurement attributes, measurement time, and measurement values.
  • Step S12 Perform second data processing on the detailed data table according to the type of energy equipment to obtain a basic data table to construct a basic data layer. Since the energy data between similar energy equipment is consistent, and the data between different types of energy equipment is quite different, and the subsequent analysis of energy data is mainly based on the internal analysis of similar energy equipment, so the basic data In the design of the layer, the corresponding basic data table is mainly established with energy equipment as the unit.
  • the main things that need to be done include: classifying energy data according to the type of energy equipment, flattening the narrow table into a wide table, standardizing the data, and loading it into In the basic data sheet of the corresponding energy equipment.
  • step S12 may include the following steps:
  • Step S121 Classify the data in the detailed data table according to the type of energy equipment to obtain data corresponding to each type of energy equipment.
  • the subsequent business analysis that needs to be performed is also different, so energy data needs to be classified according to the type of energy equipment.
  • Step S122 Perform time alignment on the data corresponding to the energy equipment according to the minimum time granularity to obtain the first data. Since the energy data uploaded by the IoT may have a time difference, the measured values of the same energy device at the same time may be uploaded at different time points. Therefore, in order to facilitate subsequent analysis, it is necessary to perform a minimum time granularity (such as minutes) on these energy data. ) Time alignment, the second-level difference in the time information is processed to the same minute and put into the same row of the same time dimension to obtain the first data.
  • Step S123 Perform data flattening processing on the first data to obtain second data. By flattening the data corresponding to the type of energy equipment, all measurement information of the same time dimension is put into one line to facilitate subsequent indicator analysis and data comparison.
  • Step S124 Write the second data into the basic data table corresponding to each energy equipment type to construct a basic data layer.
  • the table name is FDM-TRAN, and its basic data table structure definition and description are as shown in Table 1:
  • the description information, status information and measurement information of the transformer are all concentrated in this basic data table, whether it is the energy data collected by the Internet of Things or the energy data accessed from a third-party interface.
  • Standardization is the definition model, and subsequent transformer-based analysis can extract data from it.
  • the basic data layer defines the basic data table structure for more than 100 types of commonly used energy equipment in the energy industry, which can support most of the data storage and analysis needs of the energy industry, and due to the weak correlation between the types of energy equipment It is very convenient to expand new types of energy equipment.
  • the basic data layer also includes environmental data obtained through data crawling programs and business data synchronized from business systems. These data are also standardized before entering the basic data layer for subsequent analysis. Provide a unified view.
  • Step S13 Perform third data processing on the basic data table according to business analysis needs to obtain the data warehouse theme to construct a general data layer.
  • the general data layer is business-oriented. According to the needs of business analysis, the data warehouse theme is designed from top to bottom.
  • the third data processing includes reading the data in the basic data table of each energy equipment type, aggregating the data, and so on. Please refer to Fig. 4, step S13 may include the following steps:
  • Step S131 Determine the theme of the data warehouse according to the needs of business analysis.
  • Step S132 According to the theme of the data warehouse, read the data in the basic data table corresponding to each energy equipment type.
  • Step S133 Aggregate data corresponding to each energy equipment type with reference to the dimensional data to obtain aggregated data.
  • Step S134 Write the aggregated data into the data warehouse theme to construct a general data layer.
  • the data warehouse topics to focus on include energy enterprise capacity analysis topics, energy consumption enterprise energy consumption analysis topics, enterprise energy efficiency analysis topics, equipment status operation trend analysis topics, equipment predictive maintenance analysis topics, and enterprise data access topics.
  • the business analysis here will be carried out at different levels. Taking the topic of enterprise energy consumption analysis as an example, it will be in different time dimensions (hour level, day level, month level, grade), and different energy use unit dimensions (department, production line). , Workshop, team), different energy-using equipment dimensions (refrigeration, lighting, processing equipment) for multi-dimensional analysis, the theme design needs to meet the needs of supporting multi-dimensional analysis.
  • the calculation of the general data layer is a process of gradual aggregation from basic data.
  • the low-dimensional data is first calculated, and then the low-dimensional calculation results are aggregated into high-dimensional data. Take the calculation of electricity consumption and electricity costs as an example. First, calculate the electricity at the minute level. The electricity at the minute level is combined with the electricity price strategy (the electricity prices at different periods of peak and valley) to calculate the hourly electricity and electricity costs, and then the hourly electricity and the electricity fee are calculated. The electricity fee is calculated by calculating the electricity and electricity charges at the day-level (or group period) level, and continues to be aggregated into monthly electricity and electricity charges, and finally aggregated into quarterly and annual electricity and electricity charges. With these aggregated data in different time dimensions, in the analysis based on the time dimension, you can quickly respond to the analysis operation based on the calculated results.
  • Step S14 Perform fourth data processing on the data warehouse theme according to the business unit, and obtain the data mart corresponding to the business unit to construct an application data layer.
  • the application data layer is the data layer that provides external application access to the calculation results of the energy data warehouse system.
  • Applications include front-end products, report systems, algorithm platforms, operation analysis, etc. Since the energy data warehouse system is mainly calculated and stored on the big data platform, here Integrate all energy data inside and outside the enterprise.
  • step S14 may include the following steps:
  • Step S141 Determine the data mart according to the business unit.
  • the types of the data mart include energy consumption data mart, energy supply data mart, operation data mart, artificial intelligence data mart, etc., of course, can also include other types of marts. City, there is no restriction here.
  • Step S142 According to the data mart, classify the data in the data warehouse theme.
  • Step S143 Write the classified data in the data warehouse theme into the corresponding data mart, and determine the access authority to build an application data layer. This is because some energy data is not allowed to be accessed by all applications or analysts, so the data required by different business units are separated through the data mart, and the data required by different business units are placed in the corresponding data mart, and Control access rights in the data mart to ensure data security.
  • step S143 it may further include:
  • Step S144 Migrate the data mart to the report system to generate a report. That is, some businesses or reporting systems put data marts into their own databases through data migration to improve data access efficiency.
  • the method for constructing an energy data warehouse system provided in this embodiment further includes:
  • Step S15 Construct dimensional data according to the energy data analysis dimension.
  • Dimensional data can provide references for data processing (including data association, data aggregation, data aggregation, etc.) in the general data layer, and can also provide dimensional information for data processing and data migration in the application data layer.
  • the dimensional data includes at least one of time dimension, geographic dimension, user dimension, campus dimension, system dimension, and device dimension. These dimensional data are the basis for data cleaning and data processing, as well as subsequent multi-dimensional modeling and data analysis. Basically, only when the data is cleaned and processed according to the standard, can the consistency and accuracy of the subsequent analysis be guaranteed.
  • the time dimension provides year (including natural year and corporate custom fiscal year), quarter, month (including natural month and custom calculation and settlement month), day (including natural day and custom team period), Dimension definitions at different levels such as hours and minutes; geographic dimensions provide definitions of different geographic dimensions including the country, regions, provinces (cities), cities (districts), districts, counties, and parks; according to user characteristics, provide the user’s industry and user Dimensional definitions such as level and user category; each energy system belongs to a park, and the corresponding dimensions are established based on the park and system information when the system is modeled, and the energy type to which the system belongs; the energy system’s capacity and energy-using equipment are based on equipment Categories, equipment categories, equipment manufacturers and other dimensions provide standards, such as the following common equipment types: air compressors, refrigerators, air conditioners, gas steam boilers, gas hot water boilers, transformers, steam meters, electricity meters, energy meters, thermometers, Pressure gauges, gas flow meters, liquid flow meters, differential pressure meters, etc.
  • common equipment types air compressors, refrigerator
  • the construction method of the energy data warehouse system provided in this embodiment further includes:
  • Step S16 Build a management tool, the management tool includes at least one of a metadata management tool, a workflow management tool, a data collection tool, a data processing tool, and a data migration tool.
  • Metadata management tools There are all kinds of intricate energy data in the energy data warehouse system. In order to enable users to have a clear understanding of energy data, metadata management tools are provided to solve three data problems: the data in each link is What, where does the data in each link come from, and where does the data in each link go. Among them, what is the data of each link, you can search and view the name, type, length, business meaning, etc. of various data through tools, and you can view the meaning of the code for the coded data; where the data of each link comes from, to solve the problem of data traceability The problem is that all data in the energy data warehouse system has upstream data.
  • Workflow management tools Because there are a large number of data collection, processing and migration tasks in the energy data warehouse system, and there are dependencies between these tasks, in order to support the automatic periodic and orderly execution of the energy data warehouse system, workflow support is required.
  • One of the largest workflows in the energy data warehouse system is to start from incremental data collection to the file system, and perform the following tasks in sequence:
  • the workflow management tool in the energy data warehouse system provides task scheduling, workflow or task dependency management, topological relationship management, task execution strategy, execution result management, workflow and task re-execution support, etc.
  • the workflow system drives the entire energy data
  • the warehouse system operates in an orderly manner.
  • the above steps (1) to (3) constitute a workflow from the original data to the basic data, as shown in Figure 8.
  • the workflow from raw data to basic data runs once an hour, and is responsible for processing the data that arrived in the last hour into the basic data layer. With the help of workflow management tools, these tasks are organized together, and the scheduling period is set to every hour Executed at the 5th minute (it can also be executed at other times, there is no restriction here).
  • the first step of the workflow is to start multiple parallel tasks, and load the integrated station data, energy domain data, gas exchange station data, photovoltaic station data, heating station data, etc. into the original table respectively. When all these are executed in parallel After all tasks are completed, start the second task; write the newly loaded data into the detailed data table.
  • the basis for judging the newly loaded data is that the data processing time will be increased when the data is loaded in the first step, and whether it is judged according to the processing time.
  • Newly loaded data in the process of writing the data into the detail table, it is necessary to judge the data format.
  • For abnormal data write to the abnormal log. Only the correct data will be written into the detailed data table. At the same time, it will be based on the energy equipment to which the data belongs.
  • Types are partitioned to facilitate subsequent data processing based on the type of energy equipment; the third task is based on the type of energy equipment, and the data is processed and loaded into the basic data table based on the type of energy equipment (ie FDM table), one for each energy equipment type Separate tasks, these tasks are executed in parallel, each task mainly deals with two aspects of logic: one is to align the time of all data, because the data uploaded by the Internet of Things may have a time difference, the measurement value of the same device at the same time It may be uploaded at different time points. In order to facilitate subsequent analysis, it is necessary to align these data with the smallest time granularity (such as minutes), and process the second-level differences in the time information to the same minute, and put them into the same time dimension. In the same row; the second is to flatten the data, and put all the measurement information in the same time dimension into one row, which is convenient for subsequent indicator analysis and data comparison. After all the basic data of all energy equipment types are processed, the workflow ends.
  • FIG. 9 shows the data processing of the enterprise energy use report based on basic data flow.
  • analysis based on basic data according to the analysis theme, such as transformer analysis, air compressor equipment analysis, gas boiler equipment analysis, enterprise energy structure analysis, capacity energy efficiency analysis, etc.
  • the various energy structures including electricity, heat, natural gas, steam, etc. used by the enterprise will be comprehensively analyzed, first based on the bottom numbers of various energy sources, that is, the meter of various energy measurements.
  • the workflow Based on the uploaded data, calculate the various energy consumption in different periods, and calculate the cost of different periods according to the pricing strategy (such as the peak and valley period electricity price of electric energy or the use step price, the step price of natural gas, etc.), and then calculate all costs according to the enterprise (Accounting unit) summarizes and generates aggregated data with the lowest time granularity (hour-level) (including the consumption and cost of each energy type), and then aggregates the aggregated data of the higher time granularity (day-level), and then summarizes higher Aggregate data of the first-level time granularity (month and grade), and then generate data marts for the companies that need reports, and finally export the data marts of each company to the reporting system for report generation. Marked by the task of exporting energy data to the reporting system, the workflow ends.
  • the pricing strategy such as the peak and valley period electricity price of electric energy or the use step price, the step price of natural gas, etc.
  • Data collection tools mainly provide convenience for various data sources to access the energy data warehouse system.
  • the main functions of data collection are shown in Figure 10 and provide various types of energy data access services.
  • IOT device data is directly collected from the message queue and pushed to the distributed file system of the data warehouse; business and configuration data are regularly synchronized from the business system; third-party data is regularly pulled from the third-party system through the interface call method, interface During the calling process, permission authentication and data acquisition are required.
  • data mapping is performed, and then pushed to the energy data warehouse system; Internet data is obtained through program crawling, and the crawled data is analyzed and pushed to the energy data warehouse system .
  • Data collection tasks need to be executed periodically, and task scheduling is supported by the aforementioned workflow management tools.
  • Data processing tools For a large number of data processing tasks in the energy data warehouse system, the flow, classification, cleaning, aggregation, and calculation of data among various layers are all data processing tasks. Such a large number of processing tasks need to provide tools Support, support interface configuration processing logic. In the process of data processing, it is necessary to distinguish between the business time and processing time of the data.
  • Business time is the actual time when the data occurs in the business process.
  • the bottom number of the metering device is the corresponding value at a specific moment.
  • the time corresponding to the value is the business time; and the processing time refers to the system time when the processing program is executed. This time is generally later than the business time.
  • variables $ ⁇ biztime ⁇ , $ ⁇ bizdate ⁇ , $ ⁇ systime ⁇ and $ ⁇ sysdate ⁇ can be provided to represent business time, business date, processing time and processing date respectively, which can be called during data processing.
  • many built-in variables unique to the energy system have been added, such as the park $ ⁇ parkid ⁇ to which the data belongs, the system $ ⁇ systemid ⁇ to which the data belongs, and the site $ ⁇ stationid ⁇ to which the data belongs.
  • These are also available as built-in variables for data processing.
  • the processing logic is not completed in one step. There will be multiple modifications and adjustments. Therefore, data processing supports debugging. Each debugging task will check the grammar and generate the task execution plan. Through debugging The information confirms whether the task logic is correct. Data processing tasks need to be executed periodically, and task scheduling and dependencies are supported by the aforementioned workflow management tools.
  • Data migration tool For data migration between the energy data warehouse system and external systems, a tool-based support is needed.
  • the data of the original system can be regularly migrated to the energy data warehouse system, or the result data processed by the energy data warehouse system can be migrated to Data application system, such as report system, etc.
  • the data migration tool supports data migration based on the database level and table level, sets the mapping relationship between the source table and the destination table, sets the incremental mode (incremental migration or full migration), sets the target data coverage strategy, etc., combined Task scheduling allows migration tasks to be executed periodically or triggered when upstream processing tasks are completed. Data migration tasks need to be executed periodically, and task scheduling and dependencies are supported by the aforementioned workflow management tools.
  • the energy data warehouse system constructed by the method for constructing an energy data warehouse system includes an operational data layer (ODS), a basic data layer (FDM), a general data layer (GDM), and an application data layer.
  • ODS operational data layer
  • FDM basic data layer
  • GDM general data layer
  • ADM application data layer
  • the energy data warehouse system also includes dimensional data (DIM), which provides references for data processing in the general data layer and provides dimensional information for data processing and migration in the application data layer.
  • DIM dimensional data
  • the energy data warehouse system also includes management tools to facilitate the construction, operation and monitoring of data and tasks in the energy data warehouse system.
  • an energy data warehouse system for the energy industry can be constructed, which effectively solves the problem that the general data warehouse cannot be applied to the energy field. It helps to form a unified and standard data system, speeds up the energy industry's data processing and data analysis of energy equipment, and facilitates data analysts and data scientists to conduct real-time and effective analysis based on high-quality massive data.
  • This embodiment absorbs the advantages of different construction methods at different levels, thereby helping to improve the performance of the energy data warehouse system.
  • Bill Inmon s enterprise data warehouse architecture method is adopted, which is an energy data warehouse.
  • the system builds a solid foundation; in the general data layer and application data layer, the dimensional data warehouse construction method proposed by Ralph Kimball is used to provide good support for the flexible and changeable analysis of the upper layer.
  • this embodiment can provide standard support and a unified perspective for data processing and multi-dimensional data analysis, ensuring that data processing processes such as data cleaning and processing can be unified in accordance with the standard, thereby ensuring consistent subsequent analysis Sex and accuracy.
  • This embodiment builds management tools, including metadata management tools, workflow management tools, data collection tools, data processing tools, and data migration tools, which can support data and task-related tasks in the energy data warehouse system Configuration, operation, and monitoring help maintain the effective operation of the energy data warehouse system.
  • the purpose of this embodiment is also to provide an energy data warehouse system construction device, including an operation data layer construction module 21, a basic data layer construction module 22, a general data layer construction module 23, and an application data layer construction module 24 .
  • the operation data layer construction module 21 is used to perform first data processing on the energy data table of the data source to obtain the detailed data table corresponding to the energy data to construct the operation data layer;
  • the basic data layer construction module 22 is used to construct the operation data layer according to the energy
  • the device type performs second data processing on the detailed data table to obtain a basic data table to construct a basic data layer;
  • the general data layer construction module 23 is used to perform third data processing on the basic data table according to business analysis needs to obtain
  • the data warehouse theme is used to construct a general data layer;
  • the application data layer construction module 24 is used to perform fourth data processing on the data warehouse theme according to the business unit, and obtain the data mart corresponding to the business unit to construct the application data layer.
  • the energy data warehouse system construction device further includes a dimensional data construction module 25, the dimensional data construction module 25 is used to analyze the dimensions of the energy data to construct dimensional data, the dimensional data includes at least a time dimension, a geographic dimension , User dimension, campus dimension, system dimension, equipment dimension.
  • the energy data warehouse system construction device also includes a management tool building module 26, which is used to build management tools.
  • the management tools include at least metadata management tools, workflow management tools, data collection tools, data processing tools, and One of the data migration tools.
  • FIG. 13 is a schematic diagram of a terminal device provided by an embodiment of the present invention.
  • the terminal device 3 of this embodiment includes: a processor 30, a memory 31, and a computer program 32 stored in the memory 31 and running on the processor 30, for example, an energy data warehouse system construction program.
  • the processor 30 executes the computer program 32, the steps in the foregoing embodiments of the energy data warehouse system construction method are implemented, for example, steps S11 to S16 shown in FIGS. 1 to 6.
  • the processor 30 executes the computer program 32, the functions of the modules/units in the foregoing device embodiments, such as the functions of the modules 21 to 26 shown in FIGS. 11 to 12, are realized.
  • the computer program 32 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 31 and executed by the processor 30 to complete this invention.
  • the one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 32 in the terminal device 3.
  • the terminal device 3 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 30 and a memory 31.
  • FIG. 13 is only an example of the terminal device 3, and does not constitute a limitation on the terminal device 3. It may include more or less components than shown in the figure, or a combination of certain components, or different components.
  • the terminal device 3 may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 30 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 31 may be an internal storage unit of the terminal device 3, such as a hard disk or a memory of the terminal device 3.
  • the memory 31 may also be an external storage device of the terminal device 3, such as a plug-in hard disk equipped on the terminal device 3, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 31 may also include both an internal storage unit of the terminal device 3 and an external storage device.
  • the memory 31 is used to store the computer program and other programs and data required by the terminal device 3.
  • the memory 31 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the present invention implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signal telecommunications signal
  • software distribution media etc.
  • the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction.
  • the computer-readable medium Does not include electrical carrier signals and telecommunication signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种能源数据仓库系统构建方法及装置,方法包括:对数据源的能源数据进行第一数据处理,获得能源数据对应的细节数据表,以构建操作数据层;根据能源设备类型对细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;根据业务分析需要对基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;根据业务单元对数据仓库主题进行第四数据处理,获取业务单元对应的数据集市,以构建应用数据层;有效解决通用数据仓库无法应用于能源领域的问题,有助于形成统一、标准的数据体系,加快能源行业对能源设备数据加工和数据分析的速度。

Description

一种能源数据仓库系统构建方法及装置 技术领域
本发明属于能源数据处理技术领域,尤其涉及一种能源数据仓库系统构建方法及装置。
背景技术
数据仓库是一个面向主题的、集成的、相对稳定的、反映历史变化的数据集合,用于支持管理决策。数据仓库之父Bill Inmon提出的企业数据仓库架构和Ralph Kimball提出的维度数据仓库架构是两个主流数据仓库构建方法。
传统数据仓库厂商都有比较成熟的数据仓库产品,也有针对某些行业的数据模型。例如,在银行业,Teradata有自己的FS-LDM(Teradata Financial Services Logical Data Model)模型,而IBM有自己的BDWM(Banking Data Warehouse Model)模型;在电信业,Teradata有自己的CLDM(Teradata Communications Logical Data Model)模型,而IBM有TDWM(Telecom Data Warehouse Model)。然而,这些模型主要针对传统行业,并没有针对能源行业的数据模型,无法适应能源行业特点;并且这些数据模型都是基于传统关系数据库构建的,不能适应当前海量数据的实时和准实时的分析需求,也无法根据分析主题的变化而进行灵活处理。
技术问题
本发明实施例的目的在于提供一种能源数据仓库系统构建方法及装置,以解决现有技术中没有针对能源行业的数据仓库系统的技术问题。
技术解决方案
本发明实施例的第一方面提供了一种能源数据仓库系统构建方法,包括:
对数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层;
根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;
根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;
根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层。
本发明实施例的第二方面提供了一种能源数据仓库系统构建装置,包括:
操作数据层构建模块,用于对数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层;
基础数据层构建模块,用于根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;
通用数据层构建模块,用于根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;
应用数据层构建模块,用于根据业务单元对所述数据仓库主题进行第四数 据处理,获取所述业务单元对应的数据集市,以构建应用数据层。
本发明实施例的第三方面提供了一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述能源数据仓库系统构建方法的步骤。
本发明实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述能源数据仓库系统构建方法的步骤。
有益效果
本发明实施例提供的能源数据仓库系统构建方法的有益效果至少在于:本发明实施例通过构建操作数据层、基础数据层、通用数据层以及应用数据层,从而可以构建针对能源行业的能源数据仓库系统,有效解决了通用数据仓库无法应用于能源领域的问题,有助于形成统一、标准的数据体系,加快能源行业对能源设备数据加工和数据分析的速度,方便数据分析师和数据科学家基于高质量的海量数据进行实时有效分析。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本发明实施例提供的能源数据仓库系统构建方法的实现流程示意图一;
图2是本发明实施例提供的能源数据仓库系统构建方法中构建操作数据层的实现流程示意图;
图3是本发明实施例提供的能源数据仓库系统构建方法中构建基础数据层的实现流程示意图;
图4是本发明实施例提供的能源数据仓库系统构建方法中构建通用数据层的实现流程示意图;
图5是本发明实施例提供的能源数据仓库系统构建方法中构建应用数据层的实现流程示意图;
图6是本发明实施例提供的能源数据仓库系统构建方法的实现流程示意图二;
图7是本发明实施例提供的能源数据仓库系统构建方法构建的能源数据仓库系统的示意图;
图8是本发明实施例提供的能源数据仓库系统构建方法中从原始数据到基础数据的工作流的实现流程图;
图9是本发明实施例提供的能源数据仓库系统构建方法中基于基础数据生成用能报表的工作流的实现流程图;
图10是本发明实施例提供的能源数据仓库系统构建方法中数据采集的示意图;
图11是本发明实施例提供的能源数据仓库系统构建装置的示意图一;
图12是本发明实施例提供的能源数据仓库系统构建装置的示意图二;
图13是本发明实施例提供的终端设备的示意图。
本发明的实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本发明实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本发明。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本发明的描述。
为了说明本发明所述的技术方案,下面通过具体实施例来进行说明。
数据仓库是一个面向主题的、集成的、相对稳定的、反映历史变化的数据集合,用于支持管理决策。传统数据仓库厂商都有比较成熟的数据仓库产品,然而目前已有的通用数据仓库方案应用到能源数据仓库系统时实施难度巨大,如果是一个不带行业模型的数据仓库,应用到能源行业需要从零开始设计数据模型以及数据加工逻辑,并在通用工具上部署,工作量巨大,并且需要非常专业的行业知识背景,否则建立的数据模型很难满足能源数据分析的需求,而目前面向银行业和电信业等传统行业的数据模型由于与能源行业存在差异,从而无法应用到能源行业。
本实施例则提供了一种针对能源行业的能源数据仓库系统构建方法,能够结合能源行业的业务特征,构建的能源数据仓库系统是基于能源行业的业务数据而集成的、面向能源分析主题的、相对稳定的、反映历史变化的能源数据集合,可以用于能源企业的数据分析,支撑能源企业的管理决策。
图1是本实施例提供的一种能源数据仓库系统构建方法,包括:
步骤S11:对数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层。
操作数据层作为一个缓冲层,增量存储每次数据采集间隔之间新产生或更新的数据。为了构建操作数据层,需要对获取的能源数据进行数据处理,并将处理后的能源数据对应写入细节数据表中,从而实现对操作数据层的构建。
数据源是能源数据仓库系统的数据来源,能源数据仓库系统的能源数据来源主要包括能源设备运行数据、能源系统配置数据、业务数据、互联网数据以及第三方数据等,每种类型的数据通过不同的方式进行采集,从而构建数据源。
能源设备运行数据:能源设备运行数据是能源数据仓库系统的主要数据来源,大量的设备运行数据通过物联网采集、上传到消息总线,能源数据仓库系统的数据采集程序从消息总线实时消费数据,存入原始数据层。从消息总线接收的能源设备运行数据可以是物联网采集的标准json格式的数据,主要信息包括设备所属信息、量测属性、量测时间和量测值等信息。
能源系统配置数据:关于能源系统的配置信息,是能源系统数据模型的关键信息,也是能源数据分析中维度信息的主要来源,如能源系统的结构,园区信息、系统信息、设备属性和关系信息。这部分数据通过从配置库同步,采集 到能源数据仓库系统的数据源中。
业务数据:业务数据是公司业务开展过程中关于人员、组织、流程相关的数据,包括员工信息、部门信息、产品信息、采购信息、销售信息、项目信息等。业务数据通过业务库同步,采集到能源数据仓库系统的数据源中。
互联网数据:在能源数据分析过程中,需要一些外部数据,例如设备运行环境的天气数据(温度、湿度、风向、风力等),不同类型能源在不同区域的价格数据等。互联网数据通过互联网数据爬取程序采集到数据仓库系统的数据源中。
第三方数据:在进行数据分析时,除了通过物联网上传采集的能源设备运行数据外,还需要大量第三方厂商已经采集并存储在第三方系统中的数据。第三方数据主要包括设备信息数据、设备运行数据等,这部分数据通过第三方数据接口服务,采集到能源数据仓库系统的数据源中。
当然,在其他实施例中,能源数据仓库系统的能源数据来源还可以为其他,并不仅限于上述的情形,此处不做限制。
在本实施例中,第一数据处理至少包括将能源数据加载至原始数据表、对能源数据进行解析、筛选出能源数据中的异常数据以及根据能源设备类型和获取时间对能源数据进行分区等。请参阅图2,步骤S11可以包括如下步骤:
步骤S111:加载所述数据源中的能源数据至原始数据表。所有采集到的能源数据会进入消息队列,消息程序从消息队列获取能源数据并存储在分布式文件系统中,批处理程序通过外部表的方式定期从分布式文件系统加载能源数据到原始数据表中。可选地,原始数据表结构与分布式文件系统接收到的能源数据格式一致。
由于能源数据在物联网采集阶段可能存在数据中断、数据多次上传、数据异常等情况,在从原始数据表向细节数据表加载时,可以对能源数据进行相应加工和处理。
步骤S112:根据所述能源数据的处理时间解析所述原始数据表,确定所述原始数据表中的新加载数据。为了方便后续处理和分析,能源数据在加载到原始数据表中时具有时间戳,从而可以方便后续的数据处理。例如,在对能源数据进行加载时,可以根据能源数据的处理时间来确定哪些能源数据是新加载数据,哪些数据不是新加载数据,只有新加载数据才需要进一步进行处理,而非新加载数据则可以在之前的过程中已经进行相应的处理后写入细节数据表中。
步骤S113:判断所述新加载数据是否为异常数据。
若所述新加载数据不是异常数据,则:
步骤S114:对所述新加载数据进行格式转换,获取中间数据。
若所述新加载数据是异常数据,则:
步骤S115:将所述新加载数据加入数据异常日志。
为了提高处理效率,需要对新加载数据中的异常数据进行识别,从而可以从新加载数据中筛选出正常数据,避免对异常数据进行无用处理。而对于正常数据,则对其进行格式转换,使得其转换为与细节数据表相同格式的中间数据。
步骤S116:根据能源设备类型和获取时间对所述中间数据进行分区,将所述中间数据写入细节数据表中,以构建操作数据层。由于能源数据的数据量通常比较多,在细节数据表的设计中,结合能源数据特点,对数据进行获取时间维度和设备类型维度的分区,可以提高能源数据存储和后续处理的高效性。细节数据表的结构以时序为基础,结构中主要包括以下信息:站点信息、设备类型、设备标识(设备ID)、量测属性、量测时间以及量测值等。
步骤S12:根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层。由于同类能源设备之间的能源数据都是一致的,而不同类型能源设备之间的数据差异性较大,而后续对能源数据的分析主要以同类能源设备内部的分析为主,因此在基础数据层的设计上,主要以能源设备为单位建立相应的基础数据表。
在本实施例中,从细节数据表加工到基础数据表的过程中,主要需要做的事情包括:按照能源设备类型将能源数据进行分类、将窄表扁平化为宽表、数据标准化以及加载到相应能源设备的基础数据表中。
在本实施例中,第二数据处理至少包括对细节数据表的数据进行分类、对数据进行时间对齐以及进行数据扁平化处理等。请参阅图3,步骤S12可以包括如下步骤:
步骤S121:根据能源设备类型对所述细节数据表中的数据进行分类,以获取每类能源设备对应的数据。根据能源设备的类型不同,后续需要执行的业务分析也不相同,因此需要根据能源设备的类型对能源数据进行分类。
步骤S122:根据最小时间粒度对所述能源设备对应的数据进行时间对齐,以获取第一数据。由于物联上传的能源数据可能会存在时差,同一个能源设备同一个时间的量测值可能在不同的时间点上传,因此为了便于后续分析,需要对这些能源数据进行最小时间粒度(如分钟级)的时间对齐,将时间信息中的秒级差异处理到同一分钟,并放入同一时间维度的同一行中,以获得第一数据。
步骤S123:对所述第一数据进行数据扁平化处理,获得第二数据。通过将能源设备类型对应的数据进行扁平化处理,将同一时间维度的所有量测信息全部放入一行,以便于后续指标分析和数据对比。
步骤S124:将所述第二数据写入各能源设备类型对应的基础数据表中,以构建基础数据层。
在一个实施例中,以变压器为例,表名为FDM-TRAN,其基础数据表结构定义及说明如下表一所示:
Figure PCTCN2020103657-appb-000001
Figure PCTCN2020103657-appb-000002
表一
通过上述基础数据表结构定义,变压器的描述信息、状态信息和量测信息全部集中在这张基础数据表中,无论是物联采集的能源数据,还是从第三方接口接入的能源数据,都规范化为该定义模式,后续基于变压器的分析可以全部从这里提取数据。
在本实施例中,基础数据层对能源行业超过100多类常用能源设备定义了基础数据表结构,能够支持能源行业的大部分数据存储和分析需要,并且由于各能源设备类型之间的弱关联性,扩展新的能源设备类型非常方便。基础数据层除了大量能源设备信息表外,还有通过数据爬取程序获取的环境数据以及从业务系统同步的业务数据等,这些数据也都是在进入基础数据层之前进行标准化处理,为后续分析提供统一的视图。
步骤S13:根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层。通用数据层是面向业务的,根据业务分析需要,自顶向下设计数据仓库主题。在本实施例中,第三数据处理包括读取各能源设备类型的基础数据表中的数据、对数据进行聚合等。请参阅图4,步骤S13可以包括如下步骤:
步骤S131:根据业务分析需要,确定数据仓库主题。
步骤S132:根据数据仓库主题,读取各能源设备类型对应的基础数据表中的数据。
步骤S133:参照维度数据对各能源设备类型对应的数据进行聚合,以获取聚合数据。
步骤S134:将所述聚合数据写入所述数据仓库主题,以构建通用数据层。
具体地,在能源领域,重点关注的数据仓库主题包括能源企业产能分析主题、用能企业用能分析主题、企业能效分析主题、设备状态运行趋势分析主题、设备预测性维护分析主题以及企业数据接入质量分析主题等。这里的业务分析会在不同层次进行,以如企业用能分析主题为例,其会在不同的时间维度(小时级、天级、月级、年级)、不同用能单元维度(部门、产线、车间、班组)、不同用能设备维度(制冷、照明、加工设备)进行多维分析,主题设计需要能满足支持多维分析的需要。通用数据层的计算,是由基础数据逐步聚合汇总的过程,首先计算低维度的数据,再由低维度的计算结果聚合为高维度的数据。以计算用电量和用电费用为例,首先计算分钟级别的电量,由分钟级别的电量结合电价策略(尖峰平谷不同时段的电价)计算出小时级的电量和电费,再由小时级电量和电费计算出天级(或班组时段级)电量和电费,继续聚合为月级电量和电费,最终聚合为季度和年度的电量和电费。有了这些不同时间维度的聚合数据,在基于时间维度的分析时,就可以基于已经计算好的结果快速响应分析操作。
步骤S14:根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层。应用数据层是将能源数据仓库系统计算的结果对外提供应用访问的数据层,应用包括前端产品、报表系统、算法平台、运营分析等,由于能源数据仓库系统主要计算和存储在大数据平台,这里集成了企业内外的所有能源数据。
在本实施例中,第四数据处理包括对数据仓库主题的数据进行分类、确定访问权限等。请参阅图5,步骤S14可以包括如下步骤:
步骤S141:根据业务单元,确定数据集市,数据集市的类型包括用能数据 集市、供能数据集市、运营数据集市、人工智能数据集市等,当然还可以包括其他类型的集市,此处不做限制。
步骤S142:根据所述数据集市,对所述数据仓库主题中的数据进行分类。
步骤S143:将经过分类的所述数据仓库主题中的数据写入对应的数据集市中,并确定访问权限,以构建应用数据层。这是由于有些能源数据并不是允许所有应用或分析人员访问,所以通过数据集市的方式,将不同业务单元需要的数据分开,将不同业务单元需要的数据放在相应的数据集市中,并在数据集市中对访问权限进行控制,保证数据的安全性。
进一步地,为了提升数据访问的效率,步骤S143之后还可以包括:
步骤S144:将所述数据集市迁移至报表系统中,以生成报表。即有些业务或报表系统通过数据迁移的方式将数据集市放入自己的数据库中,提升数据访问效率。
进一步地,为了对能源数据的加工处理以及多维分析提供标准支撑和统一的视角,本实施例提供的能源数据仓库系统构建方法还包括:
步骤S15:根据能源数据分析维度,构建维度数据。维度数据可以为通用数据层进行数据加工(包括数据关联、数据聚合以及数据汇总等)提供参照,同时也可以为应用数据层进行数据加工、数据迁移提供维度信息。所述维度数据至少包括时间维度、地理维度、用户维度、园区维度、系统维度、设备维度中的一种,这些维度数据是进行数据清洗、数据处理的基础,也是后续多维建模和数据分析的基础,只有在数据清洗和处理时将数据按照标准进行统一,才能保证后续分析的一致性和准确性。
依据能源数据分析需要,时间维度提供年(包括自然年和企业自定义财年)、季度、月份(包括自然月和自定义计算和结算月)、天(包括自然日和自定义班组时段)、小时、分钟等不同层级的维度定义;地理维度提供包括全国、大区、省(市)、市(区)、区县、园区等不同地理维度的定义;按照用户特性,提供用户所属行业、用户等级、用户类别等维度定义;每个能源系统都属于一个园区,通过系统建模时的园区和系统信息,以及系统所属能源类型,建立相应的维度;能源系统的产能、用能设备,按照设备大类、设备小类、设备厂商等维度提供标准,例如以下常见设备类型:空气压缩机、制冷机、空调、燃气蒸汽锅炉、燃气热水锅炉、变压器、蒸汽表、电表、能量表、温度计、压力计、燃气流量计、液体流量表、压差计等。
请参阅图6,进一步地,为了便于构建的能源数据仓库系统中数据和任务相关的配置、运行和监控等,本实施例提供的能源数据仓库系统构建方法还包括:
步骤S16:构建管理工具,所述管理工具至少包括元数据管理工具、工作流管理工具、数据采集工具、数据加工工具以及数据迁移工具中的一种。
元数据管理工具:能源数据仓库系统中存在各类错综复杂的能源数据,为了能够让使用者对能源数据有清晰的认识,提供元数据管理工具,主要解决数据的三个问题:各个环节的数据是什么、各个环节的数据从哪里来、各个环节 的数据到哪里去。其中,各个环节的数据是什么,可以通过工具搜索和查看各类数据的名称、类型、长度、业务含义等,对于编码数据可以查看编码的含义;各个环节的数据从哪里来,解决数据溯源的问题,对于能源数据仓库系统中的所有数据,都有上游数据,当需要分析一个数据是否有问题、一个任务失败原因、一个计算指标不正确的原因时,都需要对上游数据进行溯源,沿着上游节点逐步上溯,直到找出问题出现的源头;各个环节的数据到哪里去,解决数据影响分析的问题,当需要对能源数据仓库系统中任何一个环节的计算逻辑、处理方式进行改动时,需要评估改动对现有系统的影响,需要找到所有依赖该节点的后续数据处理流程和数据,评估修改是否可行。能源数据的三个问题结合在一起,就形成了一张全局的错综复杂的数据网络,这个数据网络就是数据地图。元数据管理工具可以提供这样一个能够看到数据仓库全局的数据地图,通过这个地图,能够看到能源数据仓库系统全部数据的点,可以查看每个点的说明,可以在任意一个点进行溯源分析和影响分析。
工作流管理工具:由于能源数据仓库系统中存在大量的数据采集、加工以及迁移任务,并且这些任务之间存在依赖关系,为了支持能源数据仓库系统自动化周期性有序执行,需要工作流支持。在能源数据仓库系统中最大的一个工作流就是从数据增量采集到文件系统开始,依次执行以下任务:
(1)加载能源数据到能源数据仓库系统的原始数据表;
(2)从原始数据表解析获取新加载数据并写入细节数据表;
(3)从细节数据表读取数据进行清洗转换,并按照能源设备类型分别加载到各基础数据表;
(4)并行执行通用数据层各数据仓库主题计算任务,并把计算结果写入各数据仓库主题;
(5)将各数据仓库主题结果按照业务需要加载到数据集市;
(6)将数据集市迁移到各业务数据库。
能源数据仓库系统中的工作流管理工具提供任务调度、工作流或任务依赖关系管理、拓扑关系管理、任务执行策略、执行结果管理、工作流和任务重新执行支持等,工作流系统驱动整个能源数据仓库系统有序正常运转。
上述第(1)至第(3)步构成一个从原始数据到基础数据的工作流,如图8所示。从原始数据到基础数据的工作流每小时运行一次,负责将最近一个小时到达的数据加工到基础数据层中,借助工作流管理工具,将这些任务组织在一起,并设置调度周期为每个小时的第5分钟执行(也可以为其他时间执行,此处不做限制)。工作流的第一步是启动多个并行任务,分别将综合站数据、用能域数据、换气站数据、光伏站数据、供热站数据等分别加载到原始表,当所有这些并行执行的任务全部完成以后,启动第二步任务;将新加载的数据写入细节数据表,判断新加载数据的依据是,在第一步装载数据时会增加数据的处理时间,依据处理时间判断是否为新装载数据,在将数据写入细节表的过程中,需要进行数据格式的判断,对于异常数据,写入异常日志,只有正确的数据才会写入细节数据表,同时会根据数据所属能源设备类型进行分区,方便后续基于 能源设备类型的数据处理;第三步任务基于能源设备类型,将数据处理并加载到基于能源设备类型的基础数据表中(即FDM表),每个能源设备类型一个单独的任务,这些任务并行执行,每个任务主要处理两方面的逻辑:一是将所有数据的时间进行对齐,由于物联上传的数据可能会存在时差,同一个设备同一个时间的量测值可能在不同的时间点上传,为了便于后续分析,需要对这些数据进行最小时间粒度(如分钟级)的时间对齐,将时间信息中的秒级差异处理到同一分钟,并放入同一时间维度的同一行中;二是将数据扁平化处理,将同一时间维度的所有量测信息全部放入一行,便于后续指标分析和数据对比。所有能源设备类型的基础数据全部处理完成之后,此次工作流结束。
上述第(4)至第(6)步构成多个面向分析主题的工作流,这些工作流都依赖第一个工作流,图9给出了基于基础数据进行企业用能报表的数据加工的工作流。基于基础数据的分析按分析主题会有很多,如变压器分析、空压机设备分析、燃气锅炉设备分析、企业用能结构分析、产能能效分析等。在企业用能报告分析中,会将企业所用到的各类能源结构(包括电能、热能、天然气、蒸汽等)进行综合分析,首先基于各类能源的表底数,即各类能源测量的表计上传的数据,计算出不同时段的各类能源用量,根据计价策略(如电能的按尖峰平谷时段电价或用量阶梯电价、天然气的阶梯价格等)计算出不同时段的费用,再将所有费用按企业(核算单元)进行汇总,生成最低时间粒度(小时级)的聚合数据(包括各能源类型的用量和费用),接下来汇总高一级时间粒度(天级)的聚合数据,再汇总出更高一级时间粒度(月级和年级)的聚合数据,再给需要报表的企业分别生成数据集市,最终把各企业的数据集市导出到报表系统,用于生成报表。以能源数据导出到报表系统任务为标志,工作流结束。
数据采集工具:数据采集工具主要为各类数据源接入能源数据仓库系统提供便利,数据采集的主要功能如图10所示,提供各类能源数据的接入服务。物联设备数据直接从消息队列采集,推送到数据仓库的分布式文件系统;业务和配置数据从业务系统进行定期数据同步;第三方数据通过接口调用方式,定期从第三方系统拉取数据,接口调用过程中,需要进行权限认证、数据获取,提取数据之后进行数据映射,之后推送到能源数据仓库系统;互联网数据通过程序爬取方式获取,对爬取的数据进行解析并推送到能源数据仓库系统。数据采集任务需要定期周期性执行,任务的调度由前述工作流管理工具支持。
数据加工工具:对于能源数据仓库系统中大量的数据加工任务,数据在各层之间的流转、分类、清洗、聚合、计算等都是一个个的数据加工任务,如此大量的加工任务需要提供工具支持,支持界面化配置加工逻辑。在数据加工过程中,需要根据数据的业务时间和处理时间进行区分,业务时间是数据在业务过程中的实际发生时间,如表计设备的表底数,都是在某个具体时刻对应的数值,与数值对应的时间就是业务时间;而处理时间是指加工程序执行时的系统时间,这个时间一般比业务时间晚,用这两个时间进行不同需求的数据处理,如需要计算某一天的能源量,需要根据业务时间来计算,如果某一天的任务失败或存在问题,需要重新执行,就需要根据处理时间来计算。在数据加工中, 可以提供${biztime}、${bizdate}、${systime}和${sysdate}几个变量分别表示业务时间、业务日期、处理时间和处理日期,可以在数据加工时调用,另外还增加了许多能源系统特有的内置变量,如数据所属园区${parkid}、所属系统${systemid}、所属站点${stationid}等,这些也作为数据加工的内置变量供选择。在配置数据加工任务的过程中,对于加工逻辑并不是一步完成的,其中会有多次修改调整的过程,所以数据加工支持调试,每次调试任务会检查语法、生成任务执行计划等,通过调试信息确认任务逻辑是否正确。数据加工任务需要定期周期性执行,任务的调度及依赖关系由前述工作流管理工具支持。
数据迁移工具:对于能源数据仓库系统与外部系统之间的数据迁移,需要一个工具化支持,可以将原系统的数据定期迁移到能源数据仓库系统,或者将能源数据仓库系统加工的结果数据迁移到数据应用系统,如报表系统等。数据迁移工具支持基于库级别、表级别的数据迁移,设定源表与目的表之间的映射关系,设置增量方式(增量迁移还是全量迁移),设置对目标的数据覆盖策略等,结合任务调度,让迁移任务周期性执行或者在上游加工任务完成时触发执行。数据迁移任务需要定期周期性执行,任务的调度及依赖关系由前述工作流管理工具支持。
请参阅图7,通过本实施例提供的能源数据仓库系统构建方法所构建的能源数据仓库系统,包括操作数据层(ODS)、基础数据层(FDM)、通用数据层(GDM)以及应用数据层(ADM),其中操作数据层从数据源中进行数据采集,而数据依次经过操作数据层、基础数据层、通用数据层以及应用数据层处理后,可供外部系统进行数据应用。能源数据仓库系统还包括维度数据(DIM),从而为通用数据层进行数据加工提供参照,以及为应用数据层进行数据加工和迁移提供维度信息。能源数据仓库系统还包括管理工具,便于构建的能源数据仓库系统中数据和任务相关的配置、运行和监控等。
本实施例提供的能源数据仓库系统构建方法的有益效果至少在于:
(1)本实施例通过构建操作数据层、基础数据层、通用数据层以及应用数据层,从而可以构建针对能源行业的能源数据仓库系统,有效解决了通用数据仓库无法应用于能源领域的问题,有助于形成统一、标准的数据体系,加快能源行业对能源设备数据加工和数据分析的速度,方便数据分析师和数据科学家基于高质量的海量数据进行实时有效分析。
(2)本实施例在不同层面吸纳了不同构建方法的优势,从而有助于提高能源数据仓库系统的性能,例如在构建基础数据层时采用Bill Inmon的企业数据仓库架构方法,为能源数据仓库系统打造坚实的基础底座;而在通用数据层和应用数据层则采用Ralph Kimball提出的维度数据仓库构建方法,为上层灵活多变的分析提供良好支撑。
(3)本实施例通过构建维度数据,从而可以为数据加工和数据多维分析提供标准支撑和统一的视角,确保了数据清洗和处理等数据加工过程可以按照标准进行统一,从而确保后续分析的一致性和准确性。
(4)本实施例通过构建管理工具,管理工具包括元数据管理工具、工作流管 理工具、数据采集工具、数据加工工具以及数据迁移工具中,可以支持能源数据仓库系统中的数据和任务相关的配置、运行与监控等,有助于维持能源数据仓库系统的有效运行。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
请参阅图11,本实施例的目的还在于提供一种能源数据仓库系统构建装置,包括操作数据层构建模块21、基础数据层构建模块22、通用数据层构建模块23以及应用数据层构建模块24。其中,操作数据层构建模块21用于对数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层;基础数据层构建模块22用于根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;通用数据层构建模块23用于根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;应用数据层构建模块24用于根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层。
请参阅图12,进一步地,能源数据仓库系统构建装置还包括维度数据构建模块25,维度数据构建模块25用于根据能源数据分析维度,构建维度数据,所述维度数据至少包括时间维度、地理维度、用户维度、园区维度、系统维度、设备维度中的一种。
进一步地,能源数据仓库系统构建装置还包括管理工具构建模块26,管理工具构建模块26用于构建管理工具,管理工具至少包括元数据管理工具、工作流管理工具、数据采集工具、数据加工工具以及数据迁移工具中的一种。
图13是本发明一实施例提供的终端设备的示意图。如图13所示,该实施例的终端设备3包括:处理器30、存储器31以及存储在所述存储器31中并可在所述处理器30上运行的计算机程序32,例如能源数据仓库系统构建程序。所述处理器30执行所述计算机程序32时实现上述各个能源数据仓库系统构建方法实施例中的步骤,例如图1至图6所示的步骤S11至步骤S16。或者,所述处理器30执行所述计算机程序32时实现上述各装置实施例中各模块/单元的功能,例如图11至图12所示模块21至26的功能。
示例性的,所述计算机程序32可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器31中,并由所述处理器30执行,以完成本发明。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序32在所述终端设备3中的执行过程。
所述终端设备3可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备可包括,但不仅限于,处理器30、存储器31。本领域技术人员可以理解,图13仅仅是终端设备3的示例,并不构成对终端设备3的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的 部件,例如所述终端设备3还可以包括输入输出设备、网络接入设备、总线等。
所称处理器30可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器31可以是所述终端设备3的内部存储单元,例如终端设备3的硬盘或内存。所述存储器31也可以是所述终端设备3的外部存储设备,例如所述终端设备3上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器31还可以既包括所述终端设备3的内部存储单元也包括外部存储设备。所述存储器31用于存储所述计算机程序以及所述终端设备3所需的其它程序和数据。所述存储器31还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本发明所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者 也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种能源数据仓库系统构建方法,其特征在于,包括:
    对数据源的能源数据进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层;
    根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;
    根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;
    根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层。
  2. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述数据源的能源数据至少包括能源设备运行数据、能源系统配置数据、业务开展过程中的业务数据、通过网络获取的互联网数据以及第三方系统提供的第三方数据中的一种。
  3. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述对所述数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层,包括:
    加载所述数据源中的能源数据至原始数据表;
    根据所述能源数据的处理时间解析所述原始数据表,确定所述原始数据表中的新加载数据;
    判断所述新加载数据是否为异常数据;
    若所述新加载数据不是异常数据,则对所述新加载数据进行格式转换,获取中间数据;
    根据能源设备类型和获取时间对所述中间数据进行分区,将所述中间数据写入细节数据表中,以构建操作数据层。
  4. 如权利要求3所述的能源数据仓库系统构建方法,其特征在于,所述细节数据表中至少包括站点信息、设备类型、设备标识、量测属性、量测时间以及量测值中的一项。
  5. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层,包括:
    根据能源设备类型对所述细节数据表中的数据进行分类,以获取每类能源设备对应的数据;
    根据最小时间粒度对所述能源设备对应的数据进行时间对齐,以获取第一数据;
    对所述第一数据进行数据扁平化处理,获得第二数据;
    将所述第二数据写入各能源设备类型对应的基础数据表中,以构建基础数据层。
  6. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述根 据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层,包括:
    根据业务分析需要,确定数据仓库主题;
    根据数据仓库主题,读取各能源设备类型对应的基础数据表中的数据;
    参照维度数据对各能源设备类型对应的数据进行聚合,以获取聚合数据;
    将所述聚合数据写入所述数据仓库主题,以构建通用数据层。
  7. 如权利要求6所述的能源数据仓库系统构建方法,其特征在于,所述数据仓库主题至少包括能源企业产能分析主题、用能企业用能分析主题、企业能效分析主题、设备状态运行趋势分析主题、设备预测性维护分析主题、企业数据接入质量分析主题中的一种。
  8. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层,包括:
    根据业务单元,确定数据集市;
    根据所述数据集市,对所述数据仓库主题中的数据进行分类;
    将经过分类的所述数据仓库主题中的数据写入对应的数据集市中,并确定访问权限,以构建应用数据层。
  9. 如权利要求8所述的能源数据仓库系统构建方法,其特征在于,所述将经过分类的所述数据仓库主题中的数据写入对应的数据集市中,并确定访问权限,以构建应用数据层步骤后,还包括:
    将所述数据集市迁移至报表系统中,以生成报表。
  10. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述能源数据仓库系统构建方法还包括:
    根据能源数据分析维度,构建维度数据,所述维度数据至少包括时间维度、地理维度、用户维度、园区维度、系统维度、设备维度中的一种。
  11. 如权利要求1所述的能源数据仓库系统构建方法,其特征在于,所述能源数据仓库系统构建方法还包括:
    构建管理工具,所述管理工具至少包括元数据管理工具、工作流管理工具、数据采集工具、数据加工工具以及数据迁移工具中的一种。
  12. 一种能源数据仓库系统构建装置,其特征在于,包括:
    操作数据层构建模块,用于对数据源的能源数据表进行第一数据处理,获得所述能源数据对应的细节数据表,以构建操作数据层;
    基础数据层构建模块,用于根据能源设备类型对所述细节数据表进行第二数据处理,获取基础数据表,以构建基础数据层;
    通用数据层构建模块,用于根据业务分析需要对所述基础数据表进行第三数据处理,获取数据仓库主题,以构建通用数据层;
    应用数据层构建模块,用于根据业务单元对所述数据仓库主题进行第四数据处理,获取所述业务单元对应的数据集市,以构建应用数据层。
  13. 如权利要求12所述的能源数据仓库系统构建装置,其特征在于,所述 能源数据仓库系统构建装置还包括:
    维度数据构建模块,用于根据能源数据分析维度,构建维度数据,所述维度数据至少包括时间维度、地理维度、用户维度、园区维度、系统维度、设备维度中的一种;
    和/或,管理工具构建模块,用于构建管理工具,所述管理工具至少包括元数据管理工具、工作流管理工具、数据采集工具、数据加工工具以及数据迁移工具中的一种。
  14. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1所述能源数据仓库系统构建方法的步骤。
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1所述能源数据仓库系统构建方法的步骤。
PCT/CN2020/103657 2019-12-31 2020-07-23 一种能源数据仓库系统构建方法及装置 WO2021135177A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911401126.9 2019-12-31
CN201911401126.9A CN111104394A (zh) 2019-12-31 2019-12-31 一种能源数据仓库系统构建方法及装置

Publications (1)

Publication Number Publication Date
WO2021135177A1 true WO2021135177A1 (zh) 2021-07-08

Family

ID=70425196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103657 WO2021135177A1 (zh) 2019-12-31 2020-07-23 一种能源数据仓库系统构建方法及装置

Country Status (2)

Country Link
CN (1) CN111104394A (zh)
WO (1) WO2021135177A1 (zh)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104394A (zh) * 2019-12-31 2020-05-05 新奥数能科技有限公司 一种能源数据仓库系统构建方法及装置
CN111159154A (zh) * 2019-12-31 2020-05-15 新奥数能科技有限公司 一种能源数据仓库系统
CN112035468A (zh) * 2020-08-24 2020-12-04 杭州览众数据科技有限公司 基于内存计算、web可视化配置的多数据源ETL工具
CN112084182A (zh) * 2020-09-10 2020-12-15 重庆富民银行股份有限公司 一种用于数据集市和数据仓库的数据建模方法
CN112231410A (zh) * 2020-10-23 2021-01-15 中国平安人寿保险股份有限公司 适用于大数据的数据处理方法、装置、设备及介质
CN112395298A (zh) * 2020-10-26 2021-02-23 国电南瑞科技股份有限公司 一种基于数据分层思想的数据一致性管理系统
CN112307041A (zh) * 2020-10-29 2021-02-02 山东浪潮通软信息科技有限公司 指标维度建模方法、装置和计算机可读介质
CN112434078A (zh) * 2020-11-20 2021-03-02 广州奇享科技有限公司 一种锅炉数据的处理方法、装置、设备及存储介质
CN112416918B (zh) * 2020-11-20 2024-04-26 移通科技(杭州)有限公司 数据治理系统及其工作方法
US11797557B2 (en) 2020-12-03 2023-10-24 Boe Technology Group Co., Ltd. Data management platform, intelligent defect analysis system, intelligent defect analysis method, computer-program product, and method for defect analysis
CN112541841B (zh) * 2020-12-14 2023-12-26 新奥数能科技有限公司 用于模拟过去未来数据的方法、装置及终端设备
CN112818017A (zh) * 2021-01-22 2021-05-18 百果园技术(新加坡)有限公司 一种事件数据处理方法及装置
CN113407649A (zh) * 2021-06-30 2021-09-17 北京百度网讯科技有限公司 数据仓库建模方法、装置、电子设备及存储介质
CN114064993A (zh) * 2021-11-16 2022-02-18 瀚云科技有限公司 一种能源数据的采集方法、能源平衡图的构建方法及装置
CN114490886A (zh) * 2021-12-29 2022-05-13 北京航天智造科技发展有限公司 一种基于数据仓库的工业操作系统数据湖建设方法
CN114741357A (zh) * 2022-06-13 2022-07-12 广东电网有限责任公司 一种数据迁移方法、系统、设备及存储介质
CN115470304B (zh) * 2022-08-31 2023-08-25 北京九章云极科技有限公司 一种特征因果仓库管理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991960A (zh) * 2015-07-22 2015-10-21 北京京东尚科信息技术有限公司 构建数据仓库模型的方法与装置
CN108280084A (zh) * 2017-01-06 2018-07-13 上海前隆信息科技有限公司 一种数据仓库的构建方法、系统及服务器
CN111104394A (zh) * 2019-12-31 2020-05-05 新奥数能科技有限公司 一种能源数据仓库系统构建方法及装置
CN111160865A (zh) * 2019-12-31 2020-05-15 新奥数能科技有限公司 一种工作流管理方法及装置
CN111159154A (zh) * 2019-12-31 2020-05-15 新奥数能科技有限公司 一种能源数据仓库系统

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101253335B1 (ko) * 2011-06-07 2013-04-10 백승호 데이터 웨어하우스를 이용한 데이터베이스 구축 방법 및 그 시스템
CN106294521B (zh) * 2015-06-12 2019-09-06 交通银行股份有限公司 数据存储方法及数据仓库系统
CN104915456A (zh) * 2015-07-03 2015-09-16 宁夏隆基宁光仪表有限公司 一种基于数据分析系统下的海量用电数据挖掘方法
CN105843880A (zh) * 2016-03-21 2016-08-10 中国矿业大学 一种基于多数据集市的煤矿多维数据仓库系统
CN106339509A (zh) * 2016-10-26 2017-01-18 国网山东省电力公司临沂供电公司 一种基于大数据技术的电网运营数据共享系统
CN106874353A (zh) * 2016-12-28 2017-06-20 合肥智畅信息科技有限公司 一种操作型商业智能应用系统
CN109033113B (zh) * 2017-06-12 2021-07-30 北京京东尚科信息技术有限公司 数据仓库和数据集市的管理方法及装置
CN110019462B (zh) * 2017-11-14 2021-09-03 南方电网科学研究院有限责任公司 电力科研生产数据分析方法、装置、系统及存储介质
CN109669975B (zh) * 2018-11-09 2020-12-18 成都数之联科技有限公司 一种工业大数据处理系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991960A (zh) * 2015-07-22 2015-10-21 北京京东尚科信息技术有限公司 构建数据仓库模型的方法与装置
CN108280084A (zh) * 2017-01-06 2018-07-13 上海前隆信息科技有限公司 一种数据仓库的构建方法、系统及服务器
CN111104394A (zh) * 2019-12-31 2020-05-05 新奥数能科技有限公司 一种能源数据仓库系统构建方法及装置
CN111160865A (zh) * 2019-12-31 2020-05-15 新奥数能科技有限公司 一种工作流管理方法及装置
CN111159154A (zh) * 2019-12-31 2020-05-15 新奥数能科技有限公司 一种能源数据仓库系统

Also Published As

Publication number Publication date
CN111104394A (zh) 2020-05-05

Similar Documents

Publication Publication Date Title
WO2021135177A1 (zh) 一种能源数据仓库系统构建方法及装置
US20220358606A1 (en) Methods and systems for machine-learning for prediction of grid carbon emissions
WO2021135727A1 (zh) 一种能源数据仓库系统
Wu et al. An integrated decision-making model for sustainable photovoltaic module supplier selection based on combined weight and cumulative prospect theory
CN107194621B (zh) 一种供水管网管理系统和方法
CN108197132B (zh) 一种基于图数据库的电力资产画像构建方法及装置
CN111008197A (zh) 一种电力营销服务系统数据中台设计方法
CN111291076B (zh) 基于大数据的异常用水监测报警系统及其构建方法
WO2018176863A1 (zh) 配电网可靠性投资经济效益分析方法及装置、存储介质
CN109376924A (zh) 一种物资需求预测的方法、装置、设备及可读存储介质
CN111160865A (zh) 一种工作流管理方法及装置
CN113111053A (zh) 一种基于大数据的线损诊断与反窃电系统、方法及模型
CN106327055A (zh) 一种基于大数据技术的电力费控方法及系统
CN113872813B (zh) 一种载波通信设备全生命周期管理方法及系统
CN114880405A (zh) 一种基于数据湖的数据处理方法及系统
CN110490761A (zh) 一种电网配网设备台账数据模型建模方法
CN114036206A (zh) 一种基于时序数据库的多能种能源信息管理系统
CN115330404A (zh) 用于电力营销稽查的系统及方法
CN105205185A (zh) 监控系统与管理信息系统间数据交互及数据建模的方法
RU105491U1 (ru) Автоматизированная система доступа к информационным ресурсам на основе универсального классификатора бюджетных данных
CN113590607A (zh) 一种基于报表因子的电力营销报表实现方法和系统
CN117057835A (zh) 一种电网工程造价辅助分析方法及系统
CN111127186A (zh) 一种基于大数据技术的客户信用等级评价体系的使用方法
CN115905319B (zh) 一种海量用户电费异常的自动识别方法及系统
CN116226293A (zh) 一种电力客户画像生成管理的方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20910895

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20910895

Country of ref document: EP

Kind code of ref document: A1