CN110019228A - Multi-source data integration method and device based on fan data - Google Patents

Multi-source data integration method and device based on fan data Download PDF

Info

Publication number
CN110019228A
CN110019228A CN201711418200.9A CN201711418200A CN110019228A CN 110019228 A CN110019228 A CN 110019228A CN 201711418200 A CN201711418200 A CN 201711418200A CN 110019228 A CN110019228 A CN 110019228A
Authority
CN
China
Prior art keywords
data
blower
source
variable
unity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711418200.9A
Other languages
Chinese (zh)
Other versions
CN110019228B (en
Inventor
徐斌
霍钧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Original Assignee
Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Goldwind Science and Creation Windpower Equipment Co Ltd filed Critical Beijing Goldwind Science and Creation Windpower Equipment Co Ltd
Priority to CN201711418200.9A priority Critical patent/CN110019228B/en
Publication of CN110019228A publication Critical patent/CN110019228A/en
Application granted granted Critical
Publication of CN110019228B publication Critical patent/CN110019228B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Structures Of Non-Positive Displacement Pumps (AREA)

Abstract

A multi-source data integration method and device based on fan data are provided. The method comprises the following steps: designing a table structure of a fan data fact table by analyzing fan service data and fan data of each data source; respectively extracting the fan data of each data source into a designed fan data fact table and converting data identification in the extracted fan data into uniform data identification; data cleaning is carried out on the converted data of each data source; and carrying out data fusion on the cleaned data of each data source to generate a fan data real-time table.

Description

Multi-source data unity method and device based on blower data
Technical field
The present invention relates to technical field of wind power generation, more particularly, are related to a kind of multi-source data based on blower data Integration method and its device.
Background technique
In recent years, flourishing with wind power industry, wind power generating set has possessed very on a large scale, with society Can development and the continuous expansion in market establish and run more for the administration behaviours of specification enterprises, the requirement of client A operation management system.The implementation of a variety of operation management systems is conducive to the needs that enterprise-like corporation adapts to different turns of the market. However, data resource abundant is since construction period is different, developing department is different, using equipment, different, technology developing stage is not Same and difference of ability level etc., data storage management extremely disperses, and causes excessive data redundancy and data inconsistency, So that data resource is difficult to queried access, management level can not obtain effective decision data and support.Often manager is it is to be understood that institute The information for administering different departments needs to enter numerous different systems, and data cannot direct comparative analysis.
The cloud of operation management system is low with FTP client FTP integrated level, interconnectivity is poor, information management dispersion, data it is complete There are larger gaps for whole property, accuracy, timeliness etc., form many information islands, lack it is shared, networking can The high blower data system of expenditure.
Digitlization transition to the data accuracy of each operation management system, validity it is more demanding.It is run in unit Each operation management system has returned a large amount of crew qiting information data in the process, and each operation management system data source Data emphasis is not quite similar, so that data integrally become scattered.Therefore, it is necessary to by these mass datas be integrated into it is complete, Effective data, and store.Therefore, it is necessary to a kind of method and device thereof of multi-source data unity for blower data.
Summary of the invention
To solve the above-mentioned problems and/or disadvantage, and at least advantages described below is provided, the present invention provides a kind of bases In the method and apparatus of the multi-source data unity of blower data.
It is an aspect of the invention to provide a kind of multi-source data unity method based on blower data, the method packet It includes: designing the table structure of blower data fact table by the blower data of analysis blower business datum and each data source; The blower data of each data source are drawn into respectively in the blower data fact table of design and will be in the blower data of extraction Data Identification is converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will cleaning The data of each data source afterwards carry out data fusion, generate blower data real-time table.
Preferably, the step of designing the table structure of blower data fact table may include: the basis for each data source Data generate crew base information dimension table using unified data variable title;By the variable for determining operation system demand Information generates variable information dimension table;The change between each data source is generated by analyzing the variable information of each data source Measure transforming relationship table.
Preferably, the step of designing the table structure of blower data fact table may include: the unit according to each data source Coding, unit name variable, combined data acquisition time, data return times design the table structure of blower data fact table.
Preferably, the step of extracting data may include: all to be drawn into number of targets from each data source by blower data It, will data pick-up relevant to blower operation system according to name variable defined in variable information dimension table according in the interim table in library Into the blower data fact table of design.
Preferably, the step of data convert may include: referring to the variables transformations relation table generated by each data source Unit coding, wind power plant are encoded to unified unit coding, wind power plant coding.
Preferably, the step of data cleansing may include: referring to the variable information dimension table generated, for each data source Repeated data, incomplete data and/or the misplaced data occurred in the data transmission carries out the blower data of different types of data Type checking falls the unmatched blower data filtering of data type.
Preferably, the step of data cleansing can also include: referring to variable information dimension table and crew base information dimension Table, for the verification rule of each unit variable, Naming conventions and data value range to the blower number after data type verifies According to validity check is carried out, invalid data filtering is fallen.
Preferably, the step of data cleansing can also include: referring to variable information dimension table and crew base information dimension Table determines that associated configuration information is with the presence or absence of conflict in blower data, when configuration information, which exists, to conflict, by current homogeneous The total data of extraction filters out.
Preferably, the step of data cleansing can also include: referring to variable information dimension table, for each unit variable it Between associated variate-value carry out conflict verification, when associated variate-value, which exists, to conflict, whole that current homogeneous is extracted Data filtering falls.
Preferably, the step of data fusion may include: the blower data by each data source after cleaning according to unit Coding, name variable, data acquisition time sequence be ranked up, the duplicate data of variate-value are filtered, are generated new Blower data fact table, and new blower data fact table is ranked up according to the inverted order of data acquisition time, generate wind Machine data real-time table.
Preferably, the step of data fusion can also include: to integrate the blower data real-time table of each data source And data cleansing, the blower data real-time table integrated and cleaned is stored in database.
It is another aspect of the invention to provide a kind of multi-source data unity device based on blower data, described device packets Include: analysis design module is configured as designing wind by the blower data of analysis blower business datum and each data source The table structure of machine data fact table;Data extraction module, is configured as the blower data of each data source being drawn into respectively and sets In the blower data fact table of meter;Data conversion module, the Data Identification being configured as in the blower data by extraction are converted into Unified Data Identification;Data cleansing module is configured as carrying out data cleansing for the data of each data source after conversion; Data fusion module, the data for being configured as each data source after cleaning carry out data fusion, it is real-time to generate blower data Table.
It is another aspect of the invention to provide a kind of multi-source data unity device based on blower data, described device packets Include: memory is configured as store instruction;Processor is configured as running described instruction stored in memory to execute It operates below: designing the table of blower data fact table by the blower data of analysis blower business datum and each data source Structure;The blower data of each data source are drawn into respectively in the blower data fact table of design and by the blower data of extraction In Data Identification be converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will The data of each data source after cleaning carry out data fusion, generate blower data real-time table.
It is another aspect of the invention to provide a kind of computer readable storage medium, including computer program, the meter The method that calculation machine program can be run by processor to execute the above-mentioned multi-source data unity based on blower data.
It is another aspect of the invention to provide a kind of computers, comprising: memory is configured as store instruction;Processing Device is configured as running side of the described instruction stored in memory to hold the above-mentioned multi-source data unity based on blower data Method.
Based on the multi-source data unity method and device thereof described above based on blower data, can effectively integrate existing Have and the data resource of following operation management system, formed it is unified, complete, accurately, the high blower data of availability System solves the integration problem of multiple data source mass datas.
Detailed description of the invention
In the following, detailed description of the invention will be carried out in conjunction with attached drawing, features described above of the invention and other purposes, feature and Advantage will become apparent, in which:
Fig. 1 is the stream for showing the multi-source data unity method based on blower data of an exemplary embodiment of the present invention Cheng Tu;
Fig. 2 is the flow chart that data are carried out with cleaning operation for showing an exemplary embodiment of the present invention;
Fig. 3 is the flow chart that data are carried out with type checking for showing an exemplary embodiment of the present invention;
Fig. 4 is the flow chart that data are carried out with validity check for showing an exemplary embodiment of the present invention;
Fig. 5 is the flow chart that data are carried out with service logic verification for showing an exemplary embodiment of the present invention;
Fig. 6 is the flow chart that data are carried out with variate-value verification for showing an exemplary embodiment of the present invention;
Fig. 7 is the flow chart merged to data for showing an exemplary embodiment of the present invention;
Fig. 8 is the block diagram for showing the multi-source data unity device of an exemplary embodiment of the present invention.
Specific embodiment
In the following, exemplary embodiment of the present invention is described in detail with reference to the attached drawings.Wherein, identical label always shows phase Same component.It should be understood that the multi-source data unity method and its dress according to an exemplary embodiment of the present invention based on blower data Set the integration that can be applied to multiple data source mass datas of various wind power generating sets.
Fig. 1 is the stream for showing the multi-source data unity method based on blower data of an exemplary embodiment of the present invention Cheng Tu.
As shown in fig. 1, in step S110, by the blower data of analysis blower business datum and each data source come Design the table structure of blower data fact table.Specifically, in conjunction with blower business to each operation management system (i.e. data source) Required blower data are analyzed, and the report of blower data requirements is generated, based on the report of blower data requirements for demand Blower data design the table structure of blower data fact table.It is first during design is directed to the table structure of each data source It include first that content, the renewal frequency of data and acquisition modes of data etc. are analyzed to the data of each operation management system To obtain the information of each operation management system demand data and analysis result be applied in subsequent data cleansing operation.So Afterwards, finishing analysis is carried out to the business datum of blower operation system, the title of data variable needed for determining operation system is to protect Demonstrate,prove the consistency of the variable information of each product type.
Specifically, during designing the table structure of blower data fact table, the basic number of each data source can be directed to Crew base information dimension table is generated according to unified data variable title is used.Wherein, basic data may include about wind The data (wind power plant coding, wind power plant title, affiliated section and affiliated province etc.) of electric field, about the data of blower unit (unit coding, unit capacity, unit major class and unit subclass etc.) and protocol information (such as protocol number, protocol class Type etc.) etc..
Since different operation management systems may use the number of different name variable and each operation management system It may be mismatched according to structure, it is therefore desirable to use unified name variable and number for the unit data of different product types Unified crew base data information management is formed, according to structure convenient for converging the crew base data of each operation management system Stack up.For example, it is assumed that the unit ID from the first operation management system is 100001001, the second operation management system is come from Unit ID be GW150001, the naming rule that can be used in the first operation management system will be in the second operation management system Unit IDGW150001 is revised as 100001001.It can know that blower data are true by generating crew base information dimension table Basic information of the data from which unit and the unit in table.In accordance with an embodiment of the present disclosure, crew base information is tieed up Degree table may include wind power plant coding, unit coding, group name, unit capacity and product class according to the demand of blower business The contents such as type, as shown in table 1.But above-described embodiment is merely exemplary, however it is not limited to this.
1 crew base information dimension table of table
Wind power plant coding Unit coding Group name Unit capacity Product type
During designing the table structure of blower data fact table, it can also be believed by determining the variable of operation system demand Breath is to generate variable information dimension table.For example, it is assumed that blower operation system needs 500 specifying variables, each O&M is being analyzed When the basic data of management system, which variable counted in each operation management system is belonged into blower operation system needs 500 specifying variables, and variable information dimension table is formed according to data variable title needed for determining operation system.Pass through Variable information dimension table can know variable meaning, name variable in blower data fact table etc..According to the implementation of the disclosure Example, variable information dimension table may include the contents such as types of variables, variable value range and name variable, as shown in table 2.
2 variable information dimension table of table
Types of variables Data area Name variable Variable meaning
Since in different operation management systems, the same variable there may be different marking variables, therefore, setting It is needed during the table structure for counting blower data fact table according to business demand and for point of each operation management system Analysis design is to generate variables transformations relation table.For example, different operation management systems may use the central control system of different editions, Assuming that the first operation management system uses the central control system of the second edition, the second operation management system uses the middle control of third version System needs to name the second edition system variable variable naming for being converted into third edition system to facilitate management at this time.
The above-mentioned design method for including the table structure of each dimension table is merely exemplary, and the disclosure is not limited to This.Data redundancy and incidence relation can be reduced by the design to crew base information dimension table, variable information dimension table etc..
According to the exemplary embodiment of the application, crew base information dimension table as described above, variable information can refer to Dimension table, according to the unit of each operation management system coding, unit name variable, combined data acquisition time, data are returned Time designs the table structure of the blower data fact table of each data source.For example, in accordance with an embodiment of the present disclosure, it can be by wind Machine data fact table structure is the form of table 3, however, the disclosure is not limited to this.
3 blower data fact table structure of table
Wind power plant coding Unit coding Name variable Variate-value Data acquisition time Data source Turn around time
By the design of design and blower data fact table to above-mentioned various dimension tables be conducive to meet data pick-up, The purpose of data cleansing and Data Integration may be implemented to generate less data redundancy etc. when data store in this way.In the following, It will be explained in the sequence of operations that blower data are handled using the result of the analysis design of S110 step.
After the data of blower data and blower operation system demand by each data source carry out analysis design, In step S120, the blower data of each data source are drawn into respectively in the blower data fact table of design and by the wind of extraction Data Identification in machine data is converted into unified Data Identification.Specifically, use universal data interface by blower number first According to from being drawn into each operation management system in the interim table of target database.In embodiment of the disclosure, use can fit Data-interface with numerous types of data extracts total data from the historical data in each operation management system, realizes just Secondary data pick-up, complete data available in this way.It then will be with by name variable defined in variable information dimension table The relevant data pick-up of operation system is into the blower data fact table of design.
In addition, according to data acquisition time, extracting data in the way of daily increment in subsequent data pick-up.Example Such as, it is assumed that 1 point of every afternoon is data acquisition time, is extracted from each operation management system in every afternoon 1 previous The newly-increased data of it blower data, i.e. the previous day.It, can be according to turn around time in true table if data source is database Field extracts data.
After by data pick-up to blower data fact table, referring to variables transformations relation table by each operation management system Unit coding, wind power plant are encoded translated for unified unit coding and wind power plant coding.For example, being taken out from the first operation management system Data after taking and converting are put into the blower data fact table of design, as shown in table 4.Table 4 is merely exemplary, and the disclosure is simultaneously It is without being limited thereto.
4 blower data fact table of table
Wind power plant coding Unit coding Name variable Variate-value Data acquisition time
101001 101001001 Primary control program version number 1500_FR_V170725
101001 101001001 Blower type 121/1500
101001 101001001 Inverter type 3
In step S130, data cleansing is carried out for the data of each data source after conversion.Due to wind park network The requirement to power grid security of particularity and country, each operation management system are easy to appear repeated data, residual in the data transmission Situations such as lacking data and misplaced data needs to be directed to the repeat number that blower data occur in the transmission by service logic at this time Cleaning data are carried out according to, incomplete data and/or misplaced data.The operation of cleaning data is carried out hereinafter with reference to Fig. 2 detailed Description.
Fig. 2 is the flow chart that data are carried out with cleaning operation for showing an exemplary embodiment of the present invention.
As shown in Figure 2, in step S211, referring to the variable information dimension table generated, for each data source in data Repeated data, incomplete data and/or the misplaced data occurred in transmission carries out type school to the blower data of different types of data It tests, the unmatched blower data filtering of data type is fallen.Since blower data are likely to occur number during storage, passback According to there is a situation where mistakes, it is possible to carry out primary filtration to blower data using data type.Referring to Fig. 3, in step S310 inquires the data type in the data after extracting and convert in the step s 120 first, in step S320, is become according to unit The type attribute of amount classifies to data type, and in step S330, the blower data of different types of data are carried out data class Data after data type verifies in step S340, are merged record, in step S350, to data type by type verification Unmatched data are filtered, and the quality of data can be improved in this way.For example, word can be divided into according to the classification of data type According with type, numeric type etc., it is assumed that the data type of the wind power plant coding in variable information dimension table is character type, and extract The data type of a certain wind power plant coding is numeric type, then falls the data filtering.
In step S212, referring to variable information dimension table and crew base information dimension table, for each unit variable Verification rule, Naming conventions and data value range carry out validity check to the blower data after data type verifies.Reference Fig. 4, in step S410, classifies, in step after data type verifies according to the name variable in variable information dimension table Rapid S420 carries out validity check according to the verification of variable rule, Naming conventions and numberical range.For example, soft for master control Part version number variable needs to check the product type of Wind turbines corresponding with version number's variable, determines main control software version Whether Naming conventions and the variate-value to match with main control software version number variable are correct, mismatched if there is variate-value, Name does not meet the data of specification or the data area beyond this variable, it is determined that the data are for invalid data and in step S430 The data filtering is fallen.
In step S213, referring to variable information dimension table and crew base information dimension table, determine related in blower data The configuration information of connection is with the presence or absence of conflict, and when configuration information, which exists, to conflict, the blower data filtering that homogeneous acquires is fallen.Its In, configuration information, which refers to, can judge blower data thing present in the crew base information dimension table and variable information dimension table Whether the data in real table match, for example, can judge blower number according to the machine set type in crew base information dimension table According to variables such as middle control software version number, blower types in true table.
Since the data of each operation system have respective uniqueness, data conversion is likely to occur in data back not Correct situation leads to variate-value conflict, it is therefore desirable to carry out service logic verification to blower data, i.e. verifying is mutually related Configuration information is with the presence or absence of conflict.Specifically, referring to Fig. 5, in step S510, by the data and variable letter after variate-value verifies Breath dimension table and crew base information dimension table are compared, and in step S520, find out the unit number for different product type Be mutually related configuration information in, then by the field selection in step S530, in step S540, matches to being mutually related Confidence breath carries out conflict verification, and to determine, associated configuration information is with the presence or absence of conflict in blower data, when configuration information is deposited In conflict, in step S550, the blower data filtering of present lot is fallen, i.e., the whole acquired current unit data homogeneous Data filtering falls.For example, when primary control program version number is 1500_FR_V21070725, inverter type is 2, two variables It clashes, this is because inverter type becomes in the case where primary control program version number is 1500_FR_V2107072 format One be only in 1,3,5 is measured, it is thus determined that filtering out the total data that current unit data homogeneous acquires for conflict.
In another example when master control version number is 1500_FR_V170725 and blower type is 121/1500, referring to unit The data of basic information dimension table can determine that machine set type is 1.5MW air-cooled unit.If when unit is encoded to 101001001, machine set type be 2.5MW air-cooled unit when, the data of return be master control version number be 1500_FR_V170725 and Blower type is 121/1500, it is determined that configuration information clashes, the total data mistake that current unit data homogeneous is acquired It filters.Above-mentioned example is merely exemplary, and the disclosure is not limited to this.
It is carried out referring to variable information dimension table for variate-value associated between each unit variable in step S214 Conflict verification filters out the total data that current homogeneous extracts when associated variate-value, which exists, to conflict.Specifically, join According to Fig. 6, after carrying out service logic verification to blower data, in step S610, referring to variable information dimension table, according to unit Coding is grouped data, makes data flattening, in step S620, for interrelated between the different variables of same unit Variate-value, select the field of same type and in step S630, variate-value conflict verification carried out, if the variable of different variables There is conflict in value, then in step S640, will cast the data of present lot, i.e., all numbers of current unit data homogeneous acquisition aside According to.For example, for the unit of identical product type, it is corresponding to the unit that there are series of parameters configuration variables, for example, 1.5MW Water chiller corresponds to the variate-values such as inverter type, primary control program version number, the initialization files number of specific type or format, This step needs to carry out conflict verification to these relevant variate-values, if for the variable between the unit of identical product type There is conflict in value, then filter out the total data that current unit data homogeneous acquires.
In addition, for the operation order of step S213 and step S214, however it is not limited to above-described embodiment, can grasp parallel Make step S213 and step S214, it can also first operating procedure S214 operating procedure S213 again.
Referring again to Fig. 1, in step S140, for each data source after cleaning blower data according to unit coding, Name variable, data acquisition time are ranked up, and are filtered to the duplicate data of variate-value, and new blower data are generated True table, and newly-generated blower data fact table is ranked up according to the inverted order of data acquisition time, generate blower number According to real-time table.
Then integration and data cleansing, the wind that will be integrated and cleaned are carried out to the blower data real-time table of each data source Machine data fact table is stored in database.The blower data in each operation management system are pressed in step S710 referring to Fig. 7 It is ranked up according to unit coding, name variable, data acquisition time to generate new blower data fact table, and will be newly-generated Blower data fact table is ranked up according to the inverted order of unit coding, name variable, data acquisition time, and it is real to generate blower data When table.It merges for the blower data real-time table of each operation management system, in step S720, will be repeated in merging process The data filtering of record falls, and in step S730, data cleansing operation is carried out again to the data after merging, in data fusion The step of the step of data cleansing is with the data cleansing in step S130 is similar, is not repeating here.Finally, in step S740, The blower data real-time table integrated and cleaned is stored in database, guarantees the integrality and standard of data to greatest extent in this way True property, maintains the trackability of blower data, is convenient for tracking problem.
Fig. 8 is the block diagram for showing the multi-source data unity device of an exemplary embodiment of the present invention.
As shown in figure 8, integrating apparatus 80 includes analysis design module 801, data extraction module 802, data conversion module 803, data cleansing module 804 and data fusion module 805.Wherein, analysis design module 801 passes through analysis blower business number Accordingly and the blower data of each data source design the table structure of blower data fact table.Data extraction module 802 will be each The blower data of data source are drawn into respectively in the blower data fact table of design.Data conversion module 803 is by the blower of extraction Data Identification in data is converted into unified Data Identification.Data cleansing module 804 is by the number of each data source after conversion According to progress data cleansing.The data of each data source after cleaning are carried out data fusion by data fusion module 805, generate blower Data real-time table.
During designing the table structure of blower data fact table, analysis design module 801 is needed for each data The basic data in source generates crew base information dimension table using unified data variable title, by determining that operation system needs The variable information asked generates each number to generate variable information dimension table and the variable information by analyzing each data source According to the variables transformations relation table between source.It is set by analysis blower business datum and the blower data of each data source, analysis Module 801 is counted according to the unit of each data source coding, unit name variable, combined data acquisition time, data return times To design the table structure of blower data fact table.
Analysis design is being carried out to the data and blower business datum of each data source by analysis design module 801 Afterwards, data extraction module 802 is first all extracted blower data using universal data interface from each operation management system Into the interim table of target database, first data pick-up is realized.Then, pass through name variable defined in variable information dimension table It will be in the blower data fact table of data pick-up relevant to operation system to design.In addition, data extraction module 802 is subsequent Data pick-up in, according to data acquisition time, extract data in the way of daily increment.
After receipt to be drawn into blower data fact table, data conversion module 803 will be each according to variables transformations relation table The unit coding of a operation management system, wind power plant are encoded translated for unified unit coding and wind power plant coding.
After converting unified Data Identification for the Data Identification for extracting data, 804 pairs of data cleansing module conversions Data afterwards carry out data cleansing.Specifically, data cleansing module 804 is first according to the variable information dimension table of generation, for Repeated data, incomplete data and/or the misplaced data that each data source occurs in the data transmission are by the wind of different types of data Machine data carry out type checking, and the unmatched blower data filtering of data type is fallen.Then, 804 basis of data cleansing module Variable information dimension table and crew base information dimension table, for verification rule, Naming conventions and the data of each unit variable Value range carries out validity check to the blower data after data type verifies, and invalid data filtering is fallen.By upper After stating cleaning process, data cleansing module 804 is according to variable information dimension table and crew base information dimension table, it is also necessary to determine Associated configuration information is with the presence or absence of conflict in blower data.When configuration information, which exists, to conflict, current homogeneous is extracted Total data filters out.Data cleansing module 804 is associated also according to variable information dimension table between each unit variable Variate-value carry out conflict verification, when associated variate-value, which exists, to conflict, the total data that current homogeneous is extracted is filtered Fall.The cleaning operation of data cleansing module 804 is identical as the operation of step S130, is no longer described in detail here.
After the data cleansing through each data source, data fusion module 805 is by the wind of each data source after cleaning Machine data are ranked up according to unit coding, name variable, data acquisition time, are filtered to the duplicate data of variate-value, New blower data fact table is generated, and new blower data fact table is arranged according to the inverted order of data acquisition time Sequence generates blower data real-time table.Then, data fusion module 805 carries out the blower data real-time table of each data source whole The blower data real-time table integrated and cleaned is stored in database by conjunction and data cleansing.Data fusion and step herein The operation of S140 is identical, and which is not described herein again.
A kind of multi-source data unity device based on blower data of disclosed embodiment according to the present invention can include: storage Device is configured as store instruction;Processor is configured as running described instruction stored in memory to execute following behaviour Make: designing the table structure of blower data fact table by the blower data of analysis blower business datum and each data source; The blower data of each data source are drawn into respectively in the blower data fact table of design and will be in the blower data of extraction Data Identification is converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will cleaning The data of each data source afterwards carry out data fusion, generate blower data real-time table.
A kind of multi-source data unity method based on blower data of disclosed embodiment can be implemented as according to the present invention Computer-readable code in computer readable recording medium, or can be sent by transmission medium.Computer-readable record Medium is the arbitrary data storage device that can store the data that hereafter can be read by computer system.Computer readable recording medium Example include read-only memory (ROM), random access memory (RAM), CD (CD)-ROM, digital versatile disc (DVD), Tape, floppy disk, optical data storage device, but not limited to this.Transmission medium may include by network or various types of communications The carrier wave that channel is sent.Computer readable recording medium also can be distributed in the computer system of connection network, so that computer can Code is read to be stored and executed in a distributed fashion.
Based on the multi-source data unity method and device described above based on blower data, can effectively integrate existing The data resource of each data source have and following, formed it is unified, completely, accurately, the high blower number of availability According to system, the inconsistency of excessive data redundancy and data is avoided, so that data resource is convenient for queried access, management level can It is supported with obtaining effective decision data.
Although being particularly shown and describing the present invention, those skilled in the art referring to its exemplary embodiment Member is it should be understood that can carry out shape to it in the case where not departing from the spirit and scope of the present invention defined by claim Various changes in formula and details.

Claims (24)

1. a kind of multi-source data unity method based on blower data, which is characterized in that the described method includes:
The table structure of blower data fact table is designed by the blower data of analysis blower business datum and each data source;
The blower data of each data source are drawn into respectively in the blower data fact table of design and by the blower data of extraction In Data Identification be converted into unified Data Identification;
Data cleansing is carried out for the data of each data source after conversion;
The data of each data source after cleaning are subjected to data fusion, generate blower data real-time table.
2. multi-source data unity method as described in claim 1, which is characterized in that the table structure of design blower data fact table The step of include:
Crew base information dimension table is generated using unified data variable title for the basic data of each data source;
Variable information dimension table is generated by determining the variable information of operation system demand;
The variables transformations relation table between each data source is generated by analyzing the variable information of each data source.
3. multi-source data unity method as described in claim 1, which is characterized in that the table structure of design blower data fact table The step of include: according to the unit of each data source coding, unit name variable, combined data acquisition time, data return when Between design the table structure of blower data fact table.
4. multi-source data unity method as claimed in claim 2, which is characterized in that the step of extracting data includes: from each Blower data are all drawn into the interim table of target database by data source, according to variable name defined in variable information dimension table Title will be in the blower data fact table of data pick-up relevant to blower operation system to design.
5. multi-source data unity method as claimed in claim 2, which is characterized in that the step of data convert includes: referring to raw At variables transformations relation table the unit of each data source coding, wind power plant are encoded to unified unit coding, wind power plant is compiled Code.
6. multi-source data unity method as claimed in claim 2, which is characterized in that the step of data cleansing includes: referring to raw At variable information dimension table, the repeated data occurred in the data transmission for each data source, incomplete data and/or dislocation The blower data of different types of data are carried out type checking by data, and the unmatched blower data filtering of data type is fallen.
7. multi-source data unity method as claimed in claim 6, which is characterized in that the step of data cleansing further include: reference Variable information dimension table and crew base information dimension table, for verification rule, Naming conventions and the numerical value of each unit variable Range carries out data validation to the blower data after data type verifies, and will be confirmed as invalid data filtering Fall.
8. multi-source data unity method as claimed in claim 7, which is characterized in that the step of data cleansing further include: reference Variable information dimension table and crew base information dimension table determine that associated configuration information is with the presence or absence of punching in blower data It is prominent, when configuration information, which exists, to conflict, the total data that current homogeneous extracts is filtered out.
9. multi-source data unity method as claimed in claim 7, which is characterized in that the step of data cleansing further include: reference Variable information dimension table carries out conflict verification for associated variate-value between each unit variable, when associated variable When value has conflict, the total data that current homogeneous extracts is filtered out.
10. multi-source data unity method as described in claim 1, which is characterized in that the step of data fusion includes: that will clean The blower data of each data source afterwards are ranked up according to the sequence of unit coding, name variable, data acquisition time, will be become The duplicate data filtering of magnitude falls to generate new blower data fact table, and by new blower data fact table according to data The inverted order of acquisition time is ranked up, and generates blower data real-time table.
11. multi-source data unity method as claimed in claim 10, which is characterized in that the step of data fusion further include: will The blower data real-time table of each data source carries out integration and data cleansing, and the blower data real-time table integrated and cleaned is deposited Enter in database.
12. a kind of multi-source data unity device based on blower data, described device include:
Analysis design module is configured as designing wind by the blower data of analysis blower business datum and each data source The table structure of machine data fact table;
Data extraction module is configured as the blower data of each data source being drawn into the blower data fact table of design respectively In;
Data conversion module, the Data Identification being configured as in the blower data by extraction are converted into unified Data Identification;
Data cleansing module is configured as carrying out data cleansing for the data of each data source after conversion;
Data fusion module, the data for being configured as each data source after cleaning carry out data fusion, generate blower data Real-time table.
13. multi-source data unity device as claimed in claim 12, which is characterized in that analysis design module is configured as:
Crew base information dimension table is generated using unified data variable title for the blower data of each data source;
Variable information dimension table is generated by determining the variable information of operation system demand;
The variables transformations relation table between each data source is generated by analyzing the variable information of each data source.
14. multi-source data unity device as claimed in claim 12, which is characterized in that analysis design module is configured as basis Unit coding, the unit name variable of each data source, combined data acquisition time, data return times design blower data The table structure of true table.
15. multi-source data unity device as claimed in claim 13, which is characterized in that data extraction module is configured from each Blower data are all drawn into the interim table of target database by data source, according to variable name defined in variable information dimension table Title will be in the blower data fact table of data pick-up relevant to blower operation system to design.
16. multi-source data unity device as claimed in claim 13, which is characterized in that data conversion module is configured as reference The variables transformations relation table of generation is encoded translated for unified unit coding, wind by the unit coding of each data source, wind power plant Electric field coding.
17. multi-source data unity device as claimed in claim 13, which is characterized in that data cleansing module is configured as reference The variable information dimension table of generation, the repeated data occurred in the data transmission for each data source, incomplete data and dislocation The blower data of different types of data are carried out type checking by data, are filtered to the unmatched blower data of data type.
18. multi-source data unity device as claimed in claim 17, which is characterized in that data cleansing module is additionally configured to join According to variable information dimension table and crew base information dimension table, for verification rule, the Naming conventions sum number of each unit variable Value range carries out data validation to the blower data after data type verifies, and will be confirmed as invalid data filtering Fall.
19. multi-source data unity device as claimed in claim 18, which is characterized in that data cleansing module is additionally configured to join According to variable information dimension table and crew base information dimension table, determine that associated configuration information is with the presence or absence of punching in blower data It is prominent, when configuration information, which exists, to conflict, the total data that current homogeneous extracts is filtered out.
20. multi-source data unity device as claimed in claim 18, which is characterized in that data cleansing module is additionally configured to join According to variable information dimension table, conflict verification is carried out for associated variate-value between each unit variable, when associated change When magnitude has conflict, the total data that current homogeneous extracts is filtered out.
21. multi-source data unity device as claimed in claim 12, which is characterized in that data fusion module is configured as will be clear The blower data of each data source after washing are ranked up according to the sequence of unit coding, name variable and data acquisition time, The duplicate data filtering of variate-value is fallen to generate new blower data fact table, and by new blower data fact table according to The inverted order of data acquisition time is ranked up, and generates blower data real-time table.
22. multi-source data unity device as claimed in claim 21, which is characterized in that data fusion module be additionally configured to by The blower data real-time table of each data source carries out integration and data cleansing, and the blower data real-time table integrated and cleaned is deposited Enter in database.
23. a kind of computer readable storage medium, including computer program, the computer program can be run by processor with It executes such as method any one of in claim 1-11.
24. a kind of computer, comprising:
Memory is configured as store instruction;
Processor is configured as running described instruction stored in memory to execute such as any one in claim 1-11 The method.
CN201711418200.9A 2017-12-25 2017-12-25 Multi-source data integration method and device based on fan data Active CN110019228B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711418200.9A CN110019228B (en) 2017-12-25 2017-12-25 Multi-source data integration method and device based on fan data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711418200.9A CN110019228B (en) 2017-12-25 2017-12-25 Multi-source data integration method and device based on fan data

Publications (2)

Publication Number Publication Date
CN110019228A true CN110019228A (en) 2019-07-16
CN110019228B CN110019228B (en) 2022-08-09

Family

ID=67186984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711418200.9A Active CN110019228B (en) 2017-12-25 2017-12-25 Multi-source data integration method and device based on fan data

Country Status (1)

Country Link
CN (1) CN110019228B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416904A (en) * 2020-11-24 2021-02-26 广东稳峰电力科技有限公司 Electric power data standardization processing method and device
CN114567626A (en) * 2022-01-24 2022-05-31 国电联合动力技术有限公司 Internet-based remote data transmission method and system for wind turbine generator

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452450A (en) * 2007-11-30 2009-06-10 上海市电力公司 Multiple source data conversion service method and apparatus thereof
US20100169312A1 (en) * 2008-12-30 2010-07-01 Yield Software, Inc. Method and System for Negative Keyword Recommendations
US8000911B2 (en) * 2008-05-06 2011-08-16 Schneider Electric USA, Inc. Automated hierarchical classification for utility systems with multiple sources
CN103647669A (en) * 2013-12-16 2014-03-19 上海证券交易所 System and method for guaranteeing distributed data processing consistency
CN104200402A (en) * 2014-09-11 2014-12-10 国家电网公司 Publishing method and system of source data of multiple data sources in power grid
CN104346377A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for integrating and exchanging data on basis of unique identification
US20150170195A1 (en) * 2013-12-13 2015-06-18 Aaron Drew System and Method to Collect, Correlate and Display Customer Origination Data with Customer Revenue Data
CN105637139A (en) * 2013-10-16 2016-06-01 萨罗尼科斯贸易与服务一人有限公司 Laundry washing machine with speech recognition and response capabilities and method for operating same
CN106383999A (en) * 2016-09-13 2017-02-08 北京协力筑成金融信息服务股份有限公司 Trend analysis method and device of multi-source time sequence data
CN106610957A (en) * 2015-10-21 2017-05-03 星际空间(天津)科技发展有限公司 Multi-source data integration method based on geographic information
CN107193858A (en) * 2017-03-28 2017-09-22 福州金瑞迪软件技术有限公司 Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101452450A (en) * 2007-11-30 2009-06-10 上海市电力公司 Multiple source data conversion service method and apparatus thereof
US8000911B2 (en) * 2008-05-06 2011-08-16 Schneider Electric USA, Inc. Automated hierarchical classification for utility systems with multiple sources
US20100169312A1 (en) * 2008-12-30 2010-07-01 Yield Software, Inc. Method and System for Negative Keyword Recommendations
CN104346377A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for integrating and exchanging data on basis of unique identification
CN105637139A (en) * 2013-10-16 2016-06-01 萨罗尼科斯贸易与服务一人有限公司 Laundry washing machine with speech recognition and response capabilities and method for operating same
US20150170195A1 (en) * 2013-12-13 2015-06-18 Aaron Drew System and Method to Collect, Correlate and Display Customer Origination Data with Customer Revenue Data
CN103647669A (en) * 2013-12-16 2014-03-19 上海证券交易所 System and method for guaranteeing distributed data processing consistency
CN104200402A (en) * 2014-09-11 2014-12-10 国家电网公司 Publishing method and system of source data of multiple data sources in power grid
CN106610957A (en) * 2015-10-21 2017-05-03 星际空间(天津)科技发展有限公司 Multi-source data integration method based on geographic information
CN106383999A (en) * 2016-09-13 2017-02-08 北京协力筑成金融信息服务股份有限公司 Trend analysis method and device of multi-source time sequence data
CN107193858A (en) * 2017-03-28 2017-09-22 福州金瑞迪软件技术有限公司 Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GRAHAM CORMODE 等: "Aggregate Query Answering on Possibilistic Data with Cardinality Constraints", 《DATA ENGINEERING》 *
王澂: "《经济与管理论文集》", 31 August 2011, 中国经济出版社 *
聂常红: "基于Struts2的数据输入处理的应用研究", 《信息技术与信息化》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416904A (en) * 2020-11-24 2021-02-26 广东稳峰电力科技有限公司 Electric power data standardization processing method and device
CN114567626A (en) * 2022-01-24 2022-05-31 国电联合动力技术有限公司 Internet-based remote data transmission method and system for wind turbine generator
CN114567626B (en) * 2022-01-24 2024-04-02 国电联合动力技术有限公司 Internet-based remote transmission method and system for wind turbine generator data

Also Published As

Publication number Publication date
CN110019228B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN105528280B (en) System log and health monitoring relationship determine the method and system of log alarm grade
WO2013051101A1 (en) System and method for management of time-series data
CN106845794A (en) A kind of online check method of electric network model that system is dispatched for intelligent grid
CN110490761B (en) Power grid distribution network equipment ledger data model modeling method
WO2023108967A1 (en) Joint credit scoring method and apparatus based on privacy protection calculation and cross-organization
CN107394892A (en) The debugging range determining method and system of a kind of intelligent substation separation fluctuation
Ebden et al. Network analysis on provenance graphs from a crowdsourcing application
CN110019228A (en) Multi-source data integration method and device based on fan data
CN111708774A (en) Industry analytic system based on big data
CN115600824A (en) Early warning method and device for carbon emission, storage medium and electronic equipment
CN110826845B (en) Multidimensional combination cost allocation device and method
CN112363996A (en) Method, system, and medium for building a physical model of a power grid knowledge graph
Gil et al. On the discovery of urban typologies
CN108182055A (en) The information object modeling method and system of a kind of SCD file
Broderick et al. Clustering method and representative feeder selection for the California Solar Initiative
CN111026705B (en) Building engineering file management method, system and terminal equipment
CN114757448A (en) Manufacturing inter-link optimal value chain construction method based on data space model
CN111444254B (en) SKL system file format conversion method and system
CN114169026A (en) Thematic charting system based on web technology
CN111143622B (en) Fault data set construction method based on big data platform
CN113468239A (en) Method and system for realizing internet of things industry usage statistics based on rule engine
CN105590224A (en) Method for determining failure node in transaction process
CN112612778B (en) Enterprise data architecture method
CN115664982B (en) Network resource management system based on cloud computing
CN111461515B (en) Intelligent analysis method for transformer substation vacant interval based on electric power big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant