CN110019228A - Multi-source data integration method and device based on fan data - Google Patents
Multi-source data integration method and device based on fan data Download PDFInfo
- Publication number
- CN110019228A CN110019228A CN201711418200.9A CN201711418200A CN110019228A CN 110019228 A CN110019228 A CN 110019228A CN 201711418200 A CN201711418200 A CN 201711418200A CN 110019228 A CN110019228 A CN 110019228A
- Authority
- CN
- China
- Prior art keywords
- data
- blower
- source
- variable
- unity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000010354 integration Effects 0.000 title claims abstract description 9
- 230000004927 fusion Effects 0.000 claims abstract description 22
- 238000004140 cleaning Methods 0.000 claims abstract description 17
- 238000013461 design Methods 0.000 claims description 40
- 238000004458 analytical method Methods 0.000 claims description 24
- 238000001914 filtration Methods 0.000 claims description 19
- 238000012795 verification Methods 0.000 claims description 19
- 238000006243 chemical reaction Methods 0.000 claims description 16
- 230000005540 biological transmission Effects 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 9
- 238000000844 transformation Methods 0.000 claims description 9
- 238000013075 data extraction Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 230000005684 electric field Effects 0.000 claims description 2
- 238000013502 data validation Methods 0.000 claims 2
- 238000004080 punching Methods 0.000 claims 2
- 238000005406 washing Methods 0.000 claims 1
- 238000007726 management method Methods 0.000 description 41
- 241001269238 Data Species 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011017 operating method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000010248 power generation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2264—Multidimensional index structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Structures Of Non-Positive Displacement Pumps (AREA)
Abstract
A multi-source data integration method and device based on fan data are provided. The method comprises the following steps: designing a table structure of a fan data fact table by analyzing fan service data and fan data of each data source; respectively extracting the fan data of each data source into a designed fan data fact table and converting data identification in the extracted fan data into uniform data identification; data cleaning is carried out on the converted data of each data source; and carrying out data fusion on the cleaned data of each data source to generate a fan data real-time table.
Description
Technical field
The present invention relates to technical field of wind power generation, more particularly, are related to a kind of multi-source data based on blower data
Integration method and its device.
Background technique
In recent years, flourishing with wind power industry, wind power generating set has possessed very on a large scale, with society
Can development and the continuous expansion in market establish and run more for the administration behaviours of specification enterprises, the requirement of client
A operation management system.The implementation of a variety of operation management systems is conducive to the needs that enterprise-like corporation adapts to different turns of the market.
However, data resource abundant is since construction period is different, developing department is different, using equipment, different, technology developing stage is not
Same and difference of ability level etc., data storage management extremely disperses, and causes excessive data redundancy and data inconsistency,
So that data resource is difficult to queried access, management level can not obtain effective decision data and support.Often manager is it is to be understood that institute
The information for administering different departments needs to enter numerous different systems, and data cannot direct comparative analysis.
The cloud of operation management system is low with FTP client FTP integrated level, interconnectivity is poor, information management dispersion, data it is complete
There are larger gaps for whole property, accuracy, timeliness etc., form many information islands, lack it is shared, networking can
The high blower data system of expenditure.
Digitlization transition to the data accuracy of each operation management system, validity it is more demanding.It is run in unit
Each operation management system has returned a large amount of crew qiting information data in the process, and each operation management system data source
Data emphasis is not quite similar, so that data integrally become scattered.Therefore, it is necessary to by these mass datas be integrated into it is complete,
Effective data, and store.Therefore, it is necessary to a kind of method and device thereof of multi-source data unity for blower data.
Summary of the invention
To solve the above-mentioned problems and/or disadvantage, and at least advantages described below is provided, the present invention provides a kind of bases
In the method and apparatus of the multi-source data unity of blower data.
It is an aspect of the invention to provide a kind of multi-source data unity method based on blower data, the method packet
It includes: designing the table structure of blower data fact table by the blower data of analysis blower business datum and each data source;
The blower data of each data source are drawn into respectively in the blower data fact table of design and will be in the blower data of extraction
Data Identification is converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will cleaning
The data of each data source afterwards carry out data fusion, generate blower data real-time table.
Preferably, the step of designing the table structure of blower data fact table may include: the basis for each data source
Data generate crew base information dimension table using unified data variable title;By the variable for determining operation system demand
Information generates variable information dimension table;The change between each data source is generated by analyzing the variable information of each data source
Measure transforming relationship table.
Preferably, the step of designing the table structure of blower data fact table may include: the unit according to each data source
Coding, unit name variable, combined data acquisition time, data return times design the table structure of blower data fact table.
Preferably, the step of extracting data may include: all to be drawn into number of targets from each data source by blower data
It, will data pick-up relevant to blower operation system according to name variable defined in variable information dimension table according in the interim table in library
Into the blower data fact table of design.
Preferably, the step of data convert may include: referring to the variables transformations relation table generated by each data source
Unit coding, wind power plant are encoded to unified unit coding, wind power plant coding.
Preferably, the step of data cleansing may include: referring to the variable information dimension table generated, for each data source
Repeated data, incomplete data and/or the misplaced data occurred in the data transmission carries out the blower data of different types of data
Type checking falls the unmatched blower data filtering of data type.
Preferably, the step of data cleansing can also include: referring to variable information dimension table and crew base information dimension
Table, for the verification rule of each unit variable, Naming conventions and data value range to the blower number after data type verifies
According to validity check is carried out, invalid data filtering is fallen.
Preferably, the step of data cleansing can also include: referring to variable information dimension table and crew base information dimension
Table determines that associated configuration information is with the presence or absence of conflict in blower data, when configuration information, which exists, to conflict, by current homogeneous
The total data of extraction filters out.
Preferably, the step of data cleansing can also include: referring to variable information dimension table, for each unit variable it
Between associated variate-value carry out conflict verification, when associated variate-value, which exists, to conflict, whole that current homogeneous is extracted
Data filtering falls.
Preferably, the step of data fusion may include: the blower data by each data source after cleaning according to unit
Coding, name variable, data acquisition time sequence be ranked up, the duplicate data of variate-value are filtered, are generated new
Blower data fact table, and new blower data fact table is ranked up according to the inverted order of data acquisition time, generate wind
Machine data real-time table.
Preferably, the step of data fusion can also include: to integrate the blower data real-time table of each data source
And data cleansing, the blower data real-time table integrated and cleaned is stored in database.
It is another aspect of the invention to provide a kind of multi-source data unity device based on blower data, described device packets
Include: analysis design module is configured as designing wind by the blower data of analysis blower business datum and each data source
The table structure of machine data fact table;Data extraction module, is configured as the blower data of each data source being drawn into respectively and sets
In the blower data fact table of meter;Data conversion module, the Data Identification being configured as in the blower data by extraction are converted into
Unified Data Identification;Data cleansing module is configured as carrying out data cleansing for the data of each data source after conversion;
Data fusion module, the data for being configured as each data source after cleaning carry out data fusion, it is real-time to generate blower data
Table.
It is another aspect of the invention to provide a kind of multi-source data unity device based on blower data, described device packets
Include: memory is configured as store instruction;Processor is configured as running described instruction stored in memory to execute
It operates below: designing the table of blower data fact table by the blower data of analysis blower business datum and each data source
Structure;The blower data of each data source are drawn into respectively in the blower data fact table of design and by the blower data of extraction
In Data Identification be converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will
The data of each data source after cleaning carry out data fusion, generate blower data real-time table.
It is another aspect of the invention to provide a kind of computer readable storage medium, including computer program, the meter
The method that calculation machine program can be run by processor to execute the above-mentioned multi-source data unity based on blower data.
It is another aspect of the invention to provide a kind of computers, comprising: memory is configured as store instruction;Processing
Device is configured as running side of the described instruction stored in memory to hold the above-mentioned multi-source data unity based on blower data
Method.
Based on the multi-source data unity method and device thereof described above based on blower data, can effectively integrate existing
Have and the data resource of following operation management system, formed it is unified, complete, accurately, the high blower data of availability
System solves the integration problem of multiple data source mass datas.
Detailed description of the invention
In the following, detailed description of the invention will be carried out in conjunction with attached drawing, features described above of the invention and other purposes, feature and
Advantage will become apparent, in which:
Fig. 1 is the stream for showing the multi-source data unity method based on blower data of an exemplary embodiment of the present invention
Cheng Tu;
Fig. 2 is the flow chart that data are carried out with cleaning operation for showing an exemplary embodiment of the present invention;
Fig. 3 is the flow chart that data are carried out with type checking for showing an exemplary embodiment of the present invention;
Fig. 4 is the flow chart that data are carried out with validity check for showing an exemplary embodiment of the present invention;
Fig. 5 is the flow chart that data are carried out with service logic verification for showing an exemplary embodiment of the present invention;
Fig. 6 is the flow chart that data are carried out with variate-value verification for showing an exemplary embodiment of the present invention;
Fig. 7 is the flow chart merged to data for showing an exemplary embodiment of the present invention;
Fig. 8 is the block diagram for showing the multi-source data unity device of an exemplary embodiment of the present invention.
Specific embodiment
In the following, exemplary embodiment of the present invention is described in detail with reference to the attached drawings.Wherein, identical label always shows phase
Same component.It should be understood that the multi-source data unity method and its dress according to an exemplary embodiment of the present invention based on blower data
Set the integration that can be applied to multiple data source mass datas of various wind power generating sets.
Fig. 1 is the stream for showing the multi-source data unity method based on blower data of an exemplary embodiment of the present invention
Cheng Tu.
As shown in fig. 1, in step S110, by the blower data of analysis blower business datum and each data source come
Design the table structure of blower data fact table.Specifically, in conjunction with blower business to each operation management system (i.e. data source)
Required blower data are analyzed, and the report of blower data requirements is generated, based on the report of blower data requirements for demand
Blower data design the table structure of blower data fact table.It is first during design is directed to the table structure of each data source
It include first that content, the renewal frequency of data and acquisition modes of data etc. are analyzed to the data of each operation management system
To obtain the information of each operation management system demand data and analysis result be applied in subsequent data cleansing operation.So
Afterwards, finishing analysis is carried out to the business datum of blower operation system, the title of data variable needed for determining operation system is to protect
Demonstrate,prove the consistency of the variable information of each product type.
Specifically, during designing the table structure of blower data fact table, the basic number of each data source can be directed to
Crew base information dimension table is generated according to unified data variable title is used.Wherein, basic data may include about wind
The data (wind power plant coding, wind power plant title, affiliated section and affiliated province etc.) of electric field, about the data of blower unit
(unit coding, unit capacity, unit major class and unit subclass etc.) and protocol information (such as protocol number, protocol class
Type etc.) etc..
Since different operation management systems may use the number of different name variable and each operation management system
It may be mismatched according to structure, it is therefore desirable to use unified name variable and number for the unit data of different product types
Unified crew base data information management is formed, according to structure convenient for converging the crew base data of each operation management system
Stack up.For example, it is assumed that the unit ID from the first operation management system is 100001001, the second operation management system is come from
Unit ID be GW150001, the naming rule that can be used in the first operation management system will be in the second operation management system
Unit IDGW150001 is revised as 100001001.It can know that blower data are true by generating crew base information dimension table
Basic information of the data from which unit and the unit in table.In accordance with an embodiment of the present disclosure, crew base information is tieed up
Degree table may include wind power plant coding, unit coding, group name, unit capacity and product class according to the demand of blower business
The contents such as type, as shown in table 1.But above-described embodiment is merely exemplary, however it is not limited to this.
1 crew base information dimension table of table
Wind power plant coding | Unit coding | Group name | Unit capacity | Product type | … |
During designing the table structure of blower data fact table, it can also be believed by determining the variable of operation system demand
Breath is to generate variable information dimension table.For example, it is assumed that blower operation system needs 500 specifying variables, each O&M is being analyzed
When the basic data of management system, which variable counted in each operation management system is belonged into blower operation system needs
500 specifying variables, and variable information dimension table is formed according to data variable title needed for determining operation system.Pass through
Variable information dimension table can know variable meaning, name variable in blower data fact table etc..According to the implementation of the disclosure
Example, variable information dimension table may include the contents such as types of variables, variable value range and name variable, as shown in table 2.
2 variable information dimension table of table
Types of variables | Data area | Name variable | Variable meaning | … |
Since in different operation management systems, the same variable there may be different marking variables, therefore, setting
It is needed during the table structure for counting blower data fact table according to business demand and for point of each operation management system
Analysis design is to generate variables transformations relation table.For example, different operation management systems may use the central control system of different editions,
Assuming that the first operation management system uses the central control system of the second edition, the second operation management system uses the middle control of third version
System needs to name the second edition system variable variable naming for being converted into third edition system to facilitate management at this time.
The above-mentioned design method for including the table structure of each dimension table is merely exemplary, and the disclosure is not limited to
This.Data redundancy and incidence relation can be reduced by the design to crew base information dimension table, variable information dimension table etc..
According to the exemplary embodiment of the application, crew base information dimension table as described above, variable information can refer to
Dimension table, according to the unit of each operation management system coding, unit name variable, combined data acquisition time, data are returned
Time designs the table structure of the blower data fact table of each data source.For example, in accordance with an embodiment of the present disclosure, it can be by wind
Machine data fact table structure is the form of table 3, however, the disclosure is not limited to this.
3 blower data fact table structure of table
Wind power plant coding | Unit coding | Name variable | Variate-value | Data acquisition time | Data source | Turn around time |
By the design of design and blower data fact table to above-mentioned various dimension tables be conducive to meet data pick-up,
The purpose of data cleansing and Data Integration may be implemented to generate less data redundancy etc. when data store in this way.In the following,
It will be explained in the sequence of operations that blower data are handled using the result of the analysis design of S110 step.
After the data of blower data and blower operation system demand by each data source carry out analysis design,
In step S120, the blower data of each data source are drawn into respectively in the blower data fact table of design and by the wind of extraction
Data Identification in machine data is converted into unified Data Identification.Specifically, use universal data interface by blower number first
According to from being drawn into each operation management system in the interim table of target database.In embodiment of the disclosure, use can fit
Data-interface with numerous types of data extracts total data from the historical data in each operation management system, realizes just
Secondary data pick-up, complete data available in this way.It then will be with by name variable defined in variable information dimension table
The relevant data pick-up of operation system is into the blower data fact table of design.
In addition, according to data acquisition time, extracting data in the way of daily increment in subsequent data pick-up.Example
Such as, it is assumed that 1 point of every afternoon is data acquisition time, is extracted from each operation management system in every afternoon 1 previous
The newly-increased data of it blower data, i.e. the previous day.It, can be according to turn around time in true table if data source is database
Field extracts data.
After by data pick-up to blower data fact table, referring to variables transformations relation table by each operation management system
Unit coding, wind power plant are encoded translated for unified unit coding and wind power plant coding.For example, being taken out from the first operation management system
Data after taking and converting are put into the blower data fact table of design, as shown in table 4.Table 4 is merely exemplary, and the disclosure is simultaneously
It is without being limited thereto.
4 blower data fact table of table
Wind power plant coding | Unit coding | Name variable | Variate-value | Data acquisition time | … |
101001 | 101001001 | Primary control program version number | 1500_FR_V170725 | … | |
101001 | 101001001 | Blower type | 121/1500 | ||
101001 | 101001001 | Inverter type | 3 | ||
… |
In step S130, data cleansing is carried out for the data of each data source after conversion.Due to wind park network
The requirement to power grid security of particularity and country, each operation management system are easy to appear repeated data, residual in the data transmission
Situations such as lacking data and misplaced data needs to be directed to the repeat number that blower data occur in the transmission by service logic at this time
Cleaning data are carried out according to, incomplete data and/or misplaced data.The operation of cleaning data is carried out hereinafter with reference to Fig. 2 detailed
Description.
Fig. 2 is the flow chart that data are carried out with cleaning operation for showing an exemplary embodiment of the present invention.
As shown in Figure 2, in step S211, referring to the variable information dimension table generated, for each data source in data
Repeated data, incomplete data and/or the misplaced data occurred in transmission carries out type school to the blower data of different types of data
It tests, the unmatched blower data filtering of data type is fallen.Since blower data are likely to occur number during storage, passback
According to there is a situation where mistakes, it is possible to carry out primary filtration to blower data using data type.Referring to Fig. 3, in step
S310 inquires the data type in the data after extracting and convert in the step s 120 first, in step S320, is become according to unit
The type attribute of amount classifies to data type, and in step S330, the blower data of different types of data are carried out data class
Data after data type verifies in step S340, are merged record, in step S350, to data type by type verification
Unmatched data are filtered, and the quality of data can be improved in this way.For example, word can be divided into according to the classification of data type
According with type, numeric type etc., it is assumed that the data type of the wind power plant coding in variable information dimension table is character type, and extract
The data type of a certain wind power plant coding is numeric type, then falls the data filtering.
In step S212, referring to variable information dimension table and crew base information dimension table, for each unit variable
Verification rule, Naming conventions and data value range carry out validity check to the blower data after data type verifies.Reference
Fig. 4, in step S410, classifies, in step after data type verifies according to the name variable in variable information dimension table
Rapid S420 carries out validity check according to the verification of variable rule, Naming conventions and numberical range.For example, soft for master control
Part version number variable needs to check the product type of Wind turbines corresponding with version number's variable, determines main control software version
Whether Naming conventions and the variate-value to match with main control software version number variable are correct, mismatched if there is variate-value,
Name does not meet the data of specification or the data area beyond this variable, it is determined that the data are for invalid data and in step S430
The data filtering is fallen.
In step S213, referring to variable information dimension table and crew base information dimension table, determine related in blower data
The configuration information of connection is with the presence or absence of conflict, and when configuration information, which exists, to conflict, the blower data filtering that homogeneous acquires is fallen.Its
In, configuration information, which refers to, can judge blower data thing present in the crew base information dimension table and variable information dimension table
Whether the data in real table match, for example, can judge blower number according to the machine set type in crew base information dimension table
According to variables such as middle control software version number, blower types in true table.
Since the data of each operation system have respective uniqueness, data conversion is likely to occur in data back not
Correct situation leads to variate-value conflict, it is therefore desirable to carry out service logic verification to blower data, i.e. verifying is mutually related
Configuration information is with the presence or absence of conflict.Specifically, referring to Fig. 5, in step S510, by the data and variable letter after variate-value verifies
Breath dimension table and crew base information dimension table are compared, and in step S520, find out the unit number for different product type
Be mutually related configuration information in, then by the field selection in step S530, in step S540, matches to being mutually related
Confidence breath carries out conflict verification, and to determine, associated configuration information is with the presence or absence of conflict in blower data, when configuration information is deposited
In conflict, in step S550, the blower data filtering of present lot is fallen, i.e., the whole acquired current unit data homogeneous
Data filtering falls.For example, when primary control program version number is 1500_FR_V21070725, inverter type is 2, two variables
It clashes, this is because inverter type becomes in the case where primary control program version number is 1500_FR_V2107072 format
One be only in 1,3,5 is measured, it is thus determined that filtering out the total data that current unit data homogeneous acquires for conflict.
In another example when master control version number is 1500_FR_V170725 and blower type is 121/1500, referring to unit
The data of basic information dimension table can determine that machine set type is 1.5MW air-cooled unit.If when unit is encoded to
101001001, machine set type be 2.5MW air-cooled unit when, the data of return be master control version number be 1500_FR_V170725 and
Blower type is 121/1500, it is determined that configuration information clashes, the total data mistake that current unit data homogeneous is acquired
It filters.Above-mentioned example is merely exemplary, and the disclosure is not limited to this.
It is carried out referring to variable information dimension table for variate-value associated between each unit variable in step S214
Conflict verification filters out the total data that current homogeneous extracts when associated variate-value, which exists, to conflict.Specifically, join
According to Fig. 6, after carrying out service logic verification to blower data, in step S610, referring to variable information dimension table, according to unit
Coding is grouped data, makes data flattening, in step S620, for interrelated between the different variables of same unit
Variate-value, select the field of same type and in step S630, variate-value conflict verification carried out, if the variable of different variables
There is conflict in value, then in step S640, will cast the data of present lot, i.e., all numbers of current unit data homogeneous acquisition aside
According to.For example, for the unit of identical product type, it is corresponding to the unit that there are series of parameters configuration variables, for example, 1.5MW
Water chiller corresponds to the variate-values such as inverter type, primary control program version number, the initialization files number of specific type or format,
This step needs to carry out conflict verification to these relevant variate-values, if for the variable between the unit of identical product type
There is conflict in value, then filter out the total data that current unit data homogeneous acquires.
In addition, for the operation order of step S213 and step S214, however it is not limited to above-described embodiment, can grasp parallel
Make step S213 and step S214, it can also first operating procedure S214 operating procedure S213 again.
Referring again to Fig. 1, in step S140, for each data source after cleaning blower data according to unit coding,
Name variable, data acquisition time are ranked up, and are filtered to the duplicate data of variate-value, and new blower data are generated
True table, and newly-generated blower data fact table is ranked up according to the inverted order of data acquisition time, generate blower number
According to real-time table.
Then integration and data cleansing, the wind that will be integrated and cleaned are carried out to the blower data real-time table of each data source
Machine data fact table is stored in database.The blower data in each operation management system are pressed in step S710 referring to Fig. 7
It is ranked up according to unit coding, name variable, data acquisition time to generate new blower data fact table, and will be newly-generated
Blower data fact table is ranked up according to the inverted order of unit coding, name variable, data acquisition time, and it is real to generate blower data
When table.It merges for the blower data real-time table of each operation management system, in step S720, will be repeated in merging process
The data filtering of record falls, and in step S730, data cleansing operation is carried out again to the data after merging, in data fusion
The step of the step of data cleansing is with the data cleansing in step S130 is similar, is not repeating here.Finally, in step S740,
The blower data real-time table integrated and cleaned is stored in database, guarantees the integrality and standard of data to greatest extent in this way
True property, maintains the trackability of blower data, is convenient for tracking problem.
Fig. 8 is the block diagram for showing the multi-source data unity device of an exemplary embodiment of the present invention.
As shown in figure 8, integrating apparatus 80 includes analysis design module 801, data extraction module 802, data conversion module
803, data cleansing module 804 and data fusion module 805.Wherein, analysis design module 801 passes through analysis blower business number
Accordingly and the blower data of each data source design the table structure of blower data fact table.Data extraction module 802 will be each
The blower data of data source are drawn into respectively in the blower data fact table of design.Data conversion module 803 is by the blower of extraction
Data Identification in data is converted into unified Data Identification.Data cleansing module 804 is by the number of each data source after conversion
According to progress data cleansing.The data of each data source after cleaning are carried out data fusion by data fusion module 805, generate blower
Data real-time table.
During designing the table structure of blower data fact table, analysis design module 801 is needed for each data
The basic data in source generates crew base information dimension table using unified data variable title, by determining that operation system needs
The variable information asked generates each number to generate variable information dimension table and the variable information by analyzing each data source
According to the variables transformations relation table between source.It is set by analysis blower business datum and the blower data of each data source, analysis
Module 801 is counted according to the unit of each data source coding, unit name variable, combined data acquisition time, data return times
To design the table structure of blower data fact table.
Analysis design is being carried out to the data and blower business datum of each data source by analysis design module 801
Afterwards, data extraction module 802 is first all extracted blower data using universal data interface from each operation management system
Into the interim table of target database, first data pick-up is realized.Then, pass through name variable defined in variable information dimension table
It will be in the blower data fact table of data pick-up relevant to operation system to design.In addition, data extraction module 802 is subsequent
Data pick-up in, according to data acquisition time, extract data in the way of daily increment.
After receipt to be drawn into blower data fact table, data conversion module 803 will be each according to variables transformations relation table
The unit coding of a operation management system, wind power plant are encoded translated for unified unit coding and wind power plant coding.
After converting unified Data Identification for the Data Identification for extracting data, 804 pairs of data cleansing module conversions
Data afterwards carry out data cleansing.Specifically, data cleansing module 804 is first according to the variable information dimension table of generation, for
Repeated data, incomplete data and/or the misplaced data that each data source occurs in the data transmission are by the wind of different types of data
Machine data carry out type checking, and the unmatched blower data filtering of data type is fallen.Then, 804 basis of data cleansing module
Variable information dimension table and crew base information dimension table, for verification rule, Naming conventions and the data of each unit variable
Value range carries out validity check to the blower data after data type verifies, and invalid data filtering is fallen.By upper
After stating cleaning process, data cleansing module 804 is according to variable information dimension table and crew base information dimension table, it is also necessary to determine
Associated configuration information is with the presence or absence of conflict in blower data.When configuration information, which exists, to conflict, current homogeneous is extracted
Total data filters out.Data cleansing module 804 is associated also according to variable information dimension table between each unit variable
Variate-value carry out conflict verification, when associated variate-value, which exists, to conflict, the total data that current homogeneous is extracted is filtered
Fall.The cleaning operation of data cleansing module 804 is identical as the operation of step S130, is no longer described in detail here.
After the data cleansing through each data source, data fusion module 805 is by the wind of each data source after cleaning
Machine data are ranked up according to unit coding, name variable, data acquisition time, are filtered to the duplicate data of variate-value,
New blower data fact table is generated, and new blower data fact table is arranged according to the inverted order of data acquisition time
Sequence generates blower data real-time table.Then, data fusion module 805 carries out the blower data real-time table of each data source whole
The blower data real-time table integrated and cleaned is stored in database by conjunction and data cleansing.Data fusion and step herein
The operation of S140 is identical, and which is not described herein again.
A kind of multi-source data unity device based on blower data of disclosed embodiment according to the present invention can include: storage
Device is configured as store instruction;Processor is configured as running described instruction stored in memory to execute following behaviour
Make: designing the table structure of blower data fact table by the blower data of analysis blower business datum and each data source;
The blower data of each data source are drawn into respectively in the blower data fact table of design and will be in the blower data of extraction
Data Identification is converted into unified Data Identification;Data cleansing is carried out for the data of each data source after conversion;It will cleaning
The data of each data source afterwards carry out data fusion, generate blower data real-time table.
A kind of multi-source data unity method based on blower data of disclosed embodiment can be implemented as according to the present invention
Computer-readable code in computer readable recording medium, or can be sent by transmission medium.Computer-readable record
Medium is the arbitrary data storage device that can store the data that hereafter can be read by computer system.Computer readable recording medium
Example include read-only memory (ROM), random access memory (RAM), CD (CD)-ROM, digital versatile disc (DVD),
Tape, floppy disk, optical data storage device, but not limited to this.Transmission medium may include by network or various types of communications
The carrier wave that channel is sent.Computer readable recording medium also can be distributed in the computer system of connection network, so that computer can
Code is read to be stored and executed in a distributed fashion.
Based on the multi-source data unity method and device described above based on blower data, can effectively integrate existing
The data resource of each data source have and following, formed it is unified, completely, accurately, the high blower number of availability
According to system, the inconsistency of excessive data redundancy and data is avoided, so that data resource is convenient for queried access, management level can
It is supported with obtaining effective decision data.
Although being particularly shown and describing the present invention, those skilled in the art referring to its exemplary embodiment
Member is it should be understood that can carry out shape to it in the case where not departing from the spirit and scope of the present invention defined by claim
Various changes in formula and details.
Claims (24)
1. a kind of multi-source data unity method based on blower data, which is characterized in that the described method includes:
The table structure of blower data fact table is designed by the blower data of analysis blower business datum and each data source;
The blower data of each data source are drawn into respectively in the blower data fact table of design and by the blower data of extraction
In Data Identification be converted into unified Data Identification;
Data cleansing is carried out for the data of each data source after conversion;
The data of each data source after cleaning are subjected to data fusion, generate blower data real-time table.
2. multi-source data unity method as described in claim 1, which is characterized in that the table structure of design blower data fact table
The step of include:
Crew base information dimension table is generated using unified data variable title for the basic data of each data source;
Variable information dimension table is generated by determining the variable information of operation system demand;
The variables transformations relation table between each data source is generated by analyzing the variable information of each data source.
3. multi-source data unity method as described in claim 1, which is characterized in that the table structure of design blower data fact table
The step of include: according to the unit of each data source coding, unit name variable, combined data acquisition time, data return when
Between design the table structure of blower data fact table.
4. multi-source data unity method as claimed in claim 2, which is characterized in that the step of extracting data includes: from each
Blower data are all drawn into the interim table of target database by data source, according to variable name defined in variable information dimension table
Title will be in the blower data fact table of data pick-up relevant to blower operation system to design.
5. multi-source data unity method as claimed in claim 2, which is characterized in that the step of data convert includes: referring to raw
At variables transformations relation table the unit of each data source coding, wind power plant are encoded to unified unit coding, wind power plant is compiled
Code.
6. multi-source data unity method as claimed in claim 2, which is characterized in that the step of data cleansing includes: referring to raw
At variable information dimension table, the repeated data occurred in the data transmission for each data source, incomplete data and/or dislocation
The blower data of different types of data are carried out type checking by data, and the unmatched blower data filtering of data type is fallen.
7. multi-source data unity method as claimed in claim 6, which is characterized in that the step of data cleansing further include: reference
Variable information dimension table and crew base information dimension table, for verification rule, Naming conventions and the numerical value of each unit variable
Range carries out data validation to the blower data after data type verifies, and will be confirmed as invalid data filtering
Fall.
8. multi-source data unity method as claimed in claim 7, which is characterized in that the step of data cleansing further include: reference
Variable information dimension table and crew base information dimension table determine that associated configuration information is with the presence or absence of punching in blower data
It is prominent, when configuration information, which exists, to conflict, the total data that current homogeneous extracts is filtered out.
9. multi-source data unity method as claimed in claim 7, which is characterized in that the step of data cleansing further include: reference
Variable information dimension table carries out conflict verification for associated variate-value between each unit variable, when associated variable
When value has conflict, the total data that current homogeneous extracts is filtered out.
10. multi-source data unity method as described in claim 1, which is characterized in that the step of data fusion includes: that will clean
The blower data of each data source afterwards are ranked up according to the sequence of unit coding, name variable, data acquisition time, will be become
The duplicate data filtering of magnitude falls to generate new blower data fact table, and by new blower data fact table according to data
The inverted order of acquisition time is ranked up, and generates blower data real-time table.
11. multi-source data unity method as claimed in claim 10, which is characterized in that the step of data fusion further include: will
The blower data real-time table of each data source carries out integration and data cleansing, and the blower data real-time table integrated and cleaned is deposited
Enter in database.
12. a kind of multi-source data unity device based on blower data, described device include:
Analysis design module is configured as designing wind by the blower data of analysis blower business datum and each data source
The table structure of machine data fact table;
Data extraction module is configured as the blower data of each data source being drawn into the blower data fact table of design respectively
In;
Data conversion module, the Data Identification being configured as in the blower data by extraction are converted into unified Data Identification;
Data cleansing module is configured as carrying out data cleansing for the data of each data source after conversion;
Data fusion module, the data for being configured as each data source after cleaning carry out data fusion, generate blower data
Real-time table.
13. multi-source data unity device as claimed in claim 12, which is characterized in that analysis design module is configured as:
Crew base information dimension table is generated using unified data variable title for the blower data of each data source;
Variable information dimension table is generated by determining the variable information of operation system demand;
The variables transformations relation table between each data source is generated by analyzing the variable information of each data source.
14. multi-source data unity device as claimed in claim 12, which is characterized in that analysis design module is configured as basis
Unit coding, the unit name variable of each data source, combined data acquisition time, data return times design blower data
The table structure of true table.
15. multi-source data unity device as claimed in claim 13, which is characterized in that data extraction module is configured from each
Blower data are all drawn into the interim table of target database by data source, according to variable name defined in variable information dimension table
Title will be in the blower data fact table of data pick-up relevant to blower operation system to design.
16. multi-source data unity device as claimed in claim 13, which is characterized in that data conversion module is configured as reference
The variables transformations relation table of generation is encoded translated for unified unit coding, wind by the unit coding of each data source, wind power plant
Electric field coding.
17. multi-source data unity device as claimed in claim 13, which is characterized in that data cleansing module is configured as reference
The variable information dimension table of generation, the repeated data occurred in the data transmission for each data source, incomplete data and dislocation
The blower data of different types of data are carried out type checking by data, are filtered to the unmatched blower data of data type.
18. multi-source data unity device as claimed in claim 17, which is characterized in that data cleansing module is additionally configured to join
According to variable information dimension table and crew base information dimension table, for verification rule, the Naming conventions sum number of each unit variable
Value range carries out data validation to the blower data after data type verifies, and will be confirmed as invalid data filtering
Fall.
19. multi-source data unity device as claimed in claim 18, which is characterized in that data cleansing module is additionally configured to join
According to variable information dimension table and crew base information dimension table, determine that associated configuration information is with the presence or absence of punching in blower data
It is prominent, when configuration information, which exists, to conflict, the total data that current homogeneous extracts is filtered out.
20. multi-source data unity device as claimed in claim 18, which is characterized in that data cleansing module is additionally configured to join
According to variable information dimension table, conflict verification is carried out for associated variate-value between each unit variable, when associated change
When magnitude has conflict, the total data that current homogeneous extracts is filtered out.
21. multi-source data unity device as claimed in claim 12, which is characterized in that data fusion module is configured as will be clear
The blower data of each data source after washing are ranked up according to the sequence of unit coding, name variable and data acquisition time,
The duplicate data filtering of variate-value is fallen to generate new blower data fact table, and by new blower data fact table according to
The inverted order of data acquisition time is ranked up, and generates blower data real-time table.
22. multi-source data unity device as claimed in claim 21, which is characterized in that data fusion module be additionally configured to by
The blower data real-time table of each data source carries out integration and data cleansing, and the blower data real-time table integrated and cleaned is deposited
Enter in database.
23. a kind of computer readable storage medium, including computer program, the computer program can be run by processor with
It executes such as method any one of in claim 1-11.
24. a kind of computer, comprising:
Memory is configured as store instruction;
Processor is configured as running described instruction stored in memory to execute such as any one in claim 1-11
The method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711418200.9A CN110019228B (en) | 2017-12-25 | 2017-12-25 | Multi-source data integration method and device based on fan data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711418200.9A CN110019228B (en) | 2017-12-25 | 2017-12-25 | Multi-source data integration method and device based on fan data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110019228A true CN110019228A (en) | 2019-07-16 |
CN110019228B CN110019228B (en) | 2022-08-09 |
Family
ID=67186984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711418200.9A Active CN110019228B (en) | 2017-12-25 | 2017-12-25 | Multi-source data integration method and device based on fan data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019228B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112416904A (en) * | 2020-11-24 | 2021-02-26 | 广东稳峰电力科技有限公司 | Electric power data standardization processing method and device |
CN114567626A (en) * | 2022-01-24 | 2022-05-31 | 国电联合动力技术有限公司 | Internet-based remote data transmission method and system for wind turbine generator |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452450A (en) * | 2007-11-30 | 2009-06-10 | 上海市电力公司 | Multiple source data conversion service method and apparatus thereof |
US20100169312A1 (en) * | 2008-12-30 | 2010-07-01 | Yield Software, Inc. | Method and System for Negative Keyword Recommendations |
US8000911B2 (en) * | 2008-05-06 | 2011-08-16 | Schneider Electric USA, Inc. | Automated hierarchical classification for utility systems with multiple sources |
CN103647669A (en) * | 2013-12-16 | 2014-03-19 | 上海证券交易所 | System and method for guaranteeing distributed data processing consistency |
CN104200402A (en) * | 2014-09-11 | 2014-12-10 | 国家电网公司 | Publishing method and system of source data of multiple data sources in power grid |
CN104346377A (en) * | 2013-07-31 | 2015-02-11 | 克拉玛依红有软件有限责任公司 | Method for integrating and exchanging data on basis of unique identification |
US20150170195A1 (en) * | 2013-12-13 | 2015-06-18 | Aaron Drew | System and Method to Collect, Correlate and Display Customer Origination Data with Customer Revenue Data |
CN105637139A (en) * | 2013-10-16 | 2016-06-01 | 萨罗尼科斯贸易与服务一人有限公司 | Laundry washing machine with speech recognition and response capabilities and method for operating same |
CN106383999A (en) * | 2016-09-13 | 2017-02-08 | 北京协力筑成金融信息服务股份有限公司 | Trend analysis method and device of multi-source time sequence data |
CN106610957A (en) * | 2015-10-21 | 2017-05-03 | 星际空间(天津)科技发展有限公司 | Multi-source data integration method based on geographic information |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
-
2017
- 2017-12-25 CN CN201711418200.9A patent/CN110019228B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452450A (en) * | 2007-11-30 | 2009-06-10 | 上海市电力公司 | Multiple source data conversion service method and apparatus thereof |
US8000911B2 (en) * | 2008-05-06 | 2011-08-16 | Schneider Electric USA, Inc. | Automated hierarchical classification for utility systems with multiple sources |
US20100169312A1 (en) * | 2008-12-30 | 2010-07-01 | Yield Software, Inc. | Method and System for Negative Keyword Recommendations |
CN104346377A (en) * | 2013-07-31 | 2015-02-11 | 克拉玛依红有软件有限责任公司 | Method for integrating and exchanging data on basis of unique identification |
CN105637139A (en) * | 2013-10-16 | 2016-06-01 | 萨罗尼科斯贸易与服务一人有限公司 | Laundry washing machine with speech recognition and response capabilities and method for operating same |
US20150170195A1 (en) * | 2013-12-13 | 2015-06-18 | Aaron Drew | System and Method to Collect, Correlate and Display Customer Origination Data with Customer Revenue Data |
CN103647669A (en) * | 2013-12-16 | 2014-03-19 | 上海证券交易所 | System and method for guaranteeing distributed data processing consistency |
CN104200402A (en) * | 2014-09-11 | 2014-12-10 | 国家电网公司 | Publishing method and system of source data of multiple data sources in power grid |
CN106610957A (en) * | 2015-10-21 | 2017-05-03 | 星际空间(天津)科技发展有限公司 | Multi-source data integration method based on geographic information |
CN106383999A (en) * | 2016-09-13 | 2017-02-08 | 北京协力筑成金融信息服务股份有限公司 | Trend analysis method and device of multi-source time sequence data |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
Non-Patent Citations (3)
Title |
---|
GRAHAM CORMODE 等: "Aggregate Query Answering on Possibilistic Data with Cardinality Constraints", 《DATA ENGINEERING》 * |
王澂: "《经济与管理论文集》", 31 August 2011, 中国经济出版社 * |
聂常红: "基于Struts2的数据输入处理的应用研究", 《信息技术与信息化》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112416904A (en) * | 2020-11-24 | 2021-02-26 | 广东稳峰电力科技有限公司 | Electric power data standardization processing method and device |
CN114567626A (en) * | 2022-01-24 | 2022-05-31 | 国电联合动力技术有限公司 | Internet-based remote data transmission method and system for wind turbine generator |
CN114567626B (en) * | 2022-01-24 | 2024-04-02 | 国电联合动力技术有限公司 | Internet-based remote transmission method and system for wind turbine generator data |
Also Published As
Publication number | Publication date |
---|---|
CN110019228B (en) | 2022-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105528280B (en) | System log and health monitoring relationship determine the method and system of log alarm grade | |
WO2013051101A1 (en) | System and method for management of time-series data | |
CN106845794A (en) | A kind of online check method of electric network model that system is dispatched for intelligent grid | |
CN110490761B (en) | Power grid distribution network equipment ledger data model modeling method | |
WO2023108967A1 (en) | Joint credit scoring method and apparatus based on privacy protection calculation and cross-organization | |
CN107394892A (en) | The debugging range determining method and system of a kind of intelligent substation separation fluctuation | |
Ebden et al. | Network analysis on provenance graphs from a crowdsourcing application | |
CN110019228A (en) | Multi-source data integration method and device based on fan data | |
CN111708774A (en) | Industry analytic system based on big data | |
CN115600824A (en) | Early warning method and device for carbon emission, storage medium and electronic equipment | |
CN110826845B (en) | Multidimensional combination cost allocation device and method | |
CN112363996A (en) | Method, system, and medium for building a physical model of a power grid knowledge graph | |
Gil et al. | On the discovery of urban typologies | |
CN108182055A (en) | The information object modeling method and system of a kind of SCD file | |
Broderick et al. | Clustering method and representative feeder selection for the California Solar Initiative | |
CN111026705B (en) | Building engineering file management method, system and terminal equipment | |
CN114757448A (en) | Manufacturing inter-link optimal value chain construction method based on data space model | |
CN111444254B (en) | SKL system file format conversion method and system | |
CN114169026A (en) | Thematic charting system based on web technology | |
CN111143622B (en) | Fault data set construction method based on big data platform | |
CN113468239A (en) | Method and system for realizing internet of things industry usage statistics based on rule engine | |
CN105590224A (en) | Method for determining failure node in transaction process | |
CN112612778B (en) | Enterprise data architecture method | |
CN115664982B (en) | Network resource management system based on cloud computing | |
CN111461515B (en) | Intelligent analysis method for transformer substation vacant interval based on electric power big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |