CN109213752A - A kind of data cleansing conversion method based on CIM - Google Patents
A kind of data cleansing conversion method based on CIM Download PDFInfo
- Publication number
- CN109213752A CN109213752A CN201810887270.7A CN201810887270A CN109213752A CN 109213752 A CN109213752 A CN 109213752A CN 201810887270 A CN201810887270 A CN 201810887270A CN 109213752 A CN109213752 A CN 109213752A
- Authority
- CN
- China
- Prior art keywords
- data
- cim
- operable
- node
- distributed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000004458 analytical method Methods 0.000 claims abstract description 23
- 238000004140 cleaning Methods 0.000 claims description 27
- 230000008859 change Effects 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 10
- 238000013506 data mapping Methods 0.000 claims description 4
- 238000007405 data analysis Methods 0.000 claims description 3
- 238000013523 data management Methods 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims description 3
- 238000013499 data model Methods 0.000 abstract description 8
- 230000005540 biological transmission Effects 0.000 description 14
- 230000008569 process Effects 0.000 description 14
- 230000009466 transformation Effects 0.000 description 13
- 238000012544 monitoring process Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 6
- 206010000117 Abnormal behaviour Diseases 0.000 description 5
- 238000012517 data analytics Methods 0.000 description 5
- 238000013500 data storage Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000013396 workstream Methods 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 238000012098 association analyses Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012804 iterative process Methods 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001965 increasing effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The data cleansing conversion method based on CIM that the present invention provides a kind of, this method comprises: the operation data of capture electric system;The operation data of the electric system captured is cleaned and converted, obtains the data based on CIM unified standard, and store into distributed file system;Data are extracted from distributed file system, construct the Distributed Data Warehouse based on CIM.The data cleansing conversion method based on CIM that the invention proposes a kind of, under the support of improved grid operation data model and distributed data platform, source data is extracted, cleaned, is integrated, ensure the quality of data and reliability, realize the unified standard data output based on database, with the broad applicability for supporting clustered deploy(ment) and concurrent, integrated and analysis can be automated for electric network data and reliable support is provided.
Description
Technical field
The invention belongs to power grid big data field more particularly to a kind of data cleansing conversion methods based on CIM.
Background technique
With the extensive use of all kinds of power transmission and transforming equipments, grid operation data amount is presented geometry grade and increases.For magnanimity
Operation data carries out quick analysis processing, realizes anomaly data detection and excavation, is faced with the challenge of how operation of power networks
Big data carries out valid data processing and analyzes with efficient data.Due to the software and hardware system of each net provincial company, resource all exist compared with
Big difference increases the difficulty of data log analysis platform building.Traditional grid operation data platform can no longer meet
The needs of enterprise operation data store optimization and parallel processing.And traditional data store organisation is intuitive, but its significant disadvantage
Be data amount of redundancy it is larger.It causes operation information to repeat to store, brings one to the mixing operation between different operation data tables
Fixed difficulty causes the search efficiency of operation data low.
Summary of the invention
It is an object of the invention to which operation of power networks big data is effectively treated, convenient for enterprise by multiple network systems into
Row is integrated and merges to realize unified efficient big data analysis.It cleans transfer framework by establishing distributed data and can operate
Data field avoids conflicting for data conversion process and data query, and carries out electric network data on the basis of the data warehouse of foundation
It excavates, including the use of improved electric network data model realization association analysis and disorder data recognition.
To solve the above problems, the invention proposes a kind of data cleansing conversion method based on CIM, comprising:
Capture the operation data of electric system;
The operation data of the electric system captured is cleaned and is converted, the data based on CIM unified standard are obtained,
And it stores into distributed file system;
Data are extracted from distributed file system, construct the Distributed Data Warehouse based on CIM.
Preferably, the operation data include equipment account information, operation/maintenance data, fault data, trend topological data,
GIS device information.
Preferably, the model information of the electric system and the metadata of the Distributed Data Warehouse based on CIM are stored
In MangoDB.
Preferably, described that task is disassembled later directly from distribution by MapReduce based on the Distributed Data Warehouse of CIM
Formula file system extracts data and is analyzed, unified to carry out data management and the data access simultaneously mapping of implementation model data and performance
Optimization.
Preferably, the model data mapping includes attribute and the bottom different types of data source of power system service model
Model data mapping.
Preferably, the cleaning and conversion include two stages: the first stage is that data are drawn into and can be operated from data source
Data are drawn into the data warehouse based on CIM: (1) the first rank from operable data buffer area by data buffer zone, second stage
Section, the data source of isomery is drawn into operable data buffer area, by the first stage, the operation data of electric system is existed
The copy backup of an identical structure, identical content is established in operable data buffer area;(2) second stage, to can operand
Statistics merging is carried out according to the data of buffer area and is summarized, and stores data into the data bins based on CIM using step increment method mode
In library;The data pick-up is increment extraction, if can not judge increment when extracting, calculates increment, data in load
Time tag is added when being loaded into the data warehouse based on CIM;From operable data to the extraction stream of the data warehouse based on CIM
Cheng Zhong after data are read out from operable data buffer area, first carries out unified information coded treatment, then respectively to true table number
Different disposal is carried out according to, dimension table data;For the data variation of true table, different increasings is selected according to different situations of change
Loading method is measured, if data temporally change, timestamp increment is used, if random variation is presented in data, carries out full table
Comparison data increment;For the data variation of dimension table, with the newest data cover off-line data based on CIM.
Preferably, described to be analyzed from distributed file system extraction data, further comprise and electric system is transported
The similarity analysis of row data.
Preferably, in the similarity analysis of the Operation of Electric Systems data, judged not according to sequence curve shape
With contacting between sequence, select temporal characteristics correlative factor as the sample of calculating correlation;Steps are as follows for specific calculating:
(1) current time sequence Y={ Y (m) | m=1,2 ... p } is set as reference sequences, historical time operation data sequence Xi
={ Xi(m) | m=1,2 ... p }, i=1,2 ... k are to compare sequence, and p is sequential element number;
(2) it calculates
(3) calculate correlation coefficient ζi(m):
ζ in formulaiIt (m) is Y (m) in Xi(m) incidence coefficient at place:
Wherein △i(m)=| y (m)-xi(m) |, ρ is resolution ratio, and value interval is (0,1):
(4) calculating correlation:
The present invention compared with prior art, has the advantage that
The data cleansing conversion method based on CIM that the invention proposes a kind of, in improved grid operation data model and
Under the support of distributed data platform, source data is extracted, cleaned, is integrated, ensure the quality of data and reliability, realizes base
It is exported in the unified standard data of database, there is the broad applicability for supporting clustered deploy(ment) and concurrent, can be power grid number
It is integrated according to automation and analysis provides reliable support.
Detailed description of the invention
Fig. 1 is the flow chart of the data cleansing conversion method according to an embodiment of the present invention based on CIM.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the present invention will be made further in conjunction with attached drawing
Detailed description.This description is to describe specific implementation consistent with the principles of the present invention by way of example, and not limitation
Mode, the description of these embodiments is detailed enough, so that those skilled in the art can practice the present invention, is not being taken off
Other embodiments can be used in the case where from scope and spirit of the present invention and can change and/or replace each element
Structure.Therefore, the following detailed description should not be understood from restrictive sense.For the technology hand for realizing the present invention
Section, creation characteristic reach purpose and effect is easy to understand, and following further describes the present invention in conjunction with specific drawings.
An aspect of of the present present invention provides a kind of data cleansing conversion method based on CIM.Fig. 1 is to implement according to the present invention
The data cleansing conversion method flow chart based on CIM of example.
Operation of Electric Systems data monitoring platform the present invention is based on CIM includes that tidal data recovering server, data mart modeling are deposited
Store up server and data analytics server.Tidal data recovering server captures Operation of Electric Systems data by sensor etc., described
Operation data includes equipment account information, operation/maintenance data, fault data, trend topological data, GIS device information, also comprising non-
Structured image and video.Large amount of complex, redundancy, the data of mistake are contained in the mass data of isomery, are needed in short-term
It is interior to extract the data for constituting unified standard.Data mart modeling storage server is by distributed file system and MangoDB data
Library is integrated, and by the supervising data storage with unified standard into distributed file system, the model of electric system is believed
The metadata of breath and the Distributed Data Warehouse based on CIM is stored in MangoDB, the Distributed Data Warehouse wound based on CIM
The table and field built are stored in MangoDB.While executing data manipulation, start MangoDB engine to verify first number
According to whether there is.The data analytics server completes the distributed similarity analysis to Operation of Electric Systems data.It is being based on
After the Distributed Data Warehouse of CIM passes through MapReduce dismantling task, data directly are extracted from distributed file system and are carried out
Analysis, it is unified to carry out data management and data access, in this layer of implementation model data mapping and performance optimization, the model data
The model data in each attribute of Mapping implementation power system service model and bottom different types of data source maps, and supports to being based on
The access of the data warehouse, relational database and non-relational database of CIM, provides the unified query based on business model and update
API;The performance optimization provides the data query of L2 cache, asynchronous parallel.
The data cleansing based on CIM is arranged on the basis of above-mentioned distributed structure/architecture and turns for the data mart modeling storage server
Change frame, including semantic meaning analysis module, MangoDB rule base, scheduler module and cleaning conversion module.Electric system reception comes from
Data cleansing convert task is interpreted as meeting the work of the DAG structure of unified format by the data cleansing convert task of user's request
Make flow graph, due to electric system data conversion semanteme execute be not in logic it is optimal, need to transfer to semantic meaning analysis module
Complete Optimization Work.
Semantic meaning analysis module is analyzed and is optimized to the cleaning conversion work flow graph formatted via electric system, passes through
Each node in traversing graph, determines the activity attributes in workflow, converts to work flow diagram, finally by the work after optimization
Flow graph send to coordination unit and executes.Detailed process is as follows:
1. looping through each node of work flow diagram.Node, that is, electric power system data the source for being 0 to in-degree, determines data source
Data volume, the relevant information based on CIM is recorded in MangoDB rule base;Number can be operated in the node for being 0 for out-degree
According to collection, relevant metadata is recorded in MangoDB rule base;It is both greater than 0 active node for out-degree and in-degree, judges its work
Dynamic type, the binary active node of workflow is divided for being used as, records the movable attribute and present position.
2. after traverse node, carry out exchanging the optimizations such as conversion operating to the node in workflow, reduces between active node
Data exchange.
3 will be divided into multiple sub- workflows by boundary of binary activity by the workflow of optimization, will be more in sub- workflow
A unitary activity is classified as one group and is sent to coordination unit being ready to carry out, and indicates group, is coordination unit dynamic optimization workflow
Journey provides reference.
Coordination unit further comprises divide and rule module and scheduling cleaning conversion module.Module of dividing and ruling by data divide from
And existing resource is made full use of to play the performance advantage of parallel computation, data are horizontally divided into multiple numbers according to division rule
According to stream.Corresponding data cleansing translation activity is then packaged distribution to each distributed parallel cleaning and converted by scheduling cleaning conversion module
Module executes.Coordination unit receives execution information the holding with real-time tracking data cleaning active node for carrying out self-cleaning conversion module
Traveling degree.Module of dividing and ruling and scheduling cleaning conversion module are coordinated to execute, the former to come the execution information of self-cleaning conversion module into
Row Macro or mass analysis, thus the Data Partition Strategy optimized;The latter is according to obtained real-time optimized results, by task point
It is fitted on cleaning conversion module.
Cleaning conversion module executes the calculating operation packet distributed via coordination unit in a distributed computing environment, by data
Conversion intermediate result is cached in local.Only just used when needing to collect the output data of multiple nodes network bandwidth resources into
The transmission of row data.
It is described to be completed based on the optimization of semantic logic in semantic resolution phase, when the institute of ergodic data cleaning conversion work stream
After having active node, semantic meaning analysis module executes sequence with exchange activity, merges semanteme according to the attribute of different active nodes
Duplicate active policy modifies data cleansing conversion work process under the premise of not changing the implementing result of datamation stream.Through
The workflow for crossing optimization reduces volume of the circular flow of the data between different nodes.For the data cleansing convert task described based on CIM,
The frame uses the optimisation strategy based on relational database.
The module of dividing and ruling divides data, specifically includes, and data source T is horizontally divided into T1And T2, then T=T1∪
T2, by each of data cleansing process based on CIM activity Acti(i ∈ [1, m]) regards the Function Mapping to T as, judges that data are clear
It washes in conversion work stream with the presence or absence of following movable serial sequence: i.e. if Actm(Actm-1(…(Acti(T))))={ D1,
D2..., Dm, D1, D2..., DmAct is then respectively met to the workflow subgraph of the Function Mapping result of T as each activitym
(Actm-1(…(Acti(Ti))))∪Actm(Actm-1(...(Acti(T2))))={ D1, D2..., Dm}.If M is the relationship to T
Operation, meets M (T)=M (T1)∪M(T2).If there are such sequences in workflow, one group is merged into.By dividing data
Mobile component is fitted on different cleaning conversion module asynchronous executions by mode, forms the effect of assembly line.
For the data cleansing conversion work stream executed parallel, if there is data distribution and Mapreduce node resource
Unmatched situation is calculated according to current system operation progress and decides whether then after system starts to execute processor active task
Carry out data division.When there is idle node in Mapreduce node or when starting to execute some new unitary active task,
Data will occur to divide.The execution process of the strategy is as follows, is carrying out processor active task in access Mapreduce node first
Conversion module is cleaned, selects the wherein activity of deadline the latest as the object for dividing data;Access Mapreduce section later
Available free cleaning conversion module in point judges whether to meet execution condition, and the execution condition is the time of node free time
Window, which is greater than, is divided the sum of across the machine transmission time of data and operation time, and otherwise data divide nonsensical, will be eligible
Cleaning conversion module record;The data volume that data divide occurs finally, calculating, turns these data in idle cleaning
It is most short to change the mold the time span completing transmission on block and calculating.
When occurring idle cleaning conversion module in Mapreduce node has activity to be scheduled, system activates data to draw
Divide algorithm.Data processing for the cleaning conversion module, main process is divided into two stages: the first stage is by data from number
It is drawn into operable data buffer area according to source, data are drawn into the number based on CIM from operable data buffer area by second stage
According to warehouse:
(1) data source of isomery is drawn into operable data buffer area by the first stage, will be electric by the first stage
The operation data of Force system establishes the copy backup of an identical structure, identical content in operable data buffer area.
(2) second stage carries out statistics merging to the data of operable data buffer area and summarizes, uses step increment method side
Formula stores data into the data warehouse based on CIM.The data pick-up is increment extraction, if can not judge to increase when extracting
Amount then calculates increment in load, and data add time tag when being loaded into the data warehouse based on CIM.From can operand
According in the extraction process to the data warehouse based on CIM, after data are read out from operable data buffer area, unification is first carried out
Information coding processing, then different disposal is carried out to true table data, dimension table data respectively.For the data variation of true table,
Different step increment method modes is selected according to different situations of change, if data temporally change, uses timestamp increment, if
Random variation is presented in data, then carries out the comparison data increment of full table.For the data variation of dimension table, it is based on newest
The data cover off-line data of CIM.
Backup of the operable data buffer area as Database Management System in Electrical Power System, by electricity such as production defect, network loads
Force system operation data is backed up, in cleaning conversion process, so that it may use the grid operation data in operable data
Backup is used as data source, these data are loaded into the topic model of the data warehouse based on CIM after conversion and cleaning.It needs
The operation data for all electric system to enter the data warehouse based on CIM is transmitted directly to operable data buffering first
Area, then the target being transferred in the data warehouse based on CIM is handled from operable data buffer area through over cleaning, conversion, mapping
In theme, the data of operable data buffer area are deleted after treatment.
The data temporary library of the operable data buffer area is the initial data for storing electric system and each isomery system
The initial data that system is transferred to, grid operation data are stored according to theme, are cleaned to the data of data temporary library, then face
Subject data fairground is stored in theme and by data model.The data in subject data fairground using a conversion process, into
Enter the data warehouse based on CIM.It is divided into multiple topic models, dimension table model in data warehouse based on CIM.
As further embodiment, it is of the invention based on the data conversion of CIM cleaning conversion module treatment process
Middle execution following steps:
(1) judge the position converted and cleaned in electric power system data source;Field null value is captured, so
After load or be substituted for other meaning data, further according to field null value complete shunt be loaded into different target library.
(2) data sample is extracted from data source, whether with definition consistent, search if analyzing the data that extract
The format and structure of abnormal data define CIM business rule;Standardization data format realizes constraint definition to field format, together
When numerical value, time, the character in data source are loaded with user-defined format;Field is disassembled according to CIM business demand.
(3) inquiry table verify data correctness is utilized, then invalid data, missing data are replaced;And it advises in advance
Surely the processing strategie of data is lost;
(4) data are transformed into the data model of a standard, based on definition standardized data value and format;It is establishing
During constraint condition, by ineligible invalid data, it is deposited into wrong data concentration by replacing or exporting, is guaranteed
The uniqueness of data major key.
It is influenced to farthest reduce inquiry conflict bring, the present invention is further by the data cleansing based on CIM
Flow path switch is divided into asynchronous conversion and synchronous conversion, is respectively used to processing power grid real-time running data and off-line data.It is described different
Step conversion includes that the off-line operation data of real-time will be lost in electric power system data source in a manner of batch processing with predetermined period
It is loaded into data warehouse.The synchronous conversion includes actively capturing the operation data of real-time change in electric system, and being loaded into can
Operation data memory block.After completing the query analysis operation to latest data in operable data memory block, certain systems are triggered
System condition, then batch import in the data warehouse based on CIM.Operable data memory block by multiple data copies with based on dual
The copy of link indexes composition, and copy is the data space of logical construction and physical structure having the same, can operated
Dynamic creation in data storage area.
When creating a copy, a corresponding wave file is saved in operable data memory block, by power grid reality
When operation data be orderly loaded onto copy.Copy index be made of two horizontal and vertical queues, lateral queue be by
Possess same data item ID but timestamp it is different replica node composition, longitudinal queue is by the copy queue of different data item ID
Head node composition.
Copy queue is made of queue head node and queue nodes, and queue head node possesses two attributes: data item ID with
First address.The source of data item ID mark data, in a copy queue, the data of all replica nodes are from identical
Data source, thus possess identical data item ID, the identical data of these data item ID are known as same source data.First address storage one
A address, it is directed toward first replica node of queue.
Queue nodes are gathered around there are five attribute, are replica node size, replica node data time stamp, operation label, number respectively
According to the address of storage address and direction queue next node.Node size identify current copy node to data occupy
Space size.Replica node is sorted from large to small according to timestamp.Operation label is for marking data in current copy node
Which kind of operation is carried out, if current copy node is just carrying out for power grid real-time running data in source data being loaded into operable data storage
Area, then the operation label of this replica node is set as 0, if current copy node is directed toward data and needs from operable data memory block batch
Amount is loaded onto the data warehouse based on CIM, then operates label and be set to 1.Address data memory is directed toward replica node corresponding data and deposits
The position of storage.
All copies from same data source constitute a copy queue, referred to as a copy cluster.Wherein copy cluster
First address be exactly queue heads address of node.In operable data memory block, if storing the number of n kind different data item ID
According to then there is n copy cluster;Also queue structure is used between copy cluster;Copy cluster queue does not have gauge outfit node.If can currently grasp
Make that the queue of copy cluster is not present in data storage area, i.e., any copy cluster is also not present, then it represents that current operable data is deposited
Without storage power grid real-time running data in storage area;
The creation process of copy be exactly power grid real-time running data deposit real-time storage region process, specifically: (1) when
When having power grid real-time running data is captured to need to be loaded onto operable data memory block, replica management module is in operable data
A block space is distributed in memory block, stores data in this space, is then created a copy, is directed toward this block space;(2) copy
Cluster is queue structure, can only use sequential search mode, traverse each copy queue nodes in queue, compare copy cluster queue
In whether there is copy cluster node identical with new data item ID, i.e., whether possess and newly arrive in retrieval operable data memory block
The homologous data of power grid real-time running data.If so, being transferred to (3);If it is not, being transferred to (9).(3) by copy cluster queue
The cluster copy first address of node, navigates to the head node of copy queue in current copy cluster.(4) the newly-built copy section of initialization
Point.Operation label is set to 0.(5) newly-built replica node is inserted into copy queue.First by the data time of newly-built replica node
Stamp compares since first replica node of queue, and until traversing a certain replica node, timestamp is greater than newly-built node
Timestamp but the timestamp of its next node are less than the timestamp of newly-built node, and newly-built replica node is inserted in the node
Next node.(6) if the power grid real-time running data that replica node is directed toward fails, or system command is received, and need
The data warehouse based on CIM by the batch data in replica node is imported from operable data memory block, then by this mirror node
Label be set to 1, meanwhile, the batch data in replica node is sequentially loaded into the data warehouse based on CIM.(7) if receiving number
It is requested according to updating, then distributes memory space in operable data memory block, newly-built replica node simultaneously completes initialization operation, so
Check whether operable data memory block has the corresponding copy queue of the data item ID for data of newly arriving afterwards.If so, being transferred to step
(8);If no, being transferred to (9).(8) a new copy queue is built for newly-built replica node.Queue is initialized, by newly-built copy
The data item ID of node is assigned to the data item ID of gauge outfit node;The first address of gauge outfit node is directed toward newly-built replica node first address.
(9) newly-built replica node is inserted into copy queue.More latest copy cluster queue.If without newly-built copy queuing data item in cluster queue
The corresponding cluster node of ID then creates copy cluster and initializes, the data item ID of newly-built copy queue is assigned to the data of cluster node
Item ID, the cluster of cluster node are directed toward the head node of copy queue, by the tail portion of newly-built cluster node insertion copy cluster queue.(10) it is based on
The copy index of deque completes corresponding update.
In terms of data model, the present invention uses improved Operation of Electric Systems data model.For same parent member
Uniform rules coding, in true table and is stored in distributed document for the level coding Information Compression in operation data dimension table
In system, for executing big data analysis on a large scale, on distribution Mapreduce node in Operation of Electric Systems monitoring.Classification
Coding takes sequential encoding and splicing coding.The sequential encoding is according to predefined sequence using the decimal system to each in dimension
Attribute is encoded, and the corresponding relationship before dimensional attribute cannot be directly acquired.And splice coding by the splicing of coding, pass through
Dimension traversal is realized in the shifting function of coding.Coding rule is as follows:
All detail datas are categorized into a non-overlapping data structure.Assuming that d indicates any dimension in dimension table
Degree, has following characteristics:
1) each d has and only comprising a theme.
2) d is the set constituted by n classification, is denoted as l1, l2..., ln, any one classification liAll contain only
Unique dimensional attribute and miA value;
3) any dimension can be used as tree structure composed by the value of each level.
If liIt is any level of dimension d, corresponds to all values miSet as level liUniverse, then level
li-1As level liFather node level, and the father node of highest level is defined as affiliated theme.Possess common parent p
Level liThe set that value member is constituted is referred to as level liSubset domain.And the brotgher of node be belong to same class node at
Member.
Each dimension can be used as a special single hierarchical tree, and the path of any node of the single tree is according to preceding
Sequence traversal executes.The universe level coding of the node refers to will be after the coding splicing of the subset domain hierarchy of each node in path
Obtained coding.
The data analytics server is also used to for grid operation data and its metadata being packaged into unified format, it includes
There are metadata package module and shifting combination module.Metadata package module is packaged electric network information metadata, passes through member
Data are to data cleansing and inspection;Shifting combination module group again in a manner of sectional encryption by grid operation data and metadata
It closes, the safety and data of improve data transfer and exchange are uniformly processed.
The information generated by data in record electric network information metadata, power system information and transmission, in CIM data
Under the rule constraint of conversion, so that being unsatisfactory for the data of rule can not pass through, to clean to data.It is rule-based clear
It washes and data is cleaned by extraction basic metadata value and electric system additional safety level information.
After completing cleaning, by operation data basic metadata, electric system additional safety level information and system operatio
It is packaged into final electric network information metadata, which is encapsulated in the form of key-value pair;Shifting combination module is with sectional encryption side
Data and its metadata are encapsulated as translation-protocol by formula together.The data for having metadata are being packaged into unified format number by data
According to, then carry out CIM data conversion.
In CIM data conversion, the data of the metadata package module and the encapsulation of shifting combination module are interpreted, respectively with extensive
Telegram in reply network data and its metadata;Data are cleaned using rule according to metadata, to clean the data not being inconsistent normally;
Cleaning wherein is carried out to data using rule, the electric network information metadata provided according to electric system is be provided, it is right
Data are cleaned.Unified rule description is provided in rule, realizes filter data to handle metadata information.It is described
Rule is designed as customized mapping ruler expression formula, is made of variate-value and operator.Variate-value is from electric network information metadata
Middle extraction.When cleaning, metadata is replaced into variate-value, then computation rule expression formula, finally exports calculated result.Defining number
According to source to target matrix rule when, rule recorded using mapping expression formula.System is according to mapping expression parsing
The position of one or more source literary name sections in aiming field source is formed out, and parses complicated conditional plan and data screening
The transformation rule parsed is stored in rule base by scheduled format, then submits to corresponding conversion module and carry out by condition
Processing.After mapping expression formula is parsed, the transformation rule of user-defined dispersion is just integrally incorporated in rule base.When
When executing the extraction of data, the transformation rule in rule base is read, corresponding transition components is called to complete the extraction of data.
When the data analytics server carries out the distributed similarity analysis to Operation of Electric Systems data, specifically
Including being associated analysis for electric system abnormal behaviour and power transmission and transformation monitoring data.It is right respectively before being associated analysis
Power transmission and transformation monitoring data and electric system abnormal behaviour data are pre-processed.Pretreatment to electric system abnormal behaviour data
Including 2 steps: 1) the unit exception behavioral data for having installed monitor terminal in all Operation of Electric Systems data is selected,
And the various kinds of equipment failure frequency in each detection terminal is summarized;2) place is normalized in the data summarized
Reason.For existing association analysis just for the spatial character of monitoring data, ignore time response.Corresponding equipment is filtered out to occur
Abort situation, and the monitoring data of its monitor terminal is obtained, monitoring data is pre-processed according to the following steps: 1) entirely being supervised
Power transmission and transformation qualified rate of each power transmission and transformation qualified rate that Statistical monitor terminal monitoring obtains in the survey period as the position;
2) each power transmission and transformation index average value for monthly counting monitor terminal is averaged in entire monitoring cycle, and it is each to obtain the position
The average value of power transmission and transformation index;3) each power transmission and transformation index value calculated in above step is normalized, will be owned
Data are converted between [0,1].By the pretreatment of data, the items of electric system abnormal behaviour data and power transmission and transformation monitoring are defeated
Power transformation index is mapped as the numerical value in [0,1] section.
The related coefficient between variable is calculated, obtains the incidence matrix A for the m × n dimension being made of related coefficient, as follows.
The row variable of matrix is electric system abnormal behaviour statistical data in formula, uses xi, i=1 ..., m expression, column variable
For power transmission and transformation monitoring data, y is usedj, j=1 ..., n are indicated.ρXi, yjFor xiAnd yjRelated coefficient.
For the structuring operation data of the power grid based on CIM, the present invention goes electric network composition data pick-up and conversion
To be expressed as behavior model four-tuple N=(P, W, O, M), wherein P indicates the data set of network system data source, and W expression is based on
The data set of the data warehouse of CIM, O indicate multiple mutually independent extraction set of tasks, and M indicates the data warehouse based on CIM
The metadata set of modeling.For extracting task O={ O1, O2, O3, O1Data cleansing task is indicated, according to the data based on CIM
Warehouse metadata extracts pretreated data from electric system;O2Indicate that data load task, by the number in interface document area
It is mapped to the tables of data of the data warehouse transition file area based on CIM according to table and carries out relevant data conversion and loading;O3Table
Show integration servers, according to the data warehouse model based on CIM, data verification carried out to the data in buffer area and data map,
And by the data integration examined into the data warehouse based on CIM.
If T is the Data source table of data conversion process, TiIt is T in the data warehouse buffer area based on CIM of moment i
Data copy, Ti={ D, T }, wherein D indicates timestamp.If I is that T from the i moment to the data at i+1 moment changes copy, then I=
{Lsn, M, To, Tn}。LsnIndicate the log number that data change occurs, M indicates data change operation, ToIndicate data before changing
Or the data before deleting, TnIndicate changed data or newly-increased data.Obtaining Ti+1When, and directly tables of data T is selected
It selects operation to compare, the performance of source database be influenced smaller.In the data warehouse buffer area based on CIM, from Ti+1It is mapped to
True table S, first acquisition Ti+1In the true data of [i, i+1] in the period, according still further to the metadata of the data warehouse based on CIM
Definition, makees relevant aggregation project.
The data cleansing process further passes through similarity analysis determination and the biggish operation number of the current time degree of association
According to similar sample set, characteristic feature sequence is then obtained using hierarchical clustering, using characteristic sequence as reference pair sequence to be detected
Fault data identification is carried out, finally modifies to the fault data of identification, the corresponding normal data of characteristic sequence is moved to
Sequence fault data section to be detected.By cluster process, different characteristic feature sequences are extracted, and are ginseng with characteristic feature sequence
It examines, to may be identified and be modified containing the sequence to be detected of fault data.
In order to faster retrieve data tuple, the present invention establishes index to set of relationship data tuple in memory.Then it will visit
It asks that most frequent set of relationship data tuple is put into caching, reduces I/O expense.Frequent set of relationship R will be accessed to store to slow
It deposits, set of relationship R and the real-time copy data D for being stored in operable data memory block are used as input.In each iterative process, close
The piecemeal Pi that assembly closes R is inputted as a detection.Hash attended operation is performed, i.e., it can traverse caching relation data
All tuples in area, and searched in Hash table simultaneously.Whenever successful match, matched data flow tuple is exported.Locating
Behind complete caching relation data field of reason, algorithm reads new tuple from power grid real-time running data source, is loaded into Hash
In table, and identifier is inserted into queue.For next piecemeal in selection R, it is the smallest to first look for timestamp in queue
The connection attribute of data tuple.The piecemeal for having the connection attribute in R is loaded into caching relation data field using index.Pass through this
Kind mode, each new piecemeal can carry out matching operation at least one data tuple.
In similarity analysis of the data analytics server to electric network data model, judged not according to sequence curve shape
With contacting between sequence.Select temporal characteristics correlative factor as the sample of calculating correlation.Steps are as follows for specific calculating:
(1) current time sequence Y={ Y (m) | m=1,2 ... p } is set as reference sequences, historical time operation data sequence Xi
={ Xi(m) | m=1,2 ... p }, i=1,2 ... k are to compare sequence, and p is sequential element number.
(2) it calculates
(3) calculate correlation coefficient
ζ in formulaiIt (m) is Y (m) in Xi(m) incidence coefficient at place.Wherein △i(m)=| y (m)-xi(m) |, ρ is to differentiate system
Number, value interval are (0,1).
(4) calculating correlation:
In the above-mentioned hierarchical clustering stage, if data set X={ x1, x2... ..., xn, n is the quantity of element in X.It is wherein every
A element is all a p dimensional vector, contains k class in X, it is assumed that the center v of i-th of classi={ vi1, vi2... .vip, definition is special
Sign sequence is each cluster centre.J-th of element is u to the degree of membership at i-th of class center in XijIf set U={ uij, V=
{vij}。
uijCalculation formula are as follows:
M is Weighted Index in formula.dij=| | xj-vi| | j-th of element is represented to the distance at i-th of class center.For poly-
Class center viIt can be calculated as follows:
Clustering iterative process is to find the cluster centre and subordinated-degree matrix corresponding when objective function reaches minimum value,
If objective function J are as follows:
Cluster result analyze and determines optimal dividing.If a data set comprising n sequence is divided into k class
(C1, C2..., Ck), for CaIn i-th of sequence x (i), calculate the average distance a (i) of other sequences in x (i) and class.D (i,
Cb) it is that x (i) arrives another class CbThe average distance of all sequences defines b (i)=min { d (i, Cb), b=1,2 ... k, a ≠
b.The Singularity Degree of sequence in the average distance and other classes of each sequence and sample in class is calculated, the calculating of each sequence i is public
Formula are as follows:
The average Dissim value of data set whole sample is taken to evaluate the quality of cluster result, index maximum value corresponds to poly-
Class optimal classification number.
When fault data judges, if d days shared by the similar sample set that similarity analysis obtains, there is d in every one kindn
It,N takes 1~k, and t moment operation data maximum rate of change is denoted as αmax(t, dn)。
αmax(t, dn)=max { [L (d-i, t)-L (d-i, t-1)]/L (d-i, t-1 }), i=1~dn
Wherein function L (d, t) is the d days t moment operation datas.
If sequence X to be detectedd=(xd1, xd2….xdm), m is daily sampling number.Maximum membership degree characteristic sequence is
Xt, in sampling time t, XdRelative to characteristic sequence XtChange rate are as follows:
δt=(xdt-xtt)/xtt
If δt> αmax(t, dn), then it is assumed that it is fault data.The method reduce workloads, improve calculating speed and mould
Type working efficiency.
If detecting some sequence XdP point between q point be fault data, be subordinate to the maximum two feature sequences of angle value
Column are respectively Xt1, Xt2.Maximum membership degree characteristic sequence is used during actual modification.It is as follows to modify formula.
X'd(i)=X't1(i)(uT1, i/(uT1, i+uT2, i))+X't2(i)(uT2, i/(uT1, i+uT2, i))
X't1(i)=X't1(i)×[Xd(p-1)/Xt1(p-1)+Xd(q+1)/Xt1(q+1)]
X't2(i)=X't2(i)×[Xd(p-1)/Xt2(p-1)+Xd(q+1)/Xt2(q+1)],
Wherein i=p, p+1 ..., q
In conclusion the invention proposes a kind of data cleansing conversion method based on CIM, in improved operation of power networks number
Under support according to model and distributed data platform, source data is extracted, cleaned, is integrated, ensures the quality of data and reliable
Property, it realizes the unified standard data output based on database, there is the broad applicability for supporting clustered deploy(ment) and concurrent, it can
It is integrated for electric network data automation and analysis provides reliable support.
Obviously, it should be appreciated by those skilled in the art, each module of the above invention or each steps can be with general
Computing system realize that they can be concentrated in single computing system, or be distributed in multiple computing systems and formed
Network on, optionally, they can be realized with the program code that computing system can be performed, it is thus possible to they are stored
It is executed within the storage system by computing system.In this way, the present invention is not limited to any specific hardware and softwares to combine.
It should be understood that above-mentioned specific embodiment of the invention is used only for exemplary illustration or explains of the invention
Principle, but not to limit the present invention.Therefore, that is done without departing from the spirit and scope of the present invention is any
Modification, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.In addition, appended claims purport of the present invention
Covering the whole variations fallen into attached claim scope and boundary or this range and the equivalent form on boundary and is repairing
Change example.
Claims (8)
1. a kind of data cleansing conversion method based on CIM characterized by comprising
Capture the operation data of electric system;
The operation data of the electric system captured is cleaned and converted, obtains the data based on CIM unified standard, and deposit
Storage is into distributed file system;
Data are extracted from distributed file system, construct the Distributed Data Warehouse based on CIM.
2. the method according to claim 1, wherein the operation data includes equipment account information, O&M number
According to, fault data, trend topological data, GIS device information.
3. the method according to claim 1, wherein further include: by the model information and base of the electric system
It is stored in MangoDB in the metadata of the Distributed Data Warehouse of CIM.
4. according to the method described in claim 3, it is characterized in that, the Distributed Data Warehouse based on CIM passes through
Directly data are extracted from distributed file system after MapReduce dismantling task to be analyzed, it is unified carry out data management with
Data access and the mapping of implementation model data and performance optimization.
5. according to the method described in claim 4, it is characterized in that, model data mapping includes power system service model
Attribute and bottom different types of data source model data map.
6. the method according to claim 1, wherein the cleaning and conversion include two stages: the first stage is
Data are drawn into operable data buffer area from data source, data are drawn into base from operable data buffer area by second stage
In the data warehouse of CIM: the data source of isomery is drawn into operable data buffer area by (1) first stage, by the first rank
The operation data of electric system, it is standby to be established an identical structure, the copy of identical content by section in operable data buffer area
Part;(2) second stage carries out statistics merging to the data of operable data buffer area and summarizes, will using step increment method mode
Data are stored into the data warehouse based on CIM;The data pick-up is increment extraction, if can not judge increment when extracting,
Increment then is calculated in load, data add time tag when being loaded into the data warehouse based on CIM;From operable data to
In the extraction process of data warehouse based on CIM, after data are read out from operable data buffer area, unified information is first carried out
Coded treatment, then different disposal is carried out to true table data, dimension table data respectively;For the data variation of true table, according to
Different situations of change selects different step increment method modes, if data temporally change, timestamp increment is used, if data
Random variation is presented, then carries out the comparison data increment of full table;For the data variation of dimension table, CIM is based on newest
Data cover off-line data.
7. according to the method described in claim 4, it is characterized in that, described divided from distributed file system extraction data
Analysis, further comprises the similarity analysis to Operation of Electric Systems data.
8. the method according to the description of claim 7 is characterized in that in the similarity analysis of the Operation of Electric Systems data
In, the connection between different sequences is judged according to sequence curve shape, and temporal characteristics correlative factor is selected to be associated with as calculating
The sample of degree;Steps are as follows for specific calculating:
(1) current time sequence Y={ Y (m) | m=1,2 ... p } is set as reference sequences, historical time operation data sequence Xi={ Xi
(m) | m=1,2 ... p }, i=1,2 ... k are to compare sequence, and p is sequential element number;
(2) it calculates
(3) calculate correlation coefficient ζi(m):
ζ in formulaiIt (m) is Y (m) in Xi(m) incidence coefficient at place:
Wherein △i(m)=| y (m)-xi(m) |, ρ is resolution ratio, and value interval is (0,1):
(4) calculating correlation:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810887270.7A CN109213752A (en) | 2018-08-06 | 2018-08-06 | A kind of data cleansing conversion method based on CIM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810887270.7A CN109213752A (en) | 2018-08-06 | 2018-08-06 | A kind of data cleansing conversion method based on CIM |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109213752A true CN109213752A (en) | 2019-01-15 |
Family
ID=64987594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810887270.7A Pending CN109213752A (en) | 2018-08-06 | 2018-08-06 | A kind of data cleansing conversion method based on CIM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213752A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109918367A (en) * | 2019-03-19 | 2019-06-21 | 北京百度网讯科技有限公司 | A kind of cleaning method of structural data, device, electronic equipment and storage medium |
CN110968627A (en) * | 2019-11-11 | 2020-04-07 | 南京峰凯云歌数据科技有限公司 | Big data analysis method and system |
CN111177126A (en) * | 2019-08-01 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Information processing method, device and equipment |
CN111177128A (en) * | 2019-12-11 | 2020-05-19 | 国网天津市电力公司电力科学研究院 | Batch processing method and system for big metering data based on improved outlier detection algorithm |
CN111506640A (en) * | 2020-04-21 | 2020-08-07 | 北京中电普华信息技术有限公司 | Mapping method and device |
CN112650744A (en) * | 2020-12-31 | 2021-04-13 | 广州晟能软件科技有限公司 | Data management method for preventing secondary pollution of data |
CN112948203A (en) * | 2021-02-03 | 2021-06-11 | 刘靖宇 | Elevator intelligent inspection method based on big data |
CN113742086A (en) * | 2021-09-17 | 2021-12-03 | 中环曼普科技(南京)有限公司 | Distributed parallel analysis type data cluster management method and system |
CN116821223A (en) * | 2023-08-25 | 2023-09-29 | 云南三耳科技有限公司 | Industrial visual control platform and method based on digital twinning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101764835A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Task allocation method and device based on MapReduce programming framework |
CN104636204A (en) * | 2014-12-04 | 2015-05-20 | 中国联合网络通信集团有限公司 | Task scheduling method and device |
CN105138405A (en) * | 2015-08-06 | 2015-12-09 | 湖南大学 | To-be-released resource list based MapReduce task speculation execution method and apparatus |
CN106528880A (en) * | 2016-12-14 | 2017-03-22 | 云南电网有限责任公司电力科学研究院 | Normalizing method and system for data structure format of multi-source power service data |
CN107451622A (en) * | 2017-08-18 | 2017-12-08 | 长安大学 | A kind of tunnel operation state division methods based on big data cluster analysis |
US9906604B2 (en) * | 2015-03-09 | 2018-02-27 | Dell Products L.P. | System and method for dynamic discovery of web services for a management console |
CN107766541A (en) * | 2017-10-30 | 2018-03-06 | 北京国电通网络技术有限公司 | With electricity consumption overall situation full dose data transfer and storage method, device, electronic equipment |
CN107798139A (en) * | 2017-11-23 | 2018-03-13 | 国网上海市电力公司 | A kind of master/slave data isomery method based on CIM/XML |
-
2018
- 2018-08-06 CN CN201810887270.7A patent/CN109213752A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101764835A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Task allocation method and device based on MapReduce programming framework |
CN104636204A (en) * | 2014-12-04 | 2015-05-20 | 中国联合网络通信集团有限公司 | Task scheduling method and device |
US9906604B2 (en) * | 2015-03-09 | 2018-02-27 | Dell Products L.P. | System and method for dynamic discovery of web services for a management console |
CN105138405A (en) * | 2015-08-06 | 2015-12-09 | 湖南大学 | To-be-released resource list based MapReduce task speculation execution method and apparatus |
CN106528880A (en) * | 2016-12-14 | 2017-03-22 | 云南电网有限责任公司电力科学研究院 | Normalizing method and system for data structure format of multi-source power service data |
CN107451622A (en) * | 2017-08-18 | 2017-12-08 | 长安大学 | A kind of tunnel operation state division methods based on big data cluster analysis |
CN107766541A (en) * | 2017-10-30 | 2018-03-06 | 北京国电通网络技术有限公司 | With electricity consumption overall situation full dose data transfer and storage method, device, electronic equipment |
CN107798139A (en) * | 2017-11-23 | 2018-03-13 | 国网上海市电力公司 | A kind of master/slave data isomery method based on CIM/XML |
Non-Patent Citations (6)
Title |
---|
叶彬;曾伟民;肖治华;郭创新;朱乘治;曹一家: "数据仓库在电力系统中的应用", 《电力系统及其自动化学报》 * |
周秀文: "灰色关联度的研究与应用", 《中国优秀硕士论文全文数据库 基础科学辑》 * |
尚博祥;王扬;孙轶凡: "公共信息模型(CIM)在智能电网信息化中的应用", 《中国电机工程学会2012电力行业信息化年会》 * |
赵林;张令涛;马仲佳: "基于大数据技术调度端电网模型管理和分析架构", 《电网技术》 * |
钟庆;陈伟坤;许中: "设备故障统计数据与电能质量监测数据的关联分析", 《电力电容器与无功补偿》 * |
陈盛荣,刘广钟: "分布式环境下ETL 系统的优化策略研究", 《现代计算机(专业版)》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109918367A (en) * | 2019-03-19 | 2019-06-21 | 北京百度网讯科技有限公司 | A kind of cleaning method of structural data, device, electronic equipment and storage medium |
CN111177126A (en) * | 2019-08-01 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Information processing method, device and equipment |
CN111177126B (en) * | 2019-08-01 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Information processing method, device and equipment |
CN110968627A (en) * | 2019-11-11 | 2020-04-07 | 南京峰凯云歌数据科技有限公司 | Big data analysis method and system |
CN111177128B (en) * | 2019-12-11 | 2023-10-27 | 国网天津市电力公司电力科学研究院 | Metering big data batch processing method and system based on improved outlier detection algorithm |
CN111177128A (en) * | 2019-12-11 | 2020-05-19 | 国网天津市电力公司电力科学研究院 | Batch processing method and system for big metering data based on improved outlier detection algorithm |
CN111506640A (en) * | 2020-04-21 | 2020-08-07 | 北京中电普华信息技术有限公司 | Mapping method and device |
CN112650744A (en) * | 2020-12-31 | 2021-04-13 | 广州晟能软件科技有限公司 | Data management method for preventing secondary pollution of data |
CN112650744B (en) * | 2020-12-31 | 2024-04-30 | 广州晟能软件科技有限公司 | Data treatment method for preventing secondary pollution of data |
CN112948203A (en) * | 2021-02-03 | 2021-06-11 | 刘靖宇 | Elevator intelligent inspection method based on big data |
CN112948203B (en) * | 2021-02-03 | 2023-04-07 | 刘靖宇 | Elevator intelligent inspection method based on big data |
CN113742086A (en) * | 2021-09-17 | 2021-12-03 | 中环曼普科技(南京)有限公司 | Distributed parallel analysis type data cluster management method and system |
CN116821223A (en) * | 2023-08-25 | 2023-09-29 | 云南三耳科技有限公司 | Industrial visual control platform and method based on digital twinning |
CN116821223B (en) * | 2023-08-25 | 2023-11-24 | 云南三耳科技有限公司 | Industrial visual control platform and method based on digital twinning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109213752A (en) | A kind of data cleansing conversion method based on CIM | |
CN105069703B (en) | A kind of electrical network mass data management method | |
CN103488673B (en) | For performing the method for reconciliation process, controller and data-storage system | |
US10599684B2 (en) | Data relationships storage platform | |
CN104881424B (en) | A kind of acquisition of electric power big data, storage and analysis method based on regular expression | |
CN109120461B (en) | A kind of service feature end-to-end monitoring method, system and device | |
CN109308290A (en) | A kind of efficient data cleaning conversion method based on CIM | |
CN106815338A (en) | A kind of real-time storage of big data, treatment and inquiry system | |
CN107315776A (en) | A kind of data management system based on cloud computing | |
CN104809244B (en) | Data digging method and device under a kind of big data environment | |
CN112181960A (en) | Intelligent operation and maintenance framework system based on AIOps | |
CN113254630B (en) | Domain knowledge map recommendation method for global comprehensive observation results | |
CN106570145B (en) | Distributed database result caching method based on hierarchical mapping | |
US20190050435A1 (en) | Object data association index system and methods for the construction and applications thereof | |
CN107103064A (en) | Data statistical approach and device | |
CN112148578A (en) | IT fault defect prediction method based on machine learning | |
CN116049454A (en) | Intelligent searching method and system based on multi-source heterogeneous data | |
CN103995828B (en) | A kind of cloud storage daily record data analysis method | |
CN109460393B (en) | Big data-based visual system for pre-inspection and pre-repair | |
US11182386B2 (en) | Offloading statistics collection | |
CN114356712B (en) | Data processing method, apparatus, device, readable storage medium, and program product | |
Theeten et al. | Chive: Bandwidth optimized continuous querying in distributed clouds | |
CN103530369A (en) | De-weight method and system | |
CN116680090B (en) | Edge computing network management method and platform based on big data | |
CN108363761A (en) | Hadoop awr automatic loads analyze information bank, analysis method and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190115 |