CN105303456A - Method for processing monitoring data of electric power transmission equipment - Google Patents

Method for processing monitoring data of electric power transmission equipment Download PDF

Info

Publication number
CN105303456A
CN105303456A CN201510674398.1A CN201510674398A CN105303456A CN 105303456 A CN105303456 A CN 105303456A CN 201510674398 A CN201510674398 A CN 201510674398A CN 105303456 A CN105303456 A CN 105303456A
Authority
CN
China
Prior art keywords
data
hash
time
backup
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510674398.1A
Other languages
Chinese (zh)
Inventor
耿利
许海霞
苗泽玮
赵娜
陈迪
刘泉
李晨
李振宇
胡青学
张子建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guo Wang Shandong Ningyang Power Supply Co
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
Guo Wang Shandong Ningyang Power Supply Co
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guo Wang Shandong Ningyang Power Supply Co, State Grid Corp of China SGCC, TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical Guo Wang Shandong Ningyang Power Supply Co
Priority to CN201510674398.1A priority Critical patent/CN105303456A/en
Publication of CN105303456A publication Critical patent/CN105303456A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for processing monitoring data of electric power transmission equipment, which comprises the steps of carrying out consistent hash storage of multiple backups according to association and time and space attributes of the monitoring data, and carrying out combined retrieval and parallel retrieval and feature analysis on multiple monitoring data sources by using a parallel computation framework. According to the method provided by the invention for processing the monitoring data of the electric power transmission equipment, efficient and reliable storage is carried out on the monitoring data based on cloud computing technologies, and rapid access and analysis are realized.

Description

Power transmitting device monitor data disposal route
Technical field
The present invention relates to electric network data process, particularly a kind of power transmitting device monitor data disposal route.
Background technology
Along with the quick growth of electrical network scale, electric network composition are increasingly sophisticated, electric power enterprise strengthens the promotion and application dynamics of power transmitting device monitoring one after another, obtains with the Various types of data of transmission also in the growth that geometry level occurs.The status information of the various types of signal that these data occur when not only including unit exception, operating various kinds of equipment, further comprises a large amount of related datas simultaneously, as geography information, weather, scene temperature and humidity and detect video, image and relevant documentation etc., form power transmitting device monitor data gradually.A large amount of monitor nodes, constantly to the data that data platform transmission gathers, forms the heterogeneous data flow of magnanimity.Data platform not only needs reliably to store these data, and needs these data for the treatment of and analysis in time.Although prior art is based on cloud computing platform process magnanimity monitor data, compared with applying with the cloud computing of internet arena, all there is very big-difference in power transmitting device monitoring in data storage, communication or calculating.How to carry out efficiently, reliably storing to above-mentioned data, and fast access and analysis, be current urgent problem.
Summary of the invention
For solving the problem existing for above-mentioned prior art, the present invention proposes a kind of power transmitting device monitor data disposal route, comprising:
The consistance hash of carrying out multiple duplication according to the relevance of monitor data and Time and place attribute stores, and utilizes parallel computation frame to carry out combined retrieval and parallel search and signature analysis to multiple monitor data source.
Preferably, the consistance hash that the described relevance according to monitor data and Time and place attribute carry out multiple duplication stores, and comprises further:
Obtain the Time and place characteristic of each watch-dog image data, the acquisition time that namely data are corresponding and collecting location and self-defined related coefficient are as the key word of data retrieval and analysis; In cloud platform, data are stored as 3 backup versions; Utilize consistance hash that the 1st of data the backup is carried out Hash maps according to watch-dog numbering; The 2nd of data backup is carried out Hash maps according to acquisition time data; The 3rd of data backup is carried out Hash maps according to self-defined related coefficient, and described related coefficient is the particular community of monitor data, and it needs to come assignment according to upper level applications; Described consistance hash stores and comprises following process further:
1) by the described related coefficient of configuration file predefine monitor data and the quantity of redundancy backup;
2) calculate the hashed value of each memory node in cloud platform, and be configured between the circulation hash queue region set up in advance;
3) according to the Time and place attribute of monitor data and the hashed value of Calculation of correlation factor data, to the 1st backup of the multiple backup of the data existed under cloud platform, according to the source of data, i.e. watch-dog numbering, calculate the first hashed value, be mapped in the queue of circulation hash; To the 2nd backup, according to time attribute and the acquisition time data of monitor data, calculate the second hashed value, and be mapped in the queue of circulation hash; To the 3rd backup, according to Calculation of correlation factor the 3rd hashed value of data, and be mapped in the queue of circulation hash; If cloud platform configuration has the backup of more than 3, then alternately calculate its hashed value according to the mode of the above-mentioned first to the 3rd backup and be mapped in the queue of circulation hash successively;
4) according to the memory location of data hash value and memory node hashed value determination data, by clockwise by data-mapping on the memory node nearest apart from it;
5) if the node of storage is occurred insufficient space situation by data, then present node is skipped to find next memory node;
In addition, when carrying out digital independent, namenode returns to client after sorting to multiple memory node according to the distance between memory node and client, to read data from nearest node, wherein, the distance definition between two nodes by a node arrive another node the nodes of process.
Preferably, described combined retrieval is carried out to multiple monitor data source, comprises further:
Retrieve according to following condition: device attribute data, i.e. title, working time, infield, body parameter, monitor data and conductor temperature, current-carrying capacity, pulling force, environmental data and environment temperature, humidity and air pressure, geographic information data and height above sea level, longitude and latitude; Different data sources is carried out data cube computation, and described different data source comes from multiple file; Watch-dog carries out unified data acquisition to insulated terminal leakage current, wire tension, current in wire, conductor temperature, microclimate data and uploads, abnormal at insulated terminal, terminal is overheated or unbalance the information of carrying out being correlated with report to the police; Wherein in the process of monitoring leakage current, these 3 data files of device attribute data file, insulated terminal leakage current data file and environmental data file are utilized to retrieve, generate the monitor data in the watch-dog schedule time, and 3 data files are carried out connection handling to carry out combined retrieval;
After power transmitting device monitor data completes storage, the method retrieved data is the parallel query method performed at map end, and complete the filtration of data and connection procedure in the map stage and avoid carrying out the reduce stage, retrieval comprises the following steps:
1) according to the search condition that user proposes, data are filtered, remove the data do not satisfied condition;
2) according to Search Requirement, setting major key; Described major key is watch-dog numbering, time data or related coefficient;
3) to every bar record of each data source, Data Filename is adopted to mark as label;
4) according to major key by the record cutting of same alike result value to one group, and carry out data cube computation;
Filtration in the map process of combined retrieval, flag settings, packet sequencing, attended operation are carried out at local node, and then the result of combined retrieval outputs to distributed file system;
Further, described parallel search and signature analysis are carried out to multiple monitor data source, comprise further:
Based on hyperchannel seasonal effect in time series dynamic interrelationships, integration characteristics extraction is carried out to the signal data of multichannel synchronousing collection, first by data upload to distributed file system, by distributed file system by deblocking, and stochastic distribution is on multiple memory node, the calculating of hyperchannel seasonal effect in time series dynamic interrelationships completed in the reduce stage, result of calculation outputs in distributed file system and preserves, utilize the temporal associativity of data, acquisition time data are calculated hash memory location as key word, and described characteristic extraction procedure comprises further:
1) the calculation task time, data are filtered, remove the data not meeting time conditions; 2) using time data as major key, every bar record is marked; 3) according to major key by the record cutting of same alike result value to one group, and call multivariate sample entropy computation process, result of calculation outputted to distributed file system.
The present invention compared to existing technology, has the following advantages:
The present invention proposes a kind of disposal route of power transmitting device monitor data, carry out efficiently, reliably storing to monitor data based on cloud computing technology, and realize fast access and analysis.
Embodiment
Detailed description to one or more embodiment of the present invention is hereafter provided.Describe the present invention in conjunction with such embodiment, but the invention is not restricted to any embodiment.Scope of the present invention is only defined by the claims, and the present invention contain many substitute, amendment and equivalent.Set forth many details in the following description to provide thorough understanding of the present invention.These details are provided for exemplary purposes, and also can realize the present invention according to claims without some in these details or all details.
The present invention is based on the research that cloud platform carries out power transmitting device supervising data storage and parallel parsing process; Consider the relevance of data and Time and place attribute, propose the multiple duplication consistance hash storage means of data correlation, and the data cutting strategy of cloud platform and the planning of cloud platform network framework are optimized.On this basis, data source parallel search and the multi-channel data integration characteristics extraction parallel computation of monitor data is realized based on parallel framework.
Consider from the upper level applications angle of power transmitting device monitor data platform, the distribution of data is mainly by the impact of following factor: 1) data need to be distributed to each node in cloud platform uniformly, to keep load balancing as far as possible; 2) cloud platform cloud platform interior joint fault is regarded as a kind of normality, needs to consider node failure problem during optimization data distribution; 3) for ensureing reliability and the retrieval process efficiency of data, need to take multiple duplication scheme; 4), under cloud platform running environment, Internet Transmission and magnetic disc i/o operation are the key factors affecting overall performance, if can reduce the traffic of data, effectively will reduce data processing time.Being retrieved as example with data correlation conventional in supervisory system, when performing parallel computation associative search, adopting the cloud platform data placement scheme (not considering data correlation) of standard, attended operation needs to complete in the Reduce stage.In the Map stage, all data carry out packet sequencing on multiple node, carry out data download afterwards by the node of reduce task by remote access mode.In this process, a large amount of data irrelevant with last attended operation may be had also to be replicated in a network and to transmit.If when data upload according to the device attribute of data, the data of same equipment are stored in same node point as far as possible, then can complete attended operation in the map stage, save the data communication in reduce stage, overall execution efficiency is improved.
According to above analysis, the data layout of cloud platform is optimized, utilize following date storage method: relevant data centralization stored, in data retrieval with when analyzing, groundwork is placed on map end to perform, to reduce by being mapped to reduce pilot process network service load, thus improve integral retrieval and analytical performance.Each type monitor data may have different data types and form, but their common feature all has Time and place characteristic, and namely each watch-dog image data is all corresponding to a concrete acquisition time and a concrete collecting location.The retrieval of this composition data and key word the most frequently used when analyzing.Because data are saved as 3 backup versions by cloud platform acquiescence, method considers the relevance of 3 aspects: watch-dog position, data acquisition time and self-defined relevance.Utilize consistance ashing technique, the 1st backup version of data is carried out Hash maps according to watch-dog numbering; 2nd backup version of data is carried out Hash maps according to acquisition time data; 3rd backup version of data is carried out Hash maps according to self-defined related coefficient, with satisfied difference retrieval and data analysis requirements.Related coefficient can be used as an attribute of monitor data, needs assignment, to realize self-defined relevance according to upper level applications.Need in method to build the queue of circulation hash.Idiographic flow is described below:
1) related coefficient of monitor data and redundancy backup quantity are by configuration file predefine, and redundancy backup version quantity is defined as 3 here;
2) calculate the hashed value of each memory node in cloud platform, and be configured between circulation hash queue region;
3) according to the Time and place attribute of monitor data and the hashed value of Calculation of correlation factor data.The multiple duplication of data is there is under cloud platform.To the 1st backup version, according to the source of data, namely watch-dog numbering, calculates hashed value 1, is mapped in the queue of circulation hash; To the 2nd backup version, according to the time attribute of monitor data, i.e. acquisition time data, calculate its hashed value 2, and are mapped in the queue of circulation hash.To the 3rd backup version, according to its hashed value 3 of Calculation of correlation factor of data, and be mapped in the queue of circulation hash.If need higher memory reliability, be configured with the backup version quantity being greater than 3, then alternately calculate its hashed value i according to above-mentioned 3 kinds of modes, and be mapped in the queue of circulation hash successively;
4) according to the memory location of data hash value and memory node hashed value determination data.By clockwise by data-mapping on the memory node nearest apart from it;
5) if the node of storage is occurred the abnormal conditions such as insufficient space by data, then this node is skipped to find next memory node.
When carrying out digital independent, namenode returns to client after sorting to multiple memory node according to the distance between memory node and client, to read data from nearest node.In cloud platform, network node is tree structure, and in tree, the root node of every stalk tree normally connects the switching node of computing machine, the distance definition between two nodes by a node arrive another node the nodes of process.
The default configuration of cloud platform thinks that all node is all in a frame, therefore the configuring condition according to actual cloud platform is needed, the network architecture of cloud platform nodes is passed to cloud platform, cloud dispatching platforms device just can be made to select rational memory node to carry out digital independent and write.Network architecture structure can adopt the form of scripted code to pass to cloud platform.
Power transmitting device monitoring needs to carry out combined retrieval to the plurality of devices of on-line monitoring and line parameter circuit value according to conditions such as watch-dog numbering, acquisition times.Combined retrieval relates to device attribute data (title, working time, infield etc.), body parameter, the data sources such as monitor data (conductor temperature, current-carrying capacity, pulling force etc.), environmental data (ambient temperature and humidity, air pressure etc.), geographic information data (height above sea level, longitude and latitude etc.), different data sources is carried out data cube computation by these needs.Multi-source data comes from different files usually.Watch-dog carries out unified data acquisition to data such as insulated terminal leakage current, wire tension, current in wire, conductor temperature, microclimates and uploads.Abnormal at insulated terminal, terminal is overheated or unbalance the information that can carry out being correlated with report to the police.Be retrieved as example with leakage current, retrieval relates to 3 data files: device attribute data file; Insulated terminal leakage current data file; Environmental data file.Retrieval needs to generate the monitor data in watch-dog a period of time, and namely obtain the monitor data list with facility information and environmental information, this needs 3 data files to carry out connection handling, could obtain the list met the demands.
Power transmitting device monitor data complete storage after, the method retrieved data is the parallel query method performed at map end, method is mainly included in filtration and the connection procedure that the map stage completes data, avoids carrying out the reduce stage, thus saves Internet Transmission expense.The prerequisite that method performs is that data have carried out Data distribution8 according to the hereinbefore described multiple duplication consistance ashing technique based on data correlation, thus same memory node has been arrived in data gathering required when making connection.Retrieval flow can be described below:
1) according to the search condition that user proposes, data are filtered, remove the data do not satisfied condition;
2) according to Search Requirement, setting major key; Major key can be watch-dog numbering, time data or related coefficient;
3) every bar record of each data source is marked, Data Filename can be adopted to mark as label;
4) according to major key by the record cutting of same alike result value to one group, and carry out data cube computation.
Data are after Optimum distribution, and filtration, flag settings, packet sequencing, connection etc. in the map process of combined retrieval operate in local node and carry out, and the result of combined retrieval outputs to distributed file system.
Along with many sensing measurements technology is widely used in various electric apparatus monitoring, the multi-channel data sequence of synchronization monitoring is collected and preserve.Dynamic interrelationships contains abundant characteristic information in these synchronous multi-channel data sequences or between sequence, can more fully reflect power equipment running status.The present invention is based on hyperchannel seasonal effect in time series dynamic interrelationships, integration characteristics extraction is carried out to the vibration signal data of the vibration monitoring equipment of 6 Channel Synchronous collections.Under cloud platform, based on the feature extracting method of consistance ashing technique design parallelization, accelerate feature extraction speed.
6 passage vibration monitoring signal separate, stored of synchronous acquisition are in 6 files, and signal subsection stores, and every segment signal is with time data.For the parallel parsing of complete pair signals, first by data upload to distributed file system.Distributed file system is by deblocking, and stochastic distribution is on multiple memory node.Owing to not considering data correlation, the data relationship evaluation method of parallelization can only adopt carries out data filtering at map end to data, and sends to reduce to hold the computation schema carrying out solving by network each segment signal.Each passage file is cut into multiple segmentation, and distributed store is on multiple memory node.The calculating of hyperchannel seasonal effect in time series dynamic interrelationships completed in the reduce stage, and result of calculation outputs in distributed file system and preserves.Apply the multi-channel data of data-optimized location mode to synchronous acquisition mentioned above to redistribute, utilize the temporal associativity of data, acquisition time data are calculated hash memory location as key word.
Optimum distribution makes synchrodata assemble, and completes calculation task in map task.
Feature extraction flow process based on consistance hashing algorithm can be described below:
1) the calculation task time, data are filtered, remove the data not meeting time conditions;
2) using time data as major key, every bar record is marked;
3) according to major key by the record cutting of same alike result value to one group, and call the computation process of multivariate sample entropy.Result of calculation outputs to distributed file system.
Wherein multivariate sample entropy calculation process can be described below:
1) original p is set to tie up (passage) time series as { x k, i} i=1 n, k=1,2 ...., p, wherein often tieing up sequence has N number of point.First to size factor β given in advance, Multivariate Time Series { y is built k, j β, namely y k , j &beta; = 1 &beta; &Sigma; i = ( j - 1 ) &beta; + 1 j &beta; x k , i , k = 1 , 2 , ... , p , Wherein 1 < j < N &beta; .
2) preset p and tie up parameter embedding vector M [m 1, m 2, m p], p ties up time delay vector
[T 1, T 2..., T p], utilize Multivariate Time Series { y k, j β, build (N-n) individual composite delay vector Y m(i), that is:
3) Y is defined m(i) and Y mj the distance between () is d [Y m(i), Y m(j)], that is: d [Y m(i), Y m(j)]=max l=1 ..., m| x (i+l-1)-x (j+l+1|) }
4) for given threshold value r, to each i value calculating event P i: d [Y m(i), Y m(j)] the probability B that occurs of <r (j ≠ i) i m(r)=P i/ (N-n-1), illustrates all Y m(j) and Y mthe correlation degree of (i).
5) B is asked i m(r) to the mean value of all i, that is:
6) spread step 2) in m be m+1, repeat step 3)-5) obtain B m+1(r).
7) calculating multivariate sample entropy is M S E ( M , T , r , N ) = - l n ( B m + 1 ( r ) B m ( r ) ) .
Preferably, cloud platform of the present invention proposes a kind of consolidation strategy to small documents, and the small documents of some merges the new storage file of rear generation, generally merges the small documents belonging to same attribute.Index file is upgraded while by new storage file writing system.Index in cloud platform comprises, and master index is the resource collection belonging to file, as type etc.; Secondary index is concrete resource entries.When needs file reading, in master index and secondary index, Check askes successively, reduces query context, can ensure higher reading response.The core of the accumulation layer design of cloud platform of the present invention comprises: first carry out to small documents mergings and generate storage file, then the file set up secondary index after being combined based on the storage feature of database, is looked ahead improve the response speed of file reading by index.Below accumulation layer concrete details is introduced in detail.
Divide File is become block and block one by one, the default size of block is 64M.The NameSpace of distributed file system, is loaded in internal memory by namenode during startup by persistence in an image file.Large amount of small documents can cause namenode low memory, generates the search efficiency of file during excessive image file reduction file reading.To the read-write operation of each file, first inquire about in NameSpace, the information such as block address, file size of locating file, and then retrieve in back end space.When the file read is very little, in read-write process, main time all consumes in retrieval and inquisition, instead of the transmission of file data, affects the treatment effeciency of server cluster.
Cloud platform utilizes small documents to merge and generates storage file.First realize a filtrator to filter with size by type file, select the document files that can carry out full-text search, the threshold value of file size setting is herein 10M, is then considered as large files, does not need to merge when file is greater than 10M.After filtering, according to resource collection belonging to file entries, to be unit carry out merging to the small documents after filtering to cloud platform becomes blocks of files.Resource collection is the set of the resource entries with certain correlativity, and a resource entries only belongs to a resource collection.Usual set is according to the division such as range of attributes, time, and file can divide according to Attribute domain.In new blocks of files, resource entries has very large relevance, blocks of files just can be distributed to a MapReduce task by Data processing afterwards, the calculated amount avoided because of task wastes the time of task matching and switching very little, reduces data movement in the cluster.
After small documents merges, namenode internal memory is the performance bottleneck of whole file system, because all file metadata information need to be stored in its internal memory, can reduce the quantity of file after being merged by small documents, save a lot of memory headroom, but the file reading efficiency after merging can be very low.The preferred embodiment of the invention adopts hierarchical index to set up small documents index of metadata, is little index file by large index file with rational regular partition.Take resource collection as master index, the resource entries content under each resource collection, as secondary index, is first carried out Check according to the set of resource entries place when searching like this and is looked for, then search in corresponding secondary index file.Although many processes of searching in master index, because resource collection number can not be too many, its time of searching is very little, and much less than global index's file of the secondary index file through dividing, so can improve search efficiency on the whole.Simultaneously secondary index file also and not all is loaded into internal memory, according to internal memory service condition and binding cache strategy carries out flexible dispatching, can solve the problem of low memory.
Here the data that the index proposed will be accessed below looking ahead and referring to by user's current accessed data prediction user, and buffer memory is called in its index.If energy Accurate Prediction, the data just can user will accessed in advance are loaded into buffer memory, just can obtain system responses faster when user accesses.
User, in download or before browsing resource entries, usually must be obtained " intermediate result collection " by the mode of retrieval or directory search, the resource entries needed then could be selected wherein to access further.The interval of a several seconds is there is between user sees the result set page and execution is downloaded or browsed, during this period of time by the index of buffered in advance intermediate result pooling of resources entry, the inquiry of a series of file metadata just need not be performed again when user clicks and downloads or browse, directly carry out transfer files, the request response of these files can be improved so to a great extent.This response promotes does not need too many internal memory.
Below describe the accumulation layer framework of cloud platform of the present invention in detail.Cloud platform is except utilizing above-mentioned strategy, and when realizing, its accumulation layer framework is the basis of system.Cloud platform accumulation layer is structured on the distributed memory system on Hadoop cluster, provides basic file preserve and read service.
The framework of cloud platform accumulation layer adopts three-decker design: user interface layer, Business Logic and accumulation layer, and in order to improve performance, adopts the mode be separated with server cluster by Web server.The user interface that namely user interface layer provides, user is sent request and receiving feedback information by the function that this layer provides.Business Logic is the function realization layer that small documents reads and writes, and comprises Piece file mergence, index construct and buffer memory structure etc.
Business Logic comprises the functional modules such as Piece file mergence, searching system, small documents index, buffer memory and distributed system client.Each module is implemented as follows:
(1) Piece file mergence: Piece file mergence function comprises 2 stages: establishment SequenceFile object carries out small documents and merges.By the filtration of filtrator, merge meeting the file merging requirement, first search in master index according to the resource collection at resource entries place, after finding file path corresponding to resource collection, create SequenceFile object, and obtain the Writer object of SequenceFile and it is configured, prepare writing in files.A new thread is opened, by the metadata information such as file place value, length corresponding for this resource entries write resource entries secondary index while execute file write.Resource entries writes successfully and closes output stream, returns and submits to successfully, otherwise returns and submit to unsuccessfully.
(2) retrieve: document retrieval function is provided, rely on this module to carry out reading optimization based on " intermediate result collection " to distributed file system.
(3) small documents index: build small documents index, comprise resource collection master index and resource entries secondary index, provides index file creation, adds and the function such as deletion record.
Master index data are stored in relational database, provide access by relational database access interface, use the Map data structure in Java to preserve.Because resource collection is stored in database, only needs according to this index to be increased in the field be worth by system generation time resource entries is added, so can be kept in relational database, do not affect treatment effeciency.Data acquisition in master index Key/Value structure, can use Map data structure in Java to improve Check and ask efficiency.In addition, for ensureing recall precision, also must exist according to this Map object of content initialization in database when service starts always, because master index number of files is few, Map object committed memory is very little, so system overhead is limited, add when there being new resource collection or have deleted time, need upgrade this Map object.
Secondary index is created by open source projects Lucene, supports small documents metadata retrieval.Lucene has a set of perfect index construct, upgrades and search solution, and when indexed file is less than 1G, search efficiency is very high, can be used for building commercial search engine.The function that the index that cloud platform will create needs some special, as needed real-time update index file whenever user adds resource entries time; When multiple user adds resource entries under a resource collection simultaneously, the con current control of file write; Compressed index file is to reduce EMS memory occupation etc.
(4) look ahead: in order to promote response speed better, the cache management to user's interested " intermediate result collection " is provided here, comprise spatial cache and safeguard, buffer update, the functions such as update algorithm maintenance.
After user sends retrieval request, the resource entries result set meeting user's needs is ask in Web service according to user search condition Check, return to user, create asynchronous thread simultaneously and upgrade buffer memory, upgrade cache contents returning user's result set and browse result set to user and determine to click in the time interval between download or browse operation.When cache module receives renewal cache contents request, call index module and retrieve, the metadata of current results collection entry is loaded into buffer memory.When user sends download or browse request, Web service is called distributed system client and is searched metadata in the buffer and start to read data and to client transmissions.
The thread pool of a system maintenance fixing number of threads calls a thread when receiving renewal cache request at every turn and goes process, if do not have idle thread in thread pool, allows this buffer memory task wait for.The system resource that buffer update task accounts for can be maintained in a rational scope like this, not influential system overall performance.The present invention selects FIFO algorithm realization cache module scheduling feature, eliminates the cache entries at most in the most efficient manner.Specific implementation is: set up cache pool, and allocating cache pond size, is defaulted as 32M, can preserve 200,000 file metadata information.That store inside cache pool is key-value pair key/value one by one, and filename is as key, and the back end ID of file, the combination of reference position and length is as value.This cache pool provides two to operate put and get.Put puts into data toward cache pool, if existing data reach the upper limit inside cache pool, then replaces corresponding data according to cache replacement algorithm, directly puts into just if also had living space.Get operation obtains corresponding value value, as then do not returned sky according to key value.
Distributed system client encapsulates operation document system and the mutual API in the external world, comprises reading and writing of files and inquiry file position etc.When file system receives file read request, first judge through file filter device, the metadata information of the file having belonged to merged then locating file first in the buffer, if do not exist, then search in indexed file, if still search less than; communicate with namenode.Check builds the Reader object that then SequenceFile object obtain SequenceFile and sends read requests to back end after finding file metadata, close inlet flow, returned after transferring data to user.
User has two kinds of request methods, and a kind of is the write request of presenting a paper, a kind of be inquiry, browse or the read requests of Gains resources.When Web server receive user submit resource request to time, first judge whether that needing to do small documents merges, and if desired, then carries out Piece file mergence, does not need, directly use distributed file system to write interface and carry out writing.Prepare file to write distributed file system by distributed type file system client side after Piece file mergence, while distributed system client writing in files, call small documents index upgrade module and perform small documents index and renewal, because Web server main frame is separated with server cluster, write and renewal can be performed by different threads simultaneously, do not affect each other.Submission successful information is returned to client when distributed file system writes Web service successfully.
Send file read request when user needs browser document detailed content or download file, this request frequency is high, expends system resource maximum.When Web server receives the read requests of user, first the condition submitted to according to user by searching system is retrieved, the resource entries result set that obtaining user needs returns to user and browses, the entry set (giving tacit consent to 20) showing first page in the user interface in result set is sent to cache module simultaneously, and open an independent thread and upgrade buffer memory, when user browsed the result set page request that returns download or browse detailed time, Web service is called distributed type file system client side and is prepared file reading content, distributed type file system client side is locating file positional information first in the buffer, if do not find, then search in small documents index, then directly arrive back end after finding positional information and read data, return to user.
Should be understood that, above-mentioned embodiment of the present invention only for exemplary illustration or explain principle of the present invention, and is not construed as limiting the invention.Therefore, any amendment made when without departing from the spirit and scope of the present invention, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.In addition, claims of the present invention be intended to contain fall into claims scope and border or this scope and border equivalents in whole change and modification.

Claims (3)

1. a power transmitting device monitor data disposal route, is characterized in that:
The consistance hash of carrying out multiple duplication according to the relevance of monitor data and Time and place attribute stores, and utilizes parallel computation frame to carry out combined retrieval and parallel search and signature analysis to multiple monitor data source.
2. method according to claim 1, is characterized in that, the consistance hash that the described relevance according to monitor data and Time and place attribute carry out multiple duplication stores, and comprises further:
Obtain the Time and place characteristic of each watch-dog image data, the acquisition time that namely data are corresponding and collecting location and self-defined related coefficient are as the key word of data retrieval and analysis; In cloud platform, data are stored as 3 backup versions; Utilize consistance hash that the 1st of data the backup is carried out Hash maps according to watch-dog numbering; The 2nd of data backup is carried out Hash maps according to acquisition time data; The 3rd of data backup is carried out Hash maps according to self-defined related coefficient, and described related coefficient is the particular community of monitor data, and it needs to come assignment according to upper level applications; Described consistance hash stores and comprises following process further:
1) by the described related coefficient of configuration file predefine monitor data and the quantity of redundancy backup;
2) calculate the hashed value of each memory node in cloud platform, and be configured between the circulation hash queue region set up in advance;
3) according to the Time and place attribute of monitor data and the hashed value of Calculation of correlation factor data, to the 1st backup of the multiple backup of the data existed under cloud platform, according to the source of data, i.e. watch-dog numbering, calculate the first hashed value, be mapped in the queue of circulation hash; To the 2nd backup, according to time attribute and the acquisition time data of monitor data, calculate the second hashed value, and be mapped in the queue of circulation hash; To the 3rd backup, according to Calculation of correlation factor the 3rd hashed value of data, and be mapped in the queue of circulation hash; If cloud platform configuration has the backup of more than 3, then alternately calculate its hashed value according to the mode of the above-mentioned first to the 3rd backup and be mapped in the queue of circulation hash successively;
4) according to the memory location of data hash value and memory node hashed value determination data, by clockwise by data-mapping on the memory node nearest apart from it;
5) if the node of storage is occurred insufficient space situation by data, then present node is skipped to find next memory node;
In addition, when carrying out digital independent, namenode returns to client after sorting to multiple memory node according to the distance between memory node and client, to read data from nearest node, wherein, the distance definition between two nodes by a node arrive another node the nodes of process.
3. method according to claim 2, is characterized in that, describedly carries out combined retrieval to multiple monitor data source, comprises further:
Retrieve according to following condition: device attribute data, i.e. title, working time, infield, body parameter, monitor data and conductor temperature, current-carrying capacity, pulling force, environmental data and environment temperature, humidity and air pressure, geographic information data and height above sea level, longitude and latitude; Different data sources is carried out data cube computation, and described different data source comes from multiple file; Watch-dog carries out unified data acquisition to insulated terminal leakage current, wire tension, current in wire, conductor temperature, microclimate data and uploads, abnormal at insulated terminal, terminal is overheated or unbalance the information of carrying out being correlated with report to the police; Wherein in the process of monitoring leakage current, these 3 data files of device attribute data file, insulated terminal leakage current data file and environmental data file are utilized to retrieve, generate the monitor data in the watch-dog schedule time, and 3 data files are carried out connection handling to carry out combined retrieval;
After power transmitting device monitor data completes storage, the method retrieved data is the parallel query method performed at map end, and complete the filtration of data and connection procedure in the map stage and avoid carrying out the reduce stage, retrieval comprises the following steps:
1) according to the search condition that user proposes, data are filtered, remove the data do not satisfied condition;
2) according to Search Requirement, setting major key; Described major key is watch-dog numbering, time data or related coefficient;
3) to every bar record of each data source, Data Filename is adopted to mark as label;
4) according to major key by the record cutting of same alike result value to one group, and carry out data cube computation;
Filtration in the map process of combined retrieval, flag settings, packet sequencing, attended operation are carried out at local node, and then the result of combined retrieval outputs to distributed file system;
Further, described parallel search and signature analysis are carried out to multiple monitor data source, comprise further:
Based on hyperchannel seasonal effect in time series dynamic interrelationships, integration characteristics extraction is carried out to the signal data of multichannel synchronousing collection, first by data upload to distributed file system, by distributed file system by deblocking, and stochastic distribution is on multiple memory node, the calculating of hyperchannel seasonal effect in time series dynamic interrelationships completed in the reduce stage, result of calculation outputs in distributed file system and preserves, utilize the temporal associativity of data, acquisition time data are calculated hash memory location as key word, and described characteristic extraction procedure comprises further:
1) the calculation task time, data are filtered, remove the data not meeting time conditions; 2) using time data as major key, every bar record is marked; 3) according to major key by the record cutting of same alike result value to one group, and call multivariate sample entropy computation process, result of calculation outputted to distributed file system.
CN201510674398.1A 2015-10-16 2015-10-16 Method for processing monitoring data of electric power transmission equipment Pending CN105303456A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510674398.1A CN105303456A (en) 2015-10-16 2015-10-16 Method for processing monitoring data of electric power transmission equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510674398.1A CN105303456A (en) 2015-10-16 2015-10-16 Method for processing monitoring data of electric power transmission equipment

Publications (1)

Publication Number Publication Date
CN105303456A true CN105303456A (en) 2016-02-03

Family

ID=55200686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510674398.1A Pending CN105303456A (en) 2015-10-16 2015-10-16 Method for processing monitoring data of electric power transmission equipment

Country Status (1)

Country Link
CN (1) CN105303456A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912565A (en) * 2016-03-25 2016-08-31 北京用尚科技股份有限公司 Monitoring information synchronization method applied to power industry
CN106021360A (en) * 2016-05-10 2016-10-12 深圳前海信息技术有限公司 Method and device for autonomously learning and optimizing MapReduce processing data
CN106019048A (en) * 2016-05-18 2016-10-12 成都理工大学 Single-phase grounding fault transient line selection method for small-current grounding system
CN107483858A (en) * 2017-08-31 2017-12-15 益和电气集团股份有限公司 The distributed memory system and its distributed storage method of electricity consumption enterprise supervision video
CN107967301A (en) * 2017-11-07 2018-04-27 许继电气股份有限公司 A kind of storage, querying method and the device of power cable tunnel monitoring data
CN108074191A (en) * 2016-11-14 2018-05-25 平安科技(深圳)有限公司 The method and device of data processing
CN108573039A (en) * 2018-04-04 2018-09-25 烟台海颐软件股份有限公司 A kind of target identification method assembled based on multisource spatio-temporal data and system
CN109857924A (en) * 2019-02-28 2019-06-07 重庆科技学院 A kind of big data analysis monitor information processing system and method
WO2019169619A1 (en) * 2018-03-09 2019-09-12 深圳大学 Method and apparatus for dividing randomly sampled data sub-blocks of big data
CN110941642A (en) * 2019-11-20 2020-03-31 贵州电网有限责任公司电力科学研究院 Power distribution network data processing method and device based on Lucene full-text retrieval
CN111352901A (en) * 2020-03-23 2020-06-30 郑州智利信信息技术有限公司 Monitoring data storage method based on computer
CN111475105A (en) * 2020-03-11 2020-07-31 平安科技(深圳)有限公司 Monitoring data storage method, device, server and storage medium
CN111521953A (en) * 2020-05-26 2020-08-11 广州市扬新技术研究有限责任公司 Rail transit contact net leakage current detecting system
CN112131284A (en) * 2020-09-30 2020-12-25 国网智能科技股份有限公司 Transformer substation holographic data slicing method and system
CN112488456A (en) * 2020-11-12 2021-03-12 南方电网科学研究院有限责任公司 Digital data modeling method for power equipment
CN113261039A (en) * 2019-01-23 2021-08-13 索尼集团公司 Information processing apparatus, information processing method, and information processing program
CN116937794A (en) * 2023-06-30 2023-10-24 华能青铜峡新能源发电有限公司 Electric power control detection system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033748A (en) * 2010-12-03 2011-04-27 中国科学院软件研究所 Method for generating data processing flow codes
CN104184812A (en) * 2014-08-20 2014-12-03 四川九成信息技术有限公司 Multi-point data transmission method based on private cloud

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033748A (en) * 2010-12-03 2011-04-27 中国科学院软件研究所 Method for generating data processing flow codes
CN104184812A (en) * 2014-08-20 2014-12-03 四川九成信息技术有限公司 Multi-point data transmission method based on private cloud

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卞艺杰 等: "Hdspace分布式机构知识库系统的小文件存储", 《计算机系统应用》 *
宋亚奇 等: "云平台下输变电设备状态监测大数据存储优化与并行处理", 《中国电机工程学报》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912565A (en) * 2016-03-25 2016-08-31 北京用尚科技股份有限公司 Monitoring information synchronization method applied to power industry
CN106021360A (en) * 2016-05-10 2016-10-12 深圳前海信息技术有限公司 Method and device for autonomously learning and optimizing MapReduce processing data
CN106019048A (en) * 2016-05-18 2016-10-12 成都理工大学 Single-phase grounding fault transient line selection method for small-current grounding system
CN108074191A (en) * 2016-11-14 2018-05-25 平安科技(深圳)有限公司 The method and device of data processing
CN107483858A (en) * 2017-08-31 2017-12-15 益和电气集团股份有限公司 The distributed memory system and its distributed storage method of electricity consumption enterprise supervision video
CN107967301A (en) * 2017-11-07 2018-04-27 许继电气股份有限公司 A kind of storage, querying method and the device of power cable tunnel monitoring data
WO2019169619A1 (en) * 2018-03-09 2019-09-12 深圳大学 Method and apparatus for dividing randomly sampled data sub-blocks of big data
CN108573039A (en) * 2018-04-04 2018-09-25 烟台海颐软件股份有限公司 A kind of target identification method assembled based on multisource spatio-temporal data and system
CN113261039A (en) * 2019-01-23 2021-08-13 索尼集团公司 Information processing apparatus, information processing method, and information processing program
CN109857924A (en) * 2019-02-28 2019-06-07 重庆科技学院 A kind of big data analysis monitor information processing system and method
CN110941642A (en) * 2019-11-20 2020-03-31 贵州电网有限责任公司电力科学研究院 Power distribution network data processing method and device based on Lucene full-text retrieval
CN111475105A (en) * 2020-03-11 2020-07-31 平安科技(深圳)有限公司 Monitoring data storage method, device, server and storage medium
CN111475105B (en) * 2020-03-11 2024-05-03 平安科技(深圳)有限公司 Monitoring data storage method, monitoring data storage device, monitoring data server and storage medium
CN111352901A (en) * 2020-03-23 2020-06-30 郑州智利信信息技术有限公司 Monitoring data storage method based on computer
CN111521953A (en) * 2020-05-26 2020-08-11 广州市扬新技术研究有限责任公司 Rail transit contact net leakage current detecting system
CN111521953B (en) * 2020-05-26 2022-07-29 广州市扬新技术研究有限责任公司 Rail transit contact net leakage current detecting system
CN112131284A (en) * 2020-09-30 2020-12-25 国网智能科技股份有限公司 Transformer substation holographic data slicing method and system
CN112131284B (en) * 2020-09-30 2024-05-24 国网智能科技股份有限公司 Holographic data slicing method and system for transformer substation
CN112488456A (en) * 2020-11-12 2021-03-12 南方电网科学研究院有限责任公司 Digital data modeling method for power equipment
CN116937794A (en) * 2023-06-30 2023-10-24 华能青铜峡新能源发电有限公司 Electric power control detection system and method
CN116937794B (en) * 2023-06-30 2024-02-06 华能青铜峡新能源发电有限公司 Electric power control detection system and method

Similar Documents

Publication Publication Date Title
CN105303456A (en) Method for processing monitoring data of electric power transmission equipment
CN106484877B (en) A kind of document retrieval system based on HDFS
CN111324360B (en) Container mirror image construction method and system for edge calculation
US8560569B2 (en) Method and apparatus for performing bulk file system attribute retrieval
CN111177178B (en) Data processing method and related equipment
CN104090889B (en) Data processing method and system
CN103020204B (en) A kind of method and its system carrying out multi-dimensional interval query to distributed sequence list
US10013440B1 (en) Incremental out-of-place updates for index structures
CN102725755B (en) Method and system of file access
CN104239377A (en) Platform-crossing data retrieval method and device
CN104778270A (en) Storage method for multiple files
CN102169507A (en) Distributed real-time search engine
CN111258978B (en) Data storage method
CN105160039A (en) Query method based on big data
CN105117502A (en) Search method based on big data
CN110347651A (en) Method of data synchronization, device, equipment and storage medium based on cloud storage
CN104584524A (en) Aggregating data in a mediation system
CN103793493A (en) Method and system for processing car-mounted terminal mass data
US8880504B2 (en) Tag management device, system and recording medium
CN102281312B (en) Data loading method and system and data processing method and system
CN112084190A (en) Big data based acquired data real-time storage and management system and method
KR101666440B1 (en) Data processing method in In-memory Database System based on Circle-Queue
CN102104581A (en) Network karaoke on-demand system and method thereof
JP2023531751A (en) Vehicle data storage method and system
CN116089414B (en) Time sequence database writing performance optimization method and device based on mass data scene

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160203

RJ01 Rejection of invention patent application after publication