CN103580960A - Online pipe network anomaly detection system based on machine learning - Google Patents

Online pipe network anomaly detection system based on machine learning Download PDF

Info

Publication number
CN103580960A
CN103580960A CN201310581956.0A CN201310581956A CN103580960A CN 103580960 A CN103580960 A CN 103580960A CN 201310581956 A CN201310581956 A CN 201310581956A CN 103580960 A CN103580960 A CN 103580960A
Authority
CN
China
Prior art keywords
data
virtual machine
pipe network
abnormity detecting
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310581956.0A
Other languages
Chinese (zh)
Other versions
CN103580960B (en
Inventor
陈尊裕
张得志
李丹
胡斯洋
龙圣
郑思明
吴珏其
周振邦
李维海
王红旗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan Science And Technology Co Ltd
Original Assignee
Foshan Luosixun Environmental Protection Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan Luosixun Environmental Protection Technology Co ltd filed Critical Foshan Luosixun Environmental Protection Technology Co ltd
Priority to CN201310581956.0A priority Critical patent/CN103580960B/en
Publication of CN103580960A publication Critical patent/CN103580960A/en
Application granted granted Critical
Publication of CN103580960B publication Critical patent/CN103580960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses an online pipe network anomaly detection system based on machine learning. The online pipe network anomaly detection system comprises a data collection unit, a data distribution unit and a plurality of anomaly detection units. The data collection unit is used for collecting real-time data of an online pipe network, merging the real-time data according to position areas and grouping the real-time data into different data packages. The data distribution unit is used for receiving the data packages, extracting data elements from the data packages and dividing the data packages into a plurality of data subsets after formatting the data packages. The anomaly detection units are used for receiving the data subsets in a one-to-one correspondence mode and predicating anomalism of the data subsets based on a semi-supervised machine learning framework. The anomaly detection units can be used for carrying out parallel data processing, and data transmission can be carried out among the anomaly detection units through an MPI. The online pipe network anomaly detection system can meet the requirements of the online anomaly detection units based on machine learning for usability of a server, and can prevent extra hardware on standby in an idle state from being introduced in.

Description

A kind of online pipe network abnormity detecting system based on machine learning
Technical field
The present invention relates to a kind of facility pipe network monitoring technology, be specifically related to a kind of online pipe network abnormity detecting system based on machine learning.
Background technology
The development of sensor technology makes transducer to realize high space-time accuracy parameters measuring at environmental area.The time series data that transducer is collected is constantly inputted in holder, forms data flow.Take waterworks operation as example, and sensing data can comprise each hydraulic parameters and water quality index.These data can be used for unusual condition detection etc., and it differentiates data exception by historical pattern or model prediction.Unusual condition can be revealed or contamination accident for pipeline.The geographical scale of pipeline is large, and the water feature complexity being caused by Changes in weather, seasonal variations, festivals or holidays and Community Population structural change is high, makes not competent this work of manual method.Therefore unique feasible program that the machine learning techniques based on historical data is online abnormality detection.Machine learning techniques can roughly be divided three classes: (a) clear data analysis classes; (b) rule-based class; (c), based on physical model class, classification foundation is to rely on which kind of parameter follow the tracks of, predict the sensing data tendency in current or future and respectively organize the association between data.First, abnormality detection system is set a benchmark based on normal system or sensor-based system historical data.After this, any activity that deviates from this benchmark will be considered extremely.
In addition,, because needs are distinguished real abnormal data and non-abnormal data (false alarm), we still need a set of computing system based on replicanism and redundancy strategy, support continuous on-line data acquisition executing data parser.
Summary of the invention
For above deficiency, the object of this invention is to provide the online pipe network abnormity detecting system based on machine learning, based on multiserver host hardware, virtual and Publish-subscribe data distribution strategy meets the availability requirement to server of online abnormity detecting unit based on machine learning, avoids introducing idle unnecessary hardware of awaiting orders simultaneously.
For realizing above object, the technical scheme that the present invention has taked is:
An online pipe network abnormity detecting system for machine learning, it comprises:
Data acquisition unit, for gathering the real time data of described online pipe network, and merges described real time data to be grouped into different packets according to the band of position;
Data dissemination unit for receiving described packet, and extracts data element from packet, then packet is formatd after processing and is divided into a plurality of data subsets;
A plurality of abnormity detectings unit, be used for the corresponding data subset of corresponding reception one by one, and described data subset is carried out to abnormality prediction based on semi-supervised learning framework, described a plurality of abnormity detectings unit panel data is processed and by MPI, is carried out data transmission each other.
Described abnormity detecting cellular installation on virtual machine, the corresponding abnormity detecting unit of each virtual machine.
The described online pipe network abnormity detecting system based on machine learning further comprises a plurality of server hosts, server host interconnects by fully connected topology in local area network (LAN), each server host is equipped with a polycaryon processor, described polycaryon processor is divided into a plurality of virtual machines on same server host according to thread, wherein, the first thread is designated as virtual machine dom0, other thread is divided into virtual machine dom U, described virtual machine dom0 is for the hardware of access services device main frame, and interact with virtual machine dom U, described virtual machine dom U is used for installing abnormity detecting unit, the virtual machine dom U of the server host of each operation is provided with corresponding backup on the server host of other operations.
Described abnormity detecting unit comprises:
Prediction module, for setting up forecast model according to multiple regression equation, so that the statistical forecast data without data subset expecting varialbe state under abnormal conditions in hypothesis to be provided, described prediction module is also carried out the exchange of statistical forecast data with other abnormity detecting unit;
Analysis module, be used for receiving described statistical forecast data, the regression parameter of the data subset next time obtaining from Data dissemination unit according to described statistical forecast data estimation, with the predicted value of data subset next time described in calculating, described data subset next time and historical data have identical time step and consistent pipe network background;
Judge module, according to predicted value and the actual value of data subset next time, judges the abnormality of described data subset next time;
Decision-making module, the abnormality judged result of making for receiving judge module, and according to described abnormality judged result, described forecast model is made to renewal.
The method that described prediction module is set up forecast model comprises the following steps:
Step 11, interior certain the some parameter of the online pipe network of basis over time situation are carried out the simulation of data model:
X i(t+1)=F i(X(t),X(t-1),X(t-2),...X(t-n)) (1)
Wherein: F ibe the forecast model of i abnormity detecting unit, i is positive integer, and is not more than the sum of abnormity detecting unit, X ithe input data of i abnormity detecting unit, wherein, X (t), X (t-1), X (t-2) ... X (t-n) is historical data, X i(t+1) be data subset next time;
Step 12, based on multiple regression equation, build forecast model:
X i(t+1)=A i0*X i(t)+A i1*X i(t-1)+...A in*X i(t-n)+C i (2)
Wherein: A i0, A i1... A infor forecast model F iregression parameter, C ibe the random error parameter of i abnormity detecting unit, described prediction module by MPI by C icarry out the exchange of statistical forecast data;
Step 13, solve random error parameters C i
C i = Σ j ≠ i n A ij 0 * X j ( t ) + Σ j ≠ i n A ij 1 * X j ( t - 1 ) + . . . + Σ j ≠ i n A ijn * X j ( t - n ) - - - ( 3 )
In formula (3), A ij0, A ij1... A ijnfrom normal data bag, Auto-matching obtains.
The method that described judge module judges the abnormality of described data subset next time comprises the following steps:
Step 31, comparison X i(t+1) predicted value and the difference of measured value;
Step 32, collection historical data building database X i(t), X i(t-1) ... X i(t-P), wherein, the empirical parameter of the time relationship that P is i abnormity detecting unit, P is positive integer;
Step 33, structure historical data base sample { X i(t), X i(t-1) ... X i(t-P) }, calculate the standard deviation scope of this sample;
Step 34, more described difference and standard deviation scope:
If difference is less than standard deviation scope, judge module returns to a negative acknowledge character (NAK) to decision-making module, if what be provided with that the judge module of abnormity detecting unit feeds back to decision-making module is all negative acknowledge character (NAK), decision-making module is by all X i(t+1) deposit in database, and indicate corresponding judge module and up-to-date sample data and the regression parameter of database synchronization, to be ready for use on prediction X i(t+2);
Prediction X i(t+2) time, if difference is greater than the sample { X of renewal i(t+1) X i(t), X i(t-1) ... X i(t-P+1) standard deviation scope }, corresponding judge module returns to an acknowledge signal to decision-making module, decision-making module in database by X i(t+2) be labeled as anomalous event, decision-making module will indicate this judge module according to database update regression parameter, but use old sample { X i(t+1) X i(t), X i(t-1) ... X i(t-P+1) } definition standard deviation, for X i(t+3) abnormal judgement.
The described online pipe network abnormity detecting system based on machine learning further comprises network building-out storage element, for storing the mirror image copies of the historical data of all virtual machines and online pipe network, each abnormity detecting unit is the data in addressable this network building-out storage element all, and virtual machine dom0 connects the communication of its corresponding virtual machine dom U and network building-out storage element.
The method of described backup is: the virtual machine dom U test point information on the server host of each operation is distributed to backup according to the loading condition of the server host of other operations, to realize best balance movement mode, after backup, automatically generate a question blank, described question blank is for defining the migration node of primary fault virtual machine dom U backup, to carry out dynamic migration when virtual machine dom U or server host break down.
On each virtual machine dom0, backup manager is set, for the health status of virtual machine dom U corresponding to this virtual machine dom0 is listed as into inventory, virtual machine dom0 is gone the backup of handling failure virtual machine by body plan according to question blank by backup manager.
The method of described Data dissemination unit distributing data subset comprises the following steps:
Described Data dissemination unit receive packet and to the interfere information producing due to the cause of measuring, sending or collect in packet even error message filter;
Extract the data element in packet, packet is converted to consolidation form;
Packet is divided into corresponding number according to subset, and the data in data subset guarantee balanced,
Data subset is encrypted, and by Publish-subscribe model, data subset is distributed to abnormity detecting unit.
Compared with prior art, tool has the following advantages in the present invention:
1, by machine learning, online pipe network is carried out to abnormity detecting unit, thereby the Distribution Statistics prediction without distributed network expecting varialbe state under abnormal conditions in hypothesis is provided, improved the abnormal discrimination of online pipe network, save a large amount of manpowers simultaneously.
2, abnormity detecting unit parallel time is processed, and has reduced cpu resource competition, meets the availability requirement to server, avoids introducing idle unnecessary hardware of awaiting orders simultaneously.
3, do not need data reconstruction to transmit the agreement that application programming interfaces are deacclimatized the inside and outside data transmission of different server main frame and communication control.By the computing relay between dom0 and dom U, be far smaller than the transformation period of sensing data in the network of rivers.
4, each server host without the need for independent disk, it is upper that virtual machine epigraph is stored in NAS, and it can be accessed by any physical machine.In this case, any virtual machine can move and without carrying out backup on local disk again in any physical machine.
5, virtual machine acquisition testing point is copied on another server host to complete dynamic migration.If one or more data processing module faults, each malfunctioning module replaces being passed the copy that multiserver main frame virtual platform comes into force.
6,, in failover, the virtual machine of having moved on different server main frame even without fault will resume operation from up-to-date test point.Comprising that movable TCP connects all running times of operating system all can preserve.The process of moving will be carried out as usual, and all files, network state and disk all will keep integrality.
Accompanying drawing explanation
Fig. 1 is the network architecture of high availability facility pipe network abnormity detecting of the present invention;
Fig. 2 is the online abnormity detecting framework of parallel model;
Fig. 3 is multiserver mainframe virtualization anticipation framework;
Fig. 4 is the framework that multinuclear processing threads is divided into the different virtual machine on same server main frame;
Fig. 5 is the method description of management for the high availability server of the online abnormity detecting algorithm of executed in parallel.
Embodiment
Below in conjunction with the drawings and specific embodiments, content of the present invention is described in further details.
Embodiment
It is example that the present embodiment be take the abnormity detecting of water supply network, and other online pipe network is as similar with its method in electric power, telecommunications, network, communication, heating power, combustion gas etc., repeats no more here.
Fig. 1 is the network architecture of high availability facility pipe network abnormity detecting.In each group, transducer can be waterpower data or water quality data transducer, and the sensing data in immediate geographic location is grouped together as Packet Generation by data acquisition unit.The measurement data of Data dissemination unit receiving sensor, becomes to meet by data transaction form the issue that subscriber's post-processed requires.The server host of controlling center interconnects by fully connected topology in Local Area Network.The first-selected mesh topology framework of virtual machine (vm) migration on different server main frame.Network building-out storage element is connected on all physics server hosts by local area network (LAN).
Various transducers in water supply network monitoring system and instrument be image data constantly.Data can comprise waterpower data (as flow velocity, flow, hydraulic pressure, water level etc.) and water quality data (comprising free chlorine, turbidity, pH, conductivity, oxidation-reduction potential and total organic nitrogen etc.).By analyzing these indexs, can detect pipe leakage and the contamination accident in the public network of rivers.Due to for facility regular jobs such as water tank, pump, gates, the seasonal variations of water source and sealing water, or water requirements fluctuation etc., in above-mentioned water distribution system, index alters a great deal.Therefore, we need accident detection system to distinguish that the routine of sensing data changes and unusual condition.
Data acquisition unit comprises SCADA system (supervisory control and data acqui sition system) and RTUs(Remote Terminal Units, remote control terminal), SCADA system is to collect a kind of canonical system of real-time transport net sensing data.In SCADA system of the present invention (Fig. 1), we merge grouping by region RTUs by local sensing data.RTUs function is by Data Digital, according to categories of sensors and acquisition time, adds time tag etc.Digitlization sensing data is sent to data collection server subsequently, and this process can realize by closed circuit industrial network, as Modbus, and Lonworks, or BACnet.
Data dissemination unit is based on Publish-subscribe model.Data dissemination unit extracts data element as Data dissemination unit from transport net sensing data, and converts them to consolidation form.The interfere information producing due to the cause of measuring, sending or collect even mistake will be filtered in advance.Data dissemination cellular organization format receive data in order to further processing.After encrypting, data are sent to by open TCP/IP Ethernet the terminal that each receives different-format water quality data, and in the present invention, the data of transmission after format transformation are in the abnormity detecting unit of operation center.Publish-subscribe host-host protocol has comprised data set X=(X 1, X 2... X m) decomposition rule, X for example 1be sent to virtual machine #1, X 2be sent to virtual machine #2 ... X mbe sent to virtual machine #m.
Exactly, all transport net sensing datas are to be divided into different packets according to the band of position.Sensing data in each packet can be waterpower or water quality data.Take the IP address of abnormity detecting unit and be packet name in Data dissemination unit.As shown in Figure 2, be the online abnormity detecting framework of parallel model.All abnormity detectings unit will move on multiserver main frame dummy machine system, and virtual machine monitor will copy the abnormity detecting unit of makeing mistakes, and therefore, abnormity detecting unit can recover from single virtual machine fault.
What the abnormity detecting unit of operation center adopted is panel data tupe.System in the present invention is carried out a plurality of abnormity detecting algorithms simultaneously, and each algorithm is processed respectively certain subset in sensing data bag, and these subsets enter operation center with the form of independent packet.Between each abnormity detecting unit, need by Message Passing Interface(MPI) mutually transmit data.The abnormity detecting program of abnormity detecting unit can be used C language or Fortran to write, and can on linux system, move.When abnormity detecting program is during with C language compilation, MPI is one group of function in C language.During by Fortran language compilation, MPI and for the subprogram at different process swap datas (compiling of Fortran language).
The detailed description of an algorithm below:
Be input to the running status that data in abnormity detecting unit have contained whole pipe network.These data are to be obtained by sensor measurement.Database will real-time update pipe network last state.
Interior certain some parameter of pipe network over time situation is simulated by data model, as follows
X(t+1)=F(X(t),X(t-1),X(t-2),...X(t-n)),
Wherein X (t) is the measured parameters of each transducer.F is forecast model, reads historical data X (t) from database, X (t-1), X (t-2) ..., according to historical observed result, infer the X value of next time point t+1.Generally, based on multiple regression equation
X(t+1)=A 0*X(t)+A 1*X(t-1)+...A n*X(t-n)
Just be enough to build forecast model F, determine the mean value of X (t+1), wherein, A 0to A nit is coefficient matrix.
Because X is a very huge vector, up to a hundred parameters of a large-scale network will be contained.For simplified operation, the computational process of F can be divided into several different subprocess by multiple programming technology in MPI framework.
X=(X 1,X 2,...X i,...,X m
The length of each subvector is
Figure BDA0000417145560000071
And
X i(t+1)=F i(X(t),X(t-1),X(t-2),...X(t-n)),
X i(t+1)=A i0*X i(t)+A i1*X i(t-1)+...A in*X i(t-n)+C i
i=1,...,m
C i = Σ j ≠ i n A ij 0 * X j ( t ) + Σ j ≠ i n A ij 1 * X j ( t - 1 ) + . . . + Σ j ≠ i n A ijn * X j ( t - n )
Each F wherein icomputing can independent execution on virtual machine.At " Publish-subscribe " (Publish-Subscribe) under data distribution strategy framework, X if ithe input data of module, utilize by random error parameters C by message passing interface (MPI) between module icarry out exchanges data.
Parameter in regression equation can for example, obtain by Auto-matching from standard figures bag (CRAN-R statistical computation bag).
Prediction module is collected historical data building database X(t), X(t-1) ... X(t-P), wherein p is the empirical parameter that defines the time relationship of X.
According to historical data, estimate the regression parameter of each module, for calculating F i.In forecast model, by regression parameter, estimate X i(t+1) mean value.
Judge module will be calculated X subsequently i(t+1) predicted value and the difference between measured value.
If difference is less than sample { X i(t), X i(t-1) ... X i(t-P) standard deviation scope }, forecast model F ito decision package, return to a negative acknowledge character (NAK).If what all modules fed back to decision package is all negative acknowledge character (NAK), decision package allows X(t+1) deposit in database, and indicate each forecast model and up-to-date sample data and the regression parameter of database synchronization, to be ready for use on prediction X(t+2).
If difference is greater than sample { X i(t+1) X i(t), X i(t-1) ... X i(t-P+1) standard deviation scope }, forecast model F ito decision package, return to an acknowledge signal.Decision package in database by X(t) be labeled as anomalous event.Decision package will be indicated this forecast model F iaccording to parameter database, upgrade regression parameter, but use old sample numerical value definition standard deviation, abnormal for judging.
System high-available in this patent is by data parallel processing (parallel model) model realization.If one or more data processing module faults, each malfunctioning module replaces being passed the copy that multiserver main frame virtual platform comes into force.
Each belongs to the further standardization of matrix quilt of the data processing module generation of above-mentioned subset, delivers to main decision package, for differentiating event detection result.
Fig. 3 is multiserver mainframe virtualization anticipation framework.(SuSE) Linux OS is arranged on dom U.Each dom U installs an abnormity detecting unit, and by the module communication on Message Passing Interface (MPI) and another dom U.At hardware view, communications protocol takes ICP/IP protocol in server host and communication between server host.The mirror image copies of the historical data of each virtual machine and facility pipe network will be stored in network building-out storage element.Network building-out storage element is in Network storage technology (Network Storage Technologies), data above it can be accessed by each accident detection module, for data processing, also can be when a certain virtual machine or server host fault for Virtual Machine Manager by above virtual machine (vm) migration to existing server host.
Fig. 4 is divided into the different virtual machine on same server main frame by multinuclear processing threads.First thread of polycaryon processor is designated as dom0, the communication of its connecting virtual machine and network building-out storage element, and be in charge of establishment and elimination virtual machine.Remaining computational resource is for the virtual machine of operation exception detecting unit.
In conjunction with Fig. 3 and Fig. 4, in the present invention, the high-performance abnormality detection service system framework virtual based on multiserver main frame can be divided into three major parts.
[1] physical machine is virtual:
In this structure, physical machine is only the server host that virtual machine is installed, and it will carry out the parallel model abnormality detection that transport net is detected to data.Hypervisor or virtual machine manager, as IBMz/VM, VMware ESX, with XenSource or Novell Xen, will be installed on all virtual machines.Hypervisor can directly be moved on hardware, and does not need specific operating system, and on this hardware, can move a plurality of virtual machines, as shown in Figure 3.
The present invention uses the Xen CPU of acquiescence to distribute policy, and in this case, virtual machine dom0 is designated as each first thread that can be arranged on the server host (as Fig. 4) on polycaryon processor.Dom0 is first virtual machine being guided by Xen, and it has some privileges, as direct access hardware, can have both the I/O function of all access system, and with other virtual machine that is expressed as dom U interact (establishment and management) etc.
The dom0 that each server host is moves the whether good messaging system of detection virtual machine running status that Heartbeat(sets up on Xen), it is carried out intelligent troubles to all dom U on server host and detects, and with similar process exchange message on other server hosts.Because Servers-all main frame is all connected with mesh network, the backup manager on each dom0 can be listed as into inventory by the health status of all virtual machines in group.A question blank will define primary fault virtual machine and will back up at which migration node, and each backup manager can be accessed this question blank.The change at every turn distributing due to the virtual machine in group is carried out after backup process, and this table all can upgrade, or system manager is simply by hardware and virtual machine configuration restore value initial condition, and keeps question blank not change.
Virtual machine is virtualized environment, and each virtual machine is carried out themselves operating system and application program.In the present invention, Linux is designated as the operation sequence of all virtual machines and physical machine.An abnormity detecting unit is installed, an example of processing as MPI on each virtual machine.
Virtual network interface is assigned to each virtual machine.Each interface has independent MAC Address and IP address.
The present invention only uses TCP/IP communication interface as the interface of local data exchange and inter-node communication in physical machine (the physical server main frame with certain quantity virtual machine).Virtual machine visitor dom U and virtual network drive Direct Communication, and virtual network drives with Ethernet card and drives function identical.From that indication is translated as to hardware signal is different, this driving will interact with dom0, make with driving field in respective rear ends interface communication.This makes virtual machine on all-network occur as the individual services device main frame that has different MAC Address.Although ICP/IP protocol is not enough to support the transfer of data between virtual machine on same server main frame, but compare the shared drive data transfer protocol that Xensocket and Xway provide, by the computing relay of the tcp/ip layer between dom0 and domU, be far smaller than the transformation period of sensing data in the network of rivers.In addition,, due to the use of local Xen hypervisor and MPI code, the stability of a system has improved.
It is a trend favourable that polycaryon processor starts generally to use.System of the present invention can be utilized this trend, and the dom U in same server main frame is moved on different threads from dom0, can allow thus them in different IPs, carry out.CPU separation realizes by body plan Xen hypervisor.Parallel the carrying out of MPI that it processes virtual i/o control protocol in dom0 and dom U, makes to reduce cpu resource competition.This can relax the delay issue that above-mentioned I/O concentrates MPI to process.On all virtual machines, all will move based on IP(Internet Protocol, Internet protocol) service, have following functions:
[a] is from the Data dissemination unit subscription data bag of IP coupling.
[b] is by transmission network sensing data subset input abnormity detecting unit, and it is a to NAS to make a copy for.
The prediction module of [c] abnormity detecting unit and other virtual machines from the Data dissemination unit subscription data bag of different IP exchange deal with data.Prediction module is to go statistically to analyze data based on semi-supervised learning framework, thereby the Distribution Statistics prediction without distributed network expecting varialbe state under abnormal conditions in hypothesis is provided.In system in the present invention, abnormity detecting unit is compiled on the virtual machine of MPI program in identical or different physical machine and moves.In the present invention, we use the acquiescence ICP/IP protocol on Xen to transmit data in different MPI programs.Therefore, we do not need data reconstruction to transmit the agreement that application programming interfaces are deacclimatized the inside and outside data transmission of different server main frame and communication control.By the computing relay of the tcp/ip layer between dom0 and dom U, be far smaller than the transformation period of sensing data in the network of rivers.In addition,, due to the use of local Xen hypervisor and MPI code, the stability of a system has improved.
[d], in conjunction with shown in Fig. 2, the analysis module of abnormity detecting unit is from prediction module receiving and counting prediction data, and it may comprise distribution, the variance of possible range numerical value, and some other statistical indicator.Residual error in each time step must be classified as or outlier consistent with background water quality value.The regression parameter of the data subset next time that analysis module obtains from Data dissemination unit according to described statistical forecast data estimation, with the predicted value of data subset next time described in calculating.Data subset has under identical time step and consistent background water quality value with historical data next time.
[e], in conjunction with shown in Fig. 2, the judge module of abnormity detecting unit judges predicted value and the extent of deviation of online sensing data.Although the absolute value at initial unit lower threshold value can change along with water quality index, relatively acceptable prediction distribution formula network state deviation is fixed to specific standard deviation.Subsequently, an abnormal accident differentiation based on machine learning is used as decision tool use with sort module.This judge module can be stored in the historical data database in network building-out storage element from access.
[f], in conjunction with shown in Fig. 2, result is imported on the main abnormity detecting unit moving on different virtual machine.The result of all parallel abnormity detecting processes will be analyzed in this main abnormity detecting unit, and the classification of definite anomalous event and the position occurring in facility pipe network.
[2] use that network building-out stores
Each server host without the need for independent disk, it is upper that virtual machine epigraph is stored in NAS, and it can be accessed by any physical machine.In this case, any virtual machine can move and without carrying out backup on local disk again in any physical machine.
[3] control of monitoring and high availability
REMUS software kit in Xen framework is responsible for the common virtual machine moving in Xen hypervisor high-performance guarantee is provided.In system of the present invention, when physical machine or certain specific virtual machine (reason whatsoever that makes a mistake just, hardware or software are made mistakes) time, REMUS will be with high-frequency (20-40 test point/second) to virtual machine acquisition testing point (checkpoints), and is copied on another server host to complete dynamic migration.In failover, the virtual machine of having moved on different server main frame even without fault resumes operation the test point from up-to-date (checkpoints).Comprising that movable TCP connects all running times of operating system all can preserve.The process of moving will be carried out as usual, and all files, network state and disk all will keep integrality, and TCP storehouse there will be packet loss at the most, but package also will be resend.
Use REMUS can prevent that virtual machine from collapse fault occurring.This characteristic contributes to carry out the parallel computing of the MPI of abnormality detection, synchronous because all calculation procedure all will maintain.
Server exists in pairs with operation/standby two kinds of patterns under REMUS drives, and the server of operational mode will send test point information back-up to standby mode server based on Heartbeat signal in good time.
In the present invention, each server host is simultaneously in moving, back up two patterns, will be by a question blank of design, virtual machine test point (checkpoint) distribution of information in certain particular server main frame is arrived on another server host in " backup " pattern, to can carry out dynamic migration when fault occurs.
Because solid state hard disc still possesses high availability in acceptable price, the server host of backup virtual machine can be used solid state hard disc to file for test point (checkpoint) provides high speed local, realizes virtual machine very fast (sub-second) and restarts.
Fig. 5 management is described for the method for the high availability server of the online abnormity detecting algorithm of executed in parallel.This figure has shown the situation that copies fault virtual machine or server host in the situation that of 4 server hosts and 16 virtual machines.4 virtual machines of each server host operation.The backup manager moving on the dom0 of each existing server host will be gone the backup of handling failure virtual machine by body plan according to question blank.For example, server host #1 is except being responsible for operation A, B, C, D virtual machine, also be responsible for the test point information back-up of E, I, M, F, certain virtual machine (supposing E) in E, I, M, F breaks down, backup manager in server host #1 will be enabled the backup of corresponding test point, the virtual machine (E) that makes to break down resume operation in server host #1 (now upper A, B of server host #1, C, D, E are in operational mode).After automated back-up process completes, if need to reach optimizer system performance, system manager can get involved each server host operating duty of manager administration, and online by virtual machine (vm) migration to different physical server main frames.Afterwards, under the virtual machine that system manager need to redistribute in existing server host, (suppose server host #1 collapse, new question blank will be the question blank of renewal control virtual machine backup process: server host #2 will move A, E, F, G, H, backup B, I, N, O, P; Server host #3 will move B, I, J, K, L, backup A, C, D, E, M; Server host #4 will move C, D, M, N, O, P, backup I, J, K, M, N), or just simply by hardware and virtual machine configuration reverting value initial condition, and keep question blank not change.The design principle of question blank is, after any one server host collapse, all virtual machines that move on this server host will be divided equally to other normal server main frames and continue to move.
In Fig. 5, as an embodiment, adopted and only comprised the situation that 4 server hosts move respectively 4 virtual machines.Should state, it is just for the illustrating of possible embodiments of the present invention, and this embodiment is not in order to limit the scope of the claims of the present invention, and the present invention should include but not limited to above-mentioned detailed description and object lesson.The present invention should comprise all adjustment and the modification within the scope of core content, and the equivalence that all the present invention of disengaging do is implemented or change, all should be contained in the scope of the claims of this case.

Claims (10)

1. the online pipe network abnormity detecting system based on machine learning, is characterized in that, it comprises:
Data acquisition unit, for gathering the real time data of described online pipe network, and merges described real time data to be grouped into different packets according to the band of position;
Data dissemination unit for receiving described packet, and extracts data element from packet, then packet is formatd after processing and is divided into a plurality of data subsets;
A plurality of abnormity detectings unit, be used for the corresponding data subset of corresponding reception one by one, and described data subset is carried out to abnormality prediction based on semi-supervised learning framework, described a plurality of abnormity detectings unit panel data is processed and by MPI, is carried out data transmission each other.
2. the online pipe network abnormity detecting system based on machine learning according to claim 1, is characterized in that, described abnormity detecting cellular installation on virtual machine, the corresponding abnormity detecting unit of each virtual machine.
3. the online pipe network abnormity detecting system based on machine learning according to claim 2, it is characterized in that, the described online pipe network abnormity detecting system based on machine learning further comprises a plurality of server hosts, server host interconnects by fully connected topology in local area network (LAN), each server host is equipped with a polycaryon processor, described polycaryon processor is divided into a plurality of virtual machines on same server host according to thread, wherein, the first thread is designated as virtual machine dom0, other thread is divided into virtual machine dom U, described virtual machine dom0 is for the hardware of access services device main frame, and interact with virtual machine dom U, described virtual machine dom U is used for installing abnormity detecting unit, the virtual machine dom U of the server host of each operation is provided with corresponding backup on the server host of other operations.
4. the online pipe network abnormity detecting system based on machine learning according to claim 3, is characterized in that, described abnormity detecting unit comprises:
Prediction module, for setting up forecast model according to multiple regression equation, so that the statistical forecast data without data subset expecting varialbe state under abnormal conditions in hypothesis to be provided, described prediction module is also carried out the exchange of statistical forecast data with other abnormity detecting unit;
Analysis module, be used for receiving described statistical forecast data, the regression parameter of the data subset next time obtaining from Data dissemination unit according to described statistical forecast data estimation, with the predicted value of data subset next time described in calculating, described data subset next time and historical data have identical time step and consistent pipe network background;
Judge module, according to predicted value and the actual value of data subset next time, judges the abnormality of described data subset next time;
Decision-making module, the abnormality judged result of making for receiving judge module, and according to described abnormality judged result, described forecast model is made to renewal.
5. the online pipe network abnormity detecting system based on machine learning according to claim 4, is characterized in that, the method that described prediction module is set up forecast model comprises the following steps:
Step 11, interior certain the some parameter of the online pipe network of basis over time situation are carried out the simulation of data model:
X i(t+1)=F i(X(t),X(t-1),X(t-2),...X(t-n)) (1)
Wherein: F ibe the forecast model of i abnormity detecting unit, i is positive integer, and is not more than the sum of abnormity detecting unit, X ithe input data of i abnormity detecting unit, wherein, X (t), X (t-1), X (t-2) ... X (t-n) is historical data, X i(t+1) be data subset next time;
Step 12, based on multiple regression equation, build forecast model:
X i(t+1)=A i0*X i(t)+A i1*X i(t-1)+...A in*X i(t-n)+C i (2)
Wherein: A i0, A i1... A infor forecast model F iregression parameter, C ibe the random error parameter of i abnormity detecting unit, described prediction module by MPI by C icarry out the exchange of statistical forecast data;
Step 13, solve random error parameters C i
C i = Σ j ≠ i n A ij 0 * X j ( t ) + Σ j ≠ i n A ij 1 * X j ( t - 1 ) + . . . + Σ j ≠ i n A ijn * X j ( t - n ) - - - ( 3 )
In formula (3), A ij0, A ij1... A ijnfrom normal data bag, Auto-matching obtains.
6. the online pipe network abnormity detecting system based on machine learning according to claim 5, is characterized in that, the method that described judge module judges the abnormality of described data subset next time comprises the following steps:
Step 31, comparison X i(t+1) predicted value and the difference of measured value;
Step 32, collection historical data building database X i(t), X i(t-1) ... X i(t-P), wherein, the empirical parameter of the time relationship that P is i abnormity detecting unit, P is positive integer;
Step 33, structure historical data base sample { X i(t), X i(t-1) ... X i(t-P) }, calculate the standard deviation scope of this sample;
Step 34, more described difference and standard deviation scope:
If difference is less than standard deviation scope, judge module returns to a negative acknowledge character (NAK) to decision-making module, if what be provided with that the judge module of abnormity detecting unit feeds back to decision-making module is all negative acknowledge character (NAK), decision-making module is by all X i(t+1) deposit in database, and indicate corresponding judge module and up-to-date sample data and the regression parameter of database synchronization, to be ready for use on prediction X i(t+2);
Prediction X i(t+2) time, if difference is greater than the sample { X of renewal i(t+1) X i(t), X i(t-1) ... X i(t-P+1) standard deviation scope }, corresponding judge module returns to an acknowledge signal to decision-making module, decision-making module in database by X i(t+2) be labeled as anomalous event, decision-making module will indicate this judge module according to database update regression parameter, but use old sample { X i(t+1) X i(t), X i(t-1) ... X i(t-P+1) } definition standard deviation, for X i(t+3) abnormal judgement.
7. the online pipe network abnormity detecting system based on machine learning according to claim 6, it is characterized in that, the described online pipe network abnormity detecting system based on machine learning further comprises network building-out storage element, for storing the mirror image copies of the historical data of all virtual machines and online pipe network, each abnormity detecting unit is the data in addressable this network building-out storage element all, and virtual machine dom0 connects the communication of its corresponding virtual machine dom U and network building-out storage element.
8. the online pipe network abnormity detecting system based on machine learning according to claim 7, it is characterized in that, the method of described backup is: the virtual machine dom U test point information on the server host of each operation is distributed to backup according to the loading condition of the server host of other operations, to realize best balance movement mode, after backup, automatically generate a question blank, described question blank is for defining the migration node of primary fault virtual machine domU backup, to carry out dynamic migration when virtual machine dom U or server host break down.
9. the online pipe network abnormity detecting system based on machine learning according to claim 8, it is characterized in that, on each virtual machine dom0, backup manager is set, for the health status of virtual machine dom U corresponding to this virtual machine dom0 is listed as into inventory, virtual machine dom0 is gone the backup of handling failure virtual machine by body plan according to question blank by backup manager.
10. the online pipe network abnormity detecting system based on machine learning according to claim 1, is characterized in that, the method for described Data dissemination unit distributing data subset comprises the following steps:
Described Data dissemination unit receive packet and to the interfere information producing due to the cause of measuring, sending or collect in packet even error message filter;
Extract the data element in packet, packet is converted to consolidation form;
Packet is divided into corresponding number according to subset, and the data in data subset guarantee balanced,
Data subset is encrypted, and by Publish-subscribe model, data subset is distributed to abnormity detecting unit.
CN201310581956.0A 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning Active CN103580960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310581956.0A CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310581956.0A CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Publications (2)

Publication Number Publication Date
CN103580960A true CN103580960A (en) 2014-02-12
CN103580960B CN103580960B (en) 2017-01-11

Family

ID=50051937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310581956.0A Active CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Country Status (1)

Country Link
CN (1) CN103580960B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104852830A (en) * 2015-06-01 2015-08-19 广东电网有限责任公司信息中心 Service access model based on machine learning and implementation method thereof
CN105740989A (en) * 2016-02-03 2016-07-06 杭州电子科技大学 Water supply pipe network abnormal event detection method based on VARX (a Vector Auto-Regressive with eXogenous variables) models
CN106125643A (en) * 2016-06-22 2016-11-16 华东师范大学 A kind of industry control safety protection method based on machine learning techniques
CN106209843A (en) * 2016-07-12 2016-12-07 工业和信息化部电子工业标准化研究院 A kind of data flow anomaly towards Modbus agreement analyzes method
CN106649414A (en) * 2015-11-04 2017-05-10 阿里巴巴集团控股有限公司 Data warehouse data exception pre-detecting method and device
CN106775929A (en) * 2016-11-25 2017-05-31 中国科学院信息工程研究所 A kind of virtual platform safety monitoring method and system
CN107360159A (en) * 2017-07-11 2017-11-17 中国科学院信息工程研究所 A kind of method and device for identifying abnormal encryption flow
CN108023740A (en) * 2016-10-31 2018-05-11 腾讯科技(深圳)有限公司 The indicating risk method and apparatus of exception information in monitoring
CN108229537A (en) * 2017-12-07 2018-06-29 深圳市宏电技术股份有限公司 The singular values standard form method, apparatus and equipment of a kind of precipitation
CN108259482A (en) * 2018-01-04 2018-07-06 平安科技(深圳)有限公司 Network Abnormal data detection method, device, computer equipment and storage medium
CN109286526A (en) * 2018-10-08 2019-01-29 成都西加云杉科技有限公司 A kind of wifi system running policy dynamic adjusting method and device
CN109857611A (en) * 2019-01-31 2019-06-07 泰康保险集团股份有限公司 Test method for hardware and device, storage medium and electronic equipment based on block chain
CN109871002A (en) * 2019-03-06 2019-06-11 东方证券股份有限公司 The identification of concurrent abnormality and positioning system based on the study of tensor label
CN109981744A (en) * 2019-02-28 2019-07-05 东软集团股份有限公司 Distribution method, device, storage medium and the electronic equipment of data
CN110188910A (en) * 2018-07-10 2019-08-30 第四范式(北京)技术有限公司 The method and system of on-line prediction service are provided using machine learning model
CN110618854A (en) * 2019-08-21 2019-12-27 浙江大学 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
CN111639430A (en) * 2020-05-29 2020-09-08 重庆大学 Digital twin driven natural gas pipeline leakage identification system
CN113641444A (en) * 2020-04-27 2021-11-12 南通华信中央空调有限公司 Virtual test method, virtual test device and related equipment
CN114623872A (en) * 2022-03-08 2022-06-14 内蒙古金原农牧科技有限公司 Underground water dynamic monitoring system based on strong magnetic wireless transmission
TWI798007B (en) * 2022-02-25 2023-04-01 中華電信股份有限公司 Anomaly detection system, method and computer readable medium based on system characteristics

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719849A (en) * 2009-11-03 2010-06-02 清华大学 Pattern clustering-based parallel network flow characteristic detection method
CN101980480A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised anomaly intrusion detection method
CN102045381A (en) * 2010-10-13 2011-05-04 北京博大水务有限公司 On-line monitoring system for regenerated water pipe network
CN102635787A (en) * 2012-04-16 2012-08-15 中山大学 Automatic detection device and detection method for water leakage of water pipeline
CN102970245A (en) * 2012-11-21 2013-03-13 北京奇虎科技有限公司 Data transmission method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719849A (en) * 2009-11-03 2010-06-02 清华大学 Pattern clustering-based parallel network flow characteristic detection method
CN102045381A (en) * 2010-10-13 2011-05-04 北京博大水务有限公司 On-line monitoring system for regenerated water pipe network
CN101980480A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised anomaly intrusion detection method
CN102635787A (en) * 2012-04-16 2012-08-15 中山大学 Automatic detection device and detection method for water leakage of water pipeline
CN102970245A (en) * 2012-11-21 2013-03-13 北京奇虎科技有限公司 Data transmission method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
POSTLOCHE O. SILVA GIRAO P, DIAS PEREIRA J M.: ""Wireless water quality monitoring system based on field point technology and kohonen maps"", 《PROC OF IEEE SENSORS》 *
庄宪骥: ""基于S7-200自来水管网监控系统设计"", 《中北大学》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104852830A (en) * 2015-06-01 2015-08-19 广东电网有限责任公司信息中心 Service access model based on machine learning and implementation method thereof
CN106649414B (en) * 2015-11-04 2020-01-31 阿里巴巴集团控股有限公司 Method and equipment for pre-detecting data anomalies of data warehouses
CN106649414A (en) * 2015-11-04 2017-05-10 阿里巴巴集团控股有限公司 Data warehouse data exception pre-detecting method and device
CN105740989A (en) * 2016-02-03 2016-07-06 杭州电子科技大学 Water supply pipe network abnormal event detection method based on VARX (a Vector Auto-Regressive with eXogenous variables) models
CN105740989B (en) * 2016-02-03 2019-09-27 杭州电子科技大学 A kind of water supply network anomalous event method for detecting based on VARX model
CN106125643A (en) * 2016-06-22 2016-11-16 华东师范大学 A kind of industry control safety protection method based on machine learning techniques
CN106209843A (en) * 2016-07-12 2016-12-07 工业和信息化部电子工业标准化研究院 A kind of data flow anomaly towards Modbus agreement analyzes method
CN108023740A (en) * 2016-10-31 2018-05-11 腾讯科技(深圳)有限公司 The indicating risk method and apparatus of exception information in monitoring
CN106775929A (en) * 2016-11-25 2017-05-31 中国科学院信息工程研究所 A kind of virtual platform safety monitoring method and system
CN106775929B (en) * 2016-11-25 2019-11-26 中国科学院信息工程研究所 A kind of virtual platform safety monitoring method and system
CN107360159A (en) * 2017-07-11 2017-11-17 中国科学院信息工程研究所 A kind of method and device for identifying abnormal encryption flow
CN107360159B (en) * 2017-07-11 2019-12-03 中国科学院信息工程研究所 A kind of method and device of the abnormal encryption flow of identification
CN108229537A (en) * 2017-12-07 2018-06-29 深圳市宏电技术股份有限公司 The singular values standard form method, apparatus and equipment of a kind of precipitation
CN108259482B (en) * 2018-01-04 2019-05-28 平安科技(深圳)有限公司 Network Abnormal data detection method, device, computer equipment and storage medium
CN108259482A (en) * 2018-01-04 2018-07-06 平安科技(深圳)有限公司 Network Abnormal data detection method, device, computer equipment and storage medium
CN110188910A (en) * 2018-07-10 2019-08-30 第四范式(北京)技术有限公司 The method and system of on-line prediction service are provided using machine learning model
CN109286526A (en) * 2018-10-08 2019-01-29 成都西加云杉科技有限公司 A kind of wifi system running policy dynamic adjusting method and device
CN109857611A (en) * 2019-01-31 2019-06-07 泰康保险集团股份有限公司 Test method for hardware and device, storage medium and electronic equipment based on block chain
CN109981744A (en) * 2019-02-28 2019-07-05 东软集团股份有限公司 Distribution method, device, storage medium and the electronic equipment of data
CN109981744B (en) * 2019-02-28 2022-03-04 东软集团股份有限公司 Data distribution method and device, storage medium and electronic equipment
CN109871002A (en) * 2019-03-06 2019-06-11 东方证券股份有限公司 The identification of concurrent abnormality and positioning system based on the study of tensor label
CN110618854A (en) * 2019-08-21 2019-12-27 浙江大学 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
CN110618854B (en) * 2019-08-21 2022-04-26 浙江大学 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
CN113641444A (en) * 2020-04-27 2021-11-12 南通华信中央空调有限公司 Virtual test method, virtual test device and related equipment
CN111639430A (en) * 2020-05-29 2020-09-08 重庆大学 Digital twin driven natural gas pipeline leakage identification system
CN111639430B (en) * 2020-05-29 2024-02-27 重庆大学 Natural gas pipeline leakage identification system driven by digital twinning
TWI798007B (en) * 2022-02-25 2023-04-01 中華電信股份有限公司 Anomaly detection system, method and computer readable medium based on system characteristics
CN114623872A (en) * 2022-03-08 2022-06-14 内蒙古金原农牧科技有限公司 Underground water dynamic monitoring system based on strong magnetic wireless transmission

Also Published As

Publication number Publication date
CN103580960B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN103580960B (en) Online pipe network anomaly detection system based on machine learning
CN110609512B (en) Internet of things platform and Internet of things equipment monitoring method
CN103443727B (en) Abnormality detection system and method for detecting abnormality
Ferrari et al. Distributed fault detection and isolation of large-scale discrete-time nonlinear systems: An adaptive approximation approach
CN108418841A (en) Next-generation key message infrastructure network Security Situation Awareness Systems based on AI
Ploennigs et al. Adapting semantic sensor networks for smart building diagnosis
CN104412190A (en) Systems and methods for health assessment of a human-machine interface (HMI) device
Kavulya et al. Failure diagnosis of complex systems
Lin et al. A general framework for quantitative modeling of dependability in cyber-physical systems: A proposal for doctoral research
CN115102827A (en) Digital product real-time monitoring general Internet platform for small and medium-sized manufacturing industry
CN103716203B (en) Networked control system intrusion detection method and system based on ontology model
WO2014200836A1 (en) Systems and methods for monitoring system performance and availability
CN106249709A (en) Dynamic process quality control figure and determine to keep in repair co-design optimal control method age
JP2017207894A (en) Integrated monitoring operation system and method
CN105183659A (en) Software system behavior anomaly detection method based on multi-level mode predication
Li et al. Research and application of AI in 5G network operation and maintenance
Selvaraj et al. Real-time fault identification system for a retrofitted ultra-precision CNC machine from equipment's power consumption data: a case study of an implementation
KR20220089853A (en) Method for Failure prediction and prognostics and health management of renewable energy generation facilities using machine learning technology
US10999159B2 (en) System and method of detecting application affinity using network telemetry
Ding A note on diagnosis and performance degradation detection in automatic control systems towards functional safety and cyber security
CN115437886A (en) Fault early warning method, device and equipment based on storage and calculation integrated chip and storage
KR20220057146A (en) ON-DEVICE HIGH-SPEED EVENT COMPLEX ANALYSIS AND SYNCHRONIZATION METHOD OF IoT TIME SERIES DATA APPLICABLE TO SMART FACTORY AND SYSTEM THAT PERFORMS IT
Sheeba et al. WFCM based big sensor data error detection and correction in wireless sensor network
US20140297578A1 (en) Processing a technical system
KONDURU Fault Detection and Tolerance in Wireless Sensor Networks: a Study on Reliable Data Transmission Using Machine Learning Algorithms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 528200 Guangdong Province Nanhai District of Foshan city streets Guicheng Shilong Road No. 1 joybon IFC 2 room 1707

Patentee after: Foshan science and Technology Co., Ltd.

Address before: 528200 Guangdong city of Foshan province sea road Han day Technology City Building No. 8 901-3

Patentee before: FOSHAN LUOSIXUN ENVIRONMENTAL PROTECTION TECHNOLOGY CO., LTD.