CN103580960B - Online pipe network anomaly detection system based on machine learning - Google Patents

Online pipe network anomaly detection system based on machine learning Download PDF

Info

Publication number
CN103580960B
CN103580960B CN201310581956.0A CN201310581956A CN103580960B CN 103580960 B CN103580960 B CN 103580960B CN 201310581956 A CN201310581956 A CN 201310581956A CN 103580960 B CN103580960 B CN 103580960B
Authority
CN
China
Prior art keywords
data
virtual machine
pipe network
abnormity detecting
abnormity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310581956.0A
Other languages
Chinese (zh)
Other versions
CN103580960A (en
Inventor
陈尊裕
张得志
李丹
胡斯洋
龙圣
郑思明
吴珏其
周振邦
李维海
王红旗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan Science And Technology Co Ltd
Original Assignee
Foshan Luosixun Environmental Protection Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan Luosixun Environmental Protection Technology Co ltd filed Critical Foshan Luosixun Environmental Protection Technology Co ltd
Priority to CN201310581956.0A priority Critical patent/CN103580960B/en
Publication of CN103580960A publication Critical patent/CN103580960A/en
Application granted granted Critical
Publication of CN103580960B publication Critical patent/CN103580960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an online pipe network anomaly detection system based on machine learning. The online pipe network anomaly detection system comprises a data collection unit, a data distribution unit and a plurality of anomaly detection units. The data collection unit is used for collecting real-time data of an online pipe network, merging the real-time data according to position areas and grouping the real-time data into different data packages. The data distribution unit is used for receiving the data packages, extracting data elements from the data packages and dividing the data packages into a plurality of data subsets after formatting the data packages. The anomaly detection units are used for receiving the data subsets in a one-to-one correspondence mode and predicating anomalism of the data subsets based on a semi-supervised machine learning framework. The anomaly detection units can be used for carrying out parallel data processing, and data transmission can be carried out among the anomaly detection units through an MPI. The online pipe network anomaly detection system can meet the requirements of the online anomaly detection units based on machine learning for usability of a server, and can prevent extra hardware on standby in an idle state from being introduced in.

Description

A kind of online pipe network abnormity detecting system based on machine learning
Technical field
The present invention relates to a kind of facility pipe network monitoring technology, be specifically related to a kind of based on machine learning at spool Net abnormity detecting system.
Background technology
The development of sensor technology makes sensor can realize high space-time accuracy parameters measuring at environmental area.Pass The time series data that sensor is collected constantly inputs in bin, forms data stream.With waterworks operation As a example by, sensing data can include each hydraulic parameters and water quality index.These data can be used for abnormal shape Condition detections etc., it differentiates data exception by historical pattern or model prediction.Unusual condition can be pipeline Reveal or contamination accident.The geographical scale of pipeline is big, by Changes in weather, seasonal variations, festivals or holidays and society It is high by water feature complexity that district's population structure change is caused so that manual method this work not competent. Therefore machine learning techniques based on historical data is the unique feasible scheme of online abnormality detection.Machine learning Technology can be roughly divided into three classes: (a) clear data analysis classes;(b) rule-based class;C () is based on physics mould Type class, classification foundation be dependent on which kind of parameter follow the tracks of, predict current and future sensing data tendency and Association between each group data.First, abnormality detection system is based on normal system or sensor-based system historical data Set a benchmark.Hereafter, any activity deviating from this benchmark will be considered exception.
Additionally, due to need to distinguish real abnormal data and non-abnormal data (false alarm), we still need to Want a set of calculating system based on replicanism Yu redundancy strategy, support continuous on-line data acquisition and perform number According to parser.
Summary of the invention
For above not enough, it is an object of the invention to provide online pipe network abnormity detecting based on machine learning System, meets based on machine based on the virtualization of multiserver host hardware and Publish-subscribe data distribution strategy The availability requirement to server of the online abnormity detecting unit of device study, avoids introducing what the free time awaited orders simultaneously Unnecessary hardware.
For realizing object above, the technical scheme that this invention takes is:
A kind of online pipe network abnormity detecting system based on machine learning, comprising:
Data acquisition unit, for gathering the real time data of described online pipe network, and depends on described real time data Merge according to the band of position and be grouped into different packets;
File distributing unit, is used for receiving described packet, and extracts data element from packet, then Multiple data subset it is divided into after packet is formatted process;
Multiple abnormity detecting unit, receive corresponding data subset for one_to_one corresponding, and to described data Collection carries out abnormity prediction, the plurality of abnormity detecting unit panel data based on semi-supervised learning framework Process and carry out data transmission by MPI each other.
Described abnormity detecting unit is installed on virtual machine, the corresponding abnormity detecting unit of each virtual machine.
Described online pipe network abnormity detecting system based on machine learning farther includes multiple server host, Server host is connected with each other by fully connected topology in LAN, and each server host is equipped with more than one Core processor, multiple virtual machines that described polycaryon processor is divided on same server host according to thread, Wherein, first thread is designated as virtual machine dom0, and other thread is divided into virtual machine dom U, described void Plan machine dom0 is for accessing the hardware of server host and interacting with virtual machine dom U, described Virtual machine dom U is used for installing abnormity detecting unit, the virtual machine dom U of the server host of each operation Other server hosts run are provided with corresponding backup.
Described abnormity detecting unit includes:
Prediction module, for setting up forecast model according to multiple regression equation, is as good as reason to provide in hypothesis The actuarial prediction data of data subset expecting varialbe state under condition, described prediction module and with other abnormity detecting Unit carries out the exchange of actuarial prediction data;
Analyze module, be used for receiving described actuarial prediction data, according to described actuarial prediction data estimation from number The regression parameter of data subset next time obtained according to Dispatching Unit, with data subset next time described in calculating Predictive value, described data subset next time has identical time step and consistent pipe network with historical data Background;
Judge module, according to predictive value and the actual value of data subset next time, to described data next time The abnormity of collection judges;
Decision-making module, for receiving the abnormity judged result that judge module is made, and according to described abnormity Described forecast model is made renewal by judged result.
Described prediction module is set up the method for forecast model and is comprised the following steps:
Step 11, carry out the simulation of data model according to certain some parameter situation over time in online pipe network:
Xi(t+1)=Fi(X (t), X (t-1), X (t-2) ... X (t-n)) (1)
Wherein: FiBeing the forecast model of i-th abnormity detecting unit, i is positive integer, and is not more than abnormal detecing Survey the sum of unit, XiIt is the input data of i-th abnormity detecting unit, wherein, X (t), X (t-1), X (t-2) ... X (t-n) is historical data, Xi(t+1) it is data subset next time;
Step 12, based on multiple regression equation build forecast model:
Xi(t+1)=Ai0*Xi(t)+Ai1*Xi(t-1)+... Ain*Xi(t-n)+Ci(2)
Wherein: Ai0、Ai1、...AinFor forecast model FiRegression parameter, CiFor i-th abnormity detecting unit Random error parameter, described prediction module by MPI by CiCarry out the exchange of actuarial prediction data;
Step 13, solve random error parameter Ci
C i = Σ j ≠ i n A ij 0 * X j ( t ) + Σ j ≠ i n A ij 1 * X j ( t - 1 ) + . . . + Σ j ≠ i n A ijn * X j ( t - n ) - - - ( 3 )
In formula (3), Aij0、Aij1、...AijnFrom normal data bag, Auto-matching obtains.
The method that the abnormity of described data subset next time is judged by described judge module comprises the following steps:
Step 31, compare Xi(t+1) predictive value and the difference of measured value;
Step 32, collection historical data set up data base Xi(t), Xi(t-1) ... Xi(t-P), wherein, P For the empirical parameter of the time relationship of i-th abnormity detecting unit, P is positive integer;
Step 33, structure historical data base sample { Xi(t), Xi(t-1) ... Xi(t-P) this sample }, is calculated This standard deviation scope;
Step 34, relatively described difference and standard deviation scope:
If difference is less than standard deviation scope, it is judged that module then returns a negative acknowledge character (NAK) to decision-making module, if What the judge module being provided with abnormity detecting unit fed back to decision-making module is all negative acknowledge character (NAK), and decision-making module then will All of Xi(t+1) it is stored in data base, and indicates the up-to-date sample of corresponding judge module and database synchronization Data and regression parameter, to be ready for use on prediction Xi(t+2);
Prediction Xi(t+2) time, if difference is more than the sample { X updatedi(t+1) Xi(t), Xi(t-1) ... Xi (t-P+1) standard deviation scope }, corresponding judge module then returns a signal certainly to decision-making module, certainly Plan module in data base by Xi(t+2) be labeled as anomalous event, decision-making module by this judge module of instruction according to Database update regression parameter, but use old sample { Xi(t+1) Xi(t), Xi(t-1) ... Xi(t-P+1) } Definition standard deviation, for Xi(t+3) exception judges.
Described online pipe network abnormity detecting system based on machine learning farther includes Network Attached Storage list Unit, for storing the mirror image copies of the historical data of all virtual machines and online pipe network, each abnormity detecting Unit all may have access to the data in this Network Attached Storage unit, and virtual machine dom0 connects the virtual of its correspondence Machine dom U and the communication of Network Attached Storage unit.
The method of described backup is: believed by the virtual machine dom U test point on the server host of each operation Cease and be distributed backup according to the loading condition of other server hosts run, to realize optimal balance fortune Line mode, automatically generates an inquiry table after backup, described inquiry table is used for defining primary fault virtual machine dom U The migration node of backup, in order to perform dynamic migration when virtual machine dom U or server host break down.
Each virtual machine dom0 arranges backup manager, for by void corresponding for this virtual machine dom0 The health status of plan machine dom U arranges into inventory, and virtual machine dom0 passes through backup manager according to inquiry table quilt Body plan goes to process the backup of fault virtual machine.
The method of described file distributing unit distribution data subset comprises the following steps:
Described file distributing unit receive packet and in packet owing to measuring, the reason that sends or collect And the interference information even error message produced filters;
Extract the data element in packet, packet is converted to consolidation form;
Packet is divided into corresponding number and ensures equilibrium according to subset, the data in data subset,
Data subset is encrypted, and by publish-subscribe architecture, data subset is distributed to abnormity detecting list Unit.
The present invention compared with prior art, has the advantage that
1, by machine learning online pipe network carried out abnormity detecting unit, thus provide and be as good as reason assuming Under condition, the statistical distribution prediction of distributed network expecting varialbe state, improves the anomalous identification rate of online pipe network, Save substantial amounts of manpower simultaneously.
2, abnormity detecting unit parallel time processes, and reduces cpu resource competition, meets server Availability requirement, avoids introducing the idle unnecessary hardware awaited orders simultaneously.
3, need not rebuild data transfer application interface and deacclimatize data transmission inside and outside different server main frame Agreement with communication control.Sensing number it is far smaller than in the network of rivers by the computing relay between dom0 and dom U According to transformation period.
4, each server host has been not required to single disk, and virtual machine epigraph is stored in NAS On, it can be accessed by any physical machine.In this case, any virtual machine can be in any physical machine Run without again and on local disk, carry out backup.
5, virtual machine acquisition testing point is copied on another server host to complete dynamic migration.If One or more data processing module faults, each malfunctioning module will be by by multiserver main frame virtual platform The copy come into force replaces.
6, in failover, even if not having fault to have moved the virtual machine on different server main frame To resume operation from up-to-date test point.All operation times of operating system include that the TCP of activity connects Can preserve.The process being currently running will be carried out as usual, and all files, network state and disk all will keep Whole property.
Accompanying drawing explanation
Fig. 1 is the network architecture of high availability facility pipe network abnormity detecting of the present invention;
Fig. 2 is parallel type online abnormity detecting framework;
Fig. 3 is that multiserver mainframe virtualization envisions framework;
Fig. 4 is the framework that multinuclear process thread is divided into the different virtual machine on same server main frame;
Fig. 5 is to manage the method for the high availability server of executed in parallel online abnormity detecting algorithm to retouch State.
Detailed description of the invention
With detailed description of the invention, present disclosure is described in further details below in conjunction with the accompanying drawings.
Embodiment
The present embodiment as a example by the abnormity detecting of water supply network, other online pipe network such as electric power, telecommunications, network, Communication, heating power, combustion gas etc. are similar with its method, repeat no more here.
Fig. 1 is the network architecture of high availability facility pipe network abnormity detecting.In each group, sensor can be Hydraulic data or water quality data sensor, the sensing data in immediate geographic location passes through data acquisition list Unit is grouped together as packet and sends.File distributing unit receives the measurement data of sensor, by number According to be converted into meet subscriber's later stage process require form and issue.The server host at manipulation center is in office Territory net (LAN) is connected with each other by fully connected topology.Virtual machine (vm) migration on different server main frame First-selected mesh topology framework.Network Attached Storage unit is connected to all physical server hosts by LAN On.
Various sensors and instrument in water supply network monitoring system constantly gather data.Data can comprise water Force data (such as flow velocity, flow, hydraulic pressure, water level etc.) and water quality data (include free chlorine, turbidity, pH, Electrical conductivity, oxidation-reduction potential and total organic nitrogen etc.).The public network of rivers can be detected by analyzing these indexs In pipe leakage and contamination accident.Due to for the facility regular jobs such as water tank, pump, gate, water source And the seasonal variations of closing water, or water requirements fluctuation etc., in above-mentioned water distribution system, index changes the most greatly. It would therefore be desirable to incident detection system distinguishes conventional change and the unusual condition of sensing data.
Data acquisition unit includes SCADA system (supervisory control and data acqui sition system) and RTUs(Remote Terminal Units, remote control terminal), SCADA system is to collect real-time transport net sensing data A kind of canonical system.In SCADA system of the present invention (Fig. 1), we pass through region RTUs This locality sensing data is merged packet.RTUs function is by Data Digital, according to categories of sensors and collection Time adds time tag etc..Digitized sensor data are then sent to data collection server, and this process can Realized by closed circuit industrial network, such as Modbus, Lonworks, or BACnet.
File distributing unit is based on publish-subscribe architecture.File distributing unit as file distributing unit from conveying Network sensing data extracts data element, and converts them to consolidation form.Due to measure, send or The reason collected and the interference information even mistake that produces will be filtered in advance.File distributing unit tissue lattice Formulaization receives data in case processing further.After encryption, data are by open TCP/IP Ethernet transmission The terminal receiving different-format water quality data to each, in the present invention, sends the data after converted form In the abnormity detecting unit of operation center.Publish-subscribe host-host protocol includes data set X= (X1,X2... Xm) decomposition rule, such as X1Send to virtual machine #1, X2Send to virtual machine #2 ... XmSend to virtual machine #m.
Exactly, all transport net sensing datas are based on the band of position and are divided into different packets. Sensing data in each packet can be waterpower or water quality data.File distributing unit is with abnormity detecting list The IP address of unit is packet name.As in figure 2 it is shown, be parallel type online abnormity detecting framework.All different Often detecting unit will run in multiserver host virtual machine system, and duplication is made mistakes different by virtual machine monitor Often detecting unit, therefore, abnormity detecting unit can recover from single virtual machine fault.
The abnormity detecting unit of operation center uses panel data tupe.System in the present invention is same Shi Zhihang multiple abnormity detecting algorithm, each algorithm processes certain subset in sensing data bag respectively, these Subset enters operation center with the form of independent packet.Need to pass through between each abnormity detecting unit Message Passing Interface(MPI) mutually transmit data.The abnormity detecting program of abnormity detecting unit C language or Fortran can be used to write, can run on linux system.When abnormity detecting program is with C When language is write, MPI is one group of function in C language.When writing with Fortran language, MPI and use In the subprogram (Fortran language compilation) exchanging data in different processes.
The detailed description of an algorithm be presented herein below:
The data being input in abnormity detecting unit cover the running status of whole pipe network.These data be by Sensor measurement and obtain.Data base will the last state of real-time update pipe network.
In pipe network, certain some parameter situation over time is simulated by data model, as follows
X (t+1)=F (X (t), X (t-1), X (t-2) ... X (t-n)),
Wherein X (t) is the parameters measured by each sensor.F is forecast model, reads from data base Historical data X (t), X (t-1), X (t-2) ..., speculate next time point t+1's according to the observed result of history X value.Under normal circumstances, based on multiple regression equation
X(t+1)=A0*X(t)+A1*X(t-1)+...An*X(t-n)
Just be enough to build forecast model F, determine the meansigma methods of X (t+1), wherein, A0To AnIt it is coefficient matrix.
Owing to X is a vector the hugest, up to a hundred parameters of a large-scale network will be contained.For letter Changing computing, it is different that the calculating process of F can be divided into several by multiple programming technology in MPI framework Subprocess.
I.e.
X=(X1, X2... Xi..., Xm)
The length of each subvector is
And
Xi(t+1)=Fi(X (t), X (t-1), X (t-2) ... X (t-n)),
Xi(t+1)=Ai0*Xi(t)+Ai1*Xi(t-1)+... Ain*Xi(t-n)+Ci,
I=1 ..., m
C i = Σ j ≠ i n A ij 0 * X j ( t ) + Σ j ≠ i n A ij 1 * X j ( t - 1 ) + . . . + Σ j ≠ i n A ijn * X j ( t - n )
The most each FiComputing can independently execute on a virtual machine.At " Publish-subscribe " (Publish-Subscribe) under data distribution strategy framework, XiIt is FiThe input data of module, module it Between by message passing interface (MPI) utilize by random error parameter CiCarry out data exchange.
Parameter in regression equation can from standard figures bag (such as CRAN-R statistical computation bag) automatically Coupling obtains.
Prediction module is collected historical data and is set up data base X(t), X(t-1) ... X(t-P), wherein p is The empirical parameter of the time relationship of definition X.
Estimate the regression parameter of each module according to historical data, be used for calculating Fi.Forecast model is used Regression parameter estimation Xi(t+1) meansigma methods.
Subsequently determine whether that module will calculate Xi(t+1) the difference between predictive value and measured value.
If difference is less than sample { Xi(t), Xi(t-1) ... Xi(t-P) standard deviation scope }, it was predicted that mould Type FiThen return a negative acknowledge character (NAK) to decision package.If all module feedback to decision package the most whether Determining signal, decision package then allows X(t+1) it is stored in data base, and indicate each forecast model and data base Synchronize up-to-date sample data and regression parameter, to be ready for use on prediction X(t+2).
If difference is more than sample { Xi(t+1) Xi(t), Xi(t-1) ... Xi(t-P+1) standard deviation model } Enclose, it was predicted that model FiThen return a signal certainly to decision package.Decision package in data base by X(t) It is labeled as anomalous event.Decision package is by instruction this forecast model FiRegression parameter is updated according to parameter database, But use old sample values definition standard deviation, be used for judging exception.
System high-available in this patent passes through data parallel processing (parallel type) model realization.If one Individual or multiple data processing module faults, each malfunctioning module will be by raw by multiserver main frame virtual platform The copy of effect replaces.
The matrix that each data processing module belonging to above-mentioned subset produces, by further standardization, delivers to main determining Plan unit, is used for differentiating event detection outcome.
Fig. 3 is that multiserver mainframe virtualization envisions framework.(SuSE) Linux OS is arranged on dom U. Each dom U installs an abnormity detecting unit, and by Message Passing Interface (MPI) With the module communication on another dom U.At hardware view, communications protocol takes ICP/IP protocol for taking Business device main frame in and server host between communication.The mirror image of the historical data of each virtual machine and facility pipe network Copy will be stored in Network Attached Storage unit.Network Attached Storage unit i.e. Network storage technology In (Network Storage Technologies), its data above can be visited by each accident detection module Ask, process for data, it is also possible to when a certain virtual machine or server host fault for Virtual Machine Manager By on above virtual machine (vm) migration to existing service device main frame.
Multinuclear is processed the different virtual machine that thread is divided on same server main frame by Fig. 4.Multinuclear processes First thread of device is designated as dom0, its connecting virtual machine and the communication of Network Attached Storage unit, and It is responsible for creating and elimination virtual machine.Remaining calculates resource and supplies the virtual machine of operation exception detecting unit to make With.
In conjunction with Fig. 3 and Fig. 4, in the present invention, based on the high-performance abnormality detection service that multiserver main frame is virtual System architecture can be divided into three major parts.
[1] physical machine virtualization:
In this construction, physical machine be only install virtual machine server host, it by execution to conveying The parallel type abnormality detection of network detection data.Management program or virtual machine manager, such as IBMz/VM, VMware ESX, with XenSource or Novell Xen, will be installed on all virtual machines.Management journey Sequence can directly be run on hardware, without specific operating system, and can transport on the hardware The multiple virtual machines of row, as shown in Figure 3.
The present invention uses the Xen CPU of acquiescence to distribute policy, and in this case, virtual machine dom0 is designated First thread for each server host (such as Fig. 4) that may be installed on polycaryon processor.Dom0 is First virtual machine guided by Xen, it has some privileges, as can be directly accessed hardware, can have both The I/O function of all access systems, and (create with other virtual machine interaction being expressed as dom U With management) etc..
It is virtual that the dom0 that each server host is runs the detection that Heartbeat(sets up on Xen The messaging system that whether good machine running status is), it performs intelligence to all dom U on server host Energy fault detect, and process similar with on other server hosts exchange information.Due to Servers-all master Machine is all connected with mesh network, and the backup manager on each dom0 can being good for interior for group all virtual machines Health status Bar becomes inventory.Which backup is migrated node by definition primary fault virtual machine by one inquiry table at, And each backup manager can access this inquiry table.It is distributed more due to the virtual machine in group every time Change and after performing backup process, this table all can be updated, or system manager the most simply will be hard Part and virtual machine configuration recovery value original state, and keep inquiry table not change.
Virtual machine is virtualized environment, and each virtual machine performs themselves operating system and application journey Sequence.In the present invention, Linux is designated as the operation sequence of all virtual machines and physical machine.Each virtual One abnormity detecting unit is all installed, the example processed as MPI on machine.
Virtual network interface is assigned to each virtual machine.Each interface has single MAC Address and IP Address.
The present invention only uses TCP/IP communication interface (to have the physical services of certain quantity virtual machine as physical machine Device main frame) in local data exchange and the interface of inter-node communication.Virtual machine guest dom U and virtual network Driving Direct Communication, virtual network drives and drives function identical with Ethernet card.It is translated as hardware with by instruction Unlike signal, this driving will interact with dom0 so that connects with the respective rear ends in driving field Mouth communication.This makes virtual machine on all-network go out as the individual services device main frame having different MAC Address Existing.Although ICP/IP protocol is not enough to support the data transmission on same server main frame between virtual machine, but phase Than for the shared drive data transfer protocols that Xensocket Yu Xway provides, by dom0 and domU Between the computing relay of tcp/ip layer be far smaller than the transformation period of sensing data in the network of rivers.Additionally, due to this Ground Xen management program and the use of MPI code, system stability improves.
It is a trend favourable that polycaryon processor starts to commonly use.The system of the present invention can utilize this trend, Make dom U from dom0 in same server main frame run on different threads, thus can allow them Different IPs performs.CPU separates by body plan Xen management program realization.It makes the void in dom0 The MPI that plan I/O control protocol and dom U process is carried out parallel, makes reduction cpu resource compete.This can delay With the delay issue that above-mentioned I/O concentrates MPI process.On all virtual machines all will run based on IP(Internet Protocol, Internet protocol) service, have a following functions:
The file distributing unit that [a] mates from IP subscribes to packet.
Transmission network sensing data subset is inputted abnormity detecting unit by [b], and it is a to NAS to make a copy for.
The prediction module of [c] abnormity detecting unit subscribes to packet from other from the file distributing unit of different IP Virtual machine exchange process data.Prediction module is to go statistically to analyze based on semi-supervised learning framework Data, thus the statistical distribution prediction of distributed network expecting varialbe state in the case of hypothesis is without exception is provided. In system in the present invention, abnormity detecting unit is compiled into MPI program in identical or different physical machine Run on virtual machine.In the present invention, we use the acquiescence ICP/IP protocol on Xen in different MPI journeys Sequence transmits data.Therefore, we need not rebuild data transfer application interface and deacclimatize different service Inside and outside device main frame, data transmit the agreement with communication control.By the tcp/ip layer between dom0 and dom U Computing relay is far smaller than the transformation period of sensing data in the network of rivers.Additionally, due to local Xen manages program With the use of MPI code, system stability improves.
[d] combines shown in Fig. 2, and the analysis module of abnormity detecting unit receives actuarial prediction data from prediction module, It potentially includes the distribution of possible range numerical value, variance, and some other statistical indicator.Each time step Residual error in length must be classified as or outlier consistent with background water quality value.Analyze module according to described The regression parameter of the data subset next time that actuarial prediction data estimation obtains from file distributing unit, to calculate The predictive value of described data subset next time.Data subset and historical data have identical time step next time Long lower and consistent background water quality value.
[e] combines shown in Fig. 2, and the judge module of abnormity detecting unit is inclined to predictive value and online sensing data Difference degree judges.Although the absolute value at first unit lower threshold value can change along with water quality index, phase Acceptable prediction distribution formula network state deviation is fixed to specific standard deviation.Subsequently, one based on The abnormal accident differentiation of machine learning is used as decision tool with sort module.This judge module can be from visit Ask the historical data being stored in Network Attached Storage unit in data base.
[f] combines shown in Fig. 2, and result is imported on different virtual machine on the main abnormity detecting unit run. This main abnormity detecting unit will analyze the result of all previous concurrent abnormity detecting processes, and determines anomalous event Classification with in facility pipe network occur position.
[2] use of Network Attached Storage
Each server host has been not required to single disk, and virtual machine epigraph is stored on NAS, It can be accessed by any physical machine.In this case, any virtual machine can run in any physical machine And without carrying out backup on local disk again.
[3] monitoring and the control of high availability
REMUS software kit in Xen framework is responsible in Xen management program the General Virtual Machine run Offer high-performance ensures.In a system of the invention, when physical machine or simply certain specific virtual machine generation During mistake (whatsoever reason, hardware or software faults), REMUS will be with altofrequency (20-40 inspection Measuring point/second) to virtual machine acquisition testing point (checkpoints), and it is copied into another server host On to complete dynamic migration.In failover, even if not having fault to have moved different server master Virtual machine on machine will resume operation from up-to-date test point (checkpoints).All operations of operating system Time includes that the TCP of activity connects and all can preserve.The process being currently running will be carried out as usual, all files, Network state and disk all will keep integrity, at most TCP storehouse to there will be packet loss, but package also will It is possibly retransmission.
REMUS is used to be possible to prevent virtual machine that collapse fault occurs.This characteristic contributes to carrying out abnormal inspection The parallel computing of the MPI surveyed, because maintenance is all synchronized by all of calculation procedure.
Server exists in pairs with operation/standby both of which under REMUS drives, the service of operational mode Device will send test point information back-up to standby mode server based on Heartbeat signal in good time.
In the present invention, each server host is simultaneously in operation, two patterns of backup, will be by design one Individual inquiry table, is distributed to virtual machine test point (checkpoint) information in certain particular server main frame Another is on the server host of " backup " pattern, in order to can perform dynamic migration when fault occurs.
Owing to solid state hard disc still possesses high availability in acceptable price, the server master of backup virtual machine Machine can use solid state hard disc to be that test point (checkpoint) provides the most local archive, it is achieved virtual machine pole Speed (sub-second) is restarted.
Fig. 5 management describes for the method for the high availability server of executed in parallel online abnormity detecting algorithm. This figure illustrates and replicates fault virtual machine or server in the case of 4 server hosts and 16 virtual machines The situation of main frame.Each server host runs 4 virtual machines.Dom0 at each existing service device main frame The backup manager of upper operation will be gone the backup of process fault virtual machine according to inquiry table by body plan.Such as, clothes Business device main frame #1, in addition to responsible operation A, B, C, D virtual machine, is also responsible for the test point of E, I, M, F Information back-up, when certain virtual machine (assuming E) in E, I, M, F breaks down, server host Backup manager in #1 will enable the backup of corresponding test point, make the virtual machine (E) broken down in service Resume operation in device main frame #1 (now on server host #1 A, B, C, D, E all in operational mode). After automated back-up process completes, showing if required up optimizer system, system manager can get involved management Device manages each server host operating duty, and online by virtual machine (vm) migration to difference physical servers On main frame.Afterwards, system manager will need in existing service device main frame under the virtual machine of redistribution, (assuming that server host #1 collapses, new inquiry table will is that the inquiry table of renewal control virtual machine backup process Server host #2 will run A, E, F, G, H, back up B, I, N, O, P;Server host #3 B, I, J, K, L will be run, back up A, C, D, E, M;Server host #4 will run C, D, M, N, O, P, back up I, J, K, M, N), or simply simply hardware and virtual machine are configured back Complex value original state, and keep inquiry table not change.The design principle of inquiry table is, when any one services After the collapse of device main frame, all virtual machines that this server host runs will be divided equally to other normal server master Continue to run with on machine.
Fig. 5 have employed as an embodiment and only comprise 4 server hosts and be separately operable 4 virtual machines Situation.Should state, it illustrates just for possible embodiments of the present invention, this embodiment And it being not used to limit the scope of the claims of the present invention, the present invention should include but not limited to above-mentioned detailed description and tool Style.The present invention should include all adjustment in the range of core content and amendment, all without departing from institute of the present invention For equivalence implement or change, be intended to be limited solely by the scope of the claims of this case.

Claims (8)

1. an online pipe network abnormity detecting system based on machine learning, it is characterised in that comprising:
Data acquisition unit, for gathering the real time data of described online pipe network, and depends on described real time data Merge according to the band of position and be grouped into different packets;
File distributing unit, is used for receiving described packet, and extracts data element from packet, then Multiple data subset it is divided into after packet is formatted process;
Multiple abnormity detecting unit, receive corresponding data subset for one_to_one corresponding, and to described data Collection carries out abnormity prediction, the plurality of abnormity detecting unit panel data based on semi-supervised learning framework Process and carry out data transmission by MPI each other;
Described abnormity detecting unit is installed on virtual machine, the corresponding abnormity detecting unit of each virtual machine;
Described online pipe network abnormity detecting system based on machine learning farther includes multiple server host, Server host is connected with each other by fully connected topology in LAN, and each server host is equipped with more than one Core processor, multiple virtual machines that described polycaryon processor is divided on same server host according to thread, Wherein, first thread is designated as virtual machine dom 0, and other thread is divided into virtual machine dom U, described void Plan machine dom 0 is for accessing the hardware of server host and interacting with virtual machine dom U, described Virtual machine dom U is used for installing abnormity detecting unit, the virtual machine dom U of the server host of each operation Other server hosts run are provided with corresponding backup.
Online pipe network abnormity detecting system based on machine learning the most according to claim 1, its feature Being, described abnormity detecting unit includes:
Prediction module, for setting up forecast model according to multiple regression equation, is as good as reason to provide in hypothesis The actuarial prediction data of data subset expecting varialbe state under condition, described prediction module and with other abnormity detecting Unit carries out the exchange of actuarial prediction data;
Analyze module, be used for receiving described actuarial prediction data, according to described actuarial prediction data estimation from number The regression parameter of data subset next time obtained according to Dispatching Unit, with data subset next time described in calculating Predictive value, described data subset next time has identical time step and consistent pipe network with historical data Background;
Judge module, according to predictive value and the actual value of data subset next time, to described data next time The abnormity of collection judges;
Decision-making module, for receiving the abnormity judged result that judge module is made, and according to described abnormity Described forecast model is made renewal by judged result.
Online pipe network abnormity detecting system based on machine learning the most according to claim 2, its feature Being, described prediction module is set up the method for forecast model and is comprised the following steps:
Step 11, carry out the simulation of data model according to certain some parameter situation over time in online pipe network:
Xi(t+1)=Fi(X (t), X (t-1), X (t-2) ... X (t-n)) (1)
Wherein: FiBeing the forecast model of i-th abnormity detecting unit, i is positive integer, and is not more than abnormal detecing Survey the sum of unit, XiIt is the input data of i-th abnormity detecting unit, wherein, X (t), X (t-1), X (t-2) ... X (t-n) is historical data, Xi(t+1) it is data subset next time;
Step 12, based on multiple regression equation build forecast model:
Xi(t+1)=Ai0*Xi(t)+Ai1*Xi(t-1)+...Ain*Xi(t-n)+Ci (2)
Wherein: Ai0、Ai1、...AinFor forecast model FiRegression parameter, CiFor i-th abnormity detecting unit Random error parameter, described prediction module by MPI by CiCarry out the exchange of actuarial prediction data;
Step 13, solve random error parameter Ci
C i = Σ j ≠ i n A i j 0 * X j ( t ) + Σ j ≠ i n A i j 1 * X j ( t - 1 ) + .... + Σ j ≠ i n A i j n * X j ( t - n ) - - - ( 3 )
In formula (3), Aij0、Aij1、...AijnFrom normal data bag, Auto-matching obtains.
Online pipe network abnormity detecting system based on machine learning the most according to claim 3, its feature Being, the method that the abnormity of described data subset next time is judged by described judge module includes following Step:
Step 31, compare Xi(t+1) predictive value and the difference of measured value;
Step 32, collection historical data set up data base Xi(t), Xi(t-1) ... Xi(t-P), wherein, P For the empirical parameter of the time relationship of i-th abnormity detecting unit, P is positive integer;
Step 33, structure historical data base sample { Xi(t), Xi(t-1) ... Xi(t-P) this sample }, is calculated This standard deviation scope;
Step 34, relatively described difference and standard deviation scope:
If difference is less than standard deviation scope, it is judged that module then returns a negative acknowledge character (NAK) to decision-making module, if What the judge module being provided with abnormity detecting unit fed back to decision-making module is all negative acknowledge character (NAK), and decision-making module then will All of Xi(t+1) it is stored in data base, and indicates the up-to-date sample of corresponding judge module and database synchronization Data and regression parameter, to be ready for use on prediction Xi(t+2);
Prediction Xi(t+2) time, if difference is more than the sample { X updatedi(t+1)Xi(t), Xi(t-1) ... Xi (t-P+1) standard deviation scope }, corresponding judge module then returns a signal certainly to decision-making module, certainly Plan module in data base by Xi(t+2) be labeled as anomalous event, decision-making module by this judge module of instruction according to Database update regression parameter, but use old sample { Xi(t+1)Xi(t), Xi(t-1) ... Xi(t-P+1)} Definition standard deviation, for Xi(t+3) exception judges.
Online pipe network abnormity detecting system based on machine learning the most according to claim 4, its feature Being, described online pipe network abnormity detecting system based on machine learning farther includes Network Attached Storage list Unit, for storing the mirror image copies of the historical data of all virtual machines and online pipe network, each abnormity detecting Unit all may have access to the data in this Network Attached Storage unit, and virtual machine dom 0 connects the virtual of its correspondence Machine dom U and the communication of Network Attached Storage unit.
Online pipe network abnormity detecting system based on machine learning the most according to claim 5, its feature Being, the method for described backup is: by the virtual machine dom U test point on the server host of each operation Information is distributed backup according to the loading condition of other server hosts run, to realize optimal balance The method of operation, automatically generates an inquiry table after backup, described inquiry table is used for defining primary fault virtual machine dom The migration node of U backup, in order to perform when virtual machine dom U or server host break down dynamically to move Move.
Online pipe network abnormity detecting system based on machine learning the most according to claim 6, its feature It is, each virtual machine dom 0 arranges backup manager, for by this virtual machine dom 0 correspondence The health status of virtual machine dom U arranges into inventory, and virtual machine dom 0 passes through backup manager according to inquiry table Gone to process the backup of fault virtual machine by body plan.
Online pipe network abnormity detecting system based on machine learning the most according to claim 1, its feature Being, the method for described file distributing unit distribution data subset comprises the following steps:
Described file distributing unit receive packet and in packet owing to measuring, the reason that sends or collect And the interference information even error message produced filters;
Extract the data element in packet, packet is converted to consolidation form;
Packet is divided into corresponding number and ensures equilibrium according to subset, the data in data subset,
Data subset is encrypted, and by publish-subscribe architecture, data subset is distributed to abnormity detecting list Unit.
CN201310581956.0A 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning Active CN103580960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310581956.0A CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310581956.0A CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Publications (2)

Publication Number Publication Date
CN103580960A CN103580960A (en) 2014-02-12
CN103580960B true CN103580960B (en) 2017-01-11

Family

ID=50051937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310581956.0A Active CN103580960B (en) 2013-11-19 2013-11-19 Online pipe network anomaly detection system based on machine learning

Country Status (1)

Country Link
CN (1) CN103580960B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104852830A (en) * 2015-06-01 2015-08-19 广东电网有限责任公司信息中心 Service access model based on machine learning and implementation method thereof
CN106649414B (en) * 2015-11-04 2020-01-31 阿里巴巴集团控股有限公司 Method and equipment for pre-detecting data anomalies of data warehouses
CN105740989B (en) * 2016-02-03 2019-09-27 杭州电子科技大学 A kind of water supply network anomalous event method for detecting based on VARX model
CN106125643A (en) * 2016-06-22 2016-11-16 华东师范大学 A kind of industry control safety protection method based on machine learning techniques
CN106209843A (en) * 2016-07-12 2016-12-07 工业和信息化部电子工业标准化研究院 A kind of data flow anomaly towards Modbus agreement analyzes method
CN108023740B (en) * 2016-10-31 2020-06-16 腾讯科技(深圳)有限公司 Risk prompting method and device for abnormal information in monitoring
CN106775929B (en) * 2016-11-25 2019-11-26 中国科学院信息工程研究所 A kind of virtual platform safety monitoring method and system
CN107360159B (en) * 2017-07-11 2019-12-03 中国科学院信息工程研究所 A kind of method and device of the abnormal encryption flow of identification
CN108229537A (en) * 2017-12-07 2018-06-29 深圳市宏电技术股份有限公司 The singular values standard form method, apparatus and equipment of a kind of precipitation
CN108259482B (en) * 2018-01-04 2019-05-28 平安科技(深圳)有限公司 Network Abnormal data detection method, device, computer equipment and storage medium
CN110188910B (en) * 2018-07-10 2021-10-22 第四范式(北京)技术有限公司 Method and system for providing online prediction service by using machine learning model
CN109286526A (en) * 2018-10-08 2019-01-29 成都西加云杉科技有限公司 A kind of wifi system running policy dynamic adjusting method and device
CN109857611A (en) * 2019-01-31 2019-06-07 泰康保险集团股份有限公司 Test method for hardware and device, storage medium and electronic equipment based on block chain
CN109981744B (en) * 2019-02-28 2022-03-04 东软集团股份有限公司 Data distribution method and device, storage medium and electronic equipment
CN109871002B (en) * 2019-03-06 2020-08-25 东方证券股份有限公司 Concurrent abnormal state identification and positioning system based on tensor label learning
CN110618854B (en) * 2019-08-21 2022-04-26 浙江大学 Virtual machine behavior analysis system based on deep learning and memory mirror image analysis
CN111639430B (en) * 2020-05-29 2024-02-27 重庆大学 Natural gas pipeline leakage identification system driven by digital twinning
TWI798007B (en) * 2022-02-25 2023-04-01 中華電信股份有限公司 Anomaly detection system, method and computer readable medium based on system characteristics
CN114623872A (en) * 2022-03-08 2022-06-14 内蒙古金原农牧科技有限公司 Underground water dynamic monitoring system based on strong magnetic wireless transmission

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719849A (en) * 2009-11-03 2010-06-02 清华大学 Pattern clustering-based parallel network flow characteristic detection method
CN101980480A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised anomaly intrusion detection method
CN102045381A (en) * 2010-10-13 2011-05-04 北京博大水务有限公司 On-line monitoring system for regenerated water pipe network
CN102635787A (en) * 2012-04-16 2012-08-15 中山大学 Automatic detection device and detection method for water leakage of water pipeline
CN102970245A (en) * 2012-11-21 2013-03-13 北京奇虎科技有限公司 Data transmission method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719849A (en) * 2009-11-03 2010-06-02 清华大学 Pattern clustering-based parallel network flow characteristic detection method
CN102045381A (en) * 2010-10-13 2011-05-04 北京博大水务有限公司 On-line monitoring system for regenerated water pipe network
CN101980480A (en) * 2010-11-04 2011-02-23 西安电子科技大学 Semi-supervised anomaly intrusion detection method
CN102635787A (en) * 2012-04-16 2012-08-15 中山大学 Automatic detection device and detection method for water leakage of water pipeline
CN102970245A (en) * 2012-11-21 2013-03-13 北京奇虎科技有限公司 Data transmission method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Wireless water quality monitoring system based on field point technology and kohonen maps";Postloche O. Silva Girao P, Dias Pereira J M.;《Proc of IEEE Sensors》;20021230;第7卷(第5期);全文 *
"基于S7-200自来水管网监控系统设计";庄宪骥;《中北大学》;20091019;第40,63,55,32页,图2.1 *

Also Published As

Publication number Publication date
CN103580960A (en) 2014-02-12

Similar Documents

Publication Publication Date Title
CN103580960B (en) Online pipe network anomaly detection system based on machine learning
CN110609512B (en) Internet of things platform and Internet of things equipment monitoring method
Diez-Olivan et al. Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0
Xia et al. Toward cognitive predictive maintenance: A survey of graph-based approaches
CN113820993B (en) Method, system and non-transitory computer readable medium for generating industrial control programming
Ploennigs et al. Adapting semantic sensor networks for smart building diagnosis
CN112579653B (en) Gradual contextualization and analysis of industrial data
Ferrari et al. Distributed fault detection and isolation of large-scale discrete-time nonlinear systems: An adaptive approximation approach
Rabatel et al. Anomaly detection in monitoring sensor data for preventive maintenance
CN107085415A (en) Regular composer in process control network
CN112580813B (en) Contextualization of industrial data at device level
JP6875179B2 (en) System analyzer and system analysis method
CN108320040A (en) Acquisition terminal failure prediction method and system based on Bayesian network optimization algorithm
CN104412190A (en) Systems and methods for health assessment of a human-machine interface (HMI) device
Kavulya et al. Failure diagnosis of complex systems
WO2014200836A1 (en) Systems and methods for monitoring system performance and availability
CN106249709A (en) Dynamic process quality control figure and determine to keep in repair co-design optimal control method age
Yue et al. Understanding digital twins for cyber-physical systems: A conceptual model
Bi et al. A comprehensive survey on applications of AI technologies to failure analysis of industrial systems
Pahl et al. A Quality-driven Machine Learning Governance Architecture for Self-adaptive Edge Clouds.
Wu et al. Analysis of data for the carbon dioxide capture domain
JP7062505B2 (en) Equipment management support system
Chehida et al. Applied statistical model checking for a sensor behavior analysis
CN116028450A (en) Log detection method, device and equipment
Sheeba et al. WFCM based big sensor data error detection and correction in wireless sensor network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 528200 Guangdong Province Nanhai District of Foshan city streets Guicheng Shilong Road No. 1 joybon IFC 2 room 1707

Patentee after: Foshan science and Technology Co., Ltd.

Address before: 528200 Guangdong city of Foshan province sea road Han day Technology City Building No. 8 901-3

Patentee before: FOSHAN LUOSIXUN ENVIRONMENTAL PROTECTION TECHNOLOGY CO., LTD.