CN114238854A

CN114238854A - Anomaly detection method in mining scene based on graph regular incremental non-negative matrix factorization

Info

Publication number: CN114238854A
Application number: CN202111509423.2A
Authority: CN
Inventors: 陈自刚; 肖琪; 陈龙; 张镇江; 潘鼎
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-03-25
Anticipated expiration: 2041-12-10
Also published as: CN114238854B

Abstract

The invention provides a mining scene abnormity detection method based on graph regular increment nonnegative matrix decomposition, and belongs to the technical field of abnormity detection and diagnosis and the field of intelligent safety. The method comprises the steps that firstly, mining environment information under a normal state is collected by two sets of equipment, and data obtained by the two sets of equipment are processed as follows; preprocessing data to obtain a training set X'; then obtaining the optimal base matrix W through regular incremental nonnegative matrix decomposition of the graph_newSum coefficient matrix H_new(ii) a Thereby establishing a monitoring statistic N²And SPE, calculating the control of the training sets of the two sets of equipmentLimiting; then, data (a test set X ") is collected again for detection, the statistic of the test set X" is calculated, and finally the statistic is compared with two sets of control limits, so that whether the mining scene is abnormal or not is judged; when the scene is abnormal, the maximum or larger contribution values are uploaded to the control interface as the abnormal reasons to be displayed. The method solves the problems that the traditional mining scene abnormity detection is not timely and accurate, and the like, and creates digital mining industry.

Description

Mining scene abnormity detection method based on graph regular increment nonnegative matrix decomposition

Technical Field

The invention relates to the technical field of anomaly detection and diagnosis and the field of intelligent safety, in particular to a mining scene anomaly detection method based on graph regular increment non-negative matrix decomposition.

Background

Mining safety issues have been a concern. At present, most mines, especially metal mines, adopt an underground mining mode, however, in the underground construction process, toxic gas, mine collapse and the like pose great threats to the safety of constructors; therefore, the method has important significance in risk detection and early warning of the underground construction environment. At present, a plurality of anomaly detection methods are provided, but the method is challenging work for ensuring timeliness and effective utilization of network resources, comprehensively analyzing multi-sensor (multi-factor) data to ensure accuracy and determining the root cause of scene anomaly.

A mining scene anomaly detection method based on graph regular increment nonnegative matrix decomposition is provided. The graph regular incremental nonnegative matrix decomposition can overcome the difficulty of the traditional nonnegative matrix decomposition in the online processing of a large data set, can online represent data content and maintain the geometric structure of data, and can also fully utilize the decomposition result of the previous step by combining incremental learning to avoid repeated calculation, thereby reducing the operation time; meanwhile, the dimension is obviously reduced, and the clustering precision is better; based on the operation of the point elements by the matrix points, the decomposition result of the graph regular increment non-negative matrix method is more likely to represent the local characteristics of the data; in addition, the nonnegative basis vectors obtained by the regular incremental nonnegative matrix decomposition of the graph have certain linear independence and sparsity, and large-scale high-dimensional data generated in an industrial process can be better described; finally, the regular incremental non-negative matrix method of the graph does not make a special ideal assumption on data distribution in the decomposition process, so that non-Gaussian data can be processed, and only a proper monitoring statistic needs to be designed by combining a density estimation method.

Disclosure of Invention

In view of the above, the invention provides a mining scene abnormity detection method based on graph regular increment non-negative matrix decomposition, which realizes dynamic data training and multi-sensor data comprehensive analysis, saves time and space cost, and can analyze the reason of abnormity if abnormity occurs. The method comprises the following steps:

step 1: collecting data; sampling environments at a plurality of normal working moments by using two sets of equipment; the data obtained by the two sets of equipment are processed in the same way as follows, and the operation performed by one set of equipment data will be described below, and will not be described repeatedly hereinafter.

Step 2: preprocessing data; the measured data in the industrial process do not necessarily satisfy the non-negative condition, for example, the readings of the sensors such as temperature, pressure, etc. may be negative, and the unit can be adjusted to make the values non-negative; then preprocessing the data collected in the step 1 such as graying, vectorization, normalization and the like to obtain a training set X', and initially training a sample matrix

The matrix (k samples in the matrix) is composed of data samples in a certain time period in the training set X ', and the data samples at the later moment in the training set X' are sequentially used as the next new sample.

Initial training matrix

Wherein t is₀≤t_i≤…≤t_i+k-1≤t₁Is an initial period t₀～t₁Collected data;

represents data acquired by the ith device in the jth sample at t_iCollected all the time; for example, i-1 indicates that the device is a camera, i-2 indicates that the device is a gas sensor,

data representing the gas sensor in a third sample at t_i+2Time samplingCollected;

represents the first data sample, is at t_iCollected all the time;

represents the (k + 1) th data sample, is at t₂The samples are collected at the moment and are also used as the next newly added sample, and the rest is analogized by the newly added sample.

And step 3: a training stage; the initial training sample matrix obtained in the step 2

Carrying out graph regular increment non-negative matrix decomposition on the newly added samples to obtain an optimal base matrix W_newSum coefficient matrix H_new(ii) a Therefore, the geometrical structure information of the sample can be kept in a low-dimensional space, the decomposition result of the previous step can be fully utilized by combining incremental learning, repeated calculation is avoided, and the operation time is reduced. The method comprises the following specific steps:

step 3.1: firstly, an initial training sample matrix is

SVD is carried out to obtain a singular value matrix sigma and a singular vector matrix U, V, and the singular value matrix and the singular vector matrix are respectively used for pairing

Initializing a base matrix and a coefficient matrix in the regular non-negative matrix decomposition of the graph, and updating and iterating the initialized base matrix and coefficient matrix until a target function tends to be stable to obtain W_k,H_k(ii) a Therefore, a better global optimal solution effect can be obtained, the input matrix does not need to be changed in any data structure, the data structure of the original data cannot be damaged, more detailed information can be reserved, and the decomposition effect is improved.

SVD decomposition formula:

W_k,H_kinitialization:

where | U | represents taking the absolute value of the matrix U, V^ΤRepresenting a transpose of the matrix V.

An objective function:

an iteration rule:

wherein R represents a weight matrix and D is a diagonal matrix

L_kIs a Laplace matrix (L)_kD-R), λ is a regularization parameter.

Step 3.2: when a new sample is added at the next moment, the optimal base matrix W is obtained by adopting the regular incremental nonnegative matrix decomposition of the graph_newSum coefficient matrix H_new；

For example when

The objective function at the time of addition was:

iterative formula

Step 3.3: repeating the above operations on all newly added samples, and obtaining the optimal base matrix W after the updating is finished_newSum coefficient matrix H_new。

And 4, step 4: calculating a control limit; w from step 3_new、H_newCalculating a monitoring statistic N²And SPE, N²In order to monitor the change of the characteristic space, SPE is used for monitoring the change of the residual error space; then, probability density estimation is carried out on the process data by adopting a kernel density estimation (KED) method, and actual distribution information of the data is extracted, so that statistic control limits corresponding to two sets of equipment training samples are determined

SPE₁'；

SPE'₂。

And 5: a testing stage; the data is collected again (as a test set X ') for detection, the same processing is carried out on the test set X' at S2 and S3, corresponding statistic is obtained, and the monitoring statistic is compared with two sets of control limits, wherein the following three conditions can be adopted:

the first condition is as follows: when the two statistics are within the two sets of control limits, the scene is normal, and mining can be carried out.

Case two: when any one or two statistics are outside the two sets of control limits, the situation is necessarily abnormal, a first-level alarm is immediately carried out, and mining cannot be carried out; and calculating and sequencing the contribution values, and uploading the largest or larger contribution values to a control interface as the abnormality reasons for displaying.

Case three: and when any one or two statistics are only outside one set of control limit, further confirming whether the equipment is in failure or not, if not, immediately carrying out secondary alarm, calculating contribution values and sequencing, and uploading the largest or larger contribution values to a control interface as abnormal reasons for display.

The invention has the beneficial effects that: the invention avoids the defects of the traditional detection, can carry out dynamic training, real-time tracking and prediction and ensures the timeliness; various factors such as toxic gas, water burst, mine collapse and the like can be comprehensively considered, and the data of the plurality of sensors are comprehensively analyzed to ensure the accuracy; if the scene is abnormal, the source of the scene abnormality can be found; and effective utilization of network resources can be ensured, and efficiency is improved.

The invention has the beneficial effects that: the method for carrying out abnormity detection by using the graph regular increment non-negative matrix decomposition can overcome the difficulty of the traditional non-negative matrix decomposition in the online processing of a large data set, can online represent the data content and maintain the geometric structure of the data, and simultaneously obviously reduces the dimension, reduces the operation time and has better clustering precision; based on the operation of the point elements by the matrix points, the decomposition result of the graph regular increment non-negative matrix method is more likely to represent the local characteristics of the data; meanwhile, the nonnegative basis vectors obtained by the regular incremental nonnegative matrix decomposition of the graph have certain linear independence and sparsity, and large-scale high-dimensional data generated in an industrial process can be better described; in addition, the graph regular increment non-negative matrix method does not make a special ideal assumption on data distribution in the decomposition process, so that non-Gaussian data can be processed, and only a proper monitoring statistic needs to be designed by combining a density estimation method.

Drawings

Fig. 1 is a general flowchart of a mining scene anomaly detection method based on graph regular incremental nonnegative matrix factorization according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of mining environment data collection provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of data structure provided by an embodiment of the present invention;

FIG. 4 is a specific flowchart of regular incremental non-negative matrix decomposition of a graph according to an embodiment of the present invention;

Detailed Description

The invention will be further described with reference to the accompanying drawings and specific embodiments.

The invention aims to provide a mining scene abnormity detection method based on graph regular increment nonnegative matrix decomposition, which can track and predict the safety of a mining scene in real time, achieve the effect of reducing the safety risk of mining personnel, feed back the reason of abnormity when the environment is abnormal, ensure the effective utilization of network resources and improve the efficiency.

As shown in fig. 1, a general flowchart of a mining scene anomaly detection method based on graph regular incremental non-negative matrix factorization is provided in the embodiment of the present invention. The method comprises the following steps:

step 1: collecting data; two sets of equipment are used (each set of equipment comprises a camera and a plurality of sensors (a gas sensor and a CO sensor)₂Sensors, etc.) in a single quantity) samples the environment at multiple times of normal operation.

Fig. 2 is a schematic diagram of mining environment data acquisition. And data acquired by the two sets of equipment are transmitted to the nodes and then uploaded to the cloud server for processing. The equipment is not limited to the above three types and may be tailored to the particular mining environment.

The data obtained by the two sets of equipment are respectively processed in the following way, and the operation performed by the data collected by the set of equipment will be described below, and is respectively described if necessary, and the repeated description is not repeated hereinafter.

Step 2: preprocessing data; carrying out graying, vectorization, normalization and other processing on the data acquired in the step 1 to obtain a training set X', and obtaining an initial training sample matrix

Fig. 3 is a schematic diagram of data composition. Preprocessing the acquired data, graying the image data acquired by the camera to obtain an image matrix V belonging to a multiplied by b.

Vectorizing the obtained image matrix V, first extracting each column and recombining into a column vector as shown in FIG. 3

Then, the data collected by other sensors in a corresponding set of equipment are combined into a column vector in sequence as shown in FIG. 3

Finally, the set of equipment is put at t_iThe data collected at the moment are normalized to be between 0 and 1 and combined into a column vector

Indicating the first sample data, is at t_iAcquired at a moment of time, wherein

Data representing the second plant gas sensor in the first sample, also at t_iAnd (4) acquiring at any moment.

Continuously collecting data at multiple moments, performing repeated processing according to the above manner, and finally obtaining an initial training sample matrix

The data collected at different later times are sequentially used as the added samples.

In this embodiment, 200 groups of normal data samples are collected within 5 hours as samples in an initial training sample matrix, 40 groups are collected every hour, 10 groups are collected every hour after 5 hours as added samples, and 100 groups are added as newly added samples.

Carrying out graph regular increment non-negative matrix decomposition on the newly added samples to obtain an optimal base matrix W_newSum coefficient matrix H_new。

As shown in fig. 4, a specific process of regular incremental non-negative matrix factorization is provided for the graph according to the embodiment of the present invention. Firstly, an initial training sample matrix is

SVD is performed to obtain a singular value matrix sigma and a singular vector matrix U, V.

SVD decomposition:

using singular value matrix and singular vector matrix respectively

And initializing a base matrix and a coefficient matrix in the regular non-negative matrix decomposition of the graph.

W_k,H_kInitialization:

where | U | represents taking the absolute value of the matrix U, V^ΤIs the transpose of matrix V.

Updating and iterating the initialized base matrix and coefficient matrix until the objective function tends to be stable to obtain W_k,H_k。

An objective function:

an iteration rule:

wherein R represents a weight matrix and D is a diagonal matrix

L_kIs a Laplace matrix (L)_kD-R), λ is a regularization parameter.

Sequentially increasing samples

Performing graph regular increment non-negative matrix decomposition once every adding one sample, and iteratively updating until the target function meets the convergence condition to obtain the optimal base matrix and coefficient matrix W_new,H_new。

For example when

The objective function at the time of addition was:

W_k+1and H_k+1Respectively representing sample sets X_k+1Base matrix and coefficient matrix L obtained by performing regular nonnegative matrix decomposition on graph_k+1Is a sample set X_k+1The laplacian matrix of. When the number of training samples is large enough, the influence of adding a new training sample on the base matrix and the coefficient matrix is small, so that it is assumed that the coefficient matrix H is added when a new sample is added_k+1Is approximately equal to H_kColumn vector of (i.e. H)_k+1＝[H_k,h_k+1]At this time, the objective function F_k+1Rewritable in the form:

obtain an objective function F_k+1After the incremental expression is obtained, a corresponding iterative updating formula can be deduced by using a gradient descent method:

repeating the above operations on all newly added samples, and obtaining the optimal base matrix W after the updating is finished_newSum coefficient matrix H_new。

And 4, step 4: calculating a control limit; obtaining corresponding W from two sets of training sets_new、H_newCalculating a corresponding monitoring statistic N²And SPE.

Statistic N for monitoring feature spatial variation²：N²(i)＝X^Τ(i)WW^ΤX(i)。

Statistic SPE for degree of deviation of reaction data:

wherein

The reconstructed value representing the ith sample vector is calculated as:

the control limits for two statistics are calculated: probability density estimation is carried out on the two statistics by using a nuclear density estimation method, the actual distribution condition of the statistics is extracted, and the control limits of the statistics of the two sets of equipment training samples are respectively calculated by setting the significance level alpha

SPE₁'；

SPE₂'。

And 5: a testing stage; the data is collected again (as a test set X ') for detection, the same processing as the step 2 and the step 3 is carried out on the test set X', and the corresponding statistic N is obtained²And SPE, comparing the monitoring statistic with two sets of control limits.

When in use

When the two statistics are within the control limit trained by the two equipment training sets, the scene is normal, and mining can be performed.

When in use

When any one or two statistics are beyond the control limit trained by the two sets of equipment training sets, the situation is necessarily abnormal, a first-level alarm is immediately carried out, and mining cannot be carried out; calculating and sequencing the contribution values, uploading several maximum or larger contribution values serving as abnormal reasons to a control interface to be displayedShown in the figure.

Calculating the contribution value

The subscript j represents the label of the variable, abs indicates the absolute value; and delta_jIs the jth column of the n × n identity matrix; suppose there are four devices, δ for the 2 nd device (variable) gas sensor₂＝[0 1 0 0]^Τ。

And when any one or two statistics are only larger than the control limit trained by one set of equipment training set, further confirming whether the equipment is in failure or not, if not, immediately carrying out secondary alarm, calculating contribution values and sequencing, and uploading the largest or larger contribution values serving as abnormal reasons to a control interface for display.

Claims

1. A mining scene anomaly detection method based on graph regular incremental non-negative matrix decomposition, is characterized in that, comprises the following steps:

S1: Data collection; use two sets of equipment (each set of equipment includes a camera, a variety of sensors (gas sensor, CO ₂ sensor, etc.), the number is one) to sample the environment at multiple normal working hours; two sets of equipment The obtained data are processed in the same manner as follows. The operations performed by a set of device data will be described below, and described separately when necessary, and the description will not be repeated hereafter.

S2: Data preprocessing; after preprocessing the data collected by S1, such as grayscale, vectorization, and normalization, the training set X' is obtained, and the initial training sample matrix

(There are k samples in the matrix) It is composed of data samples in a certain period of time in the training set X', and the data samples in the training set X' at subsequent times are sequentially used as the next new sample.

S3: training phase; the initial training sample matrix obtained by S2

Perform graph-regular incremental non-negative matrix decomposition with new samples to obtain the optimal basis matrix W _new and coefficient matrix H _new ; use

The singular value matrix and singular vector matrix obtained by SVD decomposition are respectively

Initialize the basis matrix and coefficient matrix in the normalized non-negative matrix decomposition of the graph; in this way, the data structure of the original data will not be destroyed, the geometric structure information of the sample can be maintained in the low-dimensional space, and the decomposition of the previous step can be fully utilized in combination with incremental learning. As a result, repeated calculation is avoided, thereby reducing the operation time.

S4: Calculate the control limit; calculate the monitoring statistics N ² and SPE from the W _new and H _new obtained in S3, where N ² is the change in the monitoring feature space, and SPE is the change in the monitoring residual space; then use the kernel density estimation (KED) The method estimates the probability density of the process data, extracts the actual distribution information of the data, and determines the statistical control limit corresponding to each set of equipment training samples.

SPE′ ₁ ;

SPE' ₂ .

S5: testing stage; re-collect data (as the test set X") for detection, perform the same processing of S2 and S3 on the test set X", obtain the corresponding statistics, and compare the monitoring statistics with the two sets of control limits; If the statistics are within the control limits of the two sets of equipment training sets, it indicates a normal state; if any one or two statistics are outside the control limits of the two sets of equipment training sets, it indicates an abnormal state, and a first-level alarm will be issued immediately. ;If any one or two statistics are only outside the control limit of one of the equipment training sets, further check whether the equipment is abnormal, if not, the environment is abnormal, and a secondary alarm will be issued; when the scene is abnormal, calculate the contribution value And sort, upload the largest or larger contribution values to the control interface as abnormal causes for display.

2. the mining scene anomaly detection method based on graph regular incremental non-negative matrix decomposition according to claim 1, is characterized in that: in step S2, the specific parameters of the data sample are as follows:

Initial training sample matrix

where t ₀ ≤t _i ≤...≤t _i+k-1 ≤t ₁ , indicating that it is the data collected in the initial period t ₀ to t ₁ ;

Represents the data collected by the i-th device in the j-th sample, which is collected at time t _i ; for example, i=1 means the device is a camera, i=2 means the device is a gas sensor,

Represents the data of the gas sensor in the third sample, which is collected at time t _i+2 ;

represents the first data sample, which was collected at time t _i ;

Indicates that the k+1th data sample is collected at time t ₂ and is also used as the next new sample, and so on for the subsequent new samples.

3. the mining scene anomaly detection method based on graph regular incremental non-negative matrix decomposition according to claim 1, is characterized in that: in step S3, the concrete steps of training stage are as follows:

S3.1: First set the initial training sample matrix

Perform SVD decomposition to obtain singular value matrix ∑ and singular vector matrix U, V, use singular value matrix and singular vector matrix to respectively

Initialize the basis matrix and coefficient matrix in the decomposition of the regular non-negative matrix of the graph, and then update and iterate the initialized basis matrix and coefficient matrix until the objective function tends to be stable, and obtain W _k , H _k ; The effect of the global optimal solution, and the input matrix does not need to be changed in any data structure, the data structure of the original data will not be destroyed, and more detailed information can be retained, thereby improving the decomposition effect.

SVD decomposition formula:

W _k ,H _k initialization:

Objective function:

Iteration rules:

Note: where R represents the weight matrix and D is the diagonal matrix

L _k is the Laplacian matrix (L _k =DR) and λ is the regularization parameter.

S3.2: When a new sample is added at the next moment, the optimal basis matrix and coefficient matrix are obtained by using graph-regular incremental non-negative matrix decomposition.

For example when

The objective function when joining is:

Iteration rules

S3.3: Repeat the above operation for all newly added samples, and obtain the optimal basis matrix W _new and coefficient matrix H _new after the update is completed.

4. A computer-readable storage medium, characterized in that: the computer-readable storage medium stores a computer instruction, the instruction is used to make the computer execute the mining scene abnormality detection method according to any one of claims 1-3 .