CN114118163B - Optical fiber intrusion detection method based on machine learning - Google Patents
Optical fiber intrusion detection method based on machine learning Download PDFInfo
- Publication number
- CN114118163B CN114118163B CN202111460428.0A CN202111460428A CN114118163B CN 114118163 B CN114118163 B CN 114118163B CN 202111460428 A CN202111460428 A CN 202111460428A CN 114118163 B CN114118163 B CN 114118163B
- Authority
- CN
- China
- Prior art keywords
- formula
- data
- optical fiber
- signal
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 39
- 239000013307 optical fiber Substances 0.000 title claims abstract description 32
- 238000010801 machine learning Methods 0.000 title claims abstract description 11
- 238000000034 method Methods 0.000 claims abstract description 19
- 230000006870 function Effects 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000000354 decomposition reaction Methods 0.000 claims description 12
- 230000004927 fusion Effects 0.000 claims description 9
- 238000013145 classification model Methods 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000003066 decision tree Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 5
- 239000000835 fiber Substances 0.000 claims description 4
- 238000012706 support-vector machine Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000001351 cycling effect Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000007689 inspection Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B13/00—Burglar, theft or intruder alarms
- G08B13/02—Mechanical actuation
- G08B13/12—Mechanical actuation by the breaking or disturbance of stretched cords or wires
- G08B13/122—Mechanical actuation by the breaking or disturbance of stretched cords or wires for a perimeter fence
- G08B13/124—Mechanical actuation by the breaking or disturbance of stretched cords or wires for a perimeter fence with the breaking or disturbance being optically detected, e.g. optical fibers in the perimeter fence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/02—Preprocessing
- G06F2218/04—Denoising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an optical fiber intrusion detection method based on machine learning, which is characterized by comprising the following steps: 1) Preprocessing signals; 2) Extracting signal characteristics; 3) And (5) classifying the model. The method can rapidly distinguish normal data and intrusion data, and has higher resolution accuracy.
Description
Technical Field
The invention relates to the technical fields of perimeter security, neural networks and machine learning, in particular to an optical fiber intrusion detection method based on machine learning.
Background
Along with the rapid development of economy, the requirements of various industries on safety are increasingly vigorous, and the traditional security methods comprise manual inspection, video monitoring, power grid and the like. However, manual inspection requires a large amount of human resources, which also brings huge cost, and the safety of inspection personnel is difficult to ensure under the condition of severe natural environment; the video monitoring coverage area is small, and meanwhile, security personnel are required to continuously observe the monitoring screen; the power grid needs to be kept in a power supply state for a long time, and is poor in stability and high in use risk. The rapid development of modern optical communication now makes the optical fiber sensing technology rapidly developed. Since the optical signal is easily affected by external pressure, movement, temperature and other factors in the transmission process, the intrusion event is perceived through the change of the optical signal. The perimeter security system based on the optical fiber sensor detection technology also becomes a security hot spot and is widely applied to various infrastructures such as oil transportation, water supply, electricity and communication.
An artificial neural network, for short, is a mathematical model that mimics an animal neural network for information processing. The neural network is composed of a large number of neurons, and a linear and nonlinear relation between data is fitted through different connection modes and weight values among the connected neurons.
Machine learning is a method for model training based on a dataset and finally completing classification and regression. Machine learning involves a number of disciplines including computers, statistics, probability theory, etc., and is now widely used in various fields such as medical, educational, and military.
Disclosure of Invention
The invention aims to provide an optical fiber intrusion detection method based on machine learning, aiming at the problems of large data volume, high data dimension and difficult classification of optical fiber sensors in perimeter security. The method can rapidly distinguish normal data and intrusion data, and has higher resolution accuracy.
The technical scheme for realizing the aim of the invention is as follows:
an optical fiber intrusion detection method based on machine learning comprises the following steps:
1) Signal pretreatment: assume that the optical fiber detection signal isN×m data in total, wherein>Indicating the optical fiber detection signal of the i-th group, wherein the detection time length is m time, and the label corresponding to the detection signal is +.>Is a one-hot coding matrix, due to the collected signal x i Does not avoid the existence of random noise such as system noise, environmental interference and the like, thus x i Can be expressed by formula (1):
in the case of the formula (1),representing the original signal, ε represents noise, and thus it is necessary to align x i Denoising to increase x i Signal to noise ratio of first to signal x i Performing wavelet decomposition to obtain wavelet coefficients, wherein the decomposition formula is shown in formula (2):
in equation (2), j=0, 1,2, …, J, where J represents the optimal decomposition scale and u (J, k) represents the scale functionThe corresponding low-pass filter, v (j, k) represents the high-pass filter corresponding to the wavelet function ψ (t), H s (j, k) represents a scale factor, where H s (0,k)=x i ,W s (j, k) represents wavelet coefficients, and ω i =[W s (1,k),W s (2,k),…,W s (J+1,k)]The wavelet coefficient obtained by decomposition is represented, then a proper threshold value is selected to correct the wavelet coefficient, the threshold value selection function is mainly divided into a hard threshold value function and a soft threshold value function, wherein the soft threshold value function is selected, and the expression is shown in a formula (3):
in equation (3), sgn (·) represents the sign function, λ represents the estimated threshold,and finally, carrying out signal reconstruction to obtain a denoised signal, wherein the denoised signal is shown in a formula (4):
in the formula (4) of the present invention, and->Respectively representing the conjugates of u (j, k) and v (j, k), finally obtaining +.>Namely, the signal after noise removal is +.>
2) Extracting signal characteristics: denoising the step 1) to obtain a denoised signalExtracting features to classify the classification model by +.>Performing discrete Fourier transform to denoise the signal +.>Converting from the time domain to the frequency domain as shown in equation (5):
in the formula (5) of the present invention, representing data acquired by the ith group of optical fiber detection signals at the jth moment;
3) Classification model: the utilization of different classification models to data features is also different, the false alarm rate can be effectively reduced by fusing multiple models, the reliability of overall prediction is improved, the fusion model in the technical scheme fuses a multi-layer perceptron MLP (Multilayer Perceptron, MLP for short), a support vector machine SVM (support vector machines, SVM for short) and a LightGBM for processing optical fiber detection data, the fusion model takes the MLP as a main model, the SVM and the LightGBM as auxiliary models, the MLP consists of an input layer, an output layer and a hidden layer, and a forward propagation formula between each layer is shown as a formula (6):
wherein,a weight matrix representing the kth layer, I and O representing the input dimension and the output dimension of the kth layer, respectively, < >>Wherein the bias matrix of the kth layer is represented, wherein +.>Representing the offset of the kth layer, X K Representing the output of the kth layer, where X 0 =[xf 1 ,xf 2 ,…,xf n ] T F (·) represents the activation function, typically sigmoid and relu et al, resulting in +.>Representing a probability matrix of the MLP after classifying the fiber detection data, wherein +.> And->Respectively represent the probability of classifying the i-th group of optical fiber detection data as normal and invasive, and +.>When->Considering the i-th group of optical fiber detection data as intrusion data; when->The ith group of optical fiber detection data is considered to be normal data, wherein thr 1 And thr 2 All represent threshold values, and thr 1 >thr 2 The method comprises the steps of carrying out a first treatment on the surface of the When->When the method is used, the comprehensive judgment of the SVM and the LightGBM is needed, the SVM is a classical binary classifier, the characteristics of the data are mapped to points on a high-dimensional space, the data are divided into two different classes by adopting a hyperplane, the core idea of the SVM is to maximize the hyperplane interval, and if the hyperplane H is a hyperplane for dividing different samples, the hyperplane is expressed by a formula (7):
ω T x+b=0 (7),
wherein ω= (ω) 1 ,ω 2 ,…,ω d ) Let H 1 And H 2 Is the plane in which the sample point closest to the hyperplane H lies, then the support vector is the point closest to H, where H 1 And H 2 Can be expressed as ω T x+b=1 and ω T x+b= -1, arbitrary sample point x i The distance of (i=1, 2, …, N) to H can be expressed by formula (8):
equation (9) represents the minimum distance d between the hyperplane and all sample points m The SVM model solution to the maximum-split hyperplane problem can therefore be represented by the optimization problem of equation (10):
solving to obtain a hyperplane H, classifying the sample into two types through the hyperplane, and calculating to obtain a classification probability p similar to the MLP svm The LightGBM is a model based on a gradient lifting decision tree GBDT (Gradient Boosting Decision Tree, abbreviated as GBDT), can process a large amount of data in a distributed manner, solves the problems of classification, regression and the like, and for the decision tree model, the most important problem is to identify the optimal segmentation point of the feature, the LightGBM searches the optimal segmentation point by adopting a histogram algorithm, and the steps of the histogram algorithm are as follows:
3-1) discretizing continuous eigenvalues into k integers, constructing the eigenvalues into a histogram with the width of k, and storing the sum of gradients of samples stored in the histogram and the number of samples of the eigenvalues into respective sub-buckets;
3-2) cycling through all feature values and repeating step 3-1);
3-3) traversing all sub-buckets, and calculating gain values of the current sub-bucket by taking the current sub-bucket as a partition point, wherein the gain value calculation is shown in a formula (11):
wherein S is L 、S R And S is P Respectively representing the sum of the gradients of the current sub-bucket and the left sub-bucket, the sum of the gradients of the current sub-bucket and the right sub-bucket and the sum of the total gradients of the father nodes, n L 、n R And n P Respectively representing the number of left samples of the current sub-bucket, the number of right samples of the current sub-bucket and the number of total samples;
3-4) selecting the maximum gain, taking the characteristic of the maximum gain and the barrel statistics value as the current split criterion, and calculating the classification probability p of the current sample after the classification result is obtained L When (when)At this time, the fusion model result is calculated by the formula (12):
wherein a, b and c represent weights occupied by MLP, SVM and LightGBM, respectively, and a+b+c=1, when p>thr 3 And when the data is received, the sample is considered to be intrusion data, and otherwise, the sample is normal data.
The method can rapidly distinguish normal data and intrusion data, and has higher resolution accuracy.
The specific embodiment is as follows:
the following describes the invention in further detail with reference to examples, but is not intended to limit the invention.
Examples:
the optical fiber intrusion detection method based on machine learning is characterized by comprising the following steps:
1) Signal pretreatment: assume that the optical fiber detection signal isN×m data in total, wherein>Indicating the optical fiber detection signal of the i-th group, wherein the detection time length is m time, and the label corresponding to the detection signal is +.>Is a one-hot coding matrix, due to the collected signal x i Does not avoid the existence of random noise such as system noise, environmental interference and the like, thus x i Can be expressed by formula (1):
in the case of the formula (1),representing the original signal, ε represents noise, and thus it is necessary to align x i Denoising to increase x i Signal to noise ratio of first to signal x i Performing wavelet decomposition to obtain wavelet coefficients, wherein the decomposition formula is shown in formula (2):
in equation (2), j=0, 1,2, …, J, where J represents the optimal decomposition scale and u (J, k) represents the scale functionThe corresponding low-pass filter, v (j, k) represents the high-pass filter corresponding to the wavelet function ψ (t), H s (j, k) represents a scale factor, where H s (0,k)=x i ,W s (j, k) represents wavelet coefficients, and ω i =[W s (1,k),W s (2,k),…,W s (J+1,k)]The wavelet coefficient obtained by decomposition is represented, then a proper threshold value is selected to correct the wavelet coefficient, the threshold value selection function is mainly divided into a hard threshold value function and a soft threshold value function, wherein the soft threshold value function is selected, and the expression is shown in a formula (3):
in equation (3), sgn (·) represents the sign function, λ represents the estimated threshold,and finally, carrying out signal reconstruction to obtain a denoised signal, wherein the denoised signal is shown in a formula (4):
in the formula (4) of the present invention, and->Respectively representing the conjugates of u (j, k) and v (j, k), finally obtaining +.>Namely, the signal after noise removal is +.>
2) Extracting signal characteristics: denoising the step 1) to obtain a denoised signalExtracting features to classify the classification model by +.>Performing discrete Fourier transform to denoise the signal +.>Converting from the time domain to the frequency domain as shown in equation (5):
in the formula (5) of the present invention, representing data acquired by the ith group of optical fiber detection signals at the jth moment;
3) Classification model: the utilization of different classification models to data features is also different, the false alarm rate can be effectively reduced by fusing multiple models, the reliability of overall prediction is improved, the fusion model in the embodiment fuses the multi-layer perceptron MLP, the support vector machine SVM and the LightGBM to process optical fiber detection data, the fusion model uses the MLP as a main model, the SVM and the LightGBM are used as auxiliary models, the MLP consists of an input layer, an output layer and a hidden layer, and a forward propagation formula between each layer is shown as a formula (6):
wherein,a weight matrix representing the kth layer, I and O representing the input dimension and the output dimension of the kth layer, respectively, < >>Wherein the bias matrix of the kth layer is represented, wherein +.>Represents the bias of the layer (X K Representing the output of the kth layer, where X 0 =[xf 1 ,xf 2 ,…,xf n ] T F (·) represents the activation function, typically sigmoid and relu et al, resulting in +.>Representing a probability matrix of the MLP after classifying the fiber detection data, wherein +.> And->Respectively represent the probability of classifying the i-th group of optical fiber detection data as normal and invasive, and +.>When->Considering the i-th group of optical fiber detection data as intrusion data; when->The ith group of optical fiber detection data is considered to be normal data, wherein thr 1 And thr 2 All represent threshold values, and thr 1 >thr 2 The method comprises the steps of carrying out a first treatment on the surface of the When->When the method is used, the comprehensive judgment of the SVM and the LightGBM is needed, the SVM is a classical binary classifier, the characteristics of the data are mapped to points on a high-dimensional space, the data are divided into two different classes by adopting a hyperplane, the core idea of the SVM is to maximize the hyperplane interval, and if the hyperplane H is a hyperplane for dividing different samples, the hyperplane is expressed by a formula (7):
ω T x+b=0 (7),
wherein ω= (ω) 1 ,ω 2 ,…,ω d ) Let H 1 And H 2 Is the plane in which the sample point closest to the hyperplane H lies, then the support vector is the point closest to H, where H 1 And H 2 Can be expressed as ω T x+b=1 and ω T x+b= -1, arbitrary sample point x i The distance of (i=1, 2, …, N) to H can be expressed by formula (8):
equation (9) represents the minimum distance d between the hyperplane and all sample points m The SVM model solution to the maximum-split hyperplane problem can therefore be represented by the optimization problem of equation (10):
solving to obtain a hyperplane H, classifying the sample into two types through the hyperplane, and calculating to obtain a classification probability p similar to the MLP svm The LightGBM is a model based on a gradient lifting decision tree GBDT, can process a large amount of data for distribution, solves the problems of classification, regression and the like, and for the decision tree model, the most important problem is to identify the optimal segmentation point of the feature, the LightGBM searches the optimal segmentation point by adopting a histogram algorithm, and the histogram algorithm comprises the following steps:
3-1) discretizing continuous eigenvalues into k integers, constructing the eigenvalues into a histogram with the width of k, and storing the sum of gradients of samples stored in the histogram and the number of samples of the eigenvalues into respective sub-buckets;
3-2) cycling through all feature values and repeating step 3-1);
3-3) traversing all sub-buckets, and calculating a current sub-bucket gain value by taking the current sub-bucket as a partition point, wherein the gain value calculation is shown in a formula (11):
wherein S is L 、S R And S is P Respectively representing the sum of the gradients of the current sub-bucket and the left sub-bucket, the sum of the gradients of the current sub-bucket and the right sub-bucket and the sum of the total gradients of the father nodes, n L 、n R And n P Respectively representing the number of left samples of the current sub-bucket, the number of right samples of the current sub-bucket and the number of total samples;
3-3) selecting the maximum gain, taking the characteristic of the maximum gain and the barrel statistics value as the current split criterion, and calculating the current sample classification probability p after obtaining the classification result L When (when)At this time, the fusion model result is calculated by the formula (12):
wherein a, b and c represent weights occupied by MLP, SVM and LightGBM, respectively, and a+b+c=1, when p>thr 3 And when the data is received, the sample is considered to be intrusion data, and otherwise, the sample is normal data.
Simulation: simulation is performed by using measured fiber intrusion data, wherein the data comprises 1243 groups of intrusion events 343 groups and 900 groups of normal events, each group has 500 characteristics, so that the data set is a 1243×500 matrix, and before simulation, the sequence of the data set is disordered and the data set is divided into 6:2:2 is divided into a training set, a verification set and a test set, wherein the training set is used for training a model, the verification set is used for adjusting model parameters, the test set is used for testing model performance, and after the parameters are adjusted by the verification set, the parameters of the model are shown in table 1:
TABLE 1
Parameters (parameters) | thr 1 | thr 2 | thr 3 | a | b | c |
Value of | 0.75 | 0.2 | 0.55 | 0.4 | 0.3 | 0.3 |
Table 2 shows the performance comparison of the method with MLP, SVM and LightGBM, and the results show that the performance index of the method is superior to that of other methods:
TABLE 2
Method | Accuracy rate of | Accuracy rate of | Recall rate of recall | F 1 |
MLP | 96.79% | 94.87% | 94.87% | 94.87% |
SVM | 95.18% | 91.25% | 93.59% | 92.41% |
LightGBM | 95.18% | 93.42% | 91.03% | 92.41% |
The method of this example | 97.19% | 96.10% | 94.87% | 95.48% |
。
Claims (1)
1. The optical fiber intrusion detection method based on machine learning is characterized by comprising the following steps:
1) Signal pretreatment: assume that the optical fiber detection signal isN×m data in total, whereinRepresenting the i-th group of optical fiber detection signals, wherein the detection time length is m time, and the label corresponding to the detection signalsIs one-hot coding matrix, x i Expressed by formula (1) as:
in the case of the formula (1),represents the original signal, ε represents noise, and x is the sum of i Denoising the signal x i Performing wavelet decomposition to obtain wavelet coefficients, wherein the decomposition formula is shown in formula (2):
in formula (2), j=0, 1,2,..j, where J represents the best decomposition scale and u (J, k) represents the scale functionThe corresponding low-pass filter, v (j, k) represents the high-pass filter corresponding to the wavelet function ψ (t), H s (j, k) represents a scale factor, where H s (0,k)=x i ,W s (j, k) represents wavelet coefficients, and ω i =[W s (1,k),W s (2,k),...,W s (J+1,k)]The wavelet coefficient obtained by decomposition is represented, then, a threshold value is selected to correct the wavelet coefficient, the threshold value selection function is divided into a hard threshold value function and a soft threshold value function, the soft threshold value function is selected, and the expression is shown in a formula (3):
in equation (3), sgn (·) represents the sign function, λ represents the estimated threshold,and finally, carrying out signal reconstruction to obtain a denoised signal, wherein the denoised signal is shown in a formula (4):
in the formula (4) of the present invention,and->Respectively representing the conjugates of u (j, k) and v (j, k), finally obtaining +.>Namely, the signal after noise removal is +.>
2) Extracting signal characteristics: obtaining the denoised signal from the step 1)Extracting features by extracting ∈K>Performing discrete Fourier transform to denoise the signal +.>Converting from the time domain to the frequency domain as shown in equation (5):
in the formula (5) of the present invention, representing data acquired by the ith group of optical fiber detection signals at the jth moment;
3) Classification model: the fusion model fuses the multi-layer perceptron MLP, the support vector machine SVM and the LightGBM to process the optical fiber detection data, the fusion model takes the MLP as a main model, the SVM and the LightGBM as auxiliary models, the MLP consists of an input layer, an output layer and a hidden layer, and a forward propagation formula between each layer is shown as a formula (6):
wherein,a weight matrix representing the kth layer, I and O representing the input dimension and the output dimension of the kth layer, respectively, < >>Wherein the bias matrix of the kth layer is represented, wherein +.>Representing the offset of the kth layer, X K Representing the output of the kth layer, where X 0 =[xf 1 ,xf 2 ,...,xf n ] T F (·) represents the activation function, the final +.>Representing a probability matrix of the MLP after classifying the fiber detection data, wherein And->Respectively represent the probability of classifying the i-th group of optical fiber detection data as normal and invasive, and +.>When->Considering the i-th group of optical fiber detection data as intrusion data; when->The ith group of optical fiber detection data is considered to be normal data, wherein thr 1 And thr 2 All represent threshold values, and thr 1 >thr 2 The method comprises the steps of carrying out a first treatment on the surface of the When->When in use, the SVM and the LightGBM are adopted to comprehensively judge, the SVM is a binary classifier to map the characteristics of the data into points on a high-dimensional space, then the data is divided into two different classes by adopting a hyperplane, the space between the hyperplanes is maximized by the SVM,assuming that the hyperplane H is a hyperplane dividing different samples, it is expressed by the formula (7):
ω T x+b=0 (7),
wherein ω= (ω) 1 ,ω 2 ,...,ω d ) Let H 1 And H 2 Is the plane in which the sample point closest to the hyperplane H lies, then the support vector is the point closest to H, where H 1 And H 2 Represented as omega T x+b=1 and ω T x+b= -1, arbitrary sample point x i (i=1, 2.,. N.) the distance to H is expressed by formula (8):
equation (9) represents the minimum distance d between the hyperplane and all sample points m The SVM model solution to the maximum-split hyperplane problem is therefore represented by the optimization problem of equation (10):
solving to obtain a hyperplane H, classifying the sample into two types through the hyperplane, and calculating to obtain a classification probability p similar to the MLP svm The LightGBM is a model based on a gradient lifting decision tree GBDT, and the LightGBM searches for an optimal segmentation point by using a histogram algorithm, wherein the histogram algorithm comprises the following steps:
3-1) discretizing continuous eigenvalues into k integers, constructing the eigenvalues into a histogram with the width of k, and storing the sum of gradients of samples stored in the histogram and the number of samples of the eigenvalues into respective sub-buckets;
3-2) cycling through all feature values and repeating step 3-1);
3-3) traversing all sub-buckets, and calculating gain values of the current sub-bucket by taking the current sub-bucket as a partition point, wherein the gain value calculation is shown in a formula (11):
wherein S is L 、S R And S is P Respectively representing the sum of the gradients of the current sub-bucket and the left sub-bucket, the sum of the gradients of the current sub-bucket and the right sub-bucket and the sum of the total gradients of the father nodes, n L 、n R And n P Respectively representing the number of left samples of the current sub-bucket, the number of right samples of the current sub-bucket and the number of total samples;
3-4) selecting the maximum gain, taking the characteristic of the maximum gain and the barrel statistics value as the current split criterion, and calculating the classification probability p of the current sample after the classification result is obtained L When (when)At this time, the fusion model result is calculated by the formula (12):
wherein a, b and c represent weights occupied by MLP, SVM and LightGBM, respectively, and a+b+c=1, when p > thr 3 And when the data is received, the sample is considered to be intrusion data, and otherwise, the sample is normal data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111460428.0A CN114118163B (en) | 2021-12-01 | 2021-12-01 | Optical fiber intrusion detection method based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111460428.0A CN114118163B (en) | 2021-12-01 | 2021-12-01 | Optical fiber intrusion detection method based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114118163A CN114118163A (en) | 2022-03-01 |
CN114118163B true CN114118163B (en) | 2024-03-19 |
Family
ID=80366352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111460428.0A Active CN114118163B (en) | 2021-12-01 | 2021-12-01 | Optical fiber intrusion detection method based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114118163B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991435A (en) * | 2017-03-09 | 2017-07-28 | 南京邮电大学 | Intrusion detection method based on improved dictionary learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932480B (en) * | 2018-06-08 | 2022-03-15 | 电子科技大学 | Distributed optical fiber sensing signal feature learning and classifying method based on 1D-CNN |
US20210313063A1 (en) * | 2020-04-07 | 2021-10-07 | Clover Health | Machine learning models for gaps in care and medication actions |
-
2021
- 2021-12-01 CN CN202111460428.0A patent/CN114118163B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106991435A (en) * | 2017-03-09 | 2017-07-28 | 南京邮电大学 | Intrusion detection method based on improved dictionary learning |
Non-Patent Citations (1)
Title |
---|
一种基于支持向量机的入侵检测模型;许劲松, 覃俊;计算机仿真;20050530(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114118163A (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105224872B (en) | A kind of user's anomaly detection method based on neural network clustering | |
CN113392931B (en) | Hyperspectral open set classification method based on self-supervision learning and multitask learning | |
Iorga et al. | A deep CNN approach with transfer learning for image recognition | |
CN110941734B (en) | Depth unsupervised image retrieval method based on sparse graph structure | |
Wang et al. | A combination of residual and long–short-term memory networks for bearing fault diagnosis based on time-series model analysis | |
CN114970605A (en) | Multi-mode feature fusion neural network refrigeration equipment fault diagnosis method | |
CN114692681B (en) | SCNN-based distributed optical fiber vibration and acoustic wave sensing signal identification method | |
Pan et al. | Research on gear fault diagnosis based on feature fusion optimization and improved two hidden layer extreme learning machine | |
CN114429150A (en) | Rolling bearing fault diagnosis method and system under variable working conditions based on improved depth subdomain adaptive network | |
CN109190698B (en) | Classification and identification system and method for network digital virtual assets | |
He et al. | MTAD‐TF: Multivariate Time Series Anomaly Detection Using the Combination of Temporal Pattern and Feature Pattern | |
CN116011507A (en) | Rare fault diagnosis method for fusion element learning and graph neural network | |
Dong et al. | Multi‐task learning method for classification of multiple power quality disturbances | |
Ding et al. | Mine microseismic time series data integrated classification based on improved wavelet decomposition and ELM | |
CN108537266A (en) | A kind of cloth textured fault sorting technique of depth convolutional network | |
CN114118163B (en) | Optical fiber intrusion detection method based on machine learning | |
CN114332986B (en) | Small sample face recognition method based on subspace learning | |
Jin et al. | Multisource data fusion diagnosis method of rolling bearings based on improved multiscale CNN | |
Su et al. | PSR-LSTM model for weak pulse signal detection | |
CN115392323A (en) | Bearing fault monitoring method and system based on cloud edge cooperation | |
CN113935413A (en) | Distribution network wave recording file waveform identification method based on convolutional neural network | |
Zhang et al. | A spatial–spectral adaptive learning model for textile defect images recognition with few labeled data | |
CN113822771A (en) | Low false detection rate electricity stealing detection method based on deep learning | |
John et al. | Prediction of floods using improved pca with one-dimensional convolutional neural network | |
Trentin et al. | Unsupervised nonparametric density estimation: A neural network approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |