US11436475B2 - Anomaly detection with spiking neural networks - Google Patents
Anomaly detection with spiking neural networks Download PDFInfo
- Publication number
- US11436475B2 US11436475B2 US16/436,744 US201916436744A US11436475B2 US 11436475 B2 US11436475 B2 US 11436475B2 US 201916436744 A US201916436744 A US 201916436744A US 11436475 B2 US11436475 B2 US 11436475B2
- Authority
- US
- United States
- Prior art keywords
- input
- layer
- median
- value
- inputs
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G06N3/0481—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Definitions
- the disclosure relates generally to neural network computing, and more specifically, to detection of anomalies in a dataset by using spiking neural networks.
- Anomaly detection is an important problem in various fields of complex systems research including image processing, data analysis, physical security (for reduction of nuisance alarms), and cybersecurity (intrusion detection). Detection of anomalous data requires a contextual framework as well as a metric for comparison. For example, in images, the x and y axes provide a spatial context, and pixel-to-pixel value differences are reasonable for comparison. For video (as in physical security) or streaming data (as in cybersecurity), the time dimension adds to the context, and other features as well as combinations of features are relevant for comparison.
- Neural networks are well suited to anomaly detection due to their ability to find interesting data points, objects, or events within large volumes of data.
- spiking neural networks are particularly well-suited because the range of pixel values is typically bounded and small, and spatial context is easily represented using two-dimensional networks of neurons, resulting in intrinsically parallel operation performance.
- An illustrative embodiment provides computer-implemented method of anomaly detection.
- the method comprising receiving, by an input layer in a spiking neural network, a number of inputs, wherein each input is contained within a number of progressively larger neighborhoods of surrounding inputs.
- the input layer converts the inputs into phase-coded spikes, and from the phase-coded spikes a first median layer computes a median value of each input for each size neighborhood.
- An absolute difference layer computes an absolute difference each input from its median value for each size neighborhood.
- a second median layer computes, from absolute differences, a median absolute difference (MAD) value of each input for each size neighborhood.
- an adaptive median filter (AMF) layer determines if a MAD value for any size neighborhood exceeds a respective threshold.
- AMF adaptive median filter
- the AMF layer If a MAD value for one or more neighborhoods exceeds its respective threshold, the AMF layer outputs the median value of the input for the smallest neighborhood. If none of the MAD values for the neighborhoods exceeds the threshold, the AMF layer, outputs the original value of the input received by the input layer.
- the computer program product comprises a non-volatile computer readable storage medium having program instructions embodied therewith, the program instructions executable by a number of processors to implement a spiking neural network to perform the steps of: receiving, by an input layer in a spiking neural network, a number of inputs, wherein each input is contained within a number of progressively larger neighborhoods of surrounding inputs; converting, by the input layer, the inputs into phase-coded spikes; computing from the phase-coded spikes, by a first median layer, a median value of each input for each size neighborhood; computing, by an absolute difference layer, an absolute difference of each input from its median value for each size neighborhood; computing from absolute differences, by a second median layer, a median absolute difference (MAD) value of each input for each size neighborhood; determining for each input, by an adaptive median filter (AMF) layer, if a MAD value for any size neighborhood exceeds a respective threshold, wherein: if a MAD value for any size neighborhood exceeds a respective threshold, wherein
- the spiking neural network comprises: an input layer configured to receive a number of inputs and convert the inputs into phase-coded spikes, wherein each input is contained within a number of progressively larger neighborhoods of surrounding inputs; a first median layer configured to compute, from the phase-coded spikes, a median value of each input for each size neighborhood; an absolute difference layer configured to compute an absolute difference of each input from its median value for each size neighborhood; a second median layer configured to compute, from absolute differences, a median absolute difference (MAD) value of each input for each size neighborhood; and an adaptive median filter (AMF) layer configured to determine for each input if a MAD value for any size neighborhood exceeds a respective threshold, wherein: if a MAD value of one or more neighborhoods exceeds its threshold, the AMF layer outputs the median value of the input for the smallest size neighborhood; or if none of the MAD values for the neighborhoods exceeds the threshold, the AMF layer outputs the original value
- FIG. 1 is an illustration of a block diagram of an information environment in accordance with an illustrative embodiment
- FIG. 2 is a diagram that illustrates a node in a neural network in which illustrative embodiments can be implemented
- FIG. 3 is a diagram illustrating a restricted Boltzmann machine in which illustrative embodiments can be implemented
- FIG. 4 is a schematic illustration of a spiking neural architecture, in accordance with an illustrative embodiment
- FIG. 5 depicts a multi-layer, spiking, adaptive median-filtering network in accordance with illustrative embodiments
- FIG. 6 depicts the application of anomaly detection and correction using a spiking neural network in accordance with illustrative embodiments
- FIG. 7 is a flowchart depicting a method of anomaly detection using a spiking neural network in accordance with illustrative embodiments.
- FIG. 8 is a diagram of a data processing system depicted in accordance with an illustrative embodiment.
- the illustrative embodiments recognize and take into account one or more different considerations. For example, the illustrative embodiments recognize and take into account that anomaly detection is important problem requiring a contextual framework and comparison metric. Anomaly detection has applications for image processing, physical security, and cyber security.
- the illustrative embodiments further recognize and take into account that neural networks are well suited to detecting anomalies within large volumes of data, so-called “Big Data.”
- the present disclosure provides a spiking neural network with phase-coded spiking neurons as basic computational elements.
- Contextual framework is provided by multiple-sized neighborhoods of surrounding data. Multiple iterations of spiking adaptive median filtering are handled by routing outputs back to the neural network.
- FIG. 1 an illustration of a diagram of a data processing environment is depicted in accordance with an illustrative embodiment. It should be appreciated that FIG. 1 is only provided as an illustration of one implementation and is not intended to imply any limitation with regard to the environments in which the different embodiments may be implemented. Many modifications to the depicted environments may be made.
- the computer-readable program instructions may also be loaded onto a computer, a programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, a programmable apparatus, or other device to produce a computer implemented process, such that the instructions which execute on the computer, the programmable apparatus, or the other device implement the functions and/or acts specified in the flowchart and/or block diagram block or blocks.
- FIG. 1 depicts a pictorial representation of a network of data processing systems in which illustrative embodiments may be implemented.
- Network data processing system 100 is a network of computers in which the illustrative embodiments may be implemented.
- Network data processing system 100 contains network 102 , which is a medium used to provide communications links between various devices and computers connected together within network data processing system 100 .
- Network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.
- server computer 104 and server computer 106 connect to network 102 along with storage unit 108 .
- client computers include client computer 110 , client computer 112 , and client computer 114 .
- Client computer 110 , client computer 112 , and client computer 114 connect to network 102 . These connections can be wireless or wired connections depending on the implementation.
- Client computer 110 , client computer 112 , and client computer 114 may be, for example, personal computers or network computers.
- server computer 104 provides information, such as boot files, operating system images, and applications to client computer 110 , client computer 112 , and client computer 114 .
- Client computer 110 , client computer 112 , and client computer 114 are clients to server computer 104 in this example.
- Network data processing system 100 may include additional server computers, client computers, and other devices not shown.
- Program code located in network data processing system 100 may be stored on a computer-recordable storage medium and downloaded to a data processing system or other device for use.
- the program code may be stored on a computer-recordable storage medium on server computer 104 and downloaded to client computer 110 over network 102 for use on client computer 110 .
- network data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another.
- TCP/IP Transmission Control Protocol/Internet Protocol
- At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers consisting of thousands of commercial, governmental, educational, and other computer systems that route data and messages.
- network data processing system 100 also may be implemented as a number of different types of networks, such as, for example, an intranet, a local area network (LAN), or a wide area network (WAN).
- FIG. 1 is intended as an example, and not as an architectural limitation for the different illustrative embodiments.
- network data processing system 100 is not meant to limit the manner in which other illustrative embodiments can be implemented.
- client computers may be used in addition to or in place of client computer 110 , client computer 112 , and client computer 114 as depicted in FIG. 1 .
- client computer 110 , client computer 112 , and client computer 114 may include a tablet computer, a laptop computer, a bus with a vehicle computer, and other suitable types of clients.
- the hardware may take the form of a circuit system, an integrated circuit, an application-specific integrated circuit (ASIC), a programmable logic device, or some other suitable type of hardware configured to perform a number of operations.
- ASIC application-specific integrated circuit
- the device may be configured to perform the number of operations.
- the device may be reconfigured at a later time or may be permanently configured to perform the number of operations.
- Programmable logic devices include, for example, a programmable logic array, programmable array logic, a field programmable logic array, a field programmable gate array, and other suitable hardware devices.
- the processes may be implemented in organic components integrated with inorganic components and may be comprised entirely of organic components, excluding a human being. For example, the processes may be implemented as circuits in organic semiconductors.
- GPUs are particularly well suited to machine learning. Their specialized parallel processing architecture allows them to perform many more floating point operations per second than a CPU, on the order of 100 ⁇ or more. GPUs can be clustered together to run neural networks comprising hundreds of millions of connection nodes.
- Supervised machine learning comprises providing the machine with training data and the correct output value of the data.
- the values for the output are provided along with the training data (labeled dataset) for the model building process.
- the algorithm through trial and error, deciphers the patterns that exist between the input training data and the known output values to create a model that can reproduce the same underlying rules with new data.
- Examples of supervised learning algorithms include, but are not limited to, regression analysis, decision trees, k-nearest neighbors, neural networks, and support vector machines.
- Unsupervised learning has the advantage of discovering patterns in the data with no need for labeled datasets.
- algorithms used in unsupervised machine learning include, but are not limited to, k-means clustering, association analysis, and descending clustering.
- supervised and unsupervised methods learn from a dataset
- reinforcement learning methods learn from interactions with an environment.
- Algorithms such as Q-learning are used to train the predictive model through interacting with the environment using measurable performance criteria.
- FIG. 2 is a diagram that illustrates a node in a neural network in which illustrative embodiments can be implemented.
- Node 200 combines multiple inputs 210 from other nodes. Each input 210 is multiplied by a respective weight 220 that either amplifies or dampens that input, thereby assigning significance to each input for the task the algorithm is trying to learn.
- the weighted inputs are collected by a net input function 230 and then passed through an activation function 240 to determine the output 250 .
- the connections between nodes are called edges.
- the respective weights of nodes and edges might change as learning proceeds, increasing or decreasing the weight of the respective signals at an edge.
- a node might only send a signal if the aggregate input signal exceeds a predefined threshold. Pairing adjustable weights with input features is how significance is assigned to those features with regard to how the network classifies and clusters input data.
- Neural networks are often aggregated into layers, with different layers performing different kinds of transformations on their respective inputs.
- a node layer is a row of nodes that turn on or off as input is fed through the network. Signals travel from the first (input) layer to the last (output) layer, passing through any layers in between. Each layer's output acts as the next layer's input.
- Stochastic neural networks are a type of network that incorporate random variables, which makes them well suited for optimization problems. This is done by giving the nodes in the network stochastic (randomly determined) weights or transfer functions.
- a Boltzmann machine is a type of stochastic neural network in which each node is binary valued, and the chance of it firing depends on the other nodes in the network.
- Each node is a locus of computation that processes an input and begins by making stochastic decisions about whether to transmit that input or not.
- the weights (coefficients) that modify inputs are randomly initialized.
- Boltzmann machines optimize weights and quantities and are particularly well suited to represent and solve difficult combinatorial problems.
- a Boltzmann machine is shown a set of binary data vectors and must find weights on the connections so that the data vectors are good solutions to the optimization problem defined by those weights.
- FIG. 3 is a diagram illustrating a restricted Boltzmann machine in which illustrative embodiments can be implemented.
- the nodes in the Boltzmann machine 300 are divided into a layer of visible nodes 310 and a layer of hidden nodes 320 .
- a common problem with general Boltzmann machines is that they stop learning correctly when they are scaled up.
- Restricted Boltzmann machines overcome this problem by using an architecture that does not allow connections between nodes in the same layer. As can be seen in FIG. 3 , there is no intralayer communication between nodes.
- the visible nodes 310 are those that receive information from the environment (i.e. a set of external training data). Each visible node in layer 310 takes a low-level feature from an item in the dataset and passes it to the hidden nodes in the next layer 320 . When a node in the hidden layer 320 receives an input value x from a visible node in layer 310 it multiplies x by the weight assigned to that connection (edge) and adds it to a bias b. The result of these two operations is then fed into an activation function which produces the node's output.
- each node in one layer is connected to every node in the next layer. For example, when node 321 receives input from all of the visible nodes 311 - 313 each x value from the separate nodes is multiplied by its respective weight, and all of the products are summed. The summed products are then added to the hidden layer bias, and the result is passed through the activation function to produce output 331 . A similar process is repeated at hidden nodes 322 - 324 to produce respective outputs 332 - 334 . In the case of a deeper neural network (discussed below), the outputs 330 of hidden layer 320 serve as inputs to the next hidden layer.
- the first phase is the “positive” phase in which the visible nodes' states are clamped to a particular binary state vector sampled from the training set (i.e. the network observes the training data).
- the second phase is the “negative” phase in which none of the nodes have their state determined by external data, and the network is allowed to run freely (i.e. the network tries to reconstruct the input).
- the negative reconstruction phase the activations of the hidden layer 320 act as the inputs in a backward pass to visible layer 310 . The activations are multiplied by the same weights that the visible layer inputs were on the forward pass.
- the output of those operations is a reconstruction r (i.e. an approximation of the original input x).
- the RBM uses inputs to make predictions about node activations (i.e. the probability of output given a weighted input x).
- the RBM is attempting to estimate the probability of inputs x given activations a, which are weighted with the same coefficients as those used on the forward pass.
- the bias of the hidden layer helps the RBM to produce activations on the forward pass. Biases impose a floor so that at least some nodes fire no matter how sparse the input data.
- the visible layer bias helps the RBM learn the reconstructions on the backward pass.
- a cost function estimates how the model is performing. It is a measure of how wrong the model is in terms of its ability to estimate the relationship between input x and output y. This is expressed as a difference or distance between the predicted value and the actual value.
- the cost function (i.e. loss or error) can be estimated by iteratively running the model to compare estimated predictions against known values of y during supervised learning. The objective of a machine learning model, therefore, is to find parameters, weights, or a structure that minimizes the cost function.
- Gradient descent is an optimization algorithm that attempts to find a local or global minima of a function, thereby enabling the model to learn the gradient or direction that the model should take in order to reduce errors. As the model iterates, it gradually converges towards a minimum where further tweaks to the parameters produce little or zero changes in the loss. At this point the model has optimized the weights such that they minimize the cost function.
- neural networks can be stacked to created deep networks. After training one network, the activities of its hidden nodes can be used as training data for a higher-level network. Such stacking makes it possible to efficiently train several layers of hidden nodes.
- One such type of stacked network that more closely simulates the functioning of biological systems is a Spiking Neural Network (SNN). SNNs incorporate the concept of time into their operating model. One of the most important differences between SNNs and other types of neural networks is the way information propagates between units/nodes.
- a synapse can be either excitatory (i.e. increases membrane potential) or inhibitory (i.e. decreases membrane potential).
- the strength of the synapses can be changed as a result of learning.
- SNNs allow learning (weight modification) that depends on the relative timing of spikes between pairs of directly connected nodes.
- STDP spike-timing-dependent plasticity
- STDP spike-timing-dependent plasticity
- the weight connecting pre- and post-synaptic units is adjusted according to their relative spike times within a specified time interval. If a pre-synaptic unit fires before the post-synaptic unit within the specified time interval, the weight connecting them is increased (long-term potentiation (LTP)). If it fires after the post-synaptic unit within the time interval, the weight is decreased (long-term depression (LTD)).
- LTP long-term potentiation
- LTD long-term depression
- the leaky integrate-and-fire (LIF) neuron has been a primary area of interest for the development of an artificial neuron and is a modified version of the original integrate-and-fire circuit.
- the LIF neuron is based on the biological neuron, which exhibits the following functionalities:
- Firing Emission of an output spike when the accumulated signal reaches a certain level after a series of integration and leaking.
- An LIF neuron continually integrates the energy provided by inputs until a threshold is reached and the neuron fires as a spike that provides input to other neurons via synapse connections. By emitting this spike, the neuron is returned to a low energy state and continues to integrate input current until its next firing. Throughout this process, the energy stored in the neuron continually leaks. If insufficient input is provided within a specified time frame, the neuron gradually reverts to a low energy state. This prevents the neuron from indefinitely retaining energy, which would not match the behavior of biological neurons.
- Lateral inhibition is a process that allows an excited neuron to inhibit, or reduce, the activity of other nearby or connected neurons.
- One such neural computing system that seeks to take advantage of this is the winner-take-all system.
- artificial neurons contend for activation, meaning that only one neuron is chosen as the winner and allowed to fire, using lateral inhibition to suppress the output of all other neurons. After the winning neuron fires, the system is reset and the neurons once again compete for activation.
- a winner-take-all system is one of the many machine learning paradigms that take advantage of the lateral inhibition phenomenon, which is commonly used in recognition and modeling processes.
- Spiking neural architecture 400 may comprise a plurality of neuron lanes.
- a neuron lane is a particular type of excitatory neural pathway.
- the number of neuron lanes might equal the number of input values from which an optimum value will be determined.
- input values X 1 through X p are provided to spiking neural architecture 400 comprises p neuron lanes.
- Each neuron lane comprises a spiking neuron implementing an objective function and a blocking neuron.
- neuron lane 404 comprises spiking neuron 410 and blocking neuron 412 .
- Neuron lane 406 comprises spiking neuron 414 and blocking neuron 416 .
- Neuron lane 408 comprises spiking neuron 418 and blocking neuron 420 .
- Spiking neurons 410 , 414 , and 418 can be leaky integrate-and-fire neurons.
- Inputs to each spiking neuron might consist of any one or all of the external inputs x i modified by internal weights w ij and additional bias signal with weight w i0 .
- spiking neurons 410 , 414 , and 418 in spiking neural architecture 400 might be initialized using the value of the un-normalized univariate signed rank function for each input value in relation to all other possible input values ⁇ x 1 , x 2 , . . . , x p ⁇ .
- the spiking neurons receive no further input.
- the first one of the spiking neurons to decay to zero defines the computed median value.
- the initial neuron values can be either positive or negative according to un-normalized univariate signed rank function, and they will each decay toward zero as needed to compute the median.
- the spiking neurons thus provide inhibitory signals such that the first one to decay completely will be the first to no longer inhibit the output by its corresponding blocking neuron of its originally associated input signal x i , corresponding to the sample median of the original array of input values.
- the bias weights w i0 are set to 0 in this particular application.
- FIG. 5 depicts a multi-layer, spiking, adaptive median-filtering (AMF) network in accordance with illustrative embodiments.
- the example shown in FIG. 5 covers noise filtering in image processing.
- the method of the illustrative embodiments can be generalized to other types of datasets such as streaming data.
- Image processing and noise filtering are well suited to demonstrating anomaly detection because it is easy to visualize differences in performance.
- spiking AMF network 500 comprises five layers.
- Input layer 510 receives input value x ij 512 for pixel u ij 514 and converts it to phase-coded spike value ⁇ ij 516 .
- Phase-coding allows the spiking algorithm to process inputs and internal computations in ascending order. Including a delay of dk will delay the neuron firing by ⁇ d/k ⁇ phase windows plus d mod k phase steps. The delay is assigned to an input according to its value within a minimum/maximum range. At each layer, the smallest phase-code is processed first, then proceeding up to the largest phase-code. The phase-coding is reset for each subsequent layer of the network.
- the phase-coded spike ⁇ ij from the input layer 510 is fed into a median-value layer 520 .
- Median-value layer 520 computes a median value for each pixel within neighborhood 522 .
- a neighborhood comprises a number of input values surrounding a specific reference input value (in this case u ij ).
- neighborhood 522 comprises a 3 ⁇ 3 matrix of pixels with pixel u ij at the center and eight immediately adjacent pixels.
- the median value layer 520 calculates a 3 ⁇ 3 median value.
- median value m ij 524 is calculated for pixel u ij using all of the pixel values in neighborhood 522 .
- the median value m ij 524 is represented by phase-coded spike
- ⁇ ⁇ i ⁇ j median ⁇ u ⁇ ⁇ i ⁇ j 1 ⁇ ( u ) ⁇ 52 ⁇ 6 .
- the median value layer 520 uses values for pixels that are in neighborhood 522 as well as pixels that are not included in neighborhood 522 .
- the median value layer will use the values of pixels u i ⁇ 1,j ⁇ 1 , u i ⁇ 1,j , u i,j , u i+1,j , u i+1,j ⁇ 1 , as well as unshown pixels u i,j ⁇ 2 , u i+1,j ⁇ 2 , and u i ⁇ 1,j ⁇ 2 .
- the median value layer 520 only has to wait for half of the neurons representing the pixels in neighborhood 522 to spike before forwarding the spikes to the next layer.
- FIG. 5 only depicts a 3 ⁇ 3 neighborhood 522 containing pixel u ij corresponding to ⁇ ij 1 &r ij 1 .
- median value layer 520 also calculates in parallel a median value for pixel u ij for a number of progressively larger neighborhoods containing u ij .
- These neighborhoods can include, e.g., matrices of size 5 ⁇ 5 (corresponding to ⁇ ij 2 &r ij 2 ), 7 ⁇ 7 (corresponding to ⁇ ij 3 &r ij 3 ), and 9 ⁇ 9 (corresponding to ⁇ ij 4 &r ij 4 ).
- a neighborhood can comprise a specified number of input values before and after a reference input or input values within a specified time frame before and after the reference input.
- an absolute difference layer 530 calculates the absolute difference between the median value m ij 534 and the original value for the pixel u ij 532 using a 2 ⁇ 2 comparison.
- neuron a ji 538 subtracts the value of the phase-coded spike ⁇ ij for the original input u ij from the value of the phase-coded spike ⁇ circumflex over ( ⁇ ) ⁇ ij for the median value m ij .
- Neuron a ij 536 subtracts ⁇ circumflex over ( ⁇ ) ⁇ ij from ⁇ ij . Only one of either a ij or a ji can have a positive input and therefore spike. The other neuron will have a negative input and therefore not spike. Alternatively, both signals can be zero if there is no difference between median value and original input value.
- the absolute value indicates the degree to which the original input value of pixel u ij differs from the other pixels in the neighborhood.
- An absolute difference is calculated in parallel for each pixel, for each specified size neighborhood. Therefore, for example, an absolute difference is calculated for pixel u ij using the 3 ⁇ 3 median value (shown in FIG. 5 ), 5 ⁇ 5 median value, 7 ⁇ 7 median value, and 9 ⁇ 9 median value.
- the positive value calculated by absolute difference layer 530 is propagated to the median absolute difference (MAD) layer 540 , which calculates a MAD value r ij 1 546 for pixel u ij .
- the MAD layer 540 can potentially receive signals from one of two sets of spiking neurons from absolute difference layer 530 , set 542 (centered around neuron a ij ) or set 544 (centered around neuron a ji ). However, as noted above regarding absolute difference, only one of the paired signals from a ij and a ji can be positive (and propagated forward to MAD layer 540 ) since the other is guaranteed to be negative, or both can be zero.
- MAD layer 540 operates in a similar manner to median layer 520 , but instead of calculating the median of the original pixel values, MAD layer 540 calculates the median of the absolute difference values calculated in layer 530 .
- the MAD value r ij 1 546 is represented by phase-coded spike
- ⁇ ⁇ i ⁇ j 1 median u ⁇ ⁇ i ⁇ j 1 ⁇ ⁇ ⁇ u - ⁇ ⁇ i ⁇ j ⁇ ⁇ 548.
- MAD layer 540 calculates a MAD value for the pixel for a number of different size neighborhoods (i.e. 3 ⁇ 3, 5 ⁇ 5, etc.), similar to median layer 520 .
- the MAD value of the pixel for each size neighborhood is then propagated to an adaptive median-filter (AMF) layer 550 .
- AMF adaptive median-filter
- the AMF layer 550 compares the MAD values of a pixel, represented by a number of spiking neurons r m 552 for each size neighborhood (i.e. r ij 1 , r ij 2 , r ij 3 , r ij 4 ), to predefined threshold values ⁇ m according to Equation 3 below:
- o ij ⁇ ⁇ ⁇ ij 1 , ⁇ m , ⁇ ⁇ ij m > ⁇ m x ij , otherwise Eq . ⁇ 3
- the pixel is deemed to deviate too far from its neighbors, and an anomaly is detected, causing the representative neuron r m for that neighborhood to spike. If an anomaly is detected in any size neighborhood, the median value of the pixel for the smallest neighborhood, represented by phase-coded spike ⁇ circumflex over ( ⁇ ) ⁇ ij 1 , is output by the AMF layer 550 as output o ij 554 instead of the pixel's original input value x ij .
- the median value for the smallest neighborhood ( ⁇ circumflex over ( ⁇ ) ⁇ ij 1 ) is still selected as the output value o ij 554 .
- the median value for the smallest neighborhood is deemed to be the most accurate representation of the immediate surroundings of the pixel in question.
- phase-coded but unprocessed pixel value u ij is delayed by 5 k phase-code windows to allow the appropriate time for anomaly detection in each of the neighborhoods.
- FIG. 6 depicts the application of anomaly detection and correction using a spiking neural network in accordance with illustrative embodiments.
- Image 602 is the original unprocessed image containing 10% “salt-and-pepper” noise.
- Image 604 is the corrected image after passing the original image 602 through the AMF spiking network. Spiking AMF results in a percentage of changed (corrected) pixels after three iterations that equals the percentage of noise in the original image 602 .
- FIG. 7 is a flowchart depicting a method of anomaly detection using a spiking neural network in accordance with illustrative embodiments.
- Process 700 begins with the input layer of the spiking neural network receiving an input value (step 702 ). This input might be a pixel value or an input in a data stream. The input layer converts then the input to a phase-coded spike value (step 704 ).
- a median layer in the spiking network calculates a median value of the input for a number of progressively larger neighborhoods containing the input (step 706 ).
- the neighborhoods might comprise increasingly larger symmetric matrices of adjacent pixels surrounding the image pixel in question.
- the neighborhoods might comprise increasingly larger symmetrical numbers of inputs preceding and following the input in question.
- an absolute difference layer calculates the absolute difference between the median value of the input for each size neighborhood and the original input value received by the input layer (step 708 ).
- a median absolute difference (MAD) layer then calculates a MAD value for the input for each neighborhood (step 710 ).
- An adaptive median-filter (AMF) layer determines if the input's MAD value for each neighborhood exceeds a respective threshold (step 712 ). If none of the neighborhoods containing the input have a MAD value exceeding their respective thresholds no anomaly is detected, and the original input value is output by the AMF layer (step 714 ).
- AMF adaptive median-filter
- the AMF layer outputs the median value of the input for the smallest neighborhood (step 716 ).
- process 700 determines if a predetermined number of iterations have been performed on the input data (step 718 ). If the prescribed number of iterations has not yet been performed, the output value from the AMF layer is input back into the median value layer at step 706 , bypassing the input layer. If the prescribed number of iterations has been performed, process 700 ends.
- Data processing system 800 is an example of a system in which computer-readable program code or program instructions implementing processes of illustrative embodiments may be run.
- data processing system 800 includes communications fabric 802 , which provides communications between processor unit 804 , memory 806 , persistent storage 808 , communications unit 810 , input/output unit 812 , and display 814 .
- Processor unit 804 serves to execute instructions for software applications and programs that may be loaded into memory 806 .
- Processor unit 804 may be a set of one or more hardware processor devices or may be a multi-processor core, depending on the particular implementation. Further, processor unit 804 may be implemented using one or more heterogeneous processor systems, in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 804 may be a symmetric multi-processor system containing multiple processors of the same type.
- a computer-readable storage device is any piece of hardware that is capable of storing information, such as, for example, without limitation, data, computer-readable program code in functional form, and/or other suitable information either on a transient basis and/or a persistent basis. Further, a computer-readable storage device excludes a propagation medium.
- Memory 806 in these examples, may be, for example, a random access memory, or any other suitable volatile or non-volatile storage device.
- Persistent storage 808 may take various forms, depending on the particular implementation.
- persistent storage 808 may contain one or more devices.
- persistent storage 808 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above.
- the media used by persistent storage 808 may be removable.
- a removable hard drive may be used for persistent storage 808 .
- Communications unit 810 in this example, provides for communication with other computers, data processing systems, and devices via network communications unit 810 may provide communications using both physical and wireless communications links.
- the physical communications link may utilize, for example, a wire, cable, universal serial bus, or any other physical technology to establish a physical communications link for data processing system 800 .
- the wireless communications link may utilize, for example, shortwave, high frequency, ultra-high frequency, microwave, wireless fidelity (WiFi), Bluetooth technology, global system for mobile communications (GSM), code division multiple access (CDMA), second-generation (2G), third-generation (3G), fourth-generation (4G), 4G Long Term Evolution (LTE), LTE Advanced, or any other wireless communication technology or standard to establish a wireless communications link for data processing system 800 .
- Input/output unit 812 allows for the input and output of data with other devices that may be connected to data processing system 800 .
- input/output unit 812 may provide a connection for user input through a keypad, keyboard, and/or some other suitable input device.
- Display 814 provides a mechanism to display information to a user and may include touch screen capabilities to allow the user to make on-screen selections through user interfaces or input data, for example.
- Instructions for the operating system, applications, and/or programs may be located in storage devices 816 , which are in communication with processor unit 804 through communications fabric 802 .
- the instructions are in a functional form on persistent storage 808 .
- These instructions may be loaded into memory 806 for running by processor unit 804 .
- the processes of the different embodiments may be performed by processor unit 804 using computer-implemented program instructions, which may be located in a memory, such as memory 806 .
- These program instructions are referred to as program code, computer-usable program code, or computer-readable program code that may be read and run by a processor in processor unit 804 .
- the program code in the different embodiments, may be embodied on different physical computer-readable storage devices, such as memory 806 or persistent storage 808 .
- Program code 818 is located in a functional form on computer-readable media 820 that is selectively removable and may be loaded onto or transferred to data processing system 800 for running by processor unit 804 .
- Program code 818 and computer-readable media 820 form computer program product 822 .
- computer-readable media 820 may be computer-readable storage media 824 or computer-readable signal media 826 .
- Computer-readable storage media 824 may include, for example, an optical or magnetic disc that is inserted or placed into a drive or other device that is part of persistent storage 808 for transfer onto a storage device, such as a hard drive, that is part of persistent storage 808 .
- Computer-readable storage media 824 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory that is connected to data processing system 800 . In some instances, computer-readable storage media 824 may not be removable from data processing system 800 .
- program code 818 may be transferred to data processing system 800 using computer-readable signal media 826 .
- Computer-readable signal media 826 may be, for example, a propagated data signal containing program code 818 .
- Computer-readable signal media 826 may be an electro-magnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communication links, such as wireless communication links, an optical fiber cable, a coaxial cable, a wire, and/or any other suitable type of communications link.
- the communications link and/or the connection may be physical or wireless in the illustrative examples.
- the computer-readable media also may take the form of non-tangible media, such as communication links or wireless transmissions containing the program code.
- program code 818 may be downloaded over a network to persistent storage 808 from another device or data processing system through computer-readable signal media 826 for use within data processing system 800 .
- program code stored in a computer-readable storage media in a data processing system may be downloaded over a network from the data processing system to data processing system 800 .
- the data processing system providing program code 818 may be a server computer, a client computer, or some other device capable of storing and transmitting program code 818 .
- data processing system 800 may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding a human being.
- a storage device may be comprised of an organic semiconductor.
- a computer-readable storage device in data processing system 800 is any hardware apparatus that may store data.
- Memory 806 , persistent storage 808 , and computer-readable storage media 824 are examples of physical storage devices in a tangible form.
- a bus system may be used to implement communications fabric 802 and may be comprised of one or more buses, such as a system bus or an input/output bus.
- the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system.
- a communications unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter.
- a memory may be, for example, memory 806 or a cache such as found in an interface and memory controller hub that may be present in communications fabric 802 .
- the present invention may be a system, a method, and/or a computer program product.
- the computer program product may include a computer-readable storage medium or media having computer-readable program instructions thereon for causing a processor to carry out aspects of the present invention.
- the computer-readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
- the computer-readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- a non-exhaustive list of more specific examples of the computer-readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAN), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
- RAN random access memory
- ROM read-only memory
- EPROM or Flash memory erasable programmable read-only memory
- SRAM static random access memory
- CD-ROM compact disc read-only memory
- DVD digital versatile disk
- memory stick a floppy disk
- a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
- a computer-readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
- Computer-readable program instructions described herein can be downloaded to respective computing/processing devices from a computer-readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
- the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
- a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium within the respective computing/processing device.
- Computer-readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
- the computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
- electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer-readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
- the phrase “a number” means one or more.
- the phrase “at least one of”, when used with a list of items, means different combinations of one or more of the listed items may be used, and only one of each item in the list may be needed. In other words, “at least one of” means any combination of items and number of items may be used from the list, but not all of the items in the list are required.
- the item may be a particular object, a thing, or a category.
- “at least one of item A, item B, or item C” may include item A, item A and item B, or item C. This example also may include item A, item B, and item C or item B and item C. Of course, any combinations of these items may be present. In some illustrative examples, “at least one of” may be, for example, without limitation, two of item A; one of item B; and ten of item C; four of item B and seven of item C; or other suitable combinations.
- each block in the flowcharts or block diagrams may represent at least one of a module, a segment, a function, or a portion of an operation or step.
- one or more of the blocks may be implemented as program code.
- the function or functions noted in the blocks may occur out of the order noted in the figures.
- two blocks shown in succession may be performed substantially concurrently, or the blocks may sometimes be performed in the reverse order, depending upon the functionality involved.
- other blocks may be added in addition to the illustrated blocks in a flowchart or block diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
u j(t)=(u j(t−1)−λj(u j(t−1)−u eq)(1−z j(t−1)+u eq z j(t−1)+Σi=0 P w ij x i(t)+Σq=1 Pρq(z q(t−1)) Eq. 1
ρj(z j(t))=z j(t)(t mod k) Eq. 2
Claims (21)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/436,744 US11436475B2 (en) | 2019-06-10 | 2019-06-10 | Anomaly detection with spiking neural networks |
US17/890,843 US20220406408A1 (en) | 2019-06-10 | 2022-08-18 | Sequence-based anomaly detection with hierarchical spiking neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/436,744 US11436475B2 (en) | 2019-06-10 | 2019-06-10 | Anomaly detection with spiking neural networks |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/890,843 Continuation-In-Part US20220406408A1 (en) | 2019-06-10 | 2022-08-18 | Sequence-based anomaly detection with hierarchical spiking neural networks |
Publications (2)
Publication Number | Publication Date |
---|---|
US20200387773A1 US20200387773A1 (en) | 2020-12-10 |
US11436475B2 true US11436475B2 (en) | 2022-09-06 |
Family
ID=73650615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/436,744 Active 2041-05-30 US11436475B2 (en) | 2019-06-10 | 2019-06-10 | Anomaly detection with spiking neural networks |
Country Status (1)
Country | Link |
---|---|
US (1) | US11436475B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230359869A1 (en) * | 2021-01-25 | 2023-11-09 | Chengdu SynSense Technology Co., Ltd. | Equipment anomaly detection method, computer readable storage medium, chip, and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105205114A (en) * | 2015-09-06 | 2015-12-30 | 重庆邮电大学 | Wi-Fi (wireless fidelity) positioning fingerprint database construction method based on image processing |
WO2019125419A1 (en) * | 2017-12-19 | 2019-06-27 | Intel Corporation | Device, system and method for varying a synaptic weight with a phase differential of a spiking neural network |
-
2019
- 2019-06-10 US US16/436,744 patent/US11436475B2/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105205114A (en) * | 2015-09-06 | 2015-12-30 | 重庆邮电大学 | Wi-Fi (wireless fidelity) positioning fingerprint database construction method based on image processing |
WO2019125419A1 (en) * | 2017-12-19 | 2019-06-27 | Intel Corporation | Device, system and method for varying a synaptic weight with a phase differential of a spiking neural network |
Non-Patent Citations (3)
Title |
---|
Chen et al. "Adaptive impulse Detection Using Center-Weighted Median Filters", IEEE SPL, 2001, pp. 3. * |
Demertzis et al. "A Hybrid Network Anomaly and Intrusion Detection Approach Based on Evolving Spiking Neural Network Classification", IC e-Democracy, 2013, pp. 12. * |
Verzi S.J., Vineyard C M., Aimone J.B. (2018) Neural-Inspired Anomaly Detection. In: Morales A., Gershenson C., Braha D., Minai A., Bar-Yam Y. (eds) Unifying Themes in Complex Systems IX. ICCS 2018. Springer Proceedings in Complexity. Springer, Champp. 203-209, DOI: 10.1007/978-3-319-96661-8_21. |
Also Published As
Publication number | Publication date |
---|---|
US20200387773A1 (en) | 2020-12-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11544535B2 (en) | Graph convolutional networks with motif-based attention | |
US20210390416A1 (en) | Variable parameter probability for machine-learning model generation and training | |
JP7470476B2 (en) | Integration of models with different target classes using distillation | |
US11610131B2 (en) | Ensembling of neural network models | |
US10380479B2 (en) | Acceleration of convolutional neural network training using stochastic perforation | |
KR102492318B1 (en) | Model training method and apparatus, and data recognizing method | |
US11087086B2 (en) | Named-entity recognition through sequence of classification using a deep learning neural network | |
EP3306534B1 (en) | Inference device and inference method | |
Yao et al. | Policy gradient based quantum approximate optimization algorithm | |
Baig et al. | GMDH-based networks for intelligent intrusion detection | |
CN109754078A (en) | Method for optimization neural network | |
US11514308B2 (en) | Method and apparatus for machine learning | |
Jalali et al. | An advanced short-term wind power forecasting framework based on the optimized deep neural network models | |
KR20160063965A (en) | Method for extending structure of neural network, method of dimension reduction, and apparatus thereof | |
Nadella et al. | Adversarial attacks on deep neural network: developing robust models against evasion technique | |
US20220156508A1 (en) | Method For Automatically Designing Efficient Hardware-Aware Neural Networks For Visual Recognition Using Knowledge Distillation | |
CA3131688A1 (en) | Process and system including an optimization engine with evolutionary surrogate-assisted prescriptions | |
US20220374700A1 (en) | Time-Series Anomaly Detection Via Deep Learning | |
WO2021012263A1 (en) | Systems and methods for end-to-end deep reinforcement learning based coreference resolution | |
Al-Shabi et al. | Using deep learning to detecting abnormal behavior in internet of things | |
Rahebi et al. | Biomedical image edge detection using an ant colony optimization based on artificial neural networks | |
US11436475B2 (en) | Anomaly detection with spiking neural networks | |
Li et al. | Graph convolution recurrent denoising diffusion model for multivariate probabilistic temporal forecasting | |
US12026624B2 (en) | System and method for loss function metalearning for faster, more accurate training, and smaller datasets | |
US20200193276A1 (en) | Neuromorphic Neuron Apparatus For Artificial Neural Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: U.S. DEPARTMENT OF ENERGY, DISTRICT OF COLUMBIA Free format text: CONFIRMATORY LICENSE;ASSIGNOR:NATIONAL TECHNOLOGY & ENGINEERING SOLUTIONS OF SANDIA, LLC;REEL/FRAME:049916/0134 Effective date: 20190724 |
|
AS | Assignment |
Owner name: NATIONAL TECHNOLOGY & ENGINEERING SOLUTIONS OF SANDIA, LLC, NEW MEXICO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VERZI, STEPHEN JOSEPH;VINEYARD, CRAIG MICHAEL;AIMONE, JAMES BRADLEY;REEL/FRAME:049988/0316 Effective date: 20190805 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |