CN108881254B - Intrusion detection system based on neural network - Google Patents
Intrusion detection system based on neural network Download PDFInfo
- Publication number
- CN108881254B CN108881254B CN201810696883.2A CN201810696883A CN108881254B CN 108881254 B CN108881254 B CN 108881254B CN 201810696883 A CN201810696883 A CN 201810696883A CN 108881254 B CN108881254 B CN 108881254B
- Authority
- CN
- China
- Prior art keywords
- neural network
- matrix
- calculation
- data
- intrusion detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Hardware Design (AREA)
- Neurology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses an intrusion detection system based on a neural network, which constructs a service-oriented system architecture and comprises a cache module and a neural network accelerator module, wherein the cache module captures redundancy by utilizing time locality in hardware, so that the requirement on storage resources is reduced; the neural network accelerator module is used for detecting the attack character string and accelerating the character string matching process; the whole system runs on the cloud computing platform. The redundancy table mechanism designed and realized by the invention can well utilize the time locality of hardware and greatly reduce the requirements on-chip storage resources. Meanwhile, in order to improve the speed and the accuracy of the intrusion detection system, a neural network method is also applied to accelerate the process of character string matching. Compared with a general processor, the intrusion detection system has the characteristics of high performance and low power consumption, and can meet the requirements on speed and throughput in a big data era.
Description
Technical Field
The invention relates to the field of computer hardware acceleration, in particular to an intrusion detection system and a design method thereof.
Background
In the big data age, the internet-related applications show explosive growth, and together with hacker attacks and the wide spread of network viruses, we put higher demands on network security. In order to protect the network from attacks or at least reduce the attacks, network administrators typically use firewalls on routers. However, the firewall of the network administrator can only capture limited network attack behavior. Therefore, intrusion detection systems are increasingly used on routers for network security. Intrusion detection systems detect hacking attacks and network viruses by analyzing network packets. Typically, there are many hardware detectors in an intrusion detection system that are responsible for monitoring network packets in real time and triggering intrusion alarms once an anomaly is detected by a hardware detector.
Generally, the most important part of an intrusion detection system is a string matching algorithm. The string matching problem is a computationally intensive problem that compares a given string with a reference string. The intrusion detection system checks the incoming data packet and compares it with the reference character string, if matching, it represents that the data packet has a potential safety hazard, and the intrusion detection system processes the data according to the danger degree of the system by the potential safety hazard. However, as network flooding increases, the amount of data to be processed by current computers is increasing, and the demand of people in the big data era cannot be met, for example, a hacker can easily introduce malicious network data packets into the system and break through a firewall through a large amount of data packets and a fast attack scheme. In order to ensure the security of the network system, a significant challenge is provided for improving the throughput rate and the speed of the network intrusion system.
On the other hand, neural networks belong to the principle of connection in the field of artificial intelligence, and are mathematical models for processing information by using structures similar to brain neurosynaptic connections. In the 50 s of the 20 th century, a first-generation neural network, namely a perception machine, is born, and can realize linear classification, associative memory and the like; in the 80 s of the 20 th century, the multi-layer perceptron and its training algorithm, Back Propagation (BP), were widely studied and applied because of their ability to solve the linear immutability problem. However, the lower hardware computing power and the network training algorithm are easy to fall into the problems of local minimum and the like, which become bottlenecks restricting the development of the neural computing method, until the deep learning method of 'multi-layer structure and layer-by-layer learning' developed by the professor of Hinton in 2006, the strong computing power of the neural network is really exerted, and the neural network becomes a bright star in the field of big data analysis under the era background of big data. This approach has enjoyed a breakthrough success in speech recognition, image recognition, natural language processing, etc., with a continuous refresh of various symbolic records in these application areas at an alarming rate and result.
Disclosure of Invention
In view of the above-mentioned problems and the recent technological advances, the present invention aims to: the neural network technology is applied to an intrusion detection system, and the speed of character string matching is increased, so that the requirements on the system in the big data era are met.
The technical scheme of the invention is as follows:
an intrusion detection system based on a neural network comprises a cache module and a neural network accelerator module, wherein the cache module captures redundancy by utilizing time locality in hardware, so that the requirement on storage resources is reduced; the neural network accelerator module is used for detecting the attack character string and accelerating the character string matching process; on the basis, the invention provides a uniform programming interface for the user to call the service for the convenience of the user.
In the preferred scheme, the cache module supports parallel query by utilizing a bloom filter, expands the bloom filter, designs a countable bloom filter, takes the countable bloom filter as a basic unit, and forms a minimum cache structure by a plurality of basic units in a parallel connection mode.
In the preferred scheme, the neural network accelerator module comprises three parts, namely a bus interconnection structure, a cache structure and a calculation engine.
The bus interconnection structure comprises a data bus and a control bus which are respectively used for transmitting data and commands.
The cache structure comprises an input cache, an output cache and a weight cache, and is respectively used for storing input data, output data and weight data in the calculation process of the neural network.
And the computing unit comprises a multiplier and an adder and is used for multiplication and addition operation in the neural network.
The convolutional layer and the fully-connected layer in the convolutional neural network have different attributes, the convolutional layer is intensive in calculation, and the fully-connected layer is intensive in access and storage, so that different optimization methods are used for the convolutional layer and the fully-connected layer in the process of applying the convolutional neural network to accelerate the character string. In the preferred scheme, for convolutional layers, we focus on parallel computation, and a method is applied to convert convolutional computation into matrix multiplication computation; for the fully-connected layer, we are working to reduce the required memory bandwidth, applying a batch approach.
Matrix multiplication, in the preferred scheme, a fragmentation design is adopted, the inside of each line of an input matrix is fragmented according to the size of the fragmentation, each column of a weight matrix is fragmented according to the size of the fragmentation, multiplication and addition operation between data with the size of the fragmentation in the input matrix and data with the size of the fragmentation in the weight matrix is executed in each computation to obtain a temporary computation result, and after one line of computation is finished, a final result is obtained by accumulation.
In the preferred scheme, the neural network accelerator module is provided with a plurality of computing units, each computing unit corresponds to one layer in the convolutional neural network model, and the computing units perform computing in a pipeline mode.
Because the calculation of the convolution layer and the calculation of the full connection layer are unified to the matrix multiplication calculation, on the basis, two different matrix multiplication calculation modes are used, in the first calculation mode, a partial result of a matrix is output and the input of a fragmentation interval is utilized, then the partial result of the matrix is output and the input of the next fragmentation interval is utilized to update the partial result, and in the mode, a weight matrix window is vertically moved; in a second calculation mode, all partial result sums of the output matrix are only using the input of the slice interval, and then all partial result sums of the output matrix are updated with the input of the next slice interval, in which mode the weight matrix window is horizontally shifted. In a preferred embodiment, the two calculation modes are used alternately in the pipeline calculation.
In a preferred embodiment, the programming interface includes hardware platform initialization and data transmission.
Compared with the prior art, the invention has the advantages that:
the invention is simple and easy to use, and is transparent to users. The redundancy table mechanism designed and realized by the invention can well utilize the time locality of hardware and greatly reduce the requirements on-chip storage resources. Meanwhile, in order to improve the speed and the accuracy of the intrusion detection system, a neural network method is also applied to accelerate the process of character string matching. Compared with a general processor, the intrusion detection system has the characteristics of high performance and low power consumption, and can meet the requirements on speed and throughput in a big data era.
Drawings
The invention is further described with reference to the following figures and examples:
FIG. 1 is an architecture diagram of a parallel intrusion detection system of a data center based on FPGA according to the present embodiment;
FIG. 2 is a diagram of an intrusion detection system architecture according to the present embodiment;
FIG. 3 is a view showing the structure of a bloom filter of the present embodiment;
FIG. 4 is a diagram of an extended countable bloom filter structure of the present embodiment;
FIG. 5 is a detailed design diagram of the cache module of the present embodiment;
FIG. 6 is a graph of convolution layer calculated by the matrix multiplication method of the present embodiment;
FIG. 7 is a diagram of a batch processing method for calculating a fully connected layer according to the present embodiment;
FIG. 8 is a diagram of a pipeline calculation method of the present embodiment;
fig. 9 is a hardware configuration diagram of the neural network accelerator of the present embodiment.
Detailed Description
The above-described scheme is further illustrated below with reference to specific examples. It should be understood that these examples are for illustrative purposes and are not intended to limit the scope of the present invention. The conditions used in the examples may be further adjusted according to the conditions of the particular manufacturer, and the conditions not specified are generally the conditions in routine experiments.
The intrusion detection system in the embodiment of the invention comprises a cache module and a neural network accelerator module, wherein the cache module captures redundancy by utilizing time locality in hardware, so that the requirement on storage resources is reduced; and the neural network accelerator module is used for detecting the attack character strings and accelerating the character string matching process. The data path between the accelerators and the general purpose processor may employ the PCI-E bus protocol, the AXI bus protocol, etc. In the data path shown in the figure of the embodiment of the present invention, an AXI bus protocol is used as an example for description, but the present invention is not limited thereto.
Fig. 1 is an architecture diagram of the FPGA-based data center parallel intrusion detection system according to this embodiment, where an intrusion detection system server is mainly responsible for a pattern matching task. In the process of executing the task, part of the task is loaded on the FPGA accelerator for acceleration. The front-end intrusion detection system server is responsible for attack detection of the data center, and the rear-end server is responsible for database management. Regarding data processing of the intrusion detection system, the intrusion detection system server first analyzes the behavior of the application program, and the main method of the analysis is a neural network-based method. The assignment and interfacing of tasks runs on the software server of the intrusion detection system, while the neural network approach runs on the hardware accelerator of the FPGA.
Fig. 2 is an architecture diagram of the intrusion detection system according to the present embodiment, which includes two parts, a buffer and a matching engine. Stored in the buffer are intermediate results that are waiting to be processed by the matching engine. Stream represents a payload and is transmitted to a buffer based on a write control unit. In the matching engine, a status register is used to store the final output status of the redundant bytes.
At the beginning of each processing cycle, a combination of temporary registers and status registers will be sent to the cache module and the neural network module, this combination representing the index of the cache module, and the neural network module processes the input status and data streams. Initially, 2 select signals and NN _ done signals are initialized to 0. When a cache line hits in the cache, the output state read from the redundancy table will be sent onto the X bus, while the select signal will be set to 1. Similarly, in the original module, the final output state after traversing the finite state machine will be sent to the Y bus, and the NN _ done signal will be set to 1. The Select signal and the NN done signal will be OR operated under the en enable signal. When the enable signal en is true, the MUX unit will select the input on the X or Y bus from among them as output according to the select signal and send the result to the match logic unit. At the same time, the do next signal is asserted, which will end the execution of the cache module and the neural network module and begin the next processing cycle. Likewise, the read _ next signal will be set, which will cause the Controller to read in the data from the buffer that needs to be processed for the next cycle. When the match logic unit receives the state data, it will determine whether the rule matches a known attack behavior in the finite state machine. When a match occurs, it will set the match signal and directly update the status register. The match logic then proceeds to the next processing cycle.
In the most advanced work, redundant tables are implemented in software using the original hash table. When the hash is incremented, the index will be compared to the equivalent in the table entry. Therefore, all the indexes need to be stored in the memory. The required storage resources are necessarily larger when the table entries increase. However, on-chip memory resources on FPGAs are limited and the latency of off-chip access is too high. In order to solve the problem, the invention designs a new redundant table storage structure on the FPGA. Because the intrusion detection system allows certain errors, the invention uses the bloom filter to reconstruct the storage structure of the cache and expands the bloom filter.
Bloom filters were proposed in 1970 by bloom. It is effectively a long binary vector and a series of random mapping functions. A bloom filter may be used to retrieve whether an element is in a collection.
Fig. 3 is a structure diagram of the bloom filter of the present embodiment, where n is the size of the set, k is the number of independent hash functions, and m is the number of bits of the bit vector v. f represents the probability of the occurrence of errors of the first type, which can be reduced by choosing appropriate values of m and k. In the most ideal case, let us assume that we have chosen the most suitable value of k, as shown in equation 1:
the probability of the first type of error occurring is calculated as shown in equation 2:
the bloom filter structure allows the following operations:
adding elements: it is added to k hash functions resulting in k positions and bit position 1 at these positions.
And querying elements: it is tested if it is in the set and added to the k hash functions, resulting in k positions.
And it is not possible to want to delete an element on the bloom filter. Since errors of the first kind cannot be avoided. An element maps to k positions, and while setting all k positions to 0 is sufficient to delete the element, it deletes the other elements as well.
A large number of hash functions are used in the bloom filter, which requires a large amount of hardware resources to implement. On the basis of the original bloom filter, the invention expands the bloom filter, replaces each bit in the original v array with a counter, and designs the countable bloom filter.
FIG. 4 is a diagram of an extended countable bloom filter structure of the present embodiment, including x and y as indices. Meanwhile, an m-sized array C is maintained in the present invention, wherein each element Ci is a counter of the ith position in the associated v array. Each element in the C array is used to store an output state. When an element is inserted, each counter for the corresponding hash index value is incremented. Assuming that each element in the string set S is an index of the redundancy table, the output states of all indexes in the set are stored, then each index in the set S is subjected to a hash operation, and the corresponding counter is also accumulated, and then the intrusion detection system reads the corresponding counter in the set S and writes the output state of the corresponding index to the position where the counter value in the corresponding array C is minimum. The redundant configuration is realized by a Block RAM (BRAM) on an FPGA.
In hardware implementations of intrusion detection systems where BRAMs are configured to be dual-ported, the optimal k value is 2 in order to make better use of these BRAMs. In the present invention, the size of the counter is 3 bits, and we use 15 bits to store the output state. Thus, the size of each entry in the redundancy table is 18 bits. A block of BRAMs has 36k bits, so each BRAM can contain 36 × 1024/18 ═ 2048 entries, that is, in the bloom filter2048 and 2, and according to equation (1) and equation (2), it can be calculated that a block BRAM can support n ═ m/k ═ ln 2 ═ 709, and the probability of the first type of error occurrence is (1/2)20.25. FIG. 5(a) is a basic unit of the cache module of this embodiment, and in the present invention, we define M to represent a set of the number and address range of the support entries, such as M (709, [1.. 2048 ]]) The input to the basic cell is the index value, the output is the combination of the minimum counter and the hit state, and the cache hit or miss signal. Further, the hash functions used in the hash address generator component are different from each other. By connecting 5 such basic units in parallel, a minimum mini-EBF is formed, as shown in FIG. 5(b), with a first type of error probability f ═ 1/210. In the present invention, we define that G denotes a set of number of support items, number of basic units, and address range, for example G (709,5, [1.. 2048 ]]). Since there are 20,000 entries in the set S and each mini-EBF can support 709 entries, a total of 20000/709-29 mini-EBFs are required.
Furthermore, in mini-EBF, the range of the hash function is limited by the address range, and the mapping limitation of the hash function to a particular address space has negligible effect on the last type I error. As shown in FIG. 5(c), the hit (miss) signal of the cache module is equivalent to (a)1V a2V...a28V a29) For example, when the bit vector (a)1,a2,…,a29) Equal to (1, 0., 0), the output state of the cache module is set by the input state of the mini-EBF G1, and the corresponding hit (miss) signal of the cache module is 1.
In order to improve the speed and accuracy of an intrusion detection system, in the invention, a neural network module is also applied in matching logic.
Convolutional neural networks contain many different kinds of layers, which can be divided into two parts: a feature extractor and a classifier. A feature extractor comprising a plurality of convolutional layers, plus a downsampling layer and an excitation layer, for extracting features of the input, the output of the feature extractor being connected to the input of a classifier comprising a plurality of fully connected layers, the classifier being arranged to identify to which class the input belongs.
The convolution layer and the full-connection layer in the convolutional neural network have different attributes, the convolution layer is intensive in calculation, and the full-connection layer is intensive in access and storage, so that different optimization methods are used for the convolution layer and the full-connection layer in the process of applying the convolutional neural network to accelerate matching logic. For convolutional layers, we focus on computation parallelism, applying a method to convert convolutional computation into matrix multiplication; for the fully-connected layer, we are working to reduce the required memory bandwidth, applying a batch approach. On the basis, the invention also applies a pipeline computing method to the whole network.
The pseudo code for the convolutional layer is shown below, which receives N feature maps as input, and each input feature map is convolved by a sliding window of size K × K to generate a pixel on an output feature map. The step size of the sliding window is S, and M output feature maps are used as input of the next round to participate in operation.
In the present invention, we convert the computation of the convolutional layer into a matrix multiplication computation by a 3-dimensional mapping. For example, fig. 6 is a convolution layer diagram calculated by the matrix multiplication method of the present embodiment, fig. 6(a) is a conventional convolution calculation method, and fig. 6(b) is a convolution calculation method by matrix multiplication. By comparison, we can see that the output results obtained by the two methods are consistent. In fig. 6(b), the 3 inputs with characteristic dimensions 3 x 3 are rearranged into a matrix of (2 x 2) × (3 x 2). The data of the first convolution kernel window 2 x2 in the input signature is spread and arranged horizontally to the input matrix as shown in fig. 6 (b). Applying a 2 x2 window of convolution kernels to all 3 input features will result in the entire input matrix. The 6 convolution kernels of 2 x2 are also rearranged into a matrix of (3 x 2) x (2). The final multiplication of the two matrices is equivalent to the convolution of the layer. It should be noted that the rearrangement of the whole input features is completed when we store the data into the on-chip buffer of the FPGA, so the requirement for external memory can be reduced, because the whole rearranged input matrix does not need to be stored.
The calculation of the full connection layer can be regarded as matrix vector multiplication calculation, and in the invention, the calculation of matrix multiplication is completed by using a slicing method, as shown in fig. 7 (a). The size of the segment is xm, firstly, the interval [ x1, xm ] and the weight data m x n in the input array are calculated to obtain a partial result and [ y1, yn ], then, the partial result and [ y1, yn ] are updated by using [ xm +1, x2m ] as input and calculating in the other weight data m x n, and when all the input data and the weights of the first column are calculated, the final [ y1, yn ] can be obtained. Other results can be obtained in the same way.
Since the fully-connected layer occupies a large amount of memory access bandwidth, in the present invention, we use a batch method to optimize the memory access of the fully-connected layer. As shown in fig. 7(b), an input matrix is composed of N input arrays, and N can be regarded as the size of the batch process. After the batch processing method is applied, the calculation is increased by N times, and the memory access is not increased, so that the bandwidth of the memory access is reduced. Since it takes N clock cycles to complete the multiplication of N × m × N, during which time the accelerator needs to be ready for the next round of computation, N should be no less than the time it takes to read m × N weight data.
In the invention, the calculation of the convolution layer and the calculation of the full connection layer are converted into the calculation of matrix multiplication, and a pipeline calculation method is applied to improve the performance of the neural network module.
The pattern of matrix multiplication calculations is reorganized for the purpose of pipeline calculations. In fig. 8(a), the calculation pattern is the same as that in fig. 7, and in fig. 8(b), the calculation pattern is different from that in fig. 7. In FIG. 8(b), all partial result sums of the output matrix will use only [ x1, xm ] as input, and then all partial result sums of the output matrix will use [ xm +1, x2m ] as input to update the partial result sums. In this mode, the windows m n of the weight matrix are shifted horizontally, while in fig. 7, the windows m n of the weight matrix are shifted vertically.
In the present invention, these two modes of matrix multiplication calculation are used alternately. In the matrix multiplication calculation of the first layer, the vertical shift mode is used to get the partial result sum [ y1, yn ], and then the matrix calculation of the second layer can start the calculation using the horizontal shift mode, taking only [ y1, yn ] as input. Since the second tier starts computing not all the computations of the first tier are completed, we only need N x N sized buffers to store intermediate results of the first tier computations. For the subsequent third-level matrix multiplication, a vertical shift pattern is used, the subsequent fourth-level matrix multiplication, a horizontal shift pattern, and so on. In this way, the pipeline can flow smoothly.
Fig. 9 is a hardware configuration diagram of the neural network accelerator of the present embodiment, in which an AXI4-Lite bus is used for transmission of commands and an AXI4 bus is used for transmission of data. In fig. 9, there are multiple processing units, each corresponding to a layer in the convolutional neural network topology. To improve performance and throughput, all processing units operate in a pipelined manner. In our design, the partial results between layers are stored in on-chip buffers of the FPGA. In this way, data access can be significantly reduced and, more importantly, on-chip buffers also facilitate data reuse.
The computational structure of the matrix multiplication can also be found in fig. 9, which contains a data buffer and a computation engine. In the present invention, the calculation engine is composed of many multipliers and accelerators, and is responsible for multiplication and addition calculations. In order to increase the calculation speed, the parallel multiplication operation is added with an accelerating tree structure to complete accumulation. The input buffer, the output buffer and the weight buffer constitute a data buffer. In our design, the input data and weight data are pre-fetched into the corresponding buffers by a data pre-fetching technique, and the partial results between layers are stored in the output buffers. Double buffering is used for on-chip buffers so that data can be accessed in a ping-pong fashion, which enables the transmission time and computation time of the data to overlap.
To make it more convenient for a user to use our intrusion detection system services, we define a programming interface to control the accelerators therein. The programming interface defined in the invention has universality and can be suitable for different application fields and different types of accelerators. The pseudo code of the programming model is shown below, which contains the following two steps.
1. Hardware platform initialization: in the FPGA accelerator we designed, initialization includes initialization of the neural network accelerator and initialization of the DMA. To add more hardware modules, we can modify the initialization code based on the hardware specification. We initialize the DMA device using the AxiDma _ CfgInitialize () API, with the relevant configuration parameters stored in the DmaDev structure, including the number of channels, data width, operating mode, and control signals. Similar to the initialization operation of a DMA device, the initialization configuration information of the neural network accelerator includes a control signal, a device name, and a physical address.
2. Application loading and data transmission: after initialization is complete, we can start the DMA device and the accelerator by setting specific register values, and all the information that directs the accelerator to complete the computation is contained in the InputData. In particular, we use the AxiDma Transfer () function to Transfer data to the accelerator and receive data back from the accelerator. This function has 4 parameters, a first parameter specifying the DMA device, a second parameter specifying the start address of the data transfer, a third parameter specifying the size of the data transfer, and a fourth parameter specifying the direction of the data transfer.
The above examples are only for illustrating the technical idea and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the content of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.
Claims (3)
1. An intrusion detection system based on a neural network, comprising:
the cache module captures redundancy by utilizing time locality in hardware, and reduces the requirement on storage resources;
the neural network accelerator module is used for detecting the attack character string and accelerating the character string matching process;
a uniform programming interface for the user to call the intrusion detection system;
the neural network accelerator module comprises a bus interconnection structure, a cache structure and a calculation engine;
the bus interconnection structure comprises a data bus and a control bus, and is used for data transmission and command transmission respectively;
the buffer structure comprises an input buffer, an output buffer and a weight buffer, which are respectively used for storing input data, output data and weight data in the calculation process of the neural network;
the computing unit comprises a multiplier and an adder and is used for multiplication and addition operation in the neural network; the neural network comprises a convolution layer and a full connection layer, and for the convolution layer, convolution calculation is converted into matrix multiplication calculation; for the full connection layer, a batch processing method is applied;
the matrix multiplication adopts a fragmentation design, the inside of each line of the input matrix is fragmented according to the size of the fragmentation, each column of the weight matrix is fragmented according to the size of the fragmentation, the multiplication and addition operation between the data with the size of the fragmentation in the input matrix and the data with the size of the fragmentation in the weight matrix is executed in each computation to obtain a temporary computation result, and after one line of computation is finished, the final result is obtained by accumulation;
the neural network accelerator module is provided with a plurality of computing units, each computing unit corresponds to one layer of the convolutional neural network model, and the computing units perform computing in a pipeline mode;
unifying the calculation of the convolution layer and the full connection layer into matrix multiplication calculation, on the basis, using two different matrix multiplication calculation modes, in the first calculation mode, outputting a partial result of the matrix and utilizing the input of a fragmentation interval, then outputting the partial result of the matrix and utilizing the input of the next fragmentation interval to update the partial result, and in the mode, a weight matrix window is vertically moved; in a second calculation mode, all partial results of the output matrix only use the input of the slicing interval, and then all partial results of the output matrix and the input of the next slicing interval are used for updating partial results, wherein in the second calculation mode, the weight matrix window is horizontally moved; two different matrix multiplication modes are used alternately in pipeline calculations.
2. The intrusion detection system based on the neural network according to claim 1, wherein the cache module supports parallel query by using a bloom filter, expands the bloom filter, designs a countable bloom filter, and uses the countable bloom filter as a basic unit, and a plurality of the basic units form a minimum cache structure in a parallel connection manner, so that the method can greatly save storage resources on an FPGA (field programmable gate array) chip.
3. The neural network-based intrusion detection system of claim 1, wherein the programming interface performs the steps of hardware platform initialization and data transmission, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696883.2A CN108881254B (en) | 2018-06-29 | 2018-06-29 | Intrusion detection system based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810696883.2A CN108881254B (en) | 2018-06-29 | 2018-06-29 | Intrusion detection system based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108881254A CN108881254A (en) | 2018-11-23 |
CN108881254B true CN108881254B (en) | 2021-08-06 |
Family
ID=64297233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810696883.2A Active CN108881254B (en) | 2018-06-29 | 2018-06-29 | Intrusion detection system based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108881254B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12061971B2 (en) | 2019-08-12 | 2024-08-13 | Micron Technology, Inc. | Predictive maintenance of automotive engines |
US11748626B2 (en) * | 2019-08-12 | 2023-09-05 | Micron Technology, Inc. | Storage devices with neural network accelerators for automotive predictive maintenance |
CN110768946A (en) * | 2019-08-13 | 2020-02-07 | 中国电力科学研究院有限公司 | Industrial control network intrusion detection system and method based on bloom filter |
CN111741002B (en) * | 2020-06-23 | 2022-02-15 | 广东工业大学 | Method and device for training network intrusion detection model |
CN113447883A (en) * | 2021-06-25 | 2021-09-28 | 海宁奕斯伟集成电路设计有限公司 | Multi-station parallel test method and test system |
CN118264484B (en) * | 2024-05-29 | 2024-08-02 | 中国电子信息产业集团有限公司第六研究所 | Industrial network intrusion detection method, system, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770581A (en) * | 2010-01-08 | 2010-07-07 | 西安电子科技大学 | Semi-automatic detecting method for road centerline in high-resolution city remote sensing image |
CN103747060A (en) * | 2013-12-26 | 2014-04-23 | 惠州华阳通用电子有限公司 | Distributed monitor system and method based on streaming media service cluster |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9619406B2 (en) * | 2012-05-22 | 2017-04-11 | Xockets, Inc. | Offloading of computation for rack level servers and corresponding methods and systems |
IL239191A0 (en) * | 2015-06-03 | 2015-11-30 | Amir B Geva | Image classification system |
US10614354B2 (en) * | 2015-10-07 | 2020-04-07 | Altera Corporation | Method and apparatus for implementing layers on a convolutional neural network accelerator |
CN105891215B (en) * | 2016-03-31 | 2019-01-29 | 浙江工业大学 | Welding visible detection method and device based on convolutional neural networks |
-
2018
- 2018-06-29 CN CN201810696883.2A patent/CN108881254B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770581A (en) * | 2010-01-08 | 2010-07-07 | 西安电子科技大学 | Semi-automatic detecting method for road centerline in high-resolution city remote sensing image |
CN103747060A (en) * | 2013-12-26 | 2014-04-23 | 惠州华阳通用电子有限公司 | Distributed monitor system and method based on streaming media service cluster |
Also Published As
Publication number | Publication date |
---|---|
CN108881254A (en) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108881254B (en) | Intrusion detection system based on neural network | |
US11775430B1 (en) | Memory access for multiple circuit components | |
US10943167B1 (en) | Restructuring a multi-dimensional array | |
US11645529B2 (en) | Sparsifying neural network models | |
US9817678B2 (en) | Methods and systems for detection in a state machine | |
EP2215563B1 (en) | Method and apparatus for traversing a deterministic finite automata (dfa) graph compression | |
US8180803B2 (en) | Deterministic finite automata (DFA) graph compression | |
JP2015505399A (en) | Counter operation in a state machine grid | |
CN111797970B (en) | Method and device for training neural network | |
WO2023098544A1 (en) | Structured pruning method and apparatus based on local sparsity constraints | |
CN112667528A (en) | Data prefetching method and related equipment | |
CN111768458A (en) | Sparse image processing method based on convolutional neural network | |
CN111709022A (en) | Hybrid alarm association method based on AP clustering and causal relationship | |
CN114925320A (en) | Data processing method and related device | |
CN112200310B (en) | Intelligent processor, data processing method and storage medium | |
Hieu et al. | A memory efficient FPGA-based pattern matching engine for stateful NIDS | |
WO2023040740A1 (en) | Method for optimizing neural network model, and related device | |
Islam et al. | Representation learning in deep rl via discrete information bottleneck | |
US11263517B1 (en) | Flexible weight expansion | |
Ebrahim et al. | Fast approximation of the top‐k items in data streams using FPGAs | |
CN114386578A (en) | Convolution neural network method implemented on Haisi non-NPU hardware | |
EP3895024A1 (en) | Caching data in artificial neural network computations | |
CN117057403B (en) | Operation module, accelerator based on impulse neural network and method | |
WO2022057054A1 (en) | Convolution operation optimization method and system, terminal, and storage medium | |
Liu et al. | Deep hashing based on triplet labels and quantitative regularization term with exponential convergence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |