US20140101761A1 - Systems and methods for capturing, replaying, or analyzing time-series data - Google Patents
Systems and methods for capturing, replaying, or analyzing time-series data Download PDFInfo
- Publication number
- US20140101761A1 US20140101761A1 US13/648,176 US201213648176A US2014101761A1 US 20140101761 A1 US20140101761 A1 US 20140101761A1 US 201213648176 A US201213648176 A US 201213648176A US 2014101761 A1 US2014101761 A1 US 2014101761A1
- Authority
- US
- United States
- Prior art keywords
- buffer
- sub
- network data
- data
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0656—Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1458—Denial of Service
Definitions
- the system memory stores instructions that when executed by the processors cause the processors to perform steps including buffering network data from the network interface in the system memory; retrieving the network data buffered in the system memory; applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data; aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and upon the aggregate score exceeding a threshold, outputting an alert.
- Some aspects include a tangible non-transitory machine-readable medium storing instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations, including writing network data from a network interface to a buffer in the system memory, wherein writing the network data from the network interface to the buffer in the system memory includes writing the network data to an active unlocked sub-buffer among the plurality of sub-buffers, locking the active sub-buffer, designating an unlocked sub-buffer as the active sub-buffer, and after ascertaining that the network data stored in the locked sub-buffer has been written to system storage, unlocking the locked sub-buffer; and concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
- Some aspects include a process, including writing network data from a network interface to a buffer in the system memory, wherein writing the network data from the network interface to the buffer in the system memory, includes writing the network data to an active unlocked sub-buffer among the plurality of sub-buffers, locking the active sub-buffer, designating an unlocked sub-buffer as the active sub-buffer, and after ascertaining that the network data stored in the locked sub-buffer has been written to system storage, unlocking the locked sub-buffer; and concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
- an intrusion detection system including a network interface; a plurality of processors communicatively coupled to the network interface; system memory communicatively coupled to the plurality of processors; system storage communicatively coupled to the plurality of processors.
- the system storage stores previously captured network data and instructions that when executed by the plurality of processors cause the intrusion detection system to perform steps including pre-processing the network data in the system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer; moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving; concurrent with writing network data to the buffer, sending network data in the buffer to the network interface, wherein sending the network data includes reading time-stamps associated with packets in the network data, and sending each packet when the associated time stamp corresponds with a time stamp counter of at least one of the processors.
- Some aspects include a process, including pre-processing network data stored in system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer; moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving; concurrent with writing network data to the buffer, sending network data in the buffer to a network interface, wherein sending the network data includes reading time-stamps associated with packets in the network data, and sending each packet when the associated time stamp corresponds with a time stamp counter of a processor.
- FIGS. 2 and 3 illustrates processes for capturing network traffic in accordance with embodiments of the present techniques
- FIGS. 5 and 6 illustrates processes for replaying network traffic in accordance with embodiments of the present techniques
- the techniques described herein are broadly applicable. In some use cases, the techniques may be used to capture, replay, or analyze various types of data other than network traffic between other computers, for example internal or externally originated application program interface (“API”) calls, such as system calls made by processes on a server (having one or more of the modules described below) to the operating system of the server, transactions on a database (such as a database having one or more of the modules described below), or network traffic sent or received by a host computing device having one or more of the modules described below.
- API application program interface
- the techniques are also applicable to systems using non-commodity, customized computing hardware, as such components are also expected to benefit from use of the techniques described herein, and applications are not limited to network data, as other forms of data may be processed in accordance with some embodiments.
- FIG. 1 illustrates an example of a computing environment 10 having a network-traffic capture module 12 that, in some embodiments, is operative to capture network traffic (or other types of data) at a relatively high rate.
- the network-traffic capture module 12 uses multiple threads for buffering network data in memory (e.g., dynamic random access memory) and storing the buffered network data in system storage (e.g., a hard disk drive or solid-state drive providing persistent storage in the absence of power).
- system storage e.g., a hard disk drive or solid-state drive providing persistent storage in the absence of power.
- the buffered network data is written to one of a plurality of sub-buffers in system memory and, concurrently, other sub-buffers in system memory are saved into pre-formed files in system storage.
- Embodiments of this buffering technique are expected to accommodate variations in the rate with which data is written to system storage, variations which could otherwise cause the loss of data when attempting to write data at rates nearing the specified maximum rates for the system storage components. Further, embodiments expedite the process of writing data to system storage by overwriting pre-formed files, again, as explained in greater detail below with reference to FIGS. 1 , 2 , and 3 .
- the computing environment 10 includes, in addition to the network-traffic capture module 12 , an administrator device 14 , a secured portion of a network 16 , and a network switch 18 coupled to the Internet 20 .
- the administrator device 14 includes an interface by which an administrator issues commands to control the network-traffic capture module 12 and by which an administrator views information output by the network-traffic capture module 12 , such as status information indicating the state of an ongoing process of capturing network traffic.
- the secured portion of the network 16 may include a plurality of computers 22 , such as desktop computers, laptop computers, tablet computers, smart phones, and the like, connected to a local area network 24 , such as an intranet, that is connected to the Internet 20 via the network switch 18 .
- the networks 24 and 20 may convey data encoded in packet-based protocols, such as Internet Protocol and TCP, UDP, FTP, the like, in a sequence of packets through the network switch 18 .
- the network 20 and data-capture module 12 may be co-located, e.g., in the same building, on the same intranet, to reduce latency relative to remote uses, or the data-capture module 12 may be remote.
- the secured portion of the network 16 may reside behind a firewall and an intrusion detection system, in some embodiments, such as the intrusion detection system described below with reference to FIGS. 7 and 8 .
- the Internet 20 is generally unsecured, and various attacks may pass through the Internet 20 along with legitimate traffic to or from the computing devices 22 in the secured portion of the network 16 .
- the secured portion may also generate outgoing traffic indicative of an intrusion.
- a relatively large amount of data per unit time is exchanged between the secured portion of the network 16 and the Internet 20 through the network switch 18 , such as approximately 1 Gb per second, 10 Gb per second, or 40 Gb per second, for example.
- a portion of the data passing through switch 18 is indicative of an attack, such as a distributed denial of service attack, a brute force attempts to test passwords, SQL injection, buffer overflow, or other form of attack.
- the data can be verified to be intrusion free. Capturing this data (e.g., intrusion free data) over some period of time, such as a duration of greater than or approximately 1 minute, 10 minutes, one hour, one day, or one week, is often useful for providing a training set for an intrusion detection system, such as the intrusion detection systems described herein.
- Part of the indication of some attacks includes the timing of the constituent packets arriving at the switch 18 , and some attacks may be signaled by a very small portion of the network traffic.
- the network-traffic capture module 12 performs processes that allow the module 12 to capture data at higher rates than is possible using traditional techniques with a given set of computing hardware. It should be noted, however, that the techniques described herein address a number of problems that arise in the context of intrusion detection systems and not all embodiments relate to capturing network traffic.
- the techniques described herein are broadly applicable in fields other than intrusion detection systems, for example in data capture modules for high data rate sensors, capturing high-frequency trading data, or other systems in which data is captured at a high rate, for example within 20% of the specified maximum data rate of components in the network traffic capture module 12 .
- the network-traffic capture module 12 is a computing device, such as a rack-mounted computing device (e.g., consuming four units of rack space), such as a computing device having a single chassis (or more) and a single motherboard (or more). Or components may be distributed or replicated across multiple computing devices, e.g., with one device capturing odd-numbered packets and another device capturing even-numbered packets.
- the illustrated network-traffic capture module 12 includes a central processing unit (CPU) 26 , a network interface 28 , system memory 30 , and system storage 32 .
- CPU central processing unit
- the components 26 , 28 , 30 , and 32 may be physically coupled to one another in a rack-mounted computing device, such as an intrusion detection appliance for installation at a user's site, and the CPU 26 and system memory may be coupled directly to one another via a system board, e.g., via one or more memory channels on a motherboard.
- the network-traffic capture module 12 may also include other features that are not shown, such as a power supply and ports for interfacing with various input/output devices, such as a keyboard and monitor.
- the CPU 26 may be any of a variety of different types of CPUs, e.g., a multicore CPU capable of hyperthreading, such as a six core Xeon 5690 CPU, in which each core supports two hyper threads, available from Intel Corporation of Santa Clara, Calif.
- the CPU includes a memory controller operable to communicate with system memory 30 using various memory standards, such as DDR3.
- the illustrated CPU 26 also communicates with the network interface 28 and the system storage 32 to move data from the network interface 28 to system memory 30 and then to system storage 32 .
- the CPU 26 executes an operating system, such as Ubuntu Linux, along with a data-capture module 34 , which may be embodied by program code stored on a tangible, non-transitory, machine-readable medium, such as system memory 30 or system storage 32 , and which when executed by the CPU 26 , causes the CPU 26 to perform the operations described below with reference to FIGS. 2 and 3 .
- an operating system such as Ubuntu Linux
- a data-capture module 34 which may be embodied by program code stored on a tangible, non-transitory, machine-readable medium, such as system memory 30 or system storage 32 , and which when executed by the CPU 26 , causes the CPU 26 to perform the operations described below with reference to FIGS. 2 and 3 .
- the data-capture module 34 of this embodiment further includes a saving thread 36 and a buffering thread 38 that operate concurrently to move data from the network interface 28 into system memory 30 and from system memory 30 into the system storage 32 , as described in greater detail below.
- threads 36 and 38 may be separate processes, rather than threads of a single process, and some embodiments may include multiple instances of each thread 36 or 38 .
- FIG. 1 illustrates a number of functional blocks as discrete components, but it should be noted that code or hardware by which these functional blocks are implemented may be conjoined, distributed, intermingled, or otherwise differently organized relative to FIG. 1 .
- the saving thread 36 and buffering thread 38 are illustrated as separate threads, embodiments are not limited to this arrangement.
- the saving thread and the buffering thread may be provided by a multiple instances of a thread that changes from a loading mode to a buffering mode and back periodically, in some use cases.
- the network interface 28 is a network-interface card capable of sending or receiving data on a network, such as an Ethernet network, at approximately 1 Gb per second, 10 Gb per second, or 40 Gb per second, for instance.
- the network interface 28 may include additional buffers that are distinct from those described below with reference to the system memory 30 , or in some embodiments, buffers on the network interface 28 may provide the functionality described below with reference to the buffers in the system memory 30 .
- System memory 30 is random access memory, for example synchronous dynamic random access memory, such as a plurality of dual in-line memory modules of DDR3 SDRAM of an amount selected based on desired data capture rates and other factors.
- the system memory 30 stores various data structures operable to facilitate relatively rapid capture of data into system storage 32 .
- system memory 30 includes a buffer 40 having a plurality of sub-buffers 42 , 44 , and 46 , and read and write pointers 48 and 50 that each identify one (e.g., one and only one) of the sub-buffers 42 , 44 , or 46 .
- Each of the sub-buffers 42 , 44 , and 46 includes sub-buffer storage 52 where captured data is held, a lock 54 , an exhausted-state value 56 , and an identifier 58 (e.g., a unique identifier among the sub-buffers).
- Each sub-buffer storage 52 may be operable to store a discrete, predetermined amount of network traffic data from the network interface 28 before that network traffic data is written to system storage 32 .
- each sub-buffer storage 52 has a size between approximately 1.3 gigabytes for a 10 Gb data feed, for example, and some embodiments include approximately 10 sub-buffers. Other embodiments can have larger or smaller sub-buffers and more or fewer sub-buffers.
- Each sub-buffer 42 , 44 , and 46 may be associated (e.g., in a one-to-one correlation) with a lock 54 , having a state indicating whether the sub-buffer is either locked, meaning that the sub-buffer is not to receive additional network data, or unlocked, meaning that the sub-buffer is available to be overwritten.
- the lock constitutes a mutual exclusion, or mutex, spin lock, semaphore, or other variable state configured to prevent two threads from accessing the sub-buffer at the same time.
- the exhausted-state value 56 has uses described below with reference to FIGS. 4-6 and, in some embodiments, may be omitted from the module 12 , which is not to suggest that other features may not also be omitted in some embodiments.
- Each sub-buffer 42 , 44 , and 46 may be further associated with an identifier 58 that, in some embodiments, uniquely identifies the sub-buffer among all of the other sub-buffers.
- the write pointer 50 may point to the sub-buffer to which data is currently being written from the network interface 28 by, for example, storing the identifier 58 of the corresponding sub-buffer.
- the identifiers are integer values, and the write pointer 50 is a value that is incremented or decremented when advancing from one sub-buffer to the next when a sub-buffer is deemed full.
- the read pointer 48 points to a sub-buffer from which data is currently being read and transferred to system storage 32 .
- the pre-formed files are formed by writing dummy data, such as a string of zeros, in a predetermined amount of space in the system storage, such as approximately one pre-formed file (or more) that receives data from a plurality of sub-buffers, e.g., a one or two terabyte file, depending on the application.
- the pre-form files 60 may be formed by creating the files before executing the data-capture module 34 , in some embodiments.
- the network-traffic capture module 12 may be operable to capture network traffic at a relatively high rate using relatively inexpensive computing hardware.
- FIGS. 2 and 3 illustrate examples of processes by which the network-traffic capture module 12 captures data.
- the term “process” used herein refers to a method, which may correspond to a “process,” as that term is used to describe computing tasks (described herein as “computing processes”), but the unmodified term “process” is not so limited, e.g., a single computing process may have multiple threads, each executing one of the processes of FIG. 2 or 3 .
- FIG. 2 illustrates an example of a process 62 performed by the buffering thread 38 of FIG. 1 .
- the process 62 begins with receiving network data from a network interface, as illustrated by block 64 .
- the network data may be received at a relatively high rate, such as those discussed above.
- the network data is received using direct memory access into a buffer of a network interface card or using some other interface provided by a network card driver.
- the data passing through the network interface 28 may be mirrored by the network switch 18 described above, such that a copy of all (or substantially all, for example) network traffic passing through the network switch 18 is received by the network interface 28 and the buffering thread 38 .
- Receiving the network data from a network interface includes associating (for example, in a one-to-one relationship) a timestamp with the received packet, indicating a time at which the packet was received.
- the timestamp is a timestamp counter value of the network interface 28 incremented according to a clock signal of the network interface 28 , thereby providing relatively fine-grained documentation of the time at which packets are received for re-creating the flow of network traffic in accordance with techniques described below with reference to FIGS. 4 through 6 .
- receiving network data may entail receiving one and only one packet of network data in the step designated by block 64 , with subsequent packets received in subsequent repetitions of step 64 , such that each packet is associated with a timestamp, e.g., a time stamp having a resolution of less than one microsecond, for instance one nanosecond, and such that each packet has a unique (or approximately unique, e.g., shared by less than 10 packets) timestamp relative to the other packets.
- a timestamp e.g., a time stamp having a resolution of less than one microsecond, for instance one nanosecond
- the process 62 may further include determining whether all (or substantially all) received network data is written to a sub-buffer, as indicated by block 70 .
- packets are received and written one packet at a time, and the determination 70 is performed by determining that the one packet has been written to a sub-buffer. In some cases, the determination 70 may be performed by, for example, requesting additional data from the network interface 28 described above.
- the process returns to step 64 and additional network data is received.
- the process determines whether the active sub-buffer is full, as illustrated by block 72 .
- the process In response to determining that the active-sub-buffer is not full, the process returns to step 68 and additional received network data is written to the active sub-buffer. Alternatively, in response to determining that the active-sub-buffer is full, the process proceeds to lock the active sub-buffer as indicated by step 74 . Locking the active sub-buffer may include changing the state of a lock associated with the active sub-buffer, thereby indicating to other threads that the active sub-buffer contains data that has not yet been written to system storage and should not be overwritten. In this branch, the process 62 further includes identifying an unlocked sub-buffer, as indicated by block 76 .
- FIG. 3 illustrates an embodiment of a process 80 performed by the above-mentioned saving thread 36 of FIG. 1 that moves data from the sub-buffers to system storage.
- the process 80 includes identifying a locked sub-buffer among a plurality of sub-buffers, as indicated by block 82 . Identifying a locked sub-buffer may include incrementing or decrementing the above-mentioned read pointer 48 and determining whether the read pointer 48 identifies a sub-buffer that is locked, meaning in this context that the sub-buffer contains data that has not yet been written to system storage.
- the process 80 of this embodiment further includes identifying a pre-formed file in system storage, as indicated by block 84 .
- the pre-formed file is overwritten with the network data from the identified sub-buffer, as indicated by block 86 .
- overwriting preformed files is expected to be faster than creating new files. Further, writing larger groups of data, such as an entire sub-buffer of data, is expected to be faster than writing data to system storage one packet at a time.
- the processes 62 and 80 may be executed concurrently (e.g., substantially simultaneously), for instance by different threads, to capture network data at a relatively high rate relative to systems in which processes 62 and 80 are executed consecutively, though embodiments are not limited to those performing concurrent operations.
- one or more cores or hyperthreads execute process 62 concurrent with one or more other, different cores or hyperthreads executing process 80 .
- Both threads may access the same memory structure of buffer 40 described above. Communication between the threads may occur via the above-mentioned locks 54 by which the threads coordinate memory access. For example, full sub-buffers may be locked with a mutex, a spinlock, a semaphore, or other variable that prevents the buffering thread from overwriting the locked sub-buffer until the saving thread unlocks the sub-buffer.
- FIG. 4 illustrates an embodiment of a network-traffic replay module 90 that, in some implementations, replays network traffic captured using this system and processes of FIGS. 1 through 3 .
- some embodiments replay such traffic at a higher data rate and with greater fidelity than is possible using traditional techniques. That said, not all embodiments described herein use the network-traffic replay module 90 , and the module 90 is applicable in systems other than those described elsewhere in the present application.
- the physical components of the network-traffic replay module 90 may be the same as those of the network-traffic capture module 12 described above with reference to FIG. 1 , as indicated by the repetition of element numbers 26 , 28 , 30 , and 32 .
- the network traffic replay module 90 is formed by operating the same computing device that forms the network-traffic capture module 12 in a replay mode rather than in a capture mode, e.g., in response to commands from the administrator device 14 .
- system storage 32 includes stored network data 92 , which may be the captured data described above or may be data acquired through other techniques, for example simulated network data or data captured from other sources.
- system memory 30 includes the above-described buffer 40 in some embodiments.
- the CPU 26 may execute instructions that cause the CPU 26 to provide a batching pre-processor 94 and a replay module 96 having a buffering thread 98 and a sending thread 100 . Further, when operated in replay mode, the CPU 26 may form in system memory (or system storage) a storage offset list 102 .
- system memory or system storage
- the batching pre-processor 94 calculates offsets that define groups of data in the stored network data for transfer, as a group (e.g., in response to a single read command), to one of the sub-buffers 42 , 44 , or 46 . These offsets may be calculated before replaying network data and may be stored in the storage offset list 102 . In some use cases, the batching pre-processor 94 is not instantiated at the same time as the replay module 96 .
- the batching pre-processor 94 may be executed before the replay module 96 is instantiated to prepare the storage offset list 102 in advance of a replay of network traffic, and in some cases, the offset list 102 may be read from storage 32 before or when instantiating the replay module 96 .
- the buffering thread 98 may later use these offsets to read the groups of data from the stored network data to one of the sub-buffers 42 , 44 , and 46 , and the sending thread 100 may send this data in accordance with timestamps associated with each packet in the data, thereby re-creating the flow of network traffic that was previously captured.
- the recreated flow of data may be directed to a recipient system 104 , which may analyze the network traffic, for example testing or training candidate intrusion detection models for accuracy.
- the operation of the network-traffic replay module may be controlled by the administrator device 14 , which may issue commands to the network-traffic replay module 90 that instantiate the batching pre-processor 94 and instantiate the replay module 96 .
- the components of the network-traffic replay module 90 are illustrated as discrete functional blocks, it should be understood that hardware and software by which the functionality is provided may be distributed, conjoined, intermingled, or otherwise differently organized from the manner in which the functional blocks are illustrated.
- the process 106 includes obtaining network data stored in system storage, as illustrated by block 108 .
- the network data may be obtained in the computing environment 10 of FIG. 1 by executing the processes of FIGS. 2 and 3 , or network data may be obtained with other techniques, for example network data may be simulated or obtained from some of the system.
- the present techniques are not limited to network data and other types of data acquired over time may be replayed, for example relatively high bandwidth data from sensor arrays or market data generated by high-frequency trading activities.
- the process 106 includes calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer, as illustrated by block 110 .
- This step 110 may be performed by the above-described batching pre-processor 94 , for example, automatically upon having captured network data, or in response to a command from the administrator device 14 .
- the size of the sub-buffer storage 52 of the sub-buffers 42 , 44 , and 46 may be predefined and may be the same among each of the sub-buffers, in some embodiments.
- the size of the sub-buffers may be substantially larger than packet of network data, for example the packets may be generally smaller than 1,514 bytes (or larger, such as approximately 9,000 bytes for jumbo frames), while the sub-buffer storage may be approximately 1.3 Gb, for example between 1.3 Mb and 1.3 Gb, depending on the time interval being buffered and the speed of the interface.
- the offsets are expressed as memory addresses in the system storage 32 , such that the offsets bookend, or bracket, a block of network data (e.g., in a range of sequential addresses in system storage) substantially filling a sub-buffer (e.g. filling the sub-buffer with a sequence of network data constituting an integer number of network packets and without a packet spanning between sub-buffers).
- the calculated offsets may be stored in system memory, for example in the storage offset list 102 or, in some use cases, the offsets are stored as a file in system storage 32 to be read into system memory 30 when replaying the stored network data 92 .
- Pre-calculating the offsets is expected to facilitate the transfer of data from system storage to sub-buffers with relatively few read commands, which tends to facilitate higher rates of data transfer from system storage than transfers in which sub-buffers are filled with data from a plurality of read commands.
- the process 106 further includes identifying an exhausted sub-buffer in system memory 114 .
- This step may be performed in a different order from the order in which the flow charts illustrate.
- the sub-buffer may be identified before identifying the batch of data in step 112 .
- Identifying an exhausted sub-buffer may include incrementing or decrementing the write pointer 50 of the buffer 40 of FIG. 4 and determining whether a corresponding sub-buffer is exhausted, e.g., as indicated by the exhausted state value 56 .
- sub-buffers are designated as exhausted when the sub-buffer does not contain data yet to be sent by the sending thread 100 .
- the write pointer 50 may determine that the next sub-buffer is not exhausted and wait until the state of that sub-buffer changes.
- the process 106 further includes moving (e.g., reading) the batch of data to the identified sub-buffer, as illustrated by block 116 .
- the batch of data between the identified offsets may be moved by issuing a (e.g., one and only one or more) read command to the system storage 92 , instructing the system storage 92 to read to memory the network data between adjacent offsets, such as in a sequence of storage addresses (or locations within a file) beginning with the first offset and ending with the second offset in the pair of offsets.
- the process 106 further includes designating the identified sub-buffer to which the batch of data was moved as being unexhausted, as illustrated by block 118 .
- Designating the sub-buffer as unexhausted may include changing the state of the corresponding exhausted state value 56 of that sub-buffer 42 , 44 , or 46 , for example from a value of true to a value of false.
- unexhausted sub-buffers contain data yet to be sent via the network interface.
- System memory 30 and system storage 32 serve different roles in some embodiments due to tradeoffs between capacity, persistence, and speed.
- the speed with which data is written to, or read from, system memory 30 is substantially higher than the speed with which data is read from, or written to, system storage 32 , but the capacity available in system storage 32 for a given price is typically substantially higher.
- System storage 32 is generally used to store larger amounts of data persistently (e.g. when power to a computer system is removed), but due to lower data rates, the system storage 32 may act as a bottleneck when replaying (or capturing) data. Further, the rate with which system storage 32 returns data may fluctuate.
- Buffering the data in system memory 30 is expected to accommodate these fluctuations, drawing down the buffer when the data rate of system storage 32 temporarily drops, and filling the buffer 40 when the data rate of the system storage 32 rises or the rate at which replay traffic is sent drops, as described below. Further, the transferring the network data to system memory in groups identified by the pre-calculated offsets is expected to allow the buffering thread 98 to transfer the data with fewer read commands being issued to the system storage 32 , which is expected to result in higher rates of data transfer than would otherwise be achieved. Not all embodiments, however, provide these benefits or use these techniques.
- FIG. 6 illustrates an embodiment of a process 120 that sends data from a buffer.
- the process 120 is performed by the above-described sending thread 100 of FIG. 4 sending data from the buffer 40 through the network interface 28 to the recipient system 104 .
- the data may be sent in the order in which the data was received when the data was captured and in accordance with timestamps associated with each packet in the data, such that pauses between packets of network data are re-created when sending the network data.
- the process of FIG. 6 may be performed concurrent with the process of FIG. 5 , e.g., when (in response to) a threshold number of sub-buffers being filled by the process of FIG. 5 , when a threshold amount of time has passed since the start of the process or FIG. 5 , or in response to some other signal, such that buffer 40 contains data to accommodate fluctuations in the sending rate and the rate at which data is read from storage 32 .
- the process 120 includes identifying the active sub-buffer, as indicated by block 122 .
- identifying the active sub-buffer may include identifying the active sub-buffer with the value in the read pointer 48 corresponding to an identifier 58 of one of the sub-buffers 42 , 44 , or 46 .
- the process 120 further includes retrieving network data from the active sub-buffer, as indicated by block 124 .
- retrieving the network data from the active sub-buffer includes retrieving a next packet (e.g. one and only one packet, or multiple packets) in an ordered sequence of packets of network data, where the packets are sequenced in the order in which the packets were captured.
- the process 120 further includes determining whether the timestamp corresponds to a CPU timestamp counter (or other system clock), as indicated by block 128 .
- a starting time corresponding to when a first packet was received during data capture may be stored and associated with the stored network data.
- this starting time and timestamps associated with each packet of stored network data are equal to (or determined based on) a timestamp counter of the network interface, which may have a different period than the CPU timestamp counter, and typically a much longer period.
- a network interface timestamp counter operating at 1 GHz may increment each nanosecond, while the CPU timestamp counter operating at 3 GHz may increment in approximately 333 picoseconds.
- the present process includes disabling certain modes of operation of the CPU that might interfere with the timing at which packets are sent. For instance, lower power modes of operation of the CPU may be disabled and throttling may be disabled to keep the CPU timestamp counter operating at a relatively constant frequency.
- the process 120 includes sending the packet to the network interface, as indicated by block 130 .
- Sending the packet may include sending the packet on a network, such as an Ethernet network to a recipient system 104 , which, for example, may be an intrusion detection system, such as the intrusion detection system described below with reference to FIGS. 7 and 8 undergoing testing to evaluate intrusion detection models.
- the sent data is sent to a loopback address, and the process of FIG. 8 is performed on the data stream to test detection models concurrent with replaying the data.
- Sending the packet at the time the CPU timestamp counter corresponds with the timestamp is expected to cause the packets to be sent from the system with an inter-packet timing that approximates with relatively high fidelity the inter-packet timing with which the packets were received during capture, e.g., with a timing resolution approximately equal to that of the CPU timestamp counter or the network interface timestamp counter.
- the process 120 further includes determining whether all data has been retrieved from the active sub-buffer, as indicated by block 132 . Determining whether all data has been retrieved may include determining whether a last packet in a sequence of packets in the active sub-buffer has been retrieved. In response to determining that all the data has not been retrieved, the process 120 may return to block 124 and continue to iterate through the sequence of packets in the active sub-buffer until all of the packets have been retrieved. Alternatively, in response to determining that all of the data has been retrieved from the active sub-buffer, the process 120 may proceed to the next step.
- process 120 includes designating the active sub-buffer as exhausted and locking the active sub-buffer, as indicated by block 134 .
- Designating the active sub-buffer as exhausted may include changing the exhausted-state value 56 of the active sub-buffer, and locking the active sub-buffer may include changing a lock state 54 of the active sub-buffer.
- sub-buffers are locked with a mutex, a spinlock, or other variable configured to facilitate coordination between threads.
- the process 120 further includes identifying an unexhausted sub-buffer, as illustrated by block 136 . Identifying an unexhausted sub-buffer may include incrementing or decrementing the read pointer 48 of FIG. 4 and reading the exhausted-state value 56 of the sub-buffer 42 , 44 , or 46 corresponding to the read pointer 48 to confirm that the sub-buffer is unexhausted. Some embodiments may wait until the state of the next sub-buffer is unexhausted if needed.
- the process 120 includes designating the identified sub-buffer as the active sub-buffer and unlocking this sub-buffer, as indicated by block 138 .
- the identified sub-buffer may be designated as active by storing in the read pointer 48 the value of the identifier 58 of the corresponding sub-buffer 42 , 44 , or 46 . Further, the sub-buffer may be unlocked by changing the state of a corresponding lock. After performing step 138 , the process 120 may return to step 124 and retrieve network data from this newly identified and unlocked active sub-buffer.
- the process 128 is expected to send data from the buffer 40 in the sequence in which the data was captured and with timing approximating the timing with which the data was captured with relatively high fidelity. Further, the exhausted state values and locking the active sub-buffers is expected to facilitate concurrent processing for higher rates of data transfer. Not all embodiments, however, provide these benefits or use these techniques.
- the process 106 of FIG. 5 and the process 120 of FIG. 6 are performed concurrently by different threads executed by, for example, different cores or different hyperthreads of the CPU 26 .
- multiple buffering thread 98 and multiple sending threads 100 may be executed, for example with odd-numbered packets or groups of data being transferred by one buffering thread and even-numbered packets or groups of data being transferred by another buffering thread, and odd-numbered packets being sent by the sending thread 100 and even-numbered packets being sent by different sending thread.
- a collection of threads may perform both roles, alternating between a sending mode and a buffering mode.
- processes 106 and 120 may be performed by different computing processes.
- FIG. 7 illustrates an embodiment of an intrusion detection system 140 .
- the intrusion detection system 140 is instantiated by operating the above-described computing hardware in a different mode of operation.
- the system 140 may include the CPU 26 , the network interface 28 , and the system memory 30 described above, and the system 140 may communicate with the above-described administrator device 14 , network switch 18 , Internet 20 , and secured network 16 .
- the intrusion detection system 140 includes the above described capture module but not the replay module, or vice versa, or neither of these features, which is not to suggest that other features may not also be omitted in some embodiments.
- the intrusion detection system 140 may be operable to monitor network traffic between the secured portion of the network 16 and the Internet 20 , for example all network traffic, substantially all network traffic, or a subset of network traffic.
- the intrusion detection system 140 may further be operable to detect network traffic indicative of malicious activity, such as denial of service attacks, viruses, and the like.
- some embodiments may be configured to detect anomalies in network traffic indicative of a zero-day attack by statistically analyzing traffic to identify new anomalies (e.g., zero-day attacks), and some embodiments may analyze traffic in real-time or near real-time, for example within one millisecond, one to five seconds, one to five minutes, or one hour.
- some embodiments of the intrusion detection system 140 apply a plurality of different models to the network data and aggregate signals from each of those models into an aggregate signal indicative of whether certain network traffic is indicative of an attack.
- the combination of models is expected to yield fewer false positives and fewer false negatives than individual models applied in using traditional techniques.
- these models are executed on the CPU 26 or the graphics processing unit 142 , depending on which computing architecture is well-suited to the computations associated with the model, thereby facilitating relatively fast processing of models on the graphics processing unit 142 that would otherwise be relatively slow on the CPU 26 .
- Certain models are amenable to being executed on the CPU 26 , such as models in which much of the processing occur sequentially and is not amenable to being performed in parallel. Examples of such models include packet-time stamp model (e.g., based on statistics of interpacket timing) and a packet size model (e.g., based on statistics of packet size).
- the intrusion detection system 140 of FIG. 7 performs a process 158 for detecting intrusions, illustrated by FIG. 8 .
- the process 158 includes receiving network data from a network interface, as illustrated by block 160 , and buffering the network data from the network interface in system memory, as indicated by block 162 . Receiving and buffering may be performed by the above-mentioned buffering thread 152 storing the received data in the buffer 40 in accordance with the techniques described above.
- the process 158 further includes retrieving the network data buffered in the system memory, as indicated by block 164 .
- Retrieving the network data may be performed by the above-mentioned loading thread 150 using the techniques described above with reference to FIG. 3 except that the data is, instead of being written to system storage, input to each of a plurality of statistical or machine-learning intrusion detection models, as indicated by block 166 .
- the loading thread 150 FIG. 7
- the data may write the retrieved data to the graphics memory 151 for access by the models of the graphics processing unit 142 .
- the data may be also made available to the CPU-executed models 148 , also of FIG. 7 .
- the state of each of the models (for those models in which state is maintained) may be updated to reflect information present in the retrieved network data, and intrusion-likelihood scores from each of these models may be updated and communicated to the model aggregator 146 .
- the process 158 aggregates scores from a plurality of different models, each model potentially having different sensitivity to different types of attacks, and as a result, is expected to exhibit fewer errors than conventional systems due to the combined effect of the models.
- the model aggregator 146 and models 154 , 156 , and 148 may be collectively referred to as an ensemble learning model.
- the system 140 and process 158 process a subset of the models (or a subset of the computations of a given model) on a graphics processing unit, which is expected to facilitate relatively fast processing of models amenable to highly parallel processing, and the buffering techniques described above are expected to facilitation parallel processing of batches of data by the graphics processing unit 142 .
- Program code that, when executed by a data processing apparatus, causes the data processing apparatus to perform the operations described herein may be stored on a tangible program carrier.
- a tangible program carrier may include a non-transitory computer readable storage medium.
- a non-transitory computer readable storage medium may include a machine readable storage device, a machine readable storage substrate, a memory device, or any combination thereof.
- Non-transitory computer readable storage medium may include, non-volatile memory (e.g., flash memory, ROM, PROM, EPROM, EEPROM memory), volatile memory (e.g., random access memory (RAM), static random access memory (SRAM), synchronous dynamic RAM (SDRAM)), bulk storage (e.g., CD-ROM and/or DVD-ROM, hard-drives), or the like.
- non-volatile memory e.g., flash memory, ROM, PROM, EPROM, EEPROM memory
- volatile memory e.g., random access memory (RAM), static random access memory (SRAM), synchronous dynamic RAM (SDRAM)
- bulk storage e.g., CD-ROM and/or DVD-ROM, hard-drives
- Such memory may include a single memory device and/or a plurality of memory devices (e.g., distributed memory devices).
- the program may be conveyed by a propagated signal, such as a carrier wave or digital signal conveying a stream of packets.
- the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must).
- the words “include”, “including”, and “includes” and the like mean including, but not limited to.
- the singular forms “a”, “an” and “the” include plural referents unless the content explicitly indicates otherwise.
- a special purpose computer or a similar special purpose electronic processing or computing device is capable of manipulating or transforming signals, for instance signals represented as physical electronic, optical, or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the special purpose computer or similar special purpose processing or computing device.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Provided is an intrusion detection system configured to detect anomalies indicative of a zero-day attack by statistically analyzing substantially all traffic on a network in real-time. The intrusion detection system, in some aspects, includes a network interface; one or more processors communicatively coupled to the network interface; system memory communicatively coupled to the processors. The system memory, in some aspects, stores instructions that when executed by the processors cause the processors to perform steps including: buffering network data from the network interface in the system memory; retrieving the network data buffered in the system memory; applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data; aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and upon the aggregate score exceeding a threshold, outputting an alert.
Description
- 1. Field of the Invention
- The present disclosure relates generally to data processing and, more specifically, to capturing, replaying, and analyzing time-series data.
- 2. Description of the Related Art
- In recent years, it has become increasingly difficult to detect malicious activity carried on networks. The volume of traffic moving through a given node on modern networks is substantially larger than even in the recent past, making it more difficult to assess whether any particular portion of the data conveyed will cause harm. Further, the sophistication of attacks has increased substantially, as entities with greater resources, such as organized crime and state actors, have directed resources towards developing new modes of attack. Many existing intrusion detection systems fail to assess network traffic at the rates supported by modern networking equipment and at desired levels of accuracy and are, thus, vulnerable to being overwhelmed, for example, with a denial of service attack. Similar problems are present in other fields in which data is captured, replayed, or analyzed at relatively high rates.
- The following is a non-exhaustive listing of some aspects of the present techniques. These and other aspects are described in the following disclosure.
- Some aspects include an intrusion detection system configured to detect anomalies indicative of a zero-day attack by statistically analyzing substantially all traffic on a network in real-time. The intrusion detection system, in some aspects, includes a network interface; one or more processors communicatively coupled to the network interface; system memory communicatively coupled to the processors. The system memory, in some aspects, stores instructions that when executed by the processors cause the processors to perform steps including buffering network data from the network interface in the system memory; retrieving the network data buffered in the system memory; applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data; aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and upon the aggregate score exceeding a threshold, outputting an alert.
- Some aspects include a tangible non-transitory machine-readable medium storing instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations including buffering network data from a network interface in system memory; retrieving the network data buffered in the system memory; applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data; aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and upon the aggregate score exceeding a threshold, outputting an alert.
- Some aspects include a process, including buffering network data from a network interface in a system memory; retrieving the network data buffered in the system memory; applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data; aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and upon the aggregate score exceeding a threshold, outputting an alert.
- Some aspects include an intrusion detection system, including a network interface; one or more processors communicatively coupled to the network interface; system memory communicatively coupled to the processors; and system storage communicatively coupled to the processors. In some aspects, the system storage stores instructions that when executed by the processors cause the intrusion detection system to perform steps, including writing network data from the network interface to a buffer in the system memory; and concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
- Some aspects include a tangible non-transitory machine-readable medium storing instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations, including writing network data from a network interface to a buffer in the system memory, wherein writing the network data from the network interface to the buffer in the system memory includes writing the network data to an active unlocked sub-buffer among the plurality of sub-buffers, locking the active sub-buffer, designating an unlocked sub-buffer as the active sub-buffer, and after ascertaining that the network data stored in the locked sub-buffer has been written to system storage, unlocking the locked sub-buffer; and concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
- Some aspects include a process, including writing network data from a network interface to a buffer in the system memory, wherein writing the network data from the network interface to the buffer in the system memory, includes writing the network data to an active unlocked sub-buffer among the plurality of sub-buffers, locking the active sub-buffer, designating an unlocked sub-buffer as the active sub-buffer, and after ascertaining that the network data stored in the locked sub-buffer has been written to system storage, unlocking the locked sub-buffer; and concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
- Some aspects include an intrusion detection system, including a network interface; a plurality of processors communicatively coupled to the network interface; system memory communicatively coupled to the plurality of processors; system storage communicatively coupled to the plurality of processors. In some aspects, the system storage stores previously captured network data and instructions that when executed by the plurality of processors cause the intrusion detection system to perform steps including pre-processing the network data in the system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer; moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving; concurrent with writing network data to the buffer, sending network data in the buffer to the network interface, wherein sending the network data includes reading time-stamps associated with packets in the network data, and sending each packet when the associated time stamp corresponds with a time stamp counter of at least one of the processors.
- Some aspects include a process, including pre-processing network data stored in system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer; moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving; concurrent with writing network data to the buffer, sending network data in the buffer to a network interface, wherein sending the network data includes reading time-stamps associated with packets in the network data, and sending each packet when the associated time stamp corresponds with a time stamp counter of a processor.
- Some aspects include a tangible, machine-readable, non-transitory medium storing instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations including pre-processing network data stored in system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer; moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving; concurrent with writing network data to the buffer, sending network data in the buffer to a network interface, wherein sending the network data includes reading time-stamps associated with packets in the network data, and sending each packet when the associated time stamp corresponds with a time stamp counter of a processor.
- The above-mentioned aspects and other aspects of the present techniques will be better understood when the present application is read in view of the following figures in which like numbers indicate similar or identical elements:
-
FIG. 1 illustrates a network-traffic capture module in accordance with embodiments of the present techniques; -
FIGS. 2 and 3 illustrates processes for capturing network traffic in accordance with embodiments of the present techniques; -
FIG. 4 illustrates a network-traffic replay module in accordance with embodiments of the present techniques; -
FIGS. 5 and 6 illustrates processes for replaying network traffic in accordance with embodiments of the present techniques; -
FIG. 7 illustrates an intrusion detection system in accordance with embodiments of the present techniques; and -
FIG. 8 illustrates a process for detecting intrusions in accordance with embodiments of the present techniques. - While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. The drawings may not be to scale. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but to the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
-
FIGS. 1-8 describe systems and processes for capturing, replaying, or analyzing time-series data (e.g., network data passing through a network node over time) at a relatively high rate (for example, 10 gigabit (Gb) per second or faster), using relatively inexpensive, off-the-shelf commodity computing components. These techniques may be combined in a single system, e.g., an intrusion detection system, having different modes of operation for capture, replay, and analysis. But, it should be noted that these techniques may be used separately in different systems and applications, e.g., for data capture or replay in contexts other than detecting intrusions in network traffic. - The techniques described herein are broadly applicable. In some use cases, the techniques may be used to capture, replay, or analyze various types of data other than network traffic between other computers, for example internal or externally originated application program interface (“API”) calls, such as system calls made by processes on a server (having one or more of the modules described below) to the operating system of the server, transactions on a database (such as a database having one or more of the modules described below), or network traffic sent or received by a host computing device having one or more of the modules described below. The techniques are also applicable to systems using non-commodity, customized computing hardware, as such components are also expected to benefit from use of the techniques described herein, and applications are not limited to network data, as other forms of data may be processed in accordance with some embodiments.
-
FIG. 1 illustrates an example of acomputing environment 10 having a network-traffic capture module 12 that, in some embodiments, is operative to capture network traffic (or other types of data) at a relatively high rate. The network-traffic capture module 12, in some implementations, uses multiple threads for buffering network data in memory (e.g., dynamic random access memory) and storing the buffered network data in system storage (e.g., a hard disk drive or solid-state drive providing persistent storage in the absence of power). As explained in greater detail below, in some implementations, the buffered network data is written to one of a plurality of sub-buffers in system memory and, concurrently, other sub-buffers in system memory are saved into pre-formed files in system storage. Embodiments of this buffering technique are expected to accommodate variations in the rate with which data is written to system storage, variations which could otherwise cause the loss of data when attempting to write data at rates nearing the specified maximum rates for the system storage components. Further, embodiments expedite the process of writing data to system storage by overwriting pre-formed files, again, as explained in greater detail below with reference toFIGS. 1 , 2, and 3. - In some embodiments, the
computing environment 10 includes, in addition to the network-traffic capture module 12, anadministrator device 14, a secured portion of anetwork 16, and anetwork switch 18 coupled to the Internet 20. - The
administrator device 14 includes an interface by which an administrator issues commands to control the network-traffic capture module 12 and by which an administrator views information output by the network-traffic capture module 12, such as status information indicating the state of an ongoing process of capturing network traffic. - The secured portion of the
network 16 may include a plurality ofcomputers 22, such as desktop computers, laptop computers, tablet computers, smart phones, and the like, connected to alocal area network 24, such as an intranet, that is connected to the Internet 20 via thenetwork switch 18. Thenetworks network switch 18. In some use cases, thenetwork 20 and data-capture module 12 may be co-located, e.g., in the same building, on the same intranet, to reduce latency relative to remote uses, or the data-capture module 12 may be remote. The secured portion of thenetwork 16 may reside behind a firewall and an intrusion detection system, in some embodiments, such as the intrusion detection system described below with reference toFIGS. 7 and 8 . In contrast, the Internet 20 is generally unsecured, and various attacks may pass through the Internet 20 along with legitimate traffic to or from thecomputing devices 22 in the secured portion of thenetwork 16. Further, the secured portion may also generate outgoing traffic indicative of an intrusion. In some use cases, a relatively large amount of data per unit time is exchanged between the secured portion of thenetwork 16 and the Internet 20 through thenetwork switch 18, such as approximately 1 Gb per second, 10 Gb per second, or 40 Gb per second, for example. - In some cases, a portion of the data passing through
switch 18 is indicative of an attack, such as a distributed denial of service attack, a brute force attempts to test passwords, SQL injection, buffer overflow, or other form of attack. Or the data can be verified to be intrusion free. Capturing this data (e.g., intrusion free data) over some period of time, such as a duration of greater than or approximately 1 minute, 10 minutes, one hour, one day, or one week, is often useful for providing a training set for an intrusion detection system, such as the intrusion detection systems described herein. Part of the indication of some attacks includes the timing of the constituent packets arriving at theswitch 18, and some attacks may be signaled by a very small portion of the network traffic. - However, inexpensive commodity computing hardware, using conventional techniques, is often incapable of reliably capturing data above certain data rates, e.g., capturing 100% of the traffic, or approximately 100%, for instance greater than 80% of the traffic, or less depending on the application. In some embodiments, the network-
traffic capture module 12 performs processes that allow themodule 12 to capture data at higher rates than is possible using traditional techniques with a given set of computing hardware. It should be noted, however, that the techniques described herein address a number of problems that arise in the context of intrusion detection systems and not all embodiments relate to capturing network traffic. Further, the techniques described herein are broadly applicable in fields other than intrusion detection systems, for example in data capture modules for high data rate sensors, capturing high-frequency trading data, or other systems in which data is captured at a high rate, for example within 20% of the specified maximum data rate of components in the networktraffic capture module 12. - In this embodiment, the network-
traffic capture module 12 is a computing device, such as a rack-mounted computing device (e.g., consuming four units of rack space), such as a computing device having a single chassis (or more) and a single motherboard (or more). Or components may be distributed or replicated across multiple computing devices, e.g., with one device capturing odd-numbered packets and another device capturing even-numbered packets. The illustrated network-traffic capture module 12 includes a central processing unit (CPU) 26, anetwork interface 28,system memory 30, andsystem storage 32. Thecomponents CPU 26 and system memory may be coupled directly to one another via a system board, e.g., via one or more memory channels on a motherboard. The network-traffic capture module 12 may also include other features that are not shown, such as a power supply and ports for interfacing with various input/output devices, such as a keyboard and monitor. - The
CPU 26 may be any of a variety of different types of CPUs, e.g., a multicore CPU capable of hyperthreading, such as a six core Xeon 5690 CPU, in which each core supports two hyper threads, available from Intel Corporation of Santa Clara, Calif. In some cases, the CPU includes a memory controller operable to communicate withsystem memory 30 using various memory standards, such as DDR3. The illustratedCPU 26 also communicates with thenetwork interface 28 and thesystem storage 32 to move data from thenetwork interface 28 tosystem memory 30 and then tosystem storage 32. TheCPU 26 executes an operating system, such as Ubuntu Linux, along with a data-capture module 34, which may be embodied by program code stored on a tangible, non-transitory, machine-readable medium, such assystem memory 30 orsystem storage 32, and which when executed by theCPU 26, causes theCPU 26 to perform the operations described below with reference toFIGS. 2 and 3 . - The data-
capture module 34 of this embodiment further includes a savingthread 36 and abuffering thread 38 that operate concurrently to move data from thenetwork interface 28 intosystem memory 30 and fromsystem memory 30 into thesystem storage 32, as described in greater detail below. In other embodiments,threads thread -
FIG. 1 illustrates a number of functional blocks as discrete components, but it should be noted that code or hardware by which these functional blocks are implemented may be conjoined, distributed, intermingled, or otherwise differently organized relative toFIG. 1 . Further, while the savingthread 36 andbuffering thread 38 are illustrated as separate threads, embodiments are not limited to this arrangement. For example, the saving thread and the buffering thread may be provided by a multiple instances of a thread that changes from a loading mode to a buffering mode and back periodically, in some use cases. - The
network interface 28, in some embodiments, is a network-interface card capable of sending or receiving data on a network, such as an Ethernet network, at approximately 1 Gb per second, 10 Gb per second, or 40 Gb per second, for instance. Thenetwork interface 28 may include additional buffers that are distinct from those described below with reference to thesystem memory 30, or in some embodiments, buffers on thenetwork interface 28 may provide the functionality described below with reference to the buffers in thesystem memory 30. -
System memory 30 is random access memory, for example synchronous dynamic random access memory, such as a plurality of dual in-line memory modules of DDR3 SDRAM of an amount selected based on desired data capture rates and other factors. Thesystem memory 30 stores various data structures operable to facilitate relatively rapid capture of data intosystem storage 32. In this embodiment,system memory 30 includes abuffer 40 having a plurality ofsub-buffers pointers sub-buffer storage 52 where captured data is held, alock 54, an exhausted-state value 56, and an identifier 58 (e.g., a unique identifier among the sub-buffers). Eachsub-buffer storage 52 may be operable to store a discrete, predetermined amount of network traffic data from thenetwork interface 28 before that network traffic data is written tosystem storage 32. In some embodiments, eachsub-buffer storage 52 has a size between approximately 1.3 gigabytes for a 10 Gb data feed, for example, and some embodiments include approximately 10 sub-buffers. Other embodiments can have larger or smaller sub-buffers and more or fewer sub-buffers. - Each sub-buffer 42, 44, and 46 may be associated (e.g., in a one-to-one correlation) with a
lock 54, having a state indicating whether the sub-buffer is either locked, meaning that the sub-buffer is not to receive additional network data, or unlocked, meaning that the sub-buffer is available to be overwritten. In some cases, the lock constitutes a mutual exclusion, or mutex, spin lock, semaphore, or other variable state configured to prevent two threads from accessing the sub-buffer at the same time. The exhausted-state value 56 has uses described below with reference toFIGS. 4-6 and, in some embodiments, may be omitted from themodule 12, which is not to suggest that other features may not also be omitted in some embodiments. - Each sub-buffer 42, 44, and 46 may be further associated with an
identifier 58 that, in some embodiments, uniquely identifies the sub-buffer among all of the other sub-buffers. Thewrite pointer 50 may point to the sub-buffer to which data is currently being written from thenetwork interface 28 by, for example, storing theidentifier 58 of the corresponding sub-buffer. In some cases, the identifiers are integer values, and thewrite pointer 50 is a value that is incremented or decremented when advancing from one sub-buffer to the next when a sub-buffer is deemed full. Theread pointer 48 points to a sub-buffer from which data is currently being read and transferred tosystem storage 32. In some cases, theread pointer 48 is a variable containing the value of the identifier of the corresponding sub-buffer. Thebuffer 40, in some cases, may be characterized as a last-in, first-out buffer or as a circular buffer, though embodiments are not limited to buffers consistent with these terms. Thebuffer 40, in some cases, is provided by a portion of thesystem memory 30 allocated to the data-capture module 34 by an operating system in which the data-capture module 34 is executing. - In some embodiments, the
system storage 32 is a non-volatile form of memory, such as a solid-state drive or a hard disk drive, operable to store captured network traffic. In some embodiments, thesystem storage 32 is coupled to theCPU 26 via a SATA connection, a SAS connection, or via a PCI express connection. Examples of system storage consistent with the present techniques include the Cheetah-brand enterprise-grade hard disk drives from Seagate Technology, having principle offices in Cupertino, Calif. In some embodiments, the system storage may be organized into logical units that provide higher performance than individual drives would otherwise provide, such as in a RAID zero or aRAID 10 array. In this embodiment, the system storage includespre-formed files 60 that are overwritten with network traffic. Overwriting a pre-formed file is believed to be faster than creating a new file structure to receive network traffic in the system storage, thereby facilitating the capture of data at higher rates than would otherwise be achieved. In some cases, the pre-formed files are formed by writing dummy data, such as a string of zeros, in a predetermined amount of space in the system storage, such as approximately one pre-formed file (or more) that receives data from a plurality of sub-buffers, e.g., a one or two terabyte file, depending on the application. The pre-form files 60 may be formed by creating the files before executing the data-capture module 34, in some embodiments. - The network-
traffic capture module 12 may be operable to capture network traffic at a relatively high rate using relatively inexpensive computing hardware.FIGS. 2 and 3 illustrate examples of processes by which the network-traffic capture module 12 captures data. The term “process” used herein refers to a method, which may correspond to a “process,” as that term is used to describe computing tasks (described herein as “computing processes”), but the unmodified term “process” is not so limited, e.g., a single computing process may have multiple threads, each executing one of the processes ofFIG. 2 or 3. - Specifically,
FIG. 2 illustrates an example of aprocess 62 performed by thebuffering thread 38 ofFIG. 1 . In this embodiment, theprocess 62 begins with receiving network data from a network interface, as illustrated byblock 64. The network data may be received at a relatively high rate, such as those discussed above. In some cases, the network data is received using direct memory access into a buffer of a network interface card or using some other interface provided by a network card driver. The data passing through thenetwork interface 28 may be mirrored by thenetwork switch 18 described above, such that a copy of all (or substantially all, for example) network traffic passing through thenetwork switch 18 is received by thenetwork interface 28 and thebuffering thread 38. Capturing all or substantially all network traffic is advantageous because some intrusion threats, such as a signal activating malware already installed, can be relatively small in size. The network data, in some cases, is data on an Ethernet network, encoded according to various protocols at various levels, including IP, TCP, UDP, FTP, HTTP, SPDY, and the like. The data may be encoded in packets having headers that identify an Internet Protocol address and port to which the packet is sent and from which the packet is received. - Receiving the network data from a network interface, in some embodiments, includes associating (for example, in a one-to-one relationship) a timestamp with the received packet, indicating a time at which the packet was received. In some cases, the timestamp is a timestamp counter value of the
network interface 28 incremented according to a clock signal of thenetwork interface 28, thereby providing relatively fine-grained documentation of the time at which packets are received for re-creating the flow of network traffic in accordance with techniques described below with reference toFIGS. 4 through 6 . In some embodiments, receiving network data may entail receiving one and only one packet of network data in the step designated byblock 64, with subsequent packets received in subsequent repetitions ofstep 64, such that each packet is associated with a timestamp, e.g., a time stamp having a resolution of less than one microsecond, for instance one nanosecond, and such that each packet has a unique (or approximately unique, e.g., shared by less than 10 packets) timestamp relative to the other packets. - The
process 62, in some embodiments, further includes ascertaining which sub-buffer among a plurality of sub-buffers in system memory is active, as indicated byblock 66. The sub-buffers may be those described above with reference toFIG. 1 . Ascertaining which sub-buffer is active may include referencing a value stored in a write pointer that uniquely identifies the active sub-buffer among the plurality ofsub-buffers identifier 58 of the sub-buffer. - In some embodiments, the
process 62 further includes writing the network data from the network interface to the active sub-buffer, as indicated byblock 68. Writing the network data from the network interface may include transmitting the network data to the above-mentioned random access memory forming system memory. In some embodiments, a driver of thenetwork interface 28 may write the network data to a portion of system memory different from that of thebuffer 40, and thebuffering thread 38 may move (e.g., copy) the data from this portion of memory to one of the sub-buffers 42, 44, or 46. The network data may include the above-mentioned timestamp associated with each packet of network data and the respective packet size. - The
process 62 may further include determining whether all (or substantially all) received network data is written to a sub-buffer, as indicated byblock 70. In some embodiments, packets are received and written one packet at a time, and thedetermination 70 is performed by determining that the one packet has been written to a sub-buffer. In some cases, thedetermination 70 may be performed by, for example, requesting additional data from thenetwork interface 28 described above. In response to determining that all receive data has been written, the process returns to step 64 and additional network data is received. In response to determining that all received network data has not been written to the sub-buffer, the process determines whether the active sub-buffer is full, as illustrated byblock 72. In some embodiments, the determination ofblock 72 is made in a different order from what is illustrated, which is not to suggest that other blocks may not also be reordered. For example, some embodiments determine whether the active sub-buffer is full beforedetermination 70 or in response to an affirmative determination atstep 70. As noted above, eachsub-buffer storage 52 may have a pre-defined buffer size, and each may be deemed full when the captured network data fully occupies that size, e.g., a sub-buffer may be deemed full when the occupied space exceeds the buffer size less a maximum specified packet size to prevent packets from spanning sub-buffers. In response to determining that the active-sub-buffer is not full, the process returns to step 68 and additional received network data is written to the active sub-buffer. Alternatively, in response to determining that the active-sub-buffer is full, the process proceeds to lock the active sub-buffer as indicated bystep 74. Locking the active sub-buffer may include changing the state of a lock associated with the active sub-buffer, thereby indicating to other threads that the active sub-buffer contains data that has not yet been written to system storage and should not be overwritten. In this branch, theprocess 62 further includes identifying an unlocked sub-buffer, as indicated byblock 76. Identifying anunlocked sub-buffer 76 may include incrementing or decrementing the above-mentionedwrite pointer 50 and determining whether thelock 54 corresponding to an identified sub-buffer indicates that that sub-buffer is unlocked. Theprocess 62 may wait until the next sub-buffer is unlocked or, in other embodiments, iterate through each of the above-mentioned sub-buffers 42, 44, and 46 until an unlocked one is found. Identifying an unlocked sub-buffer may include uniquely identifying the sub-buffer among the plurality of sub-buffers. - The
process 62 in this embodiment further includes designating the identified sub-buffer as the active buffer, as indicated byblock 78. Designating the sub-buffer as the active buffer may include changing the above-mentioned write pointer to a value equal to the identifier of the new active sub-buffer. Upon designating the new active buffer, theprocess 62 returns to step 68 and additional network data is written to the new active buffer. Thus, theprocess 62 transfers received network data into sub-buffers in system memory. Further, the data in system memory is organized in sub-buffers that can be written as a group of data, which is expected to be faster than writing smaller increments of data in the sub-buffers, as fewer write commands are issued. Buffering the data further is expected to accommodate variations in the speed with which data is written to system storage, as often occurs with the movement of mechanical parts in hard disk drive and as a result of various processes executed by the operating system otherwise affecting the movement of data within the system. -
FIG. 3 illustrates an embodiment of aprocess 80 performed by the above-mentionedsaving thread 36 ofFIG. 1 that moves data from the sub-buffers to system storage. In this embodiment, theprocess 80 includes identifying a locked sub-buffer among a plurality of sub-buffers, as indicated byblock 82. Identifying a locked sub-buffer may include incrementing or decrementing the above-mentionedread pointer 48 and determining whether theread pointer 48 identifies a sub-buffer that is locked, meaning in this context that the sub-buffer contains data that has not yet been written to system storage. - The
process 80 of this embodiment further includes identifying a pre-formed file in system storage, as indicated byblock 84. Next, in this embodiment, the pre-formed file is overwritten with the network data from the identified sub-buffer, as indicated byblock 86. As noted above, overwriting preformed files is expected to be faster than creating new files. Further, writing larger groups of data, such as an entire sub-buffer of data, is expected to be faster than writing data to system storage one packet at a time. - Finally, in this embodiment, the
process 80 includes unlocking the locked sub-buffer, as indicated byblock 88. Unlocking the locked sub-buffer includes changing the state of the above-mentioned lock, indicating that the corresponding sub-buffer can be overwritten without overwriting data that has not been stored in system storage. - The
processes process 62 concurrent with one or more other, different cores orhyperthreads executing process 80. Both threads may access the same memory structure ofbuffer 40 described above. Communication between the threads may occur via the above-mentionedlocks 54 by which the threads coordinate memory access. For example, full sub-buffers may be locked with a mutex, a spinlock, a semaphore, or other variable that prevents the buffering thread from overwriting the locked sub-buffer until the saving thread unlocks the sub-buffer. - In other embodiments, a single, larger buffer provides some of the functionality of the plurality of
sub-buffers write pointer 50, and data may be written from the circular buffer to system storage from an address identified by theread pointer 48. Transfers to system storage may be initiated in response to a threshold condition obtaining, such as a more than a threshold amount of data being written between the read and write pointer addresses, or in response to data collection for more than a threshold amount of time. - The system of
FIG. 1 and processes of theFIGS. 2 and 3 , in some embodiments, are expected to facilitate the capture of network data at relatively high rates using relatively inexpensive commodity computing hardware, such as system boards, CPUs, RAM, and hard drives. In certain embodiments, the network data is buffered to accommodate variations in the rate at which the data is written to system storage, and the data is written to system storage in larger collections into pre-existing files because some forms of system storage are operable to write data at higher rates when the data is written in this fashion. It should be noted, however, that not all embodiments use these techniques. For example, at potentially higher cost, a larger amount of system storage may be provided in a RAID 0 array, such that data is written in parallel to a larger number of drives. -
FIG. 4 illustrates an embodiment of a network-traffic replay module 90 that, in some implementations, replays network traffic captured using this system and processes ofFIGS. 1 through 3 . As explained below, some embodiments replay such traffic at a higher data rate and with greater fidelity than is possible using traditional techniques. That said, not all embodiments described herein use the network-traffic replay module 90, and themodule 90 is applicable in systems other than those described elsewhere in the present application. - The physical components of the network-
traffic replay module 90 may be the same as those of the network-traffic capture module 12 described above with reference toFIG. 1 , as indicated by the repetition ofelement numbers traffic replay module 90 is formed by operating the same computing device that forms the network-traffic capture module 12 in a replay mode rather than in a capture mode, e.g., in response to commands from theadministrator device 14. - In this embodiment, the
system storage 32 includes storednetwork data 92, which may be the captured data described above or may be data acquired through other techniques, for example simulated network data or data captured from other sources. Similarly, thesystem memory 30 includes the above-describedbuffer 40 in some embodiments. - When operating in replay mode, the
CPU 26 may execute instructions that cause theCPU 26 to provide a batching pre-processor 94 and areplay module 96 having abuffering thread 98 and a sendingthread 100. Further, when operated in replay mode, theCPU 26 may form in system memory (or system storage) a storage offsetlist 102. Each of thesefeatures FIGS. 5 and 6 , which describe processes performed by these components. Generally, in some embodiments, the batching pre-processor 94 calculates offsets that define groups of data in the stored network data for transfer, as a group (e.g., in response to a single read command), to one of the sub-buffers 42, 44, or 46. These offsets may be calculated before replaying network data and may be stored in the storage offsetlist 102. In some use cases, the batching pre-processor 94 is not instantiated at the same time as thereplay module 96. For example, the batching pre-processor 94 may be executed before thereplay module 96 is instantiated to prepare the storage offsetlist 102 in advance of a replay of network traffic, and in some cases, the offsetlist 102 may be read fromstorage 32 before or when instantiating thereplay module 96. And generally, in some embodiments, thebuffering thread 98 may later use these offsets to read the groups of data from the stored network data to one of the sub-buffers 42, 44, and 46, and the sendingthread 100 may send this data in accordance with timestamps associated with each packet in the data, thereby re-creating the flow of network traffic that was previously captured. The recreated flow of data may be directed to arecipient system 104, which may analyze the network traffic, for example testing or training candidate intrusion detection models for accuracy. The operation of the network-traffic replay module may be controlled by theadministrator device 14, which may issue commands to the network-traffic replay module 90 that instantiate the batching pre-processor 94 and instantiate thereplay module 96. Again, while the components of the network-traffic replay module 90 are illustrated as discrete functional blocks, it should be understood that hardware and software by which the functionality is provided may be distributed, conjoined, intermingled, or otherwise differently organized from the manner in which the functional blocks are illustrated. -
FIG. 5 illustrates an embodiment of aprocess 106 for pre-processing and buffering network data to be replayed. Theprocess 106 may facilitate relatively accurate recreations of the flow of network traffic at relatively high rates relative to the rates supported by the computing hardware. Further, theprocess 106 may reduce deviations in the rate at which packets are sent relative to the desired rate of sending packets (e.g., the rates at which packets were captured), as the rate of data transfer from thesystem storage 92 may occasionally deviate below the rate at which traffic is replayed. - In this embodiment, the
process 106 includes obtaining network data stored in system storage, as illustrated byblock 108. As noted above, the network data may be obtained in thecomputing environment 10 ofFIG. 1 by executing the processes ofFIGS. 2 and 3 , or network data may be obtained with other techniques, for example network data may be simulated or obtained from some of the system. Further, it should be noted that the present techniques are not limited to network data and other types of data acquired over time may be replayed, for example relatively high bandwidth data from sensor arrays or market data generated by high-frequency trading activities. - The
process 106, in some embodiments, includes calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer, as illustrated byblock 110. Thisstep 110 may be performed by the above-described batching pre-processor 94, for example, automatically upon having captured network data, or in response to a command from theadministrator device 14. The size of thesub-buffer storage 52 of the sub-buffers 42, 44, and 46 may be predefined and may be the same among each of the sub-buffers, in some embodiments. Further, the size of the sub-buffers may be substantially larger than packet of network data, for example the packets may be generally smaller than 1,514 bytes (or larger, such as approximately 9,000 bytes for jumbo frames), while the sub-buffer storage may be approximately 1.3 Gb, for example between 1.3 Mb and 1.3 Gb, depending on the time interval being buffered and the speed of the interface. - Substantially filling a sub-buffer may include filling a sub-buffer with a sequence of network data (e.g. network data in the order in which it arrived as indicated by timestamps) in sequence until the next packet in the sequence would overflow the sub-buffer. Thus, when pre-processing to calculate offsets, the packet that would otherwise overflow the sub-buffer may then be designated with an offset as corresponding to an offset marking the beginning of the network data for the next sub-buffer to be filled, and end of the proceeding packet may be designated with an offset as corresponding to the end of the network data to be read to a previous sub-buffer being filled. In some embodiments, the offsets are expressed as memory addresses in the
system storage 32, such that the offsets bookend, or bracket, a block of network data (e.g., in a range of sequential addresses in system storage) substantially filling a sub-buffer (e.g. filling the sub-buffer with a sequence of network data constituting an integer number of network packets and without a packet spanning between sub-buffers). - Offsets may be calculated by, for example, designating an address of a start of a first packet in the sequence of network data as an initial offset and, then, iterating through the sequence of network packets, summing the size of each consecutive network packet with a running total until the sum exceeds the size of a sub-buffer's storage, at which time, the end of the preceding packet is designated as an ending offset, and the beginning of the packet after that packet is designated as the next beginning offset. In some cases, a single offset value (e.g., address in system storage) may designate both the end of a preceding group and the beginning of the next group of data.
- The calculated offsets may be stored in system memory, for example in the storage offset
list 102 or, in some use cases, the offsets are stored as a file insystem storage 32 to be read intosystem memory 30 when replaying the storednetwork data 92. Pre-calculating the offsets is expected to facilitate the transfer of data from system storage to sub-buffers with relatively few read commands, which tends to facilitate higher rates of data transfer from system storage than transfers in which sub-buffers are filled with data from a plurality of read commands. That said, some embodiments may read data from system storage to a single sub-buffer with more than one read command by, for example, pre-calculating offsets for a relatively small number of groups of data collectively corresponding to a single sub-buffer, such as fewer than five or fewer than 10 groups of data. Further, in some embodiments, the offsets are calculated such that packets do not span sub-buffers, which facilitates appropriately timed replay of those packets, as the sendingthread 100 need not access multiple sub-buffers to send a single packet. Or, in some cases, offsets may not be used. For instance, the data may be stored in a RAID zero array sized such that time is available to read data without pre-calculated offsets. - The
process 106 further includes identifying a batch of data based on the offsets as illustrated byblock 112.Block 112 and the subsequently described steps ofFIG. 5 may be performed by the above-describedbuffering thread 98, for example sometime aftersteps FIG. 5 have been completed and in response to a replay command fromadministrator device 14. Identifying a batch of data based on offsets may be performed by obtaining a starting offset and an ending offset of a batch in the system storage from offsetlist 102, which may store an ordered list of offsets corresponding to, or expressed as, addresses in the system storage between which a sequence of network data is stored, where the sequence of network data forms a group of data that substantially fills one of the sub-buffers. Identifying the batch of data may include incrementing or decrementing a counter corresponding to a next entry in the list of storage offsets. - The
process 106, in some embodiments, further includes identifying an exhausted sub-buffer insystem memory 114. This step, like the other steps described with reference to the other processes herein, may be performed in a different order from the order in which the flow charts illustrate. For example, the sub-buffer may be identified before identifying the batch of data instep 112. Identifying an exhausted sub-buffer may include incrementing or decrementing thewrite pointer 50 of thebuffer 40 ofFIG. 4 and determining whether a corresponding sub-buffer is exhausted, e.g., as indicated by theexhausted state value 56. As explained below, sub-buffers are designated as exhausted when the sub-buffer does not contain data yet to be sent by the sendingthread 100. In some cases, thewrite pointer 50 may determine that the next sub-buffer is not exhausted and wait until the state of that sub-buffer changes. - The
process 106, in some embodiments, further includes moving (e.g., reading) the batch of data to the identified sub-buffer, as illustrated byblock 116. The batch of data between the identified offsets, in some embodiments, may be moved by issuing a (e.g., one and only one or more) read command to thesystem storage 92, instructing thesystem storage 92 to read to memory the network data between adjacent offsets, such as in a sequence of storage addresses (or locations within a file) beginning with the first offset and ending with the second offset in the pair of offsets. The transferred data may be transferred to a (e.g., one and only one or more) identifiedsub-buffer system memory 30, such as a sub-buffer identified by thewrite pointer 50. In other embodiments, multiple commands may be used to transfer the data from system storage to a single sub-buffer or a single command may transfer data to multiple sub-buffers. - The
process 106, in some embodiments, further includes designating the identified sub-buffer to which the batch of data was moved as being unexhausted, as illustrated byblock 118. Designating the sub-buffer as unexhausted may include changing the state of the correspondingexhausted state value 56 of that sub-buffer 42, 44, or 46, for example from a value of true to a value of false. In this example, unexhausted sub-buffers contain data yet to be sent via the network interface. -
System memory 30 andsystem storage 32 serve different roles in some embodiments due to tradeoffs between capacity, persistence, and speed. Generally, the speed with which data is written to, or read from,system memory 30 is substantially higher than the speed with which data is read from, or written to,system storage 32, but the capacity available insystem storage 32 for a given price is typically substantially higher.System storage 32 is generally used to store larger amounts of data persistently (e.g. when power to a computer system is removed), but due to lower data rates, thesystem storage 32 may act as a bottleneck when replaying (or capturing) data. Further, the rate with whichsystem storage 32 returns data may fluctuate. Buffering the data insystem memory 30 is expected to accommodate these fluctuations, drawing down the buffer when the data rate ofsystem storage 32 temporarily drops, and filling thebuffer 40 when the data rate of thesystem storage 32 rises or the rate at which replay traffic is sent drops, as described below. Further, the transferring the network data to system memory in groups identified by the pre-calculated offsets is expected to allow thebuffering thread 98 to transfer the data with fewer read commands being issued to thesystem storage 32, which is expected to result in higher rates of data transfer than would otherwise be achieved. Not all embodiments, however, provide these benefits or use these techniques. -
FIG. 6 illustrates an embodiment of aprocess 120 that sends data from a buffer. In some embodiments, theprocess 120 is performed by the above-describedsending thread 100 ofFIG. 4 sending data from thebuffer 40 through thenetwork interface 28 to therecipient system 104. The data may be sent in the order in which the data was received when the data was captured and in accordance with timestamps associated with each packet in the data, such that pauses between packets of network data are re-created when sending the network data. The process ofFIG. 6 may be performed concurrent with the process ofFIG. 5 , e.g., when (in response to) a threshold number of sub-buffers being filled by the process ofFIG. 5 , when a threshold amount of time has passed since the start of the process orFIG. 5 , or in response to some other signal, such thatbuffer 40 contains data to accommodate fluctuations in the sending rate and the rate at which data is read fromstorage 32. - The
process 120, in some embodiments, includes identifying the active sub-buffer, as indicated byblock 122. In some implementations, one and only one sub-buffer is designated as active, though other embodiments may have multiple active sub-buffers. Identifying the active sub-buffer may include identifying the active sub-buffer with the value in theread pointer 48 corresponding to anidentifier 58 of one of the sub-buffers 42, 44, or 46. - The
process 120, in some embodiments, further includes retrieving network data from the active sub-buffer, as indicated byblock 124. As noted above, embodiments are not limited to network data, and other types of data may be sent in accordance with the present techniques. In some embodiments, retrieving the network data from the active sub-buffer includes retrieving a next packet (e.g. one and only one packet, or multiple packets) in an ordered sequence of packets of network data, where the packets are sequenced in the order in which the packets were captured. - The
process 120 further includes reading a timestamp associated with a retrieved packet of the network data, as indicated byblock 126. As noted above, the timestamps, in some embodiments, are (or correspond to, e.g., as in integer offset from) a value of a network interface timestamp counter or a CPU timestamp counter at the time the packet was received during capture. The timestamp may be associated with the packet (e.g., in a one-to-one relationship or approximately one-to-one relationship) in which each network packet has a distinct timestamp relative to the other network packets, though not all embodiments use this technique, and in some cases consecutive packets may have the same timestamp value. - The
process 120 further includes determining whether the timestamp corresponds to a CPU timestamp counter (or other system clock), as indicated byblock 128. For example, a starting time corresponding to when a first packet was received during data capture may be stored and associated with the stored network data. In some cases, this starting time and timestamps associated with each packet of stored network data are equal to (or determined based on) a timestamp counter of the network interface, which may have a different period than the CPU timestamp counter, and typically a much longer period. For instance, a network interface timestamp counter operating at 1 GHz may increment each nanosecond, while the CPU timestamp counter operating at 3 GHz may increment in approximately 333 picoseconds. To determine correspondence, the ratio of these timestamp counter frequencies (or periods) may be used to translate between CPU time and network interface time. Further, an offset time may be calculated by subtracting the starting time of the data capture from the current time, and then, packets may be sent when (in response to) the translated times, less the offset, are equal, such that, for instance, a packet received 10.000003 seconds after capture starts is set 10.000003 seconds after replay starts. In other embodiments, other clock signals may be used instead of, or in combination with, the CPU timestamp counter. If the timestamp does not correspond with the CPU timestamp counter, theprocess 120 continues to wait until the CPU timestamp counter increments to a value that does correspond. Alternatively, when the timestamp corresponds to the CPU timestamp counter, theprocess 120 may proceed to the next step. - In some embodiments, the present process includes disabling certain modes of operation of the CPU that might interfere with the timing at which packets are sent. For instance, lower power modes of operation of the CPU may be disabled and throttling may be disabled to keep the CPU timestamp counter operating at a relatively constant frequency.
- In some embodiments, the
process 120 includes sending the packet to the network interface, as indicated byblock 130. Sending the packet may include sending the packet on a network, such as an Ethernet network to arecipient system 104, which, for example, may be an intrusion detection system, such as the intrusion detection system described below with reference toFIGS. 7 and 8 undergoing testing to evaluate intrusion detection models. In some embodiments, the sent data is sent to a loopback address, and the process ofFIG. 8 is performed on the data stream to test detection models concurrent with replaying the data. Sending the packet at the time the CPU timestamp counter corresponds with the timestamp is expected to cause the packets to be sent from the system with an inter-packet timing that approximates with relatively high fidelity the inter-packet timing with which the packets were received during capture, e.g., with a timing resolution approximately equal to that of the CPU timestamp counter or the network interface timestamp counter. - The
process 120, in some embodiments, further includes determining whether all data has been retrieved from the active sub-buffer, as indicated byblock 132. Determining whether all data has been retrieved may include determining whether a last packet in a sequence of packets in the active sub-buffer has been retrieved. In response to determining that all the data has not been retrieved, theprocess 120 may return to block 124 and continue to iterate through the sequence of packets in the active sub-buffer until all of the packets have been retrieved. Alternatively, in response to determining that all of the data has been retrieved from the active sub-buffer, theprocess 120 may proceed to the next step. - In some embodiments,
process 120 includes designating the active sub-buffer as exhausted and locking the active sub-buffer, as indicated byblock 134. Designating the active sub-buffer as exhausted may include changing the exhausted-state value 56 of the active sub-buffer, and locking the active sub-buffer may include changing alock state 54 of the active sub-buffer. In some embodiments, sub-buffers are locked with a mutex, a spinlock, or other variable configured to facilitate coordination between threads. - In some embodiments, the
process 120 further includes identifying an unexhausted sub-buffer, as illustrated byblock 136. Identifying an unexhausted sub-buffer may include incrementing or decrementing theread pointer 48 ofFIG. 4 and reading the exhausted-state value 56 of the sub-buffer 42, 44, or 46 corresponding to theread pointer 48 to confirm that the sub-buffer is unexhausted. Some embodiments may wait until the state of the next sub-buffer is unexhausted if needed. - In some embodiments, the
process 120 includes designating the identified sub-buffer as the active sub-buffer and unlocking this sub-buffer, as indicated byblock 138. The identified sub-buffer may be designated as active by storing in theread pointer 48 the value of theidentifier 58 of the correspondingsub-buffer step 138, theprocess 120 may return to step 124 and retrieve network data from this newly identified and unlocked active sub-buffer. - The
process 128 is expected to send data from thebuffer 40 in the sequence in which the data was captured and with timing approximating the timing with which the data was captured with relatively high fidelity. Further, the exhausted state values and locking the active sub-buffers is expected to facilitate concurrent processing for higher rates of data transfer. Not all embodiments, however, provide these benefits or use these techniques. - In some embodiments, the
process 106 ofFIG. 5 and theprocess 120 ofFIG. 6 are performed concurrently by different threads executed by, for example, different cores or different hyperthreads of theCPU 26. Further, in some embodiments,multiple buffering thread 98 and multiple sendingthreads 100 may be executed, for example with odd-numbered packets or groups of data being transferred by one buffering thread and even-numbered packets or groups of data being transferred by another buffering thread, and odd-numbered packets being sent by the sendingthread 100 and even-numbered packets being sent by different sending thread. Or a collection of threads may perform both roles, alternating between a sending mode and a buffering mode. Further, in some embodiments, processes 106 and 120 may be performed by different computing processes. -
FIG. 7 illustrates an embodiment of an intrusion detection system 140. In some embodiments, the intrusion detection system 140 is instantiated by operating the above-described computing hardware in a different mode of operation. For example, the system 140 may include theCPU 26, thenetwork interface 28, and thesystem memory 30 described above, and the system 140 may communicate with the above-describedadministrator device 14,network switch 18,Internet 20, andsecured network 16. In some embodiments, the intrusion detection system 140 includes the above described capture module but not the replay module, or vice versa, or neither of these features, which is not to suggest that other features may not also be omitted in some embodiments. - The intrusion detection system 140 may be operable to monitor network traffic between the secured portion of the
network 16 and theInternet 20, for example all network traffic, substantially all network traffic, or a subset of network traffic. The intrusion detection system 140 may further be operable to detect network traffic indicative of malicious activity, such as denial of service attacks, viruses, and the like. Further, some embodiments may be configured to detect anomalies in network traffic indicative of a zero-day attack by statistically analyzing traffic to identify new anomalies (e.g., zero-day attacks), and some embodiments may analyze traffic in real-time or near real-time, for example within one millisecond, one to five seconds, one to five minutes, or one hour. - To these ends, some embodiments of the intrusion detection system 140 apply a plurality of different models to the network data and aggregate signals from each of those models into an aggregate signal indicative of whether certain network traffic is indicative of an attack. The combination of models is expected to yield fewer false positives and fewer false negatives than individual models applied in using traditional techniques. Further, in some embodiments, these models are executed on the
CPU 26 or thegraphics processing unit 142, depending on which computing architecture is well-suited to the computations associated with the model, thereby facilitating relatively fast processing of models on thegraphics processing unit 142 that would otherwise be relatively slow on theCPU 26. Further, in some embodiments, the intrusion detection system 140 analyzes network data at relatively high rates using relatively inexpensive hardware by using the above-described buffering techniques to form batches of data to be analyzed in parallel (e.g., concurrently) by thegraphics processing unit 142. To this end, the analysis of network traffic may be performed concurrent with buffering traffic. - In this embodiment, the
CPU 26 executes a data-inspection module 144, which may be one or more computing processes each executing one or more threads, each corresponding to one or more of amodel aggregator 146, CPU-executedmodels 148, a savingthread 150, and abuffering thread 152. The savingthread 150 andbuffering thread 152 may be similar or identical to the savingthread 36 and thebuffering thread 38 described above with reference toFIG. 1 and may perform the process ofFIGS. 2 and 3 with the distinction that the savingthread 150 moves data intographics memory 154 of thegraphics processing unit 142, rather than tosystem storage 32, in some embodiments. To this end,system memory 30 may include the above describedbuffer 40. Again, it should be noted that the presently described functional blocks should not be interpreted as limiting the present techniques to a particular organization of code or hardware by which their functionality is provided, as such code or hardware may be distributed, conjoined, intermingled, or otherwise differently arranged. - The
model aggregator 146 may receive signals, such as intrusion-likelihood scores, indicative of the degree to which each individual model indicates that recent network traffic represents an attack. For example, some models may be well-suited to detecting a particular type of attack, and those models may output a relatively high intrusion-likelihood score for that type of attack, while other models may output a value indicative of either no attack or no signal. Themodel aggregator 146 may combine the scores into an aggregate score, and if the aggregate score exceeds a threshold, themodel aggregator 146 may, in some embodiments, indicate that an attack is occurring, a type of attack, and a confidence value indicative of the likelihood that the models have correctly identified an attack. This alarm data may be logged, and in response to an attack, certain network traffic may be blocked, such as certain IP addresses, certain packets, certain sessions, or certain files, and messages, such as emails or text messages, may be sent to an administrator, in some embodiments. Some embodiments classify network traffic on a packet-by-packet basis, and packets classified as anomalous are blocked. - Certain models are amenable to being executed on the
CPU 26, such as models in which much of the processing occur sequentially and is not amenable to being performed in parallel. Examples of such models include packet-time stamp model (e.g., based on statistics of interpacket timing) and a packet size model (e.g., based on statistics of packet size). - The
graphics processing unit 142 may execute various models in which processing is amenable to being performed in parallel, such as an n-gram model 154, and a self-organizingmap model 156. The n-gram model andself organizing map 156 may operate on data retrieved fromgraphics memory 151 and may each output intrusion-likelihood scores to themodel aggregator 146. Other types of models that may be executed on agraphics processing unit 142 includes spaced n-gram model, frequency n-gram model, a weighted n-gram model, or other neural network, machine learning, or statistical models. In some embodiments, the models executed on the GPU perform deep packet inspection. In some cases, models involve computations on both the GPU and the CPU, e.g., in a pipeline with some stages using only the CPU and some stages using the GPU. - In some embodiments, the intrusion detection system 140 of
FIG. 7 performs aprocess 158 for detecting intrusions, illustrated byFIG. 8 . - In some embodiments, the
process 158 includes receiving network data from a network interface, as illustrated byblock 160, and buffering the network data from the network interface in system memory, as indicated byblock 162. Receiving and buffering may be performed by the above-mentionedbuffering thread 152 storing the received data in thebuffer 40 in accordance with the techniques described above. - In some embodiments, the
process 158 further includes retrieving the network data buffered in the system memory, as indicated byblock 164. Retrieving the network data may be performed by the above-mentionedloading thread 150 using the techniques described above with reference toFIG. 3 except that the data is, instead of being written to system storage, input to each of a plurality of statistical or machine-learning intrusion detection models, as indicated byblock 166. To this end, the loading thread 150 (FIG. 7 ) may write the retrieved data to thegraphics memory 151 for access by the models of thegraphics processing unit 142. Further, the data may be also made available to the CPU-executedmodels 148, also ofFIG. 7 . The state of each of the models (for those models in which state is maintained) may be updated to reflect information present in the retrieved network data, and intrusion-likelihood scores from each of these models may be updated and communicated to themodel aggregator 146. - In some embodiments, the
process 158 further includes aggregating intrusion-likelihood scores from each of the intrusion detection models in an aggregate score, as indicated byblock 168. The scores may be aggregated by, for example, providing the scores as inputs to a neural network, support vector machine, a Bayesian classifier, or other machine-learning algorithm or a weighted sum of the scores, maximum of scores, median of scores, or other formula configured to combine the scores into a determinative value indicating whether an attack is occurring. Parameters of such algorithms may be tuned based on capture and replay of network traffic or offline analysis of captured data in accordance with the above techniques. - In some embodiments, the
process 158 further includes determining whether the aggregate score is greater than a threshold score, as indicated byblock 170. If the aggregate score exceeds the threshold, in response, theprocess 158 outputs an alert, as indicated byblock 172, and returns to step 160 to continue monitoring network data. Outputting an alert may include logging a record of the alert in memory (or system storage) and transmitting a message, for example an email or SMS message to an administrator, depending upon the severity of the alert. Alternatively, if the aggregate score does not exceed the threshold, in response, theprocess 158 returns to step 160 to continue monitoring network data. - The
process 158, in some embodiments, aggregates scores from a plurality of different models, each model potentially having different sensitivity to different types of attacks, and as a result, is expected to exhibit fewer errors than conventional systems due to the combined effect of the models. In some embodiments themodel aggregator 146 andmodels process 158, in some embodiments, process a subset of the models (or a subset of the computations of a given model) on a graphics processing unit, which is expected to facilitate relatively fast processing of models amenable to highly parallel processing, and the buffering techniques described above are expected to facilitation parallel processing of batches of data by thegraphics processing unit 142. It should be noted, however, that various engineering trade-offs are envisioned, and not all of the benefits described herein are offered by all embodiments in accordance with the present techniques, as various trade-offs relating to cost, performance, and reliability may be made by system designers while still using the techniques described herein. - Program code that, when executed by a data processing apparatus, causes the data processing apparatus to perform the operations described herein may be stored on a tangible program carrier. A tangible program carrier may include a non-transitory computer readable storage medium. A non-transitory computer readable storage medium may include a machine readable storage device, a machine readable storage substrate, a memory device, or any combination thereof. Non-transitory computer readable storage medium may include, non-volatile memory (e.g., flash memory, ROM, PROM, EPROM, EEPROM memory), volatile memory (e.g., random access memory (RAM), static random access memory (SRAM), synchronous dynamic RAM (SDRAM)), bulk storage (e.g., CD-ROM and/or DVD-ROM, hard-drives), or the like. Such memory may include a single memory device and/or a plurality of memory devices (e.g., distributed memory devices). In some embodiments, the program may be conveyed by a propagated signal, such as a carrier wave or digital signal conveying a stream of packets.
- It should be understood that the description and the drawings are not intended to limit the invention to the particular form disclosed, but to the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. Further modifications and alternative embodiments of various aspects of the invention will be apparent to those skilled in the art in view of this description. Accordingly, this description and the drawings are to be construed as illustrative only and are for the purpose of teaching those skilled in the art the general manner of carrying out the invention. It is to be understood that the forms of the invention shown and described herein are to be taken as examples of embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed or omitted, and certain features of the invention may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the invention. Changes may be made in the elements described herein without departing from the spirit and scope of the invention as described in the following claims. Headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description.
- As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). The words “include”, “including”, and “includes” and the like mean including, but not limited to. As used throughout this application, the singular forms “a”, “an” and “the” include plural referents unless the content explicitly indicates otherwise. Thus, for example, reference to “an element” or “a element” includes a combination, of two or more elements, notwithstanding use of other terms and phrases for one or more elements, such as “one or more.” The term “or” is, unless indicated otherwise, non-exclusive, i.e., encompassing both “and” and “or.” Terms describing conditional relationships, e.g., “in response to X, Y,” “upon X, Y,”, “if X, Y,” “when X, Y,” and the like, encompass causal relationships in which the antecedent is a necessary causal condition, the antecedent is a sufficient causal condition, or the antecedent is a contributory causal condition of the consequent, e.g., “state X occurs upon condition Y obtaining” is generic to “X occurs solely upon Y” and “X occurs upon Y and Z.” Such conditional relationships are not limited to consequences that instantly follow the antecedent obtaining, as some consequences may be delayed, and in conditional statements, antecedents are connected to their consequents, e.g., the antecedent is relevant to the likelihood of the consequent occurring. Further, unless otherwise indicated, statements that one value or action is “based on” another condition or value encompass both instances in which the condition or value is the sole factor and instances in which the condition or value is one factor among a plurality of factors. Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout this specification discussions utilizing terms such as “processing”, “computing”, “calculating”, “determining” or the like refer to actions or processes of a specific apparatus, such as a special purpose computer or a similar special purpose electronic processing/computing device. In the context of this specification, a special purpose computer or a similar special purpose electronic processing or computing device is capable of manipulating or transforming signals, for instance signals represented as physical electronic, optical, or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the special purpose computer or similar special purpose processing or computing device.
Claims (42)
1. An intrusion detection system configured to detect anomalies indicative of a zero-day attack by statistically analyzing substantially all traffic on a network in real-time, the intrusion detection system comprising:
a network interface;
one or more processors communicatively coupled to the network interface;
system memory communicatively coupled to the processors and storing instructions that when executed by the processors cause the processors to perform steps comprising:
buffering network data from the network interface in the system memory;
retrieving the network data buffered in the system memory;
applying each of a plurality of statistical or machine-learning intrusion-detection models to the retrieved network data;
aggregating intrusion-likelihood scores from each of the intrusion-detection models in an aggregate score, and
upon the aggregate score exceeding a threshold, outputting an alert.
2. The intrusion detection system of claim 1 , wherein the plurality of statistical or machine-learning intrusion-detection models comprise distributed denial of service detection models and deep packet inspection models.
3. The intrusion detection system of claim 2 , wherein the distributed denial of service detection models comprise a packet-size distribution model, and a packet aggregate count distribution model.
4. The intrusion detection system of claim 2 , wherein the deep packet inspection models comprise an n-gram analysis model and a self-organizing map.
5. The intrusion detection system of claim 1 , wherein the network interface includes a one-gigabit per second Ethernet network interface.
6. The intrusion detection system of claim 1 , wherein the network interface includes a ten-gigabit per second Ethernet network interface.
7. The intrusion detection system of claim 1 , wherein buffering network data is performed concurrent with applying each of a plurality of statistical or machine-learning intrusion-detection models.
8. (canceled)
9. (canceled)
10. (canceled)
11. (canceled)
12. (canceled)
13. (canceled)
14. An intrusion detection system comprising:
a network interface;
one or more processors communicatively coupled to the network interface;
system memory communicatively coupled to the processors; and
system storage communicatively coupled to the processors, wherein the system storage stores instructions that when executed by the processors cause the intrusion detection system to perform steps comprising:
writing network data from the network interface to a buffer in the system memory; and
concurrent with writing the network data to the buffer in the system memory, writing the network data from the buffer in the system memory to the system storage.
15. The intrusion detection system of claim 14 , comprising:
prior to writing the network data to the system storage, ascertaining that more than a threshold amount of the network data is stored in the buffer in the system memory and has not yet been written to the system storage.
16. The intrusion detection system of claim 14 , comprising:
prior to writing the network data to the system storage, ascertaining that more than a threshold duration of time has elapsed since the network data being written to the system storage was stored in the buffer in system memory.
17. The intrusion detection system of claim 14 , wherein the buffer in the system memory comprises a plurality of sub-buffers.
18. The intrusion detection system of claim 14 , wherein the plurality of sub-buffers are a circular sequence of buffers through which a write pointer cycles.
19. The intrusion detection system of claim 14 , wherein writing the network data from the network interface to the buffer in the system memory comprises:
writing the network data to an active unlocked sub-buffer among the plurality of sub-buffers,
locking the active sub-buffer,
designating an unlocked sub-buffer as the active sub-buffer, and
after ascertaining that the network data stored in the locked sub-buffer has been written to system storage, unlocking the locked sub-buffer.
20. (canceled)
21. (canceled)
22. (canceled)
23. (canceled)
24. (canceled)
25. (canceled)
26. (canceled)
27. (canceled)
28. (canceled)
29. An intrusion detection system, comprising:
a network interface;
a plurality of processors communicatively coupled to the network interface;
system memory communicatively coupled to the plurality of processors;
system storage communicatively coupled to the plurality of processors, wherein the system storage stores previously captured network data and instructions that when executed by the plurality of processors cause the intrusion detection system to perform steps comprising:
pre-processing the network data in the system storage by calculating offsets defining batches of the network data sized such that each batch substantially fills a buffer;
moving a batch of the network data from the system storage to a buffer in the system memory, wherein the batch is stored between a pair of the calculated offsets in the system storage prior to moving;
concurrent with writing network data to the buffer, sending network data in the buffer to the network interface, wherein sending the network data comprises:
reading time-stamps associated with packets in the network data, and
sending each packet when the associated time stamp corresponds with a time stamp counter of at least one of the processors.
30. The intrusion detection system of claim 29 , wherein the offsets are calculated such that packets in the network data do not overflow the buffer.
31. The intrusion detection system of claim 29 , wherein the buffer in the system memory comprises a plurality of sub-buffers.
32. The intrusion detection system of claim 31 , wherein the plurality of sub-buffers are a circular sequence of buffers through which a write pointer cycles.
33. The intrusion detection system of claim 31 , wherein the offsets are calculated such that packets in the network data do not overflow a sub-buffer among the plurality of sub-buffers.
34. The intrusion detection system of claim 31 , wherein sending the network data comprises:
sending the network data from an active sub-buffer among the plurality of sub-buffers to the network interface,
designating the active sub-buffer as an exhausted sub-buffer,
designating another sub-buffer as the active sub-buffer, and
after ascertaining that new network data is stored in the exhausted sub-buffer, designating the exhausted sub-buffer as a replenished buffer.
35. The intrusion detection system of claim 34 , wherein moving the batch of the network data from the system storage to the buffer in the system memory comprises moving the batch of the network data to the exhausted sub-buffer.
36. (canceled)
37. (canceled)
38. (canceled)
39. (canceled)
40. (canceled)
41. (canceled)
42. (canceled)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/648,176 US20140101761A1 (en) | 2012-10-09 | 2012-10-09 | Systems and methods for capturing, replaying, or analyzing time-series data |
US13/663,257 US20140101762A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or analyzing time-series data |
US13/663,263 US20140101763A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or replaying time-series data |
PCT/US2013/063419 WO2014058727A1 (en) | 2012-10-09 | 2013-10-04 | Systems and methods for capturing, replaying, or analyzing time-series data |
US14/309,873 US9237164B2 (en) | 2012-10-09 | 2014-06-19 | Systems and methods for capturing, replaying, or analyzing time-series data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/648,176 US20140101761A1 (en) | 2012-10-09 | 2012-10-09 | Systems and methods for capturing, replaying, or analyzing time-series data |
Related Child Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/663,257 Continuation US20140101762A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or analyzing time-series data |
US13/663,263 Continuation US20140101763A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or replaying time-series data |
US14/309,873 Continuation US9237164B2 (en) | 2012-10-09 | 2014-06-19 | Systems and methods for capturing, replaying, or analyzing time-series data |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140101761A1 true US20140101761A1 (en) | 2014-04-10 |
Family
ID=50433852
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/648,176 Abandoned US20140101761A1 (en) | 2012-10-09 | 2012-10-09 | Systems and methods for capturing, replaying, or analyzing time-series data |
US13/663,257 Abandoned US20140101762A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or analyzing time-series data |
US13/663,263 Abandoned US20140101763A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or replaying time-series data |
US14/309,873 Active US9237164B2 (en) | 2012-10-09 | 2014-06-19 | Systems and methods for capturing, replaying, or analyzing time-series data |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/663,257 Abandoned US20140101762A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or analyzing time-series data |
US13/663,263 Abandoned US20140101763A1 (en) | 2012-10-09 | 2012-10-29 | Systems and methods for capturing or replaying time-series data |
US14/309,873 Active US9237164B2 (en) | 2012-10-09 | 2014-06-19 | Systems and methods for capturing, replaying, or analyzing time-series data |
Country Status (2)
Country | Link |
---|---|
US (4) | US20140101761A1 (en) |
WO (1) | WO2014058727A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140189869A1 (en) * | 2007-07-26 | 2014-07-03 | Samsung Electronics Co., Ltd. | Method of intrusion detection in terminal device and intrusion detecting apparatus |
US20160154960A1 (en) * | 2014-10-02 | 2016-06-02 | Massachusetts Institute Of Technology | Systems and methods for risk rating framework for mobile applications |
US20160241669A1 (en) * | 2015-02-16 | 2016-08-18 | Telefonaktiebolaget L M Ericsson (Publ) | Temporal caching for icn |
WO2016168476A1 (en) * | 2015-04-17 | 2016-10-20 | Symantec Corporation | A method to detect malicious behavior by computing the likelihood of data accesses |
US20170126717A1 (en) * | 2014-09-18 | 2017-05-04 | Microsoft Technology Licensing, Llc | Lateral movement detection |
US20170147669A1 (en) * | 2015-11-24 | 2017-05-25 | Cisco Technology, Inc. | Cursor-based state-collapse scheme for shared databases |
CN106789352A (en) * | 2017-01-25 | 2017-05-31 | 北京兰云科技有限公司 | A kind of exception flow of network detection method and device |
CN108243062A (en) * | 2016-12-27 | 2018-07-03 | 通用电气公司 | To detect the system of the event of machine startup in time series data |
US20200242265A1 (en) * | 2019-01-30 | 2020-07-30 | EMC IP Holding Company LLC | Detecting abnormal data access patterns |
US20200314124A1 (en) * | 2015-12-11 | 2020-10-01 | Servicenow, Inc. | Computer network threat assessment |
CN113938325A (en) * | 2021-12-16 | 2022-01-14 | 紫光恒越技术有限公司 | Method and device for processing aggressive traffic, electronic equipment and storage equipment |
US11283827B2 (en) * | 2019-02-28 | 2022-03-22 | Xm Cyber Ltd. | Lateral movement strategy during penetration testing of a networked system |
US11588835B2 (en) | 2021-05-18 | 2023-02-21 | Bank Of America Corporation | Dynamic network security monitoring system |
US11792213B2 (en) | 2021-05-18 | 2023-10-17 | Bank Of America Corporation | Temporal-based anomaly detection for network security |
US11799879B2 (en) | 2021-05-18 | 2023-10-24 | Bank Of America Corporation | Real-time anomaly detection for network security |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9843488B2 (en) * | 2011-11-07 | 2017-12-12 | Netflow Logic Corporation | Method and system for confident anomaly detection in computer network traffic |
US9367687B1 (en) * | 2011-12-22 | 2016-06-14 | Emc Corporation | Method for malware detection using deep inspection and data discovery agents |
US9692771B2 (en) * | 2013-02-12 | 2017-06-27 | Symantec Corporation | System and method for estimating typicality of names and textual data |
US9286328B2 (en) * | 2013-07-19 | 2016-03-15 | International Business Machines Corporation | Producing an image copy of a database object based on information within database buffer pools |
US9288220B2 (en) | 2013-11-07 | 2016-03-15 | Cyberpoint International Llc | Methods and systems for malware detection |
WO2016032491A1 (en) * | 2014-08-28 | 2016-03-03 | Hewlett Packard Enterprise Development Lp | Distributed detection of malicious cloud actors |
US20160070759A1 (en) * | 2014-09-04 | 2016-03-10 | Palo Alto Research Center Incorporated | System And Method For Integrating Real-Time Query Engine And Database Platform |
US10373061B2 (en) * | 2014-12-10 | 2019-08-06 | Fair Isaac Corporation | Collaborative profile-based detection of behavioral anomalies and change-points |
CN104699462A (en) * | 2015-03-13 | 2015-06-10 | 哈尔滨工程大学 | GPU (ground power unit) parallel processing method for real-time detection of hyperspectral target |
US10380006B2 (en) * | 2015-06-05 | 2019-08-13 | International Business Machines Corporation | Application testing for security vulnerabilities |
US10154079B2 (en) * | 2015-08-11 | 2018-12-11 | Dell Products L.P. | Pre-boot file transfer system |
US10560483B2 (en) * | 2015-10-28 | 2020-02-11 | Qomplx, Inc. | Rating organization cybersecurity using active and passive external reconnaissance |
US9471778B1 (en) | 2015-11-30 | 2016-10-18 | International Business Machines Corporation | Automatic baselining of anomalous event activity in time series data |
WO2017131645A1 (en) | 2016-01-27 | 2017-08-03 | Aruba Networks, Inc. | Detecting malware on spdy connections |
US10673880B1 (en) * | 2016-09-26 | 2020-06-02 | Splunk Inc. | Anomaly detection to identify security threats |
CN107391041B (en) * | 2017-07-28 | 2020-03-31 | 郑州云海信息技术有限公司 | Data access method and device |
CN108234500A (en) * | 2018-01-08 | 2018-06-29 | 重庆邮电大学 | A kind of wireless sense network intrusion detection method based on deep learning |
US20190279100A1 (en) * | 2018-03-09 | 2019-09-12 | Lattice Semiconductor Corporation | Low latency interrupt alerts for artificial neural network systems and methods |
JP7042677B2 (en) * | 2018-04-04 | 2022-03-28 | 任天堂株式会社 | Information processing equipment, control method, information processing system, and control program |
JP7082282B2 (en) * | 2018-06-06 | 2022-06-08 | 富士通株式会社 | Packet analysis program, packet analysis method and packet analysis device |
US10678676B2 (en) * | 2018-08-08 | 2020-06-09 | Servicenow, Inc. | Playback of captured network transactions in a simulation environment |
US11714900B2 (en) * | 2019-09-13 | 2023-08-01 | Jpmorgan Chase Bank, N.A. | System and method for implementing re-run dropped detection tool |
CN116055191B (en) * | 2023-02-02 | 2023-09-29 | 成都卓讯智安科技有限公司 | Network intrusion detection method and device, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080229415A1 (en) * | 2005-07-01 | 2008-09-18 | Harsh Kapoor | Systems and methods for processing data flows |
US7505416B2 (en) * | 2003-03-31 | 2009-03-17 | Finisar Corporation | Network tap with integrated circuitry |
US7711006B2 (en) * | 2003-08-15 | 2010-05-04 | Napatech A/S | Data merge unit, a method of producing an interleaved data stream, a network analyser and a method of analysing a network |
US20100281539A1 (en) * | 2009-04-29 | 2010-11-04 | Juniper Networks, Inc. | Detecting malicious network software agents |
US20110023106A1 (en) * | 2004-03-12 | 2011-01-27 | Sca Technica, Inc. | Methods and systems for achieving high assurance computing using low assurance operating systems and processes |
US20110285729A1 (en) * | 2010-05-20 | 2011-11-24 | Munshi Aaftab A | Subbuffer objects |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5715447A (en) * | 1991-08-06 | 1998-02-03 | Fujitsu Limited | Method of and an apparatus for shortening a lock period of a shared buffer |
US5911142A (en) * | 1997-07-01 | 1999-06-08 | Millennium Dynamics, Inc. | System and method for bridging compliant and non-compliant files |
US7065657B1 (en) * | 1999-08-30 | 2006-06-20 | Symantec Corporation | Extensible intrusion detection system |
US8010469B2 (en) * | 2000-09-25 | 2011-08-30 | Crossbeam Systems, Inc. | Systems and methods for processing data flows |
ATE365345T1 (en) | 2001-02-06 | 2007-07-15 | Nortel Networks Sa | MULTIPLE RATE RING BUFFER AND CORRESPONDING OPERATING METHOD |
US20060111961A1 (en) | 2004-11-22 | 2006-05-25 | Mcquivey James | Passive consumer survey system and method |
US20060111962A1 (en) | 2004-11-22 | 2006-05-25 | Taylor Holsinger | Survey system and method |
US20070022479A1 (en) * | 2005-07-21 | 2007-01-25 | Somsubhra Sikdar | Network interface and firewall device |
JP4759574B2 (en) * | 2004-12-23 | 2011-08-31 | ソレラ ネットワークス インコーポレイテッド | Method and apparatus for network packet capture distributed storage system |
US7516313B2 (en) * | 2004-12-29 | 2009-04-07 | Intel Corporation | Predicting contention in a processor |
DE102005026256A1 (en) * | 2005-06-08 | 2006-12-14 | OCé PRINTING SYSTEMS GMBH | Method for carrying out the data transfer between program elements of a process, buffer object for carrying out the data transfer, and printing system |
US8266696B2 (en) * | 2005-11-14 | 2012-09-11 | Cisco Technology, Inc. | Techniques for network protection based on subscriber-aware application proxies |
US20080022401A1 (en) * | 2006-07-21 | 2008-01-24 | Sensory Networks Inc. | Apparatus and Method for Multicore Network Security Processing |
FI20096394A0 (en) * | 2009-12-23 | 2009-12-23 | Valtion Teknillinen | DETECTING DETECTION IN COMMUNICATIONS NETWORKS |
-
2012
- 2012-10-09 US US13/648,176 patent/US20140101761A1/en not_active Abandoned
- 2012-10-29 US US13/663,257 patent/US20140101762A1/en not_active Abandoned
- 2012-10-29 US US13/663,263 patent/US20140101763A1/en not_active Abandoned
-
2013
- 2013-10-04 WO PCT/US2013/063419 patent/WO2014058727A1/en active Application Filing
-
2014
- 2014-06-19 US US14/309,873 patent/US9237164B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7505416B2 (en) * | 2003-03-31 | 2009-03-17 | Finisar Corporation | Network tap with integrated circuitry |
US7711006B2 (en) * | 2003-08-15 | 2010-05-04 | Napatech A/S | Data merge unit, a method of producing an interleaved data stream, a network analyser and a method of analysing a network |
US20110023106A1 (en) * | 2004-03-12 | 2011-01-27 | Sca Technica, Inc. | Methods and systems for achieving high assurance computing using low assurance operating systems and processes |
US20080229415A1 (en) * | 2005-07-01 | 2008-09-18 | Harsh Kapoor | Systems and methods for processing data flows |
US20100281539A1 (en) * | 2009-04-29 | 2010-11-04 | Juniper Networks, Inc. | Detecting malicious network software agents |
US20110285729A1 (en) * | 2010-05-20 | 2011-11-24 | Munshi Aaftab A | Subbuffer objects |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9501641B2 (en) * | 2007-07-26 | 2016-11-22 | Samsung Electronics Co., Ltd. | Method of intrusion detection in terminal device and intrusion detecting apparatus |
US20140189869A1 (en) * | 2007-07-26 | 2014-07-03 | Samsung Electronics Co., Ltd. | Method of intrusion detection in terminal device and intrusion detecting apparatus |
US20170126717A1 (en) * | 2014-09-18 | 2017-05-04 | Microsoft Technology Licensing, Llc | Lateral movement detection |
US9825978B2 (en) * | 2014-09-18 | 2017-11-21 | Microsoft Technology Licensing, Llc | Lateral movement detection |
US20160154960A1 (en) * | 2014-10-02 | 2016-06-02 | Massachusetts Institute Of Technology | Systems and methods for risk rating framework for mobile applications |
US10783254B2 (en) * | 2014-10-02 | 2020-09-22 | Massachusetts Institute Of Technology | Systems and methods for risk rating framework for mobile applications |
US20160241669A1 (en) * | 2015-02-16 | 2016-08-18 | Telefonaktiebolaget L M Ericsson (Publ) | Temporal caching for icn |
US9736263B2 (en) * | 2015-02-16 | 2017-08-15 | Telefonaktiebolaget L M Ericsson (Publ) | Temporal caching for ICN |
WO2016168476A1 (en) * | 2015-04-17 | 2016-10-20 | Symantec Corporation | A method to detect malicious behavior by computing the likelihood of data accesses |
US10599672B2 (en) * | 2015-11-24 | 2020-03-24 | Cisco Technology, Inc. | Cursor-based state-collapse scheme for shared databases |
US20170147669A1 (en) * | 2015-11-24 | 2017-05-25 | Cisco Technology, Inc. | Cursor-based state-collapse scheme for shared databases |
US20200314124A1 (en) * | 2015-12-11 | 2020-10-01 | Servicenow, Inc. | Computer network threat assessment |
US11539720B2 (en) * | 2015-12-11 | 2022-12-27 | Servicenow, Inc. | Computer network threat assessment |
EP3343421A1 (en) * | 2016-12-27 | 2018-07-04 | General Electric Company | System to detect machine-initiated events in time series data |
CN108243062A (en) * | 2016-12-27 | 2018-07-03 | 通用电气公司 | To detect the system of the event of machine startup in time series data |
CN106789352A (en) * | 2017-01-25 | 2017-05-31 | 北京兰云科技有限公司 | A kind of exception flow of network detection method and device |
US20200242265A1 (en) * | 2019-01-30 | 2020-07-30 | EMC IP Holding Company LLC | Detecting abnormal data access patterns |
US11283827B2 (en) * | 2019-02-28 | 2022-03-22 | Xm Cyber Ltd. | Lateral movement strategy during penetration testing of a networked system |
US11588835B2 (en) | 2021-05-18 | 2023-02-21 | Bank Of America Corporation | Dynamic network security monitoring system |
US11792213B2 (en) | 2021-05-18 | 2023-10-17 | Bank Of America Corporation | Temporal-based anomaly detection for network security |
US11799879B2 (en) | 2021-05-18 | 2023-10-24 | Bank Of America Corporation | Real-time anomaly detection for network security |
CN113938325A (en) * | 2021-12-16 | 2022-01-14 | 紫光恒越技术有限公司 | Method and device for processing aggressive traffic, electronic equipment and storage equipment |
Also Published As
Publication number | Publication date |
---|---|
US20140101762A1 (en) | 2014-04-10 |
US9237164B2 (en) | 2016-01-12 |
WO2014058727A1 (en) | 2014-04-17 |
US20150082433A1 (en) | 2015-03-19 |
US20140101763A1 (en) | 2014-04-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9237164B2 (en) | Systems and methods for capturing, replaying, or analyzing time-series data | |
US11323471B2 (en) | Advanced cybersecurity threat mitigation using cyberphysical graphs with state changes | |
US20220103431A1 (en) | Policy implementation and management | |
Kasick et al. | Black-Box Problem Diagnosis in Parallel File Systems. | |
US10565517B2 (en) | Horizontal decision tree learning from very high rate data streams with horizontal parallel conflict resolution | |
US20150339230A1 (en) | Managing out-of-order memory command execution from multiple queues while maintaining data coherency | |
US11677769B2 (en) | Counting SYN packets | |
EP3777067A1 (en) | Device and method for anomaly detection on an input stream of events | |
US10243971B2 (en) | System and method for retrospective network traffic analysis | |
CA2847788C (en) | Server system for providing current data and past data to clients | |
US9575822B2 (en) | Tracking a relative arrival order of events being stored in multiple queues using a counter using most significant bit values | |
JP2019517704A (en) | Traffic logging in computer networks | |
CN107277100B (en) | System and method for near real-time cloud infrastructure policy implementation and management | |
CN110910249B (en) | Data processing method and device, node equipment and storage medium | |
CN111988243A (en) | Coherent acquisition of shared buffer state | |
US9032536B2 (en) | System and method for incapacitating a hardware keylogger | |
CN107798009A (en) | Data aggregation method, apparatus and system | |
US20120246726A1 (en) | Determining heavy distinct hitters in a data stream | |
US8671267B2 (en) | Monitoring processing time in a shared pipeline | |
US10333813B1 (en) | Time-out tracking for high-throughput packet transmission | |
CN103746860A (en) | Network monitoring system and method thereof in virtual environment | |
EP3166027B1 (en) | Method and apparatus for determining hot page in database | |
CN111258845A (en) | Detection of event storms | |
US20240171498A1 (en) | Detecting in-transit inband telemetry packet drops | |
US20240176882A1 (en) | System and method for maching learing-based detection of ransomware attacks on a storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TRACEVECTOR, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HARLACHER, JAMES;ABENE, MARK;REEL/FRAME:029147/0301 Effective date: 20121010 |
|
AS | Assignment |
Owner name: VECTRA NETWORKS, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:TRACEVECTOR, INC.;REEL/FRAME:033146/0393 Effective date: 20140206 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |