WO2014026220A1 - Analyse de données chronologiques - Google Patents

Analyse de données chronologiques Download PDF

Info

Publication number
WO2014026220A1
WO2014026220A1 PCT/AU2013/000883 AU2013000883W WO2014026220A1 WO 2014026220 A1 WO2014026220 A1 WO 2014026220A1 AU 2013000883 W AU2013000883 W AU 2013000883W WO 2014026220 A1 WO2014026220 A1 WO 2014026220A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
time
computer
computer readable
user interface
Prior art date
Application number
PCT/AU2013/000883
Other languages
English (en)
Inventor
Michael Baker
Original Assignee
Mts Consulting Pty Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mts Consulting Pty Limited filed Critical Mts Consulting Pty Limited
Priority to US14/405,684 priority Critical patent/US9578046B2/en
Priority to AU2013302297A priority patent/AU2013302297B2/en
Publication of WO2014026220A1 publication Critical patent/WO2014026220A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Definitions

  • This disclosure concerns the analysis of time-distances data. For example, but not limited to, the analysis of the data packets sent on a computer network in order to identify security risks. Aspects include a data model, methods, software, user interfaces and a computer system. Background art
  • SIEM Security Information and Management
  • RDBMS Relational Database Management Systems
  • PCAPs process network packet captures
  • tools point solutions
  • a computer readable medium having stored threreon a computer readable data model to store values representing time indexed data, wherein the model is indexed by multiple sets of keys wherein each key in a set includes the same set of criteria related to the time indexed data, and each key of the set represents a different time interval length.
  • the index is essentially repeated for different time intervals. This enables access to the time indexed data for display on a user interface in a manner that can adapt to changes in the resolution of the time indexed data needing to be displayed (i.e. zoomed in or out) without the need or with limited further processing of the stored data.
  • the time indexed data may represent data traffic on a computer network.
  • the time indexed data may be data packets sent on the network.
  • the stored time indexed data may be processed to; identify security risks in the data traffic.
  • the criteria may relate to features in the time indexed data that separately or together with other features are indicators of security risks in the data traffic.
  • the computer readable data model may be further indexed by a second index comprised of time values represented in the time series data.
  • a first key represents a first time interval length and a second key in the same set as the first key represents a second time interval, wherein the second time interval length is larger than the first time interval length and the values stored in the datamodcl for the second key summarises the values stored in the data model for the first key.
  • the computer readable data model may be further indexed by a second index that represents a count or total.
  • a criteria of each key of a set may be a time value represented in the time series data.
  • the model may comprise one or more tables, each table indexed by multiple sets of keys.
  • a software that is computer readable instructions stored on computer readable memory, that allow data model described above to be accessed and/or updated.
  • a computer system comprising: a datastore to store a computer readable data model that stores time indexed data, wherein the model is indexed by multiple sets of keys wherein each key in a set includes the same set of criteria related to the time indexed data, and each key of the set represents a different time interval length; and
  • a processor to access the data model to query and retrieve the stored data.
  • a user interface to display the stored values in the computer readable data model of the first aspect.
  • a fourth aspect there is provided a method for causing a display on a user interface of time indexed data stored in data model, wherein the model is indexed by multiple sets of keys wherein each key in a set includes the same set of criteria related to the time indexed data, and each key of the set represents a different time interval length, the method comprising:
  • Determining the time scale may comprise receiving from the user (directly or indirectly) a time scale from a set of candidate time scales displayed to the user for selection, such as displaying data over one day, month or year.
  • the time scale may be based on the start time and end time the user selects for display.
  • the time scale may be based on a user selected resolution of the graphical display, such as by zooming in or out of the display.
  • the most appropriate time interval may be the time interval that will result in the graphical display satisfying a set of display criteria, such as no more than a first predetermined number of bars in a graph in the graphical display, and not less than second predetermined number.
  • Causing the display may be by transmitting the time indexed data associated with the relevant keys to the user interface.
  • software being computer readable instructions stored on a computer readable medium that when executed by a computer causes the computer to perform the method of the fourth aspect.
  • a computer system to cause a display on a user interface, the computer comprising a data store to store the time series data in the data model of the fourth aspect, and a processor to perform the method of the fourth aspect.
  • a user interface to display the access or received time indexed data associated with the relevant keys as determined according to the method of the fourth aspect.
  • a seventh aspect there is provided a method of analysing time series data, the method comprising:
  • the method may further comprise causing a display on a user interface by:
  • the events may be security risk events and the time series data may be data traffic on a computer network.
  • software being computer readable instructions stored on a computer readable medium that when executed by a computer causes the computer to perform the method of the seventh aspect.
  • a computer system to analyse time series data comprising:
  • a data store to store the time series data, detected events and associated confidence levels, and one or more different algorithms to detect one or more events
  • a user interface to display an indication of the one or more detected events and associated confidence level as determined according to the method of the seventh aspect.
  • a method of analysing data packet traffic in a computer network comprising:
  • the correlation may be based on whether the event relates to the same conversation, that is having one or more of the same source address, source port, destination address and destination port.
  • the method may further comprise causing a display on a graphical display by:
  • One or more of the algorithms may be executed as IPS Engines.
  • software being computer readable instructions stored on a computer readable medium that when executed by a computer causes the computer to perform the method of the tenth aspect.
  • a computer system to analyse data packet traffic in a computer network comprising:
  • a data store to store the time series data, detected security risk events and indications of correlations between security risks, and one or more different algorithms to detect one or more security risk events
  • a processor to perform the method of the tenth aspect.
  • a user interface to display one or more similar security risk events as determined by the method of the tenth aspect as a single event.
  • a method of analysing historic data packet traffic in a computer network comprising:
  • the association between the one or more new detected events and the earliest data packet may be a common time associated with each.
  • a data store to store the time series data, indications of detected security risks, an indication of a earliest data packet and one or more different algorithms to detect one or more events.
  • a user interface to display the identified at least an earlier data packet according to the method of the thirteenth aspect.
  • one or more of the method aspects described above may be combined. The same applies to the computer system, user interface and software aspects.
  • Fig. 1 is an example simplified hardware computer system that can be used with the system of this example;
  • Fig. 2 are flow charts of the methods of the system
  • Fig. 3 schematically shows the model of the graphs database
  • Fig. 4 schematically shows the relationships between stored network traffic, map/reduce jobs, IPS Engines and the user interface
  • Fig. 5 is a sample portion of a bins database
  • Fig, 6 is a schematic representation of components of an on-site version of the system
  • Fig. 7 is a sample portion of a groupings database
  • Fig. 8 is a further schematic representation of components of an on-site version of the system
  • Fig. 9 is a schematic representation of the system used to identify Zero Day attacks
  • Fig. 10 to 13 is a set of example user interface or part of example user interfaces.
  • Fig. 22 shows how all map/reduce jobs link to the data model
  • Figs. 23, 25, 26 and 27 is a further set of example user interfaces or part of example user interfaces.
  • Figs. 25 and 28 to 30 schematically show two queries to the databases. Best modes of the invention
  • the time series data is data from a network that is analysed to identify potential security risks.
  • a network 100 is typically a privately owned computer network. It typically includes one or more servers 102 that support multiple users 104, each accessing the network using one or more devices.
  • the server 102 supports processing and communications between the users 104 and also between each user 104 and the server 102.
  • a person skilled in the art would readily understand that many forms of the network 100 exist, such as distributed networks, LANs and WANs, publically owned, cloud orientated networks etc. AH of these networks are included in the definition of a network as used in this specification.
  • the server 102 is in communication with the security analysis service stored in computer readable medium hosted by server 108.
  • the server 108 is simply one example of the hardware computing devices that could be used to support this system. Many other distributed network, cloud orientated networks could be used to support this service.
  • PCAPs Network packet captures
  • an administrator of the network 100 interacts with a web interface hosted by the server 108 to upload these PCAPs to the system and these are stored in a datastore 110, such as a database.
  • a datastore 110 such as a database.
  • Alternative methods for uploading PCAPs include using Secure File Interface, Secure File Transfer Protocol or copied from a Customers Amazon S3 Bucket.
  • Network packet captures can also be streamed from a sensor located on the network 100 to server 108 to be stored by the server 108 on the database 1 10 for persistence.
  • This implementation is known as a software as a service that is located in the cloud as the processing of the PCAPs is batched processed by a remotely located server 108.
  • the server 108 includes software to support the security risk analysis methods described here, including software to host the user interface, to manage the databases described further below, the software of the IPS Engines and map reduce jobs etc.
  • the server 108 includes the necessary hardware components required to execute the methods described here, such as a processor, input/output port, RAM and ROM memory and hardware to support the datastore 1 10.
  • the server 108 could access the data when stored in an alternative way, such as distributed and remotely commonly referred to as on the cloud.
  • packet captures are an exact copy in binary format we can use it as a prima facie data source for Threats (Network Attacks), Sessions (IP,TCP,UDP,ICMP), Protocols, (HTTP.SSH.IMAP) and Files (Word, Excel, PDF).
  • the system of this example iteratively analyses the PCAPs to look for anomalies.
  • the system can dissect the traffic and perform deep packet inspection in relation to protocols conveyed in the packet capture. All files that are conveyed within the packet capture can be extracted and analysed for anomalies, viruses or compared to a whitelist or blacklist. The analysis can be repeated based on new information to identify previously unidentified security risks.
  • the aim of the security risk analysis is to identify trends, issuesj anomalies, misuse, rates of change and relationships between any node on the network 100 at any point of time. It does this by analysing and persisting large volumes of PCAPs.
  • the design of the system is schematically shown in Fig. 4.
  • the use of a parallel processing model allows the system to be readily scalable. Each process performed by the system in this example is executed using the Map/reduce model and the java based software system Hadoop. For example, feature extraction for Statistical analysis and Machine Learning can be performed.
  • anomalies and misuse can be identified using custom map/reduce jobs or by Snort and Sling map/reduce jobs to use sensors that produce alerts.
  • a person skilled in the art would also identify that other parallel processing architectures are suitable for use with the system.
  • the map/reduce jobs 500-514 breaks up and analyses the PCAPs, and in this case in rounds, so as to produce results that are inserted into databases 390.
  • the map/reduce jobs 500-514 query the PCAPs stored on the datastore 110 and perform both real-time and batch inserts into (with or without querying) a distributed database, which is essentially a number of databases 390 stored typically at multiple distinct physical locations.
  • Real-time insertions provide sub-second infonnation based on network traffic of the network.
  • Batch insertions are performed as the outcome of map/reduce jobs written to execute on a Hadoop Cluster. Batch insertions are generally used for jobs that are required to look at the entire dataset rather than a stream of network data.
  • Databases 400-412 are column orientated and database 414 is a relational database.
  • the word database should be understood here to mean one or more tables that may or may not separately be part of a database management system.
  • Example databases 390 are shown in Fig. 4 arc:
  • Bins database 400 having sparse time series data to record row keys over time Connections database 402 being an intermediary table to track IP connections such as TCP connections and UDP/ICMP pseudo connections
  • Attacks database 404 having data related to security risks (attacks)
  • Groupings database 406 being an inverted index that correlates counts and values to row keys
  • Graphs database 408 being a network graph made up of Nodes, Properties and Edges that persist relationships between structured and unstructured data from Threats, Sessions, Protocols and Files
  • PLVDB Preprocess database 410 being a vulnerability database pre-processing
  • PLVDB Recommend database 412 having recommendations for attack related information and correlated signatures
  • Correlations table 414 being a SQL table that allows Signatures to be grouped in the user interface and then be processed by the PLVDB Recommend job 512 for insertion into PLVDB Recommend table 412 Bins table 400
  • a subset of an example bins database 400 is shown in Fig. 5.
  • the data model of the bins database 400 is comprised of billions of columns 550 at 60 second time values boundaries or time value intervals using unix timestamps (e.g. 1312610400 equates to Sat Aug 6 16:00:00 2011, here the timestamps are UTC
  • the bin time length intervals are:
  • time bins provide the user with the ability to zoom in and out on in the analysis of the PCAPs in relation to time using the user interface 700. That is, the analysis can be seen at an overview level by displaying a visualisation of the 1 year bin and also zoomed all the way in to the actual minute that the data happened. The general method of operation is shown in Fig. 2(a). ⁇ The use of multiple bin widths means that the PCAPs are not abstracted and nor is there any loss of information related to the data over time (limited only by the minimum resolution in the columns). At the same time the model of the bins database also allows for management of how much data is sent to the user interface 700 for display by selecting the appropriate bin for the display and preventing excessive processing of data for display post extraction from the databases 390.
  • the bins database 400 typically has an index comprises of billions and billions of rows 552 that each define a set of criteria.
  • Each row is typically a combination of criteria that the data within the PCAPs may meet.
  • the rows are termed row keys as they provide a 'key' to data that is stored in the column that is termed the 'value'.
  • querying a row key for a particular column (time) we are able to return a value, in this case a count of how many time the PCAPs represented by that column meet the criteria of the relevant row key. So the way we persist information from the parallel map/reduce jobs is to create sets of row keys for important combinations of criteria that we want to know the values of and then store the value in the appropriate column that represents the relevant time value range.
  • bins database 400 information about the data is concurrently stored multiple times depending on the bin width. For example a specific event (i.e. when the data matches a key) is written to all the related multiple time bins for that key concurrently.
  • Row keys define the criteria that is applied to the time series data of the PCAPs.
  • This example bin database 400 row key comprises the following criteria separated by semicolons. 'user_id' being the User ID of an user to separate network 100 data from others in the database. In this way, all users of the security analysis service that belong to network 100 are combined with reference to their user ID. At the same time any analysis that is based on global average would typically consider all user IDs captured
  • 'device_id' being a Device ID for where the packet capture was taken or received from
  • 'focus__lever being the number of distinct sensor systems such as IPS Engines that detected an event, such as 1 , 2, 3, 4
  • 'src_addr' being the source of the related data, such as Country, City, ISP, IP
  • 'src_port' being the source port of the TCP or IIDP communication
  • 'dst_addr' being the destination of the related data, such as Country, City, ISP, IP
  • 'dst_port' being the destination port of the TCP or UDP communication
  • 'type' being a way to aggregate data related to attacks, such as all attacks
  • attack_all high attacks (attacks_l), medium attacks (attacks_2), low attacks
  • 'ts_amount' being an integer representing the multiple of the 'ts_intervar, such as
  • the 'source' and 'destination' can relate to a Country, City, ISP or IP Address. This is represented in row keys as shown by the following examples:
  • the map/reduce jobs process the PCAPs, enrich it (add context) and then insert the values into columns corresponding to bin widths.
  • the row keys are constructed from over 10,000 different combinations of criteria found within a particular packet, or from correlating data within the packet with other information (e.g. location information).
  • information e.g. location information.
  • a person skilled in the art will appreciate that any kind of metadata can be extracted out of individual packets, multiple packets, and out of conversations, from which we can apply logic, statistical analysis or machine learning to this meta data or features for storage.
  • the groupings database 406 data model does not use time series information for columns 700 but rather integers to store the current count or total for a row key.
  • the groupings database can be directly obtained by querying the top columns for a row key 702 and then applying a limit to the set returned.
  • a row key 702 can comprise the following criteria separated by semicolons.
  • focus_level being now many sensors recorded an alert on the same conversation with the same time value period
  • each row keys represent unique combinations of values of the criteria so there are typically billons of rows 702.
  • row keys can be considered as sets of row keys having the same criteria of 'userjd', 'device_id ⁇ 'capture_id', 'focus_lever, 'type', 'key and 'ts_value' but having a different value for the time interval, that is the combination of 'ts_intervar and 'ts_amount ⁇
  • Types of values stored in the groupings table 406 are dependent on the row key.
  • the row key could seek to identify IP Addresses that match the criteria of the row key.
  • the values stored in that row will be IP Addresses in the appropriate column based on count. So if an IP Address satisfies the criteria for a row key a total of 4 times, it will be stored in the column representing the count 4.
  • a different IP Address that satisfies the criteria for the same row key a total of 100 times will be entered in the column representing the count 100.
  • Other data types stored in the groupings table 406 includes data representative of country, city, time, IPS signature ID, destination port and device, where all entries in a row are of the same data type. A person skilled in the art will appreciate that many other forms of data can be stored in the groupings table 406.
  • groupings database does not store data within columns as plain text but rather packs the data into bytes as specific below.
  • Indications of different signatures from different IPS engines can be stored, so we need a way to track which IPS the particular signature is referring to.
  • the packed data length is 8 bytes long.
  • the first byte is the IPS.
  • the rest is specific to each IPS.
  • Snort marked as type "1”
  • the signature format is "snort_sig_l_2_3", where the three numbers are split up and stored as a byte, short and a byte. This adds up to 5 bytes, and the 3 remaining bytes are ignored, e.g.: snort_sig_5_6_7 will be stored as [01 05 00 06 07 00 00 00]
  • Check Point is marked as type "2".
  • cp_abcl234 becomes [02 61 62 63 31 32 33 34]
  • Different signatures can be packed together, e.g. using the previous two examples, [01 05 00 06 07 00 00 00 02 61 62 63 31 32 33 34
  • Source and destination ports range from 0 to 65535, so it makes sense to use a short to store the values, e. g. port 258 would be stored as [01 02]
  • IP addresses are stored with four 8 bit, numbers, so it makes sense to store 4 bytes of data, e.g. 1.2.3.4 would be stored as [01 02 03 04]
  • Regions are packed as 4 bytes, the first two bytes as the country code and the next two bytes as the localised region ID.
  • NSW number lookup instead, e.g. "NSW” is 3 bytes, so we use "02” instead.
  • NY, USA and NSW, Australia would be stored as "USNYAU02"
  • City data has a variable length, since name lengths can't be predicted, and allocating a large amount of room for a fixed data width would be wasteful.
  • Each city name is prefixed with the country code and region code.
  • the names are separated by a null character, e.g. Sydney, NSW, Australia and US NY New York packed together would be " AU 02Sydney [NULL]USN YNew York" AS Numbers
  • AS Numbers can go up to 4 bytes (RFC 4893), so we store this as 4 bytes, e.g.
  • the query seeks to know who is being attacked at each minute for 30 minutes duration and show the result in the user interface 700.
  • the following row keys would be queried: 1:1 :all: l:sig_dst:snort_all:minutes: 1 :1249081500
  • 1249081500 is a unix timestamp that equates to Sat Aug 1 09:05:00 2009. That is,
  • Connections table 402 is an intermediary database to track IP connections such as TCP connections and UDP/ICMP pseudo connections.
  • a row key can comprise the following criteria separated by semicolons:
  • connections database 402 is populated with values and queried in a similar manner to the bins database 400; Attacks table 404
  • the attacks table 404 keeps a record of every security risk event (attack) that has been triggered by an IPS sensor.
  • the row key is constructed using the following information
  • the column keys are unix timestamps at a minimum width of 1 minute - a similar column structure to that of the bins database 400.
  • the values stored in the database 404 represent the actual sensor event data encoded using JavaScript Object Notation (JSON) that includes a time stamp.
  • JSON JavaScript Object Notation
  • the value stored may represent one or more IPS Engines that detected the attack and an indication of the actual attack identified by each engine.
  • a network graph is a data model of structured with Vertices (or Nodes) and Edges (relationships). Using a network graph the system is able to model structured and unstructured data. The network graph maintains a complete picture of all attack information for all customers.
  • map/reduce jobs 500-514 are used to:
  • map/reduce jobs 500-514 utilise a parallel processing model and in this example leverage off an open source parallel computing standard known as Hadoop.
  • a map/reduce job is run to bieak up the PCAPs into manageable sizes that can be processed in parallel on the Hadoop platform.
  • map/reduce jobs can be performed in parallel. To ensure all the data required to run a map/reduce 500 are available, the map/reduce jobs are grouped in rounds and each map reduce/job in a round is performed in parallel.
  • round one 520 is comprised of the following map/reduce jobs: PCAP map/reduce Job 500 extracts data from PCAPs and inserts into 'bins' 400 and 'connections' 402.
  • Snort map/reduce Job 502 extracts data from PCAPs and inserts into 'attacks' 404.
  • Sling map/reduce Job 504 receives data extracted from the PCAPs by sensors
  • the second round 522 is comprised of the following map/reduce jobs:
  • Back channel map/reduce Job 506 extracts data from Connections 402 and inserts into bins 400 and groupings 402.
  • Attacks map/reduce job 408 extracts data from Attacks 404 and Correlations 414 and inserts into bins 400 and groupings 406 and plvdb_preprocess 410.
  • the third round 524 is comprised of the following map/reduce jobs:
  • Graph Job 510 extracts data from Attacks 404 and Correlations 414 and inserts into 'graphs' 408
  • PLVDB Recommend Job 512 extracts data from PLVDB Pre Process 410 and PLVDB
  • the fourth round 526 is comprised of the following map/reduce job:
  • Focus job 514 extracts from Correlations 414 and Attacks 414 and inserts into 'bins' 400 and 'groupings' 406.
  • the system re-analyses PCAPs so that Intrusion Prevention Devices can inspect the traffic and produce indicators and warnings in relation to attacks.
  • the sling map/reduce job 504 uses multiple Open Source IPS engines (e.g Snort and Suricata) and commercial IPS engines 750 (only four shown for simplicity).
  • the system is also not limited to IPS engines, A person skilled in the art will appreciate that any device such as a sensor that performs analysis on full network captures can be used to produce indicators and warnings of security risks instead of or in addition to IPS engines 750. Further, machine learning algorithms could readily be used.
  • all IPS Engines 750 are independent of each other i.e.
  • IPS Engines 750 also update their signature sets at any time. Once a signature is updated, the stored PCAPs 110 are re-analysed by the relevant IPS device 750 to see if any new alerts are triggered for traffic that has previously been analysed.
  • the Time, Source IP Address, Source Port, Destination IP Address, Destination Port and the exact details of the attack are stored in the Attacks database 404 by the sling map/reduce job 504.
  • the Sling map/reduce job operates in the same way as the sling map/reduce job, however it works directly on PCAPs 1 10 to identify attacks and then make a record in the attacks table 404.
  • IPS Events or snort events triggered by the same conversation i.e. packets of the same source and destination information within a small lime interval such as 60 seconds.
  • conversations can be identified are saved to the Packetloop Vulnerability Database (PLVDB) where they can be merged into a single attack name for use within the user interface.
  • PLVDB Packetloop Vulnerability Database
  • the correlation of attack signatures informs the Focus map/reduce job 514 of a match and updates the focus level in the database 390. If the signatures are not correlated, then the confidence level is less than 1.
  • Attack signatures can be manually or automatically tracked to determine that they trigger on the same conversations and they trigger uniformly each time the attack is found.
  • the PLVDB shows every conversation where IPS Engines of snort map/reduce jobs triggered on the same attack type on the same conversation in the ⁇ same time interval. A percentage score is linked to the correlation between
  • IPS Engines 750 ship with a signature or 'protection' that should detect or block a specific attack. Sometimes these signatures don't actually detect/block the attack and in some cases they may detect them sometimes but not every time. The same can be said for the accuracy of the snort map/reduce job. The system keeps a record of both types of True Negatives or False-positives.
  • the system presents a 'confidence level' within the user interface 700 based on the correlation percentage score to allow a user to filter out attacks based on the number of IPS Engines 750 that triggered the attack and the number of hits detected by the snort of map/reduce job.
  • the confidence is a rating'between the range 1 - 4, but a person skilled in the art would appreciate that any scale could be applied.
  • IPS engine 750 When set to 1 only a single IPS engine 750 (it doesn't matter which one) of snort map/reduce job needs to register an attack for it to be shown in the user interface 700. In this example, at confidence level 4 all four IPS 750 are required to alert for it to be displayed in the Web UI 700.
  • the PLVDB database will be used to track the occurrence of the first three threat conditions.
  • a key problem the system solves is finding Zero Day (0 Day) attacks in past network traffic. This is accomplished by having the PCAPs re-analysed by the system.
  • FIG. 9 An example network 902 that supports this analysis is shown in Fig. 9.
  • the most part of this network 902 can be considered isolated from the network that supports the analysis described with reference to Fig. 4. That is, in this example the network 902 includes copies of the IPS Engines 750' as shown in Fig. 4 rather than use the same engines. This improves efficiency as the system can continue to look for Zero Day attacks in past data using the system of Fig. 9 while analysing new PCAPs in the network shown in Fig. 4.
  • the sling map/reduce job 504' may also be a copy of the sling map/reduce job 504 used in Fig. 4 or it may be the same map/reduce job. Not shown is that a snort map/reduce job 502 is also used.
  • the short job is able to analyse the native PCAPs and does not need to fire the PCAPs across the IPS Engines 450. For improved efficiency the snort and sling job are kept separate.
  • the attacks table 404 in network 902 is the same as the attacks table in 404 which in turn provides input into the map reduce jobs of round two 522 and round three 524.
  • none of the IPS Engines 750' alert a zero day attack because they do not have a signature for the attack.
  • a replay or firing machine 900 After upgrading each IPS engine 750 with the latest signatures a replay or firing machine 900 sends the PCAPs 110 to be re-analysed by the IPS Engines 750' to determine if the update signatures alert on the attack. If the snort map/reduce j ob is updated the snort map/reduce job can also be re-run on the PCAPs.
  • the IPS Engines 750' update their signatures at differing rates and each time an Engine 750 * is updated the PCAPs 1 10 is replayed 900. The ⁇ resulting alerts are compared with alerts created from the previous iteration of the analysis.
  • a Looped Attack is a Zero Day attack (it was unable to be detected) that was subsequently detected through re-analysing the past PCAPS by IPS Engines 750' with new attack signatures. Looped attacks often provide information of the exact method that is used to breach the network.
  • a time range that the attack was used and the system it was used against and the source of the attack Through enriching, location information and other information related to the incident are obtained. Knowing the time the attack was used, the information can be filtered from that time until now to determine the conversations took place between the attacker and the victim and potentially other related systems.
  • the IPS Engine triggers are correlated based on conversation, such as source IP address, the time, the attack used and the system that it was initiated on.
  • the system records the event data 906 created from re- analysing the PCAPs.
  • an event is found it is processed by the Sling Map Reduce job 504 for insertion into the Attacks database 404.
  • the sling map/reduce job 504 and attacks 404 are the same as those identified in Fig. 4.
  • Snort and Sling map/reduce jobs operate in a similar fashion, however a main difference is the data that they operate on. Snort reads from a file and Sling reads from information that is save after PCAPs have been sent across an isolated network.
  • Looped attacks are also tracked. For example if all IPS Engines 750', in this example all four, do hot detect an attack on their first analysis of PCAPs and then over time, their signatures are updated and upon detection on a subsequent analysis a Looped attack can be deduced. However this almost never happens uniformly. For example, initially all four IPS engines 750' failed to detect the attack. Then over time they are all upgraded and IPS #1 picks it up as a looped attack but IPS Engines #2-4 still do not detect the attack.
  • the newly detected attack is considered a looped attack because IPS Engine #1 subsequently detected it.
  • the confidence level of this looped attack is low since the other IPS Engines did not pick it up.
  • the focus map/reduce job 514 extracts attacks from the attack database 404 and also correlations from the correlations database 414 and then inserts the confidence level as appropriate in the Bins database 400 and groupings database 406. Method of general operation is shown in Fig. 2(d).
  • the confidence level increases with each detect. In this case 50% for two IPS Engines, 75% for three IPS Engines and 100% for all four IPS Engines.
  • the Focus job once again reads all the information out of the Attacks database 404 and Correlations database 414 and updates the Bins and Groupings tables. In this example, the existing relevant row keys arc duplicated and the focus level in the duplicated row key changes. For example the value of confidence is increased by one or more.
  • the general method of operation is shown in Fig. 2(b).
  • confidence can be calculated in many ways, for example, it can be weighted according to the particular IPS Engines that have found a hit, weighted according to the type or form of the attack detected, it may not be proportional to the number of IPS Engines that identified the attack and may be based on the particular combinations of the IPS Engines that identified (or did not identify) the attack.
  • the existing row keys are maintained for the modified focus level.
  • the discussion directly above discusses detecting a zero day attack using the sling map/reduce job.
  • the snort map/reduce job can also be updated and re-run on the PCAPs to identify zero day attacks that included analysis for updating the focus level and confidence level.
  • the sling map/reduce job may represent one or more IPS Engines (sensors).
  • a Vulnerability database (not shown) is used to determine whether the identified attack is a looped attack.
  • industry reference for the attack e.g. CVE ID
  • the disclosure date available from the Internet e.g. the disclosure date available from the Internet
  • the signature information from the IPS Engine can be correlated to determine if it is a looped attack. For example:
  • An attack is detected with a signature from a sensor. Using the PLVDB, it can be determined when the signature was produced and available for the sensor. This determines whether this newly identified attached is a looped attack based on the release date of the signature. A comparison is made on the date of the newly identified attack against the signature release date. If the signature was released after the time, the attack in the packet captures indicates it is looped. Using the released date of the attack signature as opposed to replaying all traffic against all signatures past and present is the more efficient approach. The system keeps a record on which PS Engines are the fastest at supplying signatures to find looped attacks. This information can then be used in future by the system, for example to influence the confidence level ratings.
  • Data in the database 309 is presented to the user interface (UI) 700 through a standard Model, View, Controller (MVC) architecture.
  • the View is made up of a HTML5 and Javascript web page that makes JSON calls to the Controller.
  • the Controller queries the database 390 (the Model) and returns it to the View.
  • the datamodel of the database 309 is designed so as to minimise the amount of modification of the received data that has to be performed to populate the UI 700.
  • the system uses the bins 400, groupings 406 and graphs 408 databases to render most of the UI 700.
  • the system works together to display time series data, counts of particular data types and the ability to dynamically filter the content shown to the user at different confidence levels.
  • the bins 400 and groupings 406 tables use a column oriented data model where the graphs 408 database uses a network graph to store relationship data.
  • the "Time Period" in the UI sets the range of time being displayed between the left and right of the main visualisation.
  • An example main visualisation is shown in Fig. 10.
  • This time period can be further refined by time slider time selection 1008.
  • the most appropriate time interval length is chosen for display, where the aim is to maximise the details displayed in a meaningful way, that is not too much detail that it cannot be readily understood and not too little detail that too much detailed information is lost.
  • typically a display is suited to having a time scale broken into 20 to 50 parts. For example, selecting the "Time Period" 1010 and setting it to 6 hours makes a call to the bins table for ⁇ 5 minute bins, that is row keys representing a 15 minute time interval being four 15 minute bins per hour for the six hours. A time value period of 1 year would see the 1 month bins queried and a 10 year time value period has the 1 year bin queried.
  • the Ul 700 can display information that relates to a specific device that the PCAP was produced from.
  • the UI 700 can also be filtered by one or more specific PCAP data captures.
  • Operand E.g. Sum the set for updating panels.
  • Focus Level 1012 How many sensors need to register for data to be counted.
  • a series of requests is initiated to return data to be plotted in the main visualisation leading to a series of requests being initiated to populate the data panels.
  • bins database returns the data that is used to render the main visualisation as shown in Fig. 10. Bins is also used to render data panels, such as the panel showing Fig. 11, that return the count, sum, average of time series data.
  • the Groupings database is used to get Count of any combination of row keys. This is faster than trying to process complex combinations on the Web client.
  • the Groupings table is used to render data panels with ranked information (e.g. the Top 10). In the example shown in Fig. 12 the Top Attacks for a specific time range is returned from the
  • Fig. 23 shows a sample user interface. It is comprise of two main parts:
  • Data panel 2032 Both parts show a different way of representing the PCAPs from the network 100 from the same time period.
  • the time period is summarised at 2034 to describe a 3 month period and representing 17 million data packets. '
  • the visualisation 2030 is shown slightly enlarged in Fig. 24.
  • the user can select a time scale for the display such as by selecting a start time for the 3 month period by using the calendar function at 2038.
  • the user can manipulate arrows 2042 to slide around a list of dates captured in the PCAPs.
  • the start date is defined as the time aligned with the marker 2044.
  • the user can also filter the data being displayed by the confidence level as describe earlier above using the slide bar at 2040.
  • each visualisation is specific to the information being displayed.
  • the visualisation has an x-axis 2046 that represents time and the y-axis represents 2048 that represents frequency of attacks.
  • the request of Fig. 25 is made to the groupings database. It defines that the datatype is source to identify all attacks. A start time stand and an end time stamp is provided, with a time range. This request returns all Source IP addresses grouped by the number of attacks made. "Attacks" in this case can be of any severity as they are grouped by "attacks_all” rather than a severity such as "attacks_ l”.
  • Each source IP address identified as an attack is returned with an associated frequency count.
  • this returned data allows the main visualization 2030 to render the two columns on the left 2060 and to columns on the right 2062 hand side of the visualization based on the timestamps returned.
  • Each Source IP address in the graph is represented by a distinct colour that is also shown in the legend 2064.
  • the height of each bar portion is based on the frequency of the Source IP address Number of attacks. It can be seen that there were no hits between June 7 and June 20. The visualisation clearly shows for example that the highest number of hits occurred on June 2, but June 5 has a greater count of source IP addresses that were the source of identified attacks. In yet another example the colour could be applied to attack types or sensor that detected the attacks rather than source IP address. i This is summarised at 2035 that states that the system identified in these 17 million packets 21 ,925 attacks from 19 sources, and 1 of these are new.
  • the information presented is specifically related to the source.
  • a number of pre-set time value ranges are provided to the user via the "Time Period" select box 2036, 1010. This sets the time range that will be viewed and in turn the appropriate bin resolution that will be queried from the bins and groupings databases.
  • a selection of 15 minutes from the "time period" selector 2036 means that 1 hour and 1 minute time bins are used.
  • a selection of 6 hours means that time bins of 15 minutes are used.
  • Fig. 24 displays the most recent data. Then the user can use the time period. Then you could use the time period drop down 2036 to compare time periods, such as comparing today to yesterday by selecting ' 1 Day' . For time scales longer than this you can either use the 7 Days, 1 month or 1 year or start to use the calendar.
  • the number of bins and their widths is a decision made by the system based on performance and the number of bins that look good and are informative in the visualisation. So for example when 1 year or 3 years is viewed you might get 12 or 36 one-month bins.
  • a slightly enlarged data panel 2032 is shown in Fig. 27.
  • 2070 indicates the number of attacks stored in the bins database for the row keys having criteria "type" as high attacks (attacks_l)
  • 2072 indicates the number of attacks stored in the bins database ibr the row keys having criteria "type” as medium attacks (attacks_2).
  • 2074 indicates the number of attacks stored in the bins database for the row keys having criteria "type” as low_attacks (attacks_3).
  • the total for box 2070 is populated based on the query values shown in Fig. 28.
  • Y0 above is generated by the javascript as a way of storing values to be rendered in the time series. The values we are returning from the database via the controller. Is 'time' and 'value' .
  • the y and yO are produced by a Javascript framework used for graphics and visualisation.
  • the data includes three time stamps for each bin where there has been a high attack. Within each bin there is a value. The total of 184 as displayed at 2070 is determined by adding 18 + 1 + 165.
  • the total for Medium attacks is produced by a query to the Bins database as shown in Fig. 29. It can be seen that it only differs from the query in Fig. 28 in that the datatype is for "attacks_2" and returns three objects as follows:
  • the data includes three time stamps for each bin where there has been a medium attack. Within each bin there is a value.
  • the total for low attacks is produced by a query to the Bins database as shown in Fig. 30. It can be seen that it only differs from the query in Fig. 28 in that the datatype is for "attacks_3" and returns four objects as follows:
  • the scale selected by the user for display is determined by the system.
  • the user can amend the time scale of the x-axis 2046 simply by scrolling over the visualisation display 2030.
  • a person skilled in the art will appreciate that many different functionalities can be incorporated into the user interface to allow for this 'zooming' functionality.
  • the start and end times of the x-axis are analysed by the system to determine if the resolution of the display is appropriate for sufficient information display.
  • the system implements this by determining the time interval for the x-axis suitable for the level of zoom selected.
  • the system subsequently accesses processed information from the databases 390 and from the groupings table associated with the relevant keys., That is for the set of criteria queried each belonging to a different set of keys and represent the same selected time interval.
  • the query of Fig. 25 would be sent the groupings table at the time interval (bin) in the query of 1 minute instead of 1 day.
  • Map/reduce jobs process packet captures based on specific logic that is used in the user interface. These relate to;
  • the new map/reduce job compares every bin at every bin width and compares it to all bins before it to see what objects are 'new' or have never been seen in the current bin. For example this enables displaying in the user interface not all the data but only the data that is new within the current time range. This allows the identification of outliers and abnormalities more readily.
  • the new map/reduce job works on all objects within the data model - IP, Country, City, ISP, Attacks, Protocols etc. Based on these new counts averages of new occurrences can be calculated.
  • the distinct map/reduce job allows the system to quickly report on what is distinct per bin in relation to any data object in the system. Distinct Sources or Destination and distinct attacks are good examples. Instead of having to request the data and then determine what is distinct on the client side this is pre-processed using this distinct map/reduce job and the distinct figures inserted into the data model. This also allows us to determine the average of distinct,
  • Map/reduce jobs process packet captures based on specific logic that is used in the user interface. These relate to;
  • the Backchannels map/reduce job inspects every attack (between a Source and a Destination) and determines whether the Destination of the attack ever connects back to the Source. This backchannel job still uncovers attacks and covert communication that need to be further analysed.
  • the 'new' map/reduce job compares every bin at every bin width and compares it to all bins before it to determine the 'new' or have never been seen objects in the current bin. For example this enables displaying in the user interface not all the data but only the data that is new within the current time range. This allows the identification of outliers and abnormalities more readily.
  • the new map/reduce job works on all objects within the data model -TP, Country, City, ISP, Attacks, Protocols etc. Based on these new counts averages of new occurrences can be calculated.
  • the distinct map/reduce job allows the system to quickly report on the distinctions per bin in relation to any data object in the system. Distinct Sources or Destination and distinct attacks are good examples. Instead of having to request the data and then determine the distinctions on the client side, this is pre-processed using this distinct map/reduce job and the distinct figures inserted into the data model. This also allows for the determination the average of distinct.
  • Bins and Groupings databases will now be described in more detail.
  • Bins database For any set of row keys and a time slice the Bins database returns the key, column key (timestamp) and a value.
  • the row key of the bins database reads as follows:
  • a row key need not have the field 'ts_amount', only 'ts_interval ⁇ In which case, in a set of row keys the , ts_amount * for each would be different.
  • the row key of the groupings database reads as follows:
  • the groupings database doesn't return a timestamp and value like the bins table, but rather an integer value for the column and the value that has been requested. In this case it is a separated list of destination ports.
  • the data returned states that destination ports 36965,47618,41421,34001,51577,49165,49523,50259 were seen in one attack during the minute starting with the timestamp and so on.
  • the graphs database uses a network graph to build a node and relationship graph.
  • the network graph is traversed so that any object that is selected shows only the relationships related to that object.
  • Fig. 22 shows the link between all jobs and the data model.
  • the system can be provided as an on-premise version on the network 100. That is the system is delivered as a turnkey solution for the capture, processing and analysis.
  • the alternative is an onsite hardware appliance such as a sensor are placed on the network 100 to capture traffic with very low latency and a server, such as a virtual server, that stores the PCAPs, processes them and makes the analysis available by a similar but local hosted user interface.
  • a server such as a virtual server, that stores the PCAPs, processes them and makes the analysis available by a similar but local hosted user interface.
  • this onsite solution operates in real-time mode rather than through batch processing that is typically performed with the service in the cloud model.
  • Map/Reduce jobs are distributed processing units that act on data as it passes through the topology of the real-time implementation. The data is inserted into the tables at the end of the Stream process rather than the Map/Reduce process.
  • Implementations may also include a combination of processing network data packets in real-time and batch processing. Advantage of such an implementation that includes real-time processing is the reduction in latency.
  • the datastore 110 may be disk or memory.
  • the term memory here is also used to refer to disk.
  • FIG. 6 This alternative implementation is shown in Fig. 6, where hardware sensors 600 are ⁇ installed in the network 100 and given direct access to the network segments that are required to be monitored. All traffic (can be at different speeds) is passed through the Field Programmable Gate Array (FPGA) 600 that operates software. The raw packet capture information is then passed to the Sensor software that extracts important features from the data in real-time. The sensor software then inserts the row keys and values into the distributed database 390. Because the FPGA operates at very low latency, getting data from the network and into the database 390 happens very quickly (less than 1 second) and provides the User Interface 700 with a smooth real-time capability. Note that unlike the software as a service solution described earlier, the PCAPs and Snort map/reduce job would insert directly into the column oriented databases.
  • FPGA Field Programmable Gate Array
  • map/reduce jobs that extract data and perform Sling, Focus etc would then be run to back fill the information needed.
  • the main difference is that the aim of this solution is to have the sensors 600 enter as much information into the databases as possible. In turn, this reduces the number of map/reduce jobs needed.
  • Map/reduce jobs 500-514 still perform calculations that require the entire dataset (e.g. determining what data has never been seen before).
  • All traffic from the sensors are still stored 110 in PCAP format for Map/reduce jobs to operate on.
  • these PCAPs are used to iterate the IPS Engines 702 that cannot be installed on the sensor 600 itself and for period replay of all capture data. That is, the sensor 600 would process data real-time straight from the network and insert keys into the databases. It would also store the data in PCAP format for the later map/reduce jobs to process and act on.
  • the Bricks 800 are comprised direct attached disk, map/reduce software, column oriented database and source code, map reduce jobs and isolated networks iterating PCAPs through the system.
  • Bricks 800 increment based on the amount of retention the required and the daily rate of PCAPs. The faster the network 100 speed, the more network segments they need to capture and the length of time retained determines the number of bricks 800 that are deployed.
  • the Sensor 600 has a number of interface options 802 that can be selected depending on the speed and number of the network segments that need to be monitored.
  • the FPGA software extracts the TCP/IP packet information from the physical interface and passes this to the sensor software. The features extracted are then inserted into the database 110 redundantly spread across all Bricks 800.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

L'invention concerne un modèle de données, des procédés, un logiciel, des interfaces utilisateur et un système informatique qui permettent l'analyse de données chronologiques, et en particulier, mais de façon non limitative, l'analyse des paquets de données envoyés sur un réseau informatique pour identifier des risques liés à la sécurité. Cette invention se compose d'une mémoire de données (110) dédiée au trafic de réseau, de tâches Map/Reduce (500-514) pour l'analyse des paquets de données mis en mémoire et la conservation du résultat dans une base de données répartie (390), de moteurs IPS/algorithmes d'apprentissage automatique (750) destinés à la détection d'événements, et d'une interface utilisateur (700) conçue pour afficher un ou plusieurs événements détectés ainsi que le niveau de confiance associé.
PCT/AU2013/000883 2012-08-13 2013-08-09 Analyse de données chronologiques WO2014026220A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US14/405,684 US9578046B2 (en) 2012-08-13 2013-08-09 Analysis of time series data
AU2013302297A AU2013302297B2 (en) 2012-08-13 2013-08-09 Analysis of time series data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261682657P 2012-08-13 2012-08-13
US61/682,657 2012-08-13

Publications (1)

Publication Number Publication Date
WO2014026220A1 true WO2014026220A1 (fr) 2014-02-20

Family

ID=50101104

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2013/000883 WO2014026220A1 (fr) 2012-08-13 2013-08-09 Analyse de données chronologiques

Country Status (3)

Country Link
US (1) US9578046B2 (fr)
AU (1) AU2013302297B2 (fr)
WO (1) WO2014026220A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9612959B2 (en) 2015-05-14 2017-04-04 Walleye Software, LLC Distributed and optimized garbage collection of remote and exported table handle links to update propagation graph nodes
CN106663040A (zh) * 2014-05-01 2017-05-10 网络流逻辑公司 用于计算机网络业务中的信任异常检测的方法及系统
US9843488B2 (en) 2011-11-07 2017-12-12 Netflow Logic Corporation Method and system for confident anomaly detection in computer network traffic
US10002154B1 (en) 2017-08-24 2018-06-19 Illumon Llc Computer data system data source having an update propagation graph with feedback cyclicality
CN108243062A (zh) * 2016-12-27 2018-07-03 通用电气公司 用以在时间序列数据中探测机器启动的事件的系统
CN112491917A (zh) * 2020-12-08 2021-03-12 物鼎安全科技(武汉)有限公司 一种物联网设备未知漏洞识别方法及装置

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9137258B2 (en) * 2012-02-01 2015-09-15 Brightpoint Security, Inc. Techniques for sharing network security event information
US8914406B1 (en) * 2012-02-01 2014-12-16 Vorstack, Inc. Scalable network security with fast response protocol
US9710644B2 (en) 2012-02-01 2017-07-18 Servicenow, Inc. Techniques for sharing network security event information
US20140188530A1 (en) * 2012-12-28 2014-07-03 Wal-Mart Stores, Inc. Provision of customer attributes to an organization
US9639548B2 (en) * 2013-10-28 2017-05-02 Pivotal Software, Inc. Selecting files for compaction
US9537972B1 (en) * 2014-02-20 2017-01-03 Fireeye, Inc. Efficient access to sparse packets in large repositories of stored network traffic
US11722365B2 (en) 2014-05-13 2023-08-08 Senseware, Inc. System, method and apparatus for configuring a node in a sensor network
US9551594B1 (en) 2014-05-13 2017-01-24 Senseware, Inc. Sensor deployment mechanism at a monitored location
US20160147830A1 (en) * 2014-07-09 2016-05-26 Splunk Inc. Managing datasets produced by alert-triggering search queries
US10742647B2 (en) * 2015-10-28 2020-08-11 Qomplx, Inc. Contextual and risk-based multi-factor authentication
AU2016367922B2 (en) 2015-12-11 2019-08-08 Servicenow, Inc. Computer network threat assessment
US10243971B2 (en) * 2016-03-25 2019-03-26 Arbor Networks, Inc. System and method for retrospective network traffic analysis
US10412148B2 (en) * 2016-07-21 2019-09-10 Microsoft Technology Licensing, Llc Systems and methods for event aggregation to reduce network bandwidth and improve network and activity feed server efficiency
WO2018200756A1 (fr) * 2017-04-25 2018-11-01 Nutanix, Inc. Systèmes et procédés de modélisation et de visualisation de microservices en réseau
US10333960B2 (en) 2017-05-03 2019-06-25 Servicenow, Inc. Aggregating network security data for export
US20180324207A1 (en) 2017-05-05 2018-11-08 Servicenow, Inc. Network security threat intelligence sharing
US10599654B2 (en) * 2017-06-12 2020-03-24 Salesforce.Com, Inc. Method and system for determining unique events from a stream of events
US10601849B2 (en) * 2017-08-24 2020-03-24 Level 3 Communications, Llc Low-complexity detection of potential network anomalies using intermediate-stage processing
US10824623B2 (en) * 2018-02-28 2020-11-03 Vmware, Inc. Efficient time-range queries on databases in distributed computing systems
EP3826242B1 (fr) * 2018-07-19 2022-08-10 Fujitsu Limited Programme d'analyse d'informations de cyberattaque, procédé d'analyse d'informations de cyberattaque et dispositif de traitement d'informations
CN108881306B (zh) * 2018-08-08 2020-04-28 西安交通大学 一种基于数据包大小序列的加密流量分析防御方法
US10785243B1 (en) * 2018-09-28 2020-09-22 NortonLifeLock Inc. Identifying evidence of attacks by analyzing log text
US10936640B2 (en) * 2018-10-09 2021-03-02 International Business Machines Corporation Intelligent visualization of unstructured data in column-oriented data tables
US10922298B2 (en) * 2019-01-30 2021-02-16 Walmart Apollo, Llc System and method for indexing time-series-based data
US11082289B2 (en) * 2019-05-03 2021-08-03 Servicenow, Inc. Alert intelligence integration
US11223642B2 (en) 2019-09-14 2022-01-11 International Business Machines Corporation Assessing technical risk in information technology service management using visual pattern recognition
US11803530B2 (en) * 2019-12-18 2023-10-31 Schlumberger Technology Corporation Converting uni-temporal data to cloud based multi-temporal data
US11574461B2 (en) 2020-04-07 2023-02-07 Nec Corporation Time-series based analytics using video streams
WO2021226146A1 (fr) * 2020-05-08 2021-11-11 Worthy Technology LLC Système et procédés permettant de recevoir, de traiter et de stocker des données riches en séries chronologiques
US11886413B1 (en) 2020-07-22 2024-01-30 Rapid7, Inc. Time-sliced approximate data structure for storing group statistics
US11971893B1 (en) * 2020-07-22 2024-04-30 Rapid7, Inc. Group by operation on time series data using count-min sketch
US11212195B1 (en) * 2020-09-11 2021-12-28 Microsoft Technology Licensing, Llc IT monitoring recommendation service
US11797600B2 (en) * 2020-11-18 2023-10-24 Ownbackup Ltd. Time-series analytics for database management systems
WO2022106977A1 (fr) 2020-11-18 2022-05-27 Ownbackup Ltd. Protection continue de données en utilisant des instantanés de sauvegarde rétroactifs
US11888718B2 (en) * 2022-01-28 2024-01-30 Palo Alto Networks, Inc. Detecting behavioral change of IoT devices using novelty detection based behavior traffic modeling
US11934401B2 (en) 2022-08-04 2024-03-19 International Business Machines Corporation Scalable count based interpretability for database artificial intelligence (AI)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040010751A1 (en) * 2002-07-10 2004-01-15 Michael Merkel Methods and computer systems for displaying time variant tabular data

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6381603B1 (en) * 1999-02-22 2002-04-30 Position Iq, Inc. System and method for accessing local information by using referencing position system
US6867785B2 (en) * 2001-07-02 2005-03-15 Kaon Interactive, Inc. Method and system for determining resolution and quality settings for a textured, three dimensional model
US6931400B1 (en) * 2001-08-21 2005-08-16 At&T Corp. Method and system for identifying representative trends using sketches
US7751325B2 (en) * 2003-08-14 2010-07-06 At&T Intellectual Property Ii, L.P. Method and apparatus for sketch-based detection of changes in network traffic
US20060047807A1 (en) * 2004-08-25 2006-03-02 Fujitsu Limited Method and system for detecting a network anomaly in a network
US7346471B2 (en) * 2005-09-02 2008-03-18 Microsoft Corporation Web data outlier detection and mitigation
US8001601B2 (en) * 2006-06-02 2011-08-16 At&T Intellectual Property Ii, L.P. Method and apparatus for large-scale automated distributed denial of service attack detection
US8136162B2 (en) * 2006-08-31 2012-03-13 Broadcom Corporation Intelligent network interface controller
US8963969B2 (en) * 2007-01-31 2015-02-24 Hewlett-Packard Development Company, L.P. Providing an automated visualization of a collection of data values divided into a number of bins depending upon a change feature of the data values
US7792779B2 (en) * 2007-06-05 2010-09-07 Intel Corporation Detection of epidemic outbreaks with Persistent Causal-Chain Dynamic Bayesian Networks
US7676458B2 (en) * 2007-08-28 2010-03-09 International Business Machines Corporation System and method for historical diagnosis of sensor networks
US8676964B2 (en) * 2008-07-31 2014-03-18 Riverbed Technology, Inc. Detecting outliers in network traffic time series
US8677480B2 (en) * 2008-09-03 2014-03-18 Cisco Technology, Inc. Anomaly information distribution with threshold
CN102449660B (zh) * 2009-04-01 2015-05-06 I-切塔纳私人有限公司 用于数据检测的系统和方法
US20100305806A1 (en) * 2009-06-02 2010-12-02 Chadwick Todd Hawley Portable Multi-Modal Emergency Situation Anomaly Detection and Response System
US8310922B2 (en) * 2010-04-15 2012-11-13 International Business Machines Corporation Summarizing internet traffic patterns
US8971196B2 (en) * 2011-03-08 2015-03-03 Riverbed Technology, Inc. Distributed network traffic data collection and storage
US10356106B2 (en) * 2011-07-26 2019-07-16 Palo Alto Networks (Israel Analytics) Ltd. Detecting anomaly action within a computer network
US9185093B2 (en) * 2012-10-16 2015-11-10 Mcafee, Inc. System and method for correlating network information with subscriber information in a mobile network environment
IN2013MU01779A (fr) * 2013-05-20 2015-05-29 Tata Consultancy Services Ltd
US20150088798A1 (en) * 2013-09-23 2015-03-26 Mastercard International Incorporated Detecting behavioral patterns and anomalies using metadata
US10469514B2 (en) * 2014-06-23 2019-11-05 Hewlett Packard Enterprise Development Lp Collaborative and adaptive threat intelligence for computer security
US20160191549A1 (en) * 2014-10-09 2016-06-30 Glimmerglass Networks, Inc. Rich metadata-based network security monitoring and analysis
EP3221794B1 (fr) * 2014-11-18 2020-04-01 Vectra AI, Inc. Procédé et système de détection de menaces à l'aide de vecteurs de métadonnées
US10075455B2 (en) * 2014-12-26 2018-09-11 Fireeye, Inc. Zero-day rotating guest image profile

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040010751A1 (en) * 2002-07-10 2004-01-15 Michael Merkel Methods and computer systems for displaying time variant tabular data

Cited By (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9843488B2 (en) 2011-11-07 2017-12-12 Netflow Logic Corporation Method and system for confident anomaly detection in computer network traffic
US11805143B2 (en) 2011-11-07 2023-10-31 Netflow Logic Corporation Method and system for confident anomaly detection in computer network traffic
US11089041B2 (en) 2011-11-07 2021-08-10 Netflow Logic Corporation Method and system for confident anomaly detection in computer network traffic
CN106663040A (zh) * 2014-05-01 2017-05-10 网络流逻辑公司 用于计算机网络业务中的信任异常检测的方法及系统
EP3138008A4 (fr) * 2014-05-01 2017-10-25 Netflow Logic Corporation Procédé et système de détection fiable d'anomalie dans un trafic de réseau informatique
US10346394B2 (en) 2015-05-14 2019-07-09 Deephaven Data Labs Llc Importation, presentation, and persistent storage of data
US9679006B2 (en) 2015-05-14 2017-06-13 Walleye Software, LLC Dynamic join processing using real time merged notification listener
US9633060B2 (en) 2015-05-14 2017-04-25 Walleye Software, LLC Computer data distribution architecture with table data cache proxy
US9639570B2 (en) 2015-05-14 2017-05-02 Walleye Software, LLC Data store access permission system with interleaved application of deferred access control filters
US9672238B2 (en) 2015-05-14 2017-06-06 Walleye Software, LLC Dynamic filter processing
US10540351B2 (en) 2015-05-14 2020-01-21 Deephaven Data Labs Llc Query dispatch and execution architecture
US9690821B2 (en) 2015-05-14 2017-06-27 Walleye Software, LLC Computer data system position-index mapping
US9710511B2 (en) 2015-05-14 2017-07-18 Walleye Software, LLC Dynamic table index mapping
US9760591B2 (en) 2015-05-14 2017-09-12 Walleye Software, LLC Dynamic code loading
US10552412B2 (en) 2015-05-14 2020-02-04 Deephaven Data Labs Llc Query task processing based on memory allocation and performance criteria
US9836495B2 (en) 2015-05-14 2017-12-05 Illumon Llc Computer assisted completion of hyperlink command segments
US9836494B2 (en) 2015-05-14 2017-12-05 Illumon Llc Importation, presentation, and persistent storage of data
US9886469B2 (en) 2015-05-14 2018-02-06 Walleye Software, LLC System performance logging of complex remote query processor query operations
US9898496B2 (en) 2015-05-14 2018-02-20 Illumon Llc Dynamic code loading
US9934266B2 (en) 2015-05-14 2018-04-03 Walleye Software, LLC Memory-efficient computer system for dynamic updating of join processing
US10003673B2 (en) 2015-05-14 2018-06-19 Illumon Llc Computer data distribution architecture
US10002153B2 (en) 2015-05-14 2018-06-19 Illumon Llc Remote data object publishing/subscribing system having a multicast key-value protocol
US10002155B1 (en) 2015-05-14 2018-06-19 Illumon Llc Dynamic code loading
US10019138B2 (en) 2015-05-14 2018-07-10 Illumon Llc Applying a GUI display effect formula in a hidden column to a section of data
US10069943B2 (en) 2015-05-14 2018-09-04 Illumon Llc Query dispatch and execution architecture
US10176211B2 (en) 2015-05-14 2019-01-08 Deephaven Data Labs Llc Dynamic table index mapping
US10198465B2 (en) 2015-05-14 2019-02-05 Deephaven Data Labs Llc Computer data system current row position query language construct and array processing query language constructs
US10198466B2 (en) 2015-05-14 2019-02-05 Deephaven Data Labs Llc Data store access permission system with interleaved application of deferred access control filters
US9612959B2 (en) 2015-05-14 2017-04-04 Walleye Software, LLC Distributed and optimized garbage collection of remote and exported table handle links to update propagation graph nodes
US10496639B2 (en) 2015-05-14 2019-12-03 Deephaven Data Labs Llc Computer data distribution architecture
US11663208B2 (en) 2015-05-14 2023-05-30 Deephaven Data Labs Llc Computer data system current row position query language construct and array processing query language constructs
US10212257B2 (en) 2015-05-14 2019-02-19 Deephaven Data Labs Llc Persistent query dispatch and execution architecture
US11556528B2 (en) 2015-05-14 2023-01-17 Deephaven Data Labs Llc Dynamic updating of query result displays
US10242041B2 (en) 2015-05-14 2019-03-26 Deephaven Data Labs Llc Dynamic filter processing
US10241960B2 (en) 2015-05-14 2019-03-26 Deephaven Data Labs Llc Historical data replay utilizing a computer system
US10242040B2 (en) 2015-05-14 2019-03-26 Deephaven Data Labs Llc Parsing and compiling data system queries
US9613018B2 (en) 2015-05-14 2017-04-04 Walleye Software, LLC Applying a GUI display effect formula in a hidden column to a section of data
US10353893B2 (en) 2015-05-14 2019-07-16 Deephaven Data Labs Llc Data partitioning and ordering
US10452649B2 (en) 2015-05-14 2019-10-22 Deephaven Data Labs Llc Computer data distribution architecture
US11687529B2 (en) 2015-05-14 2023-06-27 Deephaven Data Labs Llc Single input graphical user interface control element and method
US9619210B2 (en) 2015-05-14 2017-04-11 Walleye Software, LLC Parsing and compiling data system queries
US9805084B2 (en) 2015-05-14 2017-10-31 Walleye Software, LLC Computer data system data source refreshing using an update propagation graph
US10565194B2 (en) 2015-05-14 2020-02-18 Deephaven Data Labs Llc Computer system for join processing
US10565206B2 (en) 2015-05-14 2020-02-18 Deephaven Data Labs Llc Query task processing based on memory allocation and performance criteria
US10572474B2 (en) 2015-05-14 2020-02-25 Deephaven Data Labs Llc Computer data system data source refreshing using an update propagation graph
US10621168B2 (en) 2015-05-14 2020-04-14 Deephaven Data Labs Llc Dynamic join processing using real time merged notification listener
US10642829B2 (en) 2015-05-14 2020-05-05 Deephaven Data Labs Llc Distributed and optimized garbage collection of exported data objects
US11514037B2 (en) 2015-05-14 2022-11-29 Deephaven Data Labs Llc Remote data object publishing/subscribing system having a multicast key-value protocol
US10678787B2 (en) 2015-05-14 2020-06-09 Deephaven Data Labs Llc Computer assisted completion of hyperlink command segments
US10691686B2 (en) 2015-05-14 2020-06-23 Deephaven Data Labs Llc Computer data system position-index mapping
US11263211B2 (en) 2015-05-14 2022-03-01 Deephaven Data Labs, LLC Data partitioning and ordering
US11249994B2 (en) 2015-05-14 2022-02-15 Deephaven Data Labs Llc Query task processing based on memory allocation and performance criteria
US11238036B2 (en) 2015-05-14 2022-02-01 Deephaven Data Labs, LLC System performance logging of complex remote query processor query operations
US10915526B2 (en) 2015-05-14 2021-02-09 Deephaven Data Labs Llc Historical data replay utilizing a computer system
US10922311B2 (en) 2015-05-14 2021-02-16 Deephaven Data Labs Llc Dynamic updating of query result displays
US10929394B2 (en) 2015-05-14 2021-02-23 Deephaven Data Labs Llc Persistent query dispatch and execution architecture
US11151133B2 (en) 2015-05-14 2021-10-19 Deephaven Data Labs, LLC Computer data distribution architecture
US9613109B2 (en) 2015-05-14 2017-04-04 Walleye Software, LLC Query task processing based on memory allocation and performance criteria
US11023462B2 (en) 2015-05-14 2021-06-01 Deephaven Data Labs, LLC Single input graphical user interface control element and method
CN108243062A (zh) * 2016-12-27 2018-07-03 通用电气公司 用以在时间序列数据中探测机器启动的事件的系统
US10783191B1 (en) 2017-08-24 2020-09-22 Deephaven Data Labs Llc Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data
US10002154B1 (en) 2017-08-24 2018-06-19 Illumon Llc Computer data system data source having an update propagation graph with feedback cyclicality
US10657184B2 (en) 2017-08-24 2020-05-19 Deephaven Data Labs Llc Computer data system data source having an update propagation graph with feedback cyclicality
US10866943B1 (en) 2017-08-24 2020-12-15 Deephaven Data Labs Llc Keyed row selection
US11941060B2 (en) 2017-08-24 2024-03-26 Deephaven Data Labs Llc Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data
US11449557B2 (en) 2017-08-24 2022-09-20 Deephaven Data Labs Llc Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data
US10909183B2 (en) 2017-08-24 2021-02-02 Deephaven Data Labs Llc Computer data system data source refreshing using an update propagation graph having a merged join listener
US10241965B1 (en) 2017-08-24 2019-03-26 Deephaven Data Labs Llc Computer data distribution architecture connecting an update propagation graph through multiple remote query processors
US11574018B2 (en) 2017-08-24 2023-02-07 Deephaven Data Labs Llc Computer data distribution architecture connecting an update propagation graph through multiple remote query processing
US10198469B1 (en) 2017-08-24 2019-02-05 Deephaven Data Labs Llc Computer data system data source refreshing using an update propagation graph having a merged join listener
US11860948B2 (en) 2017-08-24 2024-01-02 Deephaven Data Labs Llc Keyed row selection
US11126662B2 (en) 2017-08-24 2021-09-21 Deephaven Data Labs Llc Computer data distribution architecture connecting an update propagation graph through multiple remote query processors
CN112491917A (zh) * 2020-12-08 2021-03-12 物鼎安全科技(武汉)有限公司 一种物联网设备未知漏洞识别方法及装置
CN112491917B (zh) * 2020-12-08 2021-05-28 物鼎安全科技(武汉)有限公司 一种物联网设备未知漏洞识别方法及装置

Also Published As

Publication number Publication date
AU2013302297A1 (en) 2015-02-19
AU2013302297B2 (en) 2020-04-30
US20150156213A1 (en) 2015-06-04
US9578046B2 (en) 2017-02-21

Similar Documents

Publication Publication Date Title
AU2013302297B2 (en) Analysis of time series data
US11924251B2 (en) System and method for cybersecurity reconnaissance, analysis, and score generation using distributed systems
US11483332B2 (en) System and method for cybersecurity analysis and score generation for insurance purposes
US11637869B2 (en) System and method for self-adjusting cybersecurity analysis and score generation
US10984010B2 (en) Query summary generation using row-column data storage
US9003023B2 (en) Systems and methods for interactive analytics of internet traffic
US11516233B2 (en) Cyber defense system
US20120011590A1 (en) Systems, methods and devices for providing situational awareness, mitigation, risk analysis of assets, applications and infrastructure in the internet and cloud
CN111052704A (zh) 网络分析工作流程加速
US12003544B2 (en) System and methods for automatically assessing and improving a cybersecurity risk score
US10623371B2 (en) Providing network behavior visibility based on events logged by network security devices
WO2021202833A1 (fr) Système et procédé d'auto-ajustement d'analyse de cybersécurité et de génération de score
WO2021243321A1 (fr) Système et procédés de notation de cybersécurité
US11968235B2 (en) System and method for cybersecurity analysis and protection using distributed systems
EP3346666B1 (fr) Système prédictif configuré pour modéliser le nombre attendu d'attaques sur un ordinateur ou un réseau de communication
Lee et al. Building a big data platform for large-scale security data analysis
Baker et al. Finding needles in haystacks (the size of countries)
Zheng et al. The research about data mining of network intrusion based on Apriori algorithm
Li et al. The design and implementation of network alarm and data fusion analysis systems based on cloud computing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13829966

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14405684

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2013302297

Country of ref document: AU

Date of ref document: 20130809

Kind code of ref document: A

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 17/06/2015)

122 Ep: pct application non-entry in european phase

Ref document number: 13829966

Country of ref document: EP

Kind code of ref document: A1