WO2017205845A1 - System for automated capture and analysis of business information - Google Patents

System for automated capture and analysis of business information Download PDF

Info

Publication number
WO2017205845A1
WO2017205845A1 PCT/US2017/034861 US2017034861W WO2017205845A1 WO 2017205845 A1 WO2017205845 A1 WO 2017205845A1 US 2017034861 W US2017034861 W US 2017034861W WO 2017205845 A1 WO2017205845 A1 WO 2017205845A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
business
analysis
information
memory
Prior art date
Application number
PCT/US2017/034861
Other languages
French (fr)
Inventor
Jason Crabtree
Andrew Sellers
Original Assignee
Fractal Industries, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/166,158 external-priority patent/US20170124501A1/en
Application filed by Fractal Industries, Inc. filed Critical Fractal Industries, Inc.
Publication of WO2017205845A1 publication Critical patent/WO2017205845A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling

Definitions

  • the present invention is in the field of use of computer systems in business information management, operations and predictive planning. Specifically, the
  • PLANATIRTM offers software to isolate patterns in large volumes of data
  • DATAB RICKSTM offers custom analytics services
  • ANAPLANTM offers financial impact calculation services and there are other software sources that mitigate some aspect of business data relevancy identification, analysis of that data and business decision automation, but none of these solutions handle more than a single aspect of the whole task.
  • products like DATADOGTM and BMC INSIGHTTM allow businesses to monitor the function of their IT infrastructure and business software but lack the ability to perform the complex operation queries of large sections of that infrastructure so as to forecast impending bottlenecks, deficiencies and even customer facing failures.
  • the inventor has developed a distributed system for the fully integrated retrieval, and deep analysis of business operational information from a plurality of sources.
  • the system further uses results of business information analytics to optimize the making of business decisions and allow for alternate action pathways to be simulated using the latest data and machine mediated prediction algorithms.
  • portions of the system are applied to the areas of IT security and predictively enhancing the reliability of client- facing IT infrastructure.
  • a system for fully integrated collection of business impacting data, analysis of that data and generation of both analysis driven business decisions and analysis driven simulations of alternate candidate business decision comprising: a business data retrieval engine stored in a memory of and operating on a processor of a computing device, a business data analysis engine stored in a memory of and operating on a processor of a computing device and a business decision and business action path simulation engine stored in a memory of and operating on a processor of one of more computing devices.
  • the business information retrieval engine retrieves a plurality of business related data from a plurality of sources, accepts a plurality of analysis parameters and control commands directly from human interface devices or from one or more command and control storage devices and stores accumulated retrieved information for processing by data analysis engine or
  • the business information analysis engine retrieves a plurality of data types from the business information retrieval engine, and performs a plurality of analytical functions and transformations on retrieved data based upon the specific goals and needs set forth in a current campaign by business process analysis authors.
  • the business decision and business action path simulation engine employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources as well as any current campaign specific machine learning, commands and parameters from business process analysis authors to formulate current business operations and risk status reports and employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources, any current campaign specific commands and parameters from business process analysis authors, as well as input gleaned from machine learned algorithms to deliver business action pathway simulations and business decision support to a first end user.
  • the system's business information retrieval engine a stored in the memory of and operating on a processor of a computing device, employs a portal for human interface device input at least a portion of which are business related data and at least another portion of which are commands and parameters related to the conduct of a current business analysis campaign.
  • the business information retrieval engine employs a high volume deep web scraper stored in the memory of an operating on a processor of a computing device, which receives at least some scrape control and spider configuration parameters from the highly customizable cloud based interface, coordinates one or more world wide web searches (scrapes) using both general search control parameters and individual web search agent (spider) specific configuration data, receives scrape progress feedback information which may lead to issuance of further web search control parameters, controls and monitors the spiders on distributed scrape servers, receives the raw scrape campaign data from scrape servers, aggregates at least portions of scrape campaign data from each web site or web page traversed as per the parameters of the scrape campaign.
  • the archetype spiders are provided by a program library and individual spiders are created using configuration files. Scrape campaign requests are persistently stored and can be reused or used as the basis for similar scrape campaigns.
  • the business information retrieval engine employs a high volume deep web scraper stored in the memory of an operating on a processor of a computing device, which receives at least
  • multidimensional time series data store stored in a memory of and operating on a processor of a computing device to receive a plurality of data from a plurality of sensors of heterogeneous types, some of which may have heterogeneous reporting and data payload transmission profiles, aggregates the sensor data over a predetermined amount of time, a predetermined quantity of data or a predetermined number of events, retrieves a specific quantity of aggregated sensor data per each access connection predetermined to allow reliable receipt and inclusion of the data, transparently retrieves quantities of aggregated sensor data too large to be reliably transferred by one access connection using a further plurality access connections to allow capture of all aggregated sensor data under conditions of heavy sensor data influx and stores aggregated sensor data in a simple key -value pair with very little or no data transformation from how the aggregated sensor data is received.
  • the business data analysis engine employs a directed computational graph stored in the memory of an operating on a processor of a computing device which, retrieves streams of input from one or more of a plurality of data sources, filters data to remove data records from the stream for a plurality of reasons drawn from, but not limited to a set comprising absence of all information, damage to data in the record, and presence of in-congruent information or missing information which invalidates the data record, splits filtered data stream into two or more identical parts, formats data within one data stream based upon a set of predetermined parameters so as to prepare for meaningful storage in a data store, sends identical data stream further analysis and either linear transformation or branching transformation using resources of the system.
  • a method for fully integrated capture, and transformative analysis of business impactful information resulting in predictive decision making and simulation comprising the steps of: (a) retrieving business related data and analysis campaign command and control information using a business information retrieval engine stored in the memory of an operating on a processor of a computing device; (b) analyzing and transforming retrieved business related data using a business information analysis engine stored in the memory of an operating on a processor of a computing device in conjunction with previously designed analysis campaign command and control information; and (c) presenting business decision critical information as well as business action pathway simulation information using a business decision and business action path simulation engine based upon the results of analysis of previously retrieved business related data and previously entered analysis campaign command and control information.
  • Kerberos based security exploits using a system for fully integrated capture, and analysis of business information the method comprising the steps of: (a) retrieving ticket granting ticket request information, service session key request information, user sign on attempt data from a Kerberos domain controller using a multidimensional time series database module stored in a memory of and operating on a processor of a computing device; (b) applying any pre-programmed multiple dimensional time series event-condition-action rules that are present and apply to Kerberos protocol events using the multidimensional time series database module; (c) performing conversion of data into graphs where objects are vertices and their relationships edges between vertices using a graph stack service stored in a memory of and operating on a processor of a computing device; and (d) performing an analytical transformation using a directed computational graph module.
  • This technology stack may be applied without loss of generality to other problems, according to the invention.
  • a method to monitor the function of business critical IT infrastructure and business software performance using a system for fully integrated capture, and analysis of business information resulting in improved client-facing IT infrastructure reliability comprising the steps of: (a) Monitor IT equipment and application status statistics as well as failure messages using a multidimensional time series database module stored in a memory of and operating on a processor of a computing device, (b) Process the data retrieved from multidimensional time series database module using a graph stack service stored in a memory of and operating on a processor of a computing device with infrastructure items and software forming vertices of a relational graph and relationships between them forming edges of the graph, (c) Transform data acquired by the multidimensional time series database module using directed computational graph to formulate more complex diagnostic queries based upon the existing data using pre-programmed logic and machine learning and then process the results of those complex queries as predetermined by authors of the monitoring effort, (d) Present the results in format best suited to the downstream use of the processed data and wherein at least one set
  • FIG. 1 is a diagram of an exemplary architecture of a business operating system according to an embodiment of the invention.
  • Fig. 2 is a process flow diagram showing an exemplary set of steps used in the function of the very high bandwidth cloud interface.
  • FIG. 3 is a diagram of an exemplary architecture for a linear transformation pipeline system which introduces the concept of the transformation pipeline as a directed graph of transformation nodes and messages according to an embodiment of the invention.
  • FIG. 4 is a block diagram illustrating an exemplary hardware architecture of a computing device used in various embodiments of the invention.
  • FIG. 5 is a block diagram illustrating an exemplary logical architecture for a client device, according to various embodiments of the invention.
  • FIG. 6 is a block diagram illustrating an exemplary architectural arrangement of clients, servers, and external services, according to various embodiments of the invention.
  • FIG. 7 is another block diagram illustrating an exemplary hardware architecture of a computing device used in various embodiments of the invention DETAILED DESCRIPTION
  • the inventor has conceived, and reduced to practice, a system and method for fully integrated capture and analysis of business information resulting in predictive decision making and simulation.
  • Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise.
  • devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries, logical or physical.
  • a description of an embodiment with several components in communication with each other does not imply that all such components are required. To the contrary, a variety of optional components may be described to illustrate a wide variety of possible embodiments of one or more of the inventions and in order to more fully illustrate one or more aspects of the inventions.
  • process steps, method steps, algorithms or the like may be described in a sequential order, such processes, methods and algorithms may generally be configured to work in alternate orders, unless specifically stated to the contrary.
  • any sequence or order of steps that may be described in this patent application does not, in and of itself, indicate a requirement that the steps be performed in that order.
  • the steps of described processes may be performed in any order practical. Further, some steps may be performed simultaneously despite being described or implied as occurring sequentially (e.g., because one step is described after the other step).
  • the illustration of a process by its depiction in a drawing does not imply that the illustrated process is exclusive of other variations and modifications thereto, does not imply that the illustrated process or any of its steps are necessary to one or more of the invention(s), and does not imply that the illustrated process is preferred.
  • steps are generally described once per embodiment, but this does not mean they must occur once, or that they may only occur once each time a process, method, or algorithm is carried out or executed. Some steps may be omitted in some embodiments or some occurrences, or some steps may be executed more than once in a given embodiment or occurrence.
  • a "swimlane” is a communication channel between a time series sensor data reception and apportioning device and a data store meant to hold the apportioned data time series sensor data.
  • a swimlane is able to move a specific, finite amount of data between the two devices. For example a single swimlane might reliably carry and have incorporated into the data store, the data equivalent of 5 seconds worth of data from 10 sensors in 5 seconds, this being its capacity. Attempts to place 5 seconds worth of data received from 6 sensors using one swimlane would result in data loss.
  • a "metaswimlane” is an as-needed logical combination of transfer capacity of two or more real swimlanes that is transparent to the requesting process.
  • Fig. 1 is a diagram of an exemplary architecture of a business operating system 100 according to an embodiment of the invention.
  • COUCHDBTM COUCHDBTM, CASSANDRATM or RED ISTM depending on the embodiment.
  • the directed computational graph module 155 retrieves one or more streams of data from a plurality of sources, which includes, but is in no way not limited to, a plurality of physical sensors, web based questionnaires and surveys, monitoring of electronic infrastructure, crowd sourcing campaigns, and human input device information.
  • data may be split into two identical streams in a specialized pre-programmed data pipeline 155a, wherein one sub-stream may be sent for batch processing and storage while the other sub-stream may be reformatted for transformation pipeline analysis.
  • the data is then transferred to the general transformer service module 160 for linear data transformation as part of analysis or the decomposable transformer service module 150 for branching or iterative transformations that are part of analysis.
  • the directed computational graph module 155 represents all data as directed graphs where the transformations are nodes and the result messages between transformations edges of the graph.
  • the high volume web crawling module 115 uses multiple server hosted preprogrammed web spiders, which while autonomously configured are deployed within a web scraping framework 115a of which SCRAPYTM is an example, to identify and retrieve data of interest from web based sources that are not well tagged by conventional web crawling technology.
  • the multiple dimension time series database module 120 receives data from a large plurality of sensors that may be of several different types. The module is designed to accommodate irregular and high volume surges by dynamically allotting network bandwidth and server processing channels to process the incoming data.
  • decomposable transformer service 160 modules are decomposable transformer service 160 modules. Alternately, data from the
  • multidimensional time series database and high volume web crawling modules may be sent, often with scripted cuing information determining important vertexes 145a, to the graph stack service module 145 which, employing standardized protocols for converting streams of information into graph representations of that data, for example, open graph internet technology although the invention is not reliant on any one standard.
  • the graph stack service module 145 represents data in graphical form influenced by any pre-determined scripted modifications 145a and stores it in a graph data store 145b such as GIRAPHTM or a key value pair type data store REDISTM, or RIAKTM, among others, all of which are suitable for storing graph represented information.
  • Results of the transformative analysis process may then be combined with further client directives, additional business rules and practices relevant to the analysis and situational information external to the already available data in the automated planning service module 130 which also runs powerful information theory 130a based predictive statistics functions and machine learning algorithms to allow future trends and outcomes to be rapidly forecast based upon the current system derived results and choosing each a plurality of possible business decisions.
  • the automated planning service module 130 may propose business decisions most likely to result is the most favorable business outcome with a usably high level of certainty.
  • the action outcome simulation module 125 with its discrete event simulator programming module 125a coupled with the end user facing observation and state estimation service 140 which is highly scriptable 140b as circumstances require and has a game engine 140a to more realistically stage possible outcomes of business decisions under consideration, allows business decision makers to investigate the probable outcomes of choosing one pending course of action over another based upon analysis of the current available data.
  • the pipelines operations department has reported a very small reduction in crude oil pressure in a section of pipeline in a highly remote section of territory.
  • Fig. 2 is a process flow diagram showing an exemplary set of steps by an
  • the multidimensional time series database as depicted in Fig. 1, 120 may be programmed to retrieve all KERBEROSTM domain controller ticket requests, including ticket-granting-ticket requests, service-session ticket requests, and user sign on attempts from one or more of the business's domain controllers.
  • the MDTSDB may also retrieve information such as but not necessarily limited to: the userid attached to each request, the time of the request, the workstation from which the request was made, and for derivative credentials such as a service-ticket, the requesting ticket-granting-ticket credential.
  • the MDTSDB may transform the data in some way.
  • That data may then go to the graph stack service module depicted in 145 which may map the data into a relational graph where the objects, for example a user 212, 218 the KERBEROSTM ticket granting service (KTGS, 210) and requested services 214, 216, 220, 224 are represented as graph vertices and the information passage relationships between them 211, 213, 215, 217, 219, 221, 223 forming the edges of the graph 203.
  • KERBEROSTM ticket granting service KERBEROSTM ticket granting service
  • Such graph analysis may allow abnormal activity to be rapidly identified as shown in edge 213 and vertex 214 is an extremely simplified example of service ticket exploit where a user illegally accesses a system or network service for which she has no authority and 221 where an intruder gains access to the system using an exploitively gained ticket-granting-ticket and then uses a system service.
  • Panel 225 shows a highly simplified example of time series authorization data that may be attached to a specific service and shows the granting ticket used to access the service, the service key that allows access the user that requested the service and a timestamp for receipt of the information.
  • the directed computational graph module depicted in 155 with its multi- transformation capable data pipeline depicted in 155a, 150 (non-linear transformations), 160 (linear transformations) and machine learning abilities may be used to deeply analyze the data retrieved by the MDTSDB depicted in 120 in complex ways which may allow prediction of an impending security exploit.
  • mass sign on attempts from ip address ranges of an organization known to infiltrate KERBEROSTM domain controllers similar to that of the client business may occur during off hours every third day and this may be uncovered during directed computational graph 155 analysis.
  • Output (not depicted) would be formatted to best serve its pre-decided purpose.
  • Fig. 3 is a process flow diagram of a preferred method for use of the invention to monitor enterprise IT infrastructure both hardware and software, especially customer facing services supported by this infrastructure for slowdown, bottleneck or failure.
  • IT infrastructure including, but not limited to workstations 301a, servers 301b, 301c and peripherals which may include printers 301d are monitored by
  • MDTSDB multidimensional time series database module depicted in 120, 301 possibly using a standard network messaging protocol such as SNMP, as an example, or a specifically programmed adaptor present in the MDTSDB for this purpose.
  • the MDTSDB may transform data as it is captured depending on the requirements of the task prior to the data passing to other modules of the embodiment such as the graph stack service module depicted in 145, 302 where the data from the MDTSDB is processed into a open graph IT ontology compliant relational graph representation as a prelude to further analysis.
  • the relational graph is created within the graph stack service by assigning the objects of the system under analysis, for example: data centers, servers, workstations, and peripherals, as vertices of the graph and the relationships between them as edges.
  • FIG. 3 A generic example of such a relational graph if depicted in 320, 320a through 320k. Looking at 320a through 320k, line 320a depicts the relationship between the Data Center 1 vertex 320 and the generic "System 1" vertex 320b, thus 320a may be thought of as the "Data Center 1"- “System 1" relation. From “System 1" 320b, the graph progresses a complexity gradient first encountering metric group vertices 320c, 320g, 320i and then the individual metrics measured within those groups, 320d, 320e, 320k.
  • the vertices that occur as constituents of "Titan Server” 3201 in this example are "CPU” 320m, “memory” 320z and “peripheral” 320u with the direct connection relationships denoted by the graph edges between them and "Titan Server” 3201. From the graph, it can be easily seen that "Titan Server” 3201 has a CPU 320m with two cores, “core 1" 320n, and “core 2" 320p, one occupied memory slot, 'slot 2" 320q and a directly connected printer “printer 47" 320s, "scanner 120" 320x and “RAID 3” 320v designated as peripherals "periph.” 320u.
  • MDTSDB (see 120) captured data pertaining to "core 1" 320n, the memory in memory 320z "slot 2" 320q, "printer 47" 320s, "scanner 120" 320x and “RAID 3” 320v are displayed 320o, 320r, 320t, 320y and 320w respectively. Focusing on the component data display for "printer 47" 320t one can determine that the current toner cartridge has approximately 35% toner remaining and the current fuser has printer 10,000 pages. The timestamp indicates when the data was collected. The data displays for the other components show comparably useful information.
  • server specific graph shown 3201 through320z is extensively simplified in that only a very few of the possible component groups (CPU, memory, peripherals to be monitored are depicted, a minimal number of underlying components 320n, 320p, 320q, 320s320x 320v are present and data shown in the data displays 320o, 320r, 320t,320y,320w is minimal, incomplete and haphazardly chosen. Also vertex data displays 320o, 320r, 320t,320y,320w occur only at the termini. All of these characteristics of the example graph are present solely for presentation clarity purposes and in no way should be interpreted as limiting the invention. The invention is able to monitor any reported component characteristic known to those knowledgeable in the field.
  • the graph stack service module see 145 is able to map relationships of any foreseeable complexity and while the example data displays 320o, 320r, 320t,320y,320w showed a few lines of data and all displays were at terminal vertices, the length or content of the data displayed in not limited by the invention and data displays can be associated with any vertex of the graph, so, for example one could cause a possibly very lengthy data display showing all pre-determined applicable data to be shown for "Titan Server" vertex 3201.
  • the directed computational graph module depicted in 155 with its multi- transformation capable data pipeline depicted in 155a, 150 (non-linear transformations), 160 (linear transformations) and machine learning abilities may be used to deeply analyze the data retrieved by the MDTSDB depicted in 120 in complex ways which, when coupled with historic data that may span months or years may allow prediction of an impending degradation or loss of a business's customer facing IT services, whether hardware or software is the root cause.
  • web site based requests for further information about a business's newer product lines may, when it reaches a certain level may cause thrashing and bottlenecks in the database storing those documents which up until this point has caused negligible loss of retrieval speed, but recent historical data shows demand for the documents is building and the issue in the database is escalating at a disproportional rate.
  • Output would be formatted to best serve its pre-decided purpose 304 and may involve use of the action outcome simulation module to create a simulation of future infrastructure events 125 and the game engine and scriptability of observation and state estimation service module 140 to present the results in a easily comprehended, dramatic and memorable way.
  • the techniques disclosed herein may be implemented on hardware or a combination of software and hardware. For example, they may be implemented in an operating system kernel, in a separate user process, in a library package bound into network applications, on a specially constructed machine, on an application-specific integrated circuit (ASIC), or on a network interface card.
  • ASIC application-specific integrated circuit
  • Software/hardware hybrid implementations of at least some of the embodiments disclosed herein may be implemented on a programmable network-resident machine (which should be understood to include intermittently connected network-aware machines) selectively activated or reconfigured by a computer program stored in memory.
  • a programmable network-resident machine which should be understood to include intermittently connected network-aware machines
  • Such network devices may have multiple network interfaces that may be configured or designed to utilize different types of network communication protocols.
  • a general architecture for some of these machines may be described herein in order to illustrate one or more exemplary means by which a given unit of functionality may be implemented.
  • At least some of the features or functionalities of the various embodiments disclosed herein may be implemented on one or more general- purpose computers associated with one or more networks, such as for example an end-user computer system, a client computer, a network server or other server system, a mobile computing device (e.g., tablet computing device, mobile phone, smartphone, laptop, or other appropriate computing device), a consumer electronic device, a music player, or any other suitable electronic device, router, switch, or other suitable device, or any combination thereof.
  • at least some of the features or functionalities of the various embodiments disclosed herein may be implemented in one or more virtualized computing environments (e.g., network computing clouds, virtual machines hosted on one or more physical computing machines, or other appropriate virtual environments).
  • FIG. 4 there is shown a block diagram depicting an exemplary computing device 10 suitable for implementing at least a portion of the features or functionalities disclosed herein.
  • Computing device 10 may be, for example, any one of the computing machines listed in the previous paragraph, or indeed any other electronic device capable of executing software- or hardware-based instructions according to one or more programs stored in memory.
  • Computing device 10 may be configured to communicate with a plurality of other computing devices, such as clients or servers, over communications networks such as a wide area network a metropolitan area network, a local area network, a wireless network, the Internet, or any other network, using known protocols for such communication, whether wireless or wired.
  • communications networks such as a wide area network a metropolitan area network, a local area network, a wireless network, the Internet, or any other network, using known protocols for such communication, whether wireless or wired.
  • computing device 10 includes one or more central processing units (CPU) 12, one or more interfaces 15, and one or more busses 14 (such as a peripheral component interconnect (PCI) bus).
  • CPU 12 may be responsible for implementing specific functions associated with the functions of a specifically configured computing device or machine.
  • a computing device 10 may be configured or designed to function as a server system utilizing CPU 12, local memory 11 and/or remote memory 16, and interface(s) 15.
  • CPU 12 may be caused to perform one or more of the different types of functions and/or operations under the control of software modules or components, which for example, may include an operating system and any appropriate applications software, drivers, and the like.
  • CPU 12 may include one or more processors 13 such as, for example, a processor from one of the Intel, ARM, Qualcomm, and AMD families of microprocessors.
  • processors 13 may include specially designed hardware such as application- specific integrated circuits (ASICs), electrically erasable programmable read-only memories (EEPROMs), field-programmable gate arrays (FPGAs), and so forth, for controlling operations of computing device 10.
  • ASICs application- specific integrated circuits
  • EEPROMs electrically erasable programmable read-only memories
  • FPGAs field-programmable gate arrays
  • a local memory 11 such as non-volatile random access memory (RAM) and/or read-only memory (ROM), including for example one or more levels of cached memory
  • RAM non-volatile random access memory
  • ROM read-only memory
  • Memory 11 may be used for a variety of purposes such as, for example, caching and/or storing data, programming instructions, and the like. It should be further appreciated that CPU 12 may be one of a variety of system-on-a-chip (SOC) type hardware that may include additional hardware such as memory or graphics processing chips, such as a Qualcomm SNAPDRAGONTM or Samsung EXYNOSTM CPU as are becoming increasingly common in the art, such as for use in mobile devices or integrated devices.
  • SOC system-on-a-chip
  • interfaces 15 are provided as network interface cards (NICs). Generally, NICs control the sending and receiving of data packets over a computer network; other types of interfaces 15 may for example support other peripherals used with computing device 10.
  • NICs network interface cards
  • the interfaces that may be provided are Ethernet interfaces, frame relay interfaces, cable interfaces, DSL interfaces, token ring interfaces, graphics interfaces, and the like.
  • interfaces may be provided such as, for example, universal serial bus (USB), Serial, Ethernet, FIREWIRETM, THUNDERBOLTTM, PCI, parallel, radio frequency (RF), BLUETOOTHTM, near-field communications (e.g., using near- field magnetics), 802.11 (WiFi), frame relay, TCP/IP, ISDN, fast Ethernet interfaces, Gigabit Ethernet interfaces, Serial ATA (SATA) or external SATA (ESATA) interfaces, high- definition multimedia interface (HDMI), digital visual interface (DVI), analog or digital audio interfaces, asynchronous transfer mode (ATM) interfaces, high-speed serial interface (HSSI) interfaces, Point of Sale (POS) interfaces, fiber data distributed interfaces (FDDIs), and the like.
  • USB universal serial bus
  • RF radio frequency
  • BLUETOOTHTM near-field communications
  • near-field communications e.g., using near- field magnetics
  • WiFi WiFi
  • frame relay e.g., 802.11 (
  • processors 13 may be used, and such processors 13 may be present in a single device or distributed among any number of devices.
  • a single processor 13 handles communications as well as routing computations, while in other embodiments a separate dedicated communications processor may be provided.
  • different types of features or functionalities may be implemented in a system according to the invention that includes a client device (such as a tablet device or smartphone running client software) and server systems (such as a server system described in more detail below).
  • the system of the present invention may employ one or more memories or memory modules (such as, for example, remote memory block 16 and local memory 11) configured to store data, program instructions for the general-purpose network operations, or other information relating to the functionality of the embodiments described herein (or any combinations of the above).
  • Programs such as, for example, remote memory block 16 and local memory 11
  • Memory 16 or memories 11, 16 may also be configured to store data structures, configuration data, encryption data, historical system operations information, or any other specific or generic non-program information described herein.
  • At least some network device embodiments may include nontransitory machine-readable storage media, which, for example, may be configured or designed to store program instructions, state information, and the like for performing various operations described herein.
  • nontransitory machine- readable storage media include, but are not limited to, magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD- ROM disks; magneto -optical media such as optical disks, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM), flash memory (as is common in mobile devices and integrated systems), solid state drives (SSD) and "hybrid SSD” storage drives that may combine physical components of solid state and hard disk drives in a single hardware device (as are becoming increasingly common in the art with regard to personal computers), memristor memory, random access memory (RAM), and the like.
  • ROM read-only memory
  • flash memory as is common in mobile devices and integrated systems
  • SSD solid state drives
  • hybrid SSD hybrid SSD
  • such storage means may be integral and non-removable (such as RAM hardware modules that may be soldered onto a motherboard or otherwise integrated into an electronic device), or they may be removable such as swappable flash memory modules (such as “thumb drives” or other removable media designed for rapidly exchanging physical storage devices), "hot-swappable” hard disk drives or solid state drives, removable optical storage discs, or other such removable media, and that such integral and removable storage media may be utilized
  • Examples of program instructions include both object code, such as may be produced by a compiler, machine code, such as may be produced by an assembler or a linker, byte code, such as may be generated by for example a JAVATM compiler and may be executed using a Java virtual machine or equivalent, or files containing higher level code that may be executed by the computer using an interpreter (for example, scripts written in Python, Perl, Ruby, Groovy, or any other scripting language).
  • object code such as may be produced by a compiler
  • machine code such as may be produced by an assembler or a linker
  • byte code such as may be generated by for example a JAVATM compiler and may be executed using a Java virtual machine or equivalent
  • files containing higher level code that may be executed by the computer using an interpreter (for example, scripts written in Python, Perl, Ruby, Groovy, or any other scripting language).
  • systems according to the present invention may be implemented on a standalone computing system.
  • FIG. 5 there is shown a block diagram depicting a typical exemplary architecture of one or more embodiments or components thereof on a standalone computing system.
  • Computing device 20 includes processors 21 that may run software that carry out one or more functions or applications of embodiments of the invention, such as for example a client application 24.
  • Processors 21 may carry out computing instructions under control of an operating system 22 such as, for example, a version of Microsoft's WINDOWSTM operating system, Apple's Mac OS/X or iOS operating systems, some variety of the Linux operating system, Google's ANDROIDTM operating system, or the like.
  • an operating system 22 such as, for example, a version of Microsoft's WINDOWSTM operating system, Apple's Mac OS/X or iOS operating systems, some variety of the Linux operating system, Google's ANDROIDTM operating system, or the like.
  • one or more shared services 23 may be operable in system 20, and may be useful for providing common services to client applications 24.
  • Services 23 may for example be WINDOWSTM services, user-space common services in a Linux environment, or any other type of common service architecture used with operating system 21.
  • Input devices 28 may be of any type suitable for receiving user input, including for example a keyboard, touchscreen, microphone (for example, for voice input), mouse, touchpad, trackball, or any combination thereof.
  • Output devices 27 may be of any type suitable for providing output to one or more users, whether remote or local to system 20, and may include for example one or more screens for visual output, speakers, printers, or any combination thereof.
  • Memory 25 may be random-access memory having any structure and architecture known in the art, for use by processors 21, for example to run software.
  • Storage devices 26 may be any magnetic, optical, mechanical, memristor, or electrical storage device for storage of data in digital form (such as those described above). Examples of storage devices 26 include flash memory, magnetic hard drive, CD-ROM, and/or the like.
  • systems of the present invention may be implemented on a distributed computing network, such as one having any number of clients and/or servers.
  • a distributed computing network such as one having any number of clients and/or servers.
  • FIG. 6 there is shown a block diagram depicting an exemplary architecture 30 for implementing at least a portion of a system according to an embodiment of the invention on a distributed computing network.
  • any number of clients 33 may be provided.
  • Each client 33 may run software for implementing client-side portions of the present invention; clients may comprise a system 20 such as that illustrated above.
  • any number of servers 32 may be provided for handling requests received from one or more clients 33.
  • Clients 33 and servers 32 may communicate with one another via one or more electronic networks 31, which may be in various embodiments any of the Internet, a wide area network, a mobile telephony network (such as CDMA or GSM cellular networks), a wireless network (such as WiFi, Wimax, LTE, and so forth), or a local area network (or indeed any network topology known in the art; the invention does not prefer any one network topology over any other).
  • Networks 31 may be implemented using any known network protocols, including for example wired and/or wireless protocols.
  • servers 32 may call external services 37 when needed to obtain additional information, or to refer to additional data concerning a particular call. Communications with external services 37 may take place, for example, via one or more networks 31.
  • external services 37 may comprise web -enabled services or functionality related to or installed on the hardware device itself.
  • client applications 24 may obtain information stored in a server system 32 in the cloud or on an external service 37 deployed on one or more of a particular enterprise's or user's premises.
  • clients 33 or servers 32 may make use of one or more specialized services or appliances that may be deployed locally or remotely across one or more networks 31.
  • one or more databases 34 may be used or referred to by one or more embodiments of the invention. It should be understood by one having ordinary skill in the art that databases 34 may be arranged in a wide variety of architectures and using a wide variety of data access and manipulation means.
  • one or more databases 34 may comprise a relational database system using a structured query language (SQL), while others may comprise an alternative data storage technology such as those referred to in the art as "NoSQL” (for example, Hadoop Cassandra, Google BigTable, and so forth).
  • SQL structured query language
  • variant database architectures such as column-oriented databases, in-memory databases, clustered databases, distributed databases, or even flat file data repositories may be used according to the invention. It will be appreciated by one having ordinary skill in the art that any combination of known or future database technologies may be used as appropriate, unless a specific database technology or a specific arrangement of components is specified for a particular embodiment herein. Moreover, it should be appreciated that the term "database” as used herein may refer to a physical database machine, a cluster of machines acting as a single database system, or a logical database within an overall database management system.
  • security systems 36 and configuration systems 35 may make use of one or more security systems 36 and configuration systems 35.
  • Security and configuration management are common information technology (IT) and web functions, and some amount of each are generally associated with any IT or web systems. It should be understood by one having ordinary skill in the art that any configuration or security subsystems known in the art now or in the future may be used in conjunction with embodiments of the invention without limitation, unless a specific security 36 or configuration system 35 or approach is specifically required by the description of any specific embodiment.
  • FIG. 7 shows an exemplary overview of a computer system 40 as may be used in any of the various locations throughout the system. It is exemplary of any computer that may execute code to process data. Various modifications and changes may be made to computer system 40 without departing from the broader scope of the system and method disclosed herein.
  • Central processor unit (CPU) 41 is connected to bus 42, to which bus is also connected memory 43, nonvolatile memory 44, display 47, input/output (I/O) unit 48, and network interface card (NIC) 53.
  • I/O unit 48 may, typically, be connected to keyboard 49, pointing device 50, hard disk 52, and real-time clock 51.
  • NIC 53 connects to network 54, which may be the Internet or a local network, which local network may or may not have connections to the Internet.
  • power supply unit 45 connected, in this example, to a main alternating current (AC) supply 46.
  • AC alternating current
  • functionality for implementing systems or methods of the present invention may be distributed among any number of client and/or server components.
  • various software modules may be implemented for performing various functions in connection with the present invention, and such modules may be variously implemented to run on server and/or client.

Abstract

A system for fully integrated collection of business impacting data, analysis of that data and generation of both analysis driven business decisions and analysis driven simulations of alternate candidate business actions has been devised and reduced to practice. This business operating system may be used to monitor and predictively warn of events that impact the security of business infrastructure and may also be employed to monitor client-facing services supported by both software and hardware to alert in case of reduction or failure and also predict deficiency, service reduction or failure based on current event data.

Description

SYSTEM FOR AUTOMATED CAPTURE AND ANALYSIS OF BUSINESS
INFORMATION
CROSS-REFERENCE TO RELATED APPLICATIONS
[001] This application is a PCT filing of, and claims priority to, United States patent application serial number 15/166,158, titled, "SYSTEM FOR AUTOMATED CAPTURE AND ANALYSIS OF BUSINESS INFORMATION FOR SECURITY AND CLIENT-FACING
INFRASTRUCTURE RELIABILITY", and filed on May 26, 2016, the entire specification of which is incorporated herein by reference in its entirety.
BACKGROUND OF THE INVENTION
Field of the Invention
[002] The present invention is in the field of use of computer systems in business information management, operations and predictive planning. Specifically, the
development of a system that integrates the functions of business information and operating data, complex data analysis and use of that data, preprogrammed commands and parameters and machine learning to create a business operating system capable of predictive decision making and action path outcome simulation herein as applied to IT systems security and reliability of client-facing IT infrastructure.
Discussion of the State of the Art
[003] Over the past decade the amount of financial, operational, infrastructure, risk management and philosophical information available to decision makers of a business from such sources as ubiquitous sensors found on a business's equipment or available from third party sources, detailed cause and effect data, and business process monitoring software has expanded to the point where the data has overwhelmed the abilities of virtually anyone to follow all of it much less interpret and make meaningful use of that available data in a given business environment. In other words, the torrent of business related information now available to a decision maker of group of decision makers has far out grown the ability of those in most need of its use to either fully follow it or reliably use it. Failure to recognize important trends or become aware of information in a timely fashion has led to highly visible, customer facing, outages at NETFLIX™, FACEBOOK™, and UPS™ over the past few years, just to list a few.
[004] There have been several developments in business software that have arisen with the purpose of streamlining or automating either business data analysis or business decision process. PLANATIR™ offers software to isolate patterns in large volumes of data, DATAB RICKS™ offers custom analytics services ANAPLAN™ offers financial impact calculation services and there are other software sources that mitigate some aspect of business data relevancy identification, analysis of that data and business decision automation, but none of these solutions handle more than a single aspect of the whole task. Similarly products like DATADOG™ and BMC INSIGHT™ allow businesses to monitor the function of their IT infrastructure and business software but lack the ability to perform the complex operation queries of large sections of that infrastructure so as to forecast impending bottlenecks, deficiencies and even customer facing failures.
[005] What is needed is a fully integrated system that retrieves business relevant information from many diverse sources, identifies and analyzes that high volume data, transforming it to a business useful format and then uses that data to create intelligent predictive business decisions and business pathway simulations. Forming a "business operating system."
SUMMARY OF THE INVENTION
[006] Accordingly, the inventor has developed a distributed system for the fully integrated retrieval, and deep analysis of business operational information from a plurality of sources. The system further uses results of business information analytics to optimize the making of business decisions and allow for alternate action pathways to be simulated using the latest data and machine mediated prediction algorithms. Specifically, portions of the system are applied to the areas of IT security and predictively enhancing the reliability of client- facing IT infrastructure.
[007] According to a preferred embodiment of the invention, 1. A system for fully integrated collection of business impacting data, analysis of that data and generation of both analysis driven business decisions and analysis driven simulations of alternate candidate business decision comprising: a business data retrieval engine stored in a memory of and operating on a processor of a computing device, a business data analysis engine stored in a memory of and operating on a processor of a computing device and a business decision and business action path simulation engine stored in a memory of and operating on a processor of one of more computing devices. The business information retrieval engine: retrieves a plurality of business related data from a plurality of sources, accepts a plurality of analysis parameters and control commands directly from human interface devices or from one or more command and control storage devices and stores accumulated retrieved information for processing by data analysis engine or
predetermined data timeout. The business information analysis engine: retrieves a plurality of data types from the business information retrieval engine, and performs a plurality of analytical functions and transformations on retrieved data based upon the specific goals and needs set forth in a current campaign by business process analysis authors. The business decision and business action path simulation engine: employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources as well as any current campaign specific machine learning, commands and parameters from business process analysis authors to formulate current business operations and risk status reports and employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources, any current campaign specific commands and parameters from business process analysis authors, as well as input gleaned from machine learned algorithms to deliver business action pathway simulations and business decision support to a first end user.
[008] According to another embodiment of the invention, the system's business information retrieval engine a stored in the memory of and operating on a processor of a computing device, employs a portal for human interface device input at least a portion of which are business related data and at least another portion of which are commands and parameters related to the conduct of a current business analysis campaign. The business information retrieval engine employs a high volume deep web scraper stored in the memory of an operating on a processor of a computing device, which receives at least some scrape control and spider configuration parameters from the highly customizable cloud based interface, coordinates one or more world wide web searches (scrapes) using both general search control parameters and individual web search agent (spider) specific configuration data, receives scrape progress feedback information which may lead to issuance of further web search control parameters, controls and monitors the spiders on distributed scrape servers, receives the raw scrape campaign data from scrape servers, aggregates at least portions of scrape campaign data from each web site or web page traversed as per the parameters of the scrape campaign. The archetype spiders are provided by a program library and individual spiders are created using configuration files. Scrape campaign requests are persistently stored and can be reused or used as the basis for similar scrape campaigns. The business information retrieval engine employs a
multidimensional time series data store stored in a memory of and operating on a processor of a computing device to receive a plurality of data from a plurality of sensors of heterogeneous types, some of which may have heterogeneous reporting and data payload transmission profiles, aggregates the sensor data over a predetermined amount of time, a predetermined quantity of data or a predetermined number of events, retrieves a specific quantity of aggregated sensor data per each access connection predetermined to allow reliable receipt and inclusion of the data, transparently retrieves quantities of aggregated sensor data too large to be reliably transferred by one access connection using a further plurality access connections to allow capture of all aggregated sensor data under conditions of heavy sensor data influx and stores aggregated sensor data in a simple key -value pair with very little or no data transformation from how the aggregated sensor data is received. Last, the business data analysis engine employs a directed computational graph stored in the memory of an operating on a processor of a computing device which, retrieves streams of input from one or more of a plurality of data sources, filters data to remove data records from the stream for a plurality of reasons drawn from, but not limited to a set comprising absence of all information, damage to data in the record, and presence of in-congruent information or missing information which invalidates the data record, splits filtered data stream into two or more identical parts, formats data within one data stream based upon a set of predetermined parameters so as to prepare for meaningful storage in a data store, sends identical data stream further analysis and either linear transformation or branching transformation using resources of the system.
[009] According to another embodiment of the invention, a method for fully integrated capture, and transformative analysis of business impactful information resulting in predictive decision making and simulation the method comprising the steps of: (a) retrieving business related data and analysis campaign command and control information using a business information retrieval engine stored in the memory of an operating on a processor of a computing device; (b) analyzing and transforming retrieved business related data using a business information analysis engine stored in the memory of an operating on a processor of a computing device in conjunction with previously designed analysis campaign command and control information; and (c) presenting business decision critical information as well as business action pathway simulation information using a business decision and business action path simulation engine based upon the results of analysis of previously retrieved business related data and previously entered analysis campaign command and control information.
[010] According to another embodiment of the invention, a method for the detection of
Kerberos based security exploits using a system for fully integrated capture, and analysis of business information the method comprising the steps of: (a) retrieving ticket granting ticket request information, service session key request information, user sign on attempt data from a Kerberos domain controller using a multidimensional time series database module stored in a memory of and operating on a processor of a computing device; (b) applying any pre-programmed multiple dimensional time series event-condition-action rules that are present and apply to Kerberos protocol events using the multidimensional time series database module; (c) performing conversion of data into graphs where objects are vertices and their relationships edges between vertices using a graph stack service stored in a memory of and operating on a processor of a computing device; and (d) performing an analytical transformation using a directed computational graph module. This technology stack may be applied without loss of generality to other problems, according to the invention.
[011] According to yet another embodiment of the invention, a method to monitor the function of business critical IT infrastructure and business software performance using a system for fully integrated capture, and analysis of business information resulting in improved client-facing IT infrastructure reliability the method comprising the steps of: (a) Monitor IT equipment and application status statistics as well as failure messages using a multidimensional time series database module stored in a memory of and operating on a processor of a computing device, (b) Process the data retrieved from multidimensional time series database module using a graph stack service stored in a memory of and operating on a processor of a computing device with infrastructure items and software forming vertices of a relational graph and relationships between them forming edges of the graph, (c) Transform data acquired by the multidimensional time series database module using directed computational graph to formulate more complex diagnostic queries based upon the existing data using pre-programmed logic and machine learning and then process the results of those complex queries as predetermined by authors of the monitoring effort, (d) Present the results in format best suited to the downstream use of the processed data and wherein at least one set of results are displayed using an observation and state estimation service stored in a memory of and operating on a processor of a computing device BRIEF DESCRIPTION OF THE DRAWING FIGURES
[012] The accompanying drawings illustrate several embodiments of the invention and, together with the description, serve to explain the principles of the invention according to the embodiments. One skilled in the art will recognize that the particular embodiments illustrated in the drawings are merely exemplary, and are not intended to limit the scope of the present invention.
[013] Fig. 1 is a diagram of an exemplary architecture of a business operating system according to an embodiment of the invention.
[014] Fig. 2 is a process flow diagram showing an exemplary set of steps used in the function of the very high bandwidth cloud interface.
[015] Fig. 3 is a diagram of an exemplary architecture for a linear transformation pipeline system which introduces the concept of the transformation pipeline as a directed graph of transformation nodes and messages according to an embodiment of the invention.
[016] Fig. 4 is a block diagram illustrating an exemplary hardware architecture of a computing device used in various embodiments of the invention.
[017] Fig. 5 is a block diagram illustrating an exemplary logical architecture for a client device, according to various embodiments of the invention.
[018] Fig. 6 is a block diagram illustrating an exemplary architectural arrangement of clients, servers, and external services, according to various embodiments of the invention.
[019] Fig. 7 is another block diagram illustrating an exemplary hardware architecture of a computing device used in various embodiments of the invention DETAILED DESCRIPTION
[020] The inventor has conceived, and reduced to practice, a system and method for fully integrated capture and analysis of business information resulting in predictive decision making and simulation.
[021] One or more different inventions may be described in the present application. Further, for one or more of the inventions described herein, numerous alternative embodiments may be described; it should be understood that these are presented for illustrative purposes only. The described embodiments are not intended to be limiting in any sense. One or more of the inventions may be widely applicable to numerous embodiments, as is readily apparent from the disclosure. In general, embodiments are described in sufficient detail to enable those skilled in the art to practice one or more of the inventions, and it is to be understood that other embodiments may be utilized and that structural, logical, software, electrical and other changes may be made without departing from the scope of the particular inventions. Accordingly, those skilled in the art will recognize that one or more of the inventions may be practiced with various modifications and alterations. Particular features of one or more of the inventions may be described with reference to one or more particular embodiments or figures that form a part of the present disclosure, and in which are shown, by way of illustration, specific embodiments of one or more of the inventions. It should be understood, however, that such features are not limited to usage in the one or more particular embodiments or figures with reference to which they are described. The present disclosure is neither a literal description of all embodiments of one or more of the inventions nor a listing of features of one or more of the inventions that must be present in all embodiments.
[022] Headings of sections provided in this patent application and the title of this patent application are for convenience only, and are not to be taken as limiting the disclosure in any way.
[023] Devices that are in communication with each other need not be in continuous communication with each other, unless expressly specified otherwise. In addition, devices that are in communication with each other may communicate directly or indirectly through one or more intermediaries, logical or physical. [024] A description of an embodiment with several components in communication with each other does not imply that all such components are required. To the contrary, a variety of optional components may be described to illustrate a wide variety of possible embodiments of one or more of the inventions and in order to more fully illustrate one or more aspects of the inventions. Similarly, although process steps, method steps, algorithms or the like may be described in a sequential order, such processes, methods and algorithms may generally be configured to work in alternate orders, unless specifically stated to the contrary. In other words, any sequence or order of steps that may be described in this patent application does not, in and of itself, indicate a requirement that the steps be performed in that order. The steps of described processes may be performed in any order practical. Further, some steps may be performed simultaneously despite being described or implied as occurring sequentially (e.g., because one step is described after the other step). Moreover, the illustration of a process by its depiction in a drawing does not imply that the illustrated process is exclusive of other variations and modifications thereto, does not imply that the illustrated process or any of its steps are necessary to one or more of the invention(s), and does not imply that the illustrated process is preferred. Also, steps are generally described once per embodiment, but this does not mean they must occur once, or that they may only occur once each time a process, method, or algorithm is carried out or executed. Some steps may be omitted in some embodiments or some occurrences, or some steps may be executed more than once in a given embodiment or occurrence.
[025] When a single device or article is described, it will be readily apparent that more than one device or article may be used in place of a single device or article. Similarly, where more than one device or article is described, it will be readily apparent that a single device or article may be used in place of the more than one device or article.
[026] The functionality or the features of a device may be alternatively embodied by one or more other devices that are not explicitly described as having such functionality or features. Thus, other embodiments of one or more of the inventions need not include the device itself.
[027] Techniques and mechanisms described or referenced herein will sometimes be described in singular form for clarity. However, it should be noted that particular embodiments include multiple iterations of a technique or multiple manifestations of a mechanism unless noted otherwise. Process descriptions or blocks in figures should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process. Alternate implementations are included within the scope of embodiments of the present invention in which, for example, functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those having ordinary skill in the art.
Definitions
[028] As used herein, a "swimlane" is a communication channel between a time series sensor data reception and apportioning device and a data store meant to hold the apportioned data time series sensor data. A swimlane is able to move a specific, finite amount of data between the two devices. For example a single swimlane might reliably carry and have incorporated into the data store, the data equivalent of 5 seconds worth of data from 10 sensors in 5 seconds, this being its capacity. Attempts to place 5 seconds worth of data received from 6 sensors using one swimlane would result in data loss.
[029] As used herein, a "metaswimlane" is an as-needed logical combination of transfer capacity of two or more real swimlanes that is transparent to the requesting process.
Sensor studies where the amount of data received per unit time is expected to be highly heterogeneous over time may be initiated to use metaswimlanes. Using the example used above that a single real swimlane can transfer and incorporate the 5 seconds worth of data of 10 sensors without data loss, the sudden receipt of incoming sensor data from 13 sensors during a 5 second interval would cause the system to create a two swimlane metaswimlane to accommodate the standard 10 sensors of data in one real swimlane and the 3 sensor data overage in the second, transparently added real swimlane, however no changes to the data receipt logic would be needed as the data reception and apportionment device would add the additional real swimlane transparently.
Con ceptual Architecture
[030] Fig. 1 is a diagram of an exemplary architecture of a business operating system 100 according to an embodiment of the invention. Client access to the system 105 for specific data entry, system control and for interaction with system output such as automated predictive decision making and planning and alternate pathway simulations, occurs through the system's distributed, extensible high bandwidth cloud interface 110 which uses a versatile, robust web application driven interface for both input and display of client- facing information and a data store 112 such as, but not limited to MONGODB™,
COUCHDB™, CASSANDRA™ or RED IS™ depending on the embodiment. Much of the business data analyzed by the system both from sources within the confines of the client business, and from cloud based sources, also enter the system through the cloud interface 110, data being passed to the analysis and transformation components of the system, the directed computational graph module 155, high volume web crawler module 115, multidimensional time series database 120 and the graph stack service. The directed computational graph module 155 retrieves one or more streams of data from a plurality of sources, which includes, but is in no way not limited to, a plurality of physical sensors, web based questionnaires and surveys, monitoring of electronic infrastructure, crowd sourcing campaigns, and human input device information. Within the directed computational graph module 155, data may be split into two identical streams in a specialized pre-programmed data pipeline 155a, wherein one sub-stream may be sent for batch processing and storage while the other sub-stream may be reformatted for transformation pipeline analysis. The data is then transferred to the general transformer service module 160 for linear data transformation as part of analysis or the decomposable transformer service module 150 for branching or iterative transformations that are part of analysis. The directed computational graph module 155 represents all data as directed graphs where the transformations are nodes and the result messages between transformations edges of the graph. The high volume web crawling module 115 uses multiple server hosted preprogrammed web spiders, which while autonomously configured are deployed within a web scraping framework 115a of which SCRAPY™ is an example, to identify and retrieve data of interest from web based sources that are not well tagged by conventional web crawling technology. The multiple dimension time series database module 120 receives data from a large plurality of sensors that may be of several different types. The module is designed to accommodate irregular and high volume surges by dynamically allotting network bandwidth and server processing channels to process the incoming data. Inclusion of programming wrappers for languages examples of which are, but not limited to C++, PERL, PYTHON, and ERLANG™ allows sophisticated programming logic to be added to the default function of the multidimensional time series database 120 without intimate knowledge of the core programming, greatly extending breadth of function. Data retrieved by the multidimensional time series database 120 and the high volume web crawling module 115 may be further analyzed and transformed into task optimized results by the directed computational graph 155 and associated general transformer service 150 and
decomposable transformer service 160 modules. Alternately, data from the
multidimensional time series database and high volume web crawling modules may be sent, often with scripted cuing information determining important vertexes 145a, to the graph stack service module 145 which, employing standardized protocols for converting streams of information into graph representations of that data, for example, open graph internet technology although the invention is not reliant on any one standard. Through the steps, the graph stack service module 145 represents data in graphical form influenced by any pre-determined scripted modifications 145a and stores it in a graph data store 145b such as GIRAPH™ or a key value pair type data store REDIS™, or RIAK™, among others, all of which are suitable for storing graph represented information.
[031] Results of the transformative analysis process may then be combined with further client directives, additional business rules and practices relevant to the analysis and situational information external to the already available data in the automated planning service module 130 which also runs powerful information theory 130a based predictive statistics functions and machine learning algorithms to allow future trends and outcomes to be rapidly forecast based upon the current system derived results and choosing each a plurality of possible business decisions. The using all available data, the automated planning service module 130 may propose business decisions most likely to result is the most favorable business outcome with a usably high level of certainty. Closely related to the automated planning service module in the use of system derived results in conjunction with possible externally supplied additional information in the assistance of end user business decision making, the action outcome simulation module 125 with its discrete event simulator programming module 125a coupled with the end user facing observation and state estimation service 140 which is highly scriptable 140b as circumstances require and has a game engine 140a to more realistically stage possible outcomes of business decisions under consideration, allows business decision makers to investigate the probable outcomes of choosing one pending course of action over another based upon analysis of the current available data. For example, the pipelines operations department has reported a very small reduction in crude oil pressure in a section of pipeline in a highly remote section of territory. Many believe the issue is entirely due to a fouled, possibly failing flow sensor, others believe that it is a proximal upstream pump that may have foreign material stuck in it. Correction of both of these possibilities is to increase the output of the effected pump to hopefully clean out it or the fouled sensor. A failing sensor will have to be replaced at the next maintenance cycle. A few, however, feel that the pressure drop is due to a break in the pipeline, probably small at this point, but even so, crude oil is leaking and the remedy for the fouled sensor or pump option could make the leak much worse and waste much time afterwards. The company does have a contractor about 8 hours away, or could rent satellite time to look but both of those are expensive for a probable sensor issue, significantly less than cleaning up an oil spill though and then with significant negative public exposure. These sensor issues have happened before and the business operating system 100 has data from them, which no one really studied due to the great volume of columnar figures, so the alternative courses 125, 140 of action are run. The system, based on all available data, predicts that the fouled sensor or pump is unlikely to be the root cause this time due to other available data, and the contractor is dispatched. She finds a small breach in the pipeline. There will be a small cleanup and the pipeline needs to be shutdown for repair but multiple tens of millions of dollars have been saved. This is just one example of a great many of the possible use of the business operating system, those knowledgeable in the art will easily formulate more.
[032] Fig. 2 is a process flow diagram showing an exemplary set of steps by an
embodiment of the invention to monitor security exploits against a business's KERBEROS based domain controller 200. These exploits occur when an intruder is able to obtain a functional user level authentication credential (ticket-granting-ticket exploit) without passing challenge, is able to gain access to system or network services for which they are not authorized (service-ticket exploit) or are able to steal user sign on credentials
(principle-key exploit) and masquerade as that user. To monitor for these and possible other security related activities the multidimensional time series database (MDTSDB) as depicted in Fig. 1, 120 may be programmed to retrieve all KERBEROS™ domain controller ticket requests, including ticket-granting-ticket requests, service-session ticket requests, and user sign on attempts from one or more of the business's domain controllers. With that basic information, the MDTSDB may also retrieve information such as but not necessarily limited to: the userid attached to each request, the time of the request, the workstation from which the request was made, and for derivative credentials such as a service-ticket, the requesting ticket-granting-ticket credential. Using pre-programmed logic specific to the circumstances, the MDTSDB may transform the data in some way. That data may then go to the graph stack service module depicted in 145 which may map the data into a relational graph where the objects, for example a user 212, 218 the KERBEROS™ ticket granting service (KTGS, 210) and requested services 214, 216, 220, 224 are represented as graph vertices and the information passage relationships between them 211, 213, 215, 217, 219, 221, 223 forming the edges of the graph 203. Such graph analysis may allow abnormal activity to be rapidly identified as shown in edge 213 and vertex 214 is an extremely simplified example of service ticket exploit where a user illegally accesses a system or network service for which she has no authority and 221 where an intruder gains access to the system using an exploitively gained ticket-granting-ticket and then uses a system service. Panel 225 shows a highly simplified example of time series authorization data that may be attached to a specific service and shows the granting ticket used to access the service, the service key that allows access the user that requested the service and a timestamp for receipt of the information.
[033] Events including security breaches are often preceded by smaller occurrences that either go totally unnoticed or are not recognized as significant to the future calamity. Under this embodiment the directed computational graph module depicted in 155, with its multi- transformation capable data pipeline depicted in 155a, 150 (non-linear transformations), 160 (linear transformations) and machine learning abilities may be used to deeply analyze the data retrieved by the MDTSDB depicted in 120 in complex ways which may allow prediction of an impending security exploit. As an extremely simple example, mass sign on attempts from ip address ranges of an organization known to infiltrate KERBEROS™ domain controllers similar to that of the client business may occur during off hours every third day and this may be uncovered during directed computational graph 155 analysis. Output (not depicted) would be formatted to best serve its pre-decided purpose.
[034] Fig. 3 is a process flow diagram of a preferred method for use of the invention to monitor enterprise IT infrastructure both hardware and software, especially customer facing services supported by this infrastructure for slowdown, bottleneck or failure. In one embodiment, IT infrastructure including, but not limited to workstations 301a, servers 301b, 301c and peripherals which may include printers 301d are monitored by
multidimensional time series database module (MDTSDB), depicted in 120, 301 possibly using a standard network messaging protocol such as SNMP, as an example, or a specifically programmed adaptor present in the MDTSDB for this purpose. The MDTSDB may transform data as it is captured depending on the requirements of the task prior to the data passing to other modules of the embodiment such as the graph stack service module depicted in 145, 302 where the data from the MDTSDB is processed into a open graph IT ontology compliant relational graph representation as a prelude to further analysis. The relational graph is created within the graph stack service by assigning the objects of the system under analysis, for example: data centers, servers, workstations, and peripherals, as vertices of the graph and the relationships between them as edges. A generic example of such a relational graph if depicted in 320, 320a through 320k. Looking at 320a through 320k, line 320a depicts the relationship between the Data Center 1 vertex 320 and the generic "System 1" vertex 320b, thus 320a may be thought of as the "Data Center 1"- "System 1" relation. From "System 1" 320b, the graph progresses a complexity gradient first encountering metric group vertices 320c, 320g, 320i and then the individual metrics measured within those groups, 320d, 320e, 320k. Another relationship between "System 1" 320b and "metric group n" 320i is depicted 320h this is the "Systeml"-"metric group n" relation. Finally, an generic example of specific MDTSDB captured data, stored in the vertices is shown in 320f and is expanded from "metric 2" 320e.
[035] Now familiar with a general relational graph example, an application of graph creation from IT infrastructure monitoring application specific example MDTSDB (see 120) captured data is depicted 320, 3201 through 320z. Starting at "Titan Server" 3201, which forms a graph vertex and also which, consulting the graph, is shown located in "Data Center 1" 320, which is another, more complex, graph vertex. A representative, though certainly not exhaustive, sample of the Titan Server's 3201 constituent hardware parts are depicted 320m, 320z, 320u. Relationships are denoted by the lines, or edges between the vertices. The vertices that occur as constituents of "Titan Server" 3201 in this example are "CPU" 320m, "memory" 320z and "peripheral" 320u with the direct connection relationships denoted by the graph edges between them and "Titan Server" 3201. From the graph, it can be easily seen that "Titan Server" 3201 has a CPU 320m with two cores, "core 1" 320n, and "core 2" 320p, one occupied memory slot, 'slot 2" 320q and a directly connected printer "printer 47" 320s, "scanner 120" 320x and "RAID 3" 320v designated as peripherals "periph." 320u. MDTSDB (see 120) captured data pertaining to "core 1" 320n, the memory in memory 320z "slot 2" 320q, "printer 47" 320s, "scanner 120" 320x and "RAID 3" 320v are displayed 320o, 320r, 320t, 320y and 320w respectively. Focusing on the component data display for "printer 47" 320t one can determine that the current toner cartridge has approximately 35% toner remaining and the current fuser has printer 10,000 pages. The timestamp indicates when the data was collected. The data displays for the other components show comparably useful information. It must be noted that the server specific graph shown 3201 through320z is extensively simplified in that only a very few of the possible component groups (CPU, memory, peripherals to be monitored are depicted, a minimal number of underlying components 320n, 320p, 320q, 320s320x 320v are present and data shown in the data displays 320o, 320r, 320t,320y,320w is minimal, incomplete and haphazardly chosen. Also vertex data displays 320o, 320r, 320t,320y,320w occur only at the termini. All of these characteristics of the example graph are present solely for presentation clarity purposes and in no way should be interpreted as limiting the invention. The invention is able to monitor any reported component characteristic known to those knowledgeable in the field. The graph stack service module see 145 is able to map relationships of any foreseeable complexity and while the example data displays 320o, 320r, 320t,320y,320w showed a few lines of data and all displays were at terminal vertices, the length or content of the data displayed in not limited by the invention and data displays can be associated with any vertex of the graph, so, for example one could cause a possibly very lengthy data display showing all pre-determined applicable data to be shown for "Titan Server" vertex 3201.
[036] While displaying the current operating data of a business's infrastructure is a very powerful tool and can disclose certain issues, the ability to predict likely future significant slowdowns, deficiencies and outages through the intelligent interpretation of small variances in current data and event chain progressions so as to prevent noticeable degradation of service is an extremely powerful tool offered by the invention. Under this embodiment the directed computational graph module depicted in 155, with its multi- transformation capable data pipeline depicted in 155a, 150 (non-linear transformations), 160 (linear transformations) and machine learning abilities may be used to deeply analyze the data retrieved by the MDTSDB depicted in 120 in complex ways which, when coupled with historic data that may span months or years may allow prediction of an impending degradation or loss of a business's customer facing IT services, whether hardware or software is the root cause. As an extremely simple example, web site based requests for further information about a business's newer product lines may, when it reaches a certain level may cause thrashing and bottlenecks in the database storing those documents which up until this point has caused negligible loss of retrieval speed, but recent historical data shows demand for the documents is building and the issue in the database is escalating at a disproportional rate. Within the embodiment, data from the MDTSDB is retrieved by the directed computational graph module 303 which then performs more complex analyses on the data and determines that the issue lies in that the customer document mailing software in use has a long revision history and is writing customer contact information to several tables in the database, some no longer used and some better done asynchronously, also the routines within the software for mailing materials is also outdated an inefficient, again taxing the database but also not using all of the currently available printer queues, delaying print job confirmation and again indirectly slowing the database as it records those confirmations. Last, the database manager is no longer optimally tuned for current business realities. These example emerging issues would have eventually become noticeable and serious. It should be remembered that as a single example, the above should in no way be regarded as defining of constraining the capabilities of the invention.
[037] Output would be formatted to best serve its pre-decided purpose 304 and may involve use of the action outcome simulation module to create a simulation of future infrastructure events 125 and the game engine and scriptability of observation and state estimation service module 140 to present the results in a easily comprehended, dramatic and memorable way.
Hardware Architecture
[038] Generally, the techniques disclosed herein may be implemented on hardware or a combination of software and hardware. For example, they may be implemented in an operating system kernel, in a separate user process, in a library package bound into network applications, on a specially constructed machine, on an application-specific integrated circuit (ASIC), or on a network interface card.
[039] Software/hardware hybrid implementations of at least some of the embodiments disclosed herein may be implemented on a programmable network-resident machine (which should be understood to include intermittently connected network-aware machines) selectively activated or reconfigured by a computer program stored in memory. Such network devices may have multiple network interfaces that may be configured or designed to utilize different types of network communication protocols. A general architecture for some of these machines may be described herein in order to illustrate one or more exemplary means by which a given unit of functionality may be implemented. According to specific embodiments, at least some of the features or functionalities of the various embodiments disclosed herein may be implemented on one or more general- purpose computers associated with one or more networks, such as for example an end-user computer system, a client computer, a network server or other server system, a mobile computing device (e.g., tablet computing device, mobile phone, smartphone, laptop, or other appropriate computing device), a consumer electronic device, a music player, or any other suitable electronic device, router, switch, or other suitable device, or any combination thereof. In at least some embodiments, at least some of the features or functionalities of the various embodiments disclosed herein may be implemented in one or more virtualized computing environments (e.g., network computing clouds, virtual machines hosted on one or more physical computing machines, or other appropriate virtual environments).
[040] Referring now to Fig. 4, there is shown a block diagram depicting an exemplary computing device 10 suitable for implementing at least a portion of the features or functionalities disclosed herein. Computing device 10 may be, for example, any one of the computing machines listed in the previous paragraph, or indeed any other electronic device capable of executing software- or hardware-based instructions according to one or more programs stored in memory. Computing device 10 may be configured to communicate with a plurality of other computing devices, such as clients or servers, over communications networks such as a wide area network a metropolitan area network, a local area network, a wireless network, the Internet, or any other network, using known protocols for such communication, whether wireless or wired.
[041] In one embodiment, computing device 10 includes one or more central processing units (CPU) 12, one or more interfaces 15, and one or more busses 14 (such as a peripheral component interconnect (PCI) bus). When acting under the control of appropriate software or firmware, CPU 12 may be responsible for implementing specific functions associated with the functions of a specifically configured computing device or machine. For example, in at least one embodiment, a computing device 10 may be configured or designed to function as a server system utilizing CPU 12, local memory 11 and/or remote memory 16, and interface(s) 15. In at least one embodiment, CPU 12 may be caused to perform one or more of the different types of functions and/or operations under the control of software modules or components, which for example, may include an operating system and any appropriate applications software, drivers, and the like. [042] CPU 12 may include one or more processors 13 such as, for example, a processor from one of the Intel, ARM, Qualcomm, and AMD families of microprocessors. In some embodiments, processors 13 may include specially designed hardware such as application- specific integrated circuits (ASICs), electrically erasable programmable read-only memories (EEPROMs), field-programmable gate arrays (FPGAs), and so forth, for controlling operations of computing device 10. In a specific embodiment, a local memory 11 (such as non-volatile random access memory (RAM) and/or read-only memory (ROM), including for example one or more levels of cached memory) may also form part of CPU 12. However, there are many different ways in which memory may be coupled to system 10. Memory 11 may be used for a variety of purposes such as, for example, caching and/or storing data, programming instructions, and the like. It should be further appreciated that CPU 12 may be one of a variety of system-on-a-chip (SOC) type hardware that may include additional hardware such as memory or graphics processing chips, such as a Qualcomm SNAPDRAGON™ or Samsung EXYNOS™ CPU as are becoming increasingly common in the art, such as for use in mobile devices or integrated devices.
[043] As used herein, the term "processor" is not limited merely to those integrated circuits referred to in the art as a processor, a mobile processor, or a microprocessor, but broadly refers to a microcontroller, a microcomputer, a programmable logic controller, an application-specific integrated circuit, and any other programmable circuit. [044] In one embodiment, interfaces 15 are provided as network interface cards (NICs). Generally, NICs control the sending and receiving of data packets over a computer network; other types of interfaces 15 may for example support other peripherals used with computing device 10. Among the interfaces that may be provided are Ethernet interfaces, frame relay interfaces, cable interfaces, DSL interfaces, token ring interfaces, graphics interfaces, and the like. In addition, various types of interfaces may be provided such as, for example, universal serial bus (USB), Serial, Ethernet, FIREWIRE™, THUNDERBOLT™, PCI, parallel, radio frequency (RF), BLUETOOTH™, near-field communications (e.g., using near- field magnetics), 802.11 (WiFi), frame relay, TCP/IP, ISDN, fast Ethernet interfaces, Gigabit Ethernet interfaces, Serial ATA (SATA) or external SATA (ESATA) interfaces, high- definition multimedia interface (HDMI), digital visual interface (DVI), analog or digital audio interfaces, asynchronous transfer mode (ATM) interfaces, high-speed serial interface (HSSI) interfaces, Point of Sale (POS) interfaces, fiber data distributed interfaces (FDDIs), and the like. Generally, such interfaces 15 may include physical ports appropriate for communication with appropriate media. In some cases, they may also include an independent processor (such as a dedicated audio or video processor, as is common in the art for high-fidelity A/V hardware interfaces) and, in some instances, volatile and/or non- volatile memory (e.g., RAM).
[045] Although the system shown and described above illustrates one specific architecture for a computing device 10 for implementing one or more of the inventions described herein, it is by no means the only device architecture on which at least a portion of the features and techniques described herein may be implemented. For example, architectures having one or any number of processors 13 may be used, and such processors 13 may be present in a single device or distributed among any number of devices. In one embodiment, a single processor 13 handles communications as well as routing computations, while in other embodiments a separate dedicated communications processor may be provided. In various embodiments, different types of features or functionalities may be implemented in a system according to the invention that includes a client device (such as a tablet device or smartphone running client software) and server systems (such as a server system described in more detail below).
[046] Regardless of network device configuration, the system of the present invention may employ one or more memories or memory modules (such as, for example, remote memory block 16 and local memory 11) configured to store data, program instructions for the general-purpose network operations, or other information relating to the functionality of the embodiments described herein (or any combinations of the above). Program
instructions may control execution of or comprise an operating system and/or one or more applications, for example. Memory 16 or memories 11, 16 may also be configured to store data structures, configuration data, encryption data, historical system operations information, or any other specific or generic non-program information described herein.
[047] Because such information and program instructions may be employed to implement one or more systems or methods described herein, at least some network device embodiments may include nontransitory machine-readable storage media, which, for example, may be configured or designed to store program instructions, state information, and the like for performing various operations described herein. Examples of such nontransitory machine- readable storage media include, but are not limited to, magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD- ROM disks; magneto -optical media such as optical disks, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM), flash memory (as is common in mobile devices and integrated systems), solid state drives (SSD) and "hybrid SSD" storage drives that may combine physical components of solid state and hard disk drives in a single hardware device (as are becoming increasingly common in the art with regard to personal computers), memristor memory, random access memory (RAM), and the like. It should be appreciated that such storage means may be integral and non-removable (such as RAM hardware modules that may be soldered onto a motherboard or otherwise integrated into an electronic device), or they may be removable such as swappable flash memory modules (such as "thumb drives" or other removable media designed for rapidly exchanging physical storage devices), "hot-swappable" hard disk drives or solid state drives, removable optical storage discs, or other such removable media, and that such integral and removable storage media may be utilized
interchangeably. Examples of program instructions include both object code, such as may be produced by a compiler, machine code, such as may be produced by an assembler or a linker, byte code, such as may be generated by for example a JAVA™ compiler and may be executed using a Java virtual machine or equivalent, or files containing higher level code that may be executed by the computer using an interpreter (for example, scripts written in Python, Perl, Ruby, Groovy, or any other scripting language).
[048] In some embodiments, systems according to the present invention may be implemented on a standalone computing system. Referring now to Fig. 5, there is shown a block diagram depicting a typical exemplary architecture of one or more embodiments or components thereof on a standalone computing system. Computing device 20 includes processors 21 that may run software that carry out one or more functions or applications of embodiments of the invention, such as for example a client application 24. Processors 21 may carry out computing instructions under control of an operating system 22 such as, for example, a version of Microsoft's WINDOWS™ operating system, Apple's Mac OS/X or iOS operating systems, some variety of the Linux operating system, Google's ANDROID™ operating system, or the like. In many cases, one or more shared services 23 may be operable in system 20, and may be useful for providing common services to client applications 24. Services 23 may for example be WINDOWS™ services, user-space common services in a Linux environment, or any other type of common service architecture used with operating system 21. Input devices 28 may be of any type suitable for receiving user input, including for example a keyboard, touchscreen, microphone (for example, for voice input), mouse, touchpad, trackball, or any combination thereof. Output devices 27 may be of any type suitable for providing output to one or more users, whether remote or local to system 20, and may include for example one or more screens for visual output, speakers, printers, or any combination thereof. Memory 25 may be random-access memory having any structure and architecture known in the art, for use by processors 21, for example to run software. Storage devices 26 may be any magnetic, optical, mechanical, memristor, or electrical storage device for storage of data in digital form (such as those described above). Examples of storage devices 26 include flash memory, magnetic hard drive, CD-ROM, and/or the like.
[049] In some embodiments, systems of the present invention may be implemented on a distributed computing network, such as one having any number of clients and/or servers. Referring now to Fig. 6, there is shown a block diagram depicting an exemplary architecture 30 for implementing at least a portion of a system according to an embodiment of the invention on a distributed computing network. According to the embodiment, any number of clients 33 may be provided. Each client 33 may run software for implementing client-side portions of the present invention; clients may comprise a system 20 such as that illustrated above. In addition, any number of servers 32 may be provided for handling requests received from one or more clients 33. Clients 33 and servers 32 may communicate with one another via one or more electronic networks 31, which may be in various embodiments any of the Internet, a wide area network, a mobile telephony network (such as CDMA or GSM cellular networks), a wireless network (such as WiFi, Wimax, LTE, and so forth), or a local area network (or indeed any network topology known in the art; the invention does not prefer any one network topology over any other). Networks 31 may be implemented using any known network protocols, including for example wired and/or wireless protocols.
[050] In addition, in some embodiments, servers 32 may call external services 37 when needed to obtain additional information, or to refer to additional data concerning a particular call. Communications with external services 37 may take place, for example, via one or more networks 31. In various embodiments, external services 37 may comprise web -enabled services or functionality related to or installed on the hardware device itself. For example, in an embodiment where client applications 24 are implemented on a smartphone or other electronic device, client applications 24 may obtain information stored in a server system 32 in the cloud or on an external service 37 deployed on one or more of a particular enterprise's or user's premises.
[051] In some embodiments of the invention, clients 33 or servers 32 (or both) may make use of one or more specialized services or appliances that may be deployed locally or remotely across one or more networks 31. For example, one or more databases 34 may be used or referred to by one or more embodiments of the invention. It should be understood by one having ordinary skill in the art that databases 34 may be arranged in a wide variety of architectures and using a wide variety of data access and manipulation means. For example, in various embodiments one or more databases 34 may comprise a relational database system using a structured query language (SQL), while others may comprise an alternative data storage technology such as those referred to in the art as "NoSQL" (for example, Hadoop Cassandra, Google BigTable, and so forth). In some embodiments, variant database architectures such as column-oriented databases, in-memory databases, clustered databases, distributed databases, or even flat file data repositories may be used according to the invention. It will be appreciated by one having ordinary skill in the art that any combination of known or future database technologies may be used as appropriate, unless a specific database technology or a specific arrangement of components is specified for a particular embodiment herein. Moreover, it should be appreciated that the term "database" as used herein may refer to a physical database machine, a cluster of machines acting as a single database system, or a logical database within an overall database management system. Unless a specific meaning is specified for a given use of the term "database", it should be construed to mean any of these senses of the word, all of which are understood as a plain meaning of the term "database" by those having ordinary skill in the art.
[052] Similarly, most embodiments of the invention may make use of one or more security systems 36 and configuration systems 35. Security and configuration management are common information technology (IT) and web functions, and some amount of each are generally associated with any IT or web systems. It should be understood by one having ordinary skill in the art that any configuration or security subsystems known in the art now or in the future may be used in conjunction with embodiments of the invention without limitation, unless a specific security 36 or configuration system 35 or approach is specifically required by the description of any specific embodiment.
[053] Fig. 7 shows an exemplary overview of a computer system 40 as may be used in any of the various locations throughout the system. It is exemplary of any computer that may execute code to process data. Various modifications and changes may be made to computer system 40 without departing from the broader scope of the system and method disclosed herein. Central processor unit (CPU) 41 is connected to bus 42, to which bus is also connected memory 43, nonvolatile memory 44, display 47, input/output (I/O) unit 48, and network interface card (NIC) 53. I/O unit 48 may, typically, be connected to keyboard 49, pointing device 50, hard disk 52, and real-time clock 51. NIC 53 connects to network 54, which may be the Internet or a local network, which local network may or may not have connections to the Internet. Also shown as part of system 40 is power supply unit 45 connected, in this example, to a main alternating current (AC) supply 46. Not shown are batteries that could be present, and many other devices and modifications that are well known but are not applicable to the specific novel functions of the current system and method disclosed herein. It should be appreciated that some or all components illustrated may be combined, such as in various integrated applications, for example Qualcomm or Samsung system-on-a-chip (SOC) devices, or whenever it may be appropriate to combine multiple capabilities or functions into a single hardware device (for instance, in mobile devices such as smartphones, video game consoles, in-vehicle computer systems such as navigation or multimedia systems in automobiles, or other integrated hardware devices).
[054] In various embodiments, functionality for implementing systems or methods of the present invention may be distributed among any number of client and/or server components. For example, various software modules may be implemented for performing various functions in connection with the present invention, and such modules may be variously implemented to run on server and/or client.
[055] The skilled person will be aware of a range of possible modifications of the various embodiments described above. Accordingly, the present invention is defined by the claims and their equivalents.

Claims

What is claimed is:
1. A system for fully integrated collection of business impacting data, analysis of that data and generation of both analysis-driven business decisions and analysis-driven simulations of alternate candidate business decision comprising:
a business data retrieval engine stored in a memory of and operating on a processor of a computing device;
a business data analysis engine stored in a memory of and operating on a processor of a computing device; and
a business decision and business action path simulation engine stored in a memory of and operating on a processor of one of more computing devices;
wherein, the business information retrieval engine:
(a) retrieves a plurality of business related data from a plurality of sources;
(b) accept a plurality of analysis parameters and control commands directly from human interface devices or from one or more command and control storage devices;
(b) stores accumulated retrieved information for processing by data analysis engine or predetermined data timeout;
wherein the business information analysis engine:
(c) retrieves a plurality of data types from the business information retrieval engine;
(d) performs a plurality of analytical functions and transformations on retrieved data based upon the specific goals and needs set forth in a current campaign by business process analysis authors;
wherein the business decision and business action path simulation engine:
e) employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources as well as any current campaign specific machine learning, commands and parameters from business process analysis authors to formulate current business operations and risk status reports; and
(f) employs results of data analyses and transformations performed by the business information analysis engine, together with available supplemental data from a plurality of sources, any current campaign specific commands and parameters from business process analysis authors, as well as input gleaned from machine learned algorithms to deliver business action pathway simulations and business decision support to a first end user.
2. The system of claim 1, wherein the business information retrieval engine a stored in the memory of and operating on a processor of a computing device, employs a portal for human interface device input at least a portion of which are business related data and at least another portion of which are commands and parameters related to the conduct of a current business analysis campaign.
3. The system of claim 2, wherein the business information retrieval engine employs a high volume deep web scraper stored in the memory of an operating on a processor of a computing device, which receives at least some scrape control and spider configuration parameters from the highly customizable cloud based interface, coordinates one or more world wide web searches (scrapes) using both general search control parameters and individual web search agent (spider) specific configuration data, receives scrape progress feedback information which may lead to issuance of further web search control parameters, controls and monitors the spiders on distributed scrape servers, receives the raw scrape campaign data from scrape servers, aggregates at least portions of scrape campaign data from each web site or web page traversed as per the parameters of the scrape campaign.
4. The system of claim 3, wherein the archetype spiders are provided by a program library and individual spiders are created using configuration files.
5. The system of claim 3, wherein scrape campaign requests are persistently stored and can be reused or used as the basis for similar scrape campaigns.
6. The system of claim 2, wherein the business information retrieval engine employs a multidimensional time series data store stored in a memory of and operating on a processor of a computing device to receive a plurality of data from a plurality of sensors of heterogeneous types, some of which may have heterogeneous reporting and data payload transmission profiles, aggregates the sensor data over a predetermined amount of time, a predetermined quantity of data or a predetermined number of events, retrieves a specific quantity of aggregated sensor data per each access connection predetermined to allow reliable receipt and inclusion of the data, transparently retrieves quantities of aggregated sensor data too large to be reliably transferred by one access connection using a further plurality access connections to allow capture of all aggregated sensor data under conditions of heavy sensor data influx and stores aggregated sensor data in a simple key -value pair with very little or no data
transformation from how the aggregated sensor data is received.
7. The system of claim 1, wherein the business data analysis engine employs a directed computational graph stored in the memory of an operating on a processor of a computing device which, retrieves streams of input from one or more of a plurality of data sources, filters data to remove data records from the stream for a plurality of reasons drawn from, but not limited to a set comprising absence of all information, damage to data in the record, and presence of in-congruent information or missing information which invalidates the data record, splits filtered data stream into two or more identical parts, formats data within one data stream based upon a set of predetermined parameters so as to prepare for meaningful storage in a data store, sends identical data stream further analysis and either linear transformation or branching transformation using resources of the system.
8. The system of claim 1, wherein the business data analysis engine employs a graph stack service module stored in a memory of and operating on a processor of one of more computing devices which organizes data retrieved from the multidimensional time series database module into graph formats where the objects are represented as vertices and the relationships between them as edges of the graph.
9. A method for fully integrated collection of business impacting data, analysis of that data and generation of both analysis driven business decisions and analysis driven business decision simulations the method comprising the steps of:
(a) retrieving business related data and analysis campaign command and control information using a business information retrieval engine stored in the memory of an operating on a processor of a computing device;
(b) analyzing and transforming retrieved business related data using a business information analysis engine stored in the memory of an operating on a processor of a computing device in conjunction with previously designed analysis campaign command and control information; and
(c) presenting business decision critical information as well as business pathway simulation information using a business decision and business path simulation engine based upon the results of analysis of previously retrieved business related data and previously entered analysis campaign command and control information.
10. The method of claim 9, wherein the business information retrieval engine employs, a portal for human interface device input at least a portion of which are business related data and at least another portion of which are commands and parameters related to the conduct of a current business analysis campaign.
11. The method of claim 9, wherein the business information retrieval engine employs a high volume deep web scraper stored in the memory of an operating on a processor of a computing device, which receives at least some scrape control and spider configuration parameters from the highly customizable cloud based interface, coordinates one or more world wide web searches (scrapes) using both general search control parameters and individual web search agent (spider) specific configuration data, receives scrape progress feedback information which may lead to issuance of further web search control parameters, controls and monitors the spiders on distributed scrape servers, and receives the raw scrape campaign data from scrape servers, aggregates at least portions of scrape campaign data from each web site or web page traversed as per the parameters of the scrape campaign.
12. The method of claim 10, wherein the archetype spiders are provided by a program library and individual spiders are created using configuration files.
13. The method of claim, 10 wherein scrape campaign requests are persistently stored and can be reused or used as the basis for similar scrape campaigns.
14. The method of claim 9, wherein the business information retrieval engine employs a multidimensional time series data store stored in a memory of and operating on a processor of a computing device to receive a plurality of data from a plurality of sensors of heterogeneous types, some of which may have heterogeneous reporting and data payload transmission profiles, aggregates the sensor data over a predetermined amount of time, a predetermined quantity of data or a predetermined number of events, retrieves a specific quantity of aggregated sensor data per each access connection predetermined to allow reliable receipt and inclusion of the data, transparently retrieves quantities of aggregated sensor data too large to be reliably transferred by one access connection using a further plurality access connections to allow capture of all aggregated sensor data under conditions of heavy sensor data influx and stores aggregated sensor data in a simple key -value pair with very little or no data
transformation from how the aggregated sensor data is received.
15. The system of claim 8, wherein the business data analysis engine employs a directed computational graph, stored in the memory of an operating on a processor of a computing device which, retrieves streams of input from one or more of a plurality of data sources, filters data to remove data records from the stream for a plurality of reasons drawn from, but not limited to a set comprising absence of all information, damage to data in the record, and presence of in-congruent information or missing information which invalidates the data record, splits filtered data stream into two or more identical parts, formats data within one data stream based upon a set of predetermined parameters so as to prepare for meaningful storage in a data store, sends identical data stream further analysis and either linear transformation or branching transformation using resources of the system.
16. The method of claim 9, wherein the business data analysis engine employs a graph stack service module stored in a memory of and operating on a processor of one of more computing devices which organizes data retrieved from the multidimensional time series database module into graph formats where the objects are represented as vertices and the relationships between then as edges of the graph.
17. A method for the detection of Kerberos based security exploits using a system for fully integrated capture, and analysis of business information the method comprising the steps of:
(a) retrieving ticket granting ticket request information, service session key request information, user sign on attempt data from a Kerberos domain controller using a
multidimensional time series database module stored in a memory of and operating on a processor of a computing device. (b) applying any pre-programmed multiple dimensional time series event-condition-action rules that are present and apply to Kerberos protocol events using the multidimensional time series database module.
(c) performing conversion of data into into graphs where objects are vertices and their relationships edges between vertices using a graph stack service stored in a memory of and operating on a processor of a computing device.
(d) performing any analytical transformations using a directed computational graph module.
18. A method to monitor the function of business critical IT infrastructure and business software performance using a system for fully integrated capture, and analysis of business information resulting in improved client-facing IT infrastructure reliability the method comprising the steps of:
(a) monitoring IT equipment and application status statistics as well as failure messages using a multidimensional time series database module stored in a memory of and operating on a processor of a computing device;
(b) processing the data retrieved from multidimensional time series database module using a graph stack service stored in a memory of and operating on a processor of a computing device with infrastructure items and software forming vertices of a relational graph and relationships between them forming edges of the graph;
(c) transforming data acquired by the multidimensional time series database module using directed computational graph to formulate more complex diagnostic queries based upon the existing data using pre-programmed logic and machine learning and then process the results of those complex queries as predetermined by authors of the monitoring effort; and
(d) presenting the results in format best suited to the downstream use of the processed data.
19. The method of claim 18, wherein at least one set of results are displayed as a graphical simulation using an observation and state estimation service module stored in a memory of and operating on a processor of a computing device.
PCT/US2017/034861 2016-05-26 2017-05-26 System for automated capture and analysis of business information WO2017205845A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/166,158 2016-05-26
US15/166,158 US20170124501A1 (en) 2015-10-28 2016-05-26 System for automated capture and analysis of business information for security and client-facing infrastructure reliability

Publications (1)

Publication Number Publication Date
WO2017205845A1 true WO2017205845A1 (en) 2017-11-30

Family

ID=60411551

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/034861 WO2017205845A1 (en) 2016-05-26 2017-05-26 System for automated capture and analysis of business information

Country Status (1)

Country Link
WO (1) WO2017205845A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111758094A (en) * 2018-02-23 2020-10-09 克姆普勒克斯股份有限公司 System and method for dynamic geospatial referenced cyber-physical infrastructure inventory
CN112948659A (en) * 2021-03-09 2021-06-11 深圳九星互动科技有限公司 Webpage data acquisition method, device, system and medium
CN114641740A (en) * 2019-11-05 2022-06-17 Abb瑞士股份有限公司 Method and device for monitoring an electric drive in an industrial system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6341290B1 (en) * 1999-05-28 2002-01-22 Electronic Data Systems Corporation Method and system for automating the communication of business information
US6817008B2 (en) * 2002-02-22 2004-11-09 Total System Services, Inc. System and method for enterprise-wide business process management
US6892192B1 (en) * 2000-06-22 2005-05-10 Applied Systems Intelligence, Inc. Method and system for dynamic business process management using a partial order planner
US7349907B2 (en) * 1998-10-01 2008-03-25 Onepin, Inc. Method and apparatus for storing and retrieving business contact information in a computer system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7349907B2 (en) * 1998-10-01 2008-03-25 Onepin, Inc. Method and apparatus for storing and retrieving business contact information in a computer system
US6341290B1 (en) * 1999-05-28 2002-01-22 Electronic Data Systems Corporation Method and system for automating the communication of business information
US6892192B1 (en) * 2000-06-22 2005-05-10 Applied Systems Intelligence, Inc. Method and system for dynamic business process management using a partial order planner
US6817008B2 (en) * 2002-02-22 2004-11-09 Total System Services, Inc. System and method for enterprise-wide business process management

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111758094A (en) * 2018-02-23 2020-10-09 克姆普勒克斯股份有限公司 System and method for dynamic geospatial referenced cyber-physical infrastructure inventory
CN114641740A (en) * 2019-11-05 2022-06-17 Abb瑞士股份有限公司 Method and device for monitoring an electric drive in an industrial system
CN112948659A (en) * 2021-03-09 2021-06-11 深圳九星互动科技有限公司 Webpage data acquisition method, device, system and medium
CN112948659B (en) * 2021-03-09 2023-05-16 深圳九星互动科技有限公司 Webpage data acquisition method, device, system and medium

Similar Documents

Publication Publication Date Title
US11295262B2 (en) System for fully integrated predictive decision-making and simulation
US20170124501A1 (en) System for automated capture and analysis of business information for security and client-facing infrastructure reliability
US10320827B2 (en) Automated cyber physical threat campaign analysis and attribution
US20170124497A1 (en) System for automated capture and analysis of business information for reliable business venture outcome prediction
US11516097B2 (en) Highly scalable distributed connection interface for data capture from multiple network service sources
US11588793B2 (en) System and methods for dynamic geospatially-referenced cyber-physical infrastructure inventory and asset management
US11831682B2 (en) Highly scalable distributed connection interface for data capture from multiple network service and cloud-based sources
US20180247321A1 (en) Platform for management of marketing campaigns across multiple distribution mediums
US11074652B2 (en) System and method for model-based prediction using a distributed computational graph workflow
US11636549B2 (en) Cybersecurity profile generated using a simulation engine
WO2018027226A1 (en) Detection mitigation and remediation of cyberattacks employing an advanced cyber-decision platform
US20230362145A1 (en) System and method for ongoing trigger-based scanning of cyber-physical assets
WO2017205845A1 (en) System for automated capture and analysis of business information
WO2017176944A1 (en) System for fully integrated capture, and analysis of business information resulting in predictive decision making and simulation
US20220058745A1 (en) System and method for crowdsensing-based insurance premiums
WO2021055964A1 (en) System and method for crowd-sourced refinement of natural phenomenon for risk management and contract validation
EP3707634A1 (en) Cybersecurity profile generated using a simulation engine
US20230208820A1 (en) System and methods for predictive cyber-physical resource management
US11755957B2 (en) Multitemporal data analysis
US20230388277A1 (en) System and methods for predictive cyber-physical resource management
WO2019165384A1 (en) A system and methods for dynamic geospatially-referenced cyber-physical infrastructure inventory

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17803733

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17803733

Country of ref document: EP

Kind code of ref document: A1