WO2017053434A1

WO2017053434A1 - Design-time, metadata-based representation of real-time applications and message schemas

Info

Publication number: WO2017053434A1
Application number: PCT/US2016/052893
Authority: WO
Inventors: Ranga Ram POTHULA; Venkata JANAPAREDDY; David William FREUND; Ramesh NAGELLA; Matthew S. BRETZ; Serhiy BLAZHIYEVESKYY; Harshada Ram POTHULA; Chandana BHARGAVA; Christopher Weaver
Original assignee: Dragonfly Data Factory Llc
Priority date: 2015-09-21
Filing date: 2016-09-21
Publication date: 2017-03-30

Abstract

This disclosure describes techniques for providing an application, such as a real-time data analytics application, to be generated and executed. The system described herein decouples at least some application-specific details from application-component code, enabling component reuse and simplifying application composition. The system may present abstractions at design time, generating and compiling efficient code "just in time" prior to deployment of the application. The system may provide a graphical user interface (GUI) for composing real-time applications from modular components, and a framework for developers to enable component configuration via the GUI. The system may also enable developers to reuse previously developed applications, and portions thereof, to reapply desired design patterns within multiple applications.

Description

DESIGN-TIME, METADATA-BASED REPRESENTATION OF REAL-TIME

APPLICATIONS AND MESSAGE SCHEMAS

CROSS-REFERENCE TO RELATED APPLICATION

[0001] The present application is related to and claims priority to U.S. Provisional Patent Application Serial No. 62/221,498, titled "Design- Time, Metadata-Based Representation of Real-Time Applications and Message Schemas," which was filed on September 21, 2015, the entirety of which is hereby incorporated by reference into the present application.

BACKGROUND

[0002] To create an application that processes data from multiple data sources, a developer may be required to manually code a large number of interfaces or other software modules. Such a coding task may be time consuming, and may increase the amount of time and expenditure required to bring a software application from development to deployment. Moreover, if the data sources change after the initial version of the application is built, the application may need to be substantially rewritten. Such rewrites may also consume a large amount of time and expenditure.

SUMMARY

[0003] Implementations of the present disclosure include computer-implemented methods for providing an application to be executed in an execution environment. In some implementations, the methods may perform actions including one or more of the following: presenting a user interface (UI); receiving, through the UI, an indication of components to include in the application; receiving, through the UI, an indication of one or more connections between the components; generating application metadata for the application, the application metadata describing the components and the one or more connections; validating the application based on the application metadata, including compiling source code for the application to generate executable code; deploying the executable code to an execution environment; or executing the executable code in the execution environment. In some implementations, the components may include one or more of a data source; a parser; or a processing component such as a filter, an analytics component, or an output component. In some implementations, the components may include one or more dynamic data sources (e.g., data streams) to be processed in real time by the application.

[0004] Implementations of the present disclosure provide one or more of the following advantages. The system described herein decouples at least some application-specific details from application-component code, enabling better component reuse and simplifying application composition. To provide optimal application performance, the system presents most of its abstractions at design time, generating and compiling code that is as efficient as possible "just in time" prior to deployment. It also provides a graphical user interface (GUI) for composing realtime applications from modular components, and a framework for developers to enable component configuration via the GUI. Developers may also reuse previously developed applications, and portions thereof, to reapply desired design patterns within multiple

applications. "Just-in-time" compilation may save computer storage space that would otherwise be occupied by compiled object code, as well as eliminate the need to track whether such object code needs to be recompiled due to application or component configuration changes, or updates to code-generation modules.

[0005] The present disclosure also provides a computer-readable storage medium coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations in accordance with implementations of the methods provided herein. A computer-readable storage medium may include a non-transitory or tangible storage medium such as a storage device, or a storage or memory component of a computing device.

[0006] The present disclosure further provides a system for implementing the methods provided herein. The system includes one or more processors, and a computer-readable storage medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations in accordance with implementations of the methods provided herein.

[0007] It is appreciated that methods in accordance with the present disclosure can include any combination of the aspects and features described herein. That is, methods in accordance with the present disclosure are not limited to the combinations of aspects and features specifically described herein, but also include any combination of the aspects and features provided.

[0008] The details of one or more implementations of the present disclosure are set forth in the accompanying drawings and the description below. Other features and advantages of the present disclosure will be apparent from the description and drawings, and from the claims.

BRIEF DESCRIPTION OF DRAWINGS

[0009] FIG. 1 depicts an example schematic of an application according to implementations of the present disclosure.

[0010] FIGS. 2 A and 2B depict an example user interface that may be employed to create an application according to implementations of the present disclosure.

[0011] FIG. 3 depicts an architecture of an example system according to implementations of the present disclosure.

[0012] FIG. 4 depicts a flow diagram of an example process according to implementations of the present disclosure.

DETAILED DESCRIPTION

[0013] The rising flood of smart devices, sensors, and the Internet of Things (IoT) has forced organizations to explore options for processing device-generated event messages as those events occur - in other words, in real time. However, the current state of the art for development, deployment, and management of real-time processing and analytics applications is nowhere near that of batch-oriented processing, and is inadequate for real-time processing and analytics applications. Technologies such as Apache Storm and Apache Spark Streaming enable decomposition of such applications into small, modular components, but the reuse of those components is severely limited because application-specific details such as message contents, format, and routing are written into each component's source code. This causes considerable time and effort to be expended each time an application is created or modified. [0014] The system described herein decouples such application-specific details from application-component code, enabling better component reuse, and simplifying application composition. To provide optimal application performance, the system presents most of its abstractions at design time, generating and compiling code that is as efficient as possible "just in time" prior to deployment. It also provides a graphical user interface (GUI) for composing realtime applications from modular components, and a framework for developers to enable component configuration via the GUI. The system also provides an API for at least some, or all, of the functions available via the GUI, enabling programmatic access to system functions.

Moreover, the system enables the data intake and parsing to be abstracted away from the processing of the data (e.g., application of business logic).

[0015] Implementations of the present disclosure are directed to a design platform that enables a developer to rapidly develop an application at least partly using prebuilt, configurable software components. In some implementations, the design platform enables the design, deployment, and execution of an application that is a real-time analytics application configured to receive and process multiple streams of data in real time as the data is received. The design platform may provide a layer of abstraction between the format of the messages received by the application from multiple data sources, such as streaming data sources, and the processing of the data. In some cases, the data may be encapsulated in a Java bean (e.g., JavaBean™) or other data structure that provides a consistent interface to the data that may be received as different data types, different schemas, in different data formats, or via different communication protocols from different data sources. This abstraction may be provided as a design-time feature that may be employed by a developer using the design platform.

[0016] Using the design platform, a developer, designer, or other user may specify one or more data sources from which the application may receive data. Data sources may be streaming data sources that are updated in real time. Data sources may also include static data sources, such as databases, file systems, document repositories, and so forth. The user may specify parser(s) to process the data received from one or more data sources. A parser may be configured to handle data in the particular format provided by a data source, and provide output data to other components of the application. In some cases, the parser provides output data in a common format, such that data received from multiple data sources in different formats may be handled by downstream components in a consistent manner. In some cases, a parser outputs data as a set of key- value pairs, where the key is a data type. In some implementations, a parser may incorporate the incoming data into a Java bean or other data structure that provides a consistent interface to the data regardless of the format or type of data stored in the data structure. In some implementations, the data structure (e.g., the structure of the Java bean) enables developers to name fields included in the data structure.

[0017] In some implementations, a Java bean structure or other data structure may be used internally by one or more component(s) to process the data, and the data itself may be communicated between components without the surrounding data structure. For example, a first component may receive data and encapsulate (or otherwise incorporate) the data into a data structure. The first component may perform various operations on the data while it is in the data structure. Following the operations, the first component may extract the data from the data structure and communicate the output data to one or more downstream components for further processing. Transporting the data outside the data structure may enable implementations to take advantage of platform (e.g., Apache Storm) features that provide for optimized data

communication between components. In some implementations, the data may be communicated between components with the data encapsulated in a Java bean structure or other data structure.

[0018] In some implementations, the data structure (e.g., Java bean structure) may be employed such that the data structure includes data but does not include methods or other program logic.

[0019] Using the design platform, the user may specify one or more analytics components to process the data output from the parser(s). The analytics component(s) may specify business rules, business logic, application logic, or any other type of logic or code to process the data. In some implementations, the design platform provides a set of analytics components, or other types of processing components, that are pre-built to implement (e.g., common) logic for handling the data. The design platform may also enable a user to write code to specify logic for handling the data.

[0020] In some implementations, the design platform provides a design user interface (UI), a GUI that a user may employ to design an application. For example, the GUI may include a design pane that is a design view, showing a current schematic of the design of the application. The GUI may also include a library pane listing components that may be added to the application. The components may include data sources, parsers, filters, analytics components, output components, or other types of components. The user may select components to add to the application by dragging the component(s) from the library pane and dropping them in the design pane. The user may also specify relationships between the components by drawing directional lines (e.g., pipes) between the components. For example, the user may drag a data source into the design pane, drag a parser into the design pane, and draw a line or other connection indicating that the parser is to receive and process data from the data source. The user may drag an analytics component into the design pane, and draw a line from the parser to the analytics component to indicate that the analytics component is to perform one or more operations on the data that is output from the parser. The user may also specify one or more output components that specify how the output from the analytic(s) components is to be handled. For example, the output component(s) may include components that send output data in communications (e.g., email, text messages, push messages, etc.), present output data in a UI, store output data in a database or other data storage, generate reports, and so forth. In this manner, a user may build an application from the library of components.

[0021] Once the user is satisfied with the components in the application, the configuration of the components, and the connections between the components, the user may save the application design by selecting a control in the GUI. To save the application, the design platform may generate application metadata that describes the application. The application metadata (also referred to herein as the metadata) may capture all the information specified by the user at design time. The application metadata may describe the various components in the application, the connections (e.g., pipes) between the components, and any code written by the user. To deploy the application, the design platform may compile, link, or otherwise build the components according to the metadata, and output executable code. The executable code may be deployed to one or more computing devices where the application executes. At deployment time, the design platform may build and connect the user-specified components according to the metadata, generate executable code to route data from component to component according to the specified pipes, and process incoming data according to the formats specified by the user via the data source components included in the design. Accordingly, the design platform may abstract away the handling of different types of data from different data sources, and enable the developer or other user to focus on the business logic (e.g., the logic that processes the received data to produce the desired output of the application). In some implementations, an application and the configurations of its connected components is described using metadata formatted according to a version of JavaScript Object Notation (JSON).

[0022] The design platform provides ease of configurability of an application, provides for a library of software components that are reusable across multiple applications, and provides a convenient mechanism to handle different types of data streams without extensive manual coding. For at least these reasons, the design platform accelerates the development process for creating an application, reduces the cost of building an application, enables faster time to deployment or to market for software, and reduces the programming burdens on developers. Because data source abstraction occurs at design time, and because the executable code is built at deployment time for a particular runtime environment, the resulting application code may perform as efficiently as if a developer had written the code manually to target the particular execution environment where the application runs (e.g., a particular computing device or operating system). Moreover, because the various data sources, parsers, analytics components, or other components in the library may be optimized and debugged for fast, reliable

performance, implementations may reduce the latency of processing multiple streams of incoming data by the application in real time.

[0023] FIG. 1 depicts an example schematic of an application 100 according to

implementations of the present disclosure. The application 100 may include one or more data sources 102 that specify dynamic data source(s) such as streams of data to be analyzed in real time by the application 100 during its execution. The application 100 may also include any number of static data sources 102 such as data bases, file systems, document repositories, and so forth. The application may include one or more parsers 104. Each parser 104 may access data received from one or more data sources 102. The parsers 104 may transform or otherwise process the data, and output the data in a form that is consistent regardless of the particular incoming data formats or data types received from the data source(s) 102. In some cases, an application 100 may include one or more filters 106 that filter the data that is output from parser(s) 104 prior to sending the data to be processed by the analytics component(s) 108. The analytics component(s) 108 may apply rules, business logic, or perform any other type of operations with respect to the data received from the parser(s) 104 and/or the filter(s) 106. In some cases, as in the example of FIG. 1, an application 100 may include multiple analytics components 108 that process the data serially or in parallel. For example, an analytics component 108 A may receive data from a parser 104, perform operations on the data, and output data to another analytics component 108B which performs further operations on the data, and so forth. One or more of the analytics component(s) 108 may provide data to output component(s) 110 that output data resulting from the operations of the various analytics component(s) 108. Implementations support the use of output component(s) 110 to output data using various methods including, but not limited to, communicating the output data (e.g., via email, text message, push message, etc.), storing the output data in data store(s) such as database(s), writing the output data to file(s), and so forth. The application 100 may be any type of application that processes data, in any manner, from one or more data sources 102.

[0024] Although the filter(s) 106, analytics component(s) 108, and the output component(s) 110 may be described herein as performing distinct operations relative to one another, implementations are not limited to these distinctions. The various filter(s) 106, analytics component(s) 108, and output component(s) 110 in an application 100 may be included in a set of processing component(s) 112. Each processing component 112 may perform operations for one or more of the following: filtering, sampling, or otherwise determining a subset of the data output from the parser(s) 104; transforming or otherwise modifying the data; applying logic, rules, or other (e.g., business) operations to the data; or outputting data via one or more output channels.

[0025] Although FIG. 1 depicts an example application 100 that includes a particular set of components arranged according to a particular topology (e.g., with particular connections between the components), implementations are not limited to this example. Implementations support the creation, deployment, and execution of an application 100 that includes any number and any type of components arranged according to any topology.

[0026] FIGS. 2A and 2B depict an example design platform UI 200 that may be employed to create an application according to implementations of the present disclosure. During the design phase of the application 100, a developer or other user may access the design platform UI 200 to create a new application 100, or modify a previously created application 100. In some implementations, the design platform UI 200 includes a design pane 202 and a library pane 204. The library pane 204 may provide a list of various components that the user may select to include in the application 100. To add a component to the application 100, the user may drag and drop the component from the library pane 204 into the design pane 202. Components may be added to the application 100 in any order. The design pane 202 may also enable a user to remove component(s) from the application 100. In the example of FIG. 2A, the user has dragged and dropped a data source 102 A and a parser 104B into the design pane 202 to add these components to the application 100. In the example of FIG. 2B, the user has added two additional processing components 112 - an analytics component 108B and an output component 110A - and specified a connection 206 between the parser 104B and the analytics component 108B. In some implementations, the design pane 202 enables the user to connect various components by drawing a line between the components. The connection 206 may be directional, to indicate that data output from one component is to be received and processed by another component.

[0027] For each component, design platform UI 200 may also enable the user to specify attributes or other details regarding the component. In some implementations, the details may be specified via a dialog 208 that pops up, or is otherwise presented, when the user selects a particular component to modify. For example, the user may select a data source 102 to specify the format, type, size, or other aspects of the data received from the data source 102. In some cases, the dialog 208 may provide a list of supported types of data sources 102 that the user may select from, such as a JavaScript Object Notation (JSON) stream, other types of Extensible Markup Language (XML) formatted data, streaming video in any format, streaming audio in any format, data from relational or non-relational datastores, data from particular social networks, and so forth. The user may specify the format or other details for a data source 102, such as a network address, port number, credentials (e.g., login and password), or other parameters. For some types of data sources 102, the user may indicate a particular subset of particular

characteristics of the data to be received. For example, the user may specify one or more Twitter hashtags or publishers to receive a particular set of data from a Twitter feed data source 102.

[0028] The dialog 208 may also allow the user to edit or otherwise configure other types of components. For example, the user may edit a parser 104 to specify the data type or format to be parsed, or names and data types of parsed fields being output. The user may edit a filter 106 to specify filter parameters. The user may edit an output component 110 to specify the particular details for data output, such as location(s) (e.g., email addresses, file locations, databases) where the output data is to be sent, how frequently the data is to be output, whether the output data is to be cached prior to sending, and so forth. In some implementations, the dialog 208 enables a user to write custom code for a component such as an analytics component 108, in cases where the design platform does not provide a preconfigured analytics component 108 that is suitable for the user's needs.

[0029] The data sources 102 may include dynamic data sources (e.g., streaming data) or static data sources. In some implementations, one data source 102 may be used to tag, filter, or otherwise process another data source 102. For example, a static data source 102 may be used to tag a dynamic data source 102. In a particular example, a static data source 102 may provide the location, user information, device specifications, or other information regarding a particular device identifier (ID). For dynamic data received from the device, the dynamic data including the device ID, the information from the static data source 102 may be employed to tag or otherwise supplement the dynamic data with the additional information regarding the particular device.

[0030] In some cases, the design platform UI 200 may enable the user to drag and drop a parser 104 for each type or format of data source 102 included in the application 100. There may be one parser 104 for multiple sources, such as a single parser 104 specified to handle data received from multiple JSON data sources 102. Parser(s) 104 may be preconfigured to handle different types of data. In some cases, the user may specify the particular format, data type, format version, protocol, protocol version, device type, device version, firmware version, and so forth for the parser 104 by adding information into the dialog 208 presented to edit the parser 104. In some cases, such as for a JSON data source 102, the dialog 208 may enable the user to upload a particular schema or arrangement of the data to be received from the data source 102. In some implementations, the design platform may be configured to dynamically discover the data type or data format provided by a data source 102, based on information published by the data source 102 (e.g., in header information).

[0031] The design platform may provide any number of preconfigured analytics modules 108 or other types of processing module 112 to perform various operations with respect to the data. The design platform may also enable a user to manually write code to specify operations to be performed on the data. The filter(s) 106, or other processing module(s) 112, may filter the data output by the parser(s) 104 based on any number or any type of conditions. Filtering may be based on particular data values in the data itself (e.g., filter to select data for which X > N). Filtering may also be based on conditions on external variables (e.g., do not process data from a data source 102 between the hours of midnight and 8:00 am).

[0032] The design platform UI 200 may include various other controls to enable the user to design the application 100. In some implementations, the design platform UI 200 includes a save button or other control (not shown) to cause the user-provided application information to be saved. The application 100 may be saved as metadata that describes the various components, component details, connections 206 between components, user-written code, or other aspects.

[0033] After the application metadata is saved, the application 100 may be deployed to one or more computing devices. The deployment environment may be a production environment or a test environment. As part of deployment, the design platform may validate the application 100 to ensure that it operates in a suitable manner. Validation may include generating source code based on the prebuilt components included in the application 100, and compiling source code for the application 100 to generate executable code (e.g., instructions that are executable by processor(s) in the deployment environment). The design platform may generate source code for the various components and source code for the connections 206 between components. The source code may then be compiled, linked, or otherwise built to generate executable code. In some implementations, executable code may be deployed to an Apache Storm cluster, or other type of application server(s), web server(s), or other computing environments.

[0034] The application 100 may be executed in the deployment environment. In some implementations, one or more instances of each component within the application 100 are executed, and the user may be able to specify through the design platform how many instances of each component are to be executed. The number of execution instances may be tuned for optimal performance. In some cases, such tuning may include automatic tuning or optimization that is based at least partly on machine learning. The user may also be able to specify other runtime parameters for the application 100. In some implementations, the application 100 may be deployed to and execute on a computing environment that is hosted, operated, and maintained by the organization that provided the design platform. In some implementations, the executable code (e.g., binaries) for the application 100 may be provided for deployment in other

environments. [0035] The metadata may be stored and accessed during subsequent user sessions with the design platform UI 200, to modify the application 100. In some implementations, the metadata may be used to clone an application 100 or otherwise compose application(s) 100 based on other application(s) 100. In such cases, the application metadata from a first application 100 may be accessed, and in some cases modified, to create a second application 100 based at least partly on the first application 100. The component(s) of an application 100 may be reusable across multiple other applications 100.

[0036] The design platform may enable a user to write code for a component via the dialog 208 or other tools. In some cases, the user may be enable to write source code in Java or another other programming language. The user may also be able to write Apache Storm code, SQL statements, or other types of code. In some cases, the dialog 208 may enable the user to specify rules or other types of declarative statements instead of writing source code, and the design platform may generate source code based on the rules or declarative statements. User- written components may be saved in a library that is accessible to other users, and used by the other users in subsequently created applications 100. In some implementations, a software

development kit (SDK) may be provided to enable a user to access the design platform UI within an integrated development environment (IDE) where the SDK is installed.

[0037] FIG. 3 depicts an example architecture of the design platform 300 according to implementations of the present disclosure. The Graphical User Interface (GUI), i.e., the design platform UI 200, is the connection between a human user and the Application Programming Interface (API). The GUI may include any number of drag and drop actions, click selections, data entry actions, and menus through which a user communicates with system services via the API. All these actions in combination may be used to graphically design, create, deploy, manage, and monitor a real-time processing application or other type of application 100. The GUI may also provide a customizable interface to add and manage application components, and to manage one or more real-time processing engine clusters.

[0038] The Services Layer may include the API of exposed functions (e.g., services) available, as well as the underlying implementations that operate to accomplish those functions. Services may be exposed by a RESTful service over HTTP. The Services Layer may be resource based, manipulated through those resource representations, providing self-describing messages, and may be stateless. Services may include any number of the software modules working in concert to accomplish the designing, creating, deploying, managing, and monitoring of real-time data processing applications, and for managing and monitoring Engines executing those applications. The Services Layer may also include a real-time application component framework that provides an abstraction layer to simplify the development of real-time application components and enable support of multiple programming/execution environments.

[0039] A Real-Time Processing Engine, also referred to as an Engine, may include a programming environment such as Apache Storm where real-time applications are deployed and executed. The Engine may also include other types of programming environments. Valid realtime applications, also called topologies, may be in the form of a Directed Acyclic Graph (DAG). The input (e.g., data source) components, parsers, processing components, data-sink (e.g., output) components, or other components included in an application may be referred to as components.

[0040] As described above, the design platform 300 may include one or more (e.g., online) application libraries or component libraries. One or more structured, web-based libraries can be accessed via the system GUI or a separate web browser to search for and (via the GUI) seamlessly install application components and/or templates into the system for composing into real-time applications. In addition to a global library, users may be able to designate an additional online library for this purpose. For example, an organization may wish to share application templates and components among developers within certain groups, or across an entire organization.

[0041] In some implementations, the Metadata Store is an internal service comprising a single, consistent interface to a virtualized persistent store where all metadata is kept for the system. It may be provided to all modules in the system for each specific module's use. Though it is presented to the rest of the system as a single service it may be backed by multiple diverse storage technologies and/or devices. Application-specific parameters, message formats and component relationships may be decoupled from code contained within each component using metadata to represent those parameters, formats and relationships.

[0042] At the highest level of granularity, the Metadata Store maintains metadata describing: Applications, Components, Infrastructure, and other system internals. "Applications" may include metadata representations of real-time application topologies. "Components" may include the metadata necessary to design, validate and instantiate a particular component to be used in composing an Application. "Infrastructure" metadata may be related to host, cluster,

deployment, and performance. System internals metadata includes (but is not limited to) audit, user, error, and other data necessary to operate and maintain the system. The Metadata Store may not store any data produced by an Application. In some cases, the Metadata Store may be used by the system and not for storing user-produced data.

[0043] The functions provided by the GUI may include application design, as well as both functional and performance management of application components, applications, and processing engines. The GUI may also make use of the Component Framework to present configurable aspects of those Components being used to compose an Application.

[0044] The Application Design portion of the GUI, also referred to as the Designer, enables a user to compose a real-time application from installed components, validate that application, and deploy it, in some cases all without writing any code. The method by which a user designs an application may include a series of drag, drop and other GUI interactions, and possibly other standard data entry setting component parameters and/or otherwise controlling what is to be done by the application. The application may be validated, created, and deployed to the user- designated Engine. The Designer may have a toolbox available where the user may select components. These may include components with prebuilt logic to accomplish their task.

Depending upon its design, a component may employ user-entered configuration data, such as information entered through the dialog 208 described above.

[0045] In some implementations, Component Management portions of the GUI are used to install, remove, and configure reusable application components. A user can create their own component(s) using the Component Framework for a specific function, and install it via the GUI into the system for use in real-time applications.

[0046] Implementations may support various types of components, including: (1) Stream Source (e.g., data source 102), an adapter that connects an external source of event-stream data to the application and emits that source's raw event data messages; (2) Stream Parser (e.g., parser 104), that parses raw event data from a Stream Source and defines a design-time schema for the data using Java bean encapsulation; and (3) Stream Processor (e.g., filter 106, analytics component 108, output component 110, or other processing component(s) 112), that performs any other kind of operation on incoming event data.

[0047] The majority of components used in typical applications may be Stream Processors. They perform functions including but not limited to filtering and transformation, applying business logic and predictive analysis, writing data into static repositories, sending messages to human beings or external systems/devices, etc.

[0048] Application Management functions include, but are not limited to: listing, creating, deleting, modifying, deploying, undeploying, activating, deactivating, and rebalancing real-time applications. Creation and modification of applications may open the appropriate application design screen(s).

[0049] Performance Monitoring capabilities may be provided to enable a user to view the current health of an application and the cluster on which it is running. It may present details of components and their throughput, as well as potential suggestions for improvement. In an application's performance summary view, the GUI may present a snapshot of overall cluster health as well as links to other applications running on the same cluster.

[0050] The RESTful API may present the GUI and the command line interface user with a self-defining interface for all resources. POSTs and PUTs to the interface may use JSON format, and resources represented by the API may return JSON. If a user navigates to the root of the application, all resources may be presented with links to the listing of those resources. In some cases, the resources themselves may be self-defined on how they may be modified. The details of each of the resources are explained for each resource. In general, and unless otherwise specified, all resources may have the following HTTP methods available on them: GET, POST, PUT and DELETE. GET can provide many functions but in at least some cases it may obtain a list of resources or a particular instance of a resource. The POST method may create a new resource and the PUT method may update a resource with a new value. The DELETE method may delete a resource.

[0051] In some implementations, API functions may not map perfectly to GUI functions. For example, the GUI can enable a user to "delete" a running application by making a series of API calls on the user's behalf to deactivate, undeploy and then delete the application. Resources accessed via the API may maintain fairly simple state models and appropriately enforce state- transition rules.

[0052] Resources exposed via the RESTful API (and functions available on each) may include but are not limited to Applications, Components, and various system-logged events. Functions allowed on any particular resource may be dependent on that resource's current state. Valid functions (and state-related restrictions) for each resource are described in their associated service-component description. For example, functions that can be performed on Application resources are described in the Application Manager paragraph(s) below.

[0053] The Resource Controller may be the internal service responsible for handling, dispatching, and providing responses to all requests made by external entities (such as the GUI) via the RESTful API. The Resource Controller may be the primary entity that creates, modifies, and deletes resources referred to by other internal services and external API calls. It may also provide resource lists and resource metadata, and may coordinate actions by other internal services, in response to RESTful API requests.

[0054] Resource-management functions not handled directly by the Resource Controller may be dispatched to other internal services for handing. Resource-management functions provided by the Resource Controller may include but are not limited to the following:

[0055] For APPLICATIONS:

[0056] LIST: Return a list of applications defined in the system. Parameters such as filters may be present in the request.

[0057] CREATE: Creates a new, empty application with a specified name that may be unique.

[0058] DELETE: Deletes named application from the metadata store. Deletion can be virtual, retaining metadata for audit purposes, but removing the resource from normal operation service interfaces.

[0059] READ: Fetch the current JSON metadata content for the named Application.

[0060] MODIFY: Replace the JSON metadata content for the named Application. If a new name is present, replace the Application's name but output an error if the name is not unique.

Modification can be virtual, retaining the replaced metadata for audit purposes, but only exposing the latest copy to normal operation service interfaces. [0061] For COMPONENTS:

[0062] RETIRE: A Component state may be set to "retired," to mark a component as no longer available for use in new Applications. This enables a component's continued use in existing Applications, but provides a way to force users to select a different or preferred component in future applications.

[0063] LIST: Returns a list of Components installed in the system. Parameters such as filters may be present in the request.

[0064] CREATE: Installs a new component into the system, the component having a unique name. Parameters include the location of the component package (tarball). The Resource Controller calls the Component Ingestor with the tarball location to process the component package, and return either success or an error status.

[0065] DELETE: Deletes the named component from the metadata store. Before a component can be deleted, the Resource Controller first verifies that there are no Applications referencing it. If there are, an error is returned. Deletion can be virtual, retaining metadata for audit purposes, but removing the resource from normal operation service interfaces. At least some

implementations provide a "force delete" parameter to unconditionally remove or disable the component, and disable all applications that reference it. However, the service may still return an error if any such application is deployed (whether or not it's active).

[0066] READ: Fetch the current JSON metadata content for the named component.

[0067] MODIFY: Replace the JSON metadata content for the named component. In some cases, only those aspects allowed to be modified by the user may be changed. If a new name is present, replace the component's name but provide an error if the name is not unique.

Modification can be virtual, retaining the replaced metadata for audit purposes, but only exposing the latest copy to normal operation service interfaces. In some implementations, RETIRE functionality is included in MODIFY.

[0068] For ENGINES:

[0069] LIST: Returns a list of Real-Time Processing Engines currently defined in the system. Parameters such as filters may be present in the request.

[0070] ADD: Calls the Engine Manager to install a new Engine into the system. [0071] REMOVE: Removes the named Engine cluster from the metadata store. Before an Engine can be deleted, the Resource Controller first verifies that there are no Applications referencing the Engine, and calls the Engine Manager to verify that no applications are running on it. If either of those checks fails, an error is returned. Removal can be virtual, retaining metadata for audit purposes, but removing the resource from normal operation service interfaces.

[0072] READ: Fetch the current JSON metadata content for the named Engine.

[0073] MODIFY: Replace the JSON metadata content for the named Engine. In some cases, only those aspects allowed to be modified by the user may be changed. If a new name is present, replace the engine's name but output an error if the name is not unique. Modification can be virtual, retaining the replaced metadata for audit purposes, but only exposing the latest copy to normal operation service interfaces.

[0074] In some implementations, the Compiler is an internal service responsible for the transformation of a specified Application's metadata into a cohesive compiled (e.g., jar) file that, after deployment onto a real-time processing Engine, may perform the actions described in the metadata. Additional functions provided by the Compiler may include validating Application metadata (ensuring the application may be a directed acyclic graph), generating source code, and compiling that code into a standalone artifact. Internally the Compiler may be dependent on the Component Framework for the generation of code for components as well as stitching the components together (e.g., message-routing). The compiler may be used in a "just in time" manner. In some cases, compiled artifacts are used for immediate deployment onto an Engine, such as when called by the Application Manager as part of processing an application "Deploy" request. When deployment is completed, the Application Manager may delete the artifacts. If the compiler is called as part of a validation process (e.g., no compiled artifact location is provided by the caller), any created artifacts may be (e.g., immediately) deleted by the Compiler.

[0075] In some implementations, the Component Framework is a set of internal services that provides extensibility of components (Stream Sources, Parsers and Processors) that are available to the user for composing into Applications. The Component Framework may provide an API that exchanges metadata and user entered data for GUI fragments, performs component parameter validation, dynamically generates component code from metadata, and provides a mechanism for adding new components to the system. Implementations may include connection points that provide access to these functions, and connection points may include the Displayer, Validator, Generator, and Ingestor.

[0076] The Displayer may provide the toolbox view of a component to be displayed on the GUI. It may also provide the HTML fragments necessary to display the popup that appears when the user clicks on a component in the application workspace canvas. In some cases, it provides a mechanism for integrating user-provided configuration data with component metadata that allows updates to previously entered data.

[0077] To display Component icons in a graphical "toolbox," the GUI may make a call to the RESTful API asking for all the components to display in the toolbox. The Resource Controller may retrieve metadata for some, or all, of the components registered in the system from the Metadata Store. The metadata may be provided to the Displayer, which returns the HTML fragments necessary to display the component icon in the toolbox and the JavaScript code to load the correct popup page. These responses may be held in a local cache until a change for this component is detected by the Ingestor, which clears the cache of any components affected by updates.

[0078] When the user clicks on a component within an application-design workspace, the GUI may make a RESTful API call, coordinated by the Resource Controller, asking for that component's popup display. The Resource Controller may retrieve the component's metadata from the Metadata Store and give the metadata to the Displayer, which returns the HTML fragment. The HTML may include fields to collect appropriate data from the user for this component. If the component already has configuration data associated with it (e.g., previously- entered data), the Displayer may integrate the configuration data with the HTML fragment and return an HTML fragment with the user-entered data filled in on the popup.

[0079] The connection points may also include a Validator. During application design, when a user clicks the "save" button in a component-configuration window, the GUI may make a call to the RESTful API. The Controller may obtain the component's metadata and the user entered data from the Metadata Store, and provide them to the Validator. The Validator may return to the Resource Controller either a success status, indicating the parameters have no errors and are ready to be saved, or a list of errors to be resolved. The Resource Controller may then either return the errors to the RESTful API caller (the GUI) or save the user entered data to the Metadata Store and return a success status to the API caller. If the save is successful, then the GUI may close the configuration window. If there are errors, the user may either correct them or close the configuration window without saving.

[0080] The connection points may include a Generator. When the Compiler is asked to compile an Application (as part of a Validate or Deploy request, for example), it may retrieve metadata associated with the Application and its components (including parameters) from the Metadata Store. The Compiler may package the metadata into a single Transfer Object, and pass it to the Generator. The Generator may decompose the package, and build each component based on its metadata, user entered parameters, and other artifacts provided by the component. The Generator may also build message-routing code for each component based on the

Application's metadata describing inter-component stream connections. In some

implementations, the Generator generates all the source code for each component and places the source code in a cumulative source folder. After all components have been generated, the Application itself may be generated using its metadata.

[0081] The Component Ingestor provides the means to add new components to the system, and make them available to all internal services and the external API (and thus also in an associated toolbox within the GUI). The Ingestor may validate that all artifacts and metadata are supplied, place them in locations within the Metadata Store, and make them available to the rest of the system.

[0082] Each component includes any combination of metadata, source files, and display files. The metadata may provide information such as name of the component, the icon to display in the GUI toolbox, files used to build the popup, and all the files used to generate the code. The files used to generate the GUI popup and used to generate the source code may be templates. In some cases, the Component Framework dynamically merges user-entered configuration data and Application metadata with the component's template files to produce GUI displays and message- routing code.

[0083] Each component may be packaged for distribution, and ingestion into the system, using a tar.gz formatted file with an internal structure comprising a folder which contains one or more metadata files, a display-object folder and a source-file folder. Implementations also support the use of other file formats for distribution and ingestion of component(s) into the system. [0084] In some implementations, the Application Manager is an internal service acting as the programmatic control center for the real-time processing Engine. It may provide the mechanism for deploying and controlling real-time Applications on connected Engines. Functions provided by the Application Manager include (but are not limited to): Validate, Deploy, Undeploy, Activate, Deactivate, Simple Rebalance, and Complex Rebalance, described further below.

[0085] Applications may be compiled from metadata prior to deploying onto an execution Engine. This "just in time compilation" scheme may eliminate object-tracking complexity and reduce the system's storage space requirements.

[0086] VALIDATE: The validate action may ensure that the application described in its associated JSON is complete and defines a DAG. To verify the Application constitutes an acyclic graph, this function may traverse all nodes of the topology as described in the JSON metadata representing the Application. If any node is encountered more than once during the traversal, then a loop has been detected and traversal of that topology branch may stop. When traversal of all branches has completed, if any loops were detected this function may return an error to the caller along with a list of nodes that had been visited more than once. Otherwise, this function may call the Generator and the Compiler may attempt to generate and compile code for the application. If the generation and compilation steps complete successfully, then this function may mark the Application as "validated," making it eligible for deployment and execution.

Otherwise, the error(s) encountered may be returned to the caller.

[0087] DEPLOY: This action may deploy an Application onto a designated Engine, request and allocate Engine resources, and start the Application. When asked to deploy an Application, the Application Manager may request the Compiler to build that Application from its metadata into an appropriate executable form, and place it in a specified location. Once the Compiler has completed its task it may notify the Application Manager, which deploys the compiled artifact onto the Engine. Deployment may be allowed on a validated application, but not for an unvalidated application.

[0088] UNDEPLOY: This action may stops an Application if it is running, remove it from the Engine, and release related Engine resources. The action may be allowed on a deployed application. [0089] DEACTIVATE: This action may stop a running Application, but leave it deployed on the Engine. In the case of Apache Storm, this may be done by stopping the Application's input "Spouts."

[0090] ACTIVATE: This action may start a deactivated Application on the Engine. In the case of Apache Storm, this may be done by starting the Application's input "Spouts."

[0091] REBALANCE: This action may ask the Engine to redistribute all components of a currently running Application across available resources in the Engine cluster. If any

components are listed within passed parameters, the action may change the parallelism of those components within the running Application as indicated in the parameters, and ask the Engine to redistribute all components across available resources in the Engine cluster. If specified (via parameter), changes to component parallelism may also be persisted in the application's metadata.

[0092] In some implementations, the Performance Monitor is an internal service that monitors metrics related to a particular computer system, execution Engine cluster, or running application. This enables other services to, for example, track trends of several metrics over time, pinpoint issues within an Engine cluster, and make recommendations on changes that would enhance the performance of the overall system. Where appropriate, the Performance Monitor may abstract certain metrics to be independent of the underlying Engine. The Performance Monitor may use the Engine Manager to obtain performance data for the designated Engine.

[0093] In some implementations, the Engine Manager is an internal service that provides Engine cluster management to provision, manage, and scale clusters on public and private clouds. Through the RESTful API and GUI, this service enables a user to choose a cluster template and provision hosts with the appropriate software necessary to begin running an Engine cluster. It may also provide the Performance Monitor with metrics on the health of the designated cluster and each of its nodes.

[0094] In some implementations, a Real-Time Processing Engine, also referred to as simply an "Engine," is a programming environment where real-time applications are deployed and executed. An example is Apache Storm, a distributed computation framework specifically designed for event stream processing. Although examples herein may describe a system that uses Apache Storm, the metadata-based representation of real-time event processing applications enables other implementations that make use of other Engines singly or together within a single system.

[0095] A user may create and edit an Application using the GUI to visually drag and drop components onto a virtual workspace canvas, and connect them together into a real-time application. Each Application may be a directed acyclic graph (DAG), meaning data may flow from one or more root nodes to one or more leaf nodes without forming a loop anywhere. Each node in the graph may represent an amount of work being performed on the data. In some cases, the GUI may prompt the user for any configuration data as needed by each component (or graph node).

[0096] In some implementations, to create a new Application, the user begins by clicking or selecting a Create New Application function in the GUI. The GUI may present a Designer page (e.g., design platform UI 200) where the user names the Application, selects Stream Source(s) (e.g., data source(s) 102) from the set of components available in the toolbox, and drags it onto the canvas (e.g., design pane 202). The user can then click the image now on the canvas, causing the GUI to present that particular Stream Source's configuration popup GUI element (e.g., dialog 208) for the user to fill in details to allow the Stream Source to connect and stream data into the Application.

[0097] The user may then repeat the drag and drop action to select a Parser for each Stream Source, the action to drag the Parser from the toolbox and place it on the canvas. A Parser is a type of component that parses the data coming from the Stream Source, such as a comma separated value data type, and enables the user to define the schema of that data.

[0098] In some implementations, event messages coming into the system do not have schema associated with them. For efficient real-time processing, traditional Engines may not provide or enforce schema within their programming and execution environments, thus shifting the burden onto the component developer to cast fields in each event message to the appropriate type within each component the event message passes through. This aspect may severely restrict real-time component reusability in traditional systems.

[0099] To increase component reusability, and preserve execution-time message efficiency and transparency, at least some implementations provide event-message schema at application design time. For each component, a user can define one or more schemas for streams emitted by that component within a specific application using the GUI elements exposed during component configuration. The schema may be represented using a Java bean to encapsulate and serialize aspects such as field name, data type, value, and so forth.

[00100] A schema may be associated with each stream emitted by a component, and may be available to the component(s) consuming that stream. A consuming-component configuration UI can use that schema, for example, to select fields by name for processing.

[00101] In some implementations, the Stream Source may be the only component type that is allowed to not provide schema for emitted stream(s), and the Parser may be the only component type that is allowed to consume streams with no schema.

[00102] Within a downstream component, the system may use the upstream bean definition to generate code that creates the correct Java bean object associated with the incoming stream before passing it to the remaining developer-provided component code for processing.

[00103] The user may add one or more Stream Processors (e.g., filter 106, analytics component 108, or output component 110) that consume data emitted by the Parser. For each downstream Processor (e.g., processing component 1 12), the user may have the ability to select schema defined by the component whose stream the Processor is consuming. Each Processor can emit one or more streams, each with a different schema. Many different Processors may be added to the Application to perform diverse functions such as data filtering or transformation, applying business logic or predictive analytics, sending messages and alerts, etc.

[00104] The user may configure each component within the Application as prompted by its associated GUI element(s), providing information to fulfill the function(s) provided. For example, a particular Stream Source may use configuration data such as message-broker network name and/or address, and the topics to which the Stream Source is subscribing. This

configuration data may be validated by the component via the Component Framework. If valid, the user data is written to the Metadata Store. Each component may also have operational and performance configuration metadata associated with it, such as how many instances of this component to run in parallel, in addition to parameters designated by the programmer that created the component. [00105] Components may be interconnected graphically in the GUI, which is represented in the Application's metadata and translated into dynamically-generated message routing code for component(s) within the Application.

[00106] When the user has completed adding all needed Stream Sources, Parsers and

Processors, and connected them into an appropriate Application topology, the user may save the Application. The Application may have settings related to how it may run (e.g., on Storm) such as, but not limited to, how many Java Virtual Machines (JVMs) to assign for this application and which Engine cluster to deploy to. All configuration data associated with the Application and its Components may be saved to the Metadata Store.

[00107] A user may "validate" an Application before it is considered deployable by the system to an available Engine. The validation function can be available from multiple locations within the GUI navigation. In some cases, validation is performed by the Application Manager, which may also use the Generator to generate code as prescribed by the Application metadata. The Application Manager may also use the Compiler to compile the generated code along with code provided by the component developer. The Application Manager may return a set of errors to be corrected or a status indicating that the Application is "validated," which is then communicated to the user via the GUI. In some implementations, the system may prompt the user to correct any errors in the Application before the application successfully validated and before deployment is enabled for that application.

[00108] If the validation is successful, then the user may request deployment of the

Application. Invoking the Deploy function in the GUI may result in a RESTful API request. The API Controller may dispatch the request to the Application Manager, which calls the Generator and Compiler to build the Application, designating a temporary target location for the Compiler to place the compiled artifacts. Upon a successful return from the Compiler, the Application Manager may deploy the Application to the designated Engine cluster, returning appropriate success or error status. In some implementations, the Application Manager deletes the compiled artifacts from the temporary location after a successful deployment. Once the Application is deployed, the user can navigate to the Performance Management area of the GUI to monitor Application performance. The user can also use the GUI to deactivate, activate, rebalance, or undeploy the application. [00109] FIG. 4 depicts a flow diagram of an example process according to implementations of the present disclosure. Operations of the example process of FIG. 4 may be performed by one or more of the components of the design platform 300 described with reference to FIG. 3.

[00110] The design platform may present (402) a UI such as the design platform UI 200. The UI may be a GUI, such as that described with reference to FIG. 3.

[00111] The design platform may receive (404) an indication, from the UI, of the user's specification of one or more components to be included in an application 100. Such components may include one or more of a data source 102, a parser 104, a filter 106, an analytics component 108, an output component 110, or other processing component(s) 112.

[00112] The design platform may receive (406) an indication, from the UI, of a user's specification of connection(s) 206 (e.g., pipes) to route data between pair(s) of the components.

[00113] Responsive to a save request entered through the UI, the design platform may generate and store (408) application metadata describing the component(s), connection(s), or user- provided code for the application 100, as well as any other information specified for the application 100.

[00114] The design platform may validate (410) the application 100 as described above.

Validation may include compilation or otherwise building executable code for the application, based on the various components, connections, or user-provided source code for the application 100. In some cases, the application is validated by checking component connection metadata, then generating and compiling source code for the application to generate executable code. The executable code may then be deployed (412) and executed (414) as described above.

[00115] Implementations and all of the functional operations described in this specification may be realized in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations may be realized as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term "computing system" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus may include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a

combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to suitable receiver apparatus.

[00116] A computer program (also known as a program, software, software application, script, or code) may be written in any appropriate form of programming language, including compiled or interpreted languages, and it may be deployed in any appropriate form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program may be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program may be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

[00117] The processes and logic flows described in this specification may be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows may also be performed by, and apparatus may also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

[00118] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any appropriate kind of digital computer. Generally, a processor may receive instructions and data from a read only memory or a random access memory or both. Elements of a computer can include a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer may also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio player, a Global Positioning System (GPS) receiver, to name just a few. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory may be supplemented by, or incorporated in, special purpose logic circuitry.

[00119] To provide for interaction with a user, implementations may be realized on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user may provide input to the computer. Other kinds of devices may be used to provide for interaction with a user as well; for example, feedback provided to the user may be any appropriate form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user may be received in any appropriate form, including acoustic, speech, or tactile input.

[00120] Implementations may be realized in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a web browser through which a user may interact with an implementation, or any appropriate combination of one or more such back end, middleware, or front end components. The components of the system may be interconnected by any appropriate form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), e.g., the Internet. [0001] The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

[0002] While this specification contains many specifics, these should not be construed as limitations on the scope of the disclosure or of what may be claimed, but rather as descriptions of features specific to particular implementations. Certain features that are described in this specification in the context of separate implementations may also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation may also be implemented in multiple implementations separately or in any suitable sub-combination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination may in some cases be excised from the combination, and the claimed combination may be directed to a sub-combination or variation of a sub-combination.

[0003] Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the

implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems may generally be integrated together in a single software product or packaged into multiple software products.

[0004] A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. For example, various forms of the flows shown above may be used, with steps re-ordered, added, or removed. Accordingly, other implementations are within the scope of the following claims.

Claims

What is claimed is: CLAIMS:

1. A computer-implemented method for providing an application, the method comprising:

presenting a user interface (UI);

receiving, through the UI, an indication of components to include in the application; receiving, through the UI, an indication of one or more connections between the components;

generating application metadata for the application, the application metadata describing the components and the one or more connections;

validating the application based on the application metadata, including compiling source code for the application to generate executable code;

deploying the executable code to an execution environment; and

executing the executable code in the execution environment.

2. The method of claim 1, wherein the components include one or more of:

a data source;

a parser;

a filter;

an analytics component; or

an output component.

3. The method of claim 1, wherein the components include at least one dynamic data source providing a dynamic data stream to be analyzed in real time by the application.