US20210255886A1 - Distributed model execution - Google Patents

Distributed model execution Download PDF

Info

Publication number
US20210255886A1
US20210255886A1 US17/176,906 US202117176906A US2021255886A1 US 20210255886 A1 US20210255886 A1 US 20210255886A1 US 202117176906 A US202117176906 A US 202117176906A US 2021255886 A1 US2021255886 A1 US 2021255886A1
Authority
US
United States
Prior art keywords
model
node
models
nodes
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/176,906
Inventor
Eugene Von Niederhausern
Sreenivasa Gorti
Kevin W. Divincenzo
Sridhar Sudarsan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SparkCognition Inc
Original Assignee
SparkCognition Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SparkCognition Inc filed Critical SparkCognition Inc
Priority to US17/176,906 priority Critical patent/US20210255886A1/en
Assigned to SparkCognition, Inc. reassignment SparkCognition, Inc. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIVINCENZO, KEVIN W., VON NIEDERHAUSERN, EUGENE, GORTI, SREENIVASA, SUDARSAN, SRIDHAR
Publication of US20210255886A1 publication Critical patent/US20210255886A1/en
Assigned to ORIX GROWTH CAPITAL, LLC reassignment ORIX GROWTH CAPITAL, LLC SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SparkCognition, Inc.
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/35Creation or generation of source code model driven
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/52Binary to binary
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • Machine learning models may be used to perform various data analysis applications.
  • a client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
  • FIG. 1 is a block diagram of an example system for distributed model execution according to some embodiments.
  • FIG. 2 is a diagram of model dependencies for distributed model execution according to some embodiments.
  • FIG. 3 is a block diagram of an example execution environment for distributed model execution according to some embodiments.
  • FIG. 4 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 5 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 6 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 7 is a flowchart of another example method for distributed model execution according to some embodiments.
  • Machine learning models may be used to perform various data analysis applications. For example, one or more machine learning models may be used to generate predictions or other analysis based on input data. Machine learning models may be logically integrated such that the output of some models are provided as input to other models, ultimately resulting in a model providing an output as the prediction.
  • a client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
  • a client may provide the machine learning models used to generate a prediction to off-site or remote resources, such as remote data centers, cloud computing environments, and the like.
  • These remote resources may have access to hardware such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), or other devices that the models may leverage to accelerate their performance.
  • the resulting prediction or output may then be provided back to a client.
  • the execution of a given model may be performed by a node.
  • a node may include a computing device, a virtual machine, or other device as can be appreciated.
  • the models may be deployed for execution to a given node based on various criteria, including the hardware and software resources available to a node, the type of data or calculations used by the model, authorization requirements, model dependencies, and the like. Once deployed, the models may be used for distributed processing of data in order to generate a prediction for a client.
  • FIG. 1 is a block diagram of a non-limiting example system for distributed model execution.
  • the example system includes a model execution environment 106 .
  • the model execution environment 106 includes a plurality of nodes 108 a - n.
  • Each node 108 a - n is an allocation of hardware and software resources, including storage resources (e.g., storage devices, memory, and the like), processing resources (e.g., processors, hardware accelerators such as GPUs, FPGAs, and the like), software resources (e.g., operating systems, software applications, and the like), and other resources as can be appreciated to facilitate distributed model execution.
  • Each node 108 a - n may include one or more computing devices, one or more virtual machines, or other allocations of resources as can be appreciated.
  • Each node 108 a - n may be communicatively coupled to another node 108 a - n using various communications resources, including buses, wired or wireless networks, and the like.
  • the system of FIG. 1 also includes a management node 102 .
  • the management node 102 is similar to the nodes 108 a - n in that the management node 102 may include a computing device, virtual machine, and the like.
  • the management node 102 is communicatively coupled to the model execution environment 106 .
  • the management node 102 is shown as separate from the model execution environment 106 , it is understood that the management node 102 may be located remote from or proximate to the model execution environment 106 .
  • the management node 102 and the model execution environment 106 may be implemented in the same or separate data centers, cloud computing environments, and the like.
  • the client device 112 provides, to the model execution environment 106 , a plurality of models 110 a - n for execution in the plurality of nodes 108 a - n.
  • FIG. 1 shows each model 110 a - n allocated to and executed in a respective node 108 a - n, it is understood that other configurations and allocations of nodes 108 a - n are possible.
  • a node 108 a - n may be allocated execution of multiple models 110 a - n.
  • multiple nodes 108 a - n may operate in parallel to facilitate the execution of a single model 110 a - n.
  • executing a model at multiple nodes includes assigning different portions of an input data set to the different nodes, where each node executes an entirety of model operations with respect to their respective assigned input data.
  • executing a model at multiple nodes includes executing a first portion of a model (e.g., operations corresponding to one or more first neural network layers) at a first node and second portion of the model (e.g., operations corresponding to one or more second neural network layers) at a second node, where “intermediate” output from the first node is provided as input to the second node.
  • the plurality of models 110 a - n may include machine learning models (e.g., trained machine learning models such as neural networks), algorithmic models, and the like each configured to provide some output based on some input data.
  • the plurality of models 110 a - n are configured to generate a prediction based on input to one or more of the models 110 a - n. Such predictions may include, for example, classifications for a classification problem, a numerical value for a regression problem, and the like.
  • the plurality of models 110 a - n may also output one or more confidence values associated with the prediction. Accordingly, each model 110 a - n is configured to receive, as input, data output by another model 110 a - n, provide output as input data to another model 110 a - n, or both.
  • FIG. 2 shows an exemplary arrangement of models and their respective dependencies.
  • a model 204 a receives, as input data, input 202 a.
  • Input 202 a may include stored data, data from a data stream, or data from another data source as can be appreciated.
  • Model 204 a provides output to models 204 b and 204 c.
  • Model 204 b receives input from models 204 a and 204 b.
  • Model 204 d receives input from model 204 c and input 202 b.
  • Model 204 d provides, as output data, output 206 .
  • inputs 202 a, b are provided from one or more data sources to models 204 a and 204 d, respectively.
  • Data processing is performed through the various model dependencies in order to ultimately generate output 206 .
  • the output 206 may include a prediction based on the inputs 202 a, b.
  • the management node 102 executes a management module 104 for distributed model execution.
  • the management module 104 identifies, for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n. For example, assume that the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a - n for deployment to the model execution environment 106 . The request may include the plurality of models 110 a - n.
  • the request may also include identifiers, network addresses, or other data facilitating access to the models 110 a - n. For example, after uploading the plurality of models 110 a - n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a - n for deployment to the model execution environment 106 a - n for execution. Accordingly, the management module 104 identifies each node 108 a - n to which a model 110 a - n will be deployed for execution.
  • the management module 104 identifies the nodes 108 a - n for each model 110 a - n based on one or more execution constraints.
  • the one or more execution constraints for a given model 110 a - n are requirements to be satisfied by a given node 108 a - n in order to execute the given model 110 a - n.
  • the one or more execution constraints may include required constraints, where a node 108 a - n must satisfy a particular constraint for a given model 110 a - n to be deployed there.
  • the one or more execution constraints may also include preferential constraints, where a node 108 a - n is more preferentially selected for deployment of a given model 110 a - n if the constraint is satisfied.
  • the one or more execution constraints may include one or more model dependencies.
  • the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input.
  • the model 204 c is dependent on models 204 a and 204 b.
  • a node 108 a - n selected for deploying the model 204 b must have a communications pathway to (or be the same node as) nodes 108 a - n to which the models 204 a and 204 c are deployed.
  • the nodes 108 a - n are selected to reduce or minimize latency between nodes 108 a - n having interdependent models 110 a - n.
  • the one or more execution constraints may also include one or more encryption constraints.
  • the one or more encryption constraints may indicate data input to or received from a given model 110 a - n must be encrypted if transferred over a network.
  • the one or more encryption constraints may also indicate that data input to or received from a given model 110 a - n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a - n are executed in a same node 108 a - n, or executed within different virtual machine nodes 108 a - n implemented in a same hardware environment).
  • the one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like).
  • a node 108 a - n may be selected based on an encryption constraint by selecting a node 108 a - n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints.
  • a model 110 a - n whose output must be encrypted may be preferentially deployed to a node 108 a - n having greater hardware or processing resources, while a model 110 a - n that needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a - n having lesser hardware or processing resources.
  • a model 110 a - n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a - n having even greater hardware or processing resources.
  • the one or more execution constraints may also include one or more authorization constraints.
  • An authorization constraint is a restriction on which entities have access to data input to a model 110 a - n, output by a model 110 a - n, generated by the model 110 a - n (e.g., intermediary data or calculations), and the like.
  • an authorization constraint may indicate that a model 110 a - n should be executed on a private node 108 a - n (e.g., a node 108 a - n not shared by or accessible to another tenant or client of the model execution environment 106 ).
  • an authorization constraint may define access privileges for those users or other entities that may access the node 108 a - n executing a given model 110 a - n.
  • an authorization constraint may indicate that the input to or output from a given model 110 a - n should be transferred only over a private network. Accordingly, the node 108 a - n for the given model 110 a - n should be selected as having access to a private network connection to nodes 108 a - n executing its dependent models 110 a - n.
  • the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more node characteristics.
  • Node characteristics for a given node 108 a - n may include hardware resources for the node 108 a - n.
  • Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like.
  • Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a - n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a - n having more RAM than another node 108 a - n.
  • a model 110 a - n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a - n having the requisite libraries for performing the algorithm installed.
  • the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more model characteristics.
  • the model characteristics for a given model 110 a - n describe the data acted upon and the calculations performed by the model 110 a - n.
  • model characteristics may include a data type for data input to the model 110 a - n.
  • a data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like).
  • a data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, labeled or unlabeled data, time series data, and the like).
  • Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like).
  • models 110 a - n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a - n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations.
  • a neural network model may be deployed to different node(s) based on architectural parameters, such as whether the neural network is a feed-forward network or a recurrent network, whether the neural network exhibits “memory” (e.g., via long short-term memory (LSTM) architecture), etc.
  • LSTM long short-term memory
  • Identifying, for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n may include calculating, for each model 110 a - n of the plurality of models 110 a - n, a plurality of fitness scores for each of the plurality of nodes 108 a - n.
  • a given model 110 a - n has a fitness score calculated for each of the nodes 108 a - n indicating a fitness of that node 108 a - n for the given model 110 a - n.
  • Each fitness score for a given model may be calculated based on a degree to which the node 108 a - n satisfies the execution constraints for the model 110 a - n.
  • a node 108 a - n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a - n.
  • a first model 110 a - n is dependent on a second model 110 a - n (e.g., for input by virtue of receiving input from the second model 110 a - n, or output by virtue of providing output to the second model 110 a - n ), and that the first model 110 a - n is selected for deployment to a first node 108 a - n.
  • a second node 108 a - n and a third node 108 a - n are both communicatively coupled to the first node 108 a - n, with the second node 108 a - n having a lower latency connection to the first node 108 a - n compared to a connection from the third node 108 a - n to the first node 108 a - n.
  • the second model 110 a - n would have a higher fitness score for the second node 108 a - n than the third node 108 a - n by virtue of the lower latency connection to the first node 108 a - n to which the dependent first model 110 a - n is to be deployed.
  • a node 108 a - n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a - n must be executed on a private node 108 a - n. Any nodes 108 a - n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
  • the fitness score may also be calculated based on the node characteristics of each node 108 a - n or the model characteristics of the model 110 a - n. For example, a model 110 a - n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a - n with greater RAM. As another example, nodes 108 a - n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a - n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a - n.
  • the management module 104 may then select, for each model 110 a - n, based on the plurality of fitness scores, the corresponding node 108 a - n (e.g., the node 108 a - n to which a given model 110 a - n will be deployed).
  • selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes selecting, for each model 110 a - n a highest scoring node 108 a - n.
  • a node 108 a - n may be selected for each model 110 a - n by traversing a listing or ordering of models 110 a - n and selecting a node 108 a - n for a currently selected model 110 a - n.
  • the fitness scores for the given node 108 a - n may be recalculated for each model 110 a - n not having an assigned node 108 a - n.
  • a node 108 a - n already having an assigned model 110 a - n may still be an optimal selection for deploying another model 110 a - n.
  • selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes generating multiple combinations or permutations of assigning models 110 a - n for deployment to nodes 108 a - n and calculating a best fit assignment for all of the plurality of models 110 a - n (e.g., an assignment with a highest total fitness score across all models 110 a - n ).
  • the management module 104 then deploys each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
  • Deploying each model 110 a - n may include sending one or more of the models 110 a - n to their respective assigned node 108 a - n.
  • Deploying each model 110 a - n may also include causing a node 108 a - n to acquire or load its assigned model 110 a - n.
  • the management module 104 may issue a command for a given node 108 a - n to load its assigned model 110 a - n from a local or remote storage location.
  • Deploying each model 110 a - n may also include configuring one or more models 110 a - n to receive input from one or more data sources (e.g., data sources other than another model 110 a - n ).
  • the management module 104 may provide a node 108 a - n of a given model 110 a - n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a - n.
  • URLs Uniform Resource Locators
  • IP Internet Protocol
  • the management module 104 may provide a node 108 a - n a URL or IP address for a data stream of data to be provided as input to the given model 110 a - n.
  • the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a - n.
  • the management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
  • Deploying each model 110 a - n may also include configuring one or more models 110 a - n to provide, as output, a prediction generated by the plurality of models 110 a - n.
  • the management module 104 may indicate a storage location or file path for output data.
  • the management module 104 may further provide an indication of the storage location of the output data to the client device 112 .
  • deploying each model 110 a - n includes configuring each node 108 a - n to communicate with at least one other node 108 a - n of the plurality of nodes 108 a - n.
  • each interdependent model 110 a - n may communicate with each other via the configured nodes 108 a - n.
  • the management module 104 may facilitate the exchange of encryption keys between nodes 108 a - n executing dependent nodes 108 a - n requiring encryption.
  • the management module 104 may provide, to nodes 108 a - n executing models 110 a - n having dependent models 110 a - n, the URLs, IP addresses, or other identifiers of the nodes 108 a - n executing their respective dependent models 110 a - n.
  • the management module 104 may allocate or generate communications pathways between nodes 108 a - n via network communications fabrics of the model execution environment 106 .
  • the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a - n.
  • API Application Program Interface
  • a prediction may then be generated by the deployed models 110 a - n. For example, input data may be provided to one or more of the models 110 a - n. A prediction may then be generated as an output of a model 110 a - n by virtue of the distributed and interdependent execution of the models 110 a - n in the nodes 108 a - n. Data indicating the prediction may then be provided or made accessible to the client device 112 .
  • the management module 104 By deploying the models 110 a - n to the nodes 108 a - n of the model execution environment 106 as described above, the management module 104 ensures a useable configuration of models 110 a - n as deployed to nodes 108 a - n. Moreover, the management module 104 ensures that the model 110 a - n deployment preserves the hierarchy of dependencies of models 110 a - n, as well as the encryption and authorization requirements of the models 110 a - n.
  • FIG. 3 sets forth a diagram of an execution environment 300 in accordance with some embodiments of the present disclosure.
  • the execution environment 300 depicted in FIG. 3 may be embodied in a variety of different ways.
  • the execution environment 300 may be provided, for example, by one or more cloud computing providers such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and others, including combinations thereof.
  • AWS Amazon Web Services
  • Azure Microsoft Azure
  • Google Cloud and others, including combinations thereof.
  • the execution environment 300 may be embodied as a collection of devices (e.g., servers, storage devices, networking devices) and software resources that are included in a private data center.
  • the execution environment 300 may be embodied as a combination of cloud resources and private resources that collectively form a hybrid cloud computing environment.
  • the execution environment 300 depicted in FIG. 3 may include storage resources 302 , which may be embodied in many forms.
  • the storage resources 302 may include flash memory, hard disk drives, nano-RAM, non-volatile memory (NVM), 3D crosspoint non-volatile memory, magnetic random access memory (MRAM), non-volatile phase-change memory (PCM), storage class memory (SCM), or many others, including combinations of the storage technologies described above.
  • NVM non-volatile memory
  • MRAM magnetic random access memory
  • PCM non-volatile phase-change memory
  • SCM storage class memory
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • universal memory and many others.
  • the storage resources 302 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud storage resources such as Amazon Elastic Block Storage (EBS) block storage, Amazon S3 object storage, Amazon Elastic File System (EFS) file storage, Azure Blob Storage, and many others.
  • EBS Amazon Elastic Block Storage
  • EFS Amazon Elastic File System
  • Azure Blob Storage and many others.
  • the example execution environment 300 depicted in FIG. 3 may implement a variety of storage architectures, such as block storage where data is stored in blocks, and each block essentially acts as an individual hard drive, object storage where data is managed as objects, or file storage in which data is stored in a hierarchical structure. Such data may be saved in files and folders, and presented to both the system storing it and the system retrieving it in the same format.
  • the execution environment 300 depicted in FIG. 3 also includes communications resources 304 that may be useful in facilitating data communications between components within the execution environment 300 , as well as data communications between the execution environment 300 and computing devices that are outside of the execution environment 300 .
  • Such communications resources may be embodied, for example, as one or more routers, network switches, communications adapters, and many others, including combinations of such devices.
  • the communications resources 304 may be configured to utilize a variety of different protocols and data communication fabrics to facilitate data communications.
  • the communications resources 304 may utilize Internet Protocol (IP) based technologies, fibre channel (FC) technologies, FC over ethernet (FCoE) technologies, InfiniBand (IB) technologies, NVM Express (NVMe) technologies and NVMe over fabrics (NVMeoF) technologies, and many others.
  • IP Internet Protocol
  • FC fibre channel
  • FCoE FC over ethernet
  • IB InfiniBand
  • NVMe NVMe over fabrics
  • the communications resources 304 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as networking tools and resources that enable secure connections to the cloud as well as tools and resources (e.g., network interfaces, routing tables, gateways) to configure networking resources in a virtual private cloud.
  • tools and resources e.g., network interfaces, routing tables, gateways
  • the execution environment 300 depicted in FIG. 3 also includes processing resources 306 that may be useful in useful in executing computer program instructions and performing other computational tasks within the execution environment 300 .
  • the processing resources 306 may include one or more application-specific integrated circuits (ASICs) that are customized for some particular purpose, one or more central processing units (CPUs), one or more digital signal processors (DSPs), one or more field-programmable gate arrays (FPGAs), one or more systems on a chip (SoCs), or other form of processing resources 306 .
  • ASICs application-specific integrated circuits
  • CPUs central processing units
  • DSPs digital signal processors
  • FPGAs field-programmable gate arrays
  • SoCs systems on a chip
  • the processing resources 306 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances, event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others.
  • cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances
  • event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others.
  • the execution environment 300 depicted in FIG. 3 also includes software resources 308 that, when executed by processing resources 306 within the execution environment 300 , may perform various tasks.
  • the software resources 308 may include, for example, one or more modules of computer program instructions that when executed by processing resources 306 within the execution environment 300 are useful for distributed model execution.
  • the software resources may include one or more models 310 (e.g., models 110 a - n as executed in nodes 108 a - n of FIG. 1 ).
  • the software resources may also include a management module 312 (e.g., a management module 104 as described in FIG. 1 ). Accordingly, the execution environment 300 may include one or more of a management node 102 or a management execution environment 106 as described in FIG. 1 .
  • FIG. 4 sets forth a flow chart illustrating an example method for distributed model execution that includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n.
  • the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a - n for deployment to the model execution environment 106 .
  • the request may include the plurality of models 110 a - n.
  • the request may also include identifiers, network addresses, or other data facilitating access to the models 110 a - n. For example, after uploading the plurality of models 110 a - n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a - n for deployment to the model execution environment 106 a - n for execution. Accordingly, the management module 104 identifies each node 108 a - n to which a model 110 a - n will be deployed for execution.
  • the nodes 108 a - n for each model 110 a - n are identified based on one or more execution constraints.
  • the one or more execution constraints for a given model 110 a - n are requirements to be satisfied by a given node 108 a - n in order to execute the given model 110 a - n.
  • the one or more execution constraints may include required constraints, where a node 108 a - n must satisfy a particular constraint for a given model 110 a - n to be deployed there.
  • the one or more execution constraints may also include preferential constraints, where a node 108 a - n is more preferentially selected for deployment of a given model 110 a - n if the constraint is satisfied.
  • the one or more execution constraints may include one or more model dependencies.
  • the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input.
  • the model 204 c is dependent on models 204 a and 204 b.
  • a node 108 a - n selected for deploying the model 204 b must have a communications pathway to nodes 108 a - n to which the models 204 a and 204 c are deployed.
  • the nodes 108 a - n are selected to reduce or minimize latency between nodes 108 a - n having interdependent models 110 a - n.
  • the one or more execution constraints may also include one or more encryption constraints.
  • the one or more encryption constraints may indicate data input to or received from a given model 110 a - n must be encrypted if transferred over a network.
  • the one or more encryption constraints may also indicate that data input to or received from a given model 110 a - n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a - n are executed in a same node 108 a - n, or executed within different virtual machine nodes 108 a - n implemented in a same hardware environment).
  • the one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly.
  • a node 108 a - n may be selected based on an encryption constraint by selecting a node 108 a - n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a - n whose output must be encrypted may be preferentially deployed to a node 108 a - n having greater hardware or processing resources, while a model 110 a - n who needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a - n having lesser hardware or processing resources. As a further example, a model 110 a - n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a - n having even greater hardware or processing resources.
  • the one or more execution constraints may also include one or more authorization constraints.
  • An authorization constraint is a restriction on which entities have access to data input to a model 110 a - n, output by a model 110 a - n, generated by the model 110 a - n (e.g., intermediary data or calculations), and the like.
  • an authorization constraint may indicate that a model 110 a - n should be executed on a private node 108 a - n (e.g., a node 108 a - n not shared by or accessible to another tenant or client of the model execution environment 106 ).
  • an authorization constraint may define access privileges for those users or other entities that may access the node 108 a - n executing a given model 110 a - n.
  • an authorization constrain may indicate that the input to or output from a given model 110 a - n should be transferred only over a private network. Accordingly, the node 108 a - n for the given model 110 a - n should be selected as having access to a private network connection to nodes 108 a - n executing its dependent models 110 a - n.
  • the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more node characteristics.
  • Node characteristics for a given node 108 a - n may include hardware resources for the node 108 a - n.
  • Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like.
  • Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a - n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a - n having more RAM than another node 108 a - n.
  • a model 110 a - n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a - n having the requisite libraries for performing the algorithm installed.
  • the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more model characteristics.
  • the model characteristics for a given model 110 a - n describe the data acted upon and the calculations performed by the model 110 a - n.
  • model characteristics may include a data type for data input to the model 110 a - n.
  • a data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like).
  • a data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, and the like).
  • Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like).
  • models 110 a - n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a - n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations.
  • the method of FIG. 4 also includes deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
  • Deploying 404 each model 110 a - n may include sending one or more of the models 110 a - n to their respective assigned node 108 a - n.
  • Deploying 404 each model 110 a - n may also include causing a node 108 a - n to acquire or load its assigned model 110 a - n.
  • the management module 104 may issue a command for a given node 108 a - n to load its assigned model 110 a - n from a local or remote storage location.
  • each model 110 a - n may also include configuring one or more models 110 a - n to receive input from one or more data sources (e.g., data sources other than another model 110 a - n ).
  • the management module 104 may provide a node 108 a - n of a given model 110 a - n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a - n.
  • URLs Uniform Resource Locators
  • IP Internet Protocol
  • the management module 104 may provide a node 108 a - n a URL or IP address for a data stream of data to be provided as input to the given model 110 a - n.
  • the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a - n.
  • the management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
  • each model 110 a - n may also include configuring one or more models 110 a - n to provide, as output, a prediction generated by the plurality of models 110 a - n.
  • the management module 104 may indicate a storage location or file path for output data.
  • the management module 104 may further provide an indication of the storage location of the output data to the client device 112 .
  • models 110 a - n may be redistributed or redeployed according to various circumstances.
  • Such circumstances may include, for example, a user request, a predefined interval passing, an addition or removal of a node 108 a - n or model 110 a - n, a change in available computational resources in nodes 108 a - n, and the like.
  • FIG. 5 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
  • the method of FIG. 5 is similar to that of FIG. 4 in that the method of FIG. 5 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
  • the method of 5 differs from FIG. 4 in that identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n includes calculating 502 , for each model 110 a - n of the plurality of models 110 a - n, a plurality of fitness scores for each of the plurality of nodes 108 a - n.
  • a given model 110 a - n has a fitness score calculated for each of the nodes 108 a - n indicating a fitness of that node 108 a - n for the given model 110 a - n.
  • Each fitness score for a given model may be calculated based on a degree to which the node 108 a - n satisfies the execution constraints for the model 110 a - n.
  • a node 108 a - n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a - n. For example, assume that a first model 110 a - n is dependent on a second model 110 a - n (e.g., for input or output), and that the first model 110 a - n is selected for deployment to a first node 108 a - n.
  • a second node 108 a - n and a third node 108 a - n are both communicatively coupled to the first node 108 a - n, with the second node 108 a - n having a lower latency connection to the first node 108 a - n compared to a connection from the third node 108 a - n to the first node 108 a - n.
  • the second model 110 a - n would have a higher fitness score for the second node 108 a - n than the third node 108 a - n by virtue of the lower latency connection to the first node 108 a - n to which the dependent first model 110 a - n is to be deployed.
  • a node 108 a - n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a - n must be executed on a private node 108 a - n. Any nodes 108 a - n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
  • the fitness score may also be calculated based on the node characteristics of each node 108 a - n or the model characteristics of the model 110 a - n. For example, a model 110 a - n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a - n with greater RAM. As another example, nodes 108 a - n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a - n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a - n.
  • Identifying 402 for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n also includes selecting 504 , for each model 110 a - n, based on the plurality of fitness scores, the corresponding node 108 a - n (e.g., the node 108 a - n to which a given model 110 a - n will be deployed).
  • selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes selecting, for each model 110 a - n a highest scoring node 108 a - n.
  • a node 108 a - n may be selected for each model 110 a - n by traversing a listing or ordering of models 110 a - n and selecting a node 108 a - n for a currently selected model 110 a - n.
  • the fitness scores for the given node 108 a - n may be recalculated for each model 110 a - n not having an assigned node 108 a - n. Accordingly, in some embodiments, a node 108 a - n already having an assigned model 110 a - n may still be an optimal selection for deploying another model 110 a - n.
  • selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes generating multiple combinations or permutations of assigning models 110 a - n for deployment to nodes 108 a - n and calculating a best fit assignment for all of the plurality of models 110 a - n (e.g., an assignment with a highest total fitness score across all models 110 a - n ).
  • FIG. 6 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
  • the method of FIG. 6 is similar to that of FIG. 4 in that the method of FIG. 6 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
  • the method of 6 differs from FIG. 4 in that deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a includes configuring 602 each node 108 a - n to communicate with at least one other node 108 a - n of the plurality of nodes 108 a - n.
  • each interdependent model 110 a - n may communicate with each other via the configured nodes 108 a - n.
  • the management module 104 may facilitate the exchange of encryption keys between nodes 108 a - n executing dependent nodes 108 a - n requiring encryption.
  • the management module 104 may provide, to nodes 108 a - n executing models 110 a - n having dependent models 110 a - n, the URLs, IP addresses, or other identifiers of the nodes 108 a - n executing their respective dependent models 110 a - n.
  • the management module 104 may allocate or generate communications pathways between nodes 108 a - n via network communications fabrics of the model execution environment 106 .
  • the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a - n.
  • API Application Program Interface
  • FIG. 7 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
  • the method of FIG. 7 is similar to that of FIG. 4 in that the method of FIG. 7 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
  • the method of 7 differs from FIG. 4 in that the method of FIG. 7 includes generating 702 a prediction based on a distributed execution of the plurality of models 110 a - n.
  • the execution of the plurality of models 110 a - n is considered a distributed execution in that the models 110 a - n are executed across a plurality of distributed nodes 108 a - n.
  • the plurality of models 110 a - n are executed interdependently in that each node 108 a - n provides output to or receives input from at least one other node 108 a - n.
  • the prediction may be generated based on input data provided to one or more of the plurality of models 110 a - n.
  • the prediction may be indicated in, encoded in, or embodied as output from one or more of the plurality of models 110 a - n.
  • Exemplary embodiments of the present disclosure are described largely in the context of a fully functional computer system for distributed model execution. Readers of skill in the art will recognize, however, that the present disclosure also can be embodied in a computer program product disposed upon computer readable storage media for use with any suitable data processing system.
  • Such computer readable storage media can be any storage medium for machine-readable information, including magnetic media, optical media, or other suitable media. Examples of such media include magnetic disks in hard drives or diskettes, compact disks for optical drives, magnetic tape, and others as will occur to those of skill in the art.
  • Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of the method of the disclosure as embodied in a computer program product. Persons skilled in the art will recognize also that, although some of the exemplary embodiments described in this specification are oriented to software installed and executing on computer hardware, nevertheless, alternative embodiments implemented as firmware or as hardware are well within the scope of the present disclosure.
  • the present disclosure can be a system, a method, and/or a computer program product.
  • the computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or Flash memory, a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • Flash memory a static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network can include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present disclosure can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection can be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) can execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
  • These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein includes an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams can represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block can occur out of the order noted in the figures.
  • two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Distributed model execution, including: identifying, for each model of a plurality of models, based on one or more execution constraints for the plurality of models, a corresponding node of a plurality of nodes, wherein the plurality of nodes each comprise one or more computing devices or one or more virtual machines; deploying each model of the plurality of models to the identified corresponding node of the plurality of nodes; and wherein the plurality of models are configured to generate, based on data input to at least one model of the plurality of models, a prediction associated with the data.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a non-provisional application for patent entitled to a filing date and claiming the benefit of earlier-filed U.S. Provisional Patent Application Ser. No. 62/976,965, filed Feb. 14, 2020.
  • This application is related to co-pending U.S. patent application docket Ser. No. SC0010US01, filed Feb. 16, 2021, and co-pending U.S. patent application docket Ser. No. SC0011US01, filed Feb. 16, 2021, each of which is incorporated by reference in their entirety.
  • BACKGROUND
  • Machine learning models may be used to perform various data analysis applications. A client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of an example system for distributed model execution according to some embodiments.
  • FIG. 2 is a diagram of model dependencies for distributed model execution according to some embodiments.
  • FIG. 3 is a block diagram of an example execution environment for distributed model execution according to some embodiments.
  • FIG. 4 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 5 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 6 is a flowchart of another example method for distributed model execution according to some embodiments.
  • FIG. 7 is a flowchart of another example method for distributed model execution according to some embodiments.
  • DETAILED DESCRIPTION
  • Machine learning models may be used to perform various data analysis applications. For example, one or more machine learning models may be used to generate predictions or other analysis based on input data. Machine learning models may be logically integrated such that the output of some models are provided as input to other models, ultimately resulting in a model providing an output as the prediction.
  • A client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data. To address these shortcomings, a client may provide the machine learning models used to generate a prediction to off-site or remote resources, such as remote data centers, cloud computing environments, and the like. These remote resources may have access to hardware such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), or other devices that the models may leverage to accelerate their performance. The resulting prediction or output may then be provided back to a client.
  • As will be described in more detail below, the execution of a given model may be performed by a node. Such a node may include a computing device, a virtual machine, or other device as can be appreciated. The models may be deployed for execution to a given node based on various criteria, including the hardware and software resources available to a node, the type of data or calculations used by the model, authorization requirements, model dependencies, and the like. Once deployed, the models may be used for distributed processing of data in order to generate a prediction for a client.
  • FIG. 1 is a block diagram of a non-limiting example system for distributed model execution. The example system includes a model execution environment 106. The model execution environment 106 includes a plurality of nodes 108 a-n. Each node 108 a-n is an allocation of hardware and software resources, including storage resources (e.g., storage devices, memory, and the like), processing resources (e.g., processors, hardware accelerators such as GPUs, FPGAs, and the like), software resources (e.g., operating systems, software applications, and the like), and other resources as can be appreciated to facilitate distributed model execution. Each node 108 a-n may include one or more computing devices, one or more virtual machines, or other allocations of resources as can be appreciated. Each node 108 a-n may be communicatively coupled to another node 108 a-n using various communications resources, including buses, wired or wireless networks, and the like.
  • The system of FIG. 1 also includes a management node 102. The management node 102 is similar to the nodes 108 a-n in that the management node 102 may include a computing device, virtual machine, and the like. The management node 102 is communicatively coupled to the model execution environment 106. Although the management node 102 is shown as separate from the model execution environment 106, it is understood that the management node 102 may be located remote from or proximate to the model execution environment 106. For example, the management node 102 and the model execution environment 106 may be implemented in the same or separate data centers, cloud computing environments, and the like.
  • Also included in the system of FIG. 1 is a client device 112. The client device 112 provides, to the model execution environment 106, a plurality of models 110 a-n for execution in the plurality of nodes 108 a-n. Although FIG. 1 shows each model 110 a-n allocated to and executed in a respective node 108 a-n, it is understood that other configurations and allocations of nodes 108 a-n are possible. For example, a node 108 a-n may be allocated execution of multiple models 110 a-n. As another example, multiple nodes 108 a-n may operate in parallel to facilitate the execution of a single model 110 a-n. In some examples, executing a model at multiple nodes includes assigning different portions of an input data set to the different nodes, where each node executes an entirety of model operations with respect to their respective assigned input data. As yet another example, executing a model at multiple nodes includes executing a first portion of a model (e.g., operations corresponding to one or more first neural network layers) at a first node and second portion of the model (e.g., operations corresponding to one or more second neural network layers) at a second node, where “intermediate” output from the first node is provided as input to the second node.
  • The plurality of models 110 a-n may include machine learning models (e.g., trained machine learning models such as neural networks), algorithmic models, and the like each configured to provide some output based on some input data. In aggregate, the plurality of models 110 a-n are configured to generate a prediction based on input to one or more of the models 110 a-n. Such predictions may include, for example, classifications for a classification problem, a numerical value for a regression problem, and the like. The plurality of models 110 a-n may also output one or more confidence values associated with the prediction. Accordingly, each model 110 a-n is configured to receive, as input, data output by another model 110 a-n, provide output as input data to another model 110 a-n, or both.
  • Consider the example graph representations of model dependencies shown in FIG. 2. FIG. 2 shows an exemplary arrangement of models and their respective dependencies. One skilled in the art will appreciate that other arrangements or configurations of model dependencies are possible, and that FIG. 2 merely serves as an illustrative example. As shown in FIG. 2, a model 204 a receives, as input data, input 202 a. Input 202 a may include stored data, data from a data stream, or data from another data source as can be appreciated. Model 204 a provides output to models 204 b and 204 c. Model 204 b receives input from models 204 a and 204 b. Model 204 d receives input from model 204 c and input 202 b. Model 204 d provides, as output data, output 206. In the example of FIG. 2, inputs 202 a, b are provided from one or more data sources to models 204 a and 204 d, respectively. Data processing is performed through the various model dependencies in order to ultimately generate output 206. The output 206 may include a prediction based on the inputs 202 a, b.
  • Turning back to FIG. 1, the management node 102 executes a management module 104 for distributed model execution. The management module 104 identifies, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n. For example, assume that the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a-n for deployment to the model execution environment 106. The request may include the plurality of models 110 a-n. The request may also include identifiers, network addresses, or other data facilitating access to the models 110 a-n. For example, after uploading the plurality of models 110 a-n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a-n for deployment to the model execution environment 106 a-n for execution. Accordingly, the management module 104 identifies each node 108 a-n to which a model 110 a-n will be deployed for execution.
  • The management module 104 identifies the nodes 108 a-n for each model 110 a-n based on one or more execution constraints. The one or more execution constraints for a given model 110 a-n are requirements to be satisfied by a given node 108 a-n in order to execute the given model 110 a-n. The one or more execution constraints may include required constraints, where a node 108 a-n must satisfy a particular constraint for a given model 110 a-n to be deployed there. The one or more execution constraints may also include preferential constraints, where a node 108 a-n is more preferentially selected for deployment of a given model 110 a-n if the constraint is satisfied.
  • The one or more execution constraints may include one or more model dependencies. For example, turning back to the example of FIG. 2, the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input. Similarly, the model 204 c is dependent on models 204 a and 204 b. Accordingly, a node 108 a-n selected for deploying the model 204 b must have a communications pathway to (or be the same node as) nodes 108 a-n to which the models 204 a and 204 c are deployed. Moreover, in some embodiments, the nodes 108 a-n are selected to reduce or minimize latency between nodes 108 a-n having interdependent models 110 a-n.
  • The one or more execution constraints may also include one or more encryption constraints. The one or more encryption constraints may indicate data input to or received from a given model 110 a-n must be encrypted if transferred over a network. The one or more encryption constraints may also indicate that data input to or received from a given model 110 a-n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a-n are executed in a same node 108 a-n, or executed within different virtual machine nodes 108 a-n implemented in a same hardware environment). The one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly, a node 108 a-n may be selected based on an encryption constraint by selecting a node 108 a-n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a-n whose output must be encrypted may be preferentially deployed to a node 108 a-n having greater hardware or processing resources, while a model 110 a-n that needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a-n having lesser hardware or processing resources. As a further example, a model 110 a-n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a-n having even greater hardware or processing resources.
  • The one or more execution constraints may also include one or more authorization constraints. An authorization constraint is a restriction on which entities have access to data input to a model 110 a-n, output by a model 110 a-n, generated by the model 110 a-n (e.g., intermediary data or calculations), and the like. For example, an authorization constraint may indicate that a model 110 a-n should be executed on a private node 108 a-n (e.g., a node 108 a-n not shared by or accessible to another tenant or client of the model execution environment 106). As a further example, an authorization constraint may define access privileges for those users or other entities that may access the node 108 a-n executing a given model 110 a-n. As another example, an authorization constraint may indicate that the input to or output from a given model 110 a-n should be transferred only over a private network. Accordingly, the node 108 a-n for the given model 110 a-n should be selected as having access to a private network connection to nodes 108 a-n executing its dependent models 110 a-n.
  • The management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more node characteristics. Node characteristics for a given node 108 a-n may include hardware resources for the node 108 a-n. Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like. Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a-n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a-n having more RAM than another node 108 a-n. As another example, a model 110 a-n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a-n having the requisite libraries for performing the algorithm installed.
  • The management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more model characteristics. The model characteristics for a given model 110 a-n describe the data acted upon and the calculations performed by the model 110 a-n. For example, model characteristics may include a data type for data input to the model 110 a-n. A data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like). A data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, labeled or unlabeled data, time series data, and the like). Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like). For example, models 110 a-n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a-n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations. As another example, a neural network model may be deployed to different node(s) based on architectural parameters, such as whether the neural network is a feed-forward network or a recurrent network, whether the neural network exhibits “memory” (e.g., via long short-term memory (LSTM) architecture), etc.
  • Identifying, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n may include calculating, for each model 110 a-n of the plurality of models 110 a-n, a plurality of fitness scores for each of the plurality of nodes 108 a-n. In other words, a given model 110 a-n has a fitness score calculated for each of the nodes 108 a-n indicating a fitness of that node 108 a-n for the given model 110 a-n. Each fitness score for a given model may be calculated based on a degree to which the node 108 a-n satisfies the execution constraints for the model 110 a-n.
  • A node 108 a-n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a-n. For example, assume that a first model 110 a-n is dependent on a second model 110 a-n (e.g., for input by virtue of receiving input from the second model 110 a-n, or output by virtue of providing output to the second model 110 a-n), and that the first model 110 a-n is selected for deployment to a first node 108 a-n. Further assume that a second node 108 a-n and a third node 108 a-n are both communicatively coupled to the first node 108 a-n, with the second node 108 a-n having a lower latency connection to the first node 108 a-n compared to a connection from the third node 108 a-n to the first node 108 a-n. Accordingly, the second model 110 a-n would have a higher fitness score for the second node 108 a-n than the third node 108 a-n by virtue of the lower latency connection to the first node 108 a-n to which the dependent first model 110 a-n is to be deployed.
  • A node 108 a-n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a-n must be executed on a private node 108 a-n. Any nodes 108 a-n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
  • The fitness score may also be calculated based on the node characteristics of each node 108 a-n or the model characteristics of the model 110 a-n. For example, a model 110 a-n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a-n with greater RAM. As another example, nodes 108 a-n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a-n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a-n.
  • The management module 104 may then select, for each model 110 a-n, based on the plurality of fitness scores, the corresponding node 108 a-n (e.g., the node 108 a-n to which a given model 110 a-n will be deployed). In some embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes selecting, for each model 110 a-n a highest scoring node 108 a-n. For example, a node 108 a-n may be selected for each model 110 a-n by traversing a listing or ordering of models 110 a-n and selecting a node 108 a-n for a currently selected model 110 a-n. In some embodiments, after a model 110 a-n is assigned to a given node 108 a-n, the fitness scores for the given node 108 a-n may be recalculated for each model 110 a-n not having an assigned node 108 a-n. Accordingly, in some embodiments, a node 108 a-n already having an assigned model 110 a-n may still be an optimal selection for deploying another model 110 a-n. In other embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes generating multiple combinations or permutations of assigning models 110 a-n for deployment to nodes 108 a-n and calculating a best fit assignment for all of the plurality of models 110 a-n (e.g., an assignment with a highest total fitness score across all models 110 a-n).
  • The management module 104 then deploys each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. Deploying each model 110 a-n may include sending one or more of the models 110 a-n to their respective assigned node 108 a-n. Deploying each model 110 a-n may also include causing a node 108 a-n to acquire or load its assigned model 110 a-n. For example, the management module 104 may issue a command for a given node 108 a-n to load its assigned model 110 a-n from a local or remote storage location.
  • Deploying each model 110 a-n may also include configuring one or more models 110 a-n to receive input from one or more data sources (e.g., data sources other than another model 110 a-n). For example, the management module 104 may provide a node 108 a-n of a given model 110 a-n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a-n. For example, the management module 104 may provide a node 108 a-n a URL or IP address for a data stream of data to be provided as input to the given model 110 a-n. As another example, the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a-n. The management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
  • Deploying each model 110 a-n may also include configuring one or more models 110 a-n to provide, as output, a prediction generated by the plurality of models 110 a-n. For example, the management module 104 may indicate a storage location or file path for output data. The management module 104 may further provide an indication of the storage location of the output data to the client device 112.
  • In some embodiments, deploying each model 110 a-n includes configuring each node 108 a-n to communicate with at least one other node 108 a-n of the plurality of nodes 108 a-n. Thus, each interdependent model 110 a-n may communicate with each other via the configured nodes 108 a-n. For example, the management module 104 may facilitate the exchange of encryption keys between nodes 108 a-n executing dependent nodes 108 a-n requiring encryption. As another example, the management module 104 may provide, to nodes 108 a-n executing models 110 a-n having dependent models 110 a-n, the URLs, IP addresses, or other identifiers of the nodes 108 a-n executing their respective dependent models 110 a-n. In some embodiments, the management module 104 may allocate or generate communications pathways between nodes 108 a-n via network communications fabrics of the model execution environment 106. In some embodiments, the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a-n.
  • A prediction may then be generated by the deployed models 110 a-n. For example, input data may be provided to one or more of the models 110 a-n. A prediction may then be generated as an output of a model 110 a-n by virtue of the distributed and interdependent execution of the models 110 a-n in the nodes 108 a-n. Data indicating the prediction may then be provided or made accessible to the client device 112.
  • By deploying the models 110 a-n to the nodes 108 a-n of the model execution environment 106 as described above, the management module 104 ensures a useable configuration of models 110 a-n as deployed to nodes 108 a-n. Moreover, the management module 104 ensures that the model 110 a-n deployment preserves the hierarchy of dependencies of models 110 a-n, as well as the encryption and authorization requirements of the models 110 a-n.
  • For further explanation, FIG. 3 sets forth a diagram of an execution environment 300 in accordance with some embodiments of the present disclosure. The execution environment 300 depicted in FIG. 3 may be embodied in a variety of different ways. The execution environment 300 may be provided, for example, by one or more cloud computing providers such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and others, including combinations thereof. Alternatively, the execution environment 300 may be embodied as a collection of devices (e.g., servers, storage devices, networking devices) and software resources that are included in a private data center. In fact, the execution environment 300 may be embodied as a combination of cloud resources and private resources that collectively form a hybrid cloud computing environment.
  • The execution environment 300 depicted in FIG. 3 may include storage resources 302, which may be embodied in many forms. For example, the storage resources 302 may include flash memory, hard disk drives, nano-RAM, non-volatile memory (NVM), 3D crosspoint non-volatile memory, magnetic random access memory (MRAM), non-volatile phase-change memory (PCM), storage class memory (SCM), or many others, including combinations of the storage technologies described above. Readers will appreciate that other forms of computer memories and storage devices may be utilized as part of the execution environment 300, including DRAM, static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), universal memory, and many others. The storage resources 302 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud storage resources such as Amazon Elastic Block Storage (EBS) block storage, Amazon S3 object storage, Amazon Elastic File System (EFS) file storage, Azure Blob Storage, and many others. The example execution environment 300 depicted in FIG. 3 may implement a variety of storage architectures, such as block storage where data is stored in blocks, and each block essentially acts as an individual hard drive, object storage where data is managed as objects, or file storage in which data is stored in a hierarchical structure. Such data may be saved in files and folders, and presented to both the system storing it and the system retrieving it in the same format.
  • The execution environment 300 depicted in FIG. 3 also includes communications resources 304 that may be useful in facilitating data communications between components within the execution environment 300, as well as data communications between the execution environment 300 and computing devices that are outside of the execution environment 300. Such communications resources may be embodied, for example, as one or more routers, network switches, communications adapters, and many others, including combinations of such devices. The communications resources 304 may be configured to utilize a variety of different protocols and data communication fabrics to facilitate data communications. For example, the communications resources 304 may utilize Internet Protocol (IP) based technologies, fibre channel (FC) technologies, FC over ethernet (FCoE) technologies, InfiniBand (IB) technologies, NVM Express (NVMe) technologies and NVMe over fabrics (NVMeoF) technologies, and many others. The communications resources 304 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as networking tools and resources that enable secure connections to the cloud as well as tools and resources (e.g., network interfaces, routing tables, gateways) to configure networking resources in a virtual private cloud. Such communications resources may be useful in facilitating data communications between components within the execution environment 300, as well as data communications between the execution environment 300 and computing devices that are outside of the execution environment 300.
  • The execution environment 300 depicted in FIG. 3 also includes processing resources 306 that may be useful in useful in executing computer program instructions and performing other computational tasks within the execution environment 300. The processing resources 306 may include one or more application-specific integrated circuits (ASICs) that are customized for some particular purpose, one or more central processing units (CPUs), one or more digital signal processors (DSPs), one or more field-programmable gate arrays (FPGAs), one or more systems on a chip (SoCs), or other form of processing resources 306. The processing resources 306 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances, event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others.
  • The execution environment 300 depicted in FIG. 3 also includes software resources 308 that, when executed by processing resources 306 within the execution environment 300, may perform various tasks. The software resources 308 may include, for example, one or more modules of computer program instructions that when executed by processing resources 306 within the execution environment 300 are useful for distributed model execution. The software resources may include one or more models 310 (e.g., models 110 a-n as executed in nodes 108 a-n of FIG. 1). The software resources may also include a management module 312 (e.g., a management module 104 as described in FIG. 1). Accordingly, the execution environment 300 may include one or more of a management node 102 or a management execution environment 106 as described in FIG. 1.
  • For further explanation, FIG. 4 sets forth a flow chart illustrating an example method for distributed model execution that includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n. For example, assume that the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a-n for deployment to the model execution environment 106. The request may include the plurality of models 110 a-n. The request may also include identifiers, network addresses, or other data facilitating access to the models 110 a-n. For example, after uploading the plurality of models 110 a-n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a-n for deployment to the model execution environment 106 a-n for execution. Accordingly, the management module 104 identifies each node 108 a-n to which a model 110 a-n will be deployed for execution.
  • The nodes 108 a-n for each model 110 a-n are identified based on one or more execution constraints. The one or more execution constraints for a given model 110 a-n are requirements to be satisfied by a given node 108 a-n in order to execute the given model 110 a-n. The one or more execution constraints may include required constraints, where a node 108 a-n must satisfy a particular constraint for a given model 110 a-n to be deployed there. The one or more execution constraints may also include preferential constraints, where a node 108 a-n is more preferentially selected for deployment of a given model 110 a-n if the constraint is satisfied.
  • The one or more execution constraints may include one or more model dependencies. For example, turning back to the example of FIG. 2, the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input. Similarly, the model 204 c is dependent on models 204 a and 204 b. Accordingly, a node 108 a-n selected for deploying the model 204 b must have a communications pathway to nodes 108 a-n to which the models 204 a and 204 c are deployed. Moreover, in some embodiments, the nodes 108 a-n are selected to reduce or minimize latency between nodes 108 a-n having interdependent models 110 a-n.
  • The one or more execution constraints may also include one or more encryption constraints. The one or more encryption constraints may indicate data input to or received from a given model 110 a-n must be encrypted if transferred over a network. The one or more encryption constraints may also indicate that data input to or received from a given model 110 a-n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a-n are executed in a same node 108 a-n, or executed within different virtual machine nodes 108 a-n implemented in a same hardware environment). The one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly. A node 108 a-n may be selected based on an encryption constraint by selecting a node 108 a-n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a-n whose output must be encrypted may be preferentially deployed to a node 108 a-n having greater hardware or processing resources, while a model 110 a-n who needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a-n having lesser hardware or processing resources. As a further example, a model 110 a-n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a-n having even greater hardware or processing resources.
  • The one or more execution constraints may also include one or more authorization constraints. An authorization constraint is a restriction on which entities have access to data input to a model 110 a-n, output by a model 110 a-n, generated by the model 110 a-n (e.g., intermediary data or calculations), and the like. For example, an authorization constraint may indicate that a model 110 a-n should be executed on a private node 108 a-n (e.g., a node 108 a-n not shared by or accessible to another tenant or client of the model execution environment 106). As a further example, an authorization constraint may define access privileges for those users or other entities that may access the node 108 a-n executing a given model 110 a-n. As another example, an authorization constrain may indicate that the input to or output from a given model 110 a-n should be transferred only over a private network. Accordingly, the node 108 a-n for the given model 110 a-n should be selected as having access to a private network connection to nodes 108 a-n executing its dependent models 110 a-n.
  • The management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more node characteristics. Node characteristics for a given node 108 a-n may include hardware resources for the node 108 a-n. Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like. Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a-n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a-n having more RAM than another node 108 a-n. As another example, a model 110 a-n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a-n having the requisite libraries for performing the algorithm installed.
  • The management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more model characteristics. The model characteristics for a given model 110 a-n describe the data acted upon and the calculations performed by the model 110 a-n. For example, model characteristics may include a data type for data input to the model 110 a-n. A data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like). A data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, and the like). Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like). For example, models 110 a-n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a-n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations.
  • The method of FIG. 4 also includes deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. Deploying 404 each model 110 a-n may include sending one or more of the models 110 a-n to their respective assigned node 108 a-n. Deploying 404 each model 110 a-n may also include causing a node 108 a-n to acquire or load its assigned model 110 a-n. For example, the management module 104 may issue a command for a given node 108 a-n to load its assigned model 110 a-n from a local or remote storage location.
  • Deploying 404 each model 110 a-n may also include configuring one or more models 110 a-n to receive input from one or more data sources (e.g., data sources other than another model 110 a-n). For example, the management module 104 may provide a node 108 a-n of a given model 110 a-n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a-n. For example, the management module 104 may provide a node 108 a-n a URL or IP address for a data stream of data to be provided as input to the given model 110 a-n. As another example, the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a-n. The management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
  • Deploying 404 each model 110 a-n may also include configuring one or more models 110 a-n to provide, as output, a prediction generated by the plurality of models 110 a-n. For example, the management module 104 may indicate a storage location or file path for output data. The management module 104 may further provide an indication of the storage location of the output data to the client device 112.
  • One skilled in the art will appreciate that the approaches set forth above with respect to FIG. 4 may be performed repeatedly such that models 110 a-n are redistributed or redeployed according to various circumstances. Such circumstances may include, for example, a user request, a predefined interval passing, an addition or removal of a node 108 a-n or model 110 a-n, a change in available computational resources in nodes 108 a-n, and the like.
  • For further explanation, FIG. 5 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method of FIG. 5 is similar to that of FIG. 4 in that the method of FIG. 5 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n.
  • The method of 5 differs from FIG. 4 in that identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n includes calculating 502, for each model 110 a-n of the plurality of models 110 a-n, a plurality of fitness scores for each of the plurality of nodes 108 a-n. In other words, a given model 110 a-n has a fitness score calculated for each of the nodes 108 a-n indicating a fitness of that node 108 a-n for the given model 110 a-n. Each fitness score for a given model may be calculated based on a degree to which the node 108 a-n satisfies the execution constraints for the model 110 a-n.
  • A node 108 a-n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a-n. For example, assume that a first model 110 a-n is dependent on a second model 110 a-n (e.g., for input or output), and that the first model 110 a-n is selected for deployment to a first node 108 a-n. Further assume that a second node 108 a-n and a third node 108 a-n are both communicatively coupled to the first node 108 a-n, with the second node 108 a-n having a lower latency connection to the first node 108 a-n compared to a connection from the third node 108 a-n to the first node 108 a-n. Accordingly, the second model 110 a-n would have a higher fitness score for the second node 108 a-n than the third node 108 a-n by virtue of the lower latency connection to the first node 108 a-n to which the dependent first model 110 a-n is to be deployed.
  • A node 108 a-n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a-n must be executed on a private node 108 a-n. Any nodes 108 a-n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
  • The fitness score may also be calculated based on the node characteristics of each node 108 a-n or the model characteristics of the model 110 a-n. For example, a model 110 a-n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a-n with greater RAM. As another example, nodes 108 a-n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a-n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a-n.
  • Identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n also includes selecting 504, for each model 110 a-n, based on the plurality of fitness scores, the corresponding node 108 a-n (e.g., the node 108 a-n to which a given model 110 a-n will be deployed). In some embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes selecting, for each model 110 a-n a highest scoring node 108 a-n. For example, a node 108 a-n may be selected for each model 110 a-n by traversing a listing or ordering of models 110 a-n and selecting a node 108 a-n for a currently selected model 110 a-n. In some embodiments, after a model 110 a-n is assigned to a given node 108 a-n, the fitness scores for the given node 108 a-n may be recalculated for each model 110 a-n not having an assigned node 108 a-n. Accordingly, in some embodiments, a node 108 a-n already having an assigned model 110 a-n may still be an optimal selection for deploying another model 110 a-n. In other embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes generating multiple combinations or permutations of assigning models 110 a-n for deployment to nodes 108 a-n and calculating a best fit assignment for all of the plurality of models 110 a-n (e.g., an assignment with a highest total fitness score across all models 110 a-n).
  • For further explanation, FIG. 6 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method of FIG. 6 is similar to that of FIG. 4 in that the method of FIG. 6 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n.
  • The method of 6 differs from FIG. 4 in that deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a includes configuring 602 each node 108 a-n to communicate with at least one other node 108 a-n of the plurality of nodes 108 a-n. Thus, each interdependent model 110 a-n may communicate with each other via the configured nodes 108 a-n. For example, the management module 104 may facilitate the exchange of encryption keys between nodes 108 a-n executing dependent nodes 108 a-n requiring encryption. As another example, the management module 104 may provide, to nodes 108 a-n executing models 110 a-n having dependent models 110 a-n, the URLs, IP addresses, or other identifiers of the nodes 108 a-n executing their respective dependent models 110 a-n. In some embodiments, the management module 104 may allocate or generate communications pathways between nodes 108 a-n via network communications fabrics of the model execution environment 106. In some embodiments, the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a-n.
  • For further explanation, FIG. 7 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method of FIG. 7 is similar to that of FIG. 4 in that the method of FIG. 7 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n.
  • The method of 7 differs from FIG. 4 in that the method of FIG. 7 includes generating 702 a prediction based on a distributed execution of the plurality of models 110 a-n. The execution of the plurality of models 110 a-n is considered a distributed execution in that the models 110 a-n are executed across a plurality of distributed nodes 108 a-n. The plurality of models 110 a-n are executed interdependently in that each node 108 a-n provides output to or receives input from at least one other node 108 a-n. The prediction may be generated based on input data provided to one or more of the plurality of models 110 a-n. The prediction may be indicated in, encoded in, or embodied as output from one or more of the plurality of models 110 a-n.
  • In view of the explanations set forth above, readers will recognize that the benefits of distributed model execution include:
      • Improved performance of a computing system by identifying optimal or best fitting nodes for model deployment and execution.
      • Improved performance of a computing system by allowing for remote, distributed execution of models, leveraging mode advanced hardware and computational resources than found in client systems.
      • Improved performance of a computing system by deploying models such that model dependencies, encryption relationships, and authorization requirements are preserved.
  • Exemplary embodiments of the present disclosure are described largely in the context of a fully functional computer system for distributed model execution. Readers of skill in the art will recognize, however, that the present disclosure also can be embodied in a computer program product disposed upon computer readable storage media for use with any suitable data processing system. Such computer readable storage media can be any storage medium for machine-readable information, including magnetic media, optical media, or other suitable media. Examples of such media include magnetic disks in hard drives or diskettes, compact disks for optical drives, magnetic tape, and others as will occur to those of skill in the art. Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of the method of the disclosure as embodied in a computer program product. Persons skilled in the art will recognize also that, although some of the exemplary embodiments described in this specification are oriented to software installed and executing on computer hardware, nevertheless, alternative embodiments implemented as firmware or as hardware are well within the scope of the present disclosure.
  • The present disclosure can be a system, a method, and/or a computer program product. The computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
  • The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or Flash memory, a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network can include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present disclosure can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection can be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) can execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
  • Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
  • These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein includes an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • The computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams can represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block can occur out of the order noted in the figures. For example, two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
  • It will be understood from the foregoing description that modifications and changes can be made in various embodiments of the present disclosure. The descriptions in this specification are for purposes of illustration only and are not to be construed in a limiting sense. The scope of the present disclosure is limited only by the language of the following claims.

Claims (22)

What is claimed is:
1. A method for distributed model execution, comprising:
identifying, for each model of a plurality of models, based on one or more execution constraints for the plurality of models, a corresponding node of a plurality of nodes, wherein the plurality of nodes each comprise one or more computing devices or one or more virtual machines;
deploying each model of the plurality of models to the corresponding node of the plurality of nodes; and
wherein the plurality of models are configured to generate, based on data input to at least one model of the plurality of models, a prediction associated with the data.
2. The method of claim 1, wherein the one or more execution constraints comprise one or more model dependencies, one or more encryption constraints, or one or more authorization constraints.
3. The method of claim 1, further comprising generating the prediction based on a distributed execution of the plurality of models.
4. The method of claim 1, wherein identifying the corresponding node of the plurality of nodes is based on one or more node characteristics of the plurality of nodes.
5. The method of claim 4, wherein the one or more node characteristics comprise one or more of: one or more hardware resources of one or more of the plurality of nodes, or one or more software resources of one or more of the plurality of nodes.
6. The method of claim 1, wherein identifying the corresponding node of the plurality of nodes is based on one or more model characteristics of the plurality of models.
7. The method of claim 6, wherein the one or more model characteristics comprise one or more of: a data type for input data to one or more of the plurality of models, or a calculation type performed by one or more of the plurality of models.
8. The method of claim 1, wherein identifying, for each model of the plurality of models, the corresponding node of the plurality of nodes comprises:
calculating, for each model of the plurality of models, a plurality of fitness scores for the plurality of nodes; and
selecting, for each model, based on the plurality of fitness scores, the corresponding node.
9. The method of claim 1, further comprising configuring each node to communicate with at least one other node of the plurality of nodes.
10. The method of claim 9, wherein configuring each node to communicate with at least one other node of the plurality of nodes comprises configuring each node to provide output to or receive input from at least one other node.
11. The method of claim 1, further comprising redeploying one or more models of the plurality of models.
12. An apparatus for distributed model execution, the apparatus configured to perform steps comprising:
identifying, for each model of a plurality of models, based on one or more execution constraints for the plurality of models, a corresponding node of a plurality of nodes,
wherein the plurality of nodes each comprise one or more computing devices or one or more virtual machines;
deploying each model of the plurality of models to the corresponding node of the plurality of nodes; and
wherein the plurality of models are configured to generate, based on data input to at least one model of the plurality of models, a prediction associated with the data.
13. The apparatus of claim 12, wherein the one or more execution constraints comprise one or more model dependencies, one or more encryption constraints, or one or more authorization constraints.
14. The apparatus of claim 12, wherein the steps further comprise generating the prediction based on a distributed execution of the plurality of models.
15. The apparatus of claim 12, wherein identifying the corresponding node of the plurality of nodes is based on one or more node characteristics of the plurality of nodes.
16. The apparatus of claim 15, wherein the one or more node characteristics comprise one or more of: one or more hardware resources of one or more of the plurality of nodes, or one or more software resources of one or more of the plurality of nodes.
17. The apparatus of claim 12, wherein identifying the corresponding node of the plurality of nodes is based on one or more model characteristics of the plurality of models.
18. The apparatus of claim 17, wherein the one or more model characteristics comprise one or more of: a data type for input data to one or more of the plurality of models, or a calculation type performed by one or more of the plurality of models.
19. The apparatus of claim 12, wherein identifying, for each model of the plurality of models, the corresponding node of the plurality of nodes comprises:
calculating, for each model of the plurality of models, a plurality of fitness scores for the plurality of nodes; and
selecting, for each model, based on the plurality of fitness scores, the corresponding node.
20. The apparatus of claim 12, wherein the steps further comprise configuring each node to communicate with at least one other node of the plurality of nodes.
21. The apparatus of claim 20, wherein configuring each node to communicate with at least one other node of the plurality of nodes comprises configuring each node to provide output to or receive input from at least one other node.
22. The apparatus of claim 12, wherein the steps further comprise redeploying one or more models of the plurality of models.
US17/176,906 2020-02-14 2021-02-16 Distributed model execution Pending US20210255886A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/176,906 US20210255886A1 (en) 2020-02-14 2021-02-16 Distributed model execution

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062976965P 2020-02-14 2020-02-14
US17/176,906 US20210255886A1 (en) 2020-02-14 2021-02-16 Distributed model execution

Publications (1)

Publication Number Publication Date
US20210255886A1 true US20210255886A1 (en) 2021-08-19

Family

ID=77272078

Family Applications (3)

Application Number Title Priority Date Filing Date
US17/176,906 Pending US20210255886A1 (en) 2020-02-14 2021-02-16 Distributed model execution
US17/176,889 Active US11675614B2 (en) 2020-02-14 2021-02-16 Standardized model packaging and deployment
US17/176,898 Active 2041-06-23 US11947989B2 (en) 2020-02-14 2021-02-16 Process flow for model-based applications

Family Applications After (2)

Application Number Title Priority Date Filing Date
US17/176,889 Active US11675614B2 (en) 2020-02-14 2021-02-16 Standardized model packaging and deployment
US17/176,898 Active 2041-06-23 US11947989B2 (en) 2020-02-14 2021-02-16 Process flow for model-based applications

Country Status (1)

Country Link
US (3) US20210255886A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220256001A1 (en) * 2021-02-09 2022-08-11 Cisco Technology, Inc. Methods for seamless session transfer without re-keying

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11030565B1 (en) * 2020-05-18 2021-06-08 Grant Thornton Llp System and method for audit report generation from structured data
US20230092247A1 (en) * 2021-09-22 2023-03-23 Rockwell Automation Technologies, Inc. Automated monitoring using image analysis
US20230205917A1 (en) * 2021-12-24 2023-06-29 BeeKeeperAI, Inc. Systems and methods for data validation and transformation of data in a zero-trust environment

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225952A1 (en) * 2003-03-06 2004-11-11 Microsoft Corporation Architecture for distributed computing system and automated design, deployment, and management of distributed applications
US20160162819A1 (en) * 2014-12-03 2016-06-09 Hakman Labs LLC Workflow definition, orchestration and enforcement via a collaborative interface according to a hierarchical procedure list
US20190043201A1 (en) * 2017-12-28 2019-02-07 Christina R. Strong Analytic image format for visual computing
US20200027022A1 (en) * 2019-09-27 2020-01-23 Satish Chandra Jha Distributed machine learning in an information centric network
US10606660B1 (en) * 2016-04-29 2020-03-31 Architecture Technology Corporation Planned cloud resource management
US20200174844A1 (en) * 2018-12-04 2020-06-04 Huawei Technologies Canada Co., Ltd. System and method for resource partitioning in distributed computing
US20200220901A1 (en) * 2016-06-10 2020-07-09 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US20200274895A1 (en) * 2019-02-25 2020-08-27 Acronis International Gmbh System and method for creating a data protection map and remediating vulnerabilities
US20210005292A1 (en) * 2018-09-25 2021-01-07 Patientory, Inc. System and method of utilizing a user's health data stored over a health care network, for disease prevention
US20210019194A1 (en) * 2019-07-16 2021-01-21 Cisco Technology, Inc. Multi-cloud service mesh orchestration platform
US20210029204A1 (en) * 2019-07-24 2021-01-28 Vmware, Inc. Methods and apparatus to generate migration recommendations to migrate services between geographic regions
US10970123B1 (en) * 2019-09-19 2021-04-06 Amazon Technologies, Inc. Determining suitability of computing resources for workloads
US11315253B2 (en) * 2019-01-22 2022-04-26 Kabushiki Kaisha Toshiba Computer vision system and method
US11429434B2 (en) * 2019-12-23 2022-08-30 International Business Machines Corporation Elastic execution of machine learning workloads using application based profiling

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8494824B2 (en) * 2009-02-16 2013-07-23 The Boeing Company Methods and apparatus for integrating engineering models from disparate tools in support of model reuse
US8341193B2 (en) * 2010-01-12 2012-12-25 Microsoft Corporation Data versioning through data transformations
US20150134362A1 (en) * 2010-09-01 2015-05-14 Apixio, Inc. Systems and methods for a medical coder marketplace
US11538561B2 (en) * 2010-09-01 2022-12-27 Apixio, Inc. Systems and methods for medical information data warehouse management
US8805859B2 (en) * 2011-02-21 2014-08-12 General Electric Company Methods and systems for receiving, mapping and structuring data from disparate systems in a healthcare environment
US10114660B2 (en) * 2011-02-22 2018-10-30 Julian Michael Urbach Software application delivery and launching system
US20120290560A1 (en) * 2011-05-13 2012-11-15 Kushal Das Mechanism for efficiently querying application binary interface/application programming interface-related information
US10685314B1 (en) * 2014-07-31 2020-06-16 Open Text Corporation Case leaf nodes pointing to business objects or document types
US11138220B2 (en) * 2016-11-27 2021-10-05 Amazon Technologies, Inc. Generating data transformation workflows
WO2019028468A1 (en) * 2017-08-04 2019-02-07 Fair Ip, Llc Computer system for building, training and productionizing machine learning models
US10467039B2 (en) * 2017-08-07 2019-11-05 Open Data Group Inc. Deployment and management platform for model execution engine containers
US10831519B2 (en) * 2017-11-22 2020-11-10 Amazon Technologies, Inc. Packaging and deploying algorithms for flexible machine learning
US10621513B2 (en) * 2018-03-08 2020-04-14 Capital One Services, Llc System and method for deploying and versioning machine learning models
US10692153B2 (en) * 2018-07-06 2020-06-23 Optum Services (Ireland) Limited Machine-learning concepts for detecting and visualizing healthcare fraud risk
US20210141791A1 (en) * 2019-11-13 2021-05-13 Adobe Inc. Method and system for generating a hybrid data model
AU2020384311B2 (en) * 2019-11-15 2023-04-06 Equinix, Inc. Secure artificial intelligence model training and registration system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225952A1 (en) * 2003-03-06 2004-11-11 Microsoft Corporation Architecture for distributed computing system and automated design, deployment, and management of distributed applications
US20160162819A1 (en) * 2014-12-03 2016-06-09 Hakman Labs LLC Workflow definition, orchestration and enforcement via a collaborative interface according to a hierarchical procedure list
US10606660B1 (en) * 2016-04-29 2020-03-31 Architecture Technology Corporation Planned cloud resource management
US20200220901A1 (en) * 2016-06-10 2020-07-09 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US20190043201A1 (en) * 2017-12-28 2019-02-07 Christina R. Strong Analytic image format for visual computing
US20210005292A1 (en) * 2018-09-25 2021-01-07 Patientory, Inc. System and method of utilizing a user's health data stored over a health care network, for disease prevention
US20200174844A1 (en) * 2018-12-04 2020-06-04 Huawei Technologies Canada Co., Ltd. System and method for resource partitioning in distributed computing
US11315253B2 (en) * 2019-01-22 2022-04-26 Kabushiki Kaisha Toshiba Computer vision system and method
US20200274895A1 (en) * 2019-02-25 2020-08-27 Acronis International Gmbh System and method for creating a data protection map and remediating vulnerabilities
US20210019194A1 (en) * 2019-07-16 2021-01-21 Cisco Technology, Inc. Multi-cloud service mesh orchestration platform
US20210029204A1 (en) * 2019-07-24 2021-01-28 Vmware, Inc. Methods and apparatus to generate migration recommendations to migrate services between geographic regions
US10970123B1 (en) * 2019-09-19 2021-04-06 Amazon Technologies, Inc. Determining suitability of computing resources for workloads
US20200027022A1 (en) * 2019-09-27 2020-01-23 Satish Chandra Jha Distributed machine learning in an information centric network
US11429434B2 (en) * 2019-12-23 2022-08-30 International Business Machines Corporation Elastic execution of machine learning workloads using application based profiling

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220256001A1 (en) * 2021-02-09 2022-08-11 Cisco Technology, Inc. Methods for seamless session transfer without re-keying
US11683380B2 (en) * 2021-02-09 2023-06-20 Cisco Technology, Inc. Methods for seamless session transfer without re-keying

Also Published As

Publication number Publication date
US20210255839A1 (en) 2021-08-19
US11947989B2 (en) 2024-04-02
US20210256000A1 (en) 2021-08-19
US11675614B2 (en) 2023-06-13

Similar Documents

Publication Publication Date Title
US20210255886A1 (en) Distributed model execution
US10516623B2 (en) Pluggable allocation in a cloud computing system
US10700866B2 (en) Anonymous encrypted data
US10116742B2 (en) Scalable approach to manage storage volumes across heterogenous cloud systems
US11870650B2 (en) System, method and computer program product for network function optimization based on locality and function type
US10567269B2 (en) Dynamically redirecting affiliated data to an edge computing device
JP2023551527A (en) Secure computing resource placement using homomorphic encryption
Makris et al. Towards a distributed storage framework for edge computing infrastructures
US10554626B2 (en) Filtering of authenticated synthetic transactions
EP2852893A1 (en) Pluggable allocation in a cloud computing system
US11650954B2 (en) Replication continued enhancement method
CN115150117A (en) Maintaining confidentiality in decentralized policies
JP2024501168A (en) Secure memory sharing method
US11349663B2 (en) Secure workload configuration
US11102258B2 (en) Stream processing without central transportation planning
Reali et al. Orchestration of cloud genomic services
US11695552B2 (en) Quantum key distribution in a multi-cloud environment
US11875202B2 (en) Visualizing API invocation flows in containerized environments
US11709607B2 (en) Storage block address list entry transform architecture
US20230403146A1 (en) Smart round robin delivery for hardware security module host requests
WO2024032653A1 (en) Reducing network overhead
US20230222228A1 (en) Database hierarchical encryption for hybrid-cloud environment
Jiang et al. RADU: Bridging the divide between data and infrastructure management to support data-driven collaborations
CN116615891A (en) Key rotation on a publish-subscribe system
Suganya et al. DISTRIBUTED MINING ALGORITHM USING HADOOP ON LARGE DATA SET

Legal Events

Date Code Title Description
AS Assignment

Owner name: SPARKCOGNITION, INC., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VON NIEDERHAUSERN, EUGENE;GORTI, SREENIVASA;DIVINCENZO, KEVIN W.;AND OTHERS;SIGNING DATES FROM 20210317 TO 20210322;REEL/FRAME:056165/0443

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: ORIX GROWTH CAPITAL, LLC, TEXAS

Free format text: SECURITY INTEREST;ASSIGNOR:SPARKCOGNITION, INC.;REEL/FRAME:059760/0360

Effective date: 20220421

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED