US20210255886A1 - Distributed model execution - Google Patents
Distributed model execution Download PDFInfo
- Publication number
- US20210255886A1 US20210255886A1 US17/176,906 US202117176906A US2021255886A1 US 20210255886 A1 US20210255886 A1 US 20210255886A1 US 202117176906 A US202117176906 A US 202117176906A US 2021255886 A1 US2021255886 A1 US 2021255886A1
- Authority
- US
- United States
- Prior art keywords
- model
- node
- models
- nodes
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims description 35
- 238000013475 authorization Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 description 27
- 238000004891 communication Methods 0.000 description 22
- 230000015654 memory Effects 0.000 description 19
- 230000001419 dependent effect Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 15
- 238000004590 computer program Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000007667 floating Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000003491 array Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 239000004744 fabric Substances 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000037361 pathway Effects 0.000 description 4
- 230000006855 networking Effects 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/35—Creation or generation of source code model driven
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/52—Binary to binary
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- Machine learning models may be used to perform various data analysis applications.
- a client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
- FIG. 1 is a block diagram of an example system for distributed model execution according to some embodiments.
- FIG. 2 is a diagram of model dependencies for distributed model execution according to some embodiments.
- FIG. 3 is a block diagram of an example execution environment for distributed model execution according to some embodiments.
- FIG. 4 is a flowchart of another example method for distributed model execution according to some embodiments.
- FIG. 5 is a flowchart of another example method for distributed model execution according to some embodiments.
- FIG. 6 is a flowchart of another example method for distributed model execution according to some embodiments.
- FIG. 7 is a flowchart of another example method for distributed model execution according to some embodiments.
- Machine learning models may be used to perform various data analysis applications. For example, one or more machine learning models may be used to generate predictions or other analysis based on input data. Machine learning models may be logically integrated such that the output of some models are provided as input to other models, ultimately resulting in a model providing an output as the prediction.
- a client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
- a client may provide the machine learning models used to generate a prediction to off-site or remote resources, such as remote data centers, cloud computing environments, and the like.
- These remote resources may have access to hardware such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), or other devices that the models may leverage to accelerate their performance.
- the resulting prediction or output may then be provided back to a client.
- the execution of a given model may be performed by a node.
- a node may include a computing device, a virtual machine, or other device as can be appreciated.
- the models may be deployed for execution to a given node based on various criteria, including the hardware and software resources available to a node, the type of data or calculations used by the model, authorization requirements, model dependencies, and the like. Once deployed, the models may be used for distributed processing of data in order to generate a prediction for a client.
- FIG. 1 is a block diagram of a non-limiting example system for distributed model execution.
- the example system includes a model execution environment 106 .
- the model execution environment 106 includes a plurality of nodes 108 a - n.
- Each node 108 a - n is an allocation of hardware and software resources, including storage resources (e.g., storage devices, memory, and the like), processing resources (e.g., processors, hardware accelerators such as GPUs, FPGAs, and the like), software resources (e.g., operating systems, software applications, and the like), and other resources as can be appreciated to facilitate distributed model execution.
- Each node 108 a - n may include one or more computing devices, one or more virtual machines, or other allocations of resources as can be appreciated.
- Each node 108 a - n may be communicatively coupled to another node 108 a - n using various communications resources, including buses, wired or wireless networks, and the like.
- the system of FIG. 1 also includes a management node 102 .
- the management node 102 is similar to the nodes 108 a - n in that the management node 102 may include a computing device, virtual machine, and the like.
- the management node 102 is communicatively coupled to the model execution environment 106 .
- the management node 102 is shown as separate from the model execution environment 106 , it is understood that the management node 102 may be located remote from or proximate to the model execution environment 106 .
- the management node 102 and the model execution environment 106 may be implemented in the same or separate data centers, cloud computing environments, and the like.
- the client device 112 provides, to the model execution environment 106 , a plurality of models 110 a - n for execution in the plurality of nodes 108 a - n.
- FIG. 1 shows each model 110 a - n allocated to and executed in a respective node 108 a - n, it is understood that other configurations and allocations of nodes 108 a - n are possible.
- a node 108 a - n may be allocated execution of multiple models 110 a - n.
- multiple nodes 108 a - n may operate in parallel to facilitate the execution of a single model 110 a - n.
- executing a model at multiple nodes includes assigning different portions of an input data set to the different nodes, where each node executes an entirety of model operations with respect to their respective assigned input data.
- executing a model at multiple nodes includes executing a first portion of a model (e.g., operations corresponding to one or more first neural network layers) at a first node and second portion of the model (e.g., operations corresponding to one or more second neural network layers) at a second node, where “intermediate” output from the first node is provided as input to the second node.
- the plurality of models 110 a - n may include machine learning models (e.g., trained machine learning models such as neural networks), algorithmic models, and the like each configured to provide some output based on some input data.
- the plurality of models 110 a - n are configured to generate a prediction based on input to one or more of the models 110 a - n. Such predictions may include, for example, classifications for a classification problem, a numerical value for a regression problem, and the like.
- the plurality of models 110 a - n may also output one or more confidence values associated with the prediction. Accordingly, each model 110 a - n is configured to receive, as input, data output by another model 110 a - n, provide output as input data to another model 110 a - n, or both.
- FIG. 2 shows an exemplary arrangement of models and their respective dependencies.
- a model 204 a receives, as input data, input 202 a.
- Input 202 a may include stored data, data from a data stream, or data from another data source as can be appreciated.
- Model 204 a provides output to models 204 b and 204 c.
- Model 204 b receives input from models 204 a and 204 b.
- Model 204 d receives input from model 204 c and input 202 b.
- Model 204 d provides, as output data, output 206 .
- inputs 202 a, b are provided from one or more data sources to models 204 a and 204 d, respectively.
- Data processing is performed through the various model dependencies in order to ultimately generate output 206 .
- the output 206 may include a prediction based on the inputs 202 a, b.
- the management node 102 executes a management module 104 for distributed model execution.
- the management module 104 identifies, for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n. For example, assume that the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a - n for deployment to the model execution environment 106 . The request may include the plurality of models 110 a - n.
- the request may also include identifiers, network addresses, or other data facilitating access to the models 110 a - n. For example, after uploading the plurality of models 110 a - n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a - n for deployment to the model execution environment 106 a - n for execution. Accordingly, the management module 104 identifies each node 108 a - n to which a model 110 a - n will be deployed for execution.
- the management module 104 identifies the nodes 108 a - n for each model 110 a - n based on one or more execution constraints.
- the one or more execution constraints for a given model 110 a - n are requirements to be satisfied by a given node 108 a - n in order to execute the given model 110 a - n.
- the one or more execution constraints may include required constraints, where a node 108 a - n must satisfy a particular constraint for a given model 110 a - n to be deployed there.
- the one or more execution constraints may also include preferential constraints, where a node 108 a - n is more preferentially selected for deployment of a given model 110 a - n if the constraint is satisfied.
- the one or more execution constraints may include one or more model dependencies.
- the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input.
- the model 204 c is dependent on models 204 a and 204 b.
- a node 108 a - n selected for deploying the model 204 b must have a communications pathway to (or be the same node as) nodes 108 a - n to which the models 204 a and 204 c are deployed.
- the nodes 108 a - n are selected to reduce or minimize latency between nodes 108 a - n having interdependent models 110 a - n.
- the one or more execution constraints may also include one or more encryption constraints.
- the one or more encryption constraints may indicate data input to or received from a given model 110 a - n must be encrypted if transferred over a network.
- the one or more encryption constraints may also indicate that data input to or received from a given model 110 a - n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a - n are executed in a same node 108 a - n, or executed within different virtual machine nodes 108 a - n implemented in a same hardware environment).
- the one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like).
- a node 108 a - n may be selected based on an encryption constraint by selecting a node 108 a - n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints.
- a model 110 a - n whose output must be encrypted may be preferentially deployed to a node 108 a - n having greater hardware or processing resources, while a model 110 a - n that needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a - n having lesser hardware or processing resources.
- a model 110 a - n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a - n having even greater hardware or processing resources.
- the one or more execution constraints may also include one or more authorization constraints.
- An authorization constraint is a restriction on which entities have access to data input to a model 110 a - n, output by a model 110 a - n, generated by the model 110 a - n (e.g., intermediary data or calculations), and the like.
- an authorization constraint may indicate that a model 110 a - n should be executed on a private node 108 a - n (e.g., a node 108 a - n not shared by or accessible to another tenant or client of the model execution environment 106 ).
- an authorization constraint may define access privileges for those users or other entities that may access the node 108 a - n executing a given model 110 a - n.
- an authorization constraint may indicate that the input to or output from a given model 110 a - n should be transferred only over a private network. Accordingly, the node 108 a - n for the given model 110 a - n should be selected as having access to a private network connection to nodes 108 a - n executing its dependent models 110 a - n.
- the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more node characteristics.
- Node characteristics for a given node 108 a - n may include hardware resources for the node 108 a - n.
- Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like.
- Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a - n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a - n having more RAM than another node 108 a - n.
- a model 110 a - n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a - n having the requisite libraries for performing the algorithm installed.
- the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more model characteristics.
- the model characteristics for a given model 110 a - n describe the data acted upon and the calculations performed by the model 110 a - n.
- model characteristics may include a data type for data input to the model 110 a - n.
- a data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like).
- a data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, labeled or unlabeled data, time series data, and the like).
- Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like).
- models 110 a - n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a - n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations.
- a neural network model may be deployed to different node(s) based on architectural parameters, such as whether the neural network is a feed-forward network or a recurrent network, whether the neural network exhibits “memory” (e.g., via long short-term memory (LSTM) architecture), etc.
- LSTM long short-term memory
- Identifying, for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n may include calculating, for each model 110 a - n of the plurality of models 110 a - n, a plurality of fitness scores for each of the plurality of nodes 108 a - n.
- a given model 110 a - n has a fitness score calculated for each of the nodes 108 a - n indicating a fitness of that node 108 a - n for the given model 110 a - n.
- Each fitness score for a given model may be calculated based on a degree to which the node 108 a - n satisfies the execution constraints for the model 110 a - n.
- a node 108 a - n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a - n.
- a first model 110 a - n is dependent on a second model 110 a - n (e.g., for input by virtue of receiving input from the second model 110 a - n, or output by virtue of providing output to the second model 110 a - n ), and that the first model 110 a - n is selected for deployment to a first node 108 a - n.
- a second node 108 a - n and a third node 108 a - n are both communicatively coupled to the first node 108 a - n, with the second node 108 a - n having a lower latency connection to the first node 108 a - n compared to a connection from the third node 108 a - n to the first node 108 a - n.
- the second model 110 a - n would have a higher fitness score for the second node 108 a - n than the third node 108 a - n by virtue of the lower latency connection to the first node 108 a - n to which the dependent first model 110 a - n is to be deployed.
- a node 108 a - n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a - n must be executed on a private node 108 a - n. Any nodes 108 a - n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
- the fitness score may also be calculated based on the node characteristics of each node 108 a - n or the model characteristics of the model 110 a - n. For example, a model 110 a - n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a - n with greater RAM. As another example, nodes 108 a - n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a - n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a - n.
- the management module 104 may then select, for each model 110 a - n, based on the plurality of fitness scores, the corresponding node 108 a - n (e.g., the node 108 a - n to which a given model 110 a - n will be deployed).
- selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes selecting, for each model 110 a - n a highest scoring node 108 a - n.
- a node 108 a - n may be selected for each model 110 a - n by traversing a listing or ordering of models 110 a - n and selecting a node 108 a - n for a currently selected model 110 a - n.
- the fitness scores for the given node 108 a - n may be recalculated for each model 110 a - n not having an assigned node 108 a - n.
- a node 108 a - n already having an assigned model 110 a - n may still be an optimal selection for deploying another model 110 a - n.
- selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes generating multiple combinations or permutations of assigning models 110 a - n for deployment to nodes 108 a - n and calculating a best fit assignment for all of the plurality of models 110 a - n (e.g., an assignment with a highest total fitness score across all models 110 a - n ).
- the management module 104 then deploys each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
- Deploying each model 110 a - n may include sending one or more of the models 110 a - n to their respective assigned node 108 a - n.
- Deploying each model 110 a - n may also include causing a node 108 a - n to acquire or load its assigned model 110 a - n.
- the management module 104 may issue a command for a given node 108 a - n to load its assigned model 110 a - n from a local or remote storage location.
- Deploying each model 110 a - n may also include configuring one or more models 110 a - n to receive input from one or more data sources (e.g., data sources other than another model 110 a - n ).
- the management module 104 may provide a node 108 a - n of a given model 110 a - n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a - n.
- URLs Uniform Resource Locators
- IP Internet Protocol
- the management module 104 may provide a node 108 a - n a URL or IP address for a data stream of data to be provided as input to the given model 110 a - n.
- the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a - n.
- the management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
- Deploying each model 110 a - n may also include configuring one or more models 110 a - n to provide, as output, a prediction generated by the plurality of models 110 a - n.
- the management module 104 may indicate a storage location or file path for output data.
- the management module 104 may further provide an indication of the storage location of the output data to the client device 112 .
- deploying each model 110 a - n includes configuring each node 108 a - n to communicate with at least one other node 108 a - n of the plurality of nodes 108 a - n.
- each interdependent model 110 a - n may communicate with each other via the configured nodes 108 a - n.
- the management module 104 may facilitate the exchange of encryption keys between nodes 108 a - n executing dependent nodes 108 a - n requiring encryption.
- the management module 104 may provide, to nodes 108 a - n executing models 110 a - n having dependent models 110 a - n, the URLs, IP addresses, or other identifiers of the nodes 108 a - n executing their respective dependent models 110 a - n.
- the management module 104 may allocate or generate communications pathways between nodes 108 a - n via network communications fabrics of the model execution environment 106 .
- the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a - n.
- API Application Program Interface
- a prediction may then be generated by the deployed models 110 a - n. For example, input data may be provided to one or more of the models 110 a - n. A prediction may then be generated as an output of a model 110 a - n by virtue of the distributed and interdependent execution of the models 110 a - n in the nodes 108 a - n. Data indicating the prediction may then be provided or made accessible to the client device 112 .
- the management module 104 By deploying the models 110 a - n to the nodes 108 a - n of the model execution environment 106 as described above, the management module 104 ensures a useable configuration of models 110 a - n as deployed to nodes 108 a - n. Moreover, the management module 104 ensures that the model 110 a - n deployment preserves the hierarchy of dependencies of models 110 a - n, as well as the encryption and authorization requirements of the models 110 a - n.
- FIG. 3 sets forth a diagram of an execution environment 300 in accordance with some embodiments of the present disclosure.
- the execution environment 300 depicted in FIG. 3 may be embodied in a variety of different ways.
- the execution environment 300 may be provided, for example, by one or more cloud computing providers such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and others, including combinations thereof.
- AWS Amazon Web Services
- Azure Microsoft Azure
- Google Cloud and others, including combinations thereof.
- the execution environment 300 may be embodied as a collection of devices (e.g., servers, storage devices, networking devices) and software resources that are included in a private data center.
- the execution environment 300 may be embodied as a combination of cloud resources and private resources that collectively form a hybrid cloud computing environment.
- the execution environment 300 depicted in FIG. 3 may include storage resources 302 , which may be embodied in many forms.
- the storage resources 302 may include flash memory, hard disk drives, nano-RAM, non-volatile memory (NVM), 3D crosspoint non-volatile memory, magnetic random access memory (MRAM), non-volatile phase-change memory (PCM), storage class memory (SCM), or many others, including combinations of the storage technologies described above.
- NVM non-volatile memory
- MRAM magnetic random access memory
- PCM non-volatile phase-change memory
- SCM storage class memory
- DRAM dynamic random access memory
- SRAM static random access memory
- EEPROM electrically erasable programmable read-only memory
- universal memory and many others.
- the storage resources 302 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud storage resources such as Amazon Elastic Block Storage (EBS) block storage, Amazon S3 object storage, Amazon Elastic File System (EFS) file storage, Azure Blob Storage, and many others.
- EBS Amazon Elastic Block Storage
- EFS Amazon Elastic File System
- Azure Blob Storage and many others.
- the example execution environment 300 depicted in FIG. 3 may implement a variety of storage architectures, such as block storage where data is stored in blocks, and each block essentially acts as an individual hard drive, object storage where data is managed as objects, or file storage in which data is stored in a hierarchical structure. Such data may be saved in files and folders, and presented to both the system storing it and the system retrieving it in the same format.
- the execution environment 300 depicted in FIG. 3 also includes communications resources 304 that may be useful in facilitating data communications between components within the execution environment 300 , as well as data communications between the execution environment 300 and computing devices that are outside of the execution environment 300 .
- Such communications resources may be embodied, for example, as one or more routers, network switches, communications adapters, and many others, including combinations of such devices.
- the communications resources 304 may be configured to utilize a variety of different protocols and data communication fabrics to facilitate data communications.
- the communications resources 304 may utilize Internet Protocol (IP) based technologies, fibre channel (FC) technologies, FC over ethernet (FCoE) technologies, InfiniBand (IB) technologies, NVM Express (NVMe) technologies and NVMe over fabrics (NVMeoF) technologies, and many others.
- IP Internet Protocol
- FC fibre channel
- FCoE FC over ethernet
- IB InfiniBand
- NVMe NVMe over fabrics
- the communications resources 304 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as networking tools and resources that enable secure connections to the cloud as well as tools and resources (e.g., network interfaces, routing tables, gateways) to configure networking resources in a virtual private cloud.
- tools and resources e.g., network interfaces, routing tables, gateways
- the execution environment 300 depicted in FIG. 3 also includes processing resources 306 that may be useful in useful in executing computer program instructions and performing other computational tasks within the execution environment 300 .
- the processing resources 306 may include one or more application-specific integrated circuits (ASICs) that are customized for some particular purpose, one or more central processing units (CPUs), one or more digital signal processors (DSPs), one or more field-programmable gate arrays (FPGAs), one or more systems on a chip (SoCs), or other form of processing resources 306 .
- ASICs application-specific integrated circuits
- CPUs central processing units
- DSPs digital signal processors
- FPGAs field-programmable gate arrays
- SoCs systems on a chip
- the processing resources 306 may also be embodied, in embodiments where the execution environment 300 includes resources offered by a cloud provider, as cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances, event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others.
- cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances
- event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others.
- the execution environment 300 depicted in FIG. 3 also includes software resources 308 that, when executed by processing resources 306 within the execution environment 300 , may perform various tasks.
- the software resources 308 may include, for example, one or more modules of computer program instructions that when executed by processing resources 306 within the execution environment 300 are useful for distributed model execution.
- the software resources may include one or more models 310 (e.g., models 110 a - n as executed in nodes 108 a - n of FIG. 1 ).
- the software resources may also include a management module 312 (e.g., a management module 104 as described in FIG. 1 ). Accordingly, the execution environment 300 may include one or more of a management node 102 or a management execution environment 106 as described in FIG. 1 .
- FIG. 4 sets forth a flow chart illustrating an example method for distributed model execution that includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n.
- the management module 104 receives a request from the client device 112 to deploy a plurality of models 110 a - n for deployment to the model execution environment 106 .
- the request may include the plurality of models 110 a - n.
- the request may also include identifiers, network addresses, or other data facilitating access to the models 110 a - n. For example, after uploading the plurality of models 110 a - n to the model execution environment 106 or another storage location, the request may identify the plurality of models 110 a - n for deployment to the model execution environment 106 a - n for execution. Accordingly, the management module 104 identifies each node 108 a - n to which a model 110 a - n will be deployed for execution.
- the nodes 108 a - n for each model 110 a - n are identified based on one or more execution constraints.
- the one or more execution constraints for a given model 110 a - n are requirements to be satisfied by a given node 108 a - n in order to execute the given model 110 a - n.
- the one or more execution constraints may include required constraints, where a node 108 a - n must satisfy a particular constraint for a given model 110 a - n to be deployed there.
- the one or more execution constraints may also include preferential constraints, where a node 108 a - n is more preferentially selected for deployment of a given model 110 a - n if the constraint is satisfied.
- the one or more execution constraints may include one or more model dependencies.
- the model 204 b is dependent on the output of the model 204 a as the model 204 b accepts the output of the model 204 a as its input.
- the model 204 c is dependent on models 204 a and 204 b.
- a node 108 a - n selected for deploying the model 204 b must have a communications pathway to nodes 108 a - n to which the models 204 a and 204 c are deployed.
- the nodes 108 a - n are selected to reduce or minimize latency between nodes 108 a - n having interdependent models 110 a - n.
- the one or more execution constraints may also include one or more encryption constraints.
- the one or more encryption constraints may indicate data input to or received from a given model 110 a - n must be encrypted if transferred over a network.
- the one or more encryption constraints may also indicate that data input to or received from a given model 110 a - n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a - n are executed in a same node 108 a - n, or executed within different virtual machine nodes 108 a - n implemented in a same hardware environment).
- the one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly.
- a node 108 a - n may be selected based on an encryption constraint by selecting a node 108 a - n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a - n whose output must be encrypted may be preferentially deployed to a node 108 a - n having greater hardware or processing resources, while a model 110 a - n who needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a - n having lesser hardware or processing resources. As a further example, a model 110 a - n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a - n having even greater hardware or processing resources.
- the one or more execution constraints may also include one or more authorization constraints.
- An authorization constraint is a restriction on which entities have access to data input to a model 110 a - n, output by a model 110 a - n, generated by the model 110 a - n (e.g., intermediary data or calculations), and the like.
- an authorization constraint may indicate that a model 110 a - n should be executed on a private node 108 a - n (e.g., a node 108 a - n not shared by or accessible to another tenant or client of the model execution environment 106 ).
- an authorization constraint may define access privileges for those users or other entities that may access the node 108 a - n executing a given model 110 a - n.
- an authorization constrain may indicate that the input to or output from a given model 110 a - n should be transferred only over a private network. Accordingly, the node 108 a - n for the given model 110 a - n should be selected as having access to a private network connection to nodes 108 a - n executing its dependent models 110 a - n.
- the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more node characteristics.
- Node characteristics for a given node 108 a - n may include hardware resources for the node 108 a - n.
- Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like.
- Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a - n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a - n having more RAM than another node 108 a - n.
- a model 110 a - n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a - n having the requisite libraries for performing the algorithm installed.
- the management module 104 may also identify the nodes 108 a - n for each model 110 a - n based on one or more model characteristics.
- the model characteristics for a given model 110 a - n describe the data acted upon and the calculations performed by the model 110 a - n.
- model characteristics may include a data type for data input to the model 110 a - n.
- a data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like).
- a data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, and the like).
- Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like).
- models 110 a - n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a - n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations.
- the method of FIG. 4 also includes deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
- Deploying 404 each model 110 a - n may include sending one or more of the models 110 a - n to their respective assigned node 108 a - n.
- Deploying 404 each model 110 a - n may also include causing a node 108 a - n to acquire or load its assigned model 110 a - n.
- the management module 104 may issue a command for a given node 108 a - n to load its assigned model 110 a - n from a local or remote storage location.
- each model 110 a - n may also include configuring one or more models 110 a - n to receive input from one or more data sources (e.g., data sources other than another model 110 a - n ).
- the management module 104 may provide a node 108 a - n of a given model 110 a - n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a - n.
- URLs Uniform Resource Locators
- IP Internet Protocol
- the management module 104 may provide a node 108 a - n a URL or IP address for a data stream of data to be provided as input to the given model 110 a - n.
- the management module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a - n.
- the management module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams.
- each model 110 a - n may also include configuring one or more models 110 a - n to provide, as output, a prediction generated by the plurality of models 110 a - n.
- the management module 104 may indicate a storage location or file path for output data.
- the management module 104 may further provide an indication of the storage location of the output data to the client device 112 .
- models 110 a - n may be redistributed or redeployed according to various circumstances.
- Such circumstances may include, for example, a user request, a predefined interval passing, an addition or removal of a node 108 a - n or model 110 a - n, a change in available computational resources in nodes 108 a - n, and the like.
- FIG. 5 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
- the method of FIG. 5 is similar to that of FIG. 4 in that the method of FIG. 5 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
- the method of 5 differs from FIG. 4 in that identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n includes calculating 502 , for each model 110 a - n of the plurality of models 110 a - n, a plurality of fitness scores for each of the plurality of nodes 108 a - n.
- a given model 110 a - n has a fitness score calculated for each of the nodes 108 a - n indicating a fitness of that node 108 a - n for the given model 110 a - n.
- Each fitness score for a given model may be calculated based on a degree to which the node 108 a - n satisfies the execution constraints for the model 110 a - n.
- a node 108 a - n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a - n. For example, assume that a first model 110 a - n is dependent on a second model 110 a - n (e.g., for input or output), and that the first model 110 a - n is selected for deployment to a first node 108 a - n.
- a second node 108 a - n and a third node 108 a - n are both communicatively coupled to the first node 108 a - n, with the second node 108 a - n having a lower latency connection to the first node 108 a - n compared to a connection from the third node 108 a - n to the first node 108 a - n.
- the second model 110 a - n would have a higher fitness score for the second node 108 a - n than the third node 108 a - n by virtue of the lower latency connection to the first node 108 a - n to which the dependent first model 110 a - n is to be deployed.
- a node 108 a - n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a - n must be executed on a private node 108 a - n. Any nodes 108 a - n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
- the fitness score may also be calculated based on the node characteristics of each node 108 a - n or the model characteristics of the model 110 a - n. For example, a model 110 a - n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a - n with greater RAM. As another example, nodes 108 a - n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a - n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a - n.
- Identifying 402 for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n also includes selecting 504 , for each model 110 a - n, based on the plurality of fitness scores, the corresponding node 108 a - n (e.g., the node 108 a - n to which a given model 110 a - n will be deployed).
- selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes selecting, for each model 110 a - n a highest scoring node 108 a - n.
- a node 108 a - n may be selected for each model 110 a - n by traversing a listing or ordering of models 110 a - n and selecting a node 108 a - n for a currently selected model 110 a - n.
- the fitness scores for the given node 108 a - n may be recalculated for each model 110 a - n not having an assigned node 108 a - n. Accordingly, in some embodiments, a node 108 a - n already having an assigned model 110 a - n may still be an optimal selection for deploying another model 110 a - n.
- selecting, based on the plurality of fitness scores, the corresponding node 108 a - n includes generating multiple combinations or permutations of assigning models 110 a - n for deployment to nodes 108 a - n and calculating a best fit assignment for all of the plurality of models 110 a - n (e.g., an assignment with a highest total fitness score across all models 110 a - n ).
- FIG. 6 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
- the method of FIG. 6 is similar to that of FIG. 4 in that the method of FIG. 6 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
- the method of 6 differs from FIG. 4 in that deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a includes configuring 602 each node 108 a - n to communicate with at least one other node 108 a - n of the plurality of nodes 108 a - n.
- each interdependent model 110 a - n may communicate with each other via the configured nodes 108 a - n.
- the management module 104 may facilitate the exchange of encryption keys between nodes 108 a - n executing dependent nodes 108 a - n requiring encryption.
- the management module 104 may provide, to nodes 108 a - n executing models 110 a - n having dependent models 110 a - n, the URLs, IP addresses, or other identifiers of the nodes 108 a - n executing their respective dependent models 110 a - n.
- the management module 104 may allocate or generate communications pathways between nodes 108 a - n via network communications fabrics of the model execution environment 106 .
- the management module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a - n.
- API Application Program Interface
- FIG. 7 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure.
- the method of FIG. 7 is similar to that of FIG. 4 in that the method of FIG. 7 also includes identifying 402 , for each model 110 a - n of a plurality of models 110 a - n, based on one or more execution constraints for the plurality of models 110 a - n, a corresponding node 108 a - n of a plurality of nodes 108 a - n; and deploying 404 each model 110 a - n of the plurality of models 110 a - n to the identified corresponding node 108 a - n of the plurality of nodes 108 a - n.
- the method of 7 differs from FIG. 4 in that the method of FIG. 7 includes generating 702 a prediction based on a distributed execution of the plurality of models 110 a - n.
- the execution of the plurality of models 110 a - n is considered a distributed execution in that the models 110 a - n are executed across a plurality of distributed nodes 108 a - n.
- the plurality of models 110 a - n are executed interdependently in that each node 108 a - n provides output to or receives input from at least one other node 108 a - n.
- the prediction may be generated based on input data provided to one or more of the plurality of models 110 a - n.
- the prediction may be indicated in, encoded in, or embodied as output from one or more of the plurality of models 110 a - n.
- Exemplary embodiments of the present disclosure are described largely in the context of a fully functional computer system for distributed model execution. Readers of skill in the art will recognize, however, that the present disclosure also can be embodied in a computer program product disposed upon computer readable storage media for use with any suitable data processing system.
- Such computer readable storage media can be any storage medium for machine-readable information, including magnetic media, optical media, or other suitable media. Examples of such media include magnetic disks in hard drives or diskettes, compact disks for optical drives, magnetic tape, and others as will occur to those of skill in the art.
- Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of the method of the disclosure as embodied in a computer program product. Persons skilled in the art will recognize also that, although some of the exemplary embodiments described in this specification are oriented to software installed and executing on computer hardware, nevertheless, alternative embodiments implemented as firmware or as hardware are well within the scope of the present disclosure.
- the present disclosure can be a system, a method, and/or a computer program product.
- the computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
- the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
- the computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or Flash memory, a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read-only memory
- Flash memory a static random access memory
- SRAM static random access memory
- CD-ROM compact disc read-only memory
- DVD digital versatile disk
- memory stick a floppy disk
- a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
- a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
- Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
- the network can include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
- a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
- Computer readable program instructions for carrying out operations of the present disclosure can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
- the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection can be made to an external computer (for example, through the Internet using an Internet Service Provider).
- electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) can execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
- These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein includes an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
- the computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
- each block in the flowchart or block diagrams can represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical function(s).
- the functions noted in the block can occur out of the order noted in the figures.
- two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Stored Programmes (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- This application is a non-provisional application for patent entitled to a filing date and claiming the benefit of earlier-filed U.S. Provisional Patent Application Ser. No. 62/976,965, filed Feb. 14, 2020.
- This application is related to co-pending U.S. patent application docket Ser. No. SC0010US01, filed Feb. 16, 2021, and co-pending U.S. patent application docket Ser. No. SC0011US01, filed Feb. 16, 2021, each of which is incorporated by reference in their entirety.
- Machine learning models may be used to perform various data analysis applications. A client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data.
-
FIG. 1 is a block diagram of an example system for distributed model execution according to some embodiments. -
FIG. 2 is a diagram of model dependencies for distributed model execution according to some embodiments. -
FIG. 3 is a block diagram of an example execution environment for distributed model execution according to some embodiments. -
FIG. 4 is a flowchart of another example method for distributed model execution according to some embodiments. -
FIG. 5 is a flowchart of another example method for distributed model execution according to some embodiments. -
FIG. 6 is a flowchart of another example method for distributed model execution according to some embodiments. -
FIG. 7 is a flowchart of another example method for distributed model execution according to some embodiments. - Machine learning models may be used to perform various data analysis applications. For example, one or more machine learning models may be used to generate predictions or other analysis based on input data. Machine learning models may be logically integrated such that the output of some models are provided as input to other models, ultimately resulting in a model providing an output as the prediction.
- A client or consumer may not have the hardware or software resources available on-premises to perform computationally intensive predictions or handle large amounts of data. To address these shortcomings, a client may provide the machine learning models used to generate a prediction to off-site or remote resources, such as remote data centers, cloud computing environments, and the like. These remote resources may have access to hardware such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), or other devices that the models may leverage to accelerate their performance. The resulting prediction or output may then be provided back to a client.
- As will be described in more detail below, the execution of a given model may be performed by a node. Such a node may include a computing device, a virtual machine, or other device as can be appreciated. The models may be deployed for execution to a given node based on various criteria, including the hardware and software resources available to a node, the type of data or calculations used by the model, authorization requirements, model dependencies, and the like. Once deployed, the models may be used for distributed processing of data in order to generate a prediction for a client.
-
FIG. 1 is a block diagram of a non-limiting example system for distributed model execution. The example system includes amodel execution environment 106. Themodel execution environment 106 includes a plurality of nodes 108 a-n. Each node 108 a-n is an allocation of hardware and software resources, including storage resources (e.g., storage devices, memory, and the like), processing resources (e.g., processors, hardware accelerators such as GPUs, FPGAs, and the like), software resources (e.g., operating systems, software applications, and the like), and other resources as can be appreciated to facilitate distributed model execution. Each node 108 a-n may include one or more computing devices, one or more virtual machines, or other allocations of resources as can be appreciated. Each node 108 a-n may be communicatively coupled to another node 108 a-n using various communications resources, including buses, wired or wireless networks, and the like. - The system of
FIG. 1 also includes amanagement node 102. Themanagement node 102 is similar to the nodes 108 a-n in that themanagement node 102 may include a computing device, virtual machine, and the like. Themanagement node 102 is communicatively coupled to themodel execution environment 106. Although themanagement node 102 is shown as separate from themodel execution environment 106, it is understood that themanagement node 102 may be located remote from or proximate to themodel execution environment 106. For example, themanagement node 102 and themodel execution environment 106 may be implemented in the same or separate data centers, cloud computing environments, and the like. - Also included in the system of
FIG. 1 is aclient device 112. Theclient device 112 provides, to themodel execution environment 106, a plurality of models 110 a-n for execution in the plurality of nodes 108 a-n. AlthoughFIG. 1 shows each model 110 a-n allocated to and executed in a respective node 108 a-n, it is understood that other configurations and allocations of nodes 108 a-n are possible. For example, a node 108 a-n may be allocated execution of multiple models 110 a-n. As another example, multiple nodes 108 a-n may operate in parallel to facilitate the execution of a single model 110 a-n. In some examples, executing a model at multiple nodes includes assigning different portions of an input data set to the different nodes, where each node executes an entirety of model operations with respect to their respective assigned input data. As yet another example, executing a model at multiple nodes includes executing a first portion of a model (e.g., operations corresponding to one or more first neural network layers) at a first node and second portion of the model (e.g., operations corresponding to one or more second neural network layers) at a second node, where “intermediate” output from the first node is provided as input to the second node. - The plurality of models 110 a-n may include machine learning models (e.g., trained machine learning models such as neural networks), algorithmic models, and the like each configured to provide some output based on some input data. In aggregate, the plurality of models 110 a-n are configured to generate a prediction based on input to one or more of the models 110 a-n. Such predictions may include, for example, classifications for a classification problem, a numerical value for a regression problem, and the like. The plurality of models 110 a-n may also output one or more confidence values associated with the prediction. Accordingly, each model 110 a-n is configured to receive, as input, data output by another model 110 a-n, provide output as input data to another model 110 a-n, or both.
- Consider the example graph representations of model dependencies shown in
FIG. 2 .FIG. 2 shows an exemplary arrangement of models and their respective dependencies. One skilled in the art will appreciate that other arrangements or configurations of model dependencies are possible, and thatFIG. 2 merely serves as an illustrative example. As shown inFIG. 2 , amodel 204 a receives, as input data,input 202 a.Input 202 a may include stored data, data from a data stream, or data from another data source as can be appreciated.Model 204 a provides output tomodels Model 204 b receives input frommodels Model 204 d receives input frommodel 204 c andinput 202 b.Model 204 d provides, as output data,output 206. In the example ofFIG. 2 ,inputs 202 a, b are provided from one or more data sources to models 204 a and 204 d, respectively. Data processing is performed through the various model dependencies in order to ultimately generateoutput 206. Theoutput 206 may include a prediction based on theinputs 202 a, b. - Turning back to
FIG. 1 , themanagement node 102 executes amanagement module 104 for distributed model execution. Themanagement module 104 identifies, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n. For example, assume that themanagement module 104 receives a request from theclient device 112 to deploy a plurality of models 110 a-n for deployment to themodel execution environment 106. The request may include the plurality of models 110 a-n. The request may also include identifiers, network addresses, or other data facilitating access to the models 110 a-n. For example, after uploading the plurality of models 110 a-n to themodel execution environment 106 or another storage location, the request may identify the plurality of models 110 a-n for deployment to themodel execution environment 106 a-n for execution. Accordingly, themanagement module 104 identifies each node 108 a-n to which a model 110 a-n will be deployed for execution. - The
management module 104 identifies the nodes 108 a-n for each model 110 a-n based on one or more execution constraints. The one or more execution constraints for a given model 110 a-n are requirements to be satisfied by a given node 108 a-n in order to execute the given model 110 a-n. The one or more execution constraints may include required constraints, where a node 108 a-n must satisfy a particular constraint for a given model 110 a-n to be deployed there. The one or more execution constraints may also include preferential constraints, where a node 108 a-n is more preferentially selected for deployment of a given model 110 a-n if the constraint is satisfied. - The one or more execution constraints may include one or more model dependencies. For example, turning back to the example of
FIG. 2 , themodel 204 b is dependent on the output of themodel 204 a as themodel 204 b accepts the output of themodel 204 a as its input. Similarly, themodel 204 c is dependent onmodels model 204 b must have a communications pathway to (or be the same node as) nodes 108 a-n to which themodels - The one or more execution constraints may also include one or more encryption constraints. The one or more encryption constraints may indicate data input to or received from a given model 110 a-n must be encrypted if transferred over a network. The one or more encryption constraints may also indicate that data input to or received from a given model 110 a-n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a-n are executed in a same node 108 a-n, or executed within different virtual machine nodes 108 a-n implemented in a same hardware environment). The one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly, a node 108 a-n may be selected based on an encryption constraint by selecting a node 108 a-n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a-n whose output must be encrypted may be preferentially deployed to a node 108 a-n having greater hardware or processing resources, while a model 110 a-n that needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a-n having lesser hardware or processing resources. As a further example, a model 110 a-n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a-n having even greater hardware or processing resources.
- The one or more execution constraints may also include one or more authorization constraints. An authorization constraint is a restriction on which entities have access to data input to a model 110 a-n, output by a model 110 a-n, generated by the model 110 a-n (e.g., intermediary data or calculations), and the like. For example, an authorization constraint may indicate that a model 110 a-n should be executed on a private node 108 a-n (e.g., a node 108 a-n not shared by or accessible to another tenant or client of the model execution environment 106). As a further example, an authorization constraint may define access privileges for those users or other entities that may access the node 108 a-n executing a given model 110 a-n. As another example, an authorization constraint may indicate that the input to or output from a given model 110 a-n should be transferred only over a private network. Accordingly, the node 108 a-n for the given model 110 a-n should be selected as having access to a private network connection to nodes 108 a-n executing its dependent models 110 a-n.
- The
management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more node characteristics. Node characteristics for a given node 108 a-n may include hardware resources for the node 108 a-n. Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like. Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a-n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a-n having more RAM than another node 108 a-n. As another example, a model 110 a-n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a-n having the requisite libraries for performing the algorithm installed. - The
management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more model characteristics. The model characteristics for a given model 110 a-n describe the data acted upon and the calculations performed by the model 110 a-n. For example, model characteristics may include a data type for data input to the model 110 a-n. A data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like). A data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, labeled or unlabeled data, time series data, and the like). Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like). For example, models 110 a-n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a-n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations. As another example, a neural network model may be deployed to different node(s) based on architectural parameters, such as whether the neural network is a feed-forward network or a recurrent network, whether the neural network exhibits “memory” (e.g., via long short-term memory (LSTM) architecture), etc. - Identifying, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n may include calculating, for each model 110 a-n of the plurality of models 110 a-n, a plurality of fitness scores for each of the plurality of nodes 108 a-n. In other words, a given model 110 a-n has a fitness score calculated for each of the nodes 108 a-n indicating a fitness of that node 108 a-n for the given model 110 a-n. Each fitness score for a given model may be calculated based on a degree to which the node 108 a-n satisfies the execution constraints for the model 110 a-n.
- A node 108 a-n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a-n. For example, assume that a first model 110 a-n is dependent on a second model 110 a-n (e.g., for input by virtue of receiving input from the second model 110 a-n, or output by virtue of providing output to the second model 110 a-n), and that the first model 110 a-n is selected for deployment to a first node 108 a-n. Further assume that a second node 108 a-n and a third node 108 a-n are both communicatively coupled to the first node 108 a-n, with the second node 108 a-n having a lower latency connection to the first node 108 a-n compared to a connection from the third node 108 a-n to the first node 108 a-n. Accordingly, the second model 110 a-n would have a higher fitness score for the second node 108 a-n than the third node 108 a-n by virtue of the lower latency connection to the first node 108 a-n to which the dependent first model 110 a-n is to be deployed.
- A node 108 a-n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a-n must be executed on a private node 108 a-n. Any nodes 108 a-n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
- The fitness score may also be calculated based on the node characteristics of each node 108 a-n or the model characteristics of the model 110 a-n. For example, a model 110 a-n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a-n with greater RAM. As another example, nodes 108 a-n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a-n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a-n.
- The
management module 104 may then select, for each model 110 a-n, based on the plurality of fitness scores, the corresponding node 108 a-n (e.g., the node 108 a-n to which a given model 110 a-n will be deployed). In some embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes selecting, for each model 110 a-n a highest scoring node 108 a-n. For example, a node 108 a-n may be selected for each model 110 a-n by traversing a listing or ordering of models 110 a-n and selecting a node 108 a-n for a currently selected model 110 a-n. In some embodiments, after a model 110 a-n is assigned to a given node 108 a-n, the fitness scores for the given node 108 a-n may be recalculated for each model 110 a-n not having an assigned node 108 a-n. Accordingly, in some embodiments, a node 108 a-n already having an assigned model 110 a-n may still be an optimal selection for deploying another model 110 a-n. In other embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes generating multiple combinations or permutations of assigning models 110 a-n for deployment to nodes 108 a-n and calculating a best fit assignment for all of the plurality of models 110 a-n (e.g., an assignment with a highest total fitness score across all models 110 a-n). - The
management module 104 then deploys each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. Deploying each model 110 a-n may include sending one or more of the models 110 a-n to their respective assigned node 108 a-n. Deploying each model 110 a-n may also include causing a node 108 a-n to acquire or load its assigned model 110 a-n. For example, themanagement module 104 may issue a command for a given node 108 a-n to load its assigned model 110 a-n from a local or remote storage location. - Deploying each model 110 a-n may also include configuring one or more models 110 a-n to receive input from one or more data sources (e.g., data sources other than another model 110 a-n). For example, the
management module 104 may provide a node 108 a-n of a given model 110 a-n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a-n. For example, themanagement module 104 may provide a node 108 a-n a URL or IP address for a data stream of data to be provided as input to the given model 110 a-n. As another example, themanagement module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a-n. Themanagement module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams. - Deploying each model 110 a-n may also include configuring one or more models 110 a-n to provide, as output, a prediction generated by the plurality of models 110 a-n. For example, the
management module 104 may indicate a storage location or file path for output data. Themanagement module 104 may further provide an indication of the storage location of the output data to theclient device 112. - In some embodiments, deploying each model 110 a-n includes configuring each node 108 a-n to communicate with at least one other node 108 a-n of the plurality of nodes 108 a-n. Thus, each interdependent model 110 a-n may communicate with each other via the configured nodes 108 a-n. For example, the
management module 104 may facilitate the exchange of encryption keys between nodes 108 a-n executing dependent nodes 108 a-n requiring encryption. As another example, themanagement module 104 may provide, to nodes 108 a-n executing models 110 a-n having dependent models 110 a-n, the URLs, IP addresses, or other identifiers of the nodes 108 a-n executing their respective dependent models 110 a-n. In some embodiments, themanagement module 104 may allocate or generate communications pathways between nodes 108 a-n via network communications fabrics of themodel execution environment 106. In some embodiments, themanagement module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a-n. - A prediction may then be generated by the deployed models 110 a-n. For example, input data may be provided to one or more of the models 110 a-n. A prediction may then be generated as an output of a model 110 a-n by virtue of the distributed and interdependent execution of the models 110 a-n in the nodes 108 a-n. Data indicating the prediction may then be provided or made accessible to the
client device 112. - By deploying the models 110 a-n to the nodes 108 a-n of the
model execution environment 106 as described above, themanagement module 104 ensures a useable configuration of models 110 a-n as deployed to nodes 108 a-n. Moreover, themanagement module 104 ensures that the model 110 a-n deployment preserves the hierarchy of dependencies of models 110 a-n, as well as the encryption and authorization requirements of the models 110 a-n. - For further explanation,
FIG. 3 sets forth a diagram of anexecution environment 300 in accordance with some embodiments of the present disclosure. Theexecution environment 300 depicted inFIG. 3 may be embodied in a variety of different ways. Theexecution environment 300 may be provided, for example, by one or more cloud computing providers such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and others, including combinations thereof. Alternatively, theexecution environment 300 may be embodied as a collection of devices (e.g., servers, storage devices, networking devices) and software resources that are included in a private data center. In fact, theexecution environment 300 may be embodied as a combination of cloud resources and private resources that collectively form a hybrid cloud computing environment. - The
execution environment 300 depicted inFIG. 3 may includestorage resources 302, which may be embodied in many forms. For example, thestorage resources 302 may include flash memory, hard disk drives, nano-RAM, non-volatile memory (NVM), 3D crosspoint non-volatile memory, magnetic random access memory (MRAM), non-volatile phase-change memory (PCM), storage class memory (SCM), or many others, including combinations of the storage technologies described above. Readers will appreciate that other forms of computer memories and storage devices may be utilized as part of theexecution environment 300, including DRAM, static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), universal memory, and many others. Thestorage resources 302 may also be embodied, in embodiments where theexecution environment 300 includes resources offered by a cloud provider, as cloud storage resources such as Amazon Elastic Block Storage (EBS) block storage, Amazon S3 object storage, Amazon Elastic File System (EFS) file storage, Azure Blob Storage, and many others. Theexample execution environment 300 depicted inFIG. 3 may implement a variety of storage architectures, such as block storage where data is stored in blocks, and each block essentially acts as an individual hard drive, object storage where data is managed as objects, or file storage in which data is stored in a hierarchical structure. Such data may be saved in files and folders, and presented to both the system storing it and the system retrieving it in the same format. - The
execution environment 300 depicted inFIG. 3 also includescommunications resources 304 that may be useful in facilitating data communications between components within theexecution environment 300, as well as data communications between theexecution environment 300 and computing devices that are outside of theexecution environment 300. Such communications resources may be embodied, for example, as one or more routers, network switches, communications adapters, and many others, including combinations of such devices. Thecommunications resources 304 may be configured to utilize a variety of different protocols and data communication fabrics to facilitate data communications. For example, thecommunications resources 304 may utilize Internet Protocol (IP) based technologies, fibre channel (FC) technologies, FC over ethernet (FCoE) technologies, InfiniBand (IB) technologies, NVM Express (NVMe) technologies and NVMe over fabrics (NVMeoF) technologies, and many others. Thecommunications resources 304 may also be embodied, in embodiments where theexecution environment 300 includes resources offered by a cloud provider, as networking tools and resources that enable secure connections to the cloud as well as tools and resources (e.g., network interfaces, routing tables, gateways) to configure networking resources in a virtual private cloud. Such communications resources may be useful in facilitating data communications between components within theexecution environment 300, as well as data communications between theexecution environment 300 and computing devices that are outside of theexecution environment 300. - The
execution environment 300 depicted inFIG. 3 also includesprocessing resources 306 that may be useful in useful in executing computer program instructions and performing other computational tasks within theexecution environment 300. Theprocessing resources 306 may include one or more application-specific integrated circuits (ASICs) that are customized for some particular purpose, one or more central processing units (CPUs), one or more digital signal processors (DSPs), one or more field-programmable gate arrays (FPGAs), one or more systems on a chip (SoCs), or other form ofprocessing resources 306. Theprocessing resources 306 may also be embodied, in embodiments where theexecution environment 300 includes resources offered by a cloud provider, as cloud computing resources such as one or more Amazon Elastic Compute Cloud (EC2) instances, event-driven compute resources such as AWS Lambdas, Azure Virtual Machines, or many others. - The
execution environment 300 depicted inFIG. 3 also includessoftware resources 308 that, when executed by processingresources 306 within theexecution environment 300, may perform various tasks. Thesoftware resources 308 may include, for example, one or more modules of computer program instructions that when executed by processingresources 306 within theexecution environment 300 are useful for distributed model execution. The software resources may include one or more models 310 (e.g., models 110 a-n as executed in nodes 108 a-n ofFIG. 1 ). The software resources may also include a management module 312 (e.g., amanagement module 104 as described inFIG. 1 ). Accordingly, theexecution environment 300 may include one or more of amanagement node 102 or amanagement execution environment 106 as described inFIG. 1 . - For further explanation,
FIG. 4 sets forth a flow chart illustrating an example method for distributed model execution that includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n. For example, assume that themanagement module 104 receives a request from theclient device 112 to deploy a plurality of models 110 a-n for deployment to themodel execution environment 106. The request may include the plurality of models 110 a-n. The request may also include identifiers, network addresses, or other data facilitating access to the models 110 a-n. For example, after uploading the plurality of models 110 a-n to themodel execution environment 106 or another storage location, the request may identify the plurality of models 110 a-n for deployment to themodel execution environment 106 a-n for execution. Accordingly, themanagement module 104 identifies each node 108 a-n to which a model 110 a-n will be deployed for execution. - The nodes 108 a-n for each model 110 a-n are identified based on one or more execution constraints. The one or more execution constraints for a given model 110 a-n are requirements to be satisfied by a given node 108 a-n in order to execute the given model 110 a-n. The one or more execution constraints may include required constraints, where a node 108 a-n must satisfy a particular constraint for a given model 110 a-n to be deployed there. The one or more execution constraints may also include preferential constraints, where a node 108 a-n is more preferentially selected for deployment of a given model 110 a-n if the constraint is satisfied.
- The one or more execution constraints may include one or more model dependencies. For example, turning back to the example of
FIG. 2 , themodel 204 b is dependent on the output of themodel 204 a as themodel 204 b accepts the output of themodel 204 a as its input. Similarly, themodel 204 c is dependent onmodels model 204 b must have a communications pathway to nodes 108 a-n to which themodels - The one or more execution constraints may also include one or more encryption constraints. The one or more encryption constraints may indicate data input to or received from a given model 110 a-n must be encrypted if transferred over a network. The one or more encryption constraints may also indicate that data input to or received from a given model 110 a-n must be encrypted regardless if transferred over a network (e.g., if the source and destination models 110 a-n are executed in a same node 108 a-n, or executed within different virtual machine nodes 108 a-n implemented in a same hardware environment). The one or more encryption constraints may indicate a type of encryption to be used (e.g., symmetric vs. asymmetric, particular algorithms, and the like). Accordingly. A node 108 a-n may be selected based on an encryption constraint by selecting a node 108 a-n having hardware accelerators, processors, or other resources to facilitate satisfaction of the particular encryption constraints. For example, a model 110 a-n whose output must be encrypted may be preferentially deployed to a node 108 a-n having greater hardware or processing resources, while a model 110 a-n who needs to neither encrypt output or decrypt input may be preferentially deployed to a node 108 a-n having lesser hardware or processing resources. As a further example, a model 110 a-n whose input must be decrypted and whose output must be encrypted may be preferentially deployed to a node 108 a-n having even greater hardware or processing resources.
- The one or more execution constraints may also include one or more authorization constraints. An authorization constraint is a restriction on which entities have access to data input to a model 110 a-n, output by a model 110 a-n, generated by the model 110 a-n (e.g., intermediary data or calculations), and the like. For example, an authorization constraint may indicate that a model 110 a-n should be executed on a private node 108 a-n (e.g., a node 108 a-n not shared by or accessible to another tenant or client of the model execution environment 106). As a further example, an authorization constraint may define access privileges for those users or other entities that may access the node 108 a-n executing a given model 110 a-n. As another example, an authorization constrain may indicate that the input to or output from a given model 110 a-n should be transferred only over a private network. Accordingly, the node 108 a-n for the given model 110 a-n should be selected as having access to a private network connection to nodes 108 a-n executing its dependent models 110 a-n.
- The
management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more node characteristics. Node characteristics for a given node 108 a-n may include hardware resources for the node 108 a-n. Such hardware resources may include storage devices, memory (e.g., random access memory (RAM)), processors, hardware accelerators, network interfaces, and the like. Software resources may include particular operating systems, software libraries, applications, and the like. For example, a model 110 a-n processing highly-dimensional data or large amounts of data at a time may be preferentially deployed to a node 108 a-n having more RAM than another node 108 a-n. As another example, a model 110 a-n that uses a particular encryption algorithm for encrypting output data or decrypting input data may be preferentially deployed to a node 108 a-n having the requisite libraries for performing the algorithm installed. - The
management module 104 may also identify the nodes 108 a-n for each model 110 a-n based on one or more model characteristics. The model characteristics for a given model 110 a-n describe the data acted upon and the calculations performed by the model 110 a-n. For example, model characteristics may include a data type for data input to the model 110 a-n. A data type for input data may describe a type of value included in the input data (e.g., integer, floating point, bytes, and the like). A data type for input data may also describe a data structure or class of the input data (e.g., single values, multidimensional data structures, and the like). Model characteristics may also include types of calculations or transformations performed by the model (e.g., arithmetic calculations, floating point operations, matrix operations, Boolean operations, and the like). For example, models 110 a-n performing complex matrix operations on multidimensional floating point data may be preferentially deployed to nodes 108 a-n with GPUs, FPGAs, or other hardware accelerators to facilitate execution of such operations. - The method of
FIG. 4 also includes deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. Deploying 404 each model 110 a-n may include sending one or more of the models 110 a-n to their respective assigned node 108 a-n. Deploying 404 each model 110 a-n may also include causing a node 108 a-n to acquire or load its assigned model 110 a-n. For example, themanagement module 104 may issue a command for a given node 108 a-n to load its assigned model 110 a-n from a local or remote storage location. - Deploying 404 each model 110 a-n may also include configuring one or more models 110 a-n to receive input from one or more data sources (e.g., data sources other than another model 110 a-n). For example, the
management module 104 may provide a node 108 a-n of a given model 110 a-n network addresses (e.g., Uniform Resource Locators (URLs), Internet Protocol (IP) addresses) or other identifiers for data sources of input data to the given model 110 a-n. For example, themanagement module 104 may provide a node 108 a-n a URL or IP address for a data stream of data to be provided as input to the given model 110 a-n. As another example, themanagement module 104 may provide a node a URL, IP address, memory address, or file path to stored data to be provided as input to the given model 110 a-n. Themanagement module 104 may also provide authentication credentials, login credentials, or other data facilitating access to stored data or data streams. - Deploying 404 each model 110 a-n may also include configuring one or more models 110 a-n to provide, as output, a prediction generated by the plurality of models 110 a-n. For example, the
management module 104 may indicate a storage location or file path for output data. Themanagement module 104 may further provide an indication of the storage location of the output data to theclient device 112. - One skilled in the art will appreciate that the approaches set forth above with respect to
FIG. 4 may be performed repeatedly such that models 110 a-n are redistributed or redeployed according to various circumstances. Such circumstances may include, for example, a user request, a predefined interval passing, an addition or removal of a node 108 a-n or model 110 a-n, a change in available computational resources in nodes 108 a-n, and the like. - For further explanation,
FIG. 5 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method ofFIG. 5 is similar to that ofFIG. 4 in that the method ofFIG. 5 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. - The method of 5 differs from
FIG. 4 in that identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n includes calculating 502, for each model 110 a-n of the plurality of models 110 a-n, a plurality of fitness scores for each of the plurality of nodes 108 a-n. In other words, a given model 110 a-n has a fitness score calculated for each of the nodes 108 a-n indicating a fitness of that node 108 a-n for the given model 110 a-n. Each fitness score for a given model may be calculated based on a degree to which the node 108 a-n satisfies the execution constraints for the model 110 a-n. - A node 108 a-n may receive a higher fitness score for satisfying an execution constraint to a greater degree than another node 108 a-n. For example, assume that a first model 110 a-n is dependent on a second model 110 a-n (e.g., for input or output), and that the first model 110 a-n is selected for deployment to a first node 108 a-n. Further assume that a second node 108 a-n and a third node 108 a-n are both communicatively coupled to the first node 108 a-n, with the second node 108 a-n having a lower latency connection to the first node 108 a-n compared to a connection from the third node 108 a-n to the first node 108 a-n. Accordingly, the second model 110 a-n would have a higher fitness score for the second node 108 a-n than the third node 108 a-n by virtue of the lower latency connection to the first node 108 a-n to which the dependent first model 110 a-n is to be deployed.
- A node 108 a-n may receive a null or zero fitness score for failing to satisfy a required execution constraint. For example, assume that a given model 110 a-n must be executed on a private node 108 a-n. Any nodes 108 a-n accessible to other tenants may receive a zero fitness score for failing to meet the privacy requirement.
- The fitness score may also be calculated based on the node characteristics of each node 108 a-n or the model characteristics of the model 110 a-n. For example, a model 110 a-n performing calculations on highly-dimensional data may assign a higher fitness score to nodes 108 a-n with greater RAM. As another example, nodes 108 a-n having advanced processors or hardware accelerators may not receive higher fitness scores for models 110 a-n acting on low-dimensional data or performing more simple arithmetic calculations as such hardware resources would not provide a meaningful benefit when compared to other models 110 a-n.
- Identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n also includes selecting 504, for each model 110 a-n, based on the plurality of fitness scores, the corresponding node 108 a-n (e.g., the node 108 a-n to which a given model 110 a-n will be deployed). In some embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes selecting, for each model 110 a-n a highest scoring node 108 a-n. For example, a node 108 a-n may be selected for each model 110 a-n by traversing a listing or ordering of models 110 a-n and selecting a node 108 a-n for a currently selected model 110 a-n. In some embodiments, after a model 110 a-n is assigned to a given node 108 a-n, the fitness scores for the given node 108 a-n may be recalculated for each model 110 a-n not having an assigned node 108 a-n. Accordingly, in some embodiments, a node 108 a-n already having an assigned model 110 a-n may still be an optimal selection for deploying another model 110 a-n. In other embodiments, selecting, based on the plurality of fitness scores, the corresponding node 108 a-n includes generating multiple combinations or permutations of assigning models 110 a-n for deployment to nodes 108 a-n and calculating a best fit assignment for all of the plurality of models 110 a-n (e.g., an assignment with a highest total fitness score across all models 110 a-n).
- For further explanation,
FIG. 6 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method ofFIG. 6 is similar to that ofFIG. 4 in that the method ofFIG. 6 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. - The method of 6 differs from
FIG. 4 in that deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality ofnodes 108 a includes configuring 602 each node 108 a-n to communicate with at least one other node 108 a-n of the plurality of nodes 108 a-n. Thus, each interdependent model 110 a-n may communicate with each other via the configured nodes 108 a-n. For example, themanagement module 104 may facilitate the exchange of encryption keys between nodes 108 a-n executing dependent nodes 108 a-n requiring encryption. As another example, themanagement module 104 may provide, to nodes 108 a-n executing models 110 a-n having dependent models 110 a-n, the URLs, IP addresses, or other identifiers of the nodes 108 a-n executing their respective dependent models 110 a-n. In some embodiments, themanagement module 104 may allocate or generate communications pathways between nodes 108 a-n via network communications fabrics of themodel execution environment 106. In some embodiments, themanagement module 104 may configure Application Program Interface (API) calls or queries facilitating communications between any of the nodes 108 a-n. - For further explanation,
FIG. 7 sets forth a flow chart illustrating another example method for distributed model execution according to embodiments of the present disclosure. The method ofFIG. 7 is similar to that ofFIG. 4 in that the method ofFIG. 7 also includes identifying 402, for each model 110 a-n of a plurality of models 110 a-n, based on one or more execution constraints for the plurality of models 110 a-n, a corresponding node 108 a-n of a plurality of nodes 108 a-n; and deploying 404 each model 110 a-n of the plurality of models 110 a-n to the identified corresponding node 108 a-n of the plurality of nodes 108 a-n. - The method of 7 differs from
FIG. 4 in that the method ofFIG. 7 includes generating 702 a prediction based on a distributed execution of the plurality of models 110 a-n. The execution of the plurality of models 110 a-n is considered a distributed execution in that the models 110 a-n are executed across a plurality of distributed nodes 108 a-n. The plurality of models 110 a-n are executed interdependently in that each node 108 a-n provides output to or receives input from at least one other node 108 a-n. The prediction may be generated based on input data provided to one or more of the plurality of models 110 a-n. The prediction may be indicated in, encoded in, or embodied as output from one or more of the plurality of models 110 a-n. - In view of the explanations set forth above, readers will recognize that the benefits of distributed model execution include:
-
- Improved performance of a computing system by identifying optimal or best fitting nodes for model deployment and execution.
- Improved performance of a computing system by allowing for remote, distributed execution of models, leveraging mode advanced hardware and computational resources than found in client systems.
- Improved performance of a computing system by deploying models such that model dependencies, encryption relationships, and authorization requirements are preserved.
- Exemplary embodiments of the present disclosure are described largely in the context of a fully functional computer system for distributed model execution. Readers of skill in the art will recognize, however, that the present disclosure also can be embodied in a computer program product disposed upon computer readable storage media for use with any suitable data processing system. Such computer readable storage media can be any storage medium for machine-readable information, including magnetic media, optical media, or other suitable media. Examples of such media include magnetic disks in hard drives or diskettes, compact disks for optical drives, magnetic tape, and others as will occur to those of skill in the art. Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of the method of the disclosure as embodied in a computer program product. Persons skilled in the art will recognize also that, although some of the exemplary embodiments described in this specification are oriented to software installed and executing on computer hardware, nevertheless, alternative embodiments implemented as firmware or as hardware are well within the scope of the present disclosure.
- The present disclosure can be a system, a method, and/or a computer program product. The computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
- The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or Flash memory, a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
- Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network can include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
- Computer readable program instructions for carrying out operations of the present disclosure can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection can be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) can execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
- Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
- These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein includes an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
- The computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
- The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams can represent a module, segment, or portion of instructions, which includes one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block can occur out of the order noted in the figures. For example, two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
- It will be understood from the foregoing description that modifications and changes can be made in various embodiments of the present disclosure. The descriptions in this specification are for purposes of illustration only and are not to be construed in a limiting sense. The scope of the present disclosure is limited only by the language of the following claims.
Claims (22)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/176,906 US20210255886A1 (en) | 2020-02-14 | 2021-02-16 | Distributed model execution |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062976965P | 2020-02-14 | 2020-02-14 | |
US17/176,906 US20210255886A1 (en) | 2020-02-14 | 2021-02-16 | Distributed model execution |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210255886A1 true US20210255886A1 (en) | 2021-08-19 |
Family
ID=77272078
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/176,906 Pending US20210255886A1 (en) | 2020-02-14 | 2021-02-16 | Distributed model execution |
US17/176,889 Active US11675614B2 (en) | 2020-02-14 | 2021-02-16 | Standardized model packaging and deployment |
US17/176,898 Active 2041-06-23 US11947989B2 (en) | 2020-02-14 | 2021-02-16 | Process flow for model-based applications |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/176,889 Active US11675614B2 (en) | 2020-02-14 | 2021-02-16 | Standardized model packaging and deployment |
US17/176,898 Active 2041-06-23 US11947989B2 (en) | 2020-02-14 | 2021-02-16 | Process flow for model-based applications |
Country Status (1)
Country | Link |
---|---|
US (3) | US20210255886A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220256001A1 (en) * | 2021-02-09 | 2022-08-11 | Cisco Technology, Inc. | Methods for seamless session transfer without re-keying |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11030565B1 (en) * | 2020-05-18 | 2021-06-08 | Grant Thornton Llp | System and method for audit report generation from structured data |
US20230092247A1 (en) * | 2021-09-22 | 2023-03-23 | Rockwell Automation Technologies, Inc. | Automated monitoring using image analysis |
US20230205917A1 (en) * | 2021-12-24 | 2023-06-29 | BeeKeeperAI, Inc. | Systems and methods for data validation and transformation of data in a zero-trust environment |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225952A1 (en) * | 2003-03-06 | 2004-11-11 | Microsoft Corporation | Architecture for distributed computing system and automated design, deployment, and management of distributed applications |
US20160162819A1 (en) * | 2014-12-03 | 2016-06-09 | Hakman Labs LLC | Workflow definition, orchestration and enforcement via a collaborative interface according to a hierarchical procedure list |
US20190043201A1 (en) * | 2017-12-28 | 2019-02-07 | Christina R. Strong | Analytic image format for visual computing |
US20200027022A1 (en) * | 2019-09-27 | 2020-01-23 | Satish Chandra Jha | Distributed machine learning in an information centric network |
US10606660B1 (en) * | 2016-04-29 | 2020-03-31 | Architecture Technology Corporation | Planned cloud resource management |
US20200174844A1 (en) * | 2018-12-04 | 2020-06-04 | Huawei Technologies Canada Co., Ltd. | System and method for resource partitioning in distributed computing |
US20200220901A1 (en) * | 2016-06-10 | 2020-07-09 | OneTrust, LLC | Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods |
US20200274895A1 (en) * | 2019-02-25 | 2020-08-27 | Acronis International Gmbh | System and method for creating a data protection map and remediating vulnerabilities |
US20210005292A1 (en) * | 2018-09-25 | 2021-01-07 | Patientory, Inc. | System and method of utilizing a user's health data stored over a health care network, for disease prevention |
US20210019194A1 (en) * | 2019-07-16 | 2021-01-21 | Cisco Technology, Inc. | Multi-cloud service mesh orchestration platform |
US20210029204A1 (en) * | 2019-07-24 | 2021-01-28 | Vmware, Inc. | Methods and apparatus to generate migration recommendations to migrate services between geographic regions |
US10970123B1 (en) * | 2019-09-19 | 2021-04-06 | Amazon Technologies, Inc. | Determining suitability of computing resources for workloads |
US11315253B2 (en) * | 2019-01-22 | 2022-04-26 | Kabushiki Kaisha Toshiba | Computer vision system and method |
US11429434B2 (en) * | 2019-12-23 | 2022-08-30 | International Business Machines Corporation | Elastic execution of machine learning workloads using application based profiling |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8494824B2 (en) * | 2009-02-16 | 2013-07-23 | The Boeing Company | Methods and apparatus for integrating engineering models from disparate tools in support of model reuse |
US8341193B2 (en) * | 2010-01-12 | 2012-12-25 | Microsoft Corporation | Data versioning through data transformations |
US20150134362A1 (en) * | 2010-09-01 | 2015-05-14 | Apixio, Inc. | Systems and methods for a medical coder marketplace |
US11538561B2 (en) * | 2010-09-01 | 2022-12-27 | Apixio, Inc. | Systems and methods for medical information data warehouse management |
US8805859B2 (en) * | 2011-02-21 | 2014-08-12 | General Electric Company | Methods and systems for receiving, mapping and structuring data from disparate systems in a healthcare environment |
US10114660B2 (en) * | 2011-02-22 | 2018-10-30 | Julian Michael Urbach | Software application delivery and launching system |
US20120290560A1 (en) * | 2011-05-13 | 2012-11-15 | Kushal Das | Mechanism for efficiently querying application binary interface/application programming interface-related information |
US10685314B1 (en) * | 2014-07-31 | 2020-06-16 | Open Text Corporation | Case leaf nodes pointing to business objects or document types |
US11138220B2 (en) * | 2016-11-27 | 2021-10-05 | Amazon Technologies, Inc. | Generating data transformation workflows |
WO2019028468A1 (en) * | 2017-08-04 | 2019-02-07 | Fair Ip, Llc | Computer system for building, training and productionizing machine learning models |
US10467039B2 (en) * | 2017-08-07 | 2019-11-05 | Open Data Group Inc. | Deployment and management platform for model execution engine containers |
US10831519B2 (en) * | 2017-11-22 | 2020-11-10 | Amazon Technologies, Inc. | Packaging and deploying algorithms for flexible machine learning |
US10621513B2 (en) * | 2018-03-08 | 2020-04-14 | Capital One Services, Llc | System and method for deploying and versioning machine learning models |
US10692153B2 (en) * | 2018-07-06 | 2020-06-23 | Optum Services (Ireland) Limited | Machine-learning concepts for detecting and visualizing healthcare fraud risk |
US20210141791A1 (en) * | 2019-11-13 | 2021-05-13 | Adobe Inc. | Method and system for generating a hybrid data model |
AU2020384311B2 (en) * | 2019-11-15 | 2023-04-06 | Equinix, Inc. | Secure artificial intelligence model training and registration system |
-
2021
- 2021-02-16 US US17/176,906 patent/US20210255886A1/en active Pending
- 2021-02-16 US US17/176,889 patent/US11675614B2/en active Active
- 2021-02-16 US US17/176,898 patent/US11947989B2/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225952A1 (en) * | 2003-03-06 | 2004-11-11 | Microsoft Corporation | Architecture for distributed computing system and automated design, deployment, and management of distributed applications |
US20160162819A1 (en) * | 2014-12-03 | 2016-06-09 | Hakman Labs LLC | Workflow definition, orchestration and enforcement via a collaborative interface according to a hierarchical procedure list |
US10606660B1 (en) * | 2016-04-29 | 2020-03-31 | Architecture Technology Corporation | Planned cloud resource management |
US20200220901A1 (en) * | 2016-06-10 | 2020-07-09 | OneTrust, LLC | Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods |
US20190043201A1 (en) * | 2017-12-28 | 2019-02-07 | Christina R. Strong | Analytic image format for visual computing |
US20210005292A1 (en) * | 2018-09-25 | 2021-01-07 | Patientory, Inc. | System and method of utilizing a user's health data stored over a health care network, for disease prevention |
US20200174844A1 (en) * | 2018-12-04 | 2020-06-04 | Huawei Technologies Canada Co., Ltd. | System and method for resource partitioning in distributed computing |
US11315253B2 (en) * | 2019-01-22 | 2022-04-26 | Kabushiki Kaisha Toshiba | Computer vision system and method |
US20200274895A1 (en) * | 2019-02-25 | 2020-08-27 | Acronis International Gmbh | System and method for creating a data protection map and remediating vulnerabilities |
US20210019194A1 (en) * | 2019-07-16 | 2021-01-21 | Cisco Technology, Inc. | Multi-cloud service mesh orchestration platform |
US20210029204A1 (en) * | 2019-07-24 | 2021-01-28 | Vmware, Inc. | Methods and apparatus to generate migration recommendations to migrate services between geographic regions |
US10970123B1 (en) * | 2019-09-19 | 2021-04-06 | Amazon Technologies, Inc. | Determining suitability of computing resources for workloads |
US20200027022A1 (en) * | 2019-09-27 | 2020-01-23 | Satish Chandra Jha | Distributed machine learning in an information centric network |
US11429434B2 (en) * | 2019-12-23 | 2022-08-30 | International Business Machines Corporation | Elastic execution of machine learning workloads using application based profiling |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220256001A1 (en) * | 2021-02-09 | 2022-08-11 | Cisco Technology, Inc. | Methods for seamless session transfer without re-keying |
US11683380B2 (en) * | 2021-02-09 | 2023-06-20 | Cisco Technology, Inc. | Methods for seamless session transfer without re-keying |
Also Published As
Publication number | Publication date |
---|---|
US20210255839A1 (en) | 2021-08-19 |
US11947989B2 (en) | 2024-04-02 |
US20210256000A1 (en) | 2021-08-19 |
US11675614B2 (en) | 2023-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210255886A1 (en) | Distributed model execution | |
US10516623B2 (en) | Pluggable allocation in a cloud computing system | |
US10700866B2 (en) | Anonymous encrypted data | |
US10116742B2 (en) | Scalable approach to manage storage volumes across heterogenous cloud systems | |
US11870650B2 (en) | System, method and computer program product for network function optimization based on locality and function type | |
US10567269B2 (en) | Dynamically redirecting affiliated data to an edge computing device | |
JP2023551527A (en) | Secure computing resource placement using homomorphic encryption | |
Makris et al. | Towards a distributed storage framework for edge computing infrastructures | |
US10554626B2 (en) | Filtering of authenticated synthetic transactions | |
EP2852893A1 (en) | Pluggable allocation in a cloud computing system | |
US11650954B2 (en) | Replication continued enhancement method | |
CN115150117A (en) | Maintaining confidentiality in decentralized policies | |
JP2024501168A (en) | Secure memory sharing method | |
US11349663B2 (en) | Secure workload configuration | |
US11102258B2 (en) | Stream processing without central transportation planning | |
Reali et al. | Orchestration of cloud genomic services | |
US11695552B2 (en) | Quantum key distribution in a multi-cloud environment | |
US11875202B2 (en) | Visualizing API invocation flows in containerized environments | |
US11709607B2 (en) | Storage block address list entry transform architecture | |
US20230403146A1 (en) | Smart round robin delivery for hardware security module host requests | |
WO2024032653A1 (en) | Reducing network overhead | |
US20230222228A1 (en) | Database hierarchical encryption for hybrid-cloud environment | |
Jiang et al. | RADU: Bridging the divide between data and infrastructure management to support data-driven collaborations | |
CN116615891A (en) | Key rotation on a publish-subscribe system | |
Suganya et al. | DISTRIBUTED MINING ALGORITHM USING HADOOP ON LARGE DATA SET |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SPARKCOGNITION, INC., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VON NIEDERHAUSERN, EUGENE;GORTI, SREENIVASA;DIVINCENZO, KEVIN W.;AND OTHERS;SIGNING DATES FROM 20210317 TO 20210322;REEL/FRAME:056165/0443 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: ORIX GROWTH CAPITAL, LLC, TEXAS Free format text: SECURITY INTEREST;ASSIGNOR:SPARKCOGNITION, INC.;REEL/FRAME:059760/0360 Effective date: 20220421 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |