EP4374256A1 - Techniques d'implémentation des services logiciels basés sur des conteneurs - Google Patents

Techniques d'implémentation des services logiciels basés sur des conteneurs

Info

Publication number
EP4374256A1
EP4374256A1 EP22754268.5A EP22754268A EP4374256A1 EP 4374256 A1 EP4374256 A1 EP 4374256A1 EP 22754268 A EP22754268 A EP 22754268A EP 4374256 A1 EP4374256 A1 EP 4374256A1
Authority
EP
European Patent Office
Prior art keywords
interface
service
request
shim
container
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP22754268.5A
Other languages
German (de)
English (en)
Inventor
Kevin Frederick DUNNELL
Tom Martin
Vishal Inder Sikka
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vianai Systems Inc
Original Assignee
Vianai Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/867,540 external-priority patent/US20230021412A1/en
Application filed by Vianai Systems Inc filed Critical Vianai Systems Inc
Publication of EP4374256A1 publication Critical patent/EP4374256A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/541Interprogram communication via adapters, e.g. between incompatible applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment

Definitions

  • Embodiments of the present disclosure relate generally to computer science and software architecture and, more specifically, to techniques for providing container-based software services.
  • Each technology stack is composed of discrete services that provide different subsets of functionality associated with a corresponding software- based workflow.
  • a technology stack for a machine learning workflow could include services that are used to store machine learning models and/or features, select or engineer features, create or train machine learning models, deploy machine learning models in various environments, monitor the execution of the machine learning models in the various environments, and/or perform other tasks related to machine learning.
  • a given technology stack may change over time as services are added, upgraded, replaced, or deprecated within a corresponding software-based workflow to reflect changes to the underlying technology.
  • a first service within a technology stack could be replaced with a second service when the first service is no longer able to scale to meet the needs of the user of the technology stack, when the second service improves the functionality imparted by the first service, and/or when the first service is no longer supported.
  • One drawback to using conventional technology stack architectures is that the services within a given technology stack are typically “hardcoded” to interoperate with one another. Accordingly, when a service is added, modified, or replaced within a technology stack, additional components have to be added to the technology stack to adapt the other services within the technology stack to the interfaces and features implemented by the added or modified service.
  • adding a new service to a technology stack could require a separate “driver” to be created and installed within the technology stack, where the driver allows the other services in the technology stack to make calls to the interface implemented by the new service.
  • modifying the interface implemented by an existing service within a technology stack could require code-based adaptations that allow the other services within the technology stack to interact with the modified service.
  • One embodiment of the present invention sets forth a technique for processing requests associated with one or more services.
  • the technique includes deploying a first container within an environment, wherein the first container includes a first service that implements a first interface and a first shim that implements a second interface.
  • the technique also includes receiving, at the first shim, a first request associated with the second interface.
  • the technique further includes converting the first request into a second request associated with the first interface, and transmitting the second request over the first interface to the first service, wherein the second request is processed by the first service.
  • One technical advantage of the disclosed techniques relative to the prior art is that, with the disclosed techniques, the same interface can be used to access multiple services that provide similar functionality within a given technology stack. Accordingly, the disclosed techniques enable a service to be added, updated, upgraded, or replaced within the technology stack without having to create and install a custom “driver” to accommodate the modified or added service or change the interfaces of the other implemented in the technology stack, as is normally required with prior art approaches.
  • Another technical advantage of the disclosed techniques is that, because a service and a corresponding shim are packaged together within the same container, the service and shim are isolated from other components executing within the same environment. This isolation allows the service and shim to be deployed, moved, and removed as a single self-contained unit, which is more efficient relative to prior art approaches where services and components are packaged, deployed, and managed separately.
  • Figure 1 illustrates a system configured to implement one or more aspects of the various embodiments.
  • Figure 2 is a more detailed illustration of the Al design application of Figure 1 , according to various embodiments.
  • Figure 3 is a more detailed illustration of the network generator of Figure 2, according to various embodiments.
  • Figure 4 is a more detailed illustration of the compiler engine and the synthesis engine of Figure 3, according to various embodiments.
  • Figure 5 illustrates an implementation of the system of Figure 1 that includes container-based abstractions of services, according to various embodiments.
  • Figure 6 sets forth a flow diagram of method steps for implementing one or more services in a technology stack, according to various embodiments.
  • Figure 7 sets forth a flow diagram of method steps for processing a request associated with a service implemented in a technology stack, according to various embodiments.
  • FIG. 1 illustrates a system 100 configured to implement one or more aspects of the various embodiments.
  • system 100 includes a client 110 and a server 130 coupled together via network 150.
  • Client 110 or server 130 may be any technically feasible type of computer system, including a desktop computer, a laptop computer, a mobile device, a virtualized instance of a computing device, a distributed and/or cloud-based computer system, and so forth.
  • Network 150 may be any technically feasible set of interconnected communication links, including a local area network (LAN), wide area network (WAN), the World Wide Web, or the Internet, among others.
  • Client 110 and server 130 are configured to communicate via network 150.
  • client 110 includes processor 112, input/output (I/O) devices 114, and memory 116, coupled together.
  • processor 112 includes any technically feasible set of hardware units configured to process data and execute software applications.
  • processor 112 could include one or more central processing units (CPUs), one or more graphics processing units (GPUs), and/or one or more parallel processing units (PPUs).
  • I/O devices 114 include any technically feasible set of devices configured to perform input and/or output operations, including, for example, a display device, a keyboard, and a touchscreen, among others.
  • Memory 116 includes any technically feasible storage media configured to store data and software applications, such as, for example, a hard disk, a random- access memory (RAM) module, and a read-only memory (ROM).
  • Memory 116 includes a database 118(0), an artificial intelligence (Al) design application 120(0), a machine learning model 122(0), and a graphical user interface (GUI) 124(0).
  • Database 118(0) is a file system and/or data storage application that stores various types of data.
  • Al design application 120(0) is a software application that, when executed by processor 112, interoperates with a corresponding software application executing on server 130 to generate, analyze, evaluate, and describe one or more machine learning models.
  • Machine learning model 122(0) includes one or more artificial neural networks, decision trees, random forests, gradient boosted trees, regression models, support vector machines, Bayesian networks, hierarchical models, ensemble models, and/or other types of machine learning models configured to perform general-purpose or specialized artificial intelligence-oriented operations.
  • GUI 124(0) allows a user to interface with Al design application 120(0).
  • Server 130 includes processor 132, I/O devices 134, and memory 136, coupled together.
  • Processor 132 includes any technically feasible set of hardware units configured to process data and execute software applications, such as one or more CPUs, one or more GPUs, and/or one or more PPUs.
  • I/O devices 134 include any technically feasible set of devices configured to perform input and/or output operations, such as a display device, a keyboard, or a touchscreen, among others.
  • Memory 136 includes any technically feasible storage media configured to store data and software applications, such as, for example, a hard disk, a RAM module, and a ROM.
  • Memory 136 includes database 118(1), Al design application 120(1), Machine learning model 122(1), and GUI 124(1).
  • Database 118(1) is a file system and/or data storage application that stores various types of data, similar to database 118(1 ).
  • databases 118(0)-(1 ) could include (but are not limited to) feature repositories that store features used in machine learning, model repositories that store machine learning models 112(0)-(1), and/or data stores for other types of data related to machine learning.
  • Al design application 120(1) is a software application that, when executed by processor 132, interoperates with Al design application 120(0) to generate, analyze, evaluate, and describe one or more machine learning models.
  • Machine learning model 122(1) includes one or more artificial neural networks, decision trees, random forests, gradient boosted trees, regression models, support vector machines, Bayesian networks, hierarchical models, ensemble models, and/or other types of machine learning models configured to perform general-purpose or specialized artificial intelligence-oriented operations.
  • GUI 124(1 ) allows a user to interface with Al design application 120(1 ).
  • database 118(0) and 118(1 ) represent separate portions of a distributed storage entity.
  • databases 118(0) and 118(1 ) are collectively referred to herein as database 118.
  • Al design applications 120(0) and 120(1) represent separate portions of a distributed software entity that is configured to perform any and all of the inventive operations described herein. As such, Al design applications 120(0) and 120(1) are collectively referred to hereinafter as Al design application 120.
  • Machine learning models 122(0) and 122(1) likewise represent a distributed machine learning model that includes one or more artificial neural networks, decision trees, random forests, gradient boosted trees, regression models, support vector machines, Bayesian networks, hierarchical models, ensemble models, and/or other types of machine learning models. Accordingly, machine learning models 122(0) and 122(1) are collectively referred to herein as machine learning model 122.
  • GUIs 124(0) and 124(1) similarly represent distributed portions of one or more GUIs. GUIs 124(0) and 124(1) are collectively referred to herein as GUI 124.
  • Al design application 120 generates machine learning model 122 based on user input that is received via GUI 124.
  • GUI 124 exposes design and analysis tools that allow the user to create and edit machine learning model 122, explore the functionality of machine learning model 122, evaluate machine learning model 122 relative to training data, and generate various data describing and/or constraining the performance and/or operation of machine learning model 122, among other operations.
  • Various modules within Al design application 120 that perform the above operations are described in greater detail below in conjunction with Figure 2.
  • FIG. 2 is a more detailed illustration of Al design application 120 of Figure 1, according to various embodiments.
  • Al design application 120 includes network generator 200, network analyzer 210, network evaluator 220, and a network descriptor 230.
  • machine learning model 122 includes one or more agents 240, and GUI 124 includes overview GUI 206, feature engineering GUI 204, network generation GUI 202, network analysis GUI 212, network evaluation GUI 222, and network description GUI 232.
  • network generator 200 renders network generation GUI 202 to provide the user with tools for designing and connecting agents 240 within machine learning model 122.
  • a given agent 240 may include a neural network 242 that performs various Al-oriented tasks.
  • a given agent 240 may also include other types of functional elements that perform generic tasks.
  • Network generator 200 trains neural networks 242 included in specific agents 240 based on training data 250.
  • Training data 250 can include any technically feasible type of data for training neural networks.
  • training data 250 could include the Modified National Institute of Standards and Technology (MNIST) digits training set.
  • network analyzer 210 When training is complete, network analyzer 210 renders network analysis GUI 212 to provide the user with tools for analyzing and understanding how a neural network within a given agent 240 operates.
  • network analyzer 210 causes network analysis GUI 212 to display various connections and weights within a given neural network 242 and to simulate the response of the given neural network 242 to various inputs, among other operations.
  • network evaluator 220 renders network evaluation GUI 222 to provide the user with tools for evaluating a given neural network 242 relative to training data 250. More specifically, network evaluator 220 receives user input via network evaluation GUI 222 indicating a particular portion of training data 250. Network evaluator 220 then simulates how the given neural network 242 responds to that portion of training data 250. Network evaluator 220 can also cause network evaluation GUI 222 to filter specific portions of training data 250 that cause the given neural network 242 to generate certain types of outputs.
  • network descriptor 230 analyzes a given neural network 242 associated with an agent 240 and generates a natural language expression that describes the performance of the neural network 242 to the user.
  • Network descriptor 230 can also provide various “common sense” facts to the user related to how the neural network 242 interprets training data 250.
  • Network descriptor 230 outputs this data to the user via network description GUI 232.
  • network descriptor 230 can obtain rule-based expressions from the user via network description GUI 232 and then constrain network behavior based on these expressions. Further, network descriptor 230 can generate metrics that quantify various aspects of network performance and then display these metrics to the user via network description GUI 232.
  • GUI 124 additionally includes overview GUI 206 and feature engineering GUI 204, which may be rendered by Al design application 120 and/or another component of the system.
  • Overview GUI 206 includes one or more user- interface elements for viewing, setting, and/or otherwise managing objectives associated with projects or experiments involving neural network 242 and/or other machine learning models 122.
  • Feature engineering GUI 204 includes one or more user-interface elements for viewing, organizing, creating, and/or otherwise managing features inputted into neural network 242 and/or other machine learning models 122.
  • Al design application 120 advantageously provides the user with various tools for generating, analyzing, evaluating, and describing neural network behavior.
  • the disclosed techniques differ from conventional approaches to generating neural networks, which generally obfuscate network training and subsequent operation from the user.
  • FIG. 3 is a more detailed illustration of the network generator of Figure 1 , according to various embodiments.
  • network generator 200 includes compiler engine 300, synthesis engine 310, training engine 320, and visualization engine 330.
  • visualization engine 330 generates network generation GUI 202 and obtains agent definitions 340 from the user via network generation GUI 202.
  • Compiler engine 300 compiles program code included in a given agent definition 340 to generate compiled code 302.
  • Compiler engine 300 is configured to parse, compile, and/or interpret any technically feasible programming language, including C, C++, Python and associated frameworks, JavaScript and associated frameworks, and so forth.
  • Synthesis engine 310 generates initial network 312 based on compiled code 302 and on or more parameters that influence how that code executes. Initial network 312 is untrained and may not perform one or more intended operations with a high degree of accuracy.
  • Training engine 320 trains initial network 312 based on training data 250 to generate trained network 322.
  • Trained network 322 may perform the one or more intended operations with a higher degree of accuracy than initial network 312.
  • Training engine 320 may perform any technically feasible type of training operation, including backpropagation, gradient descent, and so forth.
  • Visualization engine 330 updates network generation GUI 202 in conjunction with the above operations to graphically depict the network architecture defined via agent definition 340 as well as to illustrate various performance attributes of trained network 322.
  • neural networks can be created only by a few set of developers who have expertise in the various tools and libraries. Further, because the underlying details of a network architecture are nested deep within the frameworks of the tools and libraries, a developer may not understand how the architecture functions or how to change or improve upon the architecture. To address these and other deficiencies in the neural network definition paradigm, a mathematics-based programming and execution framework for defining neural network architectures is discussed below.
  • the source code for a neural network agent definition in a mathematics-based programming language is a pipeline of linked mathematical expressions.
  • the source code is compiled into machine code without needing any intermediary libraries, where the machine code is representative of a trainable and executable neural network.
  • the mathematics-based programming language exposes several building blocks.
  • Each layer of a neural network is defined in the mathematics-based programming language as one or more mathematical expressions using the building blocks discussed above.
  • a convolution layer may be defined using the following source code that includes a set of mathematical expressions: Cssmwmsm ⁇ X & (r € iT ⁇ * ⁇ **)
  • the first line of the source code indicates that the subsequent lines of the source code are related to a CONVOLUTION operation that has an input X and an output Y.
  • the subsequent lines of the source code include a sequence of mathematical expressions that define the mathematical operations performed on the input X to generate the output Y.
  • Each mathematical expression includes a right hand-side portion and a left-hand side portion. The right-hand side portion specifies a value that is determined when a mathematics operation specified by the left-hand portion is evaluated.
  • the values of variables included in the source code of a neural network agent are either assigned when the neural network is instantiated or are learned during training of the neural network.
  • a developer of a neural network agent defined using the mathematics- based programming language has control over which variables are to be learned during training (referred to herein as “learned variables”).
  • the variables that are to be learned during training can remain uninitialized (i.e. without being assigned a value or a source of a value) even when the neural network is instantiated.
  • the techniques for handling these learned variables during the compilation and training of a neural network are discussed below in detail in conjunction with Figures 4-6.
  • FIG 4 is a more detailed illustration of compiler engine 300 and synthesis engine 310 of Figure 3, according to various embodiments.
  • compiler engine 300 includes syntax tree generator 406, instantiator 408, and compiled code 302.
  • Synthesis engine 310 includes network builder 412 and initial network 312, which includes learned variables 410.
  • each layer specification includes one or more mathematical expressions 404 (individually referred to as mathematical expression 404) defined using the mathematics-based programming language.
  • each mathematical expression 404 includes a right- hand side portion that specifies a value that is determined when a mathematics operation specified by the left-hand portion is evaluated.
  • Mathematical expressions 404 may be grouped, such that each group corresponds to a different layer of a neural network architecture.
  • the source code of agent definition 402 specifies the links between different groups of mathematical expressions 404.
  • Compiler engine 300 compiles the source code of agent definition 402 into compiled code 302.
  • the compiler engine 300 includes syntax tree generator 406 and instantiator 408.
  • Syntax tree generator 406 parses the source code of the agent definition 402 and generates an abstract syntax tree (AST) representation of the source code.
  • the AST representation includes a tree structure of nodes, where constants and variables are child nodes to parent nodes including operators or statements.
  • the AST encapsulates the syntactical structure of the source code, i.e, the statements, the mathematical expressions, the variable, and the relationship between those contained within the source code.
  • Instantiator 408 processes the AST to generate compiled code 302.
  • instantiator 408 performs semantic analysis on the AST, generates intermediate representations of the code, performs optimizations, and generates machine code that comprises compiled code 302.
  • instantiator 408 checks the source code for semantic correctness. In various embodiments, a semantic check determines whether variables and types included in the AST are properly declared and that the types of operators and objects match.
  • instantiator 408 instantiates all of the instances of a given object or function type that are included in the source code. Further, instantiator 408 generates a symbol table representing all the named objects — classes, variables, and functions — is created and used to perform the semantic check on the source code.
  • Instantiator 408 performs a mapping operation for each variable in the symbol table to determine whether the value of the variable is assigned to a source identified in the source code.
  • Instantiator 408 flags the variables that do not have an assigned source as potential learned variables, i.e, the variables that are to be learned during the training process. In various embodiments, these variables do not have a special type indicating that the variables are learned variables. Further, the source code does not expressly indicate that the variables are learned variables.
  • Instantiator 408 automatically identifies those variables as potential variables that are to be learned by virtue of those variables not being assigned to a source. Thus, instantiator 408 operates differently from traditional compilers and interpreters, which would not allow for a variable to be unassigned, undeclared, or otherwise undefined and raise an error during the compilation process.
  • Instantiator 408 transmits compiled code 302 and a list of potential learned variables to synthesis engine 310.
  • synthesis engine 310 generates initial network 312 based on compiled code 302 and on or more parameters that influence how that compiled code 302 executes.
  • network builder 412 analyzes the structure of the compiled code 302 to determine the different layers of the neural network architecture and how the outputs of a given layer are linked into inputs of one or more subsequent layers.
  • network builder 412 also receives, via user input for example, values for certain variables included in the compiled code.
  • Learned variable identifier 414 included in network builder 412 identifies learned variables 410 within initial network 312.
  • learned variable identifier 414 analyzes the list of potential learned variables received from instantiator 408 in view of the structure of the layers of the neural network architecture determined by network builder 412 and any values for variables received by network builder 412. For each of the potential learned variables, learned variable identifier 414 determines whether the source of the potential learned variable in a given layer of the neural network architecture is an output from a prior layer of the neural network architecture. If such a source exists, then the potential learned variable is not a variable that is to be learned during training of the neural network.
  • learned variable identifier 414 determines whether a value for a potential learned variable has been expressly provided to network builder 412. If such a value has been provided, then the potential learned variable is not a variable that is to be learned during training of the neural network. In such a manner, learned variable identifier 414 processes each of the potential learned variables to determine whether the potential learned variable is truly a variable that is to be learned during training. Once all of the potential learned variables have been processed, learned variable identifier 414 identifies any of the potential learned variables for which a source was not determined. These variables make up learned variables 410 of initial network 312.
  • learned variable identifier 414 causes the network generation GUI 202 to display learned variables 410 identified by learned variable identifier 414. Learned variables 410 can then be confirmed by or otherwise modified by a user of the GUI 202, such as the developer of the neural network architecture.
  • training engine 320 trains initial network 312 based on training data 250 to generate trained network 322.
  • Trained network 322 includes values for the learned variables 410 that are learned during the training process.
  • Trained network 322 may perform the one or more intended operations with a higher degree of accuracy than initial network 312.
  • Training engine 320 may perform any technically feasible type of training operation, including backpropagation, gradient descent, and so forth.
  • the above techniques provide the user with a convenient mechanism for creating and updating neural networks that are integrated into potentially complex machine learning models 122 that include numerous agents 240. Further, these techniques allow the user to modify program code that defines a given agent 240 via straightforward interactions with a graphical depiction of the corresponding network architecture.
  • Network generator 200 performs the various operations described above based on user interactions conducted via network generation GUI 202.
  • the disclosed techniques provide the user with convenient tools for designing and interacting with neural networks that expose network information to the user rather than allowing that information to remain hidden, as generally found with prior art techniques.
  • the techniques described above for generating and modifying neural networks allow users to design and modify neural networks much faster than conventional approaches permit.
  • network generator 200 provides simple and intuitive tools for performing complex tasks associated with network generation. Additionally, network generator 200 conveniently allows modifications made to a network architecture to be seamlessly propagated back to a corresponding agent definition. Once the network is trained in the manner described, network analyzer 210 performs various techniques for analyzing network functionality.
  • Al design application 120, database 118, GUI 124, network generator 200, network analyzer 210, network evaluator 220, network descriptor 230, compiler engine 300, synthesis engine 310, training engine 320, visualization engine 330, and/or other components of system 100 of Figure 1 , Al design application 120 of Figure 2, and/or network generator 200 of Figure 3 are implemented as services within a machine learning workflow.
  • each of these components could be deployed within a cloud computing environment, a local environment that is geographically in proximity to an entity using the components (e.q.. on computers that are “on premises” with respect to a person or organization using the components), and/or another type of environment or platform.
  • Each component could provide a different subset of functionality associated with the machine learning workflow.
  • the disclosed techniques package each service with a corresponding shim into an executable image.
  • the image is used to deploy a container that includes the service and shim within an environment.
  • the shim implements a standardized interface (e.q.. an application programming interface (API)) for accessing the functionality of the service.
  • API application programming interface
  • the shim also converts between the standardized interface and an implementation-specific interface provided by the service. Consequently, the image, container, and shim provide an abstraction of the functionality provided by the service and allow the service to be updated or replaced without requiring other components to be adapted to the implementation-specific interface provided by the service, as described in further detail below.
  • FIG. 5 illustrates an implementation of system 100 of Figure 1 that includes container-based abstractions of services, according to various embodiments.
  • system 100 includes a number of containers 508(1 )-(Z) (each of which is referred to individually as container 508), a load balancer 504, and a messaging system 530.
  • Containers 508, load balancer 504, and messaging system 530 are stored or loaded in memory 136 on one or more instances of server 130.
  • Containers 508, load balancer 504, and messaging system 530 can also be read from memory 136 and executed by one or more processors 132 within these instance(s) of server 130. Each of these components is described in further detail below.
  • Containers 508(1 )-(Z) are used to deploy and execute services 512(1 )-(Z) (each of which is referred to individually as service 512) within system 100.
  • Each container 508 corresponds to an autonomous, isolated runtime environment for components residing within that container 508.
  • each container 508 could be deployed in a separate physical or virtualized server 130 within a remote cloud computing environment, an on-premises environment, and/or another type of environment. Once a given container 508 is deployed, a separate instance of service 512 could be executed within that container.
  • the network, storage, or other resources used by each container 508 could additionally be isolated from other containers and/or the computer system on which that container 508 runs. Further, containers 508 could be independently created, executed, stopped, moved, copied, snapshotted, and/or deleted.
  • service 512 can include one or more components of a machine learning workflow.
  • Service 512 can also, or instead, include one or more components of another type of software workflow and/or technology stack.
  • service 512 could include (but is not limited to) a messaging service, email service, database, data warehouse, document management system, graphics editor, graphics Tenderer, enterprise application, mobile application, analytics service, web server, content management system, customer relationship management system, and/or identity management system.
  • interfaces 522(1 )-(Z) each of which is referred to individually as interface 522) implemented by services 512(1 )-(Z).
  • Interfaces 522(1 )-(Z) expose functions 524(1 )-(Z) (referred to individually as functions 524) and objects 526(1 )-(Z) (referred to individually as objects 526) implemented by services 512(1 )-(Z) to other components or services.
  • interface 522 could include an application programming interface (API) for a model training service 512 in a machine learning workflow.
  • API application programming interface
  • the API could be called by other components or services to access functions 524 that are used to assign CPU, GPU, or other resources to a training task; select a training dataset or model architecture for a machine learning model to be trained using the training task; specify hyperparameters associated with the machine learning model; execute the training task; and/or export or save the trained machine learning model at the end of the training task.
  • the API could also, or instead, be called by the other components or services to create or access objects 526 representing compute resources, training datasets, hyperparameters, machine learning models, and/or other entities used in the training task.
  • Containers 508(1 )-(Z) are also used to deploy and execute shims 510(1 )- (Z) (each of which is referred to individually as shim 510) associated with service 512.
  • Each shim 510 includes one or more software components that provide a standardized representation of the functionality provided by service 512.
  • shims 510(1)-(Z) implement interfaces 516(1)-(Z) (each of which is referred to individually as interface 516) that correspond to abstractions of interfaces 522(1 )-(Z) implemented by services 512(1 )-(Z).
  • interfaces 516(1)-(Z) include functions 518(1)-(Z) (referred to individually as functions 518) and objects 520(1 )-(Z) (referred to individually as objects 520) that are service-agnostic versions of functions 524 and objects 526, respectively, in interfaces 522 (1)-(Z) implemented by services 512(1)- (Z).
  • interface 516 could include a representational state transfer (REST) API that can be called by other components or services executing on one or more clients 110, one or more servers 130, and/or other types of computing devices.
  • REST API could include “generic” versions of functions 524 and objects 526 used to perform a training task in a machine learning workflow. These generic functions 524 and objects 526 could be used by the other components or services to access the functionality provided by the model training service 512 in lieu of the service-specific functions 524 and objects 526 in interface 516 implemented by service 512.
  • Shim 510 additionally converts between requests and responses associated with interface 522 and requests and responses associated with interface 516.
  • shim 510 When shim 510 receives a request over interface 516, shim 510 “translates” the request into one or more requests to interface 522 (e.g., by converting the parameters of the request into parameters of the request(s) to interface 522). Shim 510 also transmits the translated request(s) to interface 522 to cause service 512 to process the translated request(s). When service 512 generates one or more responses to the translated request(s), shim 510 receives the response(s) over interface 52 and “translates” the response(s) into one or more corresponding responses that adhere to interface 516. Shim 510 then transmits the translated response(s) over interface 516 to the service or component from which the original request was received.
  • interface 516 includes a function named “get_paginated_featuresets.”
  • the “get_paginated_featuresets” function can be invoked using a number of parameters (e.q.. “request,” “setname,” “page,” “pageSize,” “search,” “orderBy,” “orderDirection,” etc.).
  • the “get_paginated_featuresets” function uses some of the parameters to generate a call to a “find_featureset_pages” function that is included in interface 522 implemented by the corresponding service 512 (e.q., a feature repository service), thereby translating the invocation of the “get_paginated_featuresets” function by another component into an invocation of the “find_featureset_pages” function provided by service 512.
  • the “get_paginated_featuresets” function additionally converts the response returned by the “find_featureset_pages” function into a corresponding “JSONResponse” that is transmitted to the caller of the
  • shim 510 exposes the functionality of service 512 to other components or services without requiring the other components or services to be hardcoded or customized to use interface 522 provided by service 512. Instead, shim 510 provides another interface 516 that abstracts away the implementation details of service 512 or interface 522.
  • the other service can also be deployed with a corresponding shim that implements interface 516 and “translates” between requests and responses associated with interface 516 and requests and responses associated with an implementation-specific interface implemented by the other service. Consequently, other components or services that use the functionality provided by service 512 and/or the other service do not need to be modified to accommodate the replacement of service 512 with the other service.
  • shim 510 and service 512 are packaged together into an image that is deployed and executed within container 508.
  • the image could be built as a series of layers, where each layer applies a different set of changes to the image. This series of layers could be used to add service 512 and shim 510 to the image.
  • a writable container layer could be added to allow modifications to the running image within container 508.
  • container 508 would isolate the running image from the underlying environment and/or other services, shims, or containers running within the same environment.
  • This packaging, deployment, and execution of shim 510 and service 512 within the same container 508 allows different services that provide similar functionality and corresponding shims to be added to or removed from the environment in a seamless, self-contained manner.
  • a first service and a first shim executing in a first container could be replaced with a second service and a second shim executing in a second container (e.g., when the second service constitutes an improvement, update, or upgrade over the first service).
  • Other services that accessed the functionality provided by the first service via the interface implemented by the first shim would be able to use the same interface, as implemented by the second shim, to access the functionality provided by the second service.
  • An example script for building an image that includes shim 510 and service 512 includes the following representation:
  • the first two lines of the script are used to create a base image represented by “$ ⁇ REPO_PREFix ⁇ _python.” Subsequent lines of the script are used to create new layers that are applied to the image. These layers are used to add shim 510, service, and/or other components to the image.
  • Load balancer 504 receives requests 502(1 )-(X) (each of which is referred to individually as request 502) to interface 516 from other components or services. Load balancer 504 routes these requests 502 to different containers 508 on which shim 510 and service 512 execute. For example, load balancer 504 could receive requests 502 to a REST API corresponding to interface 516. Load balancer 504 could also use a load-balancing technique (e.q.. round robin, weighted round robin, least loaded, sticky sessions, etc.) to route requests 502 to different containers 508 for subsequent processing of requests 502 by shim 510 and service 512 executing within those containers 508.
  • a load-balancing technique e.q.. round robin, weighted round robin, least loaded, sticky sessions, etc.
  • instances of shim 510 communicate with one another using messaging system 530.
  • messaging system 530 includes a publish-subscribe messaging system that stores messages in various topics 506(1 )-(Y) (each of which is referred to individually as topic 506), which can also be referred to herein as queues.
  • Topic 506(1 ) is associated with a set of messages 532(1 )-(M)
  • topic 506(Y) is associated with a different set of messages 532(M+1 )-(N).
  • Each of messages 532(1 )-(M) and messages 532(M+1 )-(N) is referred to individually as message 532.
  • Shims 510(1)-(Z) include messaging modules 514(1)-(Z) (each of which is referred to individually as messaging module 514) that subscribe to and read messages from certain topics 506 within messaging system 530. Messaging modules 514 can also be used to write messages to the same topics 506 or different topics 506 in messaging system 530.
  • messaging system 530 allows multiple containers 508 that are deployed within the environment to horizontally scale the functionality provided by service 512 to coordinate with one another during processing of requests 502.
  • a given shim 510 in receives a certain request 502 to interface 516 (e.g., from load balancer 504), that shim 510 attempts to process the request using the corresponding service 512 within the same container 508. If that shim 510 determines that the corresponding service 512 cannot process the request (e.g.. if the corresponding service 512 lacks data and/or objects 526 required to process the request), that shim 510 uses messaging module 514 to publish a message to one or more topics 506 within messaging system 530.
  • the message can include the parameters of the request, an indication that the request cannot be processed by that shim 510 and/or corresponding service 512, and/or one or more reasons for the inability to process the request by that shim 510 and/or corresponding service 512.
  • Messaging modules 514 in other shims 510 can be configured to subscribe to these topics 506, read the message from these topics 506, and attempt to process the request within the message.
  • One or more of these other shims 510 can additionally determine that the corresponding services 512 include the data and/or objects required to process the request (e.g., by converting the request into one or more corresponding requests to interface 522 and transmitting the corresponding request(s) over interface 522 to the corresponding services 512).
  • shims 510 can also use the corresponding services 512 to generate a response to the request and transmit the response over interface 516 to the component from which the request originated. Consequently, shims 510 can use messaging modules 514 and messaging system 530 to “fan out” requests 502 to one another, thereby providing asynchronous communication, alerting, and reporting across containers 508 even when the underlying services 512 are not implemented to support horizontal scaling or asynchronous interactions.
  • job_id job_id
  • timestamp datetime.now ().timestamp () * 1000,
  • the “update_status” function generates a JavaScript Object Notation (JSON) object that stores fields related to the status of a job (e.q., “job_id,” “status,” “finished,” “statusdetails,” “result,” “exception,” etc.) that is processed using shim 510 and/or the corresponding service 512.
  • JSON JavaScript Object Notation
  • the update_status” function also calls a “send_event_bus_status_sync” function to publish the JSON object as a message to one or more topics 506 within messaging system 530.
  • Other shims 510 that subscribe to these topics 506 can retrieve the message from these topics 506 and determine, based on the status fields in the message, whether or not the job was completed successfully on shim 510. If the other shims 510 determine that the job was not completed successfully on shim 510, the other shims 510 can attempt to perform the job using other status fields in the message.
  • system 100 could include multiple model training services 512 and corresponding shims 510 that are deployed within multiple sets of containers 508.
  • Each model training service 512 could include features or performance characteristics that are optimized for certain types of machine learning models (e.q.. neural networks, regression models, tree-based models, support vector machines, etc.), model sizes, training datasets (e.q., unstructured data, structured data, text- based data, images, video, etc.), hyperparameters, training techniques, and/or other factors related to training machine learning models.
  • Load balancer 504 could be configured to route messages that include requests 502 to train machine learning models to the corresponding containers 508 and/or shims 510 based on fields in requests 502 associated with these factors, so that a training task represented by each request is executed by a model training service 512 that is most suited for that training task.
  • Load balancer 504 could also, or instead, be configured to route these types of messages to certain containers 508 and/or shims 510 based on fields in requests 502 that include identifiers for containers 508, shims 510, and/or the underlying services 512 that are selected by users that generated these requests 502.
  • Figure 6 sets forth a flow diagram of method steps of implementing one or more services in a technology stack, according to various embodiments. Although the method steps are described in conjunction with the systems of Figures 1-5, persons skilled in the art will understand that any system configured to perform the method steps in any order falls within the scope of the present disclosure.
  • system 100 builds 602 an image that includes a service that implements a first interface and a shim that implements a second interface.
  • system 100 could start with a base image and use a series of layers to apply a series of changes to the image.
  • One or more layers could be used to add one or more components of the service to the image, and one or more other layers could be used to add one or more components of the shim to the image.
  • system 100 deploys 604 one or more containers that include the image within an environment.
  • system 100 could create each container as an isolated environment within a larger cloud-based or on-premises environment. After a given container is created, system 100 could run the image within the container.
  • System 100 also routes 606 requests associated with the second interface to the container(s) based on a load-balancing technique.
  • system 100 could include a load balancer that receives the requests over a REST API.
  • the load balancer could also use a round robin, weighted round robin, sticky sessions, and/or another type of load balancing technique to distribute the requests across the container(s).
  • the requests could then be processed by instances of the shim and service in the corresponding containers, as described in further detail below with respect to Figure 7.
  • While the load balancer is used to route requests to the container(s), system 100 and/or another entity determine 608 whether or not the service is to be replaced.
  • an administrator could determine that the service is to be replaced when a newer version of the service is available, an upgrade to the service is available, a configuration associated with system 100 is updated, a different service that provides the same functionality is available, and/or another condition is met.
  • the administrator could also transmit one or more commands to system 100, update a configuration associated with system 100, and/or otherwise indicate to system 100 that the service is to be replaced.
  • system 100 builds 610 an additional image that includes another service and another shim that implements the second interface.
  • system 100 could package the other service and the other shim into the additional image.
  • the other service could provide functionality that is similar to the service that is currently used to process requests, and the other shim could translate between requests and responses associated with the interface implemented by the other service and the second interface.
  • System 100 additionally deploys 604 one or more containers that include the newly built image within the environment.
  • System 100 further repeats operations 606-612 to route requests to the other shim and service executing within the corresponding container(s) and/or replace the other service.
  • system 100 determines 612 whether or not functionality associated with the service (or similar services) should continue to be provided. While functionality associated with the service(s) continues to be provided, system 100 using the load balancer to route 606 requests associated with the second interface to the currently deployed container(s) and/or performs operations 608-610 and 604 to replace the service as the need arises. Once system 100 determines that functionality associated with the service(s) is no longer to be provided, system 100 can stop the container(s) in which the service(s) are deployed and/or discontinue routing requests associated with the second interface to the container(s).
  • Figure 7 sets forth a flow diagram of method steps for processing a request associated with a service implemented in a technology stack, according to various embodiments. Although the method steps are described in conjunction with the systems of Figures 1-5, persons skilled in the art will understand that any system configured to perform the method steps in any order falls within the scope of the present disclosure.
  • shim 510 receives 702 a first request associated with an interface implemented by the shim.
  • shim 510 could receive the first request from a load balancer and/or over a REST API corresponding to the interface.
  • shim 510 converts 704 the first request into a second request associated with another interface implemented by a service. For example, shim 510 could obtain a set of parameters from a call to a function within the first request. Shim 510 could use the parameters to generate a call to a different function within the other interface implemented by the service. Shim 510 also transmits 706 the second request over the other interface to the service.
  • Shim 510 subsequently receives 708 a first response to the second request over the other interface.
  • shim 510 could receive the first response after the service has processed the second request.
  • Shim 510 determines 710 whether the first response indicates successful processing of the second request. For example, shim 510 could determine that the second request was processed successfully when the first response includes a status field that indicates completion of a job associated with processing the second request. On the other hand, shim 510 could determine that the second request was not processed successfully when the first response includes one or more errors and/or other indicators of a lack of completion of the job.
  • shim 510 publishes 712 a message to one or more topics in a messaging system indicating that the second request was unsuccessfully processed. For example, shim 510 could generate a message that includes the first request, second request, and/or status fields associated with the first response. Shim 510 could also and write the message to the topic(s) within a publish-subscribe messaging system. Other instances of shim 510 executing within other containers could read the message from the topic(s) and attempt to process one or both requests.
  • This fan-out of the request(s) to the other instances of shim 510 allows an instance of shim 510 that is coupled to an instance of the service that includes data and/or objects needed to process the request(s) to successfully process the request(s) and transmit a response to the component from which the first request was received.
  • shim 510 converts 714 the first response into a second response associated with the interface implemented by the shim.
  • shim 510 could convert the objects and/or format associated with the first response into objects and/or format associated with the second response.
  • Shim 510 then transmits 716 the second response over the second interface.
  • the disclosed techniques provide container-based service abstractions within a cloud-based environment, an on-premises environment, and/or another type of environment that hosts running services.
  • Each service is packaged with a corresponding shim into an executable image, and a container that runs the image is deployed within the environment.
  • the shim implements a standardized interface for accessing the functionality of the service and converts between the standardized interface and an interface that is specific to the service.
  • another container that includes the other service and a different shim that converts between the standardized interface and another interface that is specific to the other service is deployed within the environment. Requests to the standardized interface are then routed to the other container.
  • One technical advantage of the disclosed techniques relative to the prior art is that, with the disclosed techniques, the same interface can be used to access multiple services that provide similar functionality within a given technology stack. Accordingly, the disclosed techniques enable a service to be added, updated, upgraded, or replaced within the technology stack without having to create and install a custom “driver” to accommodate the modified or added service or change the interfaces of the other implemented in the technology stack, as is normally required with prior art approaches.
  • Another technical advantage of the disclosed techniques is that, because a service and a corresponding shim are packaged together within the same container, the service and shim are isolated from other components executing within the same environment. This isolation allows the service and shim to be deployed, moved, and removed as a single self-contained unit, which is more efficient relative to prior art approaches where services and components are packaged, deployed, and managed separately.
  • a computer-implemented method for processing requests associated with one or more services comprises deploying a first container within an environment, wherein the first container includes a first service that implements a first interface and a first shim that implements a second interface; receiving, at the first shim, a first request associated with the second interface; converting the first request into a second request associated with the first interface; and transmitting the second request over the first interface to the first service, wherein the second request is processed by the first service.
  • one or more non-transitory computer-readable media store instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of deploying a first container within an environment, wherein the first container includes a first service that implements a first interface and a first shim that implements a second interface; receiving, at the first shim, a first request associated with the second interface; converting the first request into a second request associated with the first interface; and transmitting the second request over the first interface to the first service, wherein the second request is processed by the first service.
  • a system comprises one or more memories that store instructions, and one or more processors that are coupled to the one or more memories and, when executing the instructions, are configured to perform the steps of deploying a first container within an environment, wherein the first container includes a first service that implements a first interface and a first shim that implements a second interface; receiving, at the first shim, a first request associated with the second interface; converting the first request into a second request associated with the first interface; and transmitting the second request over the first interface to the first service, wherein the second request is processed by the first service.
  • aspects of the present embodiments may be embodied as a system, method or computer program product. Accordingly, aspects of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module,” a “system,” or a “computer.” In addition, any hardware and/or software technique, process, function, component, engine, module, or system described in the present disclosure may be implemented as a circuit or set of circuits. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Stored Programmes (AREA)

Abstract

Un mode de réalisation de la présente invention concerne une technique de traitement de demandes associées à un ou plusieurs services. La technique consiste à déployer un premier conteneur dans un environnement, le premier conteneur comprenant un premier service qui implémente une première interface et une première cale qui implémente une seconde interface. La technique consiste également à recevoir, au niveau de la première cale, une première demande associée à la seconde interface. La technique consiste en outre à convertir la première demande en une seconde demande associée à la première interface, et à transmettre la seconde demande sur la première interface au premier service, la seconde demande étant traitée par le premier service.
EP22754268.5A 2021-07-19 2022-07-19 Techniques d'implémentation des services logiciels basés sur des conteneurs Pending EP4374256A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202163223412P 2021-07-19 2021-07-19
US17/867,540 US20230021412A1 (en) 2021-07-19 2022-07-18 Techniques for implementing container-based software services
PCT/US2022/073886 WO2023004310A1 (fr) 2021-07-19 2022-07-19 Techniques d'implémentation des services logiciels basés sur des conteneurs

Publications (1)

Publication Number Publication Date
EP4374256A1 true EP4374256A1 (fr) 2024-05-29

Family

ID=82851670

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22754268.5A Pending EP4374256A1 (fr) 2021-07-19 2022-07-19 Techniques d'implémentation des services logiciels basés sur des conteneurs

Country Status (2)

Country Link
EP (1) EP4374256A1 (fr)
WO (1) WO2023004310A1 (fr)

Also Published As

Publication number Publication date
WO2023004310A1 (fr) 2023-01-26

Similar Documents

Publication Publication Date Title
US11593084B2 (en) Code development for deployment on a cloud platform
US10185558B2 (en) Language-independent program composition using containers
WO2016048732A1 (fr) Calcul parallèle basé sur un nuage à l'aide de modules acteurs
US20150347101A1 (en) R-language integration with a declarative machine learning language
US11061739B2 (en) Dynamic infrastructure management and processing
US20210125082A1 (en) Operative enterprise application recommendation generated by cognitive services from unstructured requirements
US11893367B2 (en) Source code conversion from application program interface to policy document
US11221846B2 (en) Automated transformation of applications to a target computing environment
US20200026798A1 (en) Identification and curation of application programming interface data from different sources
US11610134B2 (en) Techniques for defining and executing program code specifying neural network architectures
EP4124946A1 (fr) Fourniture de logiciel optimisée à des hôtes d'automatisation de processus robotiques (rpa) à entrefer
CN113448678A (zh) 应用信息生成方法、部署方法及装置、系统、存储介质
Dagkakis et al. ManPy: an open‐source software tool for building discrete event simulation models of manufacturing systems
US20220147831A1 (en) Automatic and unsupervised detached subgraph detection in deep learning programs
Mahapatra et al. Graphical flow-based spark programming
US20230028635A1 (en) Techniques for managing container-based software services
EP4374256A1 (fr) Techniques d'implémentation des services logiciels basés sur des conteneurs
US20220067502A1 (en) Creating deep learning models from kubernetes api objects
US11775655B2 (en) Risk assessment of a container build
Ahlbrecht et al. Scalable multi-agent simulation based on MapReduce
Chang et al. Support NNEF execution model for NNAPI
Kimura et al. A javascript transpiler for escaping from complicated usage of cloud services and apis
Madushan Cloud Native Applications with Ballerina: A guide for programmers interested in developing cloud native applications using Ballerina Swan Lake
de Oliveira et al. Clouds and Reproducibility: A Way to Go to Scientific Experiments?
US11775293B1 (en) Deploying a static code analyzer based on program synthesis from input-output examples

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20240131

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR