US11977934B2 - Automation solutions for event logging and debugging on KUBERNETES - Google Patents
Automation solutions for event logging and debugging on KUBERNETES Download PDFInfo
- Publication number
- US11977934B2 US11977934B2 US17/525,749 US202117525749A US11977934B2 US 11977934 B2 US11977934 B2 US 11977934B2 US 202117525749 A US202117525749 A US 202117525749A US 11977934 B2 US11977934 B2 US 11977934B2
- Authority
- US
- United States
- Prior art keywords
- bpf
- computing environment
- user
- data
- probe
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000000523 sample Substances 0.000 claims abstract description 99
- 238000000034 method Methods 0.000 claims description 52
- 230000006870 function Effects 0.000 claims description 47
- 230000004044 response Effects 0.000 claims description 18
- 238000012546 transfer Methods 0.000 claims description 12
- 238000011144 upstream manufacturing Methods 0.000 claims description 5
- 230000001960 triggered effect Effects 0.000 abstract description 11
- 238000010586 diagram Methods 0.000 description 34
- 238000012544 monitoring process Methods 0.000 description 27
- 239000003795 chemical substances by application Substances 0.000 description 16
- 238000012545 processing Methods 0.000 description 15
- 238000013515 script Methods 0.000 description 14
- 230000008569 process Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 10
- 239000000243 solution Substances 0.000 description 9
- 238000007726 management method Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000012800 visualization Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000002085 persistent effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/545—Interprogram communication where tasks reside in different layers, e.g. user- and kernel-space
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
- G06F11/3093—Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/323—Visualisation of programs or trace data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3495—Performance evaluation by tracing or monitoring for systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3636—Software debugging by tracing the execution of the program
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/542—Event management; Broadcasting; Multicasting; Notifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
Definitions
- the present disclosure is related to the field of event logging and debugging on container orchestration platforms, and more specifically, for automated event logging and debugging on KUBERNETES.
- Distributed software architectures often have complex designs where one software application is implemented as numerous containerized microservices. Multiple instances of the containers may be hosted by many different computing nodes in a cluster of computing nodes. The number of instances of the containers deployed within the cluster may vary on a per-container basis responsive to throughput of (e.g., demand for) the one or more microservices within the container, and can vary over time.
- Container orchestration systems automate deployment, scaling, and management of the software application upon the cluster. Real-time or near real-time visualization of the cluster and its often-changing components, particularly to monitor performance, can be difficult.
- a system for providing no-instrumentation telemetry for a distributed application cluster includes at least one processor for executing computer-executable instructions stored in a memory.
- the instructions when executed, instruct the at least one processor to provide an edge module configured to deploy a Berkeley Packet Filter (BPF) probe and a corresponding BPF program in the computing environment.
- BPF probe is triggered based on an event associated with a distributed application running in a user space of the computing environment.
- Data associated with the event is captured in a kernel space of the computing environment via the BPF program.
- the captured data is transferred from the kernel space of the computing environment to the user space of the computing environment.
- BPF Berkeley Packet Filter
- At least one aspect of the present disclosure is directed to a system for providing no-instrumentation telemetry for a distributed application cluster.
- the system includes at least one memory storing computer-executable instructions, and at least one processor for executing the computer-executable instructions stored in the memory.
- the instructions when executed, instruct the at least one processor to: provide an edge module configured to deploy a Berkeley Packet Filter (BPF) probe and a corresponding BPF program in the computing environment, trigger the BPF probe based on an event associated with a distributed application running in a user space of the computing environment, capture data associated with the event in a kernel space of the computing environment via the BPF program, and transfer the captured data from the kernel space of the computing environment to the user space of the computing environment.
- BPF Berkeley Packet Filter
- transferring the captured data from the kernel space to the user space includes transferring the captured data from the BPF program to the edge module.
- the instructions when executed, instruct the at least one processor to: analyze, via the BPF program, the captured data to an infer a protocol associated with the captured data, determine whether the inferred protocol is a protocol of interest, and transfer, in response to a determination that the inferred protocol is a protocol of interest, the captured data from the BPF program to the edge module.
- the edge module is configured to run on the computing environment with the distributed application.
- the event that triggers the BPF probe corresponds to a configuration of the edge module.
- deploying the BPF probe in the computing environment includes deploying at least one kernel BPF probe.
- triggering the BPF probe based on the event includes triggering the at least one kernel BPF probe based on the occurrence of at least one kernel function.
- deploying the BPF probe in the computing environment includes deploying at least one user BPF probe.
- triggering the BPF probe based on the event includes triggering the at least one user BPF probe based on the occurrence of at least one function in the distributed application.
- the at least one user BPF probe is deployed upstream from an encryption library associated with the distributed application.
- Another aspect of the present disclosure is directed a method for providing no-instrumentation telemetry for a distributed application cluster.
- the method includes providing an edge module configured to deploy a Berkeley Packet Filter (BPF) probe and a corresponding BPF program in the computing environment, triggering the BPF probe based on an event associated with a distributed application running in a user space of the computing environment, capturing data associated with the event in a kernel space of the computing environment via the BPF program, and transferring the captured data from the kernel space of the computing environment to the user space of the computing environment.
- BPF Berkeley Packet Filter
- transferring the captured data from the kernel space to the user space includes transferring the captured data from the BPF program to the edge module.
- the method includes analyzing the captured data via the BPF program to an infer a protocol associated with the captured data, determining whether the inferred protocol is a protocol of interest, and transferring, in response to a determination that the inferred protocol is a protocol of interest, the captured data from the BPF program to the edge module.
- the edge module is configured to run on the computing environment with the distributed application.
- the event that triggers the BPF probe corresponds to a configuration of the edge module.
- deploying the BPF probe in the computing environment includes deploying at least one kernel BPF probe.
- triggering the BPF probe based on the event includes triggering the at least one kernel BPF probe based on the occurrence of at least one kernel function.
- deploying the BPF probe in the computing environment includes deploying at least one user BPF probe.
- triggering the BPF probe based on the event includes triggering the at least one user BPF probe based on the occurrence of at least one function in the distributed application.
- the at least one user BPF probe is deployed upstream from an encryption library associated with the distributed application.
- FIG. 1 is a block diagram of a system for providing and monitoring a distributed application cluster.
- FIG. 2 is a functional block diagram of a distributed application cluster.
- FIG. 3 is a functional block diagram of a computing environment.
- FIG. 4 A is a functional block diagram of a protocol tracing architecture in accordance with aspects described herein.
- FIG. 4 B is another block diagram of the protocol tracing architecture of FIG. 4 A .
- FIG. 5 A is a functional block diagram of a protocol tracing architecture in accordance with aspects described herein.
- FIG. 5 B is another block diagram of the protocol tracing architecture of FIG. 5 A .
- FIG. 6 is a flow diagram of a protocol tracing method in accordance with aspects described herein.
- FIG. 7 A is a functional block diagram of a function tracing architecture in accordance with aspects described herein.
- FIG. 7 B is another block diagram of the function tracing architecture of FIG. 7 A .
- FIG. 8 is a flow diagram of a function tracing method in accordance with aspects described herein.
- FIG. 9 is functional block diagram of a distributed application monitoring system in accordance with aspects described herein.
- FIG. 10 is a flow diagram of a method for identifying and clustering events on a distributed application cluster in accordance with aspects described herein.
- FIG. 11 is a flow diagram of a method for linking and navigating data collected from a distributed application cluster in accordance with aspects described herein.
- FIG. 12 is a flow diagram of a method for navigating data associated with a distributed application cluster in accordance with aspects described herein.
- FIG. 13 A is a functional block diagram of a hybrid architecture operating in a direct mode of operation in accordance with aspects described herein.
- FIG. 13 B is a functional block diagram of a hybrid architecture operating in a passthrough mode of operation in accordance with aspects described herein.
- FIG. 14 is a flow diagram of a method for event logging and debugging on a distributed application cluster in accordance with aspects described herein.
- FIG. 15 is a block diagram of an example computer system in accordance with aspects described herein.
- FIG. 1 is a block diagram illustrating a system 100 for providing and monitoring a distributed application cluster 120 .
- the system 100 includes a client device 110 , the distributed application cluster 120 , a monitoring server 130 , and an end device 150 connected by a network 140 .
- the distributed application cluster 120 is a cluster of nodes 122 each running one or more pods. Each pod includes one or more containers running microservices that collectively provide a distributed application. In some examples, the pods may be containers, or virtual machines. In certain examples, one or more pods may not include any containers (e.g., upon initialization before containers are added). As part of running microservices that collectively provide the distributed application, containers may additionally run services such as databases or internal container orchestration platform services.
- the cluster of nodes 122 is managed by a container orchestration platform, such as KUBERNETES. The container orchestration platform operates upon the distributed application cluster 120 , and may additionally operate at the monitoring server 130 and/or client device 110 depending upon the embodiment.
- the client device 110 may be a personal computer, laptop, mobile device, or other computing device that includes a visual interface (e.g., a display).
- the client device 110 displays, at the visual interface, one or more user interfaces visualizing the structure, health, and/or performance of the distributed application cluster 120 .
- the client device 110 accesses the distributed application cluster 120 over the network 140 and can manage the distributed application cluster 120 .
- the client device 110 may be used to send instructions to the distributed application cluster 120 to control operation and/or configuration of the distributed application cluster 120 .
- the end device 150 accesses and uses the distributed application hosted at the distributed application cluster 120 via the network 140 . For example, the end device 150 sends a request for data to the distributed application cluster 120 , which forwards the request to a pertinent node 122 (e.g., one of nodes 122 A to 122 N), where a containerized microservice processes the request and then sends the requested data to the end device 150 .
- a pertinent node 122 e.g., one of nodes 122 A to 122 N
- the pertinent node 122 is a node 122 with a pod running an instance of the containerized microservice requisite for responding to the data request, and may be selected from multiple nodes running instances of the containerized microservice using a selection process, such as a round robin algorithm, or by ranking the multiple nodes by resource use (e.g., processor, memory, non-transitory storage) and selecting the least-used node.
- a selection process such as a round robin algorithm
- resource use e.g., processor, memory, non-transitory storage
- some or all of the functionality of the end device 150 may also or instead be performed at the client device 110 , and the system 100 may not include an end device 150 .
- the monitoring server 130 retrieves data from the distributed application cluster 120 and generates the one or more user interfaces, which the monitoring server 130 sends to the client device 110 for display.
- the generated one or more user interfaces include graphical elements representative of the structure and health of the distributed application cluster 120 .
- some or all of the monitoring server 130 functionality may instead be performed at the client device 110 , and the system 100 may not include the monitoring server 130 .
- the system 100 includes more than one client device 110 , distributed application cluster 120 , monitoring server 130 , and/or end device 150 .
- the monitoring server 130 may itself be a distributed application cluster that provides monitoring server 130 functionality as the distributed application.
- the monitoring server 130 can access and/or send instructions to the distributed application cluster 120 .
- the client device 110 , nodes 122 , monitoring server 130 , and end device 150 are configured to communicate via the network 140 , which may comprise any combination of local area and/or wide area networks, using both wired and/or wireless communication systems.
- the network 140 uses standard communications technologies and/or protocols.
- the network 140 includes communication links using technologies such as Ethernet, 802.11, worldwide interoperability for microwave access (WiMAX), 3G, 4G, code division multiple access (CDMA), digital subscriber line (DSL), etc.
- networking protocols used for communicating via the network 140 include multiprotocol label switching (MPLS), transmission control protocol/Internet protocol (TCP/IP), hypertext transport protocol (HTTP), simple mail transfer protocol (SMTP), and file transfer protocol (FTP).
- Data exchanged over the network 140 may be represented using any suitable format, such as hypertext markup language (HTML) or extensible markup language (XML).
- all or some of the communication links of the network 140 may be encrypted using any suitable technique or techniques.
- FIG. 2 is a block diagram illustrating an example of the distributed application cluster 120 .
- the distributed application cluster 120 includes a master 210 and two nodes 122 , and is connected to an application monitor 250 .
- the application monitor 250 is not part of the distributed application cluster 120 , and instead is at the monitoring server 130 , where it receives data from the distributed application cluster 120 , e.g., from the master 210 .
- Node 122 A includes two pods 224 A,B and node 122 N includes pod 224 C.
- Each pod 224 includes a container 226 with a microservice 228 .
- the distributed application cluster 120 can include fewer or more than two nodes 122 , and each node 122 may include one, two, or more than two pods 224 .
- some nodes 122 include no pods 224 , e.g., nodes 122 that have recently been added to the distributed application cluster 120 , to which pods are to be added by the container orchestration platform. Pods that have yet to be added to a node 122 by the container orchestration platform are “unassigned” or “pending” pods.
- microservice 228 A and microservice 228 C are different instantiations of a first microservice that provides first functionality, while microservice 228 B is a second microservice that provides second functionality.
- microservices 228 provide a distributed application.
- microservice 228 A could be a function to query a database
- microservice 228 B could be a function to add or remove data in the database
- microservice 228 C could be a function to generate graphs based on retrieved database data, cumulatively providing a database interface application.
- the master 210 is a component of the container orchestration platform that manages the distributed application cluster 120 . It monitors microservice 228 usage and adds and removes pods 224 to the nodes 122 in response to the usage. The master 210 also monitors the nodes 122 , and reacts to downed (e.g., broken) nodes. For example, node 122 N loses its network connection, then the master 210 instructs node 122 A to add an instance of pod 224 C, thereby restoring the functionality lost when node 122 N went offline. The master 210 may add or remove nodes 122 based on node usage, e.g., how much of each node's 122 processing units, memory, and persistent storage is in use.
- node usage e.g., how much of each node's 122 processing units, memory, and persistent storage is in use.
- the application monitor 250 monitors the distributed application cluster 120 , collecting cluster data for the distributed application cluster 120 .
- the application monitor 250 tracks the distributed application cluster's nodes 122 , the usage of each node 122 (e.g., in terms of processor use, memory use, and persistent storage use), which pods 224 are on each node, the usage of each pod 224 (e.g., in terms of microservice use), which if any pods are unassigned, and so on.
- the application monitor 250 may reside upon the distributed application cluster 210 as in the figure, and/or upon the monitoring server 130 and/or client device 110 .
- the automated solutions include event logging and debugging on the KUBERNETES platform.
- the solutions include no-instrumentation telemetry, an edge intel platform, entity linking and navigation, command driven navigation, and a hybrid-cloud/customer architecture.
- BPF Berkeley Packet Filter
- FIG. 3 is a functional block diagram of an example computing environment 300 .
- the computing environment 300 corresponds to an operating system, such as a Linux operating system.
- the nodes 122 of the distributed application cluster 120 each correspond to the computing environment 300 .
- the computing environment 300 includes a user space 302 and a kernel space 304 .
- the user space 302 is a set of memory locations where user processes are run (e.g., user programs, microservices, etc.)
- the kernel space 304 is a set of memory locations where system processes are run (e.g., device drivers, memory management, etc.).
- the user space 302 and the kernel space 304 are separated to protect the kernel (i.e., system core) from any malicious or errant software behavior that may occur in the user space 302 .
- the user space 302 is separated from the kernel space 304 to protect the kernel, in some cases, it may be necessary for the user (or user program) to have access to the kernel. For example, access to the kernel space 304 may be necessary to analyze network traffic or for other performance monitoring applications.
- the kernel space 304 includes a BPF program 306 .
- the BPF program 306 is a user developed program (or module) configured to perform one or more functions within the kernel space 304 .
- the BPF program 306 can be configured to provide one or more functions associated with performance monitoring (e.g., network traffic analysis).
- the BPF program 306 may be developed in a user-friendly programming language before being compiled into machine language and deployed in the kernel space 304 .
- the Linux BPF architecture includes a BPF verifier configured to ensure the BPF program 306 is incapable of malicious or errant software behavior within the kernel space 304 .
- BPF maps are used as global shared memory structures that can be accessed from the user space 302 and the kernel space 304 .
- the BPF map 308 is used to transfer data between the user space 302 and the kernel space 304 .
- the computing environment 300 includes a BPF map 308 .
- the BPF map 308 can be accessed within the kernel space 304 by the BPF program 306 .
- the BPF map 308 can be accessed from the user space 302 via system calls that are native to the computing environment 300 .
- Linux operating systems include system calls that provide different BPF map operations (e.g., read, write, clear, etc.).
- the native system calls may function similarly to an application programming interface (API) between the user space 302 and the BPF map 308 .
- a user program 310 may be developed by the user and configured to call one or more of the BPF system calls from the user space 302 .
- the Linux BPF architecture supports the use of BPF probes configured to interrupt the user space 302 in favor of BPF programs within the kernel space 304 .
- the BPF probes can be configured to trigger based on various events (e.g., user functions, timers, kernel processes, etc.).
- FIG. 4 A is a functional block diagram of a protocol tracing architecture 400 in accordance with aspects described herein.
- the protocol tracing architecture 400 includes the use of kernel BPF probes (“kprobes”) to trace Linux system calls (“syscalls”). By tracing the Linux system calls, send and receive messages (or requests) can be traced to infer the protocol in use.
- kernel BPF probes kprobes
- seyscalls Linux system calls
- the protocol tracing architecture 400 includes a user application 402 , a Linux environment 404 , a plurality of BPF probes 406 , and an edge module 408 .
- the user application 402 corresponds to one of the microservices 228 of FIG. 2 (e.g., microservice 228 A) and the Linux environment 404 corresponds to the computing environment running on one of the nodes 122 of FIG. 2 (e.g., node 122 A).
- the edge module 408 is configured to run on the node 122 A with the application 402 .
- the configuration of the edge module 408 may be determined and/or adjusted by a user (e.g., via the client device 110 ).
- the edge module 408 can be instantiated on the node 122 A without disrupting the other applications running on the node 122 A.
- the edge module 408 is configured to operate in a user space 404 a of the Linux environment 404 .
- the edge module 408 is configured to communicate (e.g., receive data) from at least one BPF program 410 operating in a kernel space 404 b of the Linux environment 404 .
- the at least one BPF program 410 is included in the edge module 408 (e.g., as a kernel space portion of the edge module 408 ).
- the edge module 408 is configured to deploy the plurality of BPF probes 406 to trace data sent between the user application 402 and the kernel space 404 b of the Linux environment 404 .
- the plurality of BPF probes 406 include kprobes configured to trigger based on the occurrence of certain kernel functions (e.g., received syscalls).
- the plurality of BPF probes 406 are configured to trigger on specific system calls based on the configuration (e.g., user configuration) of the edge module 408 .
- the designated system calls may include system calls used for networking.
- the BPF probes 406 may be registered to trigger on “connect,” “send,” “recv,” and “close” system calls sent from the user application 402 to the kernel space 404 b of the Linux environment 404 .
- the BPF probes 406 can trigger the BPF program 410 to capture raw message data.
- the BPF program 410 is configured to analyze the raw message data to determine the protocol associated with the data. If the protocol is of interest, the raw message data is transferred by the BPF program 410 to the user space 404 a of the Linux environment 404 .
- the BPF program 410 is configured to transfer the raw message data to the edge module 408 via a buffer 412 (e.g., a perf buffer).
- the raw message data is subsequently parsed by a protocol parser into well-formed/structured data, which is pushed and stored into data tables for querying.
- the protocol parser is included in the edge module 408 .
- the protocol inference can occur outside of the BPF program 410 (e.g., in the user space 404 a ). In such examples, the protocol inference can be moved into the user space 404 a by sending full or sample data for each protocol connection to the edge module 408 . If a connection is inferred not to be a protocol of interest, the edge module 408 can send a command (or signal) back to the BPF program 410 to discontinue tracing the connection.
- the protocol tracing architecture 400 may be used with HTTP, MySQL, PostgreSQL, CQL and DNS protocols, or other types of data protocols.
- the user application 402 may correspond to multiple applications running on the same node (e.g., microservices 228 A, 228 B of FIG. 2 ).
- the edge module 408 can be configured to trace protocols across multiple applications simultaneously.
- the protocol tracing architecture 400 can be adapted for protocol tracing over encrypted channels.
- FIG. 5 A is a functional block diagram of a protocol tracing architecture 500 in accordance with aspects described herein.
- the protocol tracing architecture 500 includes the use of user BPF probes (“uprobes”) and kernel BPF probes (“kprobes”) to trace Linux system calls. By tracing the Linux system calls, send and receive messages (or requests) can be traced to infer the protocol in use over encrypted channels.
- the protocol tracing architecture 500 is substantially similar to the protocol tracing architecture 400 of FIG. 4 A , except the protocol tracing architecture 500 includes an encryption library 509 .
- the edge module 508 is configured to deploy a plurality of BPF probes 506 to trace data sent between the user application 502 , the kernel space 504 b of the Linux environment 504 , and the encryption library 509 .
- the encryption library 509 corresponds to the OpenSSL library, the GoTLS library, and/or other encryption libraries.
- the plurality of BPF probes 506 includes uprobes configured to trigger based on the occurrence of certain activity between the user application 502 and the encryption library 509 .
- the uprobes are configured to trigger on functions of the application 502 based on a configuration (e.g., user configuration) of the edge module 508 .
- the designated functions may include writing data to the encryption library 509 and reading data from the encryption library 509 .
- the plurality of BPF probes 506 includes kprobes configured to trigger based on the occurrence of certain kernel functions (e.g., received syscalls).
- the plurality of BPF probes 506 are configured to trigger on specific system calls based on the configuration of the edge module 508 .
- the BPF probes 506 can be used to trace data higher up (i.e., upstream) in the application stack prior to being encrypted. As shown in FIG. 5 B , the BPF probes 506 can trigger the BPF program 510 to capture raw message data.
- the BPF program 510 can be triggered by one or more uprobes 506 a to capture data at the encryption library 509 .
- the BPF program 510 can be triggered by one or more kprobes 506 b to capture data associated with kernel functions (e.g., syscalls).
- the BPF program 510 is configured to analyze the raw message data to determine the protocol associated with the data.
- the raw message data is transferred by the BPF program 510 to the user space 504 a of the Linux environment 504 .
- the BPF program 510 is configured to transfer the raw message data to the edge module 508 via the buffer 512 .
- the raw message data is subsequently parsed by a protocol parser into well-formed/structured data, which is pushed and stored into data tables for querying.
- the protocol parser is included in the edge module 508 .
- the protocol inference can occur outside of the BPF program 510 (e.g., in the user space 504 a ).
- the protocol inference can be moved into the user space 504 a by sending full or sample data for each protocol connection (e.g., at the encryption library 509 ) to the edge module 508 (e.g., via the buffer 512 ). If a connection is inferred not to be a protocol of interest, the edge module 508 can send a command (or signal) back to the BPF program 510 to discontinue tracing the connection.
- the protocol tracing architecture 500 allows data tracing for protocols such as HTTPS or other protocols operating over encrypted channels (e.g., SSL/TLS).
- Certain protocols may require a state for interpretation.
- the HTTP2 protocol uses a compression scheme to encrypt headers.
- decoding captured messages is not possible without knowing the compression state.
- uprobes included in the plurality of BPF probes 506 can be used to directly trace the HTTP2 library (e.g., encryption library 509 ) and capture the messages before the compression is applied.
- the uprobes can be used to trace multiple, different HTTP2 libraries.
- the uprobes can be used to trace multiple, different Golang HTTP2 libraries.
- FIG. 6 is a flow diagram of a protocol tracing method 600 in accordance with aspects described herein.
- the method 600 corresponds to the operation of protocol tracing architectures 400 , 500 of FIGS. 4 A- 5 B .
- an edge module is provided and configured to deploy a BPF probe (e.g., kprobe) and a corresponding BPF program in the computing environment.
- the edge module is configured to run on the computing environment with a distributed application (e.g., application 402 or 502 ).
- the computing environment corresponds to a node of the distributed application cluster.
- the BPF probe is triggered based on an event associated with the distributed application running in a user space of the computing environment.
- the event (or events) that trigger the BPF probe are identified in a configuration of the edge module (e.g., a user configuration, system configuration, etc.).
- the events may correspond to “syscalls” that are sent from the user space to the kernel space.
- the events are selected by a user for monitoring or debugging purposes. In other examples, the events may be automatically selected to provide monitoring and/or event logging that is representative of the distributed application.
- data associated with the event is captured (or collected) in a kernel space of the computing environment via the BPF program.
- the captured data is analyzed via the BPF program to an infer a protocol associated with the captured data.
- a determination is made as to whether the inferred protocol is a protocol of interest.
- the protocol(s) of interest are identified in the edge module configuration.
- the captured data is transferred from the kernel space of the computing environment to the user space of the computing environment.
- the captured data is transferred from the BPF program to the edge module.
- the BPF probe can be reset and the method 600 returns to block 604 .
- the tracing may be discontinued in response to a determination that the protocol is not of interest.
- BPF probes can be used to trace or log events associated with distributed applications.
- Such tracing or logging can be used by developers (or other users) to determine the root causes of functional and/or performance issues.
- an end-to-end system enables a user to dynamically inject a trace point on an actively running application using a high-level specification.
- the dynamic injection of trace points can be achieved by (i) taking the high-level specification to automatically generate the BPF uprobe code (e.g., configuration code) to collect the desired information (e.g., trace latency, the inputs and outputs to a function every time it is called, etc.), and (ii) deploying the uprobes.
- the generation of the BPF uprobe code includes the use of (or reference to) a BPF Compiler Collection (BCC) toolkit.
- BCC BPF Compiler Collection
- the automatic generation of BPF uprobe code includes the use of debug symbols to locate the variables of interest in memory.
- the variables of interest can be subsequently extracted and exported.
- the aforementioned approach may be used to generate code that would otherwise be a time consuming, tedious, and error-prone process.
- the captured data can be subsequently outputted into a structured format (e.g., into data tables) for easy querying. This can provide visibility to desired application functions without the need to recompile and/or redeploy the application.
- basic types and/or complex structs can be traced using the generated BPF uprobes.
- debug symbols can be utilized to trace all the members of the struct from memory as raw bytes, and then cast them back into the defined structure in user space after copying them from kernel space.
- Golang interfaces can be traced by detecting the run-time type against a set of potential interface candidates extracted from the compiled code. Subsequently, the run-time type is checked against a list in the kernel space (e.g., in a BPF map). Further, raw bytes are sent for the type of interest to the user space with an indicator of the type so that it can be decoded in the user space.
- FIG. 7 A is a functional block diagram of a function tracing architecture 700 in accordance with aspects described herein.
- the function tracing architecture 700 includes the use of user BPF probes (“uprobes”) to capture desired information from one or more application functions (e.g., trace latency, the inputs and outputs to a function every time it is called, etc.).
- the function tracing architecture 700 includes an application 702 , a Linux environment 704 , a plurality of BPF probes 706 , and an edge module 708 .
- the user may use an end-to-end system to generate BPF probes 706 and inject trace points in functions of the application 702 .
- the end-to-end system corresponds, at least in part, to the system 100 of FIG. 1 .
- the user may use the client device 110 to specify functions and variables included in the deployed application 702 for inspection.
- the memory addresses of the specified functions are used to link (or register) the functions to the BPF probes 706 .
- the memory addresses of the specified functions can be included in the configuration code generated for the plurality of BPF probes 706 .
- the BPF probes 706 are triggered whenever the memory addresses of the functions are reached.
- the BPF probes 706 can trigger the BPF program 710 to capture raw data.
- the BPF program 710 can be triggered by a uprobe 706 to capture the desired data.
- data associated with specified variables may be captured by the BPF program 710 .
- the raw data is transferred by the BPF program 710 to the user space 704 a of the Linux environment 704 .
- the BPF program 710 is configured to transfer the raw message data to the edge module 708 via the buffer 712 .
- the raw data may be parsed, sorted, and/or processed into well-formed/structured data, which is pushed and stored into data tables for querying.
- the application 702 may correspond to multiple applications (e.g., microservices 228 A, 228 B of FIG. 2 ).
- the edge module 708 can be configured to trace data across functions included in multiple applications simultaneously.
- FIG. 8 is a flow diagram of a function tracing method 800 in accordance with aspects described herein.
- the method 800 corresponds to the operation of function tracing architecture 700 of FIGS. 7 A, 7 B .
- an edge module is provided and configured to deploy a BPF probe (e.g., uprobe) and a corresponding BPF program in the computing environment.
- the edge module is configured to run on the computing environment with a distributed application (e.g., application 702 ).
- the computing environment corresponds to a node of the distributed application cluster.
- the BPF probe is triggered based on an event associated with the distributed application running in a user space of the computing environment.
- the event (or events) that trigger the BPF probe are identified in a configuration of the edge module (e.g., a user configuration, system configuration, etc.).
- the events correspond to specific functions of the application 702 that are selected by a user for monitoring or debugging purposes. In other examples, the events correspond to functions that are automatically selected to provide monitoring and/or event logging that is representative of the distributed application.
- data associated with the event is captured (or collected) in a kernel space of the computing environment via the BPF program.
- the captured data is transferred from the kernel space of the computing environment to the user space of the computing environment.
- the captured data is transferred from the BPF program to the edge module.
- BPF probes can be automatically deployed and registered to corresponding BPF programs (e.g., BPF program 410 ) to provide no-instrumentation telemetry.
- BPF programs may be developed using BPF specific languages and toolkits.
- BPFTrace is a high-level tracing language for Linux eBPF that can be used to for BPF program development.
- it can be challenging to deploy and monitor BPFTrace scripts across an entire cluster e.g., cluster 120 ).
- a distributed BPF code management system can be used to automatically deploy BPFTrace scripts across the cluster 120 .
- distributed BPF code management system includes a specification (e.g., in the P ⁇ L language) that provides the automatic deployment of BPFTrace scripts.
- the code i.e., BPFTrace script
- the code is analyzed to detect outputs such that the collection of the data from the BPF program can be automated (e.g., via edge module 408 ).
- the data from each deployed BPF program on the cluster can be formatted into a structured record for easy querying.
- the distributed BPF code management can be applied to BCC, GoBPF and other BPF front-ends.
- the edge module can be configured with a flexible architecture that accepts data from a plurality of sources.
- FIG. 9 illustrates a distributed application monitoring system 900 in accordance with aspects described herein.
- the system 900 includes a plurality of sources 906 , an edge module 908 , a distributed agent 910 , and a plurality of interface tools 912 .
- the edge module 908 may correspond to each of the edge modules 408 , 508 , and 708 of FIGS. 4 A- 5 B and 7 A- 7 B .
- the edge module 908 is configured to run on a node included in a distributed application cluster (e.g., cluster 120 ).
- a distributed application cluster e.g., cluster 120
- the edge module 908 includes a data collector 914 , a plurality of data tables 916 , and a query engine 918 .
- the distributed agent 910 provides an interface between the edge module 908 and the plurality of interface tools 912 .
- the plurality of interface tools includes a command line interface (CLI) 912 a and a user interface (UI) 912 b.
- the plurality of sources 906 can include Linux kernel data exports (e.g., CPU, IO, memory usage), eBPF data exports (e.g., outputs from BPF programs), Linux APIs, Java Virtual Machines (JVM), and other sources.
- the edge module 908 (or the data collector 914 ) includes an API that allows for the addition of new data sources in a flexible manner.
- the data collector 914 can parse, sort, and/or process the collected data into well-formed/structured data, which is pushed and stored into the plurality of data tables 916 .
- the query engine 918 enables the plurality of data tables 916 to be searched.
- developers can use the interface tools 912 to engage with the query engine 918 via the distributed agent 910 .
- the distributed agent 910 is configured to run on the distributed application cluster (e.g., cluster 120 ) and is responsible for query execution and managing each edge module 908 .
- a distributed agent can be executed on the distributed application cluster 120 .
- the distributed agent 910 includes an “edgeML” system that uses the distributed data across all of the nodes 122 on the cluster 120 to train an unsupervised model used for clustering events.
- the edgeML system is configured to train one or more machine learning (ML) or artificial intelligence (AI) models.
- ML machine learning
- AI artificial intelligence
- Each edge device i.e., node 122
- each edge device keeps track of its own “coreset,” a small subset of the data of which is mathematically guaranteed to be a representative sample of the total data on the node 122 .
- the coresets are merged together by a central node (e.g., node 122 A), and the ML/AI model for data clustering is trained using the resulting unified coreset of events.
- the data clustering includes automatically clustering events collected without knowledge or guidance about the nature of those events.
- HTTP request data can be clustered by the edgeML system based on the semantic similarity of the requests to provide usable metrics.
- a coreset algorithm is applied to achieve optimized, streamed semantic clustering of the coreset data.
- a kmeans coreset algorithm can be applied for semantic clustering of HTTP request data. The application of the kmeans coreset algorithm can provide clustering on streaming data with only log N memory.
- a query language e.g., P ⁇ L
- Feature generation and inference can be invoked as a user-defined function, which allows integration of results directly in the data analysis, cleaning, and structuring phases.
- FIG. 10 is a flow diagram of a method 1000 for identifying and clustering events on a distributed application cluster in accordance with aspects described herein.
- the method 1000 can be carried out, at least in part, by the distributed application monitoring system 900 of FIG. 9 .
- a distributed agent (e.g., agent 910 ) is provided and configured to run on the distributed application cluster.
- the distributed application cluster includes a plurality of nodes and at least one distributed application runs on each node of the plurality of nodes.
- each edge module is configured to run on a corresponding node of the plurality of nodes. In some examples, each edge module is configured to deploy at least one BPF probe and at least one corresponding BPF program on the corresponding node.
- each data coreset includes data associated with the distributed applications running on the corresponding node.
- the data included in each data coreset may be a representative sample of the corresponding node's total data.
- each data coreset is tracked by triggering the at least one BPF probe and collecting data via the at least one corresponding BPF program associated with the edge module on each node.
- a unified data coreset is generated by merging the plurality of data coresets.
- merging the plurality of data coresets includes transferring the plurality of data coresets to the edge module of a central node of the plurality of nodes. The central node may then transfer the unified data coreset to the distributed agent.
- the unified data coreset is updated (or remerged) at periodic intervals.
- the unified data coreset can be generated and/or updated in response to data queries received at the distributed agent (e.g., from the CLI 912 a or the UI 912 b ).
- the unified data coreset is transferred to the distributed agent to train an unsupervised model configured to identify and cluster events across the distributed application cluster.
- the unsupervised model is an ML and/or AI model.
- semantic types can be used to track contextual information about collected data over time.
- relevant entity semantic types e.g., pod, microservice, etc.
- these semantic types can be referenced via the CLI 912 a and/or the UI 912 b to provide contextual displays. For example, a value with a semantic type representing latency quantiles can be rendered as a box whisker plot and displayed to the user via the UI 912 b.
- semantic types can be used to create contextual “deep links” to dedicated views for a given entity type in both the CLI 912 a and the UI 912 b .
- a value annotated with the “Pod” semantic type will automatically be linked to a dedicated view for that specific pod. Such linking can be accomplished automatically without any input from the user.
- the query engine 918 automatically propagates the semantic types used by a client to create the “deep link.”
- URLs can be generated that are “entity-centric.” Each entity may have a hierarchical URL that can be used to see its landing page or view. For example, the URL to navigate to the default view for a pod may be:
- FIG. 11 is a flow diagram of a method 1100 for linking and navigating data collected from a distributed application cluster in accordance with aspects described herein.
- the method 1100 can be carried out, at least in part, by the distributed application monitoring system 900 of FIG. 9 .
- an edge module is provided and configured to deploy a BPF probe and a corresponding BPF program in a computing environment.
- the edge module is configured to run on the computing environment with a distributed application.
- the computing environment corresponds to a node of the distributed application cluster.
- the BPF probe is triggered based on an event associated with the distributed application running in a user space of the computing environment.
- the event (or events) that trigger the BPF probe are identified in a configuration of the edge module (e.g., a user configuration, system configuration, etc.).
- data associated with the event is collected (or captured) in a kernel space of the computing environment via the BPF program.
- the collected data is transferred from the kernel space of the computing environment to the user space of the computing environment.
- the captured data is transferred from the BPF program to the edge module.
- one or more semantic labels are assigned to the collected data.
- the collected data is labeled to indicate a source of the data (e.g., source within the distributed application cluster).
- the semantic labels correspond to types of entities supported by each node in the cluster (e.g., node name, pod name, microservice name, etc.).
- Data queries including at least one of the semantic labels may be received from a user interface (e.g., CLI 912 a or UI 912 b ).
- data from the collected data associated with the semantic labels in the query may be returned to the user interface (e.g., via the agent 910 and/or the edge module 908 ).
- URL links corresponding the collected data can be generated.
- the URL links may include at least one semantic label assigned to the collected data.
- the collected data can be displayed (e.g., via UI 912 b ) in response to a user accessing the URL link(s).
- the collected data is displayed as a data table and/or a graphical visualization (e.g., chart, plot, etc.).
- a main-mode for navigating through an interface is achieved by typing autocompleted commands. Possible commands that can be entered can vary from navigating to other pages/views, or can perform specific actions on the current page.
- “fuzzy” searches are performed to determine which entities best match what has been typed. Matches are determined by how closely the entity name/description matches the user's input, and how relevant the entity is according to the user's current context.
- the searchable data can be indexed in order for the searches to be performed quickly. For example, the data can be indexed or filtered by a cluster ID.
- autocomplete for entity (pod, service, etc.) names is based on context provided from a knowledge graph.
- the knowledge graph is based on relationships between entities (how much they communicate, how often are they linked together, hierarchical organization) as well as which entities have the most interesting behavior within relevant time windows. This can also include recommending different actions that can be taken depending on the current page and entities involved.
- the views presented to the user are entirely based on code.
- code serves as a “visualization specification” that describes the layout of tables, charts, and other visuals in any given view.
- the view code includes a declarative message format and a P ⁇ L script.
- the P ⁇ L script may be a declarative Python script.
- These views can be registered in a public repository (e.g., GitHub).
- the registered views can be accessed by ID using hyperlinking and keyboard shortcuts.
- users can extend or modify these views, or register their own views to the public repository.
- FIG. 12 is a flow diagram of a method 1200 for navigating data associated with a distributed application cluster in accordance with aspects described herein.
- the method 1200 can be carried out, at least in part, by the distributed application monitoring system 900 of FIG. 9 .
- an edge module is provided and configured to deploy a BPF probe and a corresponding BPF program in a computing environment.
- the edge module is configured to run on the computing environment with a distributed application.
- the computing environment corresponds to a node of the distributed application cluster.
- the distributed application corresponds to one entity of a plurality of entities on the distributed application cluster.
- the plurality of entities can include nodes, pods, and services (or microservices) running on the distributed application cluster.
- the BPF probe is triggered based on an event associated with the distributed application running in a user space of the computing environment.
- the event (or events) that trigger the BPF probe are identified in a configuration of the edge module (e.g., a user configuration, system configuration, etc.).
- data associated with the event is collected (or captured) in a kernel space of the computing environment via the BPF program.
- the collected data is transferred from the BPF program to the edge module.
- At block 1208 at least one relationship is identified between the distributed application and at least one entity of the plurality of entities based on the collected data.
- identifying the at least one relationship includes generating a knowledge graph from the collected data that represents relationships between the plurality of entities.
- the relationships represented in the knowledge graph may correspond to interactions between two or more entities of the plurality of entities.
- At block 1210 at least one recommended data set is provided (e.g., to a user) based on the at least one identified relationship.
- the recommended data set(s) includes at least a portion of the collected data.
- recommended data set(s) include a data table and/or a graphical visualization representing the data set.
- the UI 912 b includes a command entry field where users can enter commands to perform various functions with the UI 912 b .
- the commands may instruct the UI 912 b to display a data table and/or a graphical visualization representing the recommended data set(s).
- the user may enter a partially completed command.
- at least one command corresponding to the recommended data set(s) may be suggested (e.g., via a drop down list) based on the partially completed command.
- the suggested command(s) are provided via the edge module (or the agent 910 ).
- a hybrid architecture is used to separate control functionality (e.g., operations for handling API requests, overall management of the system) and data functionality (e.g., collecting, managing, and executing queries on data).
- control functionality e.g., operations for handling API requests, overall management of the system
- data functionality e.g., collecting, managing, and executing queries on data.
- the separation of functionality is split between a self-hosted cloud service and customer environment.
- data can be processed entirely in the customer environment. Requests pertaining to the data are made to the application(s) running on the customer environment. All other operations (e.g., control functionality) can be handled entirely in the self-hosted cloud service.
- FIG. 13 A is a functional block diagram of a hybrid architecture 1300 operating in a direct mode in accordance with aspects described herein.
- the hybrid architecture 1300 includes a customer environment 1302 , a satellite application 1304 , a UI 1306 , and a cloud service 1308 .
- the customer environment 1302 corresponds to a node (e.g., node 122 A) of the distributed application cluster 120
- the satellite application 1304 corresponds to a distributed application or microservice running on the node (e.g., microservice 228 A)
- the UI 1306 corresponds to the UI 912 b
- the cloud service 1308 is configured to communicate with the satellite application 1304 via the distributed agent 910 .
- the UI 1306 may communicate with the satellite application 1304 via an API.
- the UI 1306 is configured to send queries (or requests) directly to the satellite application 1304 . Likewise, the UI 1306 is configured to receive responses directly from the satellite application 1304 . In one example, the UI 1306 is configured to retrieve the address (e.g., IP address) of the satellite application 1304 from the cloud service 1308 . In some examples, the UI 1306 may also retrieve a status of the satellite application 1304 (e.g., via a heartbeat sequence).
- the satellite application 1304 running on the customer environment 1302 may include a proxy service which handles requests. In some examples, the proxy service can be configured to serve pre-generated SSL certificates to satisfy browser security requirements. Being that the UI 1306 communicates directly with the satellite application 1304 , data can be kept behind a firewall 1310 .
- FIG. 13 B is a functional block diagram of the hybrid architecture 1300 operating in a passthrough mode in accordance with aspects described herein.
- the data request is made to the cloud service 1308 .
- the cloud service 1308 is responsible for forwarding the data request to the customer environment 1302 , and subsequently sending any results back to the requestor (i.e., the UI 1306 ). Being that the data flows through the cloud service 1308 , it can be accessed from out-of-network.
- a message-bus based system can be used for proxying messages between the cloud service 1308 and the customer environment 1302 .
- the message-bus system can be used to direct other control messages to/from the cloud service 1308 and the customer environment 1302 .
- Such control messages can include notifying the customer for upgrades or possible configuration changes.
- FIG. 14 is a flow diagram of a method 1400 for event logging and debugging on a distributed application cluster in accordance with aspects described herein.
- the method 1400 corresponds to the operation of the hybrid architecture 1300 in the direct and passthrough modes.
- an edge module is provided and configured to deploy a BPF probe and a corresponding BPF program in a computing environment (e.g., customer environment 1302 ).
- the edge module is configured to run on the computing environment with a distributed application (e.g., satellite application 1304 ).
- the computing environment corresponds to a node of the distributed application cluster.
- the distributed application may run in a user space of the computing environment.
- the BPF probe is triggered based on an event associated with the distributed application.
- the event (or events) that trigger the BPF probe are identified in a configuration of the edge module (e.g., a user configuration, system configuration, etc.).
- data associated with the event is collected (or captured) in a kernel space of the computing environment via the BPF program.
- transferring the collected data from the BPF program to the edge module includes transferring the collected data from the kernel space of the computing environment to the user space of the computing environment.
- a query request associated with the collected data is sent via UI 1306 to the edge module.
- the cloud service 1308 may query an address (e.g., IP address) associated with the distributed application (or the computing environment).
- the UI 1306 may request the address associated with the distributed application (or the edge module) from the cloud service 1308 and send the query request directly to the address.
- the UI 1306 sends the query request to the cloud service 1308 and the cloud service 1308 directs (or forwards) the query request to the edge module of the distributed application.
- a response corresponding to the collected data is received from the edge module at the UI 1306 .
- the response includes at least a portion of the collected data.
- the cloud service 1308 may receive the request directly from the distributed application.
- the distributed application provides the response to the cloud service 1308 and the cloud service 1308 directs (or forwards) the response to the UI 1306 .
- the UI 1306 is configured to generate a data table and/or a graphical visualization based on the received response corresponding to the collected data.
- FIG. 15 is a block diagram of an example computer system 1500 that may be used in implementing the systems and methods described herein.
- General-purpose computers, network appliances, mobile devices, or other electronic systems may also include at least portions of the system 1500 .
- the system 1500 includes a processor 1510 , a memory 1520 , a storage device 1530 , and an input/output device 1540 .
- Each of the components 1510 , 1520 , 1530 , and 1540 may be interconnected, for example, using a system bus 1550 .
- the processor 1510 is capable of processing instructions for execution within the system 1500 .
- the processor 1510 is a single-threaded processor.
- the processor 1510 is a multi-threaded processor.
- the processor 1510 is capable of processing instructions stored in the memory 1520 or on the storage device 1530 .
- the memory 1520 stores information within the system 1500 .
- the memory 1520 is a non-transitory computer-readable medium.
- the memory 1520 is a volatile memory unit.
- the memory 1520 is a non-volatile memory unit.
- some or all of the data described above can be stored on a personal computing device, in data storage hosted on one or more centralized computing devices, or via cloud-based storage.
- some data are stored in one location and other data are stored in another location.
- quantum computing can be used.
- functional programming languages can be used.
- electrical memory such as flash-based memory, can be used.
- the storage device 1530 is capable of providing mass storage for the system 1500 .
- the storage device 1530 is a non-transitory computer-readable medium.
- the storage device 1530 may include, for example, a hard disk device, an optical disk device, a solid-date drive, a flash drive, or some other large capacity storage device.
- the storage device may store long-term data (e.g., database data, file system data, etc.).
- the input/output device 1540 provides input/output operations for the system 1500 .
- the input/output device 1540 may include one or more of a network interface devices, e.g., an Ethernet card, a serial communication device, e.g., an RS-232 port, and/or a wireless interface device, e.g., an 802.11 card, a 3G wireless modem, or a 4G wireless modem.
- the input/output device may include driver devices configured to receive input data and send output data to other input/output devices, e.g., keyboard, printer and display devices 1560 .
- mobile computing devices, mobile communication devices, and other devices may be used.
- At least a portion of the approaches described above may be realized by instructions that upon execution cause one or more processing devices to carry out the processes and functions described above.
- Such instructions may include, for example, interpreted instructions such as script instructions, or executable code, or other instructions stored in a non-transitory computer readable medium.
- the storage device 1530 may be implemented in a distributed way over a network, such as a server farm or a set of widely distributed servers, or may be implemented in a single computing device.
- FIG. 15 Although an example processing system has been described in FIG. 15 , embodiments of the subject matter, functional operations and processes described in this specification can be implemented in other types of digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible nonvolatile program carrier for execution by, or to control the operation of, data processing apparatus.
- the program instructions can be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
- the computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- system may encompass all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
- a processing system may include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- a processing system may include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
- a computer program (which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a computer program may, but need not, correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code).
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- special purpose logic circuitry e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- Computers suitable for the execution of a computer program can include, by way of example, general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and data from a read-only memory or a random access memory or both.
- a computer generally includes a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e.g., a universal serial bus (USB) flash drive), to name just a few.
- PDA personal digital assistant
- GPS Global Positioning System
- USB universal serial bus
- Computer readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD-ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto optical disks e.g., CD-ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (“LAN”) and a wide area network (“WAN”), e.g., the Internet.
- LAN local area network
- WAN wide area network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- the automated solutions include event logging and debugging on the KUBERNETES platform.
- the solutions include the use of no-instrumentation telemetry, an edge intel platform, entity linking and navigation, command driven navigation, and a hybrid-cloud/customer architecture.
- X has a value of approximately Y” or “X is approximately equal to Y”
- X should be understood to mean that one value (X) is within a predetermined range of another value (Y).
- the predetermined range may be plus or minus 20%, 10%, 5%, 3%, 1%, 0.1%, or less than 0.1%, unless otherwise indicated.
- a reference to “A and/or B”, when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
- the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements.
- This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified.
- “at least one of A and B” can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
- ordinal terms such as “first,” “second,” “third,” etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed. Ordinal terms are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term), to distinguish the claim elements.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer Security & Cryptography (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
-
- “/cluster/:cluster_id/ns/:ns/pod/:pod_name.” Each URL for an entity or set of entities can be backed by a live view that is the default for the entity type. Additional live views can be registered as sub-properties of such entities. For example, a non-default view called “pod_node_stats” could be written as:
- “/cluster/:cluster_id/ns/:ns/pod/:pod_name?script=pod_node_stats.” In some examples, user-defined scripts can be automatically translated into entity-centric URLs based on defined variables in the script. For example, reserved variable names such as “namespace,” “pod_name,” and “service_name” may automatically be translated. As such, a script that contains a “namespace” variable and a “pod_name” variable can be inferred to be about the pod in “pod_name.” In some examples, the entity centric URLs exist within the confines of the
CLI 912 a (or theUI 912 b). TheUI 912 b may be responsible for mapping the URLs to the views that should be loaded for the user.
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/525,749 US11977934B2 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on KUBERNETES |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063113112P | 2020-11-12 | 2020-11-12 | |
US17/525,749 US11977934B2 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on KUBERNETES |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220147408A1 US20220147408A1 (en) | 2022-05-12 |
US11977934B2 true US11977934B2 (en) | 2024-05-07 |
Family
ID=81454453
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/525,749 Active US11977934B2 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on KUBERNETES |
US17/525,755 Pending US20220147407A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,757 Pending US20220147542A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,760 Pending US20220147433A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,767 Pending US20220147434A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
Family Applications After (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/525,755 Pending US20220147407A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,757 Pending US20220147542A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,760 Pending US20220147433A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
US17/525,767 Pending US20220147434A1 (en) | 2020-11-12 | 2021-11-12 | Automation solutions for event logging and debugging on kubernetes |
Country Status (1)
Country | Link |
---|---|
US (5) | US11977934B2 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11977934B2 (en) | 2020-11-12 | 2024-05-07 | New Relic, Inc. | Automation solutions for event logging and debugging on KUBERNETES |
US11928529B2 (en) * | 2021-10-21 | 2024-03-12 | New Relic, Inc. | High-throughput BPF map manipulations with uprobes |
US11659027B1 (en) * | 2022-01-06 | 2023-05-23 | Vmware, Inc. | Multi-network/domain service discovery in a container orchestration platform |
US20240104221A1 (en) * | 2022-09-23 | 2024-03-28 | International Business Machines Corporation | AUTOMATED TESTING OF OPERATING SYSTEM (OS) KERNEL HELPER FUNCTIONS ACCESSIBLE THROUGH EXTENDED BPF (eBPF) FILTERS |
CN116257841B (en) * | 2023-02-16 | 2024-01-26 | 北京未来智安科技有限公司 | Function processing method and device based on Kubernetes |
Citations (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020007468A1 (en) | 2000-05-02 | 2002-01-17 | Sun Microsystems, Inc. | Method and system for achieving high availability in a networked computer system |
US20020019870A1 (en) | 2000-06-29 | 2002-02-14 | International Business Machines Corporation | Proactive on-line diagnostics in a manageable network |
US20020174218A1 (en) * | 2001-05-18 | 2002-11-21 | Dick Kevin Stewart | System, method and computer program product for analyzing data from network-based structured message stream |
US20030028509A1 (en) | 2001-08-06 | 2003-02-06 | Adam Sah | Storage of row-column data |
US20030220740A1 (en) | 2000-04-18 | 2003-11-27 | Intriligator Devrie S. | Space weather prediction system and method |
US20030220984A1 (en) | 2001-12-12 | 2003-11-27 | Jones Paul David | Method and system for preloading resources |
US20050210133A1 (en) | 2004-03-12 | 2005-09-22 | Danilo Florissi | Method and apparatus for determining monitoring locations in distributed systems |
US20060056285A1 (en) * | 2004-09-16 | 2006-03-16 | Krajewski John J Iii | Configuring redundancy in a supervisory process control system |
US20060092179A1 (en) | 2004-10-18 | 2006-05-04 | Xanavi Informatics Corporation | Navigation apparatus, map data distribution apparatus, map data distribution system and map display method |
US20060182034A1 (en) * | 2002-12-13 | 2006-08-17 | Eric Klinker | Topology aware route control |
US20090287791A1 (en) * | 2008-05-19 | 2009-11-19 | Timothy Mackey | Systems and methods for automatically testing an application |
US20130054603A1 (en) | 2010-06-25 | 2013-02-28 | U.S. Govt. As Repr. By The Secretary Of The Army | Method and apparatus for classifying known specimens and media using spectral properties and identifying unknown specimens and media |
US20130097320A1 (en) | 2011-10-14 | 2013-04-18 | Sap Ag | Business Network Access Protocol for the Business Network |
US20140040275A1 (en) | 2010-02-09 | 2014-02-06 | Siemens Corporation | Semantic search tool for document tagging, indexing and search |
US20140136726A1 (en) * | 2007-10-24 | 2014-05-15 | Social Communications Company | Realtime kernel |
US20140181274A1 (en) | 2012-12-10 | 2014-06-26 | Guillaume Yves Bernard BAZIN | System and method for ip network semantic label storage and management |
US20140359719A1 (en) | 2012-10-22 | 2014-12-04 | Panasonic Corporation | Content management device, content management method, and integrated circuit |
US20150019553A1 (en) | 2013-07-11 | 2015-01-15 | Neura, Inc. | Data consolidation mechanisms for internet of things integration platform |
US20150261886A1 (en) | 2014-03-13 | 2015-09-17 | International Business Machines Corporation | Adaptive sampling schemes for clustering streaming graphs |
US20150293660A1 (en) | 2014-04-10 | 2015-10-15 | Htc Corporation | Method And Device For Managing Information |
US20150363702A1 (en) | 2014-06-16 | 2015-12-17 | Eric Burton Baum | System, apparatus and method for supporting formal verification of informal inference on a computer |
US20160026919A1 (en) | 2014-07-24 | 2016-01-28 | Agt International Gmbh | System and method for social event detection |
US20180285744A1 (en) | 2017-04-04 | 2018-10-04 | Electronics And Telecommunications Research Institute | System and method for generating multimedia knowledge base |
US20190140983A1 (en) | 2017-11-09 | 2019-05-09 | Nicira, Inc. | Extensible virtual switch datapath |
US20190173841A1 (en) * | 2017-12-06 | 2019-06-06 | Nicira, Inc. | Load balancing ipsec tunnel processing with extended berkeley packet filer (ebpf) |
US20190324882A1 (en) | 2018-04-20 | 2019-10-24 | Draios, Inc. | Programmatic container monitoring |
US20200145337A1 (en) | 2019-12-20 | 2020-05-07 | Brian Andrew Keating | Automated platform resource management in edge computing environments |
US20200193017A1 (en) * | 2016-10-24 | 2020-06-18 | Nubeva, Inc. | Leveraging Instrumentation Capabilities to Enable Monitoring Services |
US20200220794A1 (en) * | 2019-01-08 | 2020-07-09 | International Business Machines Corporation | Method and system for monitoing communication in a network |
US10747875B1 (en) | 2020-03-19 | 2020-08-18 | Cyberark Software Ltd. | Customizing operating system kernels with secure kernel modules |
US20200389531A1 (en) | 2018-07-13 | 2020-12-10 | Samsung Electronics Co., Ltd. | Method and electronic device for edge computing service |
US20200409780A1 (en) | 2019-06-27 | 2020-12-31 | Capital One Services, Llc | Baseline modeling for application dependency discovery, reporting, and management tool |
US20210058424A1 (en) * | 2019-08-21 | 2021-02-25 | Nokia Solutions And Networks Oy | Anomaly detection for microservices |
US20220147542A1 (en) | 2020-11-12 | 2022-05-12 | New Relic, Inc. | Automation solutions for event logging and debugging on kubernetes |
US20230104007A1 (en) | 2021-10-06 | 2023-04-06 | Cisco Technology, Inc. | Policy-based failure handling for edge services |
US20230168986A1 (en) | 2019-07-25 | 2023-06-01 | Deepfactor, Inc. | Systems, methods, and computer-readable media for analyzing intercepted telemetry events to generate vulnerability reports |
US20230231830A1 (en) | 2022-01-18 | 2023-07-20 | Korea Advanced Institute Of Science And Technology | High-speed network packet payload inspection system based on ebpf (extended berkeley packet filter)/xdp (express data path) for container environment |
US11709720B1 (en) | 2022-02-25 | 2023-07-25 | Datadog, Inc. | Protocol for correlating user space data with kernel space data |
-
2021
- 2021-11-12 US US17/525,749 patent/US11977934B2/en active Active
- 2021-11-12 US US17/525,755 patent/US20220147407A1/en active Pending
- 2021-11-12 US US17/525,757 patent/US20220147542A1/en active Pending
- 2021-11-12 US US17/525,760 patent/US20220147433A1/en active Pending
- 2021-11-12 US US17/525,767 patent/US20220147434A1/en active Pending
Patent Citations (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030220740A1 (en) | 2000-04-18 | 2003-11-27 | Intriligator Devrie S. | Space weather prediction system and method |
US20020007468A1 (en) | 2000-05-02 | 2002-01-17 | Sun Microsystems, Inc. | Method and system for achieving high availability in a networked computer system |
US20020019870A1 (en) | 2000-06-29 | 2002-02-14 | International Business Machines Corporation | Proactive on-line diagnostics in a manageable network |
US20020174218A1 (en) * | 2001-05-18 | 2002-11-21 | Dick Kevin Stewart | System, method and computer program product for analyzing data from network-based structured message stream |
US20030028509A1 (en) | 2001-08-06 | 2003-02-06 | Adam Sah | Storage of row-column data |
US20030220984A1 (en) | 2001-12-12 | 2003-11-27 | Jones Paul David | Method and system for preloading resources |
US20060182034A1 (en) * | 2002-12-13 | 2006-08-17 | Eric Klinker | Topology aware route control |
US20050210133A1 (en) | 2004-03-12 | 2005-09-22 | Danilo Florissi | Method and apparatus for determining monitoring locations in distributed systems |
US20060056285A1 (en) * | 2004-09-16 | 2006-03-16 | Krajewski John J Iii | Configuring redundancy in a supervisory process control system |
US20060092179A1 (en) | 2004-10-18 | 2006-05-04 | Xanavi Informatics Corporation | Navigation apparatus, map data distribution apparatus, map data distribution system and map display method |
US20140136726A1 (en) * | 2007-10-24 | 2014-05-15 | Social Communications Company | Realtime kernel |
US20090287791A1 (en) * | 2008-05-19 | 2009-11-19 | Timothy Mackey | Systems and methods for automatically testing an application |
US20140040275A1 (en) | 2010-02-09 | 2014-02-06 | Siemens Corporation | Semantic search tool for document tagging, indexing and search |
US20130054603A1 (en) | 2010-06-25 | 2013-02-28 | U.S. Govt. As Repr. By The Secretary Of The Army | Method and apparatus for classifying known specimens and media using spectral properties and identifying unknown specimens and media |
US20130097320A1 (en) | 2011-10-14 | 2013-04-18 | Sap Ag | Business Network Access Protocol for the Business Network |
US20140359719A1 (en) | 2012-10-22 | 2014-12-04 | Panasonic Corporation | Content management device, content management method, and integrated circuit |
US20140181274A1 (en) | 2012-12-10 | 2014-06-26 | Guillaume Yves Bernard BAZIN | System and method for ip network semantic label storage and management |
US20150019553A1 (en) | 2013-07-11 | 2015-01-15 | Neura, Inc. | Data consolidation mechanisms for internet of things integration platform |
US20150261886A1 (en) | 2014-03-13 | 2015-09-17 | International Business Machines Corporation | Adaptive sampling schemes for clustering streaming graphs |
US20150293660A1 (en) | 2014-04-10 | 2015-10-15 | Htc Corporation | Method And Device For Managing Information |
US20150363702A1 (en) | 2014-06-16 | 2015-12-17 | Eric Burton Baum | System, apparatus and method for supporting formal verification of informal inference on a computer |
US20160026919A1 (en) | 2014-07-24 | 2016-01-28 | Agt International Gmbh | System and method for social event detection |
US20200193017A1 (en) * | 2016-10-24 | 2020-06-18 | Nubeva, Inc. | Leveraging Instrumentation Capabilities to Enable Monitoring Services |
US20180285744A1 (en) | 2017-04-04 | 2018-10-04 | Electronics And Telecommunications Research Institute | System and method for generating multimedia knowledge base |
US20190140983A1 (en) | 2017-11-09 | 2019-05-09 | Nicira, Inc. | Extensible virtual switch datapath |
US20190173841A1 (en) * | 2017-12-06 | 2019-06-06 | Nicira, Inc. | Load balancing ipsec tunnel processing with extended berkeley packet filer (ebpf) |
US20190324882A1 (en) | 2018-04-20 | 2019-10-24 | Draios, Inc. | Programmatic container monitoring |
US20200389531A1 (en) | 2018-07-13 | 2020-12-10 | Samsung Electronics Co., Ltd. | Method and electronic device for edge computing service |
US20200220794A1 (en) * | 2019-01-08 | 2020-07-09 | International Business Machines Corporation | Method and system for monitoing communication in a network |
US20200409780A1 (en) | 2019-06-27 | 2020-12-31 | Capital One Services, Llc | Baseline modeling for application dependency discovery, reporting, and management tool |
US20230168986A1 (en) | 2019-07-25 | 2023-06-01 | Deepfactor, Inc. | Systems, methods, and computer-readable media for analyzing intercepted telemetry events to generate vulnerability reports |
US20210058424A1 (en) * | 2019-08-21 | 2021-02-25 | Nokia Solutions And Networks Oy | Anomaly detection for microservices |
US20200145337A1 (en) | 2019-12-20 | 2020-05-07 | Brian Andrew Keating | Automated platform resource management in edge computing environments |
US10747875B1 (en) | 2020-03-19 | 2020-08-18 | Cyberark Software Ltd. | Customizing operating system kernels with secure kernel modules |
US20220147542A1 (en) | 2020-11-12 | 2022-05-12 | New Relic, Inc. | Automation solutions for event logging and debugging on kubernetes |
US20230104007A1 (en) | 2021-10-06 | 2023-04-06 | Cisco Technology, Inc. | Policy-based failure handling for edge services |
US20230231830A1 (en) | 2022-01-18 | 2023-07-20 | Korea Advanced Institute Of Science And Technology | High-speed network packet payload inspection system based on ebpf (extended berkeley packet filter)/xdp (express data path) for container environment |
US11709720B1 (en) | 2022-02-25 | 2023-07-25 | Datadog, Inc. | Protocol for correlating user space data with kernel space data |
Non-Patent Citations (2)
Title |
---|
Asgar, Zain Mohamed; Issue Notification for U.S. Appl. No. 17/525,755, filed Nov. 12, 2021, mailed Mar. 13, 2024, 2 pgs. |
Asgar, Zain Mohamed; Notice of Allowance for U.S. Appl. No. 17/525,755, filed Nov. 12, 2021, mailed Dec. 28, 2023, 13 pgs. |
Also Published As
Publication number | Publication date |
---|---|
US20220147433A1 (en) | 2022-05-12 |
US20220147407A1 (en) | 2022-05-12 |
US20220147542A1 (en) | 2022-05-12 |
US20220147408A1 (en) | 2022-05-12 |
US20220147434A1 (en) | 2022-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11977934B2 (en) | Automation solutions for event logging and debugging on KUBERNETES | |
US11226795B2 (en) | Efficient state machines for real-time dataflow programming | |
CN108475360B (en) | Distributed computing dependency management system | |
US20190095478A1 (en) | Information technology networked entity monitoring with automatic reliability scoring | |
US10949178B1 (en) | Method and system for decomposing a global application programming interface (API) graph into an application-specific API subgraph | |
US20180314745A1 (en) | Dynamically-generated files for visualization sharing | |
US11323463B2 (en) | Generating data structures representing relationships among entities of a high-scale network infrastructure | |
US20220414119A1 (en) | Data source metric visualizations | |
US11669533B1 (en) | Inferring sourcetype based on match rates for rule packages | |
US10225375B2 (en) | Networked device management data collection | |
US20130268314A1 (en) | Brand analysis using interactions with search result items | |
WO2021072742A1 (en) | Assessing an impact of an upgrade to computer software | |
US20240137278A1 (en) | Cloud migration data analysis method using system process information, and system thereof | |
US11676345B1 (en) | Automated adaptive workflows in an extended reality environment | |
Baresi et al. | Microservice architecture practices and experience: a focused look on docker configuration files | |
US10782944B2 (en) | Optimizing a cache of compiled expressions by removing variability | |
CN112187509A (en) | Multi-architecture cloud platform execution log management method, system, terminal and storage medium | |
US11734297B1 (en) | Monitoring platform job integration in computer analytics system | |
Eriksson et al. | A comparative analysis of log management solutions: ELK stack versus PLG stack | |
Mengistu | Distributed Microservice Tracing Systems: Open-source tracing implementation for distributed Microservices build in Spring framework | |
US11704285B1 (en) | Metrics and log integration | |
US11487602B2 (en) | Multi-tenant integration environment | |
Sun et al. | Design and Development of a Log Management System Based on Cloud Native Architecture | |
CN115757570A (en) | Log data analysis method and device, electronic equipment and medium | |
Kancherla | A Smart Web Crawler for a Concept Based Semantic Search Engine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: NEW RELIC, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ASGAR, ZAIN MOHAMED;AZIZI, OMID JALAL;BARTLETT, JAMES MICHAEL;AND OTHERS;SIGNING DATES FROM 20210804 TO 20210915;REEL/FRAME:059429/0753 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
AS | Assignment |
Owner name: BLUE OWL CAPITAL CORPORATION, AS COLLATERAL AGENT, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:NEW RELIC, INC.;REEL/FRAME:065491/0507 Effective date: 20231108 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |