CN117743181B - System for constructing observable control surface - Google Patents

System for constructing observable control surface Download PDF

Info

Publication number
CN117743181B
CN117743181B CN202311794383.XA CN202311794383A CN117743181B CN 117743181 B CN117743181 B CN 117743181B CN 202311794383 A CN202311794383 A CN 202311794383A CN 117743181 B CN117743181 B CN 117743181B
Authority
CN
China
Prior art keywords
automatic
cluster
automatic measurement
data
cluster resources
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311794383.XA
Other languages
Chinese (zh)
Other versions
CN117743181A (en
Inventor
操润贴
张新铭
王徐
鲁源源
汪勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yunzhe Technology Co ltd
Original Assignee
Hangzhou Yunzhe Technology Co ltd
Filing date
Publication date
Application filed by Hangzhou Yunzhe Technology Co ltd filed Critical Hangzhou Yunzhe Technology Co ltd
Priority to CN202311794383.XA priority Critical patent/CN117743181B/en
Publication of CN117743181A publication Critical patent/CN117743181A/en
Application granted granted Critical
Publication of CN117743181B publication Critical patent/CN117743181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application discloses a system for constructing an observable control surface, which relates to the technical field of application observability and comprises a control console, a measuring device, an automatic voltage divider and a scheduler, wherein the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with the automatic measurement; the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with the automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages; an automated voltage divider is used to deploy and install data collectors to create data pipes; the scheduler is to allocate cluster resources to be automatically measured to the data pipe. The application can realize automatic insertion and automatic measurement without invading specific application codes and deployment modes.

Description

System for constructing observable control surface
Technical Field
The application relates to the technical field of application observability, in particular to a system for constructing an observable control surface.
Background
The application can be observed as a ring in the IT construction of modern enterprises, but the importance of the application is self-evident, however, in the actual production environment of the IT construction of the enterprises, the application basically adopts zipkin, jaeger and other open-source schemes, or other commercial products such as arms and datadog, and the solutions provided by the products have more or less problems in acquisition access and data delivery pipelines as follows:
firstly, the invasive type, namely the related system needs to be modified or the deployment mode is adjusted, so that the code is more likely to need to be adjusted, and the investment is larger;
secondly, language independence cannot be achieved, the schemes are basically bound with a certain programming language, jvm types of development languages such as java and the like are preferentially supported, and the rest languages are limited;
thirdly, the technology stack binding is difficult to be compatible with the existing open source scheme or commercial products, and the pluggable flexibility cannot be achieved.
Disclosure of Invention
The application provides a method for constructing an observable control surface, which aims to solve the problems of the prior art that an observable scheme is applied to an acquisition access and data delivery pipeline.
In order to achieve the above purpose, the present application adopts the following technical scheme:
The application discloses a system for constructing an observable control surface, which comprises a control console, a measuring device, an automatic voltage divider and a scheduler, wherein:
the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with automatic measurement;
the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
the scheduler is configured to allocate the cluster resources for which automatic measurement is to be enabled to the data pipeline.
Preferably, the console provides two data pipeline configuration modes, namely a system default mode, and configures a default data collector for all cluster resources to be automatically measured; secondly, the user selects a mode to allocate different data collectors for all cluster resources to be started with automatic measurement.
Preferably, in a user selection mode of the console, batch selection of cluster resources for which automatic measurement is to be enabled is supported, and a corresponding data receiver is selected for the cluster resources.
Preferably, the console submits the configuration of the current data collector, and invokes an application program interface of the cluster to mark the cluster resource to be automatically measured with a label to indicate that the automatic measurement is started.
Preferably, the gauge supports two triggering modes, namely, the gauge is actively triggered when a user operates a console to adjust the configuration of the data pipeline; and secondly, rescheduling by the scheduler when the data pipeline is changed or changed.
Preferably, the meter invokes the Reconcile interface of the cluster controller.
Preferably, the real-time monitoring of the original cluster resource includes real-time monitoring of creation, update or deletion events and state changes of the original cluster resource.
Preferably, the measure calls the application program interface of the cluster to create and deploy the language detection pod at the same node as the target instance for viewing the file system of the target pod.
Preferably, the detecting the programming language of the cluster resource to be enabled with automatic measurement includes the following steps:
acquiring and traversing the process information under the original cluster resource host directory with the automatic measurement mark enabled;
Matching corresponding processes according to monitored podId with changes and container names, and obtaining cmdline command lines corresponding to the matched processes;
analyzing the cmdline command line to obtain cmdline features, and determining the programming language of the cluster resource to be enabled with automatic measurement according to the cmdline features.
Preferably, the triggering condition of the automatic voltage divider includes adding a data receiving back-end service at a console or that the program data transmission amount is too large.
Compared with the prior art, the invention has the following beneficial effects:
1, no interference, namely no intrusion into specific application codes and deployment modes, and automatic plug-in mounting and automatic measurement; 2, the system is compatible with various popular observable products, tools and terminals, and can be quickly connected with various observable product data collectors; and 3, interfacing with any existing stock solution supporting opentelemtry, no configuration and zero deployment.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a diagram of a system architecture in the present application;
fig. 2 is a data interaction diagram in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," and the like in the claims and the description of the application, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order, and it is to be understood that the terms so used may be interchanged, if appropriate, merely to describe the manner in which objects of the same nature are distinguished in the embodiments of the application by the description, and furthermore, the terms "comprise" and "have" and any variations thereof are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in fig. 1, the present application provides a system for constructing an observable control surface, comprising a console, a gauge, an automatic voltage divider, and a scheduler, wherein:
the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with automatic measurement;
the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
the scheduler is configured to allocate the cluster resources for which automatic measurement is to be enabled to the data pipeline.
The system for constructing an observable control plane provided in this embodiment includes four modules, namely, a Console Console, a metric device Instrumentor, an automatic voltage divider Autoscaler and a Scheduler, which are all installed based on HELM CHART and perform communication interaction through kubernetes API SERVER, as shown in fig. 2, where the Console Console mainly provides a concise and easy-to-use user UI management interface for a user, provides an operation interface and a data pipe configuration for an application system to be enabled with automatic metric kubernetes cluster resources, and the operation interface is an operation trigger entry, and of course, the specific instrumentation and metric logic is implemented by the metric device, and the Console also provides two data pipe configurations, namely, a system default mode, that is, all collected application systems use default data collectors, and a user selection mode, that is, a data collector with different application configurations in different application systems or namespaces (namespaces), and further provides a flexible configuration for the application systems to be enabled with automatic metric, and the system is a default mode, that is, and a user selection mode is called with a namespaces to be enabled with automatic indicator kubernetes api, and a load is enabled with a nameplate for the application system.
The core responsibility of the meter Instrumentor is to dynamically discover Kubernetes cluster resources, i.e., applications, to be automatically measured, probe its development language, and execute automatic instrumentation and automatic measurement, specifically, the meter supports two trigger logics, one is that the user operates the console to adjust the data pipe configuration, and the other is that the scheduler reschedules when the data pipe changes or changes, it can collect all applications in the cluster and their language, namespace, etc. information and save them in Kubernetes persistent storage, and also can collect in console databases, such as mysql, etc. transaction databases, and to implement automatic measurement for each newly created application, the meter must include a key component instrumentlet, and deploy instrumentlet to each node of Kubernetes cluster in DaemonSet, one is to detect the programming language of the application, after detecting the language, different automatic instrumentation and automatic measurement are performed according to different languages, such as using OpenTelemetry Sdk to implement automatic measurement for the runtime language, eBPF to measure for the compiled language such as golang, and in order to detect the programming language of the application program to be enabled with automatic measurement, the module creates and deploys a language detection pod deployed on the same node as the target instance by using the application program interface of the cluster, that is Kubernetes api, the pod can view the file system of the target pod, the measurer also invokes the Reconcile interface of the Kubernetes cluster controller, monitors events such as creation, update or deletion of Kubernetes cluster resources and state changes in real time through the interface, and when it monitors the changes of the cluster resources or the naming space, that is, the marking operation of the console, that is, for the application of different languages, different automatic instrumentation and automatic metrology approaches are created.
The method specifically comprises the following steps of creating different automatic instrumentation and automatic measurement modes aiming at applications of different languages:
The java is assumed to be applied, the advantage of a java probe is used, no invasion and no interference pile insertion are realized by utilizing kubernetes characteristics and a paradigm, and the process is as follows:
a) Modifying the resource definition of the observed service by using kubernetes webhook mechanism, and injecting environment variables and init container;
b) The catalog is mounted through an init container to be shared javaagent with the observed service;
c) Automatically loading javaagent by means of the environment variables JAVA_TOOL_OPTIONS and JAVA_OPTS of jvm;
d) In combination with the user configured data pipeline or the default data pipeline of the system, add the-dotel.exoer.otlp.endpoint=http:/% s:% d variable in jvm startup naming lines, i.e. add collectors that meet otel specifications and standards.
Assuming golang language application, the ebpf feature is used to implement automatic instrumentation and measurement functions, specifically, probe plugins of different protocols such as http and grpc are developed based on opentelemetry-go-instrumentation in combination with ebpf, and secondary development is supported here, so long as plugins of different protocols or middleware are developed according to ebpf specifications.
Assuming other scripting languages, such as nodejs, python, etc., automatic metrics of opentelemetry-instrumentation may be used directly and automatic injection by the metrics Instrumentor.
The specific process of detecting the programming language of the application program to be enabled with automatic measurement includes: when instrumentlet components monitor that any resource in cluster resources such as deployment, statefulset, daemonset and namesespace enables an automatic measurement mark through a Reconcile interface of a Kubernetes cluster controller, process information under a proc directory of a node host is acquired, all the process information is traversed, podId and container names with monitoring changes are used for matching corresponding processes, cmdline command lines corresponding to the matched processes are acquired, cmdline is analyzed, and programming language of the application program is determined according to cmdline characteristics obtained through analysis.
The automatic voltage divider Autoscaler is mainly responsible for automatic deployment and installation of the data collector, and in two cases, the automatic voltage divider is triggered, one is that a data receiving back-end service is added to a console, an endpoint is generally filled, and the other is that the condition that the data receiving service pressure is overlarge due to overlarge data transmission quantity of some programs is observed, and a flow analysis program in the automatic voltage divider decides to automatically add a special data collecting service for the programs.
The Scheduler is mainly responsible for allocating cluster resources to be automatically measured to the data pipeline created by the automatic voltage divider, wherein the data pipeline refers to a channel from a collected data source to a final observable service.
The system provided by the embodiment is mainly based on kubernetes cloud primary environment, fully uses the related characteristics of kubernetes CRD, design paradigm and openness provided by kubernetes api, is used for solving the observable scene of the distributed microservices, does not need to modify any codes or adjust application deployment modes, combines the data collector configuration of a console, acquires the configuration from kubernetes storage, and injects delivery variables or configuration (configmap) into an application acquisition probe, so that automatic discovery, automatic instrumentation and automatic measurement of the application can be realized.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the present invention is not limited thereto, but any changes or substitutions within the technical scope of the present invention should be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A system for constructing an observable control surface comprising a console, a gauge, an automatic voltage divider, and a scheduler, wherein:
The control console is used for providing a user management interface, providing an operation interface for automatic instrumentation and automatic measurement and data pipeline configuration for cluster resources to be started, submitting the configuration of a current data collector, and calling an application program interface of a cluster to mark the cluster resources to be started with a label to represent that the automatic measurement is started;
The measurer is configured to monitor an original cluster resource in real time to determine a cluster resource to be enabled with automatic measurement, detect a programming language of the cluster resource to be enabled with automatic measurement, and execute different automatic instrumentation and automatic measurement according to different programming languages, and includes:
for runtime languages, openTelemetry Sdk is used to implement automatic metrics;
For compiled languages, eBPF is used to implement automatic metrics;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
The scheduler is configured to allocate the cluster resources for which the automatic measurement is to be enabled to the data pipeline;
Wherein the detecting the programming language of the cluster resource to be enabled with automatic metrics comprises the steps of:
acquiring and traversing the process information under the original cluster resource host directory with the automatic measurement mark enabled;
Matching corresponding processes according to monitored podId with changes and container names, and obtaining cmdline command lines corresponding to the matched processes;
analyzing the cmdline command line to obtain cmdline features, and determining the programming language of the cluster resource to be enabled with automatic measurement according to the cmdline features.
2. The system for building an observable control surface of claim 1, wherein the console provides two data pipe configuration modes, one is a system default mode, configuring a default data collector for all cluster resources to be automatically measured; secondly, the user selects a mode to allocate different data collectors for all cluster resources to be started with automatic measurement.
3. A system for building an observable control surface according to claim 2, characterized in that in the user selection mode of the console, bulk selection of cluster resources for which automatic metrics are to be enabled is supported and for which the corresponding data receiver is selected.
4. The system for constructing an observable control surface of claim 1, wherein said meter supports two modes of triggering, one being actively triggered by a user operating a console to adjust a data pipe configuration; and secondly, rescheduling by the scheduler when the data pipeline is changed or changed.
5. The system for building an observable control surface of claim 1, wherein the measure invokes a Reconcile interface of a cluster controller.
6. The system for building an observable control surface of claim 1, wherein said real-time listening to original cluster resources includes real-time listening to creation, update or deletion events and state changes of the original cluster resources.
7. The system for building an observable control surface of claim 1, wherein the measure invokes an application program interface of the cluster to create and deploy a language detection pod at the same node as the target instance for viewing the file system of the target pod.
8. The system for building an observable control surface of claim 1, wherein the triggering condition of the automatic voltage divider includes adding a data receiving backend service at the console or an excessive amount of program data transmission.
CN202311794383.XA 2023-12-25 System for constructing observable control surface Active CN117743181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311794383.XA CN117743181B (en) 2023-12-25 System for constructing observable control surface

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311794383.XA CN117743181B (en) 2023-12-25 System for constructing observable control surface

Publications (2)

Publication Number Publication Date
CN117743181A CN117743181A (en) 2024-03-22
CN117743181B true CN117743181B (en) 2024-07-09

Family

ID=

Similar Documents

Publication Publication Date Title
CN104699616B (en) The method of a kind of application test, Apparatus and system
US8381184B2 (en) Dynamic test coverage
CN100538656C (en) The method and apparatus of debugging computer program in distributed debugger
CN109361562B (en) Automatic testing method based on associated network equipment access
CN100533398C (en) Debug information collection method and debug information collection system
CN111478798B (en) Fault processing method, fault processing device and storage medium
CN103001815B (en) The acquisition methods of test data, Apparatus and system
CN109218133A (en) Network speed testing system, method, apparatus and computer readable storage medium
CN103019942B (en) Method and system for automatically testing applications to be tested based on android system
CN103927255A (en) Software testing method based on cloud testing system, cloud testing system and client side of cloud testing system
CN113672441B (en) Method and device for testing intelligent equipment
CN110365804A (en) A kind of distribution terminal cloud detection system
CN113448854A (en) Regression testing method and device
CN111142971A (en) Cloud platform application readiness checking method suitable for traditional application clouding
CN111506358B (en) Method and device for updating container configuration
CN117743181B (en) System for constructing observable control surface
CN117743181A (en) System for constructing observable control surface
JP2015095265A (en) Method for testing wireless output of smart device and wireless transmission network analysis tool
CN116150011A (en) Code coverage data processing method, apparatus, medium and computer program product
KR20170121627A (en) Remote inspection system and communication method of the same
CN110971478A (en) Pressure measurement method and device for cloud platform service performance and computing equipment
CN106991560B (en) Internal communication for asset health monitoring devices
WO2016076771A1 (en) Data collection arrangement for collecting data about an electric power supply location
CN111026598A (en) Data acquisition method and device
CN113010187B (en) Application installation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant