CN117743181A - System for constructing observable control surface - Google Patents

System for constructing observable control surface Download PDF

Info

Publication number
CN117743181A
CN117743181A CN202311794383.XA CN202311794383A CN117743181A CN 117743181 A CN117743181 A CN 117743181A CN 202311794383 A CN202311794383 A CN 202311794383A CN 117743181 A CN117743181 A CN 117743181A
Authority
CN
China
Prior art keywords
automatic
automatic measurement
cluster resources
data
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311794383.XA
Other languages
Chinese (zh)
Other versions
CN117743181B (en
Inventor
操润贴
张新铭
王徐
鲁源源
汪勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yunzhe Technology Co ltd
Original Assignee
Hangzhou Yunzhe Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yunzhe Technology Co ltd filed Critical Hangzhou Yunzhe Technology Co ltd
Priority to CN202311794383.XA priority Critical patent/CN117743181B/en
Priority claimed from CN202311794383.XA external-priority patent/CN117743181B/en
Publication of CN117743181A publication Critical patent/CN117743181A/en
Application granted granted Critical
Publication of CN117743181B publication Critical patent/CN117743181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The application discloses a system for constructing an observable control surface, which relates to the technical field of application observability and comprises a control console, a measuring device, an automatic voltage divider and a scheduler, wherein the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with the automatic measurement; the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with the automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages; an automated voltage divider is used to deploy and install data collectors to create data pipes; the scheduler is to allocate cluster resources to be automatically measured to the data pipe. The method and the device can achieve automatic insertion and automatic measurement without invading specific application codes and deployment modes.

Description

System for constructing observable control surface
Technical Field
The present application relates to the field of application observability, and in particular, to a system for constructing an observable control plane.
Background
The application can be observed as a ring in the IT construction of modern enterprises, but the application basically adopts zipkin, jaeger and other open-source schemes or arms, datadog and other commercial products in the actual production environment of the IT construction of the enterprises, and the solutions provided by the products have more or less problems in acquisition access and data delivery pipelines as follows:
firstly, the invasive type, namely the related system needs to be modified or the deployment mode is adjusted, so that the code is more likely to need to be adjusted, and the investment is larger;
secondly, language independence cannot be achieved, the schemes are basically bound with a certain programming language, jvm development languages such as java and the like are preferentially supported, and the rest languages are limited;
thirdly, the technology stack binding is difficult to be compatible with the existing open source scheme or commercial products, and the pluggable flexibility cannot be achieved.
Disclosure of Invention
The method for constructing the observable control surface aims to solve the problems of the prior art that an observable scheme is applied to an acquisition access and data delivery pipeline.
In order to achieve the above purpose, the present application adopts the following technical scheme:
a system for constructing an observable control surface of the present application, comprising a console, a gauge, an automatic voltage divider, and a scheduler, wherein:
the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with automatic measurement;
the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
the scheduler is configured to allocate the cluster resources for which automatic measurement is to be enabled to the data pipeline.
Preferably, the console provides two data pipeline configuration modes, namely a system default mode, and configures a default data collector for all cluster resources to be automatically measured; secondly, the user selects a mode to allocate different data collectors for all cluster resources to be started with automatic measurement.
Preferably, in a user selection mode of the console, batch selection of cluster resources for which automatic measurement is to be enabled is supported, and a corresponding data receiver is selected for the cluster resources.
Preferably, the console submits the configuration of the current data collector, and invokes an application program interface of the cluster to mark the cluster resource to be automatically measured with a label to indicate that the automatic measurement is started.
Preferably, the gauge supports two triggering modes, namely, the gauge is actively triggered when a user operates a console to adjust the configuration of the data pipeline; and secondly, rescheduling by the scheduler when the data pipeline is changed or changed.
Preferably, the meter invokes a reconnaissance interface of the cluster controller.
Preferably, the real-time monitoring of the original cluster resource includes real-time monitoring of creation, update or deletion events and state changes of the original cluster resource.
Preferably, the measure calls the application program interface of the cluster to create and deploy the language detection pod at the same node as the target instance for viewing the file system of the target pod.
Preferably, the detecting the programming language of the cluster resource to be enabled with automatic measurement includes the following steps:
acquiring and traversing the process information under the original cluster resource host directory with the automatic measurement mark enabled;
matching corresponding processes according to the monitored changed podId and the container name, and obtaining a cmdline command line corresponding to the matched processes;
analyzing the cmdline command line to obtain cmdline characteristics, and determining the programming language of the cluster resource to be started with automatic measurement according to the cmdline characteristics.
Preferably, the triggering condition of the automatic voltage divider includes adding a data receiving back-end service at a console or that the program data transmission amount is too large.
Compared with the prior art, the invention has the following beneficial effects:
1, no interference, namely no intrusion into specific application codes and deployment modes, and automatic plug-in mounting and automatic measurement; 2, the system is compatible with various popular observable products, tools and terminals, and can be quickly connected with various observable product data collectors; and 3, any existing stock solution supporting openeletry is docked, no configuration exists, and zero deployment exists.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a system architecture diagram in the present application;
fig. 2 is a data interaction diagram in the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "first," "second," and the like in the claims and the description of the present application are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order, and it should be understood that the terms so used may be interchanged, if appropriate, merely to describe the manner in which objects of the same nature are distinguished in the embodiments of the present application when described, and furthermore, the terms "comprise" and "have" and any variations thereof are intended to cover a non-exclusive inclusion such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in fig. 1, the present application provides a system for building an observable control surface, comprising a console, a gauge, an automatic voltage divider, and a scheduler, wherein:
the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with automatic measurement;
the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
the scheduler is configured to allocate the cluster resources for which automatic measurement is to be enabled to the data pipeline.
The system for constructing the observable control plane provided in this embodiment includes four modules, which are respectively a Console Console, a meter instrument, an automatic voltage divider Autoscaler and a Scheduler, and all the four modules are installed based on a helm chart and perform communication interaction through kubernetes api server, as shown in fig. 2, where the Console Console mainly provides a user UI management interface for a user, and provides an operation interface and a data pipeline configuration for an application system for kubernetes cluster resources to be enabled with automatic measurement with automatic instrumentation and automatic measurement, where the operation interface is an operation trigger entry, and of course, specific instrumentation and measurement logic are implemented by the meter, and the Console also provides two data pipeline configurations, one is a default mode of the system, that is, all collected application systems use default data collectors, one is a mode of user selection, that is, different application systems or application collectors under a naming space (name space) are configured, and the mode of the user has flexibility of calling the application systems or the application systems under different naming spaces is more, but the mode of using the name system is enabled with the automatic instrumentation or the name space.
The core responsibility of the meter instrument is to dynamically find Kubernetes cluster resources, i.e. application programs, to be enabled with automatic measurement, detect its development language, and execute automatic instrumentation and automatic measurement, specifically, the meter supports two trigger logics, one is that a user operates a console to adjust the configuration of a data pipeline, and the other is that when the data pipeline is changed or changed, the scheduler reschedules, which can collect all application programs in the cluster and the information of language, naming space, etc. and store them in Kubernetes persistence storage, and also can collect console databases, such as mysql, etc. transaction databases, and to realize automatic measurement for each newly created application program, the meter must include a key component, insistant, and deploy the Kubernetes onto each node of Kubernetes cluster in the manner of Daemon set, one of the responsibilities is to detect the programming language of the application, after detecting the language, different automatic instrumentation automatic metrics are executed according to different languages, for example, openTelemetry Sdk is used to implement the automatic metrics for the runtime language, e.g. the embpf is used to measure for the compiled language, e.g. golang, while in order to detect the programming language of the application to be enabled with the automatic metrics, the module also creates and deploys a language detection pod deployed on the same node as the target instance using the application program interface of the cluster, i.e. Kubernetes api, the pod can view the file system of the target pod, the meter also invokes the reconnaissance interface of the Kubernetes cluster controller, monitors events such as creation, update or deletion of Kubernetes cluster resources and state changes in real time through the interface, monitors the changes of cluster resources or namespaces, i.e. the control console's scaling operation, i.e. for applications of different languages, different automatic instrumentation and automatic metrology approaches are created.
The method specifically comprises the following steps of creating different automatic instrumentation and automatic measurement modes aiming at applications of different languages:
the java is assumed to be applied, the advantage of a java probe is used, the characteristics and the paradigm of kubernetes are utilized to achieve non-invasive and non-interference pile insertion, and the process is as follows:
a) Modifying the resource definition of the observed service by using a kubernetes webhook mechanism, and injecting environment variables and init container;
b) The java agent is shared with the observed service through the init container mount directory;
c) Automatically loading the JAVA agent through the environment variables java_tol_options and java_ops of jvm;
d) In combination with the user configured data pipeline or the default data pipeline of the system, adding a-dotel.exoer.otlp.endpoint=http:/% s:% d variable in the jvm startup naming line, i.e. adding a collector meeting the otel specification and standard.
Assuming that the method is an application of a golang language, the ebpf characteristic is utilized to realize automatic instrumentation and measurement functions, and specifically, probe plugins of different protocols such as http and grpc are developed based on openelemet-go-instrumentation and combined with ebpf, and secondary development is supported, so long as plugins of different protocols or middleware are developed according to the ebpf specification.
Assuming other scripting languages, such as nodejs, python, etc., automatic metrology of the openelemet-instrumentation can be used directly and automatic injection can be achieved by the metrology instrument instrumenter.
The specific process of detecting the programming language of the application program to be enabled with automatic measurement includes: when the instrumentlet component monitors that any resource in cluster resources such as deployment, statefulset, daemonset and namespace enables an automatic measurement mark through a Reconnole interface of a Kubernetes cluster controller, process information under a proc directory of a node host is acquired, all the process information is traversed, the corresponding process is matched by monitoring changed podId and container names, simultaneously, a cmdline command line corresponding to the matched process is acquired, the cmdline is analyzed, and the programming language of the application program is determined according to the cmdline characteristics obtained by analysis.
The automatic voltage divider is mainly responsible for automatic deployment and installation of a data collector, and in two cases, the automatic voltage divider is triggered, one is to add a data receiving back-end service to a console, generally fill in an endpoint, and the other is to observe the situation that the data transmission quantity of some programs is too large, so that the pressure of the data receiving service is too large, and a flow analysis program in the automatic voltage divider decides to automatically add a special data collecting service for the programs.
The Scheduler is mainly responsible for allocating cluster resources to be automatically measured to the data pipeline created by the automatic voltage divider, wherein the data pipeline refers to a channel from a collected data source to a final observable service.
The system provided by the embodiment is mainly based on a kubernetes cloud native environment, fully uses the relevant characteristics, design paradigm and openness provided by kubernetes CRD, is used for solving a distributed micro-service observable scene, does not need to modify any codes or adjust an application deployment mode, combines with the data collector configuration of a console, acquires the configuration from kubernetes storage, and injects delivery variables or configuration (configmap) into an application acquisition probe, so that automatic discovery, automatic instrumentation and automatic measurement of an application program can be realized.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the present invention is not limited thereto, but any changes or substitutions within the technical scope of the present invention should be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A system for constructing an observable control surface comprising a console, a gauge, an automatic voltage divider, and a scheduler, wherein:
the control console is used for providing a user management interface, an operation interface for automatic pile insertion and automatic measurement and data pipeline configuration for cluster resources to be started with automatic measurement;
the measuring device is used for monitoring original cluster resources in real time to determine cluster resources to be started with automatic measurement, detecting programming languages of the cluster resources to be started with automatic measurement, and executing automatic instrumentation and automatic measurement according to the programming languages;
the automatic voltage divider is used for deploying and installing a data collector to create a data pipeline;
the scheduler is configured to allocate the cluster resources for which automatic measurement is to be enabled to the data pipeline.
2. The system for building an observable control surface of claim 1, wherein the console provides two data pipe configuration modes, one is a system default mode, configuring a default data collector for all cluster resources to be automatically measured; secondly, the user selects a mode to allocate different data collectors for all cluster resources to be started with automatic measurement.
3. A system for building an observable control surface according to claim 2, characterized in that in the user selection mode of the console, bulk selection of cluster resources for which automatic metrics are to be enabled is supported and for which the corresponding data receiver is selected.
4. A system for building an observable control surface according to claim 3, characterized in that the console submits the configuration of the current data collector and invokes the cluster's application program interface to label the cluster resources to be automatically measured to indicate that automatic measurement is enabled.
5. The system for constructing an observable control surface of claim 1, wherein said meter supports two modes of triggering, one being actively triggered by a user operating a console to adjust a data pipe configuration; and secondly, rescheduling by the scheduler when the data pipeline is changed or changed.
6. The system for building an observable control surface of claim 1, wherein the measure invokes a reconnaissance interface of a cluster controller.
7. The system for building an observable control surface of claim 1, wherein said real-time listening to original cluster resources includes real-time listening to creation, update or deletion events and state changes of the original cluster resources.
8. The system for building an observable control surface of claim 1, wherein the measure invokes an application program interface of the cluster to create and deploy a language detection pod at the same node as the target instance for viewing the file system of the target pod.
9. A system for building an observable control surface according to claim 4, characterized in that said programming language that detects the cluster resources to be automatically measured, comprises the steps of:
acquiring and traversing the process information under the original cluster resource host directory with the automatic measurement mark enabled;
matching corresponding processes according to the monitored changed podId and the container name, and obtaining a cmdline command line corresponding to the matched processes;
analyzing the cmdline command line to obtain cmdline characteristics, and determining the programming language of the cluster resource to be started with automatic measurement according to the cmdline characteristics.
10. The system for building an observable control surface of claim 1, wherein the triggering condition of the automatic voltage divider includes adding a data receiving backend service at the console or an excessive amount of program data transmission.
CN202311794383.XA 2023-12-25 System for constructing observable control surface Active CN117743181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311794383.XA CN117743181B (en) 2023-12-25 System for constructing observable control surface

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311794383.XA CN117743181B (en) 2023-12-25 System for constructing observable control surface

Publications (2)

Publication Number Publication Date
CN117743181A true CN117743181A (en) 2024-03-22
CN117743181B CN117743181B (en) 2024-07-09

Family

ID=

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320120A1 (en) * 2007-06-22 2008-12-25 John Elliott Arwe Apparatus and method for visualization of web services distributed management (wsdm) resources
CN109495543A (en) * 2018-10-16 2019-03-19 新华三技术有限公司 The management method and device of monitor in a kind of ceph cluster
US20200092180A1 (en) * 2018-09-14 2020-03-19 Capital One Services, Llc Methods and systems for microservices observability automation
CN113742033A (en) * 2021-09-08 2021-12-03 广西东信数建信息科技有限公司 Kubernetes cluster federal system and implementation method thereof
US20210397465A1 (en) * 2020-06-22 2021-12-23 Hewlett Packard Enterprise Development Lp Container-as-a-service (caas) controller for monitoring clusters and implemeting autoscaling policies
CN114143169A (en) * 2021-11-24 2022-03-04 浙江大学 Micro-service application observability system
CN114490264A (en) * 2022-01-28 2022-05-13 中国工商银行股份有限公司 File monitoring method and device of application system, electronic equipment and storage medium
US20220215101A1 (en) * 2017-11-27 2022-07-07 Lacework, Inc. Dynamically generating monitoring tools for software applications
CN116016702A (en) * 2022-12-26 2023-04-25 浪潮云信息技术股份公司 Application observable data acquisition processing method, device and medium
CN116225911A (en) * 2023-01-30 2023-06-06 上海观测未来信息技术有限公司 Function test method and device for observability platform
CN116520876A (en) * 2023-04-12 2023-08-01 北京理工大学 Guidance law design method for optimizing observability
US20230315397A1 (en) * 2022-03-31 2023-10-05 Accenture Global Solutions Limited Serverless environment-based provisioning and deployment system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320120A1 (en) * 2007-06-22 2008-12-25 John Elliott Arwe Apparatus and method for visualization of web services distributed management (wsdm) resources
US20220215101A1 (en) * 2017-11-27 2022-07-07 Lacework, Inc. Dynamically generating monitoring tools for software applications
US20200092180A1 (en) * 2018-09-14 2020-03-19 Capital One Services, Llc Methods and systems for microservices observability automation
CN109495543A (en) * 2018-10-16 2019-03-19 新华三技术有限公司 The management method and device of monitor in a kind of ceph cluster
US20210397465A1 (en) * 2020-06-22 2021-12-23 Hewlett Packard Enterprise Development Lp Container-as-a-service (caas) controller for monitoring clusters and implemeting autoscaling policies
CN113742033A (en) * 2021-09-08 2021-12-03 广西东信数建信息科技有限公司 Kubernetes cluster federal system and implementation method thereof
CN114143169A (en) * 2021-11-24 2022-03-04 浙江大学 Micro-service application observability system
CN114490264A (en) * 2022-01-28 2022-05-13 中国工商银行股份有限公司 File monitoring method and device of application system, electronic equipment and storage medium
US20230315397A1 (en) * 2022-03-31 2023-10-05 Accenture Global Solutions Limited Serverless environment-based provisioning and deployment system
CN116016702A (en) * 2022-12-26 2023-04-25 浪潮云信息技术股份公司 Application observable data acquisition processing method, device and medium
CN116225911A (en) * 2023-01-30 2023-06-06 上海观测未来信息技术有限公司 Function test method and device for observability platform
CN116520876A (en) * 2023-04-12 2023-08-01 北京理工大学 Guidance law design method for optimizing observability

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
腾讯云: "使用OpenTelemetry Operator将可观测数据发送到SigNoz", pages 1 - 16, Retrieved from the Internet <URL:《https://cloud.tencent.com/developer/article/2327987》> *
运维之美: "Odigos: 一款助你在Kubernetes 上快速构建端到端无侵入的可观测解决方案", pages 1 - 7, Retrieved from the Internet <URL:《https://blog.csdn.net/easylife206/article/details/130652425》> *

Similar Documents

Publication Publication Date Title
CN104699616B (en) The method of a kind of application test, Apparatus and system
CN100538656C (en) The method and apparatus of debugging computer program in distributed debugger
EP2884695B1 (en) Management server and control method for management server
CN109361562B (en) Automatic testing method based on associated network equipment access
CN102567203B (en) A kind of method and system of test distributed file system performance
CN100533398C (en) Debug information collection method and debug information collection system
CN103001815B (en) The acquisition methods of test data, Apparatus and system
CN111478798B (en) Fault processing method, fault processing device and storage medium
CN103019942B (en) Method and system for automatically testing applications to be tested based on android system
CN108415820B (en) Test method and device of application installation package
CN109218133A (en) Network speed testing system, method, apparatus and computer readable storage medium
CN113672441B (en) Method and device for testing intelligent equipment
CN114490268A (en) Full link monitoring method, device, equipment, storage medium and program product
CN108924005A (en) Network detecting method, network detection device, medium and equipment
CN111026627A (en) Pressure testing method and device and server
CN110365804A (en) A kind of distribution terminal cloud detection system
CN111506358B (en) Method and device for updating container configuration
CN114168471A (en) Test method, test device, electronic equipment and storage medium
CN114765576A (en) File sharing method, vehicle and storage medium
CN109522181A (en) A kind of performance test methods of distributed memory system, device and equipment
CN117743181B (en) System for constructing observable control surface
CN117743181A (en) System for constructing observable control surface
CN116521414A (en) Fault code positioning method, cloud server, system and storage medium
CN110971478A (en) Pressure measurement method and device for cloud platform service performance and computing equipment
CN111026598A (en) Data acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant