CN116166181A - Cloud monitoring method and cloud management platform - Google Patents
Cloud monitoring method and cloud management platform Download PDFInfo
- Publication number
- CN116166181A CN116166181A CN202210142252.2A CN202210142252A CN116166181A CN 116166181 A CN116166181 A CN 116166181A CN 202210142252 A CN202210142252 A CN 202210142252A CN 116166181 A CN116166181 A CN 116166181A
- Authority
- CN
- China
- Prior art keywords
- tenant
- management platform
- cloud management
- cloud
- end storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 102
- 238000000034 method Methods 0.000 title claims abstract description 81
- 238000003860 storage Methods 0.000 claims abstract description 219
- 230000015654 memory Effects 0.000 claims description 90
- 238000004458 analytical method Methods 0.000 claims description 59
- 230000004044 response Effects 0.000 claims description 29
- 230000032683 aging Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000004891 communication Methods 0.000 claims description 10
- 239000007787 solid Substances 0.000 claims description 6
- 238000012545 processing Methods 0.000 abstract description 65
- 238000007726 management method Methods 0.000 description 137
- 239000008186 active pharmaceutical agent Substances 0.000 description 50
- 230000006870 function Effects 0.000 description 21
- 230000008569 process Effects 0.000 description 19
- 238000010586 diagram Methods 0.000 description 12
- 238000013500 data storage Methods 0.000 description 5
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000005055 memory storage Effects 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002085 persistent effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- -1 as shown in fig. 3 Substances 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0653—Monitoring storage devices or systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0656—Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0679—Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45595—Network integration; Enabling network access in virtual machine instances
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the application discloses a cloud monitoring method and a cloud management platform, and relates to the technical field of data processing. The method comprises the following steps: the cloud management platform determines service level target SLO information input or selected by a tenant at the cloud management platform, wherein the SLO information is used for representing the use requirement of the tenant on index data, and the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by cloud monitoring service provided by the cloud management platform; and selecting one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data. Therefore, the index data of the tenants are classified according to the use requirements of the tenants on the index data of the cloud monitoring service, so that different use requirements of different tenants on the cloud monitoring data can be guaranteed, and meanwhile, the effective utilization rate of cloud resources is improved.
Description
The present application claims priority from patent application No. 202111399739.0 entitled "a cloud monitoring method and system" filed 24 at 11/2021, the entire contents of which are incorporated herein by reference.
Technical Field
The application relates to the technical field of data processing, in particular to a cloud monitoring method and a cloud management platform.
Background
Cloud computing (clouding) is one type of distributed computing, meaning that a huge data computing process is broken down into numerous small programs by a network "cloud", and then these small programs are processed and analyzed by a system of multiple servers to obtain results and returned to users (including enterprise users and personal users).
The cloud monitoring service is a universal basic capability of a cloud manufacturer, is responsible for accessing, processing, gathering, storing and the like of index data generated by monitoring cloud resources purchased by a tenant, and provides an application programming interface (application programming interface, API) for the tenant to query and use the index data related to the cloud monitoring service.
With the development of cloud computing, more and more tenants migrate services to the cloud, and the increasing data scale also presents challenges for monitoring cloud resources. However, currently, default capabilities of APIs for querying cloud monitoring data, which are opened by each cloud vendor to all tenants, are consistent, so that different usage requirements of different types of tenants on index data of cloud monitoring services cannot be met, and meanwhile, effective utilization rate of cloud resources cannot be considered.
Disclosure of Invention
The embodiment of the application provides a cloud monitoring method and a cloud management platform, which are beneficial to guaranteeing different use requirements of different types of tenants on index data of cloud monitoring service and improving the effective utilization rate of cloud resources.
In a first aspect, an embodiment of the present application provides a cloud monitoring method, where the method may be performed by a cloud management platform, and the cloud management platform may be a cloud server. The method may include: the cloud management platform determines service level target SLO information input or selected by a tenant at the cloud management platform, wherein the SLO information is used for representing the use requirement of the tenant on index data, and the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by cloud monitoring service provided by the cloud management platform; the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, wherein the data reading rates of the at least two back-end storage modules are different and/or the aging times of the at least two back-end storage modules are different.
According to the method, the cloud management platform can store the index data of the tenant by different back-end storage modules according to the use requirement of the tenant on the index data, so that different SLO capacities are provided for the tenant in the aspects of data reading speed, data aging time and the like, different use requirements of different types of tenants on the index data of the cloud monitoring service are guaranteed, and meanwhile, the effective utilization rate of cloud resources is improved.
With reference to the first aspect, in one possible implementation manner, the at least two back-end storage modules include a memory type back-end storage module with a type of memory, an SSD type back-end storage module with a type of solid state disk SSD, and an OBS type back-end storage module with a type of object storing an OBS bucket, where the memory type back-end storage module reads data at a fastest rate, the OBS type back-end storage module reads data at a slowest rate, the memory type back-end storage module ages data at a fastest time, and the OBS type back-end storage module ages data at a slowest time.
Through the method, the cloud management platform can be connected with at least two back-end storage modules through the internal network, the at least two back-end storage modules can be used for providing different SLO capacities for tenants in the aspects of data reading speed, data aging time and the like, so that different use requirements of different types of tenants on index data of cloud monitoring service are guaranteed, and meanwhile, the effective utilization rate of cloud resources is improved.
It may be understood that, in this embodiment of the present application, only the rate of reading data and the time of aging data are taken as examples to set the at least two back-end storage modules connected to the cloud management platform, and the specific types or capabilities of the at least two back-end storage modules are not limited, and in other embodiments, a cloud vendor may deploy the back-end storage modules according to service needs, which is not limited in this embodiment of the present application. In an optional implementation manner, the cloud management platform may further provide data processing capabilities other than data storage for the index data, for example, access, processing, aggregation, and the like, and accordingly, the cloud management platform may also be connected to at least two back-end processing modules that provide any one of access, processing, aggregation, and the like, and select one back-end processing module that matches SLO information of the tenant from the at least two back-end processing modules to process (for example, access, processing, or aggregation) the index data, where the matching process is similar to the matching process when implementing the storage capability, and detailed implementation will be mutually referred to, and will not be described herein.
With reference to the first aspect, in a possible implementation manner, in a case that the SLO information indicates that a use requirement of the tenant for index data is high priority, the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, including: and the cloud management platform selects a memory type back-end storage module with a memory type to store the index data.
By the method, the high priority can represent high requirements on the reading rate, aging time and the like of the index data, and when the use requirement of the tenant on the index data is high priority, the cloud management platform selects the memory type back-end storage module with the memory to store the index data, and the query response of the memory type storage can reach the time delay of P99 < 50ms, so that when the tenant reads the index data, the cloud management platform can read the index data required by the tenant from the memory type back-end storage module with the time delay of P99 < 50ms and feed back the index data, and compared with other types of back-end storage modules, the time delay can be reduced.
With reference to the first aspect, in one possible implementation manner, in a case where the SLO information indicates that a requirement of the tenant for use of the index data is a medium priority, the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, including: and the cloud management platform selects a memory type back-end storage module to store the index data.
By the method, the medium priority can represent medium requirements on the reading speed, aging time and the like of the index data, and in the case that the use requirement of the tenant on the index data is the medium priority, the cloud management platform can select the memory type back-end storage module to store the index data in an optional implementation mode. For example, the cloud management platform can dynamically adjust SLO information of the tenant according to actual use conditions of cloud resources, so that the cloud resources are fully utilized as much as possible under the condition that the use requirements of the tenant on index data are guaranteed, and overall query response time delay, storage cost and the like are reduced, so that user experience is improved.
With reference to the first aspect, in one possible implementation manner, in a case where the SLO information indicates that a requirement of the tenant for use of the index data is a medium priority, the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, including: and the cloud management platform selects an SSD type rear end storage module with SSD type to store the index data.
By the method, under the condition that the use requirement of the tenant on the index data is medium priority, the cloud management platform can select the SSD type back-end storage module to store the index data. For example, the requirement of the medium priority on the query response time delay is lower than the requirement of the high priority on the query response time delay, and under the condition of the medium priority requirement, the storage cost can be reduced under the condition of meeting the use requirement of tenants by selecting the SSD type back-end storage module to store index data.
With reference to the first aspect, in a possible implementation manner, in a case that the SLO information is used to indicate that a use requirement of the tenant on index data is low priority, the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, including: and the cloud management platform selects an OBS type back-end storage module of an OBS barrel to store the index data.
By the method, the low priority can represent low requirements on the reading speed, aging time and the like of the index data, and when the use requirement of the tenant on the index data is low priority, the cloud management platform can select the OBS type back-end storage module to store the index data, so that the storage cost is further reduced.
With reference to the first aspect, in a possible implementation manner, the determining, by the cloud management platform, service level target SLO information input or selected by a tenant at the cloud management platform includes: the cloud management platform providing an application programming interface API to the tenant, the application programming interface API for indicating a plurality of fields representing different attributes of the SLO information; the cloud management platform receives the SLO information sent by the tenant, wherein the SLO information comprises the plurality of fields and parameters input by the tenant for each field.
By the method, the cloud management platform can provide the APIs for the tenants and provide the fields of the APIs for setting different attributes of the SLO information, the tenants can input corresponding parameters according to the fields, the fields and the parameters input by the tenants for each field can be used as SLO information set by the tenants to be sent to the cloud management platform, and configuration of the SLO information of the tenants is completed.
With reference to the first aspect, in one possible implementation manner, the plurality of fields include: namespaces, group names, instance names, monitoring item names, time periods, or SLO information desired by the tenant.
With reference to the first aspect, in a possible implementation manner, the determining, by the cloud management platform, service level target SLO information input or selected by a tenant at the cloud management platform includes: the cloud management platform provides a console interface for the tenant; the cloud management platform determines the SLO information input or selected by the tenant at the console interface, wherein the SLO information comprises a plurality of index attribute configuration items provided by the console interface and parameters input or selected by the tenant for each index attribute configuration item.
Through the method, the cloud management platform can obtain the SLO information of the tenant through the console interface.
With reference to the first aspect, in one possible implementation manner, the plurality of index attribute configuration items include: hierarchical name, hierarchical resource scope, hierarchical policy.
In a second aspect, an embodiment of the present application provides a cloud management platform, including: the cloud management platform comprises an SLO storage unit, a cloud management platform and a cloud management platform, wherein the SLO storage unit is used for determining service level target SLO information input or selected by a tenant at the cloud management platform, and the SLO information is used for representing the use requirement of the tenant on index data, wherein the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by cloud monitoring service provided by the cloud management platform; and the analysis unit is used for selecting one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, wherein the data reading rates of the at least two back-end storage modules are different and/or the aging times of the at least two back-end storage modules are different.
With reference to the second aspect, in one possible implementation manner, the at least two back-end storage modules include a memory type back-end storage module with a type of memory, an SSD type back-end storage module with a type of SSD, and an OBS type back-end storage module with a type of OBS bucket, where the memory type back-end storage module reads data at a fastest rate, the OBS type back-end storage module reads data at a slowest rate, the memory type back-end storage module ages data at a fastest time, and the OBS type back-end storage module ages data at a slowest time.
With reference to the second aspect, in a possible implementation manner, in a case that the SLO information indicates that a use requirement of the tenant for the index data is high priority, the analysis unit is configured to: and selecting a memory type back-end storage module with a memory type to store the index data.
With reference to the second aspect, in a possible implementation manner, in a case that the SLO information indicates that a use requirement of the tenant for the index data is a medium priority, the analysis unit is configured to: and selecting a memory type back-end storage module to store the index data.
With reference to the second aspect, in a possible implementation manner, in a case that the SLO information indicates that a use requirement of the tenant for the index data is a medium priority, the analysis unit is configured to: and selecting an SSD type back-end storage module with the SSD type to store the index data.
With reference to the second aspect, in a possible implementation manner, in a case that the SLO information is used to indicate that a use requirement of the tenant for the index data is low priority, the analysis unit is configured to: and selecting an OBS type back-end storage module with the type of the OBS barrel to store the index data.
With reference to the second aspect, in one possible implementation manner, the determining, by the SLO storage unit, service level target SLO information input or selected by a tenant at the cloud management platform includes: providing an application programming interface API to the tenant, the application programming interface API for indicating a plurality of fields representing different attributes of the SLO information; and receiving the SLO information sent by the tenant in response to the API being called, wherein the SLO information comprises the plurality of fields and parameters input by the tenant for each field.
With reference to the second aspect, in one possible implementation manner, the plurality of fields include: namespaces, group names, instance names, monitoring item names, time periods, or SLO information desired by the tenant.
With reference to the second aspect, in one possible implementation manner, the determining, by the SLO storage unit, service level target SLO information input or selected by a tenant at the cloud management platform includes: providing a console interface to the tenant; and determining the SLO information input or selected by the tenant at the console interface, wherein the SLO information comprises a plurality of index attribute configuration items provided by the console interface and parameters input or selected by the tenant for each index attribute configuration item.
With reference to the second aspect, in one possible implementation manner, the plurality of index attribute configuration items include: hierarchical name, hierarchical resource scope, hierarchical policy.
In a third aspect, embodiments of the present application provide a communication device comprising one or more processors and one or more memories; the one or more memories are coupled to the one or more processors, the one or more memories are configured to store computer program code comprising computer instructions that, when executed by the one or more processors, cause the apparatus to perform the method of the first aspect or any of the possible designs of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium for storing a computer program which, when run on a computing device, causes the device to perform the method of the first aspect or any one of the possible designs of the first aspect.
Further combinations of embodiments of the present application may be made to provide further implementations based on the implementations provided in the above aspects.
The technical effects that may be achieved by any possible implementation manner of any one of the second aspect to the fourth aspect may be correspondingly described with reference to the technical effects that may be achieved by any one of the possible implementation manners of any one of the first aspect, and the descriptions will not be repeated.
Drawings
FIG. 1 illustrates an architecture diagram of a cloud service system;
FIG. 2 illustrates an architectural schematic diagram of a cloud monitoring system of an embodiment of the present application;
FIG. 3 is a schematic diagram of configuring SLO information according to a Console user interface in an embodiment of the present application;
fig. 4 shows a flow diagram of a cloud monitoring method according to an embodiment of the present application;
fig. 5 shows a flow diagram of a cloud monitoring method according to an embodiment of the present application;
Fig. 6 shows a flow diagram of a cloud monitoring method according to an embodiment of the present application;
fig. 7 shows a schematic structural diagram of a cloud management platform according to an embodiment of the present application;
fig. 8 shows a schematic structural diagram of a communication device according to an embodiment of the present application.
Detailed Description
In the following, some terms in the embodiments of the present application are explained for easy understanding by those skilled in the art.
1. Timeline (Time Line): refers to the coding of what happens in the past as well as in the future in one direction. In a cloud monitoring scenario, a timeline refers to the presentation of different index data over time.
2. Object storage service (Object Storage Serves, OBS): the method is an object-based mass storage service, and provides mass, safe, high-reliability and low-cost data storage capacity for tenants.
The basic components of an OBS are the OBS barrel and the object. OBS buckets are containers for storing objects in OBS, each bucket has its own storage category, access rights, belonging area, etc., and tenants locate the bucket on the internet through the access domain name of the bucket. The object is a basic unit of Data storage in OBS, and one object is actually an aggregate of Data and related attribute information of one file, and includes three parts, namely a Key value (Key), metadata (Metadata) and Data (Data).
3. Namespaces (namespaces): is an abstract integration of a set of resources and objects. Different namespaces can be created within the same cluster, with data in the different namespaces isolated from each other so that they can either share the services of the same cluster or do not interfere with each other.
4. Work order: and when the cloud service and/or the resources on the cloud fail, the tenant sends a notice to the cloud manufacturer, and the cloud manufacturer decides whether to accept or not according to actual conditions after receiving the work order.
5. Cloud resources: cloud management platform provides cloud resources for tenant, including cloud services and cloud instances, wherein cloud services are, for example, VPC network providing services, gateway providing services, firewall services, NAT services, cloud disk, elastic public network IP (EIP), cloud monitoring services and cloud services provided by various other cloud manufacturers, and cloud instances are, for example, virtual machines, containers or bare metal servers, which are virtual instances provided by cloud manufacturers for tenant in data centers of cloud manufacturers.
The present application is described in detail below with reference to the accompanying drawings and examples.
Fig. 1 shows a schematic architecture diagram of a cloud service system.
As shown in fig. 1, the cloud service system 100 may include a cloud service module 110, a cloud management platform 120, and a back-end memory 130 provided by a cloud vendor.
The cloud management platform 120 is externally connected with terminal equipment operated by a tenant through the internet, and the tenant can subscribe to cloud service on a cloud manufacturer side, so that the cloud service module 110 can provide corresponding service for the tenant according to the cloud service subscribed by the tenant. The cloud management platform 120 may be connected with the cloud service module 110 and the back-end memory 130 through an internal network in the system 100, and the back-end memory 130 may provide a data storage service for data from terminal devices of tenants or data from the cloud service module 110.
The cloud management platform 120 may provide an API interface for the cloud service module 110 to call. The cloud service module 110 may provide cloud services for tenants and generate relevant service data in the service process, and at the same time, the cloud service module 110 may call an API provided by the cloud management platform 120 to execute a write command, report the generated relevant service data to the cloud management platform 120, and store the relevant service data to the back-end memory 130 by the cloud management platform 120. The cloud management platform 120 may provide an API interface for a tenant to call, and access the backend memory 130 according to a command input to the API interface by the tenant, and provide the access result to the tenant.
In this embodiment of the present application, the back-end memory 130 may include at least two back-end memory modules, for example, may include a memory-type back-end memory module with a memory type and a persistent-type back-end memory module with a persistent type, and the persistent-type back-end memory module may include, for example, an SSD-type back-end memory module with a Solid State Disk (Solid State Disk or Solid State Drive, SSD), an OBS back-end memory module with an object storage service (Object Storage Service, OBS) bucket, and the like. The at least two back-end memory modules have different capabilities, such as different rates of reading data and/or different times of aging data. For example, taking the example that the back-end memory 130 ages data with different retention times, the memory-type back-end memory module only retains 3 hours, the SSD-type back-end memory module retains 1 month, and the OBS-type back-end memory module may retain 1 year. The cloud management platform 120 may select a target backend storage module from the at least two backend storage modules to store real-time data from the cloud service module 110, and may provide different usage capabilities for relevant service data for different types of tenants through different backend storage modules.
The tenant may purchase cloud resources at the cloud management platform 120, after the cloud management platform 120 notifies the cloud service system 100 to create cloud resources and provides a suitable access manner for the tenant to remotely use the cloud resources, for example, the cloud resources are a virtual machine, the tenant may select a specification (memory, processor and disk) of the virtual machine at the cloud management platform, after the tenant pays successfully, the cloud management platform notifies the cloud service system 100 to create a virtual machine with the specification and opens a remote desktop of the virtual machine, and the cloud management platform 120 provides the tenant with a connection account number and a password of the remote desktop, so that the tenant can remotely log in the virtual machine through the account number password.
The cloud resources may also be various cloud services such as a container, a bare metal server, and an elastic public network IP (Elastic IP Address, EIP), which are not limited in the embodiment of the present application.
The cloud service system 100 may further provide cloud monitoring service for tenants. The tenant may subscribe to a cloud monitoring service at a cloud vendor side, and the cloud service system 100 is configured according to the cloud monitoring service subscribed by the tenant, so as to monitor the running cloud service and/or cloud resources provided by the cloud service module 110 in the process that the cloud service module 110 provides the cloud service for the tenant, and generate index data (collectively referred to as cloud monitoring data) of the cloud service and/or cloud resources of the tenant.
The tenant may learn the working state of the cloud resource according to the index data, so as to trigger a preset action on the cloud management platform 120 according to the index data, for example, for a virtual machine (for example, the virtual machine provides a web page for an internet user to access, the virtual machine may receive a plurality of service requests, specifically, a web page access request), the monitored index data is, for example, a CPU occupancy rate of the virtual machine, if the CPU occupancy rate exceeds a threshold, the cloud management platform 120 notifies the cloud service system 100 to increase the number of the virtual machines, for example, from 1 to 2, and specifically, the virtual machine may be realized by copying mirror images of the virtual machines, and the 2 virtual machines receive different service requests through a load balancing policy preset by the tenant on the cloud management platform 120 and process the service requests respectively, so that the CPU occupancy rate of the original virtual machine is reduced.
However, different types of tenants often have different capability requirements for services such as query and use of index data, and the use requirements in different business scenarios are also different. For example, in a cloud monitoring scenario, most (e.g., 90% or more) queries are real-time data queries over nearly 1 hour; most (e.g., more than 90%) queries are resident queries, and the time lines involved in a resident query account for only less than 1% of the write time line proportion; most (e.g., 95% or more) of the resident queries are initiated by larger-scale tenants, such as those with a larger number of purchased virtual machines, and in this embodiment of the present application, larger-scale tenants, such as those with more purchased cloud resources, and smaller-scale tenants, such as those with less purchased cloud resources.
Large-scale tenants have higher demands for service capabilities and can accept to obtain better service capabilities by means of payment. For example, a large-scale tenant may lease cloud resources of a plurality of different cloud vendors and call an open API provided by each cloud vendor to obtain index data for secondary data processing by the tenant, for example, analyzing and presenting the index data on an operation and maintenance platform built by the tenant, so that the tenant generally needs the cloud service system 100 to provide a faster and larger resource amount of query response, for example, a P99 millisecond delay (for example, a response delay of 99% is less than or equal to 50 milliseconds (ms)), and a hundred queries per second (Queries Per Second, QPS). The small-scale tenant has small resource amount and does not relate to the secondary data processing process, so that the requirement on response delay is not high, for example, P99 second-level delay and lower QPS can be accepted only by inquiring and analyzing according to the requirement of the tenant.
Currently, default capabilities of APIs opened by each cloud vendor to all tenants are consistent, the tenants cannot communicate own requirements for the service capabilities to the cloud vendor, and the cloud vendor cannot acquire different requirements of tenants with different scales for the service capabilities. The tenant can only notify the cloud manufacturer of the service capability required for moderate change by means of delivering a job ticket when traffic control (simply referred to as flow control) or service is damaged. Because cloud manufacturers cannot know the query and use requirements of different tenants, all index data can be processed uniformly, and the index data can be stored in the memory type back-end storage module.
However, the amount of data involved in the query and the use is generally less than 1% of the total data amount, and 99% of the index data does not necessarily occupy the storage space of the memory type back-end storage module. For example, a 1 million time line, 0.03KB per data point after time sequence compression, each data point is reported periodically every minute, and if the data point is reserved for 3 hours, 515G of memory storage space is needed, but none of the 510G of memory storage space is queried and used. In addition, since the cost of the memory type back-end storage module, the SSD type back-end storage module, and the OBS back-end storage module decreases in sequence, the response time delay increases in sequence, and if the cloud service system 100 ignores the query and the use requirement of the tenant, the cloud service system still performs unified processing on all index data, which cannot be close to the actual requirement of the tenant, and cannot guarantee the effective use rate of the cloud resource.
Aiming at the problems, the embodiment of the application provides a cloud monitoring method and a cloud management platform, which are beneficial to guaranteeing different use requirements of different types of tenants on index data and improving the effective utilization rate of cloud resources. The method and the device are based on the same technical conception, and because the principle of solving the problems by the method and the device is similar, the implementation of the device and the method can be mutually referred to, and the repeated parts are not repeated.
The following describes a system architecture applicable to the embodiment of the present application by taking a cloud monitoring service as an example.
Fig. 2 shows a schematic architecture diagram of a cloud monitoring system according to an embodiment of the present application.
As shown in fig. 2, the cloud monitoring system 200 may include a cloud service module 210, a cloud management platform 220, and a back-end memory 230, where the cloud service module 210 and/or a tenant may communicate with the cloud management platform 220 by calling an API provided by the cloud management platform 220, and the cloud management platform 220 and a back-end processing module (including but not limited to the back-end memory module shown in fig. 1) connected thereto may provide data processing functions such as accessing, processing, aggregating, storing, etc. for index data from the cloud service module 210, and provide functions such as querying and using stored index data for the tenant.
It should be understood that the functional modules that the system 200 may include are shown in fig. 2 by way of example only, and are not limited to the number and functions of the functional modules, for example, the system 200 may include multiple types of back-end processing modules, and any back-end processing module may also include a respective sub-module, which is not limited in this embodiment of the present application.
Illustratively, the cloud management platform 220 may include a service level objective (service level objective, SLO) storage unit 250, an analysis unit 260. Optionally, the cloud management platform 220 may further include a caching unit 270. The SLO storage unit 250, the analysis unit 260 and the buffer unit 270 may be independent modules in the cloud management platform 220, or may be sub-modules of other modules, which are not limited in the embodiment of the present application, and in the drawing, the dashed boxes only indicate that the SLO storage unit 250, the analysis unit 260 and the buffer unit 270 are optional independent modules.
The SLO storage unit 250, the analysis unit 260, and the buffer unit 270 according to the embodiment of the present application are described below with reference to the accompanying drawings.
1. SLO memory cell 250
The SLO storage unit 250 may be configured to persistently store tenant-customized SLO information. In an alternative implementation, the SLO storage unit 250 may be attributed to back-end memory.
In this embodiment of the present application, when a tenant subscribes to a cloud monitoring service, the cloud management platform 220 may provide an entry for configuring SLO information for the tenant, for example, configure SLO information through an API. The tenant may complete custom configuration of SLO information through the portal, where SLO information of the custom configuration of the tenant is stored in the SLO storage unit 250 in a suspected and persistent manner. Subsequently, other functional modules of the cloud monitoring system 200 may obtain SLO information of the tenant from the SLO storage unit 250, and provide various processing functions, and query and use functions for index data of the cloud monitoring service for the corresponding tenant based on the SLO information.
In a specific implementation process, the cloud management platform 220 may provide the tenant with an SLO configuration item, where the SLO configuration item may be used for the tenant to configure SLO information in a customized manner, and the SLO information may include a query range and a query capability parameter of index data desired by the tenant. By way of example, the query scope may include, for example, at least one of: the query capability parameter may include, for example, at least one of the following: delay, query rate, single query volume upper limit, SLO level.
In the query range, the name space of the index to be queried is the name space of the index data to be queried; the monitoring item names are index names corresponding to indexes to be queried; the resource grouping information is used for indicating resource grouping corresponding to the index to be queried; the resource identifier is the identifier of the resource corresponding to the index to be queried; the time period is a time range corresponding to the index to be queried. The delay may include a maximum response delay, an average response delay, etc. to the query request in the query capability parameter; the query rate may for example comprise a query rate that can be reached by the query request, in particular for example QPS; the single query volume upper limit represents a single maximum query volume; the SLO level may include any of high priority, medium priority, low priority, and the like, for example.
In a possible implementation manner, when the cloud management platform 220 subscribes to the cloud monitoring service, the tenant may send a cloud monitoring service configuration request to the cloud management platform 220 through a terminal device that operates itself. The cloud management platform 220 may receive a cloud monitoring service configuration request from a terminal device, and feedback a cloud monitoring service configuration response to the terminal device in response to the cloud monitoring service configuration request. The cloud monitoring service configuration response can be used for indicating alternative configuration information, and the tenant can determine target configuration information according to the alternative configuration information and the use requirement of the tenant on index data of the cloud monitoring service. Further, the terminal device may send the target configuration information to the cloud management platform 220, so that the cloud management platform 220 obtains SLO information of the tenant according to the target configuration information, and completes the custom configuration of the SLO of the tenant. It may be understood that, in a specific implementation, the target configuration information may be SLO information of a tenant, or the target configuration information may be information for obtaining SLO information of the tenant, which is not limited in the configuration implementation of the embodiment of the present application.
Taking the example that the tenant configures SLO information through an API, the above alternative configuration information may have at least one implementation manner, and a specific configuration process is illustrated as follows:
the API format provided by the cloud management platform 220 is as follows:
specifically, the cloud management platform 220 may display the above API format on a web page provided on the internet, and note the usage of the corresponding fields, such as the above// following related prompts. After seeing the API format, the tenant fills in corresponding parameters according to the API format, for example, filling in after 'Namespace': "EIP", i.e. "Namespace": "EIP", means that the service involved is an EIP service purchased by the tenant, further "Namespace": can be filled in later: "x", i.e. "nacespace": ", the identification refers to all cloud resources purchased by the tenant.
For example, the tenant may populate the parameters for the API format as follows:
the tenant may send the API with the input parameters to the cloud management platform 220 in a template manner through the internet, and the cloud management platform 220 detects parameters corresponding to different fields in the API, so as to obtain requirements of the tenant corresponding to different fields of the API. Thus, in this embodiment, SLO information includes API fields and tenant-entered parameters. Further, the cloud management platform 220 stores SLO information of the tenant to the SLO storage unit 250.
In addition to providing APIs, as shown in fig. 3, cloud management platform 220 may also provide a Console (Console) interface for tenant configuration. The terminal equipment of the tenant can display the Console interface, and the tenant can input or select parameters similar to the API in the relevant attribute configuration items of the Console interface according to the use requirement of the tenant on the index data.
For example, the cloud monitoring service provided by the cloud vendor may include, but is not limited to, resource grouping, alarm, host monitoring, cloud service monitoring, hierarchical processing, event monitoring, and the like, and after the tenant selects any one of the specific cloud monitoring services, the relevant index attribute configuration item of the cloud monitoring service may be presented by the Console interface, so that the tenant can select or input the relevant index attribute configuration item to complete the relevant parameter configuration of the cloud monitoring service.
Thus, in this embodiment, the SLO information includes the index attribute configuration items provided by the console interface and parameters input or selected by the tenant for each index attribute configuration item.
Taking a specific cloud monitoring service, which is a hierarchical processing service, as shown in fig. 3, for example, the index attribute configuration item related to the hierarchical processing service may include a plurality of hierarchical information, and each hierarchical information may be associated with a hierarchical name (or ID), a related hierarchical resource range, a hierarchical policy, and the like. The tenant can determine target grading information in the plurality of grading information according to the use requirement of the tenant on the index data of the cloud monitoring service, and the target grading information can meet the use requirement of the tenant on the index data of the cloud monitoring service.
For example, the tenant may select the service involved to be "EIP", and the policy includes: inquiring the data of the last 1 hour; the maximum response time delay P99 is less than 50ms; by default, high priority processing. Alternatively, the tenant may select the service involved to be "EIP", and the policy includes: inquiring the data of the last 1 hour; the maximum response time delay P99 is less than 500ms; by default, low priority processing.
The tenant feeds back a cloud monitoring service configuration response to the cloud management platform 220 through the operation terminal device, wherein the cloud monitoring service configuration response can carry target configuration information, and the target configuration information can be used for indicating target grading information in a plurality of grading information. Further, the cloud management platform 220 may determine SLO information of the tenant according to the target configuration information, and store the SLO information of the tenant to the SLO storage unit 250.
It should be noted that, the above example only uses an API to configure SLO information as an example to describe a configuration process of the SLO information, and does not limit a specific implementation manner of customized configuration of the SLO information by a tenant, and in other embodiments, the configuration process of the SLO information may also be implemented by other manners, which are not described herein. It will be appreciated that in the example of configuring SLO information based on the console interface, each hierarchical policy may be implemented by at least one API cooperation, so that the tenant selects the hierarchical policy is essentially an API required for configuring the tenant.
It may be understood that in the embodiment of the present application, a cloud vendor may charge a fee to a tenant according to SLO information configured by the tenant in a user-defined manner, where basic fees corresponding to different service levels may be different, and the cloud vendor may charge a fee according to different charging rules according to service levels related to the SLO information of the tenant. For example, for a high priority SLO, the billing may be at 100 yuan/day, a medium priority SLO may be at 10 yuan/day, and a low priority SLO may be handled as free. It should be understood that the different billing rules for the different levels of SLOs are shown here schematically by way of example only and are not limiting to the specific cost of each level of SLO.
2. Analysis unit 260
The analysis unit 260 may read SLO information of the tenant from the SLO storage unit 250, select one of the back-end storage modules matching with the SLO information of the tenant from the at least two back-end storage modules to store the index data, and route the index data reported to the cloud management platform 220 to the selected back-end storage module for storage. The analysis unit 260 may also route query requests from tenants to selected back-end storage modules to read index data required by the tenants.
In an alternative implementation manner, the analysis unit 260 may obtain query dotting statistics of the tenant, and perform analysis processing according to the query dotting statistics to obtain a query model and a hierarchical processing policy of the tenant, where the query model and the hierarchical processing policy may be used to indicate a processing manner of index data associated with the tenant.
Taking the example of acquiring the query dotting statistics of the tenant through the API, the analysis unit 260 may be connected to the entries of each API provided by the cloud vendor, and when the tenant triggers the query, the invoked API may count the key dotting data of the query service, such as the query time range, the associated time line number, the response delay, the number of returned records, and the like, and send the query dotting statistics to the analysis unit 260.
The analysis unit 260 may obtain SLO information of the tenant from the SLO storage unit 250, and obtain a query model and a hierarchical processing policy of the tenant after analysis processing according to the SLO information of the tenant and the query dotting statistical data of the tenant. The query model and/or the hierarchical processing strategy obtained by the analysis unit 260 may be cached as analysis data in the caching unit 270. The query model of the tenant may be used to indicate index metadata related to the query of the tenant, and the hierarchical processing policy may be used to indicate a processing manner adapted by the tenant. The analysis unit 260 may determine hot spot metadata related to the query of the tenant according to the query model of the tenant, and determine whether to cache index metadata associated with SLO information of the tenant in the caching unit 270 and how to cache the index metadata according to the hot spot metadata. The cached index metadata may be used to match index data from the cloud service module 210 or metadata carried in a query request from a tenant, so as to implement a storage function, a query and use function, and the like for the index data. The analysis unit 260 may also determine in which processing manner to process the index data from the cloud service module 210 and the query request from the tenant according to the hierarchical processing policy of the tenant. For example, which of the at least two back-end memory modules is used as the target back-end memory module for storing the index data, and from which back-end memory module the index data is read.
It should be noted that, in the embodiment of the present application, in the case that the tenant does not configure SLO information in a customized manner, the cloud management platform 220 may determine to implement a storage function, a query and a use function of index data of the tenant according to a minimum SLO policy default by a system, for example, in the foregoing scheme illustrated in fig. 1, the index data is written into different back-end storage modules according to a time range. In an alternative implementation manner, the cloud management platform 220 may, for example, preferentially meet the use requirement of the tenant with SLO information, and provide storage service, query and use service for index data for the tenant according to the SLO information configured by the tenant in a user-defined manner. In another alternative implementation manner, in the case that there is a remaining cloud resource (e.g. storage capacity) in the system, the cloud management platform 220 may set, for example, index metadata related to a hotspot query of a tenant from high to low according to an SLO priority, cache index metadata related to SLO queries with different priorities in the cache unit 270, and implement the index metadata not written in the cache unit according to a minimum SLO policy default by the system.
The query hotspots of the tenant may be dynamically changed, for example, changed along with the business adjustment of the tenant, changed along with the use requirement adjustment of the tenant on the cloud monitoring data, changed along with the use condition adjustment of the cloud resource, and the like. Accordingly, the analysis unit 260 may dynamically determine and/or adjust the priority of the index metadata associated with the SLO information of the tenant, the caching policy of the index metadata, the hierarchical processing policy of the tenant, etc. according to the update result of the query hotspot of the tenant, which will be described below with reference to the method flowchart, which will not be described herein.
3. Cache unit 270
The caching unit 270 may be in communication with the analysis unit 260 for caching analysis data from the analysis unit 260, e.g., index metadata associated with SLO information of a tenant and/or tenant-adapted hierarchical processing policies. In another alternative implementation, the cache unit may be attributed to back-end memory.
After receiving the index data from the cloud service module 210 and/or the query request from the tenant, the cloud management platform 220 may obtain index metadata associated with SLO information of the tenant from the cache unit 270, match the obtained index metadata with metadata carried in the index data reported by the cloud service module 210 and cloud monitoring index metadata carried in the query request of the tenant, select, according to the matching result, one of the at least two back-end storage modules (the at least two back-end storage modules correspond to different processing levels) that is matched with the SLO information of the tenant as a target back-end storage module, and store, query, and respond to the corresponding index data by using the target back-end storage module.
Specifically, the cloud management platform 220 loads metadata (such as periodic loading or subscription change notification) associated with SLO information of the tenant from the caching unit 270, and writes the index data reported by the cloud service module 210 calling the API into different back-end storage modules according to metadata types in real time. For example, the index data related to the high-priority SLO requirement is written into the memory type back-end storage module, the index data related to the medium-priority SLO requirement is written into the memory type back-end storage module or the SSD type back-end storage module, and the index data related to the default lowest-priority SLO requirement is written into the OBS type back-end storage module.
Based on the difference of data storage, the query response of the memory type back-end storage module can reach the time delay of P99 < 50ms, the query response of the SSD type back-end storage module can reach the time delay of P99 < 500ms, and under the condition that the tenant sets the target of P99 < 50ms, the cloud management platform 220 processes the back-end storage index data by taking the cache or the memory type back-end storage module as the target, so that the time delay can be reduced, the use requirement of the tenant on the index data can be more matched, and meanwhile, the effective utilization rate of cloud resources can be ensured. In the case that the tenant sets the target of P99 < 500ms, the cloud management platform 220 processes the back-end storage index data as the target through the SSD back-end storage module, so that the storage cost can be reduced.
It is to be understood that, in the embodiment of the present application, the buffer unit 270 may also be used to buffer other information, and the specific function of the buffer unit is not limited in the embodiment of the present application.
Thus far, the cloud monitoring system 200 and its functional modules according to the embodiments of the present application have been described with reference to the embodiments of the drawings. The cloud management platform can provide a customized channel of service level target information expected by the tenant, so that the tenant can configure SLO information in a customized manner. For tenants who set SLO information through the API, the cloud management platform 220 performs hierarchical processing on index data associated with the tenant according to SLO information customized by the tenant, so as to more match with a use requirement of the tenant index data. For example, in the case where the tenant sets the target with the P99 latency less than 50ms, the cloud management platform 220 reduces the latency and matches the tenant requirements through the cache or the memory type back-end storage module. The cloud management platform 220 may provide an analysis unit 260, and the analysis unit 260 may obtain a query model of the tenant through query dotting statistical data of the tenant, and dynamically adjust a hierarchical processing policy corresponding to the tenant in combination with a setting parameter of the cloud management platform 220 and an actual use condition of cloud resources, so as to fully utilize the cloud resources as much as possible under the condition of guaranteeing the use requirement of the tenant on index data, reduce overall query time delay, storage cost and the like, and improve user experience.
It should be noted that, in the foregoing embodiments of the present application, only the storage module and the storage function are taken as examples, and the processing manner of the index data is illustrated and not limited at all, and in other embodiments, the cloud management platform may be further configured to implement other processing functions of the index data, including but not limited to access, processing, aggregation, and the like, accordingly, any one processing function may correspond to at least two back-end processing modules, for the corresponding processing function, the cloud management platform may select one back-end processing module from the at least two back-end processing modules as a target processing module, and the target processing module performs corresponding processing of the index data, and detailed implementation procedures may refer to related descriptions related to the storage function and will not be repeated herein.
The cloud monitoring method according to the embodiment of the application is described below with reference to the accompanying drawings and the embodiments.
Fig. 4 shows a flow chart of a cloud monitoring method according to an embodiment of the present application. The method can be implemented by the terminal equipment of the tenant, the cloud management platform shown in fig. 2 and the submodules thereof. As shown in fig. 4, the cloud monitoring method may include the steps of:
s410: and the cloud management platform determines SLO information input or selected by the tenant at the cloud management platform.
In this embodiment of the present invention, the SLO information is obtained by user-defined configuration of a tenant, and is used to represent a use requirement of the tenant on index data, where the index data is data generated by monitoring cloud resources purchased by the tenant at a cloud management platform by a cloud monitoring service provided by the cloud management platform. The SLO information may be persisted in the SLO memory unit. When S410 is implemented, the cloud management platform may obtain SLO information of the tenant from the SLO storage unit. For example, the cloud management platform may obtain SLO information of the tenant in the SLO storage unit according to the unique identifier of the tenant (or the terminal device of the tenant).
In an optional implementation manner, when the tenant self-defines and configures SLO information, the tenant may send a cloud monitoring service configuration request to the cloud management platform through the terminal device. Accordingly, the cloud management platform may receive a cloud monitoring service configuration request from the tenant, and respond to the cloud monitoring service configuration request to feed back a cloud monitoring service configuration response to the terminal device, where the cloud monitoring service configuration response may be used to indicate the alternative configuration information. The tenant can determine target configuration information according to the alternative configuration information, and send the target configuration information to the cloud management platform through the terminal equipment, wherein the target configuration information can be used for acquiring SLO information of the tenant. Details of the detailed configuration may be found in the related description above in connection with the API format and console interface, and will not be described in detail herein.
S420: and the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data.
In this embodiment, the index data of the tenant may be generated by the monitored cloud resource.
In general, a tenant may subscribe to a cloud monitoring service (including various cloud monitoring metrics related to the cloud monitoring service) from a cloud monitoring system. The cloud management platform can monitor cloud resources according to cloud monitoring services subscribed by tenants and generate corresponding index data. In the implementation S420, the index data may be reported to the cloud management platform by the monitored cloud service module through calling an API (e.g., a data communication API), or the cloud management platform may monitor the cloud resource to obtain the index data.
Illustratively, in the embodiment of the present application, each cloud monitoring index related to the cloud monitoring service may include, but is not limited to, at least one of the following: write traffic/read traffic, raw data size, overall QPS, number of operations, service status, UE resolution success traffic, UE resolution success line number, UE resolution failure line number, UE error number, error occurrence IP statistics. The index data to be stored in S420 may include cloud monitoring data corresponding to the at least one cloud monitoring index.
In S420, the cloud management platform uses the selected back-end storage module as a target back-end storage module, and routes the index data to be stored to the target back-end storage for storage.
The method steps of the cloud monitoring method shown in fig. 4 will be described in detail with reference to fig. 5 and 6.
S501: the tenant self-defines configuration SLO information by calling an API (e.g., a management class API) through the terminal device.
S502: the called API persistently stores the SLO information of the tenant custom configuration to an SLO storage unit.
It should be noted that, in the embodiment of the present application, SLO information of the customized configuration of the tenant may include a query range and a query capability parameter desired by the tenant, and the above optional implementation is merely an example of the customized configuration process and not any limitation, and in other embodiments, the tenant may also customize to configure the SLO information in other manners, which is not described herein.
S503: a data source (e.g., including the aforementioned cloud service module) invokes an API (e.g., a data communication class API) to report the tenant's index data to the cloud management platform.
S504: and the analysis unit responds to the called API and acquires index metadata cached by the tenant from the caching unit. Specifically, S504 may include: s504a: the analysis unit sends a read request to the cache unit, S504b: the caching unit responds to the reading request and returns a reading response to the processor, wherein the reading response comprises index metadata cached by the tenant.
S505: the analysis unit analyzes and matches the metadata carried in the index data with the cached index metadata, and determines one back-end storage module matched with the SLO information of the tenant from at least two back-end storage modules as a target back-end storage module according to the matching result.
S506: the analysis unit routes the index data to the target back-end storage module so as to store the index data by using the target back-end storage module.
As shown in fig. 5, the at least two back-end storage modules (also referred to as hierarchical back-end) may include a plurality of levels of back-end storage modules obtained by dividing back-end storage, for example, level 1 back-end, level 2 back-end, level 3 back-end, … …, level n back-end, where n represents the hierarchical number of back-end storage. After the analysis unit makes a decision according to the cached index metadata, at least one target back-end storage module can be determined in the n back-end storage modules according to a processing strategy corresponding to the index data, and the index data is sent to the at least one target back-end storage module so as to store the index data by using the at least one target back-end storage module.
It can be understood that fig. 5 only schematically illustrates that the analysis unit can send the index data to the target back-end storage module of the n back-end storage modules, so as to implement storage of the index data, and the n back-end storage modules are not limited to be the target back-end storage modules.
In an alternative implementation, the cloud management platform may also provide query and use functions for the tenant for the index data.
Illustratively, when a tenant triggers a query request, as shown in fig. 5, the query request process may include the steps of:
s507: and calling an API by the terminal equipment of the tenant to report the query request.
S508: and the analysis unit responds to the called API and acquires index metadata cached by the tenant from the caching unit. Specifically, S508 may include, for example: s508a: the analysis unit sends a read request to the cache unit, S508b: the caching unit responds to the reading request and returns a reading response to the analysis unit, wherein the reading response comprises index metadata cached by the tenant.
S509: and the analysis unit matches the cloud monitoring index metadata with the cached index metadata, and selects a target back-end storage module matched with the SLO information of the tenant from at least two back-end storage modules according to the matching result.
And S510, the analysis unit routes the query request to the target back-end storage module, and the target back-end storage module is utilized to feed back target index data corresponding to the cloud monitoring index metadata, namely an access result, to the terminal equipment.
In an optional implementation manner, the analysis unit can dynamically determine and/or adjust SLO information of the tenant, so that cloud resources are fully utilized as much as possible under the condition that the use requirement of the tenant on index data is guaranteed, and overall query response time delay, storage cost and the like are reduced, so that user experience is improved. By way of example, the process may include the steps of:
s511: when the tenant triggers a query request, the called API acquires query dotting statistical data of the tenant, and sends the query dotting statistical data to the analysis unit.
S512: the analysis unit acquires SLO information of the tenant from the SLO storage unit.
S513: the analysis unit determines hot spot metadata according to query dotting statistical data of the tenant, analyzes SLO information of the tenant according to the hot spot metadata to obtain an analysis result, wherein the analysis result can be index metadata associated with whether to cache the SLO information of the tenant.
S514: and the analysis unit caches the index metadata needed to be cached by the tenant into the caching unit.
In another optional implementation manner, the SLO information stored in the SLO storage unit may be replaced by a hierarchical processing policy of the tenant, where the hierarchical processing policy may be dynamically adjusted by a dispatcher of the cloud management platform according to the SLO information of the tenant and the cloud resource.
In a specific embodiment, as shown in fig. 6, the hierarchical processing policy of the tenant may be obtained by the following steps:
s601: and dividing the duty ratio threshold value of the exclusive cloud resource and the duty ratio threshold value of the shared cloud resource for the tenant through the management class API by a system administrator according to cloud service and/or cloud monitoring service subscribed by the tenant, and storing the duty ratio threshold value as an initial hierarchical processing strategy in an SLO storage unit in a lasting mode.
S602: the hierarchical back end (comprising at least two back end storage modules) sends each cloud monitoring index written by the tenant in real time to the analysis unit, and each cloud monitoring index can be used for indicating a query hotspot of the tenant.
S603: the analysis unit can combine the updated result of the query hotspot of the tenant and the initial hierarchical processing strategy set by the system administrator to generate an adaptive hierarchical processing strategy.
S604: the scheduler acquires the hierarchical processing strategy from the analysis unit, and updates the priority of index metadata associated with the SLO information of the tenant and the caching strategy of the index metadata according to the hierarchical processing strategy.
S605: the scheduler sends the priority of index metadata associated with the SLO information of the tenant and the caching strategy of the index metadata to the caching unit so that the caching unit executes the caching step of the index metadata associated with the SLO information of the tenant according to the priority and the caching strategy.
In an optional implementation manner, the analysis unit and the scheduler may periodically update the priority of the index metadata associated with the SLO information of the tenant and the caching policy of the index metadata, so as to update the cached index metadata according to the priority of the updated index metadata and the caching policy, thereby dynamically adjusting the processing policies of different hierarchical back ends, and improving the effective utilization rate of cloud resources while guaranteeing different complaints of different tenants on the index data of the cloud monitoring service.
In combination with the above method embodiment, the embodiment of the present application further provides a cloud management platform, where a specific structure of the cloud management platform may be shown with reference to fig. 2, and may be used to execute the method executed by the cloud management platform and each sub-module thereof in the above method embodiment.
As shown in fig. 7, the cloud management platform 700 may include: the SLO storage unit 701 is configured to determine service level target SLO information input or selected by a tenant at the cloud management platform, where the SLO information is used to represent a use requirement of the tenant on index data, where the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by a cloud monitoring service provided by the cloud management platform; and an analysis unit 702, configured to select one back-end storage module that matches the SLO information from at least two back-end storage modules to store the index data, where the at least two back-end storage modules have different data reading rates and/or different aging times of the at least two back-end storage modules.
For example, the SLO storage unit 701 and the analysis unit 702 may be integrated in the cloud management platform shown in fig. 2, and the product forms of the SLO storage unit 701 and the analysis unit 702 are not limited in this embodiment, and details of the implementation of the functions of the SLO storage unit 701 and the analysis unit 702 may be referred to the related description of the above method embodiments and are not repeated herein.
It should be noted that, in the embodiment of the present application, the division of the units is schematic, which is merely a logic function division, and other division manners may be implemented in actual practice. The functional units in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In a simple embodiment, one skilled in the art may think that the cloud management platform or the terminal device in the above embodiment may take the form shown in fig. 8. The communication device 800 as shown in fig. 8 includes at least one processor 810, memory 820, and optionally a communication interface 830.
Memory 820 may be a volatile memory such as a random access memory; the memory may also be a non-volatile memory such as, but not limited to, read-only memory, flash memory, hard disk (HDD) or Solid State Drive (SSD), or the memory 820 may be any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Memory 820 may be a combination of the above.
The specific connection medium between the processor 810 and the memory 820 is not limited in the embodiments of the present application.
In the apparatus of fig. 8, a communication interface 830 is further included, and the processor 810 may perform data transmission through the communication interface 830 when communicating with other devices.
When the cloud management platform takes the form shown in fig. 8, the processor 810 in fig. 8 may cause the apparatus 800 to perform the method performed by the cloud management platform in any of the method embodiments described above by invoking computer-executable instructions stored in the memory 820.
When the tenant-side terminal device takes the form shown in fig. 8, the processor 810 in fig. 8 may cause the device 800 to execute the method executed by the tenant-side terminal device in any of the above-described method embodiments by invoking computer-executable instructions stored in the memory 820.
Embodiments of the present application also relate to a chip system including a processor for invoking a computer program or computer instructions stored in a memory to cause the processor to perform the above-described method embodiments.
In one possible implementation, the processor is coupled to the memory through an interface.
In one possible implementation, the system on a chip further includes a memory having a computer program or computer instructions stored therein.
Embodiments of the present application also relate to a processor for invoking a computer program or computer instructions stored in a memory to cause the processor to perform the above-described method embodiments.
The processor referred to in any of the above may be a general purpose central processing unit, a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the program execution of the method in the embodiment shown in fig. 8. The memory mentioned in any of the above may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM), etc.
It should be appreciated that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present application without departing from the scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to encompass such modifications and variations.
Claims (22)
1. A cloud monitoring method, the method comprising:
the cloud management platform determines service level target SLO information input or selected by a tenant at the cloud management platform, wherein the SLO information is used for representing the use requirement of the tenant on index data, and the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by cloud monitoring service provided by the cloud management platform;
the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, wherein the data reading rates of the at least two back-end storage modules are different and/or the aging times of the at least two back-end storage modules are different.
2. The method of claim 1, wherein the at least two back-end storage modules include a memory-type back-end storage module of a type of memory, an SSD-type back-end storage module of a type of solid state drive SSD, and an OBS-type back-end storage module of a type of object storing OBS buckets, wherein the memory-type back-end storage module reads data at a fastest rate, the OBS-type back-end storage module reads data at a slowest rate, the memory-type back-end storage module ages data at a fastest time, and the OBS-type back-end storage module ages data at a slowest time.
3. The method of claim 2, wherein, in the case where the SLO information indicates that the tenant's use requirement for index data is high priority, the cloud management platform selects one back-end storage module that matches the SLO information from at least two back-end storage modules to store the index data, comprising:
and the cloud management platform selects a memory type back-end storage module with a memory type to store the index data.
4. The method of claim 2 or 3, wherein, in a case where the SLO information indicates that a requirement of the tenant for use of index data is a medium priority, the cloud management platform selects one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, including:
and the cloud management platform selects a memory type back-end storage module to store the index data.
5. The method of any of claims 2-4, wherein, in the case where the SLO information indicates that the tenant's use requirement for index data is of medium priority, the cloud management platform selects one back-end storage module from at least two back-end storage modules that matches the SLO information to store the index data, comprising:
And the cloud management platform selects an SSD type rear end storage module with SSD type to store the index data.
6. The method of any of claims 2-5, wherein, in a case where the SLO information is used to indicate that a requirement of the tenant for use of index data is low priority, the cloud management platform selects one back-end storage module that matches the SLO information from at least two back-end storage modules to store the index data, including:
and the cloud management platform selects an OBS type back-end storage module of an OBS barrel to store the index data.
7. The method of any of claims 1-6, wherein the cloud management platform determining service level target SLO information entered or selected by a tenant at the cloud management platform comprises:
the cloud management platform providing an application programming interface API to the tenant, the application programming interface API for indicating a plurality of fields representing different attributes of the SLO information;
the cloud management platform receives the SLO information sent by the tenant, wherein the SLO information comprises the plurality of fields and parameters input by the tenant for each field.
8. The method of claim 7, wherein the plurality of fields comprise: namespaces, group names, instance names, monitoring item names, time periods, or SLO information desired by the tenant.
9. The method of any of claims 1-6, wherein the cloud management platform determining service level target SLO information entered or selected by a tenant at the cloud management platform comprises:
the cloud management platform provides a console interface for the tenant;
the cloud management platform determines the SLO information input or selected by the tenant at the console interface, wherein the SLO information comprises a plurality of index attribute configuration items provided by the console interface and parameters input or selected by the tenant for each index attribute configuration item.
10. The method of claim 9, wherein the plurality of index property configuration items comprises: hierarchical name, hierarchical resource scope, hierarchical policy.
11. A cloud management platform, comprising:
the cloud management platform comprises an SLO storage unit, a cloud management platform and a cloud management platform, wherein the SLO storage unit is used for determining service level target SLO information input or selected by a tenant at the cloud management platform, and the SLO information is used for representing the use requirement of the tenant on index data, wherein the index data is data generated by monitoring cloud resources purchased by the tenant at the cloud management platform by cloud monitoring service provided by the cloud management platform;
And the analysis unit is used for selecting one back-end storage module matched with the SLO information from at least two back-end storage modules to store the index data, wherein the data reading rates of the at least two back-end storage modules are different and/or the aging times of the at least two back-end storage modules are different.
12. The cloud management platform of claim 11, wherein said at least two back-end storage modules comprise a memory-type back-end storage module of a type memory, an SSD-type back-end storage module of a type SSD, and an OBS-type back-end storage module of a type OBS bucket, wherein said memory-type back-end storage module reads data at a fastest rate, said OBS-type back-end storage module reads data at a slowest rate, said memory-type back-end storage module ages data at a fastest time, and said OBS-type back-end storage module ages data at a slowest time.
13. The cloud management platform of claim 12, wherein in a case where the SLO information indicates that the tenant's use requirement for index data is high priority, the analysis unit is configured to:
and selecting a memory type back-end storage module with a memory type to store the index data.
14. The cloud management platform of claim 12 or 13, wherein, in a case where the SLO information indicates that the tenant's use requirement for index data is of medium priority, the analysis unit is configured to:
and selecting a memory type back-end storage module to store the index data.
15. The cloud management platform of any of claims 12-14, wherein, in a case where the SLO information indicates that the tenant's use requirement for index data is of medium priority, the analysis unit is configured to:
and selecting an SSD type back-end storage module with the SSD type to store the index data.
16. The cloud management platform of any of claims 12-15, wherein, in a case where the SLO information is used to indicate that the tenant's use requirement for index data is low priority, the analysis unit is configured to:
and selecting an OBS type back-end storage module with the type of the OBS barrel to store the index data.
17. The cloud management platform of any of claims 11-16, wherein the SLO storage unit determining service level target SLO information entered or selected by a tenant at the cloud management platform comprises:
providing an application programming interface API to the tenant, the application programming interface API for indicating a plurality of fields representing different attributes of the SLO information;
And receiving the SLO information sent by the tenant in response to the API being called, wherein the SLO information comprises the plurality of fields and parameters input by the tenant for each field.
18. The cloud management platform of claim 17, wherein said plurality of fields comprises: namespaces, group names, instance names, monitoring item names, time periods, or SLO information desired by the tenant.
19. The cloud management platform of any of claims 11-16, wherein the SLO storage unit determining service level target SLO information entered or selected by a tenant at the cloud management platform comprises:
providing a console interface to the tenant;
and determining the SLO information input or selected by the tenant at the console interface, wherein the SLO information comprises a plurality of index attribute configuration items provided by the console interface and parameters input or selected by the tenant for each index attribute configuration item.
20. The cloud management platform of claim 19, wherein said plurality of index attribute configuration items comprises: hierarchical name, hierarchical resource scope, hierarchical policy.
21. A communication device, comprising: one or more processors and one or more memories;
The one or more memories coupled to the one or more processors, the one or more memories for storing computer program code comprising computer instructions which, when executed by the one or more processors, the apparatus performs the method of any of claims 1-10.
22. A computer readable storage medium for storing a computer program which, when run on a computing device, causes the device to perform the method of any one of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/116891 WO2023093194A1 (en) | 2021-11-24 | 2022-09-02 | Cloud monitoring method and cloud management platform |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2021113997390 | 2021-11-24 | ||
CN202111399739 | 2021-11-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116166181A true CN116166181A (en) | 2023-05-26 |
Family
ID=86413761
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210142252.2A Pending CN116166181A (en) | 2021-11-24 | 2022-02-16 | Cloud monitoring method and cloud management platform |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116166181A (en) |
WO (1) | WO2023093194A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116956363B (en) * | 2023-09-20 | 2023-12-05 | 微网优联科技(成都)有限公司 | Data management method and system based on cloud computer technology |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8468241B1 (en) * | 2011-03-31 | 2013-06-18 | Emc Corporation | Adaptive optimization across information technology infrastructure |
US10078533B2 (en) * | 2014-03-14 | 2018-09-18 | Amazon Technologies, Inc. | Coordinated admission control for network-accessible block storage |
WO2017074320A1 (en) * | 2015-10-27 | 2017-05-04 | Hewlett Packard Enterprise Development Lp | Service scaling for batch processing |
CN108462596B (en) * | 2017-02-21 | 2021-02-23 | 华为技术有限公司 | SLA decomposition method, equipment and system |
CN109451008B (en) * | 2018-10-31 | 2021-05-28 | 中国人民大学 | Multi-tenant bandwidth guarantee framework and cost optimization method under cloud platform |
-
2022
- 2022-02-16 CN CN202210142252.2A patent/CN116166181A/en active Pending
- 2022-09-02 WO PCT/CN2022/116891 patent/WO2023093194A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2023093194A1 (en) | 2023-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11010188B1 (en) | Simulated data object storage using on-demand computation of data objects | |
US11861405B2 (en) | Multi-cluster container orchestration | |
US9755990B2 (en) | Automated reconfiguration of shared network resources | |
US10021037B2 (en) | Provisioning cloud resources | |
CN108776934B (en) | Distributed data calculation method and device, computer equipment and readable storage medium | |
US10078533B2 (en) | Coordinated admission control for network-accessible block storage | |
EP2763043A1 (en) | A system, method and apparatus for determining virtual machine performance | |
US10609118B2 (en) | Adaptive communication control device | |
US20120221730A1 (en) | Resource control system and resource control method | |
CN109981702B (en) | File storage method and system | |
US10712958B2 (en) | Elastic storage volume type selection and optimization engine for public cloud environments | |
CN111522636A (en) | Application container adjusting method, application container adjusting system, computer readable medium and terminal device | |
US10439901B2 (en) | Messaging queue spinning engine | |
US10616134B1 (en) | Prioritizing resource hosts for resource placement | |
US10250673B1 (en) | Storage workload management using redirected messages | |
US10063601B2 (en) | Client identification for enforcing computer resource quotas | |
EP2757474A2 (en) | Adaptive virtualization | |
CN108874502B (en) | Resource management method, device and equipment of cloud computing cluster | |
EP4170491A1 (en) | Resource scheduling method and apparatus, electronic device, and computer-readable storage medium | |
US10642585B1 (en) | Enhancing API service schemes | |
US20220283871A1 (en) | Multi-Account Cloud Service Usage Package Sharing Method and Apparatus, and Related Device | |
US11442632B2 (en) | Rebalancing of user accounts among partitions of a storage service | |
CN113377866A (en) | Load balancing method and device for virtualized database proxy service | |
CN116301568A (en) | Data access method, device and equipment | |
US11354164B1 (en) | Robotic process automation system with quality of service based automation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |