WO2015196885A1 - 云计算系统的性能数据的采集与存储方法及装置 - Google Patents

云计算系统的性能数据的采集与存储方法及装置 Download PDF

Info

Publication number
WO2015196885A1
WO2015196885A1 PCT/CN2015/079695 CN2015079695W WO2015196885A1 WO 2015196885 A1 WO2015196885 A1 WO 2015196885A1 CN 2015079695 W CN2015079695 W CN 2015079695W WO 2015196885 A1 WO2015196885 A1 WO 2015196885A1
Authority
WO
WIPO (PCT)
Prior art keywords
function
performance
data
database
cloud computing
Prior art date
Application number
PCT/CN2015/079695
Other languages
English (en)
French (fr)
Inventor
秦承刚
黄江伟
唐珂
Original Assignee
阿里巴巴集团控股有限公司
秦承刚
黄江伟
唐珂
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 秦承刚, 黄江伟, 唐珂 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2015196885A1 publication Critical patent/WO2015196885A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units

Definitions

  • the present invention relates to the field of cloud computing, and in particular, to a method and device for collecting and storing performance data of a cloud computing system, and a method, device and system for detecting performance of a cloud computing system.
  • Cloud computing systems include massive basic hardware resources such as servers, storage, and networks, as well as massive basic software resources such as stand-alone operating systems, middleware, and databases.
  • the hardware composition of a cloud computing system can be seen as a cluster of computers, and there may be thousands of machines involved.
  • the cost is usually proportional to the hardware resources occupied by the application. The more hardware resources you use, the higher the cost. Therefore, providing performance analysis tools for cloud computing systems helps cloud computing system developers and users to optimize the performance of cloud computing system software and application software, reducing the resource overhead of system software and application software, and has the practical significance of cost saving.
  • the applications running on the cloud computing system are mostly distributed applications.
  • the modules deployed on different machines may be isomorphic or heterogeneous. Therefore, performance analysis tools for cloud computing systems must support performance analysis of distributed software.
  • the technical problem to be solved by the present application is to provide a performance data of a cloud computing system.
  • the collection and storage method is used to solve the problem that the performance data solution of the cloud computing system cannot be accurately collected and stored in the prior art, and further, the performance analysis of the cloud computing system based on the collected and stored performance data.
  • the present application also provides a device for collecting and storing performance data of a cloud computing system, and a method, device and system for detecting performance of a cloud computing system, which are used to ensure the implementation and application of the above method in practice.
  • the present application discloses a method for collecting and storing performance data of a cloud computing system, which is applied to each host in a cloud computing system, including:
  • the performance data includes: a register value of the local CPU, a running process identifier PID, a name of the process, and a user stack of the process;
  • the sampling period, the PID, the name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function are saved as key values in the first database, and each of the The SHA1 code of the DSO file corresponding to the function is used as a key, and the storage location of the function address table in the DSO file is saved as a key value in the second database, and the function address table correspondingly stores the function name and The start and end addresses of the function.
  • the present application discloses a device for collecting and storing performance data of a cloud computing system, including:
  • the collecting unit is configured to collect performance data of the local machine according to a preset sampling period, where the performance data includes: a register value of the local CPU, a running process identifier PID, a name of the process, and a user stack of the process;
  • a first parsing unit configured to parse a user stack of the process by using a register value of the local CPU to obtain a function call chain of the process at the collection time and a DSO file corresponding to each function in the function call chain;
  • a first saving unit configured to save the sampling time, the PID, the process name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function as a key value to the first In the database;
  • a second saving unit configured to use the SHA1 encoding of the DSO file corresponding to each function as a key, and save the storage address of the function address table in the DSO file as a key value to the second database.
  • the function address table corresponds to the function name and the start and end addresses of the function.
  • the present application discloses a performance detection method for a cloud computing system, including:
  • the performance detection request includes: a performance detection target host, a name of a target process running on the target host, a DSO file and a time range involved in the target process;
  • the target data is the performance detection target host according to the name of the target process, the target
  • the DSO file and the time range involved in the process are the data retrieved in the preset first database, and the target data includes: a sampling period, a function call chain address, and a SHA1 of the DSO file corresponding to each function in the function call chain.
  • Encoding; in the first database, the collection time is used as a key, the sampling period, the PID, the name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function are saved as key values;
  • Corresponding function address table is matched from the preset second database according to the SHA1 code in the target data; in the second database, the SHA1 code of the DSO file corresponding to each function is used as a keyword, The location of the function address table in the DSO file on the disk is saved as a key value;
  • the execution time ratio of each function corresponding to each function name is calculated by using the sampling period.
  • the present application discloses a performance detecting apparatus for a cloud computing system, including:
  • a receiving request unit configured to receive a performance detection request of the user about the cloud computing system, where the performance detection request includes: a performance detection target host, a name of a target process running on the target host, and a DSO involved in the target process File and time range;
  • a sending unit configured to send the performance detection request to the performance detection target host
  • Receiving a data unit configured to receive target data returned by the performance detection target host;
  • the target data is a name of the target process, the DSO file and a time range involved by the target process by the performance detection target host
  • the data retrieved in the preset first database, the target data includes: a sampling period, a function call chain address, and a SHA1 encoding of a DSO file corresponding to each function in the function calling chain; in the first database, The collection time is used as a key, and the sampling period, the PID, the name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function are saved as key values;
  • a matching unit configured to match a corresponding function address table from the preset second database according to the SHA1 encoding in the target data; in the second database, the SHA1 encoding of the DSO file corresponding to each function is used as a keyword, where the function address table in the DSO file is stored as a key value corresponding to the storage location on the disk;
  • a second parsing unit configured to parse a function call chain address in the target data by using the function address table to obtain respective function names called in the current process
  • a calculating unit configured to calculate, by using the sampling period, an execution time ratio of each function corresponding to each function name.
  • the present application includes the following advantages:
  • the performance data of each host in the cloud computing system can be separately collected and stored, and stored in two Key-Value databases respectively, so that each host can save its own performance data.
  • the corresponding Value can be conveniently and efficiently retrieved, which realizes simple distributed data storage and avoids unnecessary network overhead caused by centralized data storage. And make full use of the storage resources on each host.
  • the aggregation and analysis of performance data from multiple machines is also beneficial to analyze the performance of the cloud computing system based on the performance data in the database, thereby realizing the analysis of the performance data of the cloud computing system. Further, the performance analysis result can be targeted to optimize the performance of each software in the host.
  • the embodiment of the present application can aggregate and analyze performance data from multiple hosts, and implements a large-scale computer cluster and a distributed application. Performance analysis.
  • FIG. 1 is a flowchart of an embodiment of a method for collecting and storing performance data of a cloud computing system according to the present application
  • FIG. 2 is a structural block diagram of an embodiment of a device for collecting and storing performance data of a cloud computing system according to the present application
  • FIG. 3 is a flowchart of an embodiment of a performance detecting method of a cloud computing system according to the present application
  • FIG. 4 is a structural block diagram of an embodiment of a performance detecting apparatus of a cloud computing system according to the present application.
  • FIG. 5 is a structural block diagram of an embodiment of a performance detecting system of a cloud computing system according to the present application.
  • This application can be used in a variety of general purpose or special purpose computing device environments or configurations.
  • personal computer server computer, handheld or portable device, tablet type device, multi-processor device, distributed computing environment including any of the above devices or devices, and the like.
  • the application can be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the present application can also be practiced in a distributed computing environment, in a distributed computing environment The connected remote processing device performs the task.
  • program modules can be located in both local and remote computer storage media including storage devices.
  • FIG. 1 a flowchart of an embodiment of a method for collecting and storing performance data of a cloud computing system is provided. This embodiment may be applied to each host in a cloud computing system. This embodiment may include the following. step:
  • Step 101 Collect performance data of the local machine according to a preset sampling period, where the performance data includes: a register value of the local CPU, a running process identifier, a name of the process, and a user stack of the process.
  • each host may collect and store performance data according to the method in this embodiment.
  • the sampling period can be set in advance by a person skilled in the art, for example, 20 ms, 50 Hz.
  • each host acquires the register value of the local CPU when the sampling time comes, such as the value of the command register IP, the stack pointer register SP, the pointer register BP, and the like.
  • These data content can be obtained through the interface provided by the operating system.
  • the sampling interval can also be collected, because the sampling period is preset. A good fixed value, but in the actual acquisition process, the value of the specific acquisition time is not necessarily equal to the sampling period, there may be some deviation, therefore, the actual sampling interval value can also be obtained by high-precision system clock acquisition.
  • Step 102 Parsing a user stack of the current process by using a register value of the local CPU to obtain a function call chain address of the process at the acquisition time and a DSO file corresponding to each function in the function call chain.
  • the register value of the local CPU can be used to parse the user stack of the current process, so that the function call chain of the current process at the acquisition time and the DSO corresponding to each function in the function call chain can be obtained.
  • (Dynamic Shared Object) file Where the function call chain represents the current function that the CPU is executing, And a collection of functions up to the current function, step by step.
  • the function call chain address is the address of each function in the function call chain.
  • the current process may correspond to multiple DSO files. Therefore, in this step, it is necessary to parse which DSO file each function in the function call chain belongs to.
  • Step 103 The sampling time, the sampling period, the PID of the current process, the process name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function are saved as a key value to the first database. in.
  • the acquisition time is first used as the key Key, the sampling period, the PID of the current process, the process name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function as the key value, corresponding to Save to the first Key-Value database.
  • Step 104 The SHA1 encoding of the DSO file corresponding to each function is used as a key, and the storage location of the function address table in the DSO file is saved as a key value in the second database, where the function address table is stored. The corresponding function name and the start and end addresses of the function are saved.
  • the SHA1 encoding of the DSO file corresponding to each function is used as the key Key, and the storage address of the function address table in the DSO file is used as the key value Value, and is saved to the second Key-Value database.
  • the DSO file includes a function address table, and the function address table stores the function name and the start address and end address of the function corresponding to the function name. Because a DSO file may correspond to multiple functions, a SHA1 encoding may also correspond to multiple functions.
  • step 103 and step 104 can also be reversed.
  • the above method is used to collect and store the performance data of each host in the cloud computing system, and store them in two Key-Value databases respectively, so that each host can save its own performance data, and according to The keyword Key can conveniently and efficiently retrieve the corresponding Value, which realizes simple distributed data storage and avoids unnecessary network overhead caused by centralized data storage. And make full use of the storage resources on each host. Moreover, it is also beneficial to analyze the performance of the cloud computing system according to the performance data in the database, thereby realizing the analysis of the performance data of the cloud computing system.
  • the method may further include:
  • Step 105 Determine whether the storage time of the data in the first database or the second database exceeds a preset time threshold, and if yes, proceed to step 106.
  • N is a natural number, and the specific value can be The technicians in the field set according to the actual storage space or technical requirements of each host. Then, N is a preset time threshold, for example, 7 days, the host can determine whether the storage time of the data in the first database or the second database has exceeded the preset time threshold, and if it is exceeded, it is not saved. If there is no more, no processing is performed on the data.
  • Step 106 Delete data whose storage time exceeds the preset time threshold.
  • the host can delete the data stored by itself for more than the preset time threshold, which can further save the storage space of the host.
  • the actual "one day” can also be set as the time attribute of the data table in the first relational library and the second relational library, and the time attribute can represent: each day
  • the performance data is stored separately in a table in the first database or the second database. That is, the performance data for each day is stored separately in a data table.
  • FIG. 2 a flowchart of an embodiment of a performance detection method of a cloud computing system is provided.
  • the embodiment can be applied to a plurality of hosts dedicated to performance detection in a cloud computing system, where performance detection is performed.
  • the host may be different from the host that collects and stores the performance data.
  • the embodiment may include the following steps:
  • Step 201 Receive a performance detection request of the user about the cloud computing system, where the performance detection request includes: a performance detection target host, a name of a target process running on the target host, and a DSO file and time involved in the target process. range.
  • the host dedicated to performance detection can respond to the performance detection request and obtain the performance detection request:
  • the target host, the name of the target process running on the target host, the DSO file and the time range involved in the target process can be detected.
  • the performance detection target host indicates that the user needs to detect the performance of the host in the cloud computing system.
  • the name of the target process running on the target host indicates the process to be detected on the host that the user needs to detect, and the DSO file involved in the target process. Because a process may involve multiple DSO files, which DSO file can be specified by the user in the performance detection request, the time range defines the performance data collected at which acquisition times are detected.
  • Step 202 Send the performance detection request to the performance detection target host, and receive the target data returned by the performance detection target host;
  • the target data is the performance detection target host according to the name of the target process,
  • the DSO file and the time range involved in the target process are data retrieved in a preset first database, and the target data includes: a sampling period, a function call chain, and a DSO file corresponding to each function in the function call chain. SHA1 encoding.
  • the host specifically for detecting forwards the request to the performance target detection host involved in the request, and the performance target detection host receives the request according to the target.
  • the name of the process, the DSO file and the time range involved in the target process, and the target data is filtered in the first database, wherein the target data includes: a sampling period, a function call chain, and a DSO file corresponding to each function in the function call chain. SHA1 encoding.
  • the first database is pre-established by the method shown in FIG. 1. Specifically, the performance target detection host may first retrieve all the values in the time range in the first database by using the time range as a key, and then use the name of the target process and the DSO file involved in the target process to filter out the final Target data.
  • Step 203 Match the corresponding function address table from the preset second database according to the SHA1 encoding in the target data.
  • the SHA1 encoding is used as a key to match the storage location of the qualified function address table on the disk from the second database, and further from the storage location. Read out the function address table.
  • the second database is pre-established using the method shown in FIG. 1.
  • Step 204 Calling a chain address of a function in the target data by using the function address table Parsing to get the names of the various functions called in the current process.
  • the function address table holds the function name and the start address and end address of the corresponding function, and the function call chain address indicates the instruction address of each function. Therefore, it is compared whether the instruction address of each function falls.
  • the start address and end address of a function can be used.
  • one of the function address tables is that the address corresponding to function A is 0x00000001 to 0x00000005, and the address of a function in the function call chain is 0x00000002. Then, the name of the function is function A.
  • the tasks of this step can be divided into different hosts. After the address comparison of each host is completed, the results of the hosts are aggregated to obtain the total result. .
  • Step 205 Calculate, by using the sampling period, an execution time ratio of each function corresponding to each function name.
  • the cycle can be used to calculate the execution time ratio of each function corresponding to each function name. Because the address of each instruction in the function call chain address may be repeated, that is, a function is called multiple times, then the total execution time of the function can be obtained by accumulating the time of each call, that is, The number of times it is called is multiplied by the sampling period, and the obtained product is the total execution time of the function. The total execution time of the obtained function is divided by the value of the sampling period to obtain the execution time ratio of each function. The value of the execution time is between 0 and 1.
  • the method may further include:
  • Step 206 Generate an execution time ratio of the respective functions and a function call chain to generate a performance map of the cloud computing system.
  • a performance map of the cloud computing system can be generated, and the performance map can reflect the execution heat of each function. And the calling relationship between functions.
  • the performance map can also be returned to the user who initiated the performance detection request, so that the user can quickly locate the hotspot of the software by contacting the map, thereby performing targeted performance optimization.
  • the performance of each host in the cloud computing system can be conveniently detected, thereby analyzing and obtaining Each host runs a hotspot of software to implement performance analysis of each host in the cloud computing system. Further, the performance analysis result can be targeted to optimize the performance of each software in the host.
  • the present application further provides an apparatus for collecting and storing performance data of a cloud computing system.
  • the device may include:
  • the collecting unit 301 is configured to collect performance data of the local machine according to a preset sampling period, where the performance data includes: a register value of the local CPU, a running process identifier, a name of the process, and a user stack of the process.
  • the first parsing unit 302 is configured to parse the user stack of the process by using the register value of the local CPU to obtain a function call chain of the process at the acquisition time and a DSO file corresponding to each function in the function call chain.
  • the first saving unit 303 is configured to save the sampling time, the PID, the name, the function call chain, and the SHA1 code of the DSO file corresponding to each function as a key value to the first database. in.
  • the second saving unit 304 is configured to use the SHA1 encoding of the DSO file corresponding to each function as a key, and save the storage address of the function address table in the DSO file as a key value to the second database.
  • the function address table correspondingly stores the function name and the start and end addresses of the function.
  • the device in the embodiment of the present application collects and stores the performance data of each host in the cloud computing system separately, and saves them in two Key-Value databases respectively, so that each host can save its own performance.
  • Data, and according to the keyword Key it is convenient and efficient to retrieve the corresponding Value, which realizes simple distributed data storage and avoids data concentration. Unnecessary network overhead caused by storage. And make full use of the storage resources on each host.
  • the apparatus may further include:
  • the determining unit 305 is configured to determine whether the storage time of the data in the first database or the second database exceeds a preset time threshold.
  • the deleting unit 306 is configured to delete data whose storage time exceeds a preset time threshold when the result of the determining module is YES.
  • the apparatus may further include:
  • a setting module configured to set a day as a time attribute of the data table in the first relational library and the second relational library, the time attribute indicating that performance data of each day is separately stored in the first database or the second database In a table.
  • the present application further provides an embodiment of a performance detecting device of a cloud computing system.
  • the performance detecting device may include:
  • the receiving requesting unit 401 is configured to receive a performance detecting request of the user about the cloud computing system, where the performance detecting request includes: a performance detecting target host, a name of a target process running on the target host, and a process involved in the target process DSO file and time range.
  • the sending unit 402 is configured to send the performance detection request to the performance detection target host.
  • a receiving data unit 403 configured to receive target data returned by the performance detecting target host;
  • the target data is a name of the target process, a DSO file and a time range involved by the target process by the performance detecting target host
  • the data retrieved in the preset first database, the target data includes: a sampling period, a function call chain, and a SHA1 encoding of a DSO file corresponding to each function in the function call chain.
  • the collection time is used as a key, and the sampling period, the PID, the name, the function call chain address, and the SHA1 code of the DSO file corresponding to each function are stored as key values.
  • the matching unit 404 is configured to: according to the SHA1 encoding in the target data, from the preset second The corresponding function address table is matched in the database.
  • the SHA1 code of the DSO file corresponding to each function is used as a key, and the storage location of the function address table in the DSO file is saved as a key value.
  • the second parsing unit 405 is configured to parse the function call chain address in the target data by using the function address table to obtain each function name called in the current process.
  • the calculating unit 406 is configured to calculate, by using the sampling period, an execution time ratio of each function corresponding to each function name.
  • the apparatus may further include:
  • the generating unit 407 is configured to generate an execution time ratio of the respective functions and a function call chain to generate a performance map of the cloud computing system.
  • a performance detection system of a cloud computing system may specifically include: a performance data collection and storage device 501 of a cloud computing system, and a performance detection device 502 of the cloud computing system.
  • the collection and storage device 501 of the performance data of the cloud computing system may include: an acquisition unit 301, configured to collect performance data of the local machine according to a preset sampling period, where the performance data includes: a register value of the local CPU, The running process identifier, the name of the process, and the user stack for the process.
  • the first parsing unit 302 is configured to parse the user stack of the process by using the register value of the local CPU to obtain a function call chain of the process at the acquisition time and a DSO file corresponding to each function in the function call chain.
  • the first saving unit 303 is configured to use the collection time as a key, and the sampling period, the PID of the current process, the process name, the function call chain, and the SHA1 code of the DSO file corresponding to each function are saved as key values.
  • a second saving unit 304 configured to use a SHA1 encoding of a DSO file corresponding to each function as a key, and a function address table in the DSO file is in a magnetic
  • the storage location on the disk is saved as a key value in the second database, and the function address table correspondingly stores the function name and the start and end addresses of the function.
  • the performance detecting apparatus 502 of the cloud computing system may specifically include: a receiving requesting unit 401, configured to receive a performance detecting request of the user about the cloud computing system, where the performance detecting request includes: a performance detecting target host, and the target host running The name of the target process, the DSO file and time range involved in the target process.
  • the sending unit 402 is configured to send the performance detection request to the performance detection target host.
  • a receiving data unit 403 configured to receive target data returned by the performance detecting target host; the target data is a name of the target process, a DSO file and a time range involved by the target process by the performance detecting target host
  • the data retrieved in the first database, the target data includes: a sampling period, a function call chain, and a SHA1 encoding of a DSO file corresponding to each function in the function call chain.
  • the matching unit 404 is configured to match the corresponding function address table from the second database according to the SHA1 encoding in the target data.
  • the second parsing unit 405 is configured to parse the function call chain address in the target data by using the function address table to obtain each function name called in the current process.
  • the calculating unit 406 is configured to calculate, by using the sampling period, an execution time ratio of each function corresponding to each function name.

Abstract

本申请提供了一种云计算系统的性能数据的采集与存储方法及装置,所述方法应用于云计算系统中的各台主机上,包括:按照预设的采样周期采集本机的性能数据,利用本机CPU的寄存器值解析进程的用户栈以得到进程在采集时刻的函数调用链和各个函数对应的DSO文件;将采集时刻作为关键字,性能数据作为键值对应保存至数据库中。本申请还提供了一种云计算系统的性能检测方法、装置和系统,该方法包括:响应于用户关于云计算系统的性能检测请求,对应获取到目标数据,并依据目标数据对云计算系统进行性能检测。本申请实现了分布式数据存储,避免了不必要的网络开销,并且可对来自多台机器的性能数据进行聚合、分析,从而实现对云计算系统的性能分析。

Description

云计算系统的性能数据的采集与存储方法及装置 技术领域
本申请涉及云计算领域,特别涉及一种云计算系统的性能数据的采集与存储方法及装置,还有一种云计算系统的性能检测方法、装置及系统。
背景技术
云计算系统包括服务器、存储、网络等海量基础硬件资源和单机操作系统、中间件、数据库等海量基础软件资源。云计算系统的硬件构成可以看为计算机集群,涉及到的机器可能会有数千台之多。对于云计算系统的使用者来说,付出的费用通常与应用占有的硬件资源成正比。使用的硬件资源越多,所需付出的成本就越高。因此,提供面向云计算系统的性能分析工具,帮助云计算系统的开发者与用户对云计算系统软件与应用软件进行性能优化,减少系统软件与应用软件的资源开销,具有节约成本的现实意义。
运行在云计算系统上的应用程序多为分布式应用,这些应用部署在不同机器上的模块可能是同构的,也可能是异构的。因此面向云计算系统的性能分析工具,必须支持分布式软件的性能分析。
进一步的,由于云计算系统的规模庞大,运行在该系统之上的系统软件与应用软件每天都会发生很多性能异常。特别是这些软件系统还没有趋于稳定时。为了能够快速排查性能异常的原因,有必要在云计算系统中保存软件的性能数据。
在现有技术中,因为云计算系统存在上述特性,所以在采集或者存储云计算系统的性能数据时,尚没有一种能够准确采集、存储、分析分布式应用性能数据的有效的方案。
发明内容
本申请所要解决的技术问题是提供一种云计算系统的性能数据的采 集与存储方法,用以解决现有技术中没有能够准确采集并存储云计算系统的性能数据方案的问题,进一步的,还能基于采集并存储的性能数据对云计算系统的进行性能分析。
本申请还提供了一种云计算系统的性能数据的采集与存储装置,以及一种云计算系统的性能检测方法、装置及系统,用以保证上述方法在实际中的实现及应用。
为了解决上述问题,本申请公开了一种云计算系统的性能数据的采集与存储方法,应用于云计算系统中的各台主机上,包括:
按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符PID、该进程的名称和该进程的用户栈;
利用所述本机CPU的寄存器值解析该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件;
将所述采集时刻作为关键字,所述采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中,并将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
本申请公开了一种云计算系统的性能数据的采集与存储装置,包括:
采集单元,用于按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符PID、该进程的名称和该进程的用户栈;
第一解析单元,用于利用所述本机CPU的寄存器值解析该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件;
第一保存单元,用于将所述采集时刻作为关键字,所述采样周期、PID、进程名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中;
第二保存单元,用于将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
本申请公开了一种云计算系统的性能检测方法,包括:
接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围;
将所述性能检测请求发送给所述性能检测目标主机,并接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链地址以及所述函数调用链中各个函数对应的DSO文件的SHA1编码;在所述第一数据库中,采集时刻作为关键字,采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存;
依据所述目标数据中的SHA1编码从预置的第二数据库中匹配出对应的函数地址表;在所述第二数据库中,所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存;
利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称;
利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
本申请公开了一种云计算系统的性能检测装置,包括:
接收请求单元,用于接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围;
发送单元,用于将所述性能检测请求发送给所述性能检测目标主机;
接收数据单元,用于接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链地址以及所述函数调用链中各个函数对应的DSO文件的SHA1编码;在所述第一数据库中,采集时刻作为关键字,采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存;
匹配单元,用于依据所述目标数据中的SHA1编码从预置的第二数据库中匹配出对应的函数地址表;在所述第二数据库中,所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存;
第二解析单元,用于利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称;
计算单元,用于利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
与现有技术相比,本申请包括以下优点:
在本申请中,可以将云计算系统中各台主机的性能数据分别进行采集与存储,并分别保存在两个Key-Value型数据库中,就可以使各台主机上都保存有自己的性能数据,并且根据关键字Key即可方便高效的检索出对应的Value,这样实现了简易的分布式数据存储,避免了数据集中存储带来的不必要的网络开销。并且充分利用了各台主机上的存储资源。并且,对来自多台机器的性能数据进行聚合、分析还有利于根据数据库中的性能数据对云计算系统的性能进行分析,从而实现对云计算系统的性能数据的分析。进一步的,还可以接触性能分析结果有针对性的对主机中的各个软件进行性能优化。
此外,通过对第一数据库和第二数据库进行上述方式的生命周期管理,可以更有效的存储更有价值的数据,同时也节省了云计算系统中各台主机的存储空间。并且,本申请实施例可以对来自多台主机的性能数据进行聚合和分析等,实现了针对大规模计算机集群及分布式应用程序 的性能分析。
当然,实施本申请的任一产品并不一定需要同时达到以上所述的所有优点。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请的一种云计算系统的性能数据的采集与存储方法实施例的流程图;
图2为本申请的一种云计算系统的性能数据的采集与存储装置实施例的结构框图;
图3是本申请的一种云计算系统的性能检测方法实施例的流程图;
图4是本申请的一种云计算系统的性能检测装置实施例的结构框图;
图5是本申请的一种云计算系统的性能检测系统实施例的结构框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请可用于众多通用或专用的计算装置环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器装置、包括以上任何装置或设备的分布式计算环境等等。
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络 而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
参考图1,示出了本申请一种云计算系统的性能数据的采集与存储方法实施例的流程图,本实施例可以应用于云计算系统中的各台主机上,本实施例可以包括以下步骤:
步骤101:按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符、该进程的名称和该进程的用户栈。
在本申请的实施例中,云计算系统中可能存在上千台主机,每一台主机都可以按照本实施例的方法进行性能数据的采集与存储。其中,采样周期可以预先由本领域技术人员设置好,例如,20ms,50Hz。其中,各台主机在采样时刻到来时,采集本机CPU的寄存器值,例如指令之神寄存器IP,栈指针寄存器SP,指针寄存器BP等的值。此外,还需要采集在采集时刻各台主机上正在运行的进程的标示符PID,该进程的进程名称,以及该进程的用户栈。这些数据内容可以通过操作系统提供的接口获得。
当然,在采集性能数据的时候,还可以采集其他数据,例如CPU的特权级别,这可以表示出当前采用是属于操作系统的还是用户级别的,例如还可以采集采样间隔,因为采样周期是预先设置好的固定值,而在实际采集过程中,具体采集的那一时刻的数值未必恰好等于采样周期,可能会存在一些偏差,因此,实际的采样间隔值也可以通过高精度的系统时钟采集得到。
步骤102:利用所述本机CPU的寄存器值解析所述当前进程的用户栈以得到该进程在采集时刻的函数调用链地址和所述函数调用链中各个函数对应的DSO文件。
在采集到本机CPU的寄存器值之后,可以利用本机CPU的寄存器值来解析当前进程的用户栈,从而可以得到当前进程在采集时刻的函数调用链,以及函数调用链中各个函数对应的DSO(动态共享对象,Dynamic Shared Object)文件。其中,函数调用链表示CPU正在执行的当前函数, 以及逐级调用直到当前函数的函数集合。函数调用链地址是指函数调用链中各个函数的地址。其中,当前进程可能对应多个DSO文件,因此,本步骤中需要解析出函数调用链中的各个函数都属于哪一个DSO文件。
步骤103:将所述采集时刻作为关键字,所述采样周期、当前进程的PID、进程名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中。
在采集到性能数据之后,则首先将采集时刻作为关键字Key,采样周期、当前进程的PID、进程名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值Value,对应保存至第一个Key-Value型数据库中。
步骤104:将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
此外,再将所述各个函数对应的DSO文件的SHA1编码作为关键字Key,DSO文件中的函数地址表在磁盘上的存放位置作为键值Value,对应保存至第二个Key-Value型数据库中。其中,DSO文件中包括函数地址表,函数地址表中保存有函数名称,以及与函数名称对应的函数的起始地址和结束地址。因为一个DSO文件可能对应多个函数,因此,一个SHA1编码也可能对应多个函数。
可以理解的是,步骤103和步骤104的顺序也可以颠倒。
采用上述方法将云计算系统中各台主机的性能数据分别进行采集与存储,并分别保存在两个Key-Value型数据库中,就可以使各台主机上都保存有自己的性能数据,并且根据关键字Key即可方便高效地检索出对应的Value,这样实现了简易的分布式数据存储,避免了数据集中存储带来的不必要的网络开销。并且充分利用了各台主机上的存储资源。并且,也有利于根据数据库中的性能数据对云计算系统的性能进行分析,从而实现对云计算系统的性能数据的分析。
在不同的实施例中,在步骤104之后,还可以包括:
步骤105:判断所述第一数据库或第二数据库中的数据的存储时间是否超过预设时间阈值,如果是,则进入步骤106。
可以理解的是,由于各台主机的存储空间有限,并且性能数据的数据量较大,因此,可以在第一数据库或者第二数据库中仅保存N天的文件,N是自然数,具体值可以由本领域技术人员根据各台主机的实际存储空间或者技术需求进行设定。那么,N即是预设时间阈值,例如7天,主机可以判断第一数据库或者第二数据库中的数据的存储时间是不是已经超过预设时间阈值,如果超过了,则不再对其进行保存,没有超过,则不对数据进行任何处理。
步骤106:删除存储时间超过预设时间阈值的数据。
主机可以删除自己存储的超过预设时间阈值的数据,这样可以进一步的节省主机的存储空间。
可以理解的是,在实际应用中,还可以将实际中的“一天”设置为所述第一关系库和第二关系库中的数据表的时间属性,所述时间属性可以表示:每一天的性能数据单独存储在第一数据库或者第二数据库的一个表格中。即是,将每一天的性能数据单独存储在一个数据表中。
通过对第一数据库和第二数据库进行上述方式的生命周期管理,可以更有效的存储更有价值的数据,同时也节省了云计算系统中各台主机的存储空间。
参考图2,示出了本申请一种云计算系统的性能检测方法实施例的流程图,本实施例可以应用于云计算系统中专门用于性能检测的若干台主机上,其中,性能检测的主机与采集与存储性能数据的主机可以不同,那么,本实施例可以包括以下步骤:
步骤201:接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围。
在用户触发了关于云计算系统的性能检测请求的时候,专门用于性能检测的主机可以响应性能检测请求,获取到性能检测请求中涉及的:性 能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围。其中,性能检测目标主机表示用户需要检测云计算系统中哪一台主机的性能,目标主机上运行的目标进程的名称表示用户需要检测的主机上的待检测进程,目标进程所涉及的DSO文件,因为一个进程可能涉及多个DSO文件,用户可以在性能检测请求中指定哪一个或者哪一些DSO文件,时间范围则限定了检测哪些采集时刻采集到的性能数据。
步骤202:将所述性能检测请求发送给所述性能检测目标主机,并接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链以及所述函数调用链中各个函数对应的DSO文件的SHA1编码。
在本实施例中,专门用于检测的主机接收到性能检测请求之后,会将该请求转发给该请求中涉及的性能目标检测主机,性能目标检测主机接收到该请求之后,会根据其中的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围来在第一数据库中筛选目标数据,其中,目标数据包括:采样周期、函数调用链以及函数调用链中各个函数对应的DSO文件的SHA1编码。其中,所述第一数据库是采用图1所示的方法预先建立的。具体的,性能目标检测主机可以先以时间范围为关键字,在第一数据库中检索出在该时间范围内的所有Value,再用目标进程的名称和目标进程所涉及的DSO文件来筛选出最终的目标数据。
步骤203:依据所述目标数据中的SHA1编码从预置的第二数据库中匹配出对应的函数地址表。
在步骤202中得到了各个函数对应的DSO文件的SHA1编码之后,再以该SHA1编码为关键字从第二数据库中匹配出符合条件的函数地址表在磁盘上的存放位置,进而从该存放位置中读取出函数地址表。其中,第二数据库是采用图1所示的方法预先建立的。
步骤204:利用所述函数地址表对所述目标数据中的函数调用链地址 进行解析,以得到所述当前进程中调用的各个函数名称。
因为函数地址表中保存的是函数名称与对应的函数的起始地址和结束地址,而函数调用链地址则表示的各个函数的指令地址,因此,一一比对各个函数的指令地址是否落到某一个函数的起始地址和结束地址中即可。例如,函数地址表中有一项是函数A对应的地址为0x00000001~0x00000005,而函数调用链中一个函数的指令地址为0x00000002,那么,可以该函数的名称即是函数A。
可以理解的是,在实际应用中,可以将本步骤的任务均分至不同的主机中进行,待各台主机的地址比对都完成之后,再将各台主机的结果进行聚合从而得到总结果。
步骤205:利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
最后可以利用采用周期来计算各个函数名称对应的各个函数的执行时间占比。因为函数调用链地址中的各个指令地址有可能是重复的,即是某个函数被调用了多次,那么,该函数的总执行时间可以通过累加其每一次被调用的时间得到,即是将其被调用的次数与采样周期相乘,得到的乘积即是该函数的执行时间总长,最终得到的函数的执行总时间再除以采样周期的值,即可得到各个函数的执行时间占比,该执行时间占比的值在0~1之间。
在不同的实施例中,在步骤205之后还可以包括:
步骤206:将所述各个函数的执行时间占比和函数调用链生成所述云计算系统的性能图谱。
根据步骤205得到的各个函数的执行时间占比,和表示各个函数之间的被调用关系的函数调用链,则可以生成云计算系统的性能图谱,该性能图谱可以反映出各个函数的执行热度,以及函数之间的调用关系。该性能图谱还可以返回给发起性能检测请求的用户,以便用户接触该图谱可以快速定位软件的热点,从而有针对性的进行性能优化。
在本实施例中,通过预先在第一关系库和第二关系库中的数据,可以很方便的对云计算系统中的各台主机的性能进行检测,从而分析得到 各台主机运行软件的热点,从而实现对云计算系统中各台主机的性能分析。进一步的,还可以接触性能分析结果有针对性的对主机中的各个软件进行性能优化。
对于前述的方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
与上述本申请一种云计算系统的性能数据的采集与存储方法实施例所提供的方法相对应,参见图3,本申请还提供了一种云计算系统的性能数据的采集与存储装置实施例,在本实施例中,该装置可以包括:
采集单元301,用于按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符、该进程的名称和该进程的用户栈。
第一解析单元302,用于利用所述本机CPU的寄存器值解析所述该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件。
第一保存单元303,用于将所述采集时刻作为关键字,所述采样周期、PID、名称、函数调用链以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中。
第二保存单元304,用于将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
本申请实施例的装置,将云计算系统中各台主机的性能数据分别进行采集与存储,并分别保存在两个Key-Value型数据库中,就可以使各台主机上都保存有自己的性能数据,并且根据关键字Key即可方便高效地检索出对应的Value,这样实现了简易的分布式数据存储,避免了数据集中 存储带来的不必要的网络开销。并且充分利用了各台主机上的存储资源。
在不同的实施例中,该装置还可以包括:
判断单元305,用于判断所述第一数据库或第二数据库中的数据的存储时间是否超过预设时间阈值。
删除单元306,用于在所述判断模块的结果为是的情况下,删除存储时间超过预设时间阈值的数据。
在不同的实施例中,该装置还可以包括:
设置模块,用于将一天设置为所述第一关系库和第二关系库中的数据表的时间属性,所述时间属性表示:每一天的性能数据单独存储在第一数据库或者第二数据库的一个表格中。
此外,通过对第一数据库和第二数据库进行上述方式的生命周期管理,可以更有效的存储最有价值的数据,同时也节省了云计算系统中各台主机的存储空间。
本申请还提供了一种云计算系统的性能检测装置实施例,在本实施例中,参考图4所示,所述性能检测装置可以包括:
接收请求单元401,用于接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围。
发送单元402,用于将所述性能检测请求发送给所述性能检测目标主机。
接收数据单元403,用于接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链以及所述函数调用链中各个函数对应的DSO文件的SHA1编码。在所述第一数据库中,采集时刻作为关键字,采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存。
匹配单元404,用于依据所述目标数据中的SHA1编码从预置的第二 数据库中匹配出对应的函数地址表。在所述第二数据库中,所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存。
第二解析单元405,用于利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称。
计算单元406,用于利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
在不同的实施例中,该装置还可以包括:
生成单元407,用于将所述各个函数的执行时间占比和函数调用链生成所述云计算系统的性能图谱。
在本实施例中,通过预先在第一关系库和第二关系库中的数据,可以很方便地对云计算系统中的各台主机的性能进行检测,从而分析得到各台主机运行软件的热点,从而实现对云计算系统中各台主机的性能分析。用户可以根据性能分析结果有针对性地对主机中的各个软件进行性能优化。
参考图5所示,一种云计算系统的性能检测系统,具体可以包括:云计算系统的性能数据的采集与存储装置501,以及,云计算系统的性能检测装置502。
其中,云计算系统的性能数据的采集与存储装置501具体可以包括:采集单元301,用于按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符、该进程的名称和该进程的用户栈。第一解析单元302,用于利用所述本机CPU的寄存器值解析该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件。第一保存单元303,用于将所述采集时刻作为关键字,所述采样周期、当前进程的PID、进程名称、函数调用链以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中。第二保存单元304,用于将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁 盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
云计算系统的性能检测装置502具体可以包括:接收请求单元401,用于接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围。发送单元402,用于将所述性能检测请求发送给所述性能检测目标主机。接收数据单元403,用于接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在所述第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链以及所述函数调用链中各个函数对应的DSO文件的SHA1编码。匹配单元404,用于依据所述目标数据中的SHA1编码从所述第二数据库中匹配出对应的函数地址表。第二解析单元405,用于利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称。计算单元406,用于利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其它变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的 过程、方法、物品或者设备中还存在另外的相同要素。
以上对本申请所提供的云计算系统的性能数据的采集与存储方法及装置、云计算系统的性能检测方法、装置及系统进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。

Claims (10)

  1. 一种云计算系统的性能数据的采集与存储方法,其特征在于,应用于云计算系统中的各台主机上,该方法包括:
    按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符PID、该进程的名称和该进程的用户栈;
    利用所述本机CPU的寄存器值解析该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件;
    将所述采集时刻作为关键字,所述采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中,并将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以及函数的起始和结束地址。
  2. 根据权利要求1所述的方法,其特征在于,还包括:
    判断所述第一数据库或第二数据库中的数据的存储时间是否超过预设时间阈值,如果是,则删除存储时间超过预设时间阈值的数据。
  3. 根据权利要求1所述的方法,其特征在于,还包括:
    将一天设置为所述第一关系库和第二关系库中的数据表的时间属性,所述时间属性表示:每一天的性能数据单独存储在第一数据库或者第二数据库的一个表格中。
  4. 一种云计算系统的性能检测方法,其特征在于,该方法包括:
    接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围;
    将所述性能检测请求发送给所述性能检测目标主机,并接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、 函数调用链地址以及所述函数调用链中各个函数对应的DSO文件的SHA1编码;在所述第一数据库中,采集时刻作为关键字,采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存;
    依据所述目标数据中的SHA1编码从预置的第二数据库中匹配出对应的函数地址表;在所述第二数据库中,所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存;
    利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称;
    利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
  5. 根据权利要求4所述的方法,其特征在于,还包括:
    将所述各个函数的执行时间占比和函数调用链生成所述云计算系统的性能图谱。
  6. 一种云计算系统的性能数据的采集与存储装置,其特征在于,包括:
    采集单元,用于按照预设的采样周期采集本机的性能数据,所述性能数据包括:本机CPU的寄存器值、正在运行的进程标示符PID、该进程的名称和该进程的用户栈;
    第一解析单元,用于利用所述本机CPU的寄存器值解析该进程的用户栈以得到该进程在采集时刻的函数调用链和所述函数调用链中各个函数对应的DSO文件;
    第一保存单元,用于将所述采集时刻作为关键字,所述采样周期、PID、进程名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存至第一数据库中;
    第二保存单元,用于将所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存至第二数据库中,所述函数地址表中对应保存有函数名称以 及函数的起始和结束地址。
  7. 根据权利要求6所述的装置,其特征在于,还包括:
    判断模块,用于判断所述第一数据库或第二数据库中的数据的存储时间是否超过预设时间阈值;
    删除模块,用于在所述判断模块的结果为是的情况下,删除存储时间超过预设时间阈值的数据。
  8. 根据权利要求6所述的装置,其特征在于,还包括:
    设置模块,用于将一天设置为所述第一关系库和第二关系库中的数据表的时间属性,所述时间属性表示:每一天的性能数据单独存储在第一数据库或者第二数据库的一个表格中。
  9. 一种云计算系统的性能检测装置,其特征在于,所述性能检测装置包括:
    接收请求单元,用于接收用户关于云计算系统的性能检测请求,所述性能检测请求中包括:性能检测目标主机、所述目标主机上运行的目标进程的名称、所述目标进程所涉及的DSO文件和时间范围;
    发送单元,用于将所述性能检测请求发送给所述性能检测目标主机;
    接收数据单元,用于接收所述性能检测目标主机返回的目标数据;所述目标数据为所述性能检测目标主机依据所述目标进程的名称、所述目标进程所涉及的DSO文件和时间范围在预置的第一数据库中检索出的数据,所述目标数据包括:采样周期、函数调用链地址以及所述函数调用链中各个函数对应的DSO文件的SHA1编码;在所述第一数据库中,采集时刻作为关键字,采样周期、PID、名称、函数调用链地址以及所述各个函数对应的DSO文件的SHA1编码作为键值对应保存;
    匹配单元,用于依据所述目标数据中的SHA1编码从预置的第二数据库中匹配出对应的函数地址表;在所述第二数据库中,所述各个函数对应的DSO文件的SHA1编码作为关键字,所述DSO文件中的函数地址表在磁盘上的存放位置作为键值对应保存;
    第二解析单元,用于利用所述函数地址表对所述目标数据中的函数调用链地址进行解析,以得到所述当前进程中调用的各个函数名称;
    计算单元,用于利用所述采样周期计算各个函数名称对应的各个函数的执行时间占比。
  10. 根据权利要求9所述的装置,其特征在于,所述检测装置还包括:
    生成单元,用于将所述各个函数的执行时间占比和函数调用链生成所述云计算系统的性能图谱。
PCT/CN2015/079695 2014-06-27 2015-05-25 云计算系统的性能数据的采集与存储方法及装置 WO2015196885A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410301908.6A CN105242873B (zh) 2014-06-27 2014-06-27 云计算系统的性能数据的采集与存储方法及装置
CN201410301908.6 2014-06-27

Publications (1)

Publication Number Publication Date
WO2015196885A1 true WO2015196885A1 (zh) 2015-12-30

Family

ID=54936727

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/079695 WO2015196885A1 (zh) 2014-06-27 2015-05-25 云计算系统的性能数据的采集与存储方法及装置

Country Status (3)

Country Link
CN (1) CN105242873B (zh)
HK (1) HK1215735A1 (zh)
WO (1) WO2015196885A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021203635A1 (zh) * 2020-04-08 2021-10-14 北京百度网讯科技有限公司 分布式系统运行状态监测方法、装置、设备及存储介质
CN117687626A (zh) * 2024-02-04 2024-03-12 双一力(宁波)电池有限公司 一种上位机和主程序匹配系统及方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107135128B (zh) * 2017-06-28 2021-07-23 努比亚技术有限公司 调用链数据采集方法、移动终端及计算机可读存储介质
CN109542793B (zh) * 2018-11-30 2022-06-14 北京小马智行科技有限公司 一种程序性能分析方法及装置
CN113448815B (zh) * 2020-03-26 2022-10-18 华为技术有限公司 一种采集追踪trace调用链的方法和电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708011A (zh) * 2012-05-11 2012-10-03 南京邮电大学 一种面向云计算平台任务调度的多级负载评估方法
CN103095533A (zh) * 2013-02-22 2013-05-08 浪潮电子信息产业股份有限公司 一种云计算系统平台中的定时监控方法
US20140047227A1 (en) * 2012-08-07 2014-02-13 Advanced Micro Devices, Inc. System and method for configuring boot-time parameters of nodes of a cloud computing system
CN103617076A (zh) * 2013-10-31 2014-03-05 中兴通讯股份有限公司 一种虚拟化资源的调度方法和系统及服务端

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708011A (zh) * 2012-05-11 2012-10-03 南京邮电大学 一种面向云计算平台任务调度的多级负载评估方法
US20140047227A1 (en) * 2012-08-07 2014-02-13 Advanced Micro Devices, Inc. System and method for configuring boot-time parameters of nodes of a cloud computing system
CN103095533A (zh) * 2013-02-22 2013-05-08 浪潮电子信息产业股份有限公司 一种云计算系统平台中的定时监控方法
CN103617076A (zh) * 2013-10-31 2014-03-05 中兴通讯股份有限公司 一种虚拟化资源的调度方法和系统及服务端

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021203635A1 (zh) * 2020-04-08 2021-10-14 北京百度网讯科技有限公司 分布式系统运行状态监测方法、装置、设备及存储介质
CN117687626A (zh) * 2024-02-04 2024-03-12 双一力(宁波)电池有限公司 一种上位机和主程序匹配系统及方法
CN117687626B (zh) * 2024-02-04 2024-05-03 双一力(宁波)电池有限公司 一种上位机和主程序匹配系统及方法

Also Published As

Publication number Publication date
CN105242873A (zh) 2016-01-13
HK1215735A1 (zh) 2016-09-09
CN105242873B (zh) 2018-06-01

Similar Documents

Publication Publication Date Title
WO2015196885A1 (zh) 云计算系统的性能数据的采集与存储方法及装置
WO2017148293A1 (zh) 一种基于云平台的客户端应用的信息统计方法和装置
CN109656963B (zh) 元数据获取方法、装置、设备及计算机可读存储介质
US11238045B2 (en) Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources
US20120054146A1 (en) Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud
CN108108288A (zh) 一种日志数据解析方法、装置及设备
WO2020087830A1 (zh) 数据分析方法、装置、服务器及存储介质
US10614087B2 (en) Data analytics on distributed databases
CN108573029B (zh) 一种获取网络访问关系数据的方法、装置及存储介质
CN112883095A (zh) 多源异构数据汇聚的方法、系统、设备以及存储介质
US10657099B1 (en) Systems and methods for transformation and analysis of logfile data
WO2017097108A1 (zh) 日志信息处理方法、装置及系统
CN105302730A (zh) 一种检测计算模型的方法、测试服务器及业务平台
CN112506969A (zh) 一种bmc地址查询方法、系统、设备及可读存储介质
Kang et al. Reducing i/o cost in olap query processing with mapreduce
CN106649584B (zh) 一种主从式数据库系统中的索引处理方法和装置
JP2016024486A (ja) データ活用システム及びその制御方法
US11023449B2 (en) Method and system to search logs that contain a massive number of entries
CN106446039B (zh) 聚合式大数据查询方法及装置
CN114416489A (zh) 系统运行状态的监控方法、装置、计算机设备及存储介质
CN111177100B (zh) 一种训练数据处理方法、装置及存储介质
JP6201053B2 (ja) 素性データ管理システム、および素性データ管理方法
Sun et al. The implementation of air pollution monitoring service using hybrid database converter
CN113297245A (zh) 获取执行信息的方法及装置
CN113553320B (zh) 数据质量监控方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15812087

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15812087

Country of ref document: EP

Kind code of ref document: A1