CN108959014B

CN108959014B - Method and apparatus for monitoring a platform

Info

Publication number: CN108959014B
Application number: CN201710349504.8A
Authority: CN
Inventors: 高婧; 阎华�; 付学良; 王少华; 胡文萍; 孔祥云; 田静
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2017-05-17
Filing date: 2017-05-17
Publication date: 2022-04-12
Anticipated expiration: 2037-05-17
Also published as: CN108959014A

Abstract

Methods and apparatus for monitoring a platform are disclosed. One embodiment of the method comprises: acquiring the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the time length to be counted; executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition period, and generating at least one availability information in the time length to be counted; when a monitoring data query request of a user is received, acquiring availability information matched with the monitoring data query request from the generated availability information; and displaying the inquired availability information. The implementation mode realizes the display of the availability information of the platform to be monitored on each dimension in the time length to be counted.

Description

Method and apparatus for monitoring a platform

Technical Field

The present application relates to the field of computer technologies, and in particular, to the field of computer operation and maintenance, and in particular, to a method and an apparatus for monitoring a platform.

Background

Computer operation and maintenance management refers to the comprehensive management of information technology operation environments (such as hard software environments, network environments and the like) and service systems by information technology departments of units by adopting related methods, means, technologies, systems, processes, documents and the like. The monitoring of the operation state of the system is an important content of operation and maintenance management.

At present, when a system platform is monitored in the prior art, only the collected original monitoring data is simply displayed, and no further processing is performed on the original monitoring data, so that the monitoring data finally displayed for operation and maintenance personnel is weak in pertinence, and the requirements of diversified monitoring cannot be met.

Disclosure of Invention

The object of the present application is to propose an improved method and apparatus for monitoring a platform to solve the technical problems mentioned in the background section above.

In a first aspect, an embodiment of the present application provides a method for monitoring a platform, where the method includes: acquiring the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the time length to be counted; executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition period, and generating at least one availability information in the time length to be counted, wherein each availability information is any one of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform and the availability of the monitored platform; when a monitoring data query request of a user is received, acquiring availability information matched with the monitoring data query request from the generated availability information; and displaying the inquired availability information.

In some embodiments, the at least one computing operation comprises: for each key in the monitored platform, determining the total calling times of the key in the time length to be counted based on the called times of the key in each unit acquisition period in the time length to be counted and the number of the unit acquisition periods contained in the time length to be counted; determining the single-period availability ratio of the key in each unit acquisition period based on the called times and the availability ratio of the key in each unit acquisition period in the to-be-counted time length and the total calling times of the key in the to-be-counted time length, summing the single-period availability ratios in each unit acquisition period in the to-be-counted time length to obtain the availability ratio of the key in the to-be-counted time length, wherein the single-period availability ratio is positively correlated with the called times and the availability ratios in the unit acquisition periods and inversely correlated with the total calling times of the key in the to-be-counted time length.

In some embodiments, the at least one computing operation further comprises: for each service in the monitored platform, determining the total calling times of the service in the duration to be counted based on the total calling times of each key in the service in the duration to be counted; determining the single-key availability ratio of each key in the time length to be counted based on the total calling times and the availability ratios of the keys in the time length to be counted and the total calling times of the service in the time length to be counted, and summing the single-key availability ratios of the keys in the service to obtain the availability ratio of the service in the time length to be counted, wherein the single-key availability ratio is positively correlated with the called times and the availability ratios of the keys in the time length to be counted and inversely correlated with the total calling times of the service in the time length to be counted.

In some embodiments, the at least one computing operation further comprises: for each application in the monitored platform, determining the total calling times of each service in the application in the duration to be counted based on the total calling times of each service in the application in the duration to be counted; determining a single service level weight of each service based on the service level, the service level weight and the total service level of each service in the application, wherein the single service level weight is positively correlated with the service level and the service level weight of the service within the time length to be counted and inversely correlated with the total service level of each service in the application; determining a single-service calling number weight of each service based on the total calling number and calling number weight of each service and the total calling number of each service in the application within the to-be-counted time length, wherein the single-service calling number weight is positively correlated with the total calling number and calling number weight of each service in the to-be-counted time length and inversely correlated with the total calling number of each service in the application; and taking the sum of the single service level weight and the single service calling frequency weight as the available rate weight of each service, and weighting the available rates of all services in the application to obtain the available rate of each application.

In some embodiments, the at least one computing operation further comprises: aiming at each center in the monitored platform, determining the total calling times of the center in the duration to be counted based on the total calling times of each application in the center in the duration to be counted; determining the availability ratio of each single application in the duration to be counted based on the total calling times and the availability ratios of the applications in the duration to be counted and the total calling times of the center in the duration to be counted, summing the availability ratios of the single applications of each application in the center to obtain the availability ratio of the center in the duration to be counted, wherein the availability ratio of the single application is positively correlated with the called times and the availability ratios of the applications in the duration to be counted and inversely correlated with the total calling times of the center in the duration to be counted.

In some embodiments, the at least one computing operation further comprises: determining the total calling times of the monitored platform in the duration to be counted based on the total calling times of each center in the monitored platform in the duration to be counted; determining the single-application availability rate of each center in the duration to be counted based on the total calling times and the availability rates of the centers in the duration to be counted and the total calling times of the monitored platform in the duration to be counted, summing the single-center availability rates of the centers of the monitored platform to obtain the availability rate of the monitored platform in the duration to be counted, wherein the single-center availability rate is positively correlated with the called times and the availability rates of the centers in the duration to be counted and inversely correlated with the total calling times of the monitored platform in the duration to be counted.

In some embodiments, obtaining the called times and the available rate of each key in the monitored platform in each unit acquisition period within the time length to be counted includes: storing the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the duration to be counted in a distributed publishing and subscribing message system; and executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition period, and generating at least one availability information in the time length to be counted, wherein the calculation operation comprises the following steps: using a distributed real-time computing system to consume the messages in the distributed publish-subscribe message system, and executing at least one computing operation on the consumed messages in parallel to generate at least one availability information; storing the generated at least one availability information to a distributed storage system; and when receiving a monitoring data query request of a user, acquiring availability information matched with the monitoring data query request from the generated availability information, wherein the availability information comprises: when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

In some embodiments, the monitoring data query request includes different time dimensions as query parameters; and when receiving a monitoring data query request of a user, acquiring availability information matched with the monitoring data query request from the availability information stored in the distributed storage system, wherein the availability information comprises: when a monitoring data query request is received, assembling query main keys by using query parameters of the monitoring data query request; and analyzing the query primary key to query the availability ratio information of different time dimensions in the distributed storage system.

In some embodiments, presenting the queried availability information includes: judging whether the availability is lower than an availability threshold value according to the availability information; and if the available rate is lower than the preset rate, displaying the available rate information as a preset graph style.

In a second aspect, an embodiment of the present application provides an apparatus for monitoring a platform, where the apparatus includes: the acquisition unit is used for acquiring the called times and the availability of each key in the monitored platform in each unit acquisition period within the duration to be counted; the calculation unit is used for executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition cycle, and generating at least one availability information in the time length to be counted, wherein each availability information is any one of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform and the availability of the monitored platform; the query unit is used for acquiring the availability rate information matched with the monitoring data query request from the generated availability rate information when the monitoring data query request of the user is received; and the display unit is used for displaying the inquired availability information.

In some embodiments, the obtaining unit is further configured to: storing the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the duration to be counted in a distributed publishing and subscribing message system; and the computing unit is further configured to: using a distributed real-time computing system to consume the messages in the distributed publish-subscribe message system, and executing at least one computing operation on the consumed messages in parallel to generate at least one availability information; storing the generated at least one availability information to a distributed storage system; and the querying element is further configured to: when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

In some embodiments, the monitoring data query request includes different time dimensions as query parameters; and the querying element is further configured to: when a monitoring data query request is received, assembling query main keys by using query parameters of the monitoring data query request; and analyzing the query primary key to query the availability ratio information of different time dimensions in the distributed storage system.

In some embodiments, the display unit is further for: judging whether the availability is lower than an availability threshold value according to the availability information; and if the available rate is lower than the preset rate, displaying the available rate information as a preset graph style.

In a third aspect, an embodiment of the present application provides an apparatus, including: one or more processors; storage means for storing one or more programs which, when executed by one or more processors, cause the one or more processors to carry out the method as described in any one of the first aspects.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is configured to, when executed by a processor, implement the method described in any one of the first aspect.

According to the method and the device for monitoring the platform, the originally acquired called times and the availability of each key in the monitored platform in each unit acquisition period in the time length to be counted are calculated and displayed, and operation and maintenance personnel can rapidly know the integral operation condition of the platform in each dimension in the time length to be counted.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for monitoring a platform according to the present application;

FIG. 3 is a flow diagram of yet another embodiment of a method for monitoring a platform according to the present application;

FIG. 4 is a schematic block diagram of one embodiment of an apparatus for monitoring a platform according to the present application;

FIG. 5 is a block diagram of a computer system suitable for use in implementing the apparatus of an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the present method for monitoring a platform or apparatus for monitoring a platform may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user can use the

terminal devices

101, 102 and 103 to interact with the server 105 through the network 104 to send a monitoring data query request, receive monitoring data returned by the server 105 and display the monitoring data. The

terminal devices

101, 102, 103 may have installed thereon a client application that presents the monitoring data.

The server 105 may be a server that provides a service of monitoring data, including computational processing and storage of the monitoring data. The server 105 may analyze the received monitoring data query request and feed back a processing result (e.g., monitoring data) to the terminal device.

It should be noted that, the method for monitoring a platform provided in the embodiment of the present application is generally performed by the server 105, and some steps may also be performed by the

terminal devices

101, 102, and 103; accordingly, the means for monitoring the platform is typically provided in the server 105, and some units may also be provided in the

terminal devices

101, 102, 103.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for monitoring a platform according to the present application is shown. The method for monitoring the platform comprises the following steps:

step 201, obtaining the called times and the availability of each key in the monitored platform in each unit acquisition period within the duration to be counted.

In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the method for Monitoring a Platform operates may obtain, from a Unified Monitoring Platform (UMP), raw data such as the number of times that each key in the monitored Platform is called and the availability ratio in each unit acquisition period within the time duration to be counted in a wired connection manner or a wireless connection manner. In general, UMP performs data acquisition according to its own unit acquisition period. For example, the unit collection period of UMP may be 1 minute or 5 minutes. When the duration to be counted is 5 minutes, if the unit acquisition period of the UMP is 1 minute, the data acquired in the last 5 unit acquisition periods of the UMP needs to be summarized, and then the called times and the availability of the Key in each unit acquisition period in the duration to be counted are acquired. When the duration to be counted is 5 minutes, the called times and the availability of the Key in each unit acquisition period in the duration to be counted can be obtained only by obtaining the data acquired by the UMP in the current unit acquisition period.

Generally, one monitored platform may include one or more centers, the centers may also include one or more applications, the applications may include one or more services, and the services may correspond to one or more keys (keys). In general, the UMP may collect data such as the number of calls and the availability of an interface (corresponding to a key) by performing a buried point monitoring on a system interface.

In practice, when obtaining the data such as the called times and the availability of each key in each unit acquisition period within the time length to be counted, all the key values can be obtained from the database, and then corresponding data can be obtained from the UMP according to the key values.

Step 202, executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition cycle, and generating at least one availability information in the time length to be counted.

In this embodiment, after obtaining the called number and the availability of each key in each unit collection period based on the step 201, the electronic device (for example, the server shown in fig. 1) may perform at least one calculation operation on the data to calculate at least one availability information in the time length to be counted. Wherein each of the available rate information is any one of: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform, and the availability of the monitored platform.

In some optional implementations of the embodiment, the at least one computing operation includes: for each key in the monitored platform, determining the total calling times of the key in the time length to be counted based on the called times of the key in each unit acquisition period in the time length to be counted and the number of the unit acquisition periods contained in the time length to be counted; determining the single-period availability ratio of the key in each unit acquisition period based on the called times and the availability ratio of the key in each unit acquisition period in the to-be-counted time length and the total calling times of the key in the to-be-counted time length, summing the single-period availability ratios in each unit acquisition period in the to-be-counted time length to obtain the availability ratio of the key in the to-be-counted time length, wherein the single-period availability ratio is positively correlated with the called times and the availability ratios in the unit acquisition periods and inversely correlated with the total calling times of the key in the to-be-counted time length.

The calculation of the called times of the ith Key in the statistical time can be completed by the formula (1), and assuming that the unit acquisition period is 1 minute or 5 minutes, T in the formula (1) is 1 or 5:

（1）

the times of the calling of the ith Key in the statistical time are represented, and the times of the calling of the ith Key in the tth Ump unit acquisition cycle of the same service are represented. It should be noted that the same symbols in different formulas represent the same data.

Calculating the availability ratio of the ith Key in the statistical time can be calculated by formula (2):

(2)

wherein the content of the first and second substances,

represents the availability of the ith Key of the same service in the t-th Ump-unit acquisition cycle,

and representing the availability of the ith Key in the statistical time.

In this implementation, the availability index value of the Key may be calculated according to the number of calls and availability data of each Key acquired from Ump in a unit acquisition cycle, so as to quantitatively evaluate and measure the system stability.

In some optional implementations of this embodiment, the at least one computing operation further includes: for each service in the monitored platform, determining the total calling times of the service in the duration to be counted based on the total calling times of each key in the service in the duration to be counted; determining the single-key availability ratio of each key in the time length to be counted based on the total calling times and the availability ratios of the keys in the time length to be counted and the total calling times of the service in the time length to be counted, and summing the single-key availability ratios of the keys in the service to obtain the availability ratio of the service in the time length to be counted, wherein the single-key availability ratio is positively correlated with the called times and the availability ratios of the keys in the time length to be counted and inversely correlated with the total calling times of the service in the time length to be counted.

Wherein, the availability of the jth service in the statistical time can be calculated by formula (3) and formula (4).

（3）

（4）

Wherein the content of the first and second substances,

indicating the number of times the jth service was invoked within a statistical time,

representing the availability of the jth service within the statistical time.

In this implementation, the index value of the availability of the service can be calculated layer by layer according to the number of calls and the availability data of each Key acquired from Ump in a unit acquisition cycle, so as to quantitatively evaluate and measure the system stability.

In some optional implementations of this embodiment, the at least one computing operation further includes: for each application in the monitored platform, determining the total calling times of each service in the application in the duration to be counted based on the total calling times of each service in the application in the duration to be counted; determining a single service level weight of each service based on the service level, the service level weight and the total service level of each service in the application, wherein the single service level weight is positively correlated with the service level and the service level weight of the service within the time length to be counted and inversely correlated with the total service level of each service in the application; determining a single-service calling number weight of each service based on the total calling number and calling number weight of each service and the total calling number of each service in the application within the to-be-counted time length, wherein the single-service calling number weight is positively correlated with the total calling number and calling number weight of each service in the to-be-counted time length and inversely correlated with the total calling number of each service in the application; and taking the sum of the single service level weight and the single service calling frequency weight as the available rate weight of each service, and weighting the available rates of all services in the application to obtain the available rate of each application.

Wherein, the calculation of the availability of the h-th application in the statistical time can be performed by the formulas (5) and (6).

（5）

（6）

Wherein the content of the first and second substances,

representing the number of times the h-th application is called within the statistical time;

the weight representing the traffic class of the jth service, which may be defined manually;

a traffic class representing the jth service, which can be manually rated;

the weight that the number of calls of the jth service occupies can be represented and is manually defined.

In the implementation mode, the applicable availability index value can be calculated layer by layer according to the calling times and the availability data of each Key acquired from the UMP in the unit acquisition period, so that the system stability is evaluated and measured quantitatively.

In some optional implementations of this embodiment, the at least one computing operation further includes: aiming at each center in the monitored platform, determining the total calling times of the center in the duration to be counted based on the total calling times of each application in the center in the duration to be counted; determining the availability ratio of each single application in the duration to be counted based on the total calling times and the availability ratios of the applications in the duration to be counted and the total calling times of the center in the duration to be counted, summing the availability ratios of the single applications of each application in the center to obtain the availability ratio of the center in the duration to be counted, wherein the availability ratio of the single application is positively correlated with the called times and the availability ratios of the applications in the duration to be counted and inversely correlated with the total calling times of the center in the duration to be counted.

Wherein, the availability of the ith center in the calculation statistical time can be calculated by the following formula (7):

（7）

wherein the content of the first and second substances,

indicating the availability of the ith center within the statistical time.

In the implementation mode, the index value of the central availability can be calculated layer by layer according to the calling times and the availability data of each Key in a unit acquisition cycle, which are acquired from the UMP, so that the system stability is evaluated and measured quantitatively.

In some optional implementations of this embodiment, the at least one computing operation further includes: determining the total calling times of the monitored platform in the duration to be counted based on the total calling times of each center in the monitored platform in the duration to be counted; determining the single-application availability rate of each center in the duration to be counted based on the total calling times and the availability rates of the centers in the duration to be counted and the total calling times of the monitored platform in the duration to be counted, summing the single-center availability rates of the centers of the monitored platform to obtain the availability rate of the monitored platform in the duration to be counted, wherein the single-center availability rate is positively correlated with the called times and the availability rates of the centers in the duration to be counted and inversely correlated with the total calling times of the monitored platform in the duration to be counted.

The calculation of the overall availability of the platform system within the statistical time can be realized by the formula (8).

（8）

Wherein the content of the first and second substances,

represents the overall availability of the platform within 5min,

it indicates the number of times the ith center was called within the statistical time.

In the implementation mode, the index value of the availability of the whole platform can be calculated layer by layer according to the calling times and the availability data of each Key in a unit acquisition cycle, which are acquired from the UMP, so that the system stability is evaluated and measured quantitatively.

In step 203, when a monitoring data query request of a user is received, the availability information requested by the monitoring data query request is obtained from the generated availability information.

In this embodiment, the electronic device on which the method for monitoring the platform operates can continuously detect the monitoring data query request sent by the user. When receiving the monitoring data query request, the electronic device may obtain the availability information requested by the monitoring data query request from the availability information generated in step 202.

And step 204, displaying the inquired availability information.

In this embodiment, based on the availability information obtained in step 203, the electronic device may display the availability information in a graphical interface manner, so as to facilitate understanding of the availability of the monitored platform.

Optionally, the image display may be performed by calling a JSF interface provided by JAVASCRIPT. The horizontal coordinate data xlist (linklist) required for displaying the graphics can be set by the JSF interface, and the available rate data can be used as the vertical coordinate data to form the display position of the corresponding point of the data in the whole graphics interface so as to identify the value size of the corresponding point. The front-end page display can use a plurality of open-source element components and can also carry out function expansion on the basis of the element components. The JSF interface can provide all data required by page display, guarantees the tidiness of a page system, provides all keys to be collected, and provides the incidence relation among the data when calculating the availability ratio information.

The method provided by the embodiment of the application calculates and displays the availability information of the platform to be monitored on each dimension in the duration to be counted by calling and availability of each key in the originally acquired monitored platform in each unit acquisition period in the duration to be counted, and is helpful for operation and maintenance personnel to quickly know the integral operation condition of the platform on each dimension in the duration to be counted.

With further reference to FIG. 3, a flow 300 of yet another embodiment of a method for monitoring a platform is shown. The process 300 of the method for monitoring a platform includes the steps of:

step 301, storing the called times and the availability of each key in the monitored platform in each unit acquisition period within the duration to be counted in the distributed publish-subscribe message system.

In this embodiment, when the called number and the availability of each key in each unit collection period within the duration to be counted are obtained, the obtained information may be stored in the distributed publish-subscribe message system. Optionally, the distributed publish-subscribe message system may be a kafka message queue.

Step 302, using a distributed real-time computing system to consume the messages in the distributed publish-subscribe message system, and performing at least one computing operation on the consumed messages in parallel to generate at least one availability information.

In this embodiment, the electronic device may consume the message in the distributed publish-subscribe message system using a distributed real-time computing system. The distributed real-time computing system is provided with a plurality of computing units which can be executed concurrently and can be used for executing the at least one computing operation on the consumed message in parallel to generate at least one type of availability information. Optionally, the messages consumed from the distributed publish-subscribe messaging system may be converted to a uniform data format for subsequent computation after the messages are consumed and before the computation using the computation unit. In addition, the data can be deduplicated, so that the data stored in the subsequently mentioned distributed storage system is prevented from being repeatedly calculated, so that the data is prevented from being confused, and the calculation capacity is prevented from being wasted. Optionally, the distributed real-time computing system may be a storm, and during the computing, computing operations may be performed concurrently by using each bolt computing unit in the storm. Wherein, the bolt calculation unit may include a bolt for time dimension calculation and a bolt for system dimension calculation.

And step 303, storing the generated at least one type of availability information into a distributed storage system.

In this embodiment, the electronic device may store the at least one availability information generated in step 302 in a distributed storage system. Alternatively, the distributed storage system may be an HBase.

Step 304, when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

In this embodiment, when receiving the monitoring data query request, the electronic device may obtain availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

And 305, displaying the inquired availability information.

In this embodiment, the specific processing of step 305 may refer to step 204 in the corresponding embodiment of fig. 2, which is not described herein again.

In some optional implementations of this embodiment, the monitoring data query request includes different time dimensions as query parameters. Meanwhile, the step 304 may specifically include: when a monitoring data query request is received, assembling query main keys by using query parameters of the monitoring data query request; and analyzing the query primary key to query the availability ratio information of different time dimensions in the distributed storage system. For example, the query primary key may be named rowkey, an example of which may be key _12351_2016_10_41_11_20_ 55. The result of rowkey resolution for the example above may be key (query identification) _12351 (keyId) _2016 (year) _10 (month) _41 (week) _11 (day) _20 (hour) _55 (min). Wherein, different time dimensions may include that a single time limit is a time dimension of real time, the current day, the current week, the current month, and the like. In addition, when the rowkey is spliced according to time, the realization process can also consider year (year, month and week), month (month and week) and week (week) to work out correct numerical values and splice out the rowkey and the rowkeyList which are in compliance, so that data query can be smoothly carried out in the distributed storage system. When the distributed query system is HBase, it supports two types of queries: and querying a single data acquisition required field get through rowKey or querying a plurality of data acquisition required fields scan through rowKeyList.

In some optional implementation manners of this embodiment, the displaying the queried availability information includes: judging whether the availability is lower than an availability threshold value according to the availability information; and if the available rate is lower than the preset rate, displaying the available rate information as a preset graph style. In the implementation mode, the availability information when the availability is lower than the availability threshold value can be displayed as a pre-designed graphic pattern, so that operation and maintenance personnel can quickly identify the abnormal data, and can respond as early as possible.

As can be seen from fig. 3, compared with the embodiment corresponding to fig. 2, in the embodiment, when performing information processing, the flow 300 of the method for monitoring a platform in this embodiment implements distributed data transmission, calculation, and storage in a data processing process through the distributed publish-subscribe message system, the distributed real-time computing system, and the distributed storage system, thereby greatly improving the efficiency of data processing.

With further reference to fig. 4, as an implementation of the method shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for monitoring a platform, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.

As shown in fig. 4, the apparatus 400 for monitoring a platform of the present embodiment includes: an acquisition unit 401, a calculation unit 402, a query unit 403 and a presentation unit 404. The acquiring unit 401 is configured to acquire the called times and the availability of each key in the monitored platform in each unit acquisition period within the duration to be counted; the calculating unit 402 is configured to execute at least one calculating operation based on the called times and the availability of each key in each unit acquisition cycle, and generate at least one availability information in a to-be-counted duration, where each availability information is any of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform and the availability of the monitored platform; the query unit 403 is configured to, when receiving a monitoring data query request from a user, obtain availability information that matches the monitoring data query request from the generated availability information; and the presentation unit 404 is used for presenting the queried availability information.

In this embodiment, the specific processing of the obtaining unit 401, the calculating unit 402, the querying unit 403, and the presenting unit 404 of the apparatus 400 for monitoring a platform may refer to step 201, step 202, step 203, and step 204 in the corresponding embodiment of fig. 2, and is not described herein again.

In some optional implementations of this embodiment, the at least one computing operation includes: for each key in the monitored platform, determining the total calling times of the key in the time length to be counted based on the called times of the key in each unit acquisition period in the time length to be counted and the number of the unit acquisition periods contained in the time length to be counted; determining the single-period availability ratio of the key in each unit acquisition period based on the called times and the availability ratio of the key in each unit acquisition period in the to-be-counted time length and the total calling times of the key in the to-be-counted time length, summing the single-period availability ratios in each unit acquisition period in the to-be-counted time length to obtain the availability ratio of the key in the to-be-counted time length, wherein the single-period availability ratio is positively correlated with the called times and the availability ratios in the unit acquisition periods and inversely correlated with the total calling times of the key in the to-be-counted time length.

In some optional implementations of this embodiment, the at least one computing operation further includes: for each service in the monitored platform, determining the total calling times of the service in the duration to be counted based on the total calling times of each key in the service in the duration to be counted; determining the single-key availability ratio of each key in the time length to be counted based on the total calling times and the availability ratios of the keys in the time length to be counted and the total calling times of the service in the time length to be counted, and summing the single-key availability ratios of the keys in the service to obtain the availability ratio of the service in the time length to be counted, wherein the single-key availability ratio is positively correlated with the called times and the availability ratios of the keys in the time length to be counted and inversely correlated with the total calling times of the service in the time length to be counted. The specific processing of this implementation may refer to a corresponding implementation in the corresponding embodiment of fig. 2, which is not described herein again.

In some optional implementations of this embodiment, the at least one computing operation further includes: for each application in the monitored platform, determining the total calling times of each service in the application in the duration to be counted based on the total calling times of each service in the application in the duration to be counted; determining a single service level weight of each service based on the service level, the service level weight and the total service level of each service in the application, wherein the single service level weight is positively correlated with the service level and the service level weight of the service within the time length to be counted and inversely correlated with the total service level of each service in the application; determining a single-service calling number weight of each service based on the total calling number and calling number weight of each service and the total calling number of each service in the application within the to-be-counted time length, wherein the single-service calling number weight is positively correlated with the total calling number and calling number weight of each service in the to-be-counted time length and inversely correlated with the total calling number of each service in the application; and taking the sum of the single service level weight and the single service calling frequency weight as the available rate weight of each service, and weighting the available rates of all services in the application to obtain the available rate of each application. The specific processing of this implementation may refer to a corresponding implementation in the corresponding embodiment of fig. 2, which is not described herein again.

In some optional implementations of the present embodiment, the obtaining unit 401 is further configured to: storing the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the duration to be counted in a distributed publishing and subscribing message system; and, the calculating unit 402 is further configured to: using a distributed real-time computing system to consume the messages in the distributed publish-subscribe message system, and executing at least one computing operation on the consumed messages in parallel to generate at least one availability information; storing the generated at least one availability information to a distributed storage system; and, the querying unit 403 is further configured to: when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

In some optional implementations of this embodiment, the monitoring data query request includes different time dimensions as query parameters; and, the querying unit 403 is further configured to: when a monitoring data query request is received, assembling query main keys by using query parameters of the monitoring data query request; and analyzing the query primary key to query the availability ratio information of different time dimensions in the distributed storage system.

In some optional implementations of the present embodiment, the presentation unit 404 is further configured to: judging whether the availability is lower than an availability threshold value according to the availability information; and if the available rate is lower than the preset rate, displaying the available rate information as a preset graph style.

In addition, the present application also provides an apparatus, comprising: one or more processors; a storage device for storing one or more programs which, when executed by one or more processors, cause the one or more processors to implement the method as described in the embodiments and any one of the implementations of fig. 2 or fig. 3.

Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing the apparatus of an embodiment of the present application. The server shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU) 501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

The following components are connected to the I/O interface 505: includes an input portion 506; an output portion 507; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 is mounted on the drive 510 as needed, so that a computer program read out therefrom is mounted into the storage section 508 as needed.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 501.

It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a calculation unit, a query unit, and a presentation unit. The names of the units do not form a limitation on the units themselves under certain conditions, for example, the receiving unit may also be described as a unit for acquiring the called times and the available rate of each key in the monitored platform in each unit acquisition cycle within the time length to be counted.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the time length to be counted; executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition period, and generating at least one availability information in the time length to be counted, wherein each availability information is any one of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform, and the availability of the monitored platform; when a monitoring data query request of a user is received, acquiring availability information matched with the monitoring data query request from the generated availability information; and displaying the inquired availability information.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A method for monitoring a platform, the method comprising:

acquiring the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the time length to be counted;

executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition period, and generating at least one availability information in the time length to be counted, wherein each availability information is any one of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform, and the availability of the monitored platform;

when a monitoring data query request of a user is received, acquiring availability information matched with the monitoring data query request from the generated availability information;

displaying the inquired availability rate information; and the number of the first and second groups,

the at least one computing operation comprising:

for each key in the monitored platform, determining the total calling times of the key in the duration to be counted based on the called times of the key in each unit acquisition period in the duration to be counted and the number of the unit acquisition periods contained in the duration to be counted;

determining the single-period availability of the key in each unit acquisition period based on the called times and the availability of the key in each unit acquisition period in the to-be-counted time length and the total calling times of the key in the to-be-counted time length, summing the single-period availability of the key in each unit acquisition period in the to-be-counted time length to obtain the availability of the key in the to-be-counted time length, wherein the single-period availability of the key in each unit acquisition period is positively correlated with the called times of the key in each unit acquisition period in the to-be-counted time length and the availability of the key in each unit acquisition period in the to-be-counted time length, and is inversely correlated with the total calling times of the key in the to-be-counted time length.

2. The method of claim 1, wherein the at least one computing operation further comprises:

for each service in the monitored platform, determining the total calling times of the service in the duration to be counted based on the total calling times of each key in the service in the duration to be counted;

the method comprises the steps of determining the single key availability ratio of each key in the to-be-counted time length based on the total calling times, the availability ratios and the total calling times of the service in the to-be-counted time length of the key, summing the single key availability ratios of all the keys in the service, and obtaining the availability ratio of the service in the to-be-counted time length, wherein the single key availability ratio and the key are positively correlated with the called times and the availability ratios in the to-be-counted time length and inversely correlated with the total calling times of the service in the to-be-counted time length.

3. The method of claim 2, wherein the at least one computing operation further comprises:

for each application in the monitored platform, determining the total calling times of each service in the application in the duration to be counted based on the total calling times of each service in the application in the duration to be counted;

determining a single service level weight of each service based on the service level, the service level weight and the total service level of each service in the application, wherein the single service level weight is positively correlated with the service level and the service level weight of the service within the time length to be counted and inversely correlated with the total service level of each service in the application;

determining a single-service calling number weight of each service based on the total calling number, the calling number weight and the total calling number in the application within the time length to be counted, wherein the single-service calling number weight is positively correlated with the total calling number and the calling number weight of the service within the time length to be counted and inversely correlated with the total calling number of each service in the application;

and taking the sum of the single service level weight and the single service calling frequency weight as the available rate weight of each service, and weighting the available rates of all services in the application to obtain the available rate of each application.

4. The method of claim 3, wherein the at least one computing operation further comprises:

for each center in the monitored platform, determining the total calling times of the center in the duration to be counted based on the total calling times of each application in the center in the duration to be counted;

the method comprises the steps that each application is determined to be applied based on the total number of times of calling, the availability ratio and the center in the duration to be counted, the single application availability ratio of each application in the center is summed, the center is in the availability ratio in the duration to be counted, wherein the single application availability ratio is positively correlated with the called number of times and the availability ratio in the duration to be counted based on the single application availability ratio and the called number of times and the availability ratio in the duration to be counted based on the center are inversely correlated with the total number of times of calling in the duration to be counted.

5. The method of claim 4, wherein the at least one computing operation further comprises:

determining the total calling times of the monitored platform in the duration to be counted based on the total calling times of each center in the monitored platform in the duration to be counted;

treat statistics time length in total number of times of calling, the availability ratio and monitored platform is in treat statistics time length in total number of times of calling confirm that every center is in treat statistics time length in the single application availability ratio, and right the single center availability ratio of each center of monitored platform sums, obtains monitored platform is in treat statistics time length in the availability ratio, wherein, single center availability ratio and center are in treat statistics time length in the number of times of being called and the availability ratio positive correlation, with monitored platform is in treat statistics time length in the total number of times of calling anti-correlation.

6. The method of claim 1, wherein the obtaining of the called times and the availability of each key in the monitored platform in each unit acquisition cycle within the time length to be counted comprises:

storing the called times and the availability ratio of each key in the monitored platform in each unit acquisition period within the duration to be counted in a distributed publishing and subscribing message system; and

the executing at least one calculation operation based on the called times and the availability of each key in each unit acquisition cycle to generate at least one availability information in the duration to be counted includes:

using a distributed real-time computing system to consume the messages in the distributed publish-subscribe message system and perform the at least one computing operation on the consumed messages in parallel to generate the at least one availability information;

storing the generated at least one availability information to a distributed storage system; and

when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the generated availability information, including:

when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system.

7. The method of claim 6, wherein the monitoring data query request includes different time dimensions as query parameters; and

when a monitoring data query request of a user is received, obtaining availability information matched with the monitoring data query request from the availability information stored in the distributed storage system, including:

when the monitoring data query request is received, assembling query main keys by using query parameters of the monitoring data query request;

and analyzing the query primary key so as to query the availability information of different time dimensions in the distributed storage system.

8. The method according to claim 7, wherein said presenting the queried availability information comprises:

judging whether the availability is lower than an availability threshold value according to the availability information;

and if the available rate information is lower than the preset rate information, displaying the available rate information as a preset pattern.

9. An apparatus for monitoring a platform, the apparatus comprising:

the acquisition unit is used for acquiring the called times and the availability of each key in the monitored platform in each unit acquisition period within the duration to be counted;

the calculation unit is configured to execute at least one calculation operation based on the called times and the availability of each key in each unit acquisition cycle, and generate at least one availability information in the to-be-counted time duration, where each availability information is any one of the following: the availability of each key in the monitored platform, the availability of each service in the monitored platform, the availability of each application in the monitored platform, the availability of each center in the monitored platform and the availability of the monitored platform, at least one computing operation comprising: for each key in the monitored platform, determining the total calling times of the key in the duration to be counted based on the called times of the key in each unit acquisition period in the duration to be counted and the number of unit acquisition periods contained in the duration to be counted, determining the single-period availability of the key in each unit acquisition period based on the called times, the availability and the total calling times of the key in each unit acquisition period in the duration to be counted, and summing the single-period availability in each unit acquisition period in the duration to be counted to obtain the availability of the key in the duration to be counted, wherein the single-period availability of the key in each unit acquisition period is positively correlated with the called times of the key in each unit acquisition period in the duration to be counted and the availability of the key in each unit acquisition period in the duration to be counted, inversely correlating with the total calling times of the key in the time length to be counted;

the query unit is used for acquiring the availability rate information matched with the monitoring data query request from the generated availability rate information when the monitoring data query request of a user is received;

and the display unit is used for displaying the inquired availability information.

10. The apparatus of claim 9, wherein the obtaining unit is further configured to:

the computing unit is further to:

the query unit is further to:

11. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 8.