WO2015085963A1 - 基于分布式系统的监控方法、装置及系统 - Google Patents

基于分布式系统的监控方法、装置及系统 Download PDF

Info

Publication number
WO2015085963A1
WO2015085963A1 PCT/CN2015/072372 CN2015072372W WO2015085963A1 WO 2015085963 A1 WO2015085963 A1 WO 2015085963A1 CN 2015072372 W CN2015072372 W CN 2015072372W WO 2015085963 A1 WO2015085963 A1 WO 2015085963A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
report record
statistical
report
record
Prior art date
Application number
PCT/CN2015/072372
Other languages
English (en)
French (fr)
Inventor
程章敏
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2015085963A1 publication Critical patent/WO2015085963A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0686Additional information in the notification, e.g. enhancement of specific meta-data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation

Definitions

  • the present invention relates to the field of distributed system monitoring technologies, and in particular, to a monitoring method, device and system based on a distributed system.
  • the monitoring process of the traditional distributed system includes: the process in each service terminal periodically outputs its own operation (that is, the event that the service terminal occurs on the statistical interface, and the event includes multiple dimensions related to the statistical interface) to the service.
  • the script of the service terminal uploads the data in the log to the unified statistics node.
  • the statistics node can collect the statistics of each statistical interface and each dimension according to the data in the corresponding log of each service terminal.
  • the operation of the system In order to facilitate the understanding of the monitoring process, a simple example is given to illustrate the above process. For example, a process in a service terminal can output a login event occurring on a service login interface to a log, and the login event includes login.
  • a login event is "Login to the application space in Shanghai at 14:30 on September 23,” which is "September 23, 14:14 PM: 30"
  • "Shanghai” is the location dimension of the login event.
  • the statistics nodes will perform related statistics based on these login events. For example, you can count the number of login events that are logged in at a certain location.
  • the embodiment of the present invention provides a monitoring method, device and system based on the distributed system, and the technical solution is as follows.
  • a monitoring method based on a distributed system comprising:
  • the at least one service terminal And receiving, by the at least one service terminal, a report record that is sent in real time, where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes an identifier of the statistical interface, an occurrence time of the event, and Describe at least one dimension of the event;
  • the report record that meets the configuration rule is selected, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • a monitoring method based on a distributed system which is applied to a service terminal, and the method includes:
  • a report record where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes an identifier of the statistical interface, and an occurrence time of the event. And at least one dimension for describing the event;
  • the report record is sent to the statistics node in real time.
  • a monitoring device based on a distributed system comprising:
  • a receiving module configured to receive a report record that is sent by the at least one service terminal in real time, where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes an identifier of the statistical interface, and the event Time of occurrence and at least one dimension used to describe the event;
  • the selection module is configured to select a report record that meets the configuration rule, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • a statistic module configured to count, according to the occurrence time of the event included in the report record selected by the selection module, the number of the reported records in a specified time period.
  • a fourth aspect provides a monitoring device based on a distributed system, which is applied to a service terminal, where the device includes:
  • An collecting module configured to collect an event that occurs on the statistical interface when the statistical interface is invoked
  • a generating module configured to generate, according to the event collected by the collecting module, a reporting record, where the reporting record is used to describe an event that occurs on the statistical interface of the service terminal, where the reporting record includes the statistical interface Identification, the time of occurrence of the event, and at least one dimension used to describe the event;
  • a sending module configured to send the reporting record to the statistical node in real time.
  • a monitoring system in a fifth aspect, includes a statistical node and at least one service terminal;
  • the statistical node includes a distributed system based monitoring device as described in the third aspect
  • the service terminal includes a distributed system based monitoring device as described in the fourth aspect.
  • the report Receiving a report record sent by at least one service terminal, and selecting a configuration rule that meets the configuration rule
  • the report records and counts the number of reported records that meet the configuration rules within a specified time period according to the occurrence time, and solves the problem that the real-time performance of the system obtained in the prior art is relatively poor; since the statistical node reports in real time according to the service terminal When the report of the specified event is counted, the length of the specified time period of the statistics can be arbitrarily set. Therefore, the monitoring can be positioned at the second level, thereby achieving the effect of real-time and rapid monitoring of the operation of the system.
  • FIG. 1 is a schematic diagram of an implementation environment involved in a monitoring method based on a distributed system provided in an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for monitoring a distributed system based method provided in an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for monitoring a distributed system based method according to another embodiment of the present invention.
  • FIG. 4 is a flowchart of a method for monitoring a distributed system based method according to still another embodiment of the present invention.
  • FIG. 5 is a schematic diagram showing the internal structure relationship of a statistical node provided in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a monitoring system provided in an embodiment of the present invention.
  • FIG. 7 is a schematic illustration of a monitoring system provided in another embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a service terminal according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a statistical node provided in an embodiment of the present invention.
  • At least one as used herein includes one, two or more.
  • the implementation environment may include a statistical node 120 and at least one service terminal 140, and the statistical node 120 may be connected to the service terminal 140 by wire or wirelessly.
  • the statistics node 120 can obtain the service running data reported by the service terminal in real time, and perform statistics according to the data.
  • the statistical node 120 can be a server, or a server cluster consisting of several servers, or a cloud computing service center.
  • the service terminal 140 runs a service, and can report the running status of the service to the statistics node 120.
  • the term "business terminal” as used herein may include, but is not limited to, a smartphone, a tablet, a PDA (Personal Digital Assistant, a handheld computer), an e-reader, a multimedia TV, and an MP4 (Moving Picture Experts Group Audio Layer IV). Compress standard audio level 4) and so on.
  • FIG. 2 a flow chart of a method for monitoring a distributed system based in an embodiment of the present invention is shown.
  • the distributed system based monitoring method is exemplified for application to the statistical node 120 in the implementation environment shown in FIG.
  • the distributed system based monitoring method can include the following steps.
  • S201 Receive a report record that is sent by the at least one service terminal in real time, where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes an identifier of the statistical interface, an event occurrence time, and at least one dimension used to describe the event. .
  • the events mentioned here are usually the business behavior of the user.
  • Statistical interfaces are generally used in business. On the statistics of the interfaces that can be called by the business process, when these interfaces are called, the service terminal can know the events that occur when these interfaces are called.
  • the dimension may be attribute information related to the event, such as location information, access method, attribution information, and the like.
  • the report record that meets the configuration rule is selected, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • the statistics node can flexibly specify which statistical interfaces and dimensions described in the report record (hereinafter referred to as specified events) are counted in the statistics. Since each event can be described by a statistical interface and a dimension, when the server specifies a statistical interface and dimension in the statistics, the specified event described by the statistical interface and dimension is specified.
  • the statistics node counts the number of records reported in the specified time period according to the occurrence time of the event described by the selected report record.
  • the specified time period is the time period set by the statistics node according to the purpose of monitoring and the stability of the service. For example, when the statistical node needs to monitor the running status of the distributed system in real time, the specified time period can be set to a smaller value, such as 1 second; for example, when the statistical node needs to count the daily running of the distributed system, You can set the specified time period to 1 day, and so on.
  • the monitoring method based on the distributed system provided by the embodiment of the present invention receives a report record sent by at least one service terminal, selects a report record that meets the configuration rule, and reports the record in a specified time period according to the occurrence time statistics.
  • the number of the problem solves the problem that the real-time performance of the system obtained in the prior art is relatively poor; since the statistical node counts the specified event according to the report record reported in real time by the service terminal, the specified time of the statistics can be arbitrarily set.
  • the length of the segment, so the monitoring can be positioned at the second level, thus achieving the effect of real-time monitoring of the operation of the system in real time.
  • FIG. 3 shows a distributed system based on another embodiment.
  • Method flow chart of the monitoring method is exemplified for application to the service terminal 140 in the implementation environment shown in FIG.
  • the distributed system based monitoring method can include the following steps.
  • the events mentioned here are usually the business behavior of the user.
  • the statistical interface is generally used in the service for the statistics to be called by the business process. When these interfaces are called, the service terminal can know the events that occur when these interfaces are called.
  • the report record includes the identifier of the statistical interface, the time when the event occurs, and at least one dimension used to describe the event. .
  • the dimension may be attribute information related to the event, such as location information, access method, attribution information, and the like.
  • the service terminal After the service terminal generates the report record according to the collected event, the service terminal sends the report record to the statistics node in real time.
  • the statistics node After receiving the report record, the statistics node selects the report record that meets the configuration rule according to the received report record, and counts the number of the report records in the specified time period according to the occurrence time of the event described by the selected report record.
  • the monitoring method based on the distributed system collects the event occurring on the statistical interface, generates a report record according to the collected event, and sends the report record to the statistical node in real time, so that
  • the statistics node may select the report record that meets the configuration rule according to the received report record, and collect the number of the report records in the specified time period according to the time of occurrence of the event described in the selected report record, which solves the prior art acquisition.
  • the problem of the real-time performance of the system is relatively poor.
  • the statistical node can arbitrarily set the specified time of the statistics when the specified event is counted according to the report record reported by the service terminal in real time. The length of the interval, so the monitoring can be positioned at the second level, thus achieving the effect of real-time and rapid monitoring of the operation of the system.
  • FIG. 4 a flowchart of a method for monitoring a distributed system based method provided in another embodiment is shown.
  • the monitoring method based on the distributed system is exemplified by being applied to the implementation environment shown in FIG. 1.
  • the distributed system based monitoring method can include the following steps.
  • the service terminal collects an event that occurs on the statistical interface when the statistical interface is invoked.
  • the events mentioned here are usually the business behavior of the user.
  • the statistical interface is generally used in the service for the statistics to be called by the business process.
  • the service terminal can know the events that occur when these interfaces are called.
  • the interface of the service may include the login interface of the service, the access interface of the service, etc.
  • the login interface needs to be invoked; when the service is accessed, the access interface needs to be invoked.
  • the service terminal generates a report record according to the collected event, where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes an identifier of the statistical interface, an event occurrence time, and at least one used to describe the event. Dimensions.
  • the reporting record is used to describe the event that occurs on the statistical interface of the service terminal.
  • the report record includes the identifier of the statistical interface, the time when the event occurred, and at least one dimension used to describe the event.
  • the dimension can be attribute information related to the event. For example, attribute information such as location information, access method, and attribution information.
  • the report record usually contains the following fields: the IP address of the service terminal, the port number of the service terminal, the interface identifier, and the timestamp. The other fields are added as needed according to the specific conditions of each interface.
  • the format of a report record can be as follows:
  • the report record can also be in other formats, but the format of the report record sent by the service terminal must be negotiated with the statistical node, so that the statistical node can receive the report record sent by the service terminal. Parse the information carried in the report record.
  • the event is usually described by multiple dimensions.
  • the event is used to describe the execution of a service terminal to a certain service. For example, “Login to the specified chat application via wireless network in Shenzhen at 8:00 am on July 15, 2013. In the case of the program, “8:00 am on July 15, 2013” is the time dimension of the event, “Shenzhen” is the address dimension of the event, and “Wireless Network Mode” is the access mode dimension of the event.
  • the login here can be regarded as the business behavior determined by a statistical interface of the chat application, and the report record corresponding to the event can be expressed as ⁇ Login ID, 8:00 am on July 15, 2013, Shenzhen, wireless network mode >.
  • a statistical node may pre-configure an interface (ie, a statistical interface) that needs to be monitored.
  • a service includes several interfaces that are invoked in service execution, and is determined when the service terminal is running.
  • the statistics interface is called, that is, the service learns that the statistics interface performs certain events.
  • the service terminal sends a report record describing the event that the service terminal invokes the statistics interface to the statistics node, correspondingly, The statistics node can receive these report records.
  • the report record may also store other information, such as the identifier of the service terminal, which may be an IP address or a port number of the service terminal.
  • the service terminal sends the report record to the statistics node in real time.
  • the report record is sent to the statistics node immediately, so that the statistics node can monitor the event corresponding to the report record in real time.
  • the statistic node receives a report record sent by at least one service terminal in real time.
  • the statistics node can receive the report records sent by many service terminals. Obviously, in actual application, if only the situation of the service running on a certain service terminal needs to be monitored, only the report record of the service terminal may be received, or the service terminal other than the service terminal may be sent out. The report of the report, or only the statistics of the report sent by the service terminal.
  • the statistical node selects a report record that meets the configuration rule, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • the statistics node selects the report record that meets the configuration rules.
  • the statistics nodes can flexibly specify which statistical interfaces and specified dimensions are counted in the report records. Because events can be described by statistical interfaces and dimensions, when the server specifies statistical interfaces and dimensions when collecting statistics, it also counts the specified events corresponding to the specified statistical interfaces and dimensions.
  • the interface included in the report record 1 is a login program interface, and the time dimension of the time dimension is "November 12" and the address dimension is "United States”.
  • the interface included in record 2 is the access space interface.
  • the time dimension included is "November 13" and the address dimension is "China”.
  • the interface included in the report record 3 is the login program interface, and the time dimension included is "November 16 And the address dimension is "United States”.
  • the interface is the login program interface, and the address dimension is "beauty.”
  • the selected report record is the report record 1 and the report record 3.
  • the interface is the access space interface and the address dimension. For "China”, that is, the designated event of "Access Space in China”, the selected report record is the report record 2.
  • a statistical interface includes a statistical interface, a statistical interface 2, and a statistical interface 3, and the dimension includes the time dimension and the address dimension
  • the address dimension is to specify an address.
  • the statistic node determines a report record that includes the occurrence time within a specified time period.
  • the specified time period is the time period for which the statistics node is set according to the monitoring purpose and the stability of the service. For example, when the statistics node needs to monitor the running status of the distributed system in real time, the specified time period can be set to a relatively small value, such as 1 second; for example, when the statistical node needs to count the daily running of the distributed system, the specified time period can be set to 1 day, and the like.
  • the statistic node can determine the report record of the included time of occurrence within the specified time period.
  • report record 1 report record 2
  • report record 3 report record 4
  • report record 5 report record 6
  • report record 6 report record 6.
  • the field information contained in these report records can be found in Table 2 below. :
  • reporting records determined by the statistical node according to the data in Table 2 above are: report record 1, report record 2, report record 5, and report record 6.
  • the statistics node determines the number of reported records, and determines the number of reported records that meet the configuration rule in the specified time period.
  • the statistics node determines the number of reported records, and the number can be determined as the number of specified events in the specified time period, and the specified event is an event determined by the configuration rule.
  • the specified event determined by the configuration rule is "Access Space Interface in Guangzhou" when the specified time period is specified.
  • the report records selected by the statistical node are the report record 2 and the report record 6, that is, from October 1 to November 15, the number of designated events is two.
  • S408 The statistics node continuously displays the number of reported records that are counted in consecutive consecutive specified time periods according to the time indicated by the specified time period.
  • the statistical node may select successive specified time periods, perform statistics of the above processes for each specified time period, and then specify The time indicated by the time period, continuously displaying the The statistical result corresponding to the specified time period.
  • the specified time period For example, you can set the specified time period to 1 second. In this case, you can count the number of specified events for each 1 second of consecutive n 1 seconds (that is, one statistical result per 1 second), and the statistics will be collected.
  • the statistics per second are displayed for 1 second. That is, the counted number of designated events in the first second is displayed in the first second period, the number of designated events in the second second count is displayed in the second second period, and so on.
  • the statistical results of consecutive 4 seconds counted by the statistics node are N1, N2, N3, and N4, respectively. If the time starts from 8:00:00, it is within 8:00:00 to 8:00:01. The statistics node continuously displays N1. In 8:00:01 to 8:00:02, the statistics node continuously displays N2. In 8:00:02 to 8:00:03, the statistics node continuously displays N3 at 8:00. From 03 to 8:00:04, the statistics node continues to display N4.
  • the statistical result of the current display statistics is usually obtained from the statistics of the report records before the current display time. Therefore, in a strict sense, the displayed time is relatively obtained. The result is lagging.
  • the specified time period can be set to different values, for example, can be set to 1 second, 5 groups, 5 hours, 1 day, 1 week, or 1 month, etc., for each length of the specified time period, Statistics can be performed in parallel, or statistics on time segments of different lengths can be paralleled, that is, one process in a statistical node is used to count statistical work with a specified time period of 1 second, and another process is used to count a specified time period. For the 5-minute statistical work, another process is used to collect statistical work with a specified time period of 5 hours. The statistical work of the three processes does not interfere with each other.
  • the statistical node detects whether the number of the reported records meets the warning condition, and the warning condition is that the absolute value of the difference between the number of reported records in the two adjacent time periods is greater than a predetermined threshold, and each time period includes At least one specified time period in succession.
  • the statistics node can also monitor the statistical results, monitor whether the statistics meet the warning conditions, and provide a warning when the statistics meet the warning conditions.
  • the alert condition can be determined to be in two adjacent time periods
  • the absolute value of the difference between the number of reported records is greater than a predetermined threshold, that is, if the difference between the statistical results of the two adjacent time periods is relatively large, an abnormality is usually indicated, and an alarm may be generated.
  • the number of reported records counted in two adjacent time periods is 500 and 512, respectively, and the predetermined threshold is 200, then
  • 12, the absolute value is much less than 200, so It is considered that no abnormality occurs and no alarm is required; for example, the number of reported records counted in the adjacent two time periods is 500 and 853, respectively, and the predetermined threshold is 200, then
  • 353, The absolute value is greater than 200, so it is considered that there is an abnormality and an alarm is required.
  • the statistical node displays the warning information.
  • the warning information may be displayed on the display end of the statistical node, or the warning information may be sent to the communication device or communication program of the manager.
  • the monitoring method based on the distributed system provided by the embodiment of the present invention receives a report record sent by at least one service terminal, selects a report record that meets the configuration rule, and reports the record in a specified time period according to the occurrence time statistics.
  • the number of the problem solves the problem that the real-time performance of the system obtained in the prior art is relatively poor; since the statistical node counts the specified event according to the report record reported in real time by the service terminal, the specified time of the statistics can be arbitrarily set.
  • the length of the segment, so the monitoring can be positioned at the second level, thus achieving the effect of real-time monitoring of the operation of the system in real time.
  • steps S401 to S403 can be separately implemented as a distributed system-based monitoring method on the service terminal side
  • steps S404 to S410 can be separately implemented as a distributed system-based monitoring method on the statistical node side.
  • FIG. 5 is a schematic diagram showing the internal structure relationship of a statistical node provided in an embodiment of the present invention.
  • the distributed cluster includes at least one service terminal 140, and a statistical node. 120 may include a record collection unit 52, a statistics unit 54, a storage database 56, an alarm unit 58, and a display unit 510.
  • the record collection unit 52 is configured to collect the report records sent by the service terminal 140.
  • the statistics unit 54 may perform statistics on the report records collected by the record collection unit 52 according to the set configuration rules and statistical conditions, obtain statistical results, and store the statistical results.
  • the alarm unit 58 can monitor the statistical result in the storage database 56. When the monitoring result meets the warning condition, the notification display unit 510 displays the warning information.
  • the statistical node can be a server or a service cluster composed of multiple servers
  • the above-mentioned record collecting unit 52, the statistical unit 54, the storage database 56, the alarm unit 58, and the display unit 510 can be concentrated on In one server, it can also be distributed among multiple servers.
  • FIG. 6 there is shown a schematic diagram of a monitoring system provided in an embodiment of the present invention, which is exemplified in the application environment shown in FIG.
  • the monitoring system can include a statistical node 620 and at least one service terminal 640.
  • the service terminal 640 includes a monitoring device based on a distributed system, and the monitoring device based on the distributed system may include: an acquisition module 642, a generation module 644, and a transmission module 646.
  • the collecting module 642 can be configured to collect the statistical interface when the statistical interface is invoked. An event that occurred on;
  • the generating module 644 can be configured to generate a report record according to the event collected by the collecting module 642, where the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes the identifier of the statistical interface, the time of occurrence of the event, and the use. Describe at least one dimension of the event;
  • the sending module 646 can be configured to send a report record to the statistical node in real time, so that the statistic node selects the report record that meets the configuration rule according to the received report record, and collects the time according to the occurrence time of the event described in the selected report record in the specified time period.
  • the number of records reported in the report, where the report record that meets the configuration rule is the report containing the specified statistical interface and the report of the specified dimension.
  • the statistical node 620 can include a distributed system based monitoring device that can include a receiving module 622, a selection module 624, and a statistics module 626.
  • the receiving module 622 is configured to receive a report record sent by the sending module 646 of the at least one service terminal 640 in real time, and the report record is used to describe an event that occurs on the statistical interface of the service terminal, where the report record includes the identifier of the statistical interface and the event occurrence time. And at least one dimension used to describe the event;
  • the selection module 624 is configured to select a report record that meets the configuration rule, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • the statistics module 626 can be configured to count the number of records reported in the specified time period according to the occurrence time of the event described by the report record selected by the selection module 624.
  • the monitoring system provided in the embodiment of the present invention, by receiving the report record sent by at least one service terminal, selects the report record that meets the configuration rule, and reports the number of records in the specified time period according to the time of occurrence, and solves the problem.
  • the real-time performance of the system is relatively poor.
  • the statistical node can arbitrarily set the length of the specified time period according to the report of the real-time report reported by the service terminal.
  • the monitoring can be positioned at the second level, so that the system can be monitored quickly and in real time. The effect of the line situation.
  • FIG. 7 there is shown a schematic diagram of a monitoring system provided in another embodiment of the present invention, which is exemplified in the application environment shown in FIG.
  • the monitoring system can include a statistical node 720 and at least one service terminal 740.
  • the monitoring system can include a statistical node 720 and at least one service terminal 740.
  • the service terminal 740 includes a monitoring system based on a distributed system, and the monitoring device based on the distributed system may include an acquisition module 742, a generation module 744, and a transmission module 746.
  • the collecting module 742 can be configured to collect an event that occurs on the statistical interface when the statistical interface is invoked;
  • the generating module 744 is configured to generate an reporting record according to the event collected by the collecting module 742, where the reporting record is used to describe an event that occurs on the statistical interface of the service terminal, where the reporting record includes the identifier of the statistical interface, the occurrence time of the event, and the usage. Describe at least one dimension of the event;
  • the sending module 746 can be configured to send a report record to the statistical node 720 in real time, so that the statistic node selects the report record that meets the configuration rule according to the received report record and collects statistics on the occurrence time of the event described in the selected report record at the specified time.
  • the number of records reported in the segment, where the report record that meets the configuration rule is the report containing the specified statistical interface and the report of the specified dimension.
  • the statistical node 720 can include a distributed system based monitoring device, which can include a receiving module 721, a selection module 722, and a statistics module 723.
  • the receiving module 721 can be configured to receive a report record sent by the sending module 746 of the at least one service terminal 740 in real time, and the report record is used to describe an event that occurs on the statistical interface of the service terminal 740.
  • the report record includes the identifier of the statistical interface and the occurrence of the event. Time and at least one dimension used to describe the event;
  • the selection module 722 is configured to select a report record that meets the configuration rule, and the report record that meets the configuration rule is an identifier that includes the specified statistical interface and a report record of the specified dimension.
  • the statistics module 723 can be configured to count the number of records reported in the specified time period according to the occurrence time of the event described by the report record selected by the selection module 722.
  • the statistics module 723 may include: a determining submodule 723a and a statistic submodule 723b.
  • a determining submodule 723a configured to determine a report record of the included time of occurrence within a specified time period
  • the statistic sub-module 723b is configured to determine the number of the report records determined by the sub-module 723a, and determine the number of the report records that meet the configuration rule within the specified time period.
  • the distributed system-based monitoring device located in the statistical node 720 may further include: a first display module 724.
  • the first display module 724 is configured to continuously display the number of reported records that are counted in consecutive consecutive specified time periods in sequence according to the time indicated by the specified time period.
  • the distributed system-based monitoring device located in the statistical node 720 may further include: a detecting module 725 and a second display module 726.
  • the detecting module 725 is configured to detect whether the number of the reported reporting records meets the warning condition, and the warning condition is that the absolute value of the difference between the number of reported records in the two adjacent time periods is greater than a predetermined threshold, each time The segment includes at least one specified time period consecutively;
  • the second display module 726 is configured to display the warning information when the detection result of the detection module 725 is that the number of the reported report records meets the warning condition.
  • the monitoring system provided in the embodiment of the present invention, by receiving the report record sent by at least one service terminal, selects the report record that meets the configuration rule, and reports the number of records in the specified time period according to the time of occurrence, and solves the problem.
  • the real-time performance of the system is relatively poor.
  • the statistical node can arbitrarily set the length of the specified time period according to the report of the real-time report reported by the service terminal.
  • the monitoring can be positioned at the second level, so that the system can be monitored quickly and in real time. The effect of the line situation.
  • the monitoring system based on the distributed system provided in the foregoing embodiment monitors the services of the distributed system
  • only the division of the foregoing functional modules is illustrated. In actual applications, the monitoring may be performed according to requirements.
  • the above function assignment is completed by different functional modules, that is, the internal structure of the statistical node and the service terminal are divided into different functional modules to complete all or part of the functions described above.
  • the distributed system-based monitoring device provided by the foregoing embodiment is the same as the distributed system-based monitoring method embodiment, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • the service terminal 800 is configured to implement the distributed system-based monitoring method provided by the foregoing embodiments.
  • the service terminal 800 in the present invention may include one or more of the following components: a processor for executing computer program instructions to perform various processes and methods, for information and storage program instructions, random access memory (RAM), and only Read memory (ROM), memory for storing data and information, I/O devices, interfaces, antennas, etc.
  • a processor for executing computer program instructions to perform various processes and methods, for information and storage program instructions, random access memory (RAM), and only Read memory (ROM), memory for storing data and information, I/O devices, interfaces, antennas, etc.
  • the service terminal 800 may include an RF (Radio Frequency) circuit 810, a memory 820, an input unit 830, a display unit 840, a sensor 850, an audio circuit 860, a WiFi (Wireless Fidelity) module 870, a processor 880, and a power supply. 882, camera 890 and other components.
  • RF Radio Frequency
  • the service terminal structure shown in FIG. 8 does not constitute a limitation of the service terminal, and may include more or less components than those illustrated, or combine some components, or different component arrangements.
  • the components of the service terminal 800 are specifically described below with reference to FIG. 8:
  • the RF circuit 810 can be used for receiving and transmitting signals during the transmission or reception of information or during a call. Specifically, after receiving the downlink information of the base station, it is processed by the processor 880. In addition, the uplink data is designed to be sent to the base station.
  • RF circuits include, but are not limited to, an antenna, at least one amplifier, a transceiver, a coupler, an LNA (Low Noise Amplifier), a duplexer. Wait.
  • RF circuitry 810 can also communicate with the network and other devices via wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and the like.
  • the memory 820 can be used to store software programs and modules, and the processor 880 executes various functional applications and data processing of the service terminal 800 by running software programs and modules stored in the memory 820.
  • the memory 820 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the business terminal 800 (such as audio data, phone book, etc.) and the like.
  • memory 820 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 830 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the service terminal 800.
  • the input unit 830 may include a touch panel 831 and other input devices 832.
  • the touch panel 831 also referred to as a touch screen, can collect touch operations on or near the user (such as the user using a finger, a stylus, or the like on the touch panel 831 or near the touch panel 831. Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 831 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 880 is provided and can receive commands from the processor 880 and execute them. In addition, it can be used
  • the touch panel 831 is implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 830 may also include other input devices 832.
  • other input devices 832 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 840 can be used to display information input by the user or information provided to the user and various menus of the service terminal 800.
  • the display unit 840 can include a display panel 841.
  • the display panel 841 can be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • the touch panel 831 can cover the display panel 841. When the touch panel 831 detects a touch operation thereon or nearby, the touch panel 831 transmits to the processor 880 to determine the type of the touch event, and then the processor 880 according to the touch event. The type provides a corresponding visual output on display panel 841.
  • touch panel 831 and the display panel 841 are used as two independent components to implement the input and input functions of the service terminal 800 in FIG. 8, in some embodiments, the touch panel 831 and the display panel 841 may be Integration and implementation of the input and output functions of the service terminal 800.
  • the service terminal 800 may also include at least one type of sensor 850, such as a gyro sensor, a magnetic induction sensor, a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 841 according to the brightness of the ambient light, and the proximity sensor may close the display panel 841 when the service terminal 800 moves to the ear. And / or backlight.
  • the acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity.
  • the service terminal 800 can be used to identify the posture of the service terminal (such as horizontal and vertical screen switching, related Game, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for the business terminal 800 can also be configured with barometers, hygrometers, thermometers, infrared sensors and other sensors, no longer repeat .
  • Audio circuit 860, speaker 861, microphone 862 can provide user and service terminal 800 The audio interface between.
  • the audio circuit 860 can transmit the converted electrical data of the received audio data to the speaker 861 for conversion to the sound signal output by the speaker 861; on the other hand, the microphone 862 converts the collected sound signal into an electrical signal by the audio circuit 860. After receiving, it is converted into audio data, and then processed by the audio data output processor 880, transmitted to the terminal, for example, via the RF circuit 810, or the audio data is output to the memory 820 for further processing.
  • WiFi is a short-range wireless transmission technology
  • the service terminal 800 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 870, which provides wireless broadband Internet access for users.
  • FIG. 8 shows the WiFi module 870, it can be understood that it does not belong to the essential configuration of the service terminal 800, and may be omitted as needed within the scope of not changing the essence of the disclosure.
  • the processor 880 is the control center of the service terminal 800, which connects various portions of the entire service terminal using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 820, and recalling stored in the memory 820.
  • the data performs various functions and processing data of the service terminal 800, thereby performing overall monitoring of the service terminal.
  • the processor 880 may include one or more processing units; preferably, the processor 880 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 880.
  • the service terminal 800 further includes a power source 882 (such as a battery) for supplying power to various components.
  • a power source 882 such as a battery
  • the power source can be logically connected to the processor 882 through the power management system to manage functions such as charging, discharging, and power management through the power management system. .
  • the camera 890 is generally composed of a lens, an image sensor, an interface, a digital signal processor, a CPU, a display screen, and the like.
  • the lens is fixed above the image sensor, and the focus can be changed by manually adjusting the lens;
  • the image sensor is equivalent to the "film" of the conventional camera, and is the heart of the image captured by the camera;
  • the interface is used to connect the camera with the cable and the board to the board.
  • Device, spring type connector The method is connected to the service terminal main board, and sends the acquired image to the memory 820;
  • the digital signal processor processes the acquired image by mathematical operations, converts the collected analog image into a digital image, and sends the collected image to the memory 820 through the interface.
  • the service terminal 800 may further include a Bluetooth module or the like, and details are not described herein again.
  • Business terminal 800 includes, in addition to one or more processors 880, a memory, and one or more programs, one or more of which are stored in a memory and configured to be executed by one or more processors.
  • processors 880 include, in addition to one or more processors 880, a memory, and one or more programs, one or more of which are stored in a memory and configured to be executed by one or more processors.
  • One or more of the above programs have the following functions:
  • the report is generated according to the collected event, and the report record is used to describe an event that occurs on the statistical interface of the service terminal.
  • the report record includes the identifier of the statistical interface, the time when the event occurs, and at least one dimension used to describe the event.
  • the report record is sent to the statistic node in real time, so that the statistic node selects the report record that meets the configuration rule according to the received report record, and the report record that meets the configuration rule is the report containing the specified statistical interface and the report of the specified dimension;
  • the occurrence time of the event described by the selected report record counts the number of records reported in the specified time period.
  • the statistical node 900 is configured to implement the distributed system-based monitoring method provided by the foregoing embodiments.
  • the statistical node 900 includes a central processing unit (CPU) 901, a system memory 904 including a random access memory (RAM) 902 and a read only memory (ROM) 903, and a system bus 905 that connects the system memory 904 and the central processing unit 901.
  • the statistical node 900 also includes a basic input/output system (I/O system) 906 that facilitates transfer of information between various devices within the computer, and a mass storage device for storing the operating system 913, applications 914, and other program modules 915. 907.
  • I/O system basic input/output system
  • the basic input/output system 906 includes a display 908 for displaying information and an input device 909 such as a mouse or keyboard for user input of information. Where display 908 and input Device 909 is connected to central processing unit 901 via an input/output controller 910 that is coupled to system bus 905.
  • the basic input/output system 906 can also include an input and output controller 910 for receiving and processing input from a plurality of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input and output controller 910 also provides output to a display screen, printer, or other type of output device.
  • the mass storage device 907 is connected to the central processing unit 901 by a mass storage controller (not shown) connected to the system bus 905.
  • the mass storage device 907 and its associated computer readable medium provide non-volatile storage for the statistical node 900. That is, the mass storage device 907 can include a computer readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • Computer readable media may comprise computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the statistical node 900 can also be operated by a remote computer connected to the network via a network such as the Internet. That is, the statistical node 900 can be connected to the network 912 via a network interface unit 911 connected to the system bus 905, or can be connected to other types of networks or remote computer systems (not shown) using the network interface unit 911.
  • the memory also includes one or more programs, one or more programs stored in the memory, and configured to be executed by one or more central processing units 901.
  • the one or more central processing units 901 described above have the following functions:
  • the report record that meets the configuration rule is selected, and the report record that meets the configuration rule is the report containing the specified statistical interface and the report of the specified dimension.
  • the number of records reported in the specified time period is counted according to the occurrence time of the event described by the selected report record.
  • the one or more central processing units 901 when counting the number of records in a specified time period according to the occurrence time of the event described in the selected report record, includes :
  • the number of reported records is determined by statistics, and the quantity is determined as the number of reported records that meet the configuration rules within the specified time period.
  • the one or more central processing units 901 further have the following functions:
  • the number of reported records counted in successive consecutive specified time periods is continuously displayed in sequence according to the time indicated by the specified time period.
  • the one or more central processing units 901 further have the following functions:
  • the warning condition is that the absolute value of the difference between the number of reported records in the two adjacent time periods is greater than a predetermined threshold, and each time period includes at least one consecutive Specified time period;
  • the warning information is displayed.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本发明公开了一种基于分布式系统的监控方法、装置及系统,属于计算机技术领域。所述方法包括:接收至少一个业务终端发送的上报记录,上报记录用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;选取符合配置规则的上报记录,配置规则为上报记录中包含有指定的统计接口的标识和指定的维度;根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。本发明通过实时接收至少一个业务终端发送的上报记录,根据上报记录对指定事件进行统计,解决了现有技术中获取到系统的运行情况的实时性比较差的问题,达到了可以实时快速的监控系统的运行情况的效果。

Description

基于分布式系统的监控方法、装置及系统
本申请要求于2013年12月13日提交中国专利局、申请号为201310689969.X、发明名称为“基于分布式系统的监控方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及分布式系统监控技术领域,特别涉及一种基于分布式系统的监控方法、装置及系统。
背景技术
在分布式系统中,由于业务量比较大,需要搭建很多台业务终端才能支持业务,在业务终端的数量比较多的情况下,监控该分布式系统的运行情况就会变得很复杂。
传统的分布式系统的监控流程包括:每台业务终端中的进程将自己的运行情况(即业务终端在统计接口上发生的事件,事件包含与统计接口相关的多个维度)定期的输出到业务终端的日志中,业务终端的脚本会将日志中的数据上传到统一的统计节点,该统计节点可以根据每台业务终端各自对应的日志中的数据统计各个统计接口、各个维度的情况,进而获取系统的运行情况。为了便于对该监控流程的理解,下面举一个简单的例子对上述的流程进行说明,例如:业务终端中的进程可以将对业务登录接口上发生的登录事件输出到日志中,该登录事件包括登录地点、登录时间、登录对象和登录用户等等维度,比如一个登录事件为“在9月23日下午14:30在上海对应用空间进行了登录”时,其中的“9月23日下午14:30” 为该登录事件的时间维度,“上海”为该登录事件的地点维度。这些日志最终会被上传至统计节点中,统计节点会根据这些登录事件进行相关的统计,比如可以统计在某个地点进行登录的登录事件的数量。
然而,由于受到脚本性能和业务终端的性能的限制,每台业务终端在周期性地将运行情况输入至日志中时,该周期通常会设置的比较长,因此获取到系统的运行情况的实时性比较差。
发明内容
为了解决现有技术中获取到系统的运行情况的实时性比较差的问题,本发明实施例提供了一种基于分布式系统的监控方法、装置及系统,技术方案如下。
第一方面,提供了一种基于分布式系统的监控方法,所述方法包括:
接收至少一个业务终端实时发送的上报记录,所述上报记录用于描述所述业务终端在统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
选取符合配置规则的上报记录,所述符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
根据选取的所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量。
第二方面,提供了一种基于分布式系统的监控方法,应用于业务终端中,所述方法包括:
在统计接口被调用时,采集在所述统计接口上发生的事件;
根据采集到的所述事件生成上报记录,所述上报记录用于描述所述业务终端在所述统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
实时的向统计节点发送所述上报记录。
第三方面,提供了一种基于分布式系统的监控装置,所述装置包括:
接收模块,用于接收至少一个业务终端实时发送的上报记录,所述上报记录用于描述所述业务终端在统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
选取模块,用于选取符合配置规则的上报记录,所述符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
统计模块,用于根据所述选取模块选取的所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量。
第四方面,提供了一种基于分布式系统的监控装置,应用于业务终端中,所述装置包括:
采集模块,用于在统计接口被调用时,采集在所述统计接口上发生的事件;
生成模块,用于根据所述采集模块采集到的所述事件生成上报记录,所述上报记录用于描述所述业务终端在所述统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
发送模块,用于实时的向统计节点发送所述上报记录。
第五方面,提供了一种监控系统,所述监控系统包括统计节点和至少一个业务终端;
所述统计节点包括如第三方面所述的基于分布式系统的监控装置;
所述业务终端包括如第四方面所述的基于分布式系统的监控装置。
本发明实施例提供的技术方案带来的有益效果是:
通过接收至少一个业务终端发送的上报记录,选取符合配置规则的 上报记录并根据发生时间统计在指定时间段内符合配置规则的上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运行情况的效果。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明一个实施例中提供的基于分布式系统的监控方法所涉及的实施环境的示意图;
图2是本发明一个实施例中提供的基于分布式系统的监控方法的方法流程图;
图3是本发明另一个实施例中提供的基于分布式系统的监控方法的方法流程图;
图4是本发明再一个实施例中提供的基于分布式系统的监控方法的方法流程图;
图5是本发明一个实施例中提供的统计节点的内部结构关系的示意图;
图6是本发明一个实施例中提供的监控系统的示意图;
图7是本发明另一个实施例中提供的监控系统的示意图;
图8是本发明一个实施例中提供的业务终端的结构示意图;
图9是本发明一个实施例中提供的统计节点的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。文中所讲的“至少一个”包括一个、两个或两个以上。
请参见图1所示,其示出了本发明一个实施例中提供的基于分布式系统的监控方法所涉及的实施环境的示意图。该实施环境可以包括统计节点120和至少一个业务终端140,统计节点120可以通过有线方式或无线方式与业务终端140连接。
统计节点120可以实时获取业务终端上报的业务运行数据,并根据这些数据进行统计。统计节点120可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。
业务终端140上运行有业务,并可以将业务的运行情况上报给统计节点120。这里所讲的“业务终端”可以包括但不限于:智能手机、平板电脑、PDA(Personal Digital Assistant,掌上电脑)、电子阅读器、多媒体电视和MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)等。
请参见图2所示,其示出了本发明一个实施例中提供的基于分布式系统的监控方法的方法流程图。该基于分布式系统的监控方法以应用于图1所示实施环境中的统计节点120中进行举例说明。该基于分布式系统的监控方法可以包括以下步骤。
S201,接收至少一个业务终端实时发送的上报记录,该上报记录用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度。
这里所讲的事件通常为用户的业务行为。统计接口一般为业务中用 于被统计的可被业务进程调用的接口,当这些接口被调用时,业务终端可以得知调用这些接口时所发生的事件。维度可以为与事件相关的属性信息,比如位置信息、接入方式、归属信息等属性信息。
S202,选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录。
在实际应用中,统计节点在统计时可以灵活的指定上报记录中对哪些统计接口和维度所描述的事件(以下称为指定事件)进行统计。由于每一个事件均可以通过统计接口和维度来描述,因此当服务器在统计时指定了统计接口和维度时,也即指定了统计接口和维度所描述的指定事件。
S203,根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
统计节点根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
指定时间段为统计节点根据监控目的以及业务的稳定性设置的时间段。比如,当统计节点需要实时监控分布式系统的运行情况时,则可以将该指定时间段设置为较小的值,如1秒;还比如当统计节点需要统计分布式系统每天的运行情况时,则可以将指定时间段设置为1天等。
综上所述,本发明实施例中提供的基于分布式系统的监控方法,通过接收至少一个业务终端发送的上报记录,选取符合配置规则的上报记录并根据发生时间统计在指定时间段内上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运行情况的效果。
请参见图3所示,其示出了另一个实施例中提供的基于分布式系统 的监控方法的方法流程图。该基于分布式系统的监控方法以应用于图1所示实施环境中的业务终端140中进行举例说明。该基于分布式系统的监控方法可以包括以下步骤。
S301,在统计接口被调用时,采集在所述统计接口上发生的事件。
这里所讲的事件通常为用户的业务行为。统计接口一般为业务中用于被统计的可被业务进程调用的接口,当这些接口被调用时,业务终端可以得知调用这些接口时所发生的事件。
S302,根据采集到的事件生成上报记录,该上报记录用于描述业务终端在所述统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度。
维度可以为与事件相关的属性信息,比如位置信息、接入方式、归属信息等属性信息。
S303,实时的向统计节点发送该上报记录。
业务终端在根据采集到的事件生成上报记录之后,实时的向统计节点发送该上报记录。
统计节点在接收到上报记录之后,根据接收到的上报记录选取符合配置规则的上报记录,并根据选取的上报记录所描述的事件的发生时间统计在指定时间段内所述上报记录的数量。
综上所述,本发明实施例中提供的基于分布式系统的监控方法,通过采集在统计接口上发生的事件,根据采集到的事件生成上报记录并实时的向统计节点发送该上报记录,使得统计节点可以根据接收到的上报记录选取符合配置规则的上报记录,并根据选取的上报记录所描述的事件的发生时间统计在指定时间段内所述上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时 间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运行情况的效果。
请参见图4所示,其示出了另一个实施例中提供的基于分布式系统的监控方法的方法流程图。该基于分布式系统的监控方法以应用于图1所示实施环境中进行举例说明。该基于分布式系统的监控方法可以包括以下步骤。
S401,业务终端在统计接口被调用时,采集在所述统计接口上发生的事件。
这里所讲的事件通常为用户的业务行为。统计接口一般为业务中用于被统计的可被业务进程调用的接口,当这些接口被调用时,业务终端可以得知调用这些接口时所发生的事件。举例来讲,业务的接口可以包括业务的登录接口、业务的访问接口等,当业务进行登录时,需要调用登录接口;当业务进行访问时,需要调用访问接口。
S402,业务终端根据采集到的事件生成上报记录,上报记录用于描述业务终端在所述统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度。
这里所讲的上报记录用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;维度可以为与事件相关的属性信息,比如位置信息、接入方式、归属信息等属性信息。
上报记录中通常包含如下几个字段:业务终端的IP地址、业务终端的端口号、接口标识和时间戳,其他的字段则根据每一个接口的具体情况按需添加,格式通常为key=value,即字段为某个数据信息,不同的维度可以通过&进行区隔,举例来讲,一个上报记录的格式可以如下所示:
buID=XX&sysID=XX&intfID=XX&set=XX&module=XX&city=XX&d stIpPort=XX&retType=XX&errCode=XX&timeStamp=XX&&TableName=XX&holdErrCode=XX&latency=XX
很显然,在实际应用中,上报记录还可以为其他的格式,但业务终端发送的上报记录的格式必须是与统计节点协商好的,这样统计节点在接收到业务终端发送的上报记录之后,才能解析出上报记录携带的信息。
事件通常是通过多个维度来描述的,事件用于描述业务终端对某一个业务执行的情况,比如对于“在2013年7月15日上午8点在深圳通过无线网络方式登录了指定的聊天应用程序”这一事件来讲,“2013年7月15日上午8点”为该事件的时间维度,“深圳”为该事件的地址维度,“无线网络方式”为该事件的接入方式维度,这里的登录即可以认定为聊天应用程序的一个统计接口确定的业务行为,对应于该事件的上报记录即可以表示为<登录的标识、2013年7月15日上午8点、深圳、无线网络方式>。
在实际应用中,统计节点可以预先设定好一个业务需要被监控的接口(即统计接口),通常来讲,一个业务会包含若干个在业务执行被调用的接口,当业务终端在运行时确定出这些统计接口被调用,即业务获知这些统计接口执行了某些事件,此时,业务终端则向统计节点发送用于描述该业务终端调用这些统计接口所发生的事件的上报记录,对应的,统计节点可以接收到这些上报记录。
在一种可能的实现方式中,上报记录中还可以保存有其他的信息,比如该业务终端的标识,具体可以为该业务终端的IP地址或端口号等。
S403,业务终端实时的向统计节点发送该上报记录。
也即,只要业务终端生成了上报记录,就会立即将上报记录发送给统计节点,这样,统计节点会实时的监控到该上报记录对应的事件。
S404,统计节点接收至少一个业务终端实时发送的上报记录。
根据分布式系统的特性可知,当业务量比较大或比较多时,分布式 系统中的业务终端的数量也会比较多,因此统计节点可以接收到很多业务终端发送的上报记录。很显然,在实际应用中,如果仅需要对某一个业务终端上运行的业务的情况进行监控,则可以仅接收这一个业务终端的上报记录,或者可以剔除除该业务终端以外的其他业务终端发送的上报记录,或者在统计的时候仅对该业务终端发送的上报记录进行统计。
S405,统计节点选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录。
统计节点选取符合配置规则的上报记录。
在实际应用中,统计节点在统计时可以灵活的指定上报记录中对哪些统计接口和指定的维度进行统计。由于事件可以通过统计接口和维度来描述,因此当服务器在统计时指定了统计接口和维度时,也即统计与指定的统计接口和维度对应的指定事件。
举例来讲,存在上报记录1、上报记录2和上报记录3,上报记录1包含的接口为登录程序接口,包含的时间维度为“11月12日”和地址维度为“美国”的字段,上报记录2包含的接口为访问空间接口,包含的时间维度为“11月13日”和地址维度为“中国”,上报记录3包含的接口为登录程序接口,包含的时间维度为“11月16日”和地址维度为“美国”,具体的上报记录所包含的字段信息可以参见下表1:
  业务接口 发生时间 发生地址
上报记录1 登录程序接口 11月12日 美国
上报记录2 访问空间接口 11月13日 中国
上报记录3 登录程序接口 11月16日 美国
表1
比如,若指定的配置规则为:接口为登录程序接口、地址维度为“美 国”,也即选取“在美国登录程序”的指定事件,则选取出的上报记录即为上报记录1和上报记录3。还比如,若指定的配置规则为:接口为访问空间接口、地址维度为“中国”,也即选取“在中国访问空间”的指定事件,则选取出的上报记录即为上报记录2。
例如,当一个业务存在的统计接口包括统计接口1、统计接口2和统计接口3,维度包括时间维度和地址维度时,则可以指定统计接口为统计接口2,指定时间维度为某个时间,指定地址维度为指定某个地址。
S406,统计节点确定包含的发生时间在指定时间段内的上报记录。
指定时间段为统计节点根据监控目的以及业务的稳定性进行设置的时间段,比如当统计节点需要实时监控分布式系统的运行情况时,则可以将该指定时间段设置为比较小的值,如1秒;还比如当统计节点需要统计分布式系统每天的运行情况时,则可以将指定时间段设置为1天等。
由于上报记录中通常包含有所描述的事件的发生时间,因此统计节点可以确定出所包含的发生时间在指定时间段内的上报记录。
举例来讲,存在的6个上报记录分别为:上报记录1、上报记录2、上报记录3、上报记录4、上报记录5和上报记录6,这些上报记录所包含的字段信息可以参见下表2:
Figure PCTCN2015072372-appb-000001
Figure PCTCN2015072372-appb-000002
表2
如果指定时间段为10月1日至11月15日,则统计节点根据上述表2中的数据确定出来的上报记录为:上报记录1、上报记录2、上报记录5和上报记录6。
S407,统计节点统计确定出的上报记录的数量,将数量确定为指定时间段内符合配置规则的上报记录的数量。
也就是说,统计节点统计确定出的上报记录的数量,可以将该数量确定为指定时间段内指定事件的数量,该指定事件是由配置规则确定的事件。
仍旧参见表2所示,设置的配置规则为:业务接口为“访问空间接口”、发生地址为“广州”,则该配置规则确定的指定事件为“在广州访问空间接口”,当指定时间段为10月1日至11月15日,则统计节点选取出的上报记录为上报记录2和上报记录6,即在10月1日至11月15日内,统计的指定事件的数量为2个。
S408,统计节点按照指定时间段所指示的时间,依次持续显示在连续的各个指定时间段内统计出的上报记录的数量。
在实际应用中,为了能够实时的观测到在不同时间段统计出的指定事件的数量,统计节点可以选取连续的各个指定时间段,对每个指定时间段均进行上述过程的统计,然后在指定时间段所指示的时间,持续显示该 指定时间段所对应的统计结果。
举例来讲,可以将指定时间段设置为1秒,此时,则可以为连续的n个1秒中的每1秒统计指定事件的数量(即每1秒对应一个统计结果),并将统计出的每秒的统计结果持续显示1秒。也即,在第1秒的时间段内显示统计出的第1秒中指定事件的数量,在第2秒的时间段内显示统计出的第2秒中的指定事件的数量,依此类推。举例来讲,统计节点统计出的连续4秒的统计结果分别为N1、N2、N3和N4,如果从8:00:00开始计时,则在8:00:00到8:00:01内,统计节点持续显示N1,在8:00:01到8:00:02内,统计节点持续显示N2,在8:00:02到8:00:03内,统计节点持续显示N3,在8:00:03到8:00:04内,统计节点持续显示N4。
这里需要说明的是,在实际应用中,当前显示统计出的统计结果通常是对在当前显示时刻前的上报记录进行的统计得来的,因此,严格意义上讲,显示的时刻相对得到的统计结果是滞后的。
在实际应用中,指定时间段可以设置成不同的值,比如可以设置为1秒、5分组、5小时、1天、1周或1个月等,对于每种长度的指定时间段,均可以进行统计,或者说对不同长度的时间段的统计的工作可以并行,也即,一个统计节点中的一个进程用于统计指定时间段为1秒的统计工作,另一个进程用于统计指定时间段为5分钟的统计工作,再一个进程用于统计指定时间段为5小时的统计工作,三个进程的统计工作互不干扰。
S409,统计节点检测统计出的上报记录的数量是否满足警示条件,警示条件为在相邻的两个时间段内上报记录的数量之间的差值的绝对值大于预定阈值,每个时间段包括连续的至少一个指定时间段。
在实际应用中,统计节点还可以监控统计出的统计结果,监控统计结果是否满足警示条件,在统计结果满足警示条件时,进行警示提醒。
在一种可能的情况下,可以将警示条件确定为在相邻的两个时间段 内上报记录的数量之间的差值的绝对值大于预定阈值,也即若相邻的两个时间段的统计结果差别比较大时,则通常表明发生异常,此时则可以进行告警。也即举例来讲,在相邻的两个时间段内统计出的上报记录的数量分别为500和512,预定阈值为200,则|500-512|=12,该绝对值远小于200,因此认为没有发生异常,不需要告警;再举例来讲,在相邻的两个时间段内统计出的上报记录的数量分别为500和853,预定阈值为200,则|500-853|=353,该绝对值大于200,因此认为存在异常,需要告警。
S410,若检测结果为统计出的上报记录的数量满足警示条件,统计节点则显示警示信息。
在实际应用中,可以在统计节点的显示端上显示警示信息,或者向管理人员的通信设备或通信程序中发送警示信息。
由此可知,由于可以对连续的指定时间段所对应的上报记录进行统计并对事件进行监控,因此可以实时观测到整个系统的运行情况,尤其是在指定时间段设置的非常小,比如1秒时,可以使得监控告警的延时缩短到1秒,大大提高了监控性能。
综上所述,本发明实施例中提供的基于分布式系统的监控方法,通过接收至少一个业务终端发送的上报记录,选取符合配置规则的上报记录并根据发生时间统计在指定时间段内上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运行情况的效果。
需要说明的是,上述步骤S401至步骤S403可以单独实现成为业务终端侧的基于分布式系统的监控方法,上述步骤S404至步骤S410可以单独实现成为统计节点侧的基于分布式系统的监控方法。
在一种可能的实现方式中,请参见图5所示,其示出了本发明一个实施例中提供的统计节点的内部结构关系的示意图,分布式集群中包含至少一个业务终端140,统计节点120可以包括:记录收集单元52、统计单元54、存储数据库56、告警单元58和展示单元510。
记录收集单元52用于收集业务终端140发送的上报记录;统计单元54可以根据设定的配置规则以及统计条件对记录收集单元52中收集的上报记录进行统计,得到统计结果,并将统计结果存储至存储数据库56中;告警单元58可以对存储数据库56中的统计结果进行监控,当监控到统计结果符合警示条件时,则通知展示单元510展示警示信息。
这里需要说明的是,由于统计节点可以为一台服务器或由多台服务器组合的服务集群,因此上述的记录收集单元52、统计单元54、存储数据库56、告警单元58和展示单元510可以集中在一台服务器中,也可以分布在多台服务器中。
还需要补充说明的是,在分布式系统中,当业务量较大的情况下,需要搭建更多的业务终端,这样统计节点接收到的上报记录也会增多,因此统计节点的统计和监控性能也要扩展,随着分布式系统规模的增大,可以通过增加统计节点以达到提高统计和监控性能的目的。
以下为本发明的装置实施例,对于其中未详尽描述的细节,可以参考上述对应的方法实施例。
请参见图6所示,其示出了本发明一个实施例中提供的监控系统的示意图,该监控系统以应用于图1所示实施环境中进行举例说明。该监控系统可以包括统计节点620和至少一个业务终端640。
业务终端640包括基于分布式系统的监控装置,该基于分布式系统的监控装置可以包括:采集模块642、生成模块644和发送模块646。
采集模块642,可以用于在统计接口被调用时,采集在所述统计接口 上发生的事件;
生成模块644,可以用于根据采集模块642采集到的事件生成上报记录,上报记录用于描述业务终端在所述统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
发送模块646,可以用于实时的向统计节点发送上报记录,以便统计节点根据接收到的上报记录选取符合配置规则的上报记录并根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量,其中符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录。
统计节点620可以包括基于分布式系统的监控装置,该基于分布式系统的监控装置可以包括接收模块622、选取模块624和统计模块626。
接收模块622,可以用于接收至少一个业务终端640的发送模块646实时发送的上报记录,上报记录用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
选取模块624,可以用于选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
统计模块626,可以用于根据选取模块624选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
综上所述,本发明实施例中提供的监控系统,通过接收至少一个业务终端发送的上报记录,选取符合配置规则的上报记录并根据发生时间统计在指定时间段内上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运 行情况的效果。
请参见图7所示,其示出了本发明另一个实施例中提供的监控系统的示意图,该监控系统以应用于图1所示实施环境中进行举例说明。该监控系统可以包括统计节点720和至少一个业务终端740。
该监控系统可以包括统计节点720和至少一个业务终端740。
业务终端740包括基于分布式系统的监控装置,该基于分布式系统的监控装置可以包括:采集模块742、生成模块744和发送模块746。
采集模块742,可以用于在统计接口被调用时,采集在所述统计接口上发生的事件;
生成模块744,可以用于根据采集模块742采集到的事件生成上报记录,上报记录用于描述业务终端在所述统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
发送模块746,可以用于实时的向统计节点720发送上报记录,以便统计节点根据接收到的上报记录选取符合配置规则的上报记录并根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量,其中符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录。
统计节点720可以包括基于分布式系统的监控装置,该基于分布式系统的监控装置可以包括接收模块721、选取模块722和统计模块723。
接收模块721,可以用于接收至少一个业务终端740的发送模块746实时发送的上报记录,上报记录用于描述业务终端740在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
选取模块722,可以用于选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
统计模块723,可以用于根据选取模块722选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
在本实施例中的第一种可能的实现方式中,统计模块723可以包括:确定子模块723a和统计子模块723b。
确定子模块723a,用于确定包含的发生时间在指定时间段内的上报记录;
统计子模块723b,用于统计确定子模块723a确定出的上报记录的数量,将数量确定为指定时间段内符合配置规则的上报记录的数量。
在本实施例中的第二种可能的实现方式中,位于统计节点720中的基于分布式系统的监控装置还可以包括:第一显示模块724。
第一显示模块724,用于按照指定时间段所指示的时间,依次持续显示在连续的各个指定时间段内统计出的上报记录的数量。
在本实施例中的第三种可能的实现方式中,位于统计节点720中的基于分布式系统的监控装置还可以包括:检测模块725和第二显示模块726。
检测模块725,用于检测统计出的上报记录的数量是否满足警示条件,警示条件为在相邻的两个时间段内上报记录的数量之间的差值的绝对值大于预定阈值,每个时间段包括连续的至少一个指定时间段;
第二显示模块726,用于在检测模块725的检测结果为统计出的上报记录的数量满足警示条件时,显示警示信息。
综上所述,本发明实施例中提供的监控系统,通过接收至少一个业务终端发送的上报记录,选取符合配置规则的上报记录并根据发生时间统计在指定时间段内上报记录的数量,解决了现有技术中获取到系统的运行情况的实时性比较差的问题;由于统计节点在根据业务终端实时上报的上报记录对指定事件进行统计时,可以任意设定统计的指定时间段的长度,因此可以将监控定位在秒级别,从而达到了可以实时快速的监控系统的运 行情况的效果。
需要说明的是:上述实施例中提供的基于分布式系统的监控装置在对分布式系统的业务进行监控时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将统计节点和业务终端的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的基于分布式系统的监控装置与基于分布式系统的监控方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参见图8所示,其示出了本发明一个实施例中提供的业务终端的结构方框图。该业务终端800用于实施上述实施例提供的基于分布式系统的监控方法。本发明中的业务终端800可以包括一个或多个如下组成部分:用于执行计算机程序指令以完成各种流程和方法的处理器,用于信息和存储程序指令随机接入存储器(RAM)和只读存储器(ROM),用于存储数据和信息的存储器,I/O设备,界面,天线等。具体来讲:
业务终端800可以包括RF(Radio Frequency,射频)电路810、存储器820、输入单元830、显示单元840、传感器850、音频电路860、WiFi(wireless fidelity,无线保真)模块870、处理器880、电源882、摄像头890等部件。本领域技术人员可以理解,图8中示出的业务终端结构并不构成对业务终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图8对业务终端800的各个构成部件进行具体的介绍:
RF电路810可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器880处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、LNA(Low Noise Amplifier,低噪声放大器)、双工器 等。此外,RF电路810还可以通过无线通信与网络和其他设备通信。所述无线通信可以使用任一通信标准或协议,包括但不限于GSM(Global System of Mobile communication,全球移动通讯系统)、GPRS(General Packet Radio Service,通用分组无线服务)、CDMA(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,宽带码分多址)、LTE(Long Term Evolution,长期演进)、电子邮件、SMS(Short Messaging Service,短消息服务)等。
存储器820可用于存储软件程序以及模块,处理器880通过运行存储在存储器820的软件程序以及模块,从而执行业务终端800的各种功能应用以及数据处理。存储器820可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据业务终端800的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器820可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元830可用于接收输入的数字或字符信息,以及产生与业务终端800的用户设置以及功能控制有关的键信号输入。具体地,输入单元830可包括触控面板831以及其他输入设备832。触控面板831,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板831上或在触控面板831附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板831可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器880,并能接收处理器880发来的命令并加以执行。此外,可以采用 电阻式、电容式、红外线以及表面声波等多种类型实现触控面板831。除了触控面板831,输入单元830还可以包括其他输入设备832。具体地,其他输入设备832可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元840可用于显示由用户输入的信息或提供给用户的信息以及业务终端800的各种菜单。显示单元840可包括显示面板841,可选的,可以采用LCD(Liquid Crystal Display,液晶显示器)、OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板841。进一步的,触控面板831可覆盖显示面板841,当触控面板831检测到在其上或附近的触摸操作后,传送给处理器880以确定触摸事件的类型,随后处理器880根据触摸事件的类型在显示面板841上提供相应的视觉输出。虽然在图8中,触控面板831与显示面板841是作为两个独立的部件来实现业务终端800的输入和输入功能,但是在某些实施例中,可以将触控面板831与显示面板841集成而实现业务终端800的输入和输出功能。
业务终端800还可包括至少一种传感器850,比如陀螺仪传感器、磁感应传感器、光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板841的亮度,接近传感器可在业务终端800移动到耳边时,关闭显示面板841和/或背光。作为运动传感器的一种,加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别业务终端姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于业务终端800还可配置的气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路860、扬声器861,传声器862可提供用户与业务终端800 之间的音频接口。音频电路860可将接收到的音频数据转换后的电信号,传输到扬声器861,由扬声器861转换为声音信号输出;另一方面,传声器862将收集的声音信号转换为电信号,由音频电路860接收后转换为音频数据,再将音频数据输出处理器880处理后,经RF电路810以发送给比如另一终端,或者将音频数据输出至存储器820以便进一步处理。
WiFi属于短距离无线传输技术,业务终端800通过WiFi模块870可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图8示出了WiFi模块870,但是可以理解的是,其并不属于业务终端800的必须构成,完全可以根据需要在不改变公开的本质的范围内而省略。
处理器880是业务终端800的控制中心,利用各种接口和线路连接整个业务终端的各个部分,通过运行或执行存储在存储器820内的软件程序和/或模块,以及调用存储在存储器820内的数据,执行业务终端800的各种功能和处理数据,从而对业务终端进行整体监控。可选的,处理器880可包括一个或多个处理单元;优选的,处理器880可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器880中。
业务终端800还包括给各个部件供电的电源882(比如电池),优选的,电源可以通过电源管理系统与处理器882逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
摄像头890一般由镜头、图像传感器、接口、数字信号处理器、CPU、显示屏幕等组成。其中,镜头固定在图像传感器的上方,可以通过手动调节镜头来改变聚焦;图像传感器相当于传统相机的“胶卷”,是摄像头采集图像的心脏;接口用于把摄像头利用排线、板对板连接器、弹簧式连接方 式与业务终端主板连接,将采集的图像发送给存储器820;数字信号处理器通过数学运算对采集的图像进行处理,将采集的模拟图像转换为数字图像并通过接口发送给存储器820。
尽管未示出,业务终端800还可以包括蓝牙模块等,在此不再赘述。
业务终端800除了包括一个或者多个处理器880,还包括有存储器,以及一个或者多个程序,其中一个或者多个程序存储于存储器中,并被配置成由一个或者多个处理器执行。上述一个或者多个程序具有如下功能:
在统计接口被调用时,采集在统计接口上发生的事件;
根据采集到的事件生成上报记录,上报记录用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
实时的向统计节点发送上报记录,以便统计节点根据接收到的上报记录选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
请参见图9所示,其示出了本发明一个实施例中提供的统计节点的结构方框图。该统计节点900用于实施上述实施例提供的基于分布式系统的监控方法。统计节点900包括中央处理单元(CPU)901、包括随机存取存储器(RAM)902和只读存储器(ROM)903的系统存储器904,以及连接系统存储器904和中央处理单元901的系统总线905。统计节点900还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O系统)906,和用于存储操作系统913、应用程序914和其他程序模块915的大容量存储设备907。
基本输入/输出系统906包括有用于显示信息的显示器908和用于用户输入信息的诸如鼠标、键盘之类的输入设备909。其中显示器908和输入 设备909都通过连接到系统总线905的输入/输出控制器910连接到中央处理单元901。基本输入/输出系统906还可以包括输入输出控制器910以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器910还提供输出到显示屏、打印机或其他类型的输出设备。
大容量存储设备907通过连接到系统总线905的大容量存储控制器(未示出)连接到中央处理单元901。大容量存储设备907及其相关联的计算机可读介质为统计节点900提供非易失性存储。也就是说,大容量存储设备907可以包括诸如硬盘或者CD-ROM驱动器之类的计算机可读介质(未示出)。
不失一般性,计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其他固态存储其技术,CD-ROM、DVD或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知计算机存储介质不局限于上述几种。上述的系统存储器904和大容量存储设备907可以统称为存储器。
根据本发明的各种实施例,统计节点900还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即统计节点900可以通过连接在系统总线905上的网络接口单元911连接到网络912,或者说,也可以使用网络接口单元911来连接到其他类型的网络或远程计算机系统(未示出)。
存储器还包括一个或者一个以上的程序,一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上中央处理单元901执行。上述一个或者多个中央处理单元901具有如下功能:
接收至少一个业务终端发送的上报记录,上报记录为业务终端实时发送的用于描述业务终端在统计接口上发生的事件,上报记录包括统计接口的标识、事件的发生时间和用于描述事件的至少一个维度;
选取符合配置规则的上报记录,符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量。
在本实施例中的第一种可能的实现方式中,上述一个或者多个中央处理单元901在根据选取的上报记录所描述的事件的发生时间统计在指定时间段内上报记录的数量时,包括:
确定包含的发生时间在指定时间段内的上报记录;
统计确定出的上报记录的数量,将数量确定为指定时间段内符合配置规则的上报记录的数量。
在本实施例中的第二种可能的实现方式中,上述一个或者多个中央处理单元901还具有如下功能:
按照指定时间段所指示的时间,依次持续显示在连续的各个指定时间段内统计出的上报记录的数量。
在本实施例中的第三种可能的实现方式中,上述一个或者多个中央处理单元901还具有如下功能:
检测统计出的上报记录的数量是否满足警示条件,警示条件为在相邻的两个时间段内上报记录的数量之间的差值的绝对值大于预定阈值,每个时间段包括连续的至少一个指定时间段;
若检测结果为统计出的上报记录的数量满足警示条件,则显示警示信息。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (11)

  1. 一种基于分布式系统的监控方法,其特征在于,所述方法包括:
    接收至少一个业务终端实时发送的上报记录,所述上报记录用于描述所述业务终端在统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
    选取符合配置规则的上报记录,所述符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
    根据选取的所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量。
  2. 根据权利要求1所述的方法,其特征在于,所述根据选取的所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量,包括:
    确定所述事件的发生时间在所述指定时间段内的所述上报记录;
    统计确定出的所述上报记录的数量,将所述数量确定为所述指定时间段内符合所述配置规则的所述上报记录的数量。
  3. 根据权利要求1所述的方法,其特征在于,在所述根据所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量之后,还包括:
    按照所述指定时间段所指示的时间,依次持续显示在连续的各个指定时间段内统计出的所述上报记录的数量。
  4. 根据权利要求1至3中任一所述的方法,其特征在于,在所述根据所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量之后,还包括:
    检测统计出的所述上报记录的数量是否满足警示条件,所述警示条件为在相邻的两个时间段内所述上报记录的数量之间的差值的绝对值大于预 定阈值,每个时间段包括连续的至少一个所述指定时间段;
    若检测结果为统计出的所述上报记录的数量满足所述警示条件,则显示警示信息。
  5. 一种基于分布式系统的监控方法,应用于业务终端中,其特征在于,所述方法包括:
    在统计接口被调用时,采集在所述统计接口上发生的事件;
    根据采集到的所述事件生成上报记录,所述上报记录用于描述所述业务终端在所述统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
    实时的向统计节点发送所述上报记录。
  6. 一种基于分布式系统的监控装置,其特征在于,所述装置包括:
    接收模块,用于接收至少一个业务终端实时发送的上报记录,所述上报记录用于描述所述业务终端在统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
    选取模块,用于选取符合配置规则的上报记录,所述符合配置规则的上报记录为包含有指定的统计接口的标识和指定的维度的上报记录;
    统计模块,用于根据所述选取模块选取的所述上报记录中包括的所述事件的发生时间统计在指定时间段内所述上报记录的数量。
  7. 根据权利要求6所述的装置,其特征在于,所述统计模块包括:
    确定子模块,用于确定所述事件的发生时间在所述指定时间段内的所述上报记录;
    统计子模块,用于统计所述确定子模块确定出的所述上报记录的数量,将所述数量确定为所述指定时间段内符合所述配置规则的所述上报记录的数量。
  8. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    第一显示模块,用于按照所述指定时间段所指示的时间,依次持续显示在连续的各个指定时间段内统计出的所述上报记录的数量。
  9. 根据权利要求6至8中任一所述的装置,其特征在于,所述装置还包括:
    检测模块,用于检测统计出的所述上报记录的数量是否满足警示条件,所述警示条件为在相邻的两个时间段内所述上报记录的数量之间的差值的绝对值大于预定阈值,每个时间段包括连续的至少一个所述指定时间段;
    第二显示模块,用于在所述检测模块的检测结果为统计出的所述上报记录的数量满足所述警示条件时,显示警示信息。
  10. 一种基于分布式系统的监控装置,应用于业务终端中,其特征在于,所述装置包括:
    采集模块,用于在统计接口被调用时,采集在所述统计接口上发生的事件;
    生成模块,用于根据所述采集模块采集到的所述事件生成上报记录,所述上报记录用于描述所述业务终端在所述统计接口上发生的事件,所述上报记录包括所述统计接口的标识、所述事件的发生时间和用于描述所述事件的至少一个维度;
    发送模块,用于实时的向统计节点发送所述上报记录。
  11. 一种监控系统,其特征在于,所述监控系统包括统计节点和至少一个业务终端;
    所述统计节点包括如权利要求6至9中任一所述的基于分布式系统的监控装置;
    所述业务终端包括如权利要求10所述的基于分布式系统的监控装置。
PCT/CN2015/072372 2013-12-13 2015-02-06 基于分布式系统的监控方法、装置及系统 WO2015085963A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310689969.XA CN104092556A (zh) 2013-12-13 2013-12-13 基于分布式系统的监控方法、装置及系统
CN201310689969.X 2013-12-13

Publications (1)

Publication Number Publication Date
WO2015085963A1 true WO2015085963A1 (zh) 2015-06-18

Family

ID=51640238

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/072372 WO2015085963A1 (zh) 2013-12-13 2015-02-06 基于分布式系统的监控方法、装置及系统

Country Status (2)

Country Link
CN (1) CN104092556A (zh)
WO (1) WO2015085963A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436979A (zh) * 2020-11-04 2021-03-02 深圳供电局有限公司 一种云网流量采集方法及系统
CN113590434A (zh) * 2021-06-20 2021-11-02 济南浪潮数据技术有限公司 一种集群告警方法、系统、设备以及介质
CN113609181A (zh) * 2021-07-05 2021-11-05 广州中大中鸣科技有限公司 智能垃圾站监控方法、系统、装置及存储介质
CN113760641A (zh) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 业务监控方法、装置、计算机系统和计算机可读存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104092556A (zh) * 2013-12-13 2014-10-08 腾讯数码(天津)有限公司 基于分布式系统的监控方法、装置及系统
CN106411547B (zh) * 2015-07-28 2019-11-05 北京京东尚科信息技术有限公司 一种redis调用监控方法和装置
CN107920360B (zh) * 2016-10-08 2022-07-29 中兴通讯股份有限公司 一种定位网络问题的方法、装置及系统
CN107025624B (zh) * 2017-02-28 2021-02-02 广州地理研究所 公共交通扫码支付数据处理方法、装置和系统
CN107277146B (zh) * 2017-06-26 2021-01-08 苏州浪潮智能科技有限公司 一种分布式存储业务流量模型生成方法及系统
CN110189039B (zh) * 2019-06-04 2023-04-25 湖南智慧畅行交通科技有限公司 基于分布式的充电桩事件处理引擎
CN114493378A (zh) * 2022-04-06 2022-05-13 树根互联股份有限公司 一种工业设备的指标获取方法、装置及计算机设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080168044A1 (en) * 2007-01-09 2008-07-10 Morgan Stanley System and method for providing performance statistics for application components
CN102880676A (zh) * 2012-09-10 2013-01-16 新浪网技术(中国)有限公司 统计用户行为数据的方法及用户行为数据统计系统
CN103297477A (zh) * 2012-03-02 2013-09-11 腾讯科技(深圳)有限公司 一种数据采集上报系统及数据处理方法和代理服务器
CN104092556A (zh) * 2013-12-13 2014-10-08 腾讯数码(天津)有限公司 基于分布式系统的监控方法、装置及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102395103B (zh) * 2011-10-27 2014-10-01 深圳市赛格导航科技股份有限公司 交互式定时定向信息发布系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080168044A1 (en) * 2007-01-09 2008-07-10 Morgan Stanley System and method for providing performance statistics for application components
CN103297477A (zh) * 2012-03-02 2013-09-11 腾讯科技(深圳)有限公司 一种数据采集上报系统及数据处理方法和代理服务器
CN102880676A (zh) * 2012-09-10 2013-01-16 新浪网技术(中国)有限公司 统计用户行为数据的方法及用户行为数据统计系统
CN104092556A (zh) * 2013-12-13 2014-10-08 腾讯数码(天津)有限公司 基于分布式系统的监控方法、装置及系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436979A (zh) * 2020-11-04 2021-03-02 深圳供电局有限公司 一种云网流量采集方法及系统
CN113760641A (zh) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 业务监控方法、装置、计算机系统和计算机可读存储介质
CN113590434A (zh) * 2021-06-20 2021-11-02 济南浪潮数据技术有限公司 一种集群告警方法、系统、设备以及介质
CN113590434B (zh) * 2021-06-20 2023-12-22 济南浪潮数据技术有限公司 一种集群告警方法、系统、设备以及介质
CN113609181A (zh) * 2021-07-05 2021-11-05 广州中大中鸣科技有限公司 智能垃圾站监控方法、系统、装置及存储介质
CN113609181B (zh) * 2021-07-05 2023-11-17 广州中大中鸣科技有限公司 智能垃圾站监控方法、系统、装置及存储介质

Also Published As

Publication number Publication date
CN104092556A (zh) 2014-10-08

Similar Documents

Publication Publication Date Title
WO2015085963A1 (zh) 基于分布式系统的监控方法、装置及系统
CN108471376B (zh) 数据处理方法、装置及系统
JP6511541B2 (ja) メッセージ更新方法、装置、および端末
WO2016197758A1 (zh) 信息推荐系统、方法及装置
WO2015081801A1 (en) Method, server, and system for information push
EP2988199A1 (en) Clicking control method and terminal
WO2016180138A1 (zh) 推广信息投放有效性确定方法、监测服务器及终端
CN104935955B (zh) 一种传输直播视频流的方法、装置和系统
CN104954233B (zh) 信息推送方法、装置和系统
CN110831262B (zh) 信号处理方法以及信号处理装置
CN105245432B (zh) 未读消息计数方法、装置及终端
CN105468513B (zh) 一种基于移动终端的测试方法、装置及系统
JP2021505081A (ja) ビデオ伝送方法、ビデオ伝送装置、およびビデオ伝送システム、ならびにコンピュータ可読記憶媒体
CN108810057B (zh) 用户行为数据的采集方法、装置及存储介质
CN108334345B (zh) 应用程序处理方法、装置、可读存储介质和移动终端
JP2017509051A (ja) ストリーミングメディアデータに関する統計を収集するための方法およびシステム、ならびに関連する装置
CN106100974B (zh) 信息分享方法及装置
CN110555155B (zh) 物品信息推荐方法、设备和存储介质
US10757060B2 (en) Message notification method and terminal
CN115118636B (zh) 网络抖动状态的确定方法、装置、电子设备及存储介质
CN107688498B (zh) 应用程序处理方法和装置、计算机设备、存储介质
CN111372127B (zh) 一种数据传输方法和相关装置
CN105681723B (zh) 音视频通话方法及装置
CN108512864B (zh) 一种网络请求调度的方法及装置
CN107526668B (zh) Cpu监控方法和装置、计算机设备、计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15727860

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.10.16)

122 Ep: pct application non-entry in european phase

Ref document number: 15727860

Country of ref document: EP

Kind code of ref document: A1