WO2017092600A1 - Pointer counting method and device - Google Patents

Pointer counting method and device Download PDF

Info

Publication number
WO2017092600A1
WO2017092600A1 PCT/CN2016/107017 CN2016107017W WO2017092600A1 WO 2017092600 A1 WO2017092600 A1 WO 2017092600A1 CN 2016107017 W CN2016107017 W CN 2016107017W WO 2017092600 A1 WO2017092600 A1 WO 2017092600A1
Authority
WO
WIPO (PCT)
Prior art keywords
indicator
level
name
structured
lowest
Prior art date
Application number
PCT/CN2016/107017
Other languages
French (fr)
Chinese (zh)
Inventor
王逸
武翀
刘键
方孝健
封仲淹
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017092600A1 publication Critical patent/WO2017092600A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Definitions

  • the present application relates to the field of real-time computing technologies, and in particular, to an indicator statistical method and an indicator statistical device.
  • Storm is a distributed open source real-time computing system under the apache community, using clojure language (Clojure is a Lisp language running on Java (Java is an object-oriented programming language that can write cross-platform applications) platform (Lisp is a programming language known for its expressiveness and power)).
  • Storm can be used in "stream processing” to process messages in real time; it can also be used for “continuous computation", which continuously processes data streams and outputs the results to users in the form of streams. It can also be used in "Remote Procedure Call Protocol" to perform operations in parallel.
  • JStorm is a real-time computing system based on Storm, which is compatible with Storm.
  • embodiments of the present application have been made in order to provide an indicator statistical method and a corresponding indicator statistical apparatus that overcome the above problems or at least partially solve the above problems.
  • an indicator statistical method including:
  • a structured indicator name is created for each level node corresponding to the indicator; wherein, the hierarchical relationship between the nodes of each level is determined by the name of the structured indicator;
  • the application also discloses an indicator statistical device, comprising:
  • a structured identifier creation module configured to create a structured indicator name for each level node corresponding to the indicator for an indicator of a topology operation; wherein, the hierarchical relationship between the nodes of each level is determined by the structured indicator name;
  • the bottom layer indicator monitoring module is configured to perform statistics on the lowest level data corresponding to the indicator, and under the name of the corresponding lowest level structured indicator;
  • the layer-by-layer summary module is used to summarize the statistics under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the previous level.
  • a structured indicator name is created for each level node corresponding to the indicator, and the structured indicator name is used to determine the hierarchical relationship between the hierarchical nodes. Then, the embodiment of the present application monitors the lowest-level data of the indicator, and collects the data under the name of the lowest-level structured indicator, and then summarizes the data based on the lowest level statistical data according to the hierarchical relationship between the structured indicator names. Under the name of the structural indicator of the previous level.
  • the embodiment of the present application can easily perform statistics on indicators of each level by hierarchical relationship of structured indicator names, which is simple in logic, can reduce system consumption, and can be conveniently implemented because of hierarchical relationships constructed according to structured indicator names. Add or remove one or several levels to facilitate expansion.
  • FIG. 3 is a structural block diagram of an embodiment of an indicator statistical device of the present application.
  • FIG. 4 is a structural block diagram of an embodiment of an indicator statistical system of the present application.
  • a jstorm or storm real-time computing system is taken as an example to introduce related terms involved in the embodiments of the present application.
  • Topology A topology job that is an application running on a jstorm or storm system. After a topology job is submitted to the storm or jstorm real-time computing system, it can run without interruption.
  • a topology consists of multiple components, each called a component.
  • the components of storm and jstorm are divided into spout and bolt.
  • the spout component represents the source of the processed data, such as a spout that can fetch data from an external message component, or retrieve data from a database.
  • spout can continuously acquire data from any external data source, and The data is sent downstream, such as a bolt, and Bolt receives the data from Spout and processes it.
  • Task represents a logical processing unit, which is an instance of the implemented spout/bolt.
  • a component may include multiple Tasks.
  • Stream Data stream. Stream is the smallest unit of indicator statistics in jstorm and storm. A Task may include multiple Streams.
  • jstorm and storm real-time computing systems have a hierarchical structure, such as stream ⁇ task ⁇ component ⁇ topology. The data of each indicator is counted in the stream.
  • One of the core concepts of the embodiments of the present application is that, for a real-time computing system, since it has a hierarchical structure when processing data, and for real-time computing systems, it is possible to quickly perform statistics on all levels of indicators, for each level of hierarchical nodes of an indicator.
  • the structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained.
  • the hierarchical relationship of the structured indicator names can be used to simply count the indicators of each level, the logic is simple, the system consumption can be reduced, and the hierarchical relationship established according to the structured indicator name can be convenient. Add or remove one or several levels for easy expansion.
  • FIG. 1 a flow chart of steps of an embodiment of an indicator statistical method of the present application is shown, which may specifically include the following steps:
  • Step 110 Create, for an indicator of a topology job, a structured indicator name for each level node corresponding to the indicator; wherein, the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name;
  • the process of real-time processing of data by jstorm is taken as an example, and jstorm can first receive a topology topology, that is, start an application. Then, the embodiment of the present application needs to count various indicators of each level in the topology processing process, such as the amount of sent messages (Emitted) of each level, the level of transmission per second (TPS), and the like. Then, the embodiment of the present application may create a structured indicator name for each level node corresponding to the indicator, and determine a hierarchical relationship between the nodes of each level by using the structured indicator name.
  • Topology, component, task, and stream indicate the location of the node identifier, and name indicates the location of the indicator identifier.
  • the topology node of topology is tp1.
  • the topology includes a hierarchical node component whose node identifier is spout.
  • the component includes two hierarchical nodes Task0 and Task1.
  • Task0 includes two hierarchical nodes Steam0, Steam1.
  • Under Task0 there are two hierarchical nodes Steam2 and Steam3.
  • the statistical indicator is identified as Emitted.
  • Tp1@spout@Task0@Stream0@Emitted indicates the value of the Emitted value of Stream0.
  • the structured indicator name for Steam1 is:
  • Tp1@spout@Task0@Stream1@Emitted indicates the value of the Emitted value of Stream1.
  • the name of the structured indicator corresponding to Task0 is:
  • Tp1@spout@Task0@@Emitted indicates the Emitted value of the statistics Task0.
  • the structured indicator name for Steam2 is:
  • Tp1@spout@Task1@Stream2@Emitted indicates the value of the Emitted value of Stream2.
  • Tp1@spout@Task1@Stream3@Emitted indicates the value of the Emitted value of Stream3.
  • the name of the structured indicator corresponding to Task1 is:
  • Tp1@spout@Task1@@Emitted indicates the Emitted value of the statistics Task1.
  • the name of the structured indicator corresponding to spout is:
  • Tp1@spout@@@Emitted indicates the Emitted value of the statistics spout.
  • the name of the structured indicator corresponding to tp1 is:
  • Tp1@@@@Emitted indicates the Emitted value of the tp1.
  • the structured indicator name corresponding to Task0 is reduced by the name of the structured indicator corresponding to Steam0 and Steam1, and there is a clear relationship between the superior and the lower.
  • tp1@spout@Task0@Stream0@Emitted and tp1@spout@Task0@Stream1@Emitted are level relationships.
  • tp1@spout@@@Emitted is reduced by the structural index names of Task0 and Task1, with a clear relationship between the superior and the subordinate.
  • tp1@spout@Task0@@Emitted and tp1@spout@Task1@Emitted are level relationships.
  • the above structured indicator names can clarify the hierarchical relationship between nodes at each level.
  • step 110 includes:
  • Sub-step A11 for an indicator of a topology job, the node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator are sequentially combined into the lowest-level structured indicator name;
  • the lowest-level node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator are used.
  • the indicator identifiers are sequentially combined into the lowest-level structured indicator names.
  • Steam0 is the lowest level node, and its structured indicator name is set first: tp1@spout@Task0@Stream0@Emitted.
  • Several other lowest level hierarchy nodes are similar.
  • the bottom-level structured indicator name represents the hierarchical path from the top to the bottom.
  • Sub-step A12 based on the lowest-level structured indicator name, set the hierarchical node of the current level in the structured indicator name to be empty for the structured indicator name of each level, and obtain the structural indicator name of the previous level.
  • the upper level of Stream is Task
  • the implementation of this application merges the same Task in the previous level of Stream. For example, if Stream0 in tp1@spout@Task0@Stream0@Emitted is set to null, or tp1@spout@Task0@Stream1@Emitted, the structured indicator name tp1@spout@Task0@@Emitted of the hierarchical node Task0 is obtained. In the same way, the structural index name of Task1 is obtained tp1@spout@Task1@@Emitted.
  • the Task0 in tp1@spout@Task0@@Emitted or Task1 in tp1@spout@Task1@@Emitted Set to null, get the structured index name of spout tp1@spout@@@Emitted. And so on, until the top-level structured indicator name is generated.
  • the symbol @ is used as a separator to facilitate the merging of structured indicator names to generate a hierarchical level of structured indicator names.
  • the symbol similar to @ may not be set.
  • the node identifier of each hierarchical node and the associated hierarchical level may be provided to the real-time computing system, so that the real-time computing system can perform sub-step A12 according to the node identifier of each hierarchical level.
  • sub-step A11 includes:
  • the hierarchical node corresponding to the current hierarchical separator in the structured indicator name is set to be empty, and the upper level structure is obtained.
  • the name of the indicator is
  • the embodiment of the present application in order to prevent the full name of the node identifier and the indicator identifier of each hierarchical node from being directly combined, the duplicate name and the ambiguity occur, and the embodiment of the present application adds a separator between the node identifiers of any level. A separator is also added between the lowest node identifier and the metric identifier. @, as in the above tp1@spout@Task0@Stream0@Emitted, is the added separator. When generating the hierarchical structure name of the previous level, you can only set the indicator ID to be empty and retain the separator to make it easier to determine the hierarchical relationship between the names of each structured indicator.
  • tp1@spout@Task0@@Emitted generated by tp1@spout@Task0@Stream0@Emitted, in the subsequent merge, only need to judge that the 3rd and 4th separators are empty, you can Make sure the tp1@spout@Task0@@Emitted name is the value of Emitted for all Steams under Statistics Task0.
  • the method before step 110, the method further includes:
  • B11 Register the indicator identifier corresponding to the lowest level node to the system.
  • the worker executed by each computing node may register the indicator identifier corresponding to the lowest level node in the system of the computing node according to requirements. For example, register the indicator ID of each level node of the Stream level. Then, the computing node in the embodiment of the present application can automatically generate the structured indicator name of the hierarchical node of each level according to the structural definition of the structured indicator name after the index identifier of the lowest level hierarchical node is registered, for example, for the stream level.
  • Stream0 in step A11, generates tp1@spout@Task0@Stream0@Emitted, etc., and then proceeds to step A12 to generate a structured indicator name until each level node is generated.
  • the above only informs the computing node to register the indicator identifier corresponding to the lowest level hierarchical node to the system, and only Give the compute node a simple notification without reducing the transmission overhead by structuring the metric data to the compute node's hierarchical nodes.
  • the structural definition of the structured indicator name may be configured in a scheduling server of the real-time computing system, and then transmitted to each computing node by the scheduling server.
  • the technician can change the hierarchical structure of the structured metric names configured in the dispatch server as needed to change the hierarchy and change the structured metric names for each level node accordingly.
  • the structure of the structured indicator name can be defined as:
  • the group groups the Emitted statistics according to the requirements of different services. For example, the service A needs to send 10 messages, the structured indicator name is +1, and the service B needs to send 1 message, and the structured indicator name + 1. Then different groups, the corresponding structured indicators have different values.
  • the embodiment of the present invention may also change the structural definition of the foregoing structured indicator name according to actual needs, which is not limited in this application.
  • Step 120 After monitoring the lowest-level data corresponding to the indicator, perform statistics under the name of the corresponding lowest-level structured indicator;
  • a topology is created in jstorm, then the jstorm scheduled system can divide the topology into multiple workers, and each worker represents a process that performs a specific task. .
  • the above workers are distributed on different computing nodes of jstorm's computing cluster and executed in parallel. All the actual data processing work is finally completed in the worker. Therefore, for each computing node, the topology is processed in a hierarchical structure, so in the embodiment of the present application, each computing node obtains the above-mentioned structural index names corresponding to the hierarchical nodes of the indicator, and then in each structured The indicators are counted under the indicator name.
  • each worker can run at least one spout and/or at least one bolt.
  • the spout or bolt is divided into task execution, and the task processes the data in the form of a stream.
  • the embodiment of the present application monitors the data related to the indicator that appears in the stream, for example, for the amount of sent messages (Emitted), monitors that the stream level node stream0 passes a Tuple (Tuple: a basic unit of a message delivery), and corresponds to Steam0. Structured indicator name tp1@spout@Task0@Stream0@Emitted value more New is 1.
  • the system can be notified to monitor the indicators of the lowest level hierarchical node, and no monitoring is performed on the lowest layer.
  • Step 130 Based on the statistical data under the name of the lowest-level structured indicator, according to the hierarchical relationship between the names of the structured indicators, the data is summarized step by step to the structural indicator name of the upper level.
  • the hierarchical structure name corresponding to the structured name is found: tp1@spout@Task0@@Emitted, tp1@spout@@@Emitted, tp1 @@@Emitted, then update the values of the three to 1.
  • step 120 it is monitored that the stream level node stream0 passes a Tuple (Tuple: the basic unit of a message delivery), and the value of the structured indicator name tp1@spout@Task0@Stream0@Emitted corresponding to Steam1 is updated to 2.
  • Tuple the basic unit of a message delivery
  • step 130 according to the hierarchical relationship, the layers are summarized step by step, and the summary order and results are as follows:
  • step 120 the stream level node stream3 is monitored to pass a Tuple (Tuple: the basic unit of a message delivery), and the value of the structured indicator name tp1@spout@Task1@Stream3@Emitted corresponding to Steam0 is updated to 1.
  • Tuple the basic unit of a message delivery
  • step 130 according to the hierarchical relationship, the layers are summarized step by step, and the summary order and results are as follows:
  • the embodiment of the present application aggregates the records under the structural indicator names of the respective computing nodes.
  • the record for compute node 2 is:
  • the real-time computing system counts data within one minute for each structured indicator. You can then continuously output statistics every minute. For example, data in the form of a log.
  • the method further includes:
  • step 140 the statistical data under each structured indicator name is exported to a database for storage.
  • the real-time computing system since the real-time computing system does not have the function of the database, the statistical result is inconvenient to query.
  • the present application can export the statistical data under each structured indicator name to HBase, Hadoop, Hive and other databases.
  • step 140 includes:
  • Sub-step C11 the statistical data under each structured indicator name is exported to the database, and the structured indicator name and time stamp are used as keywords, and the statistical data is used as the key value for storage.
  • the structured indicator name and its statistical data are structured at the end of the time period.
  • the indicator name and timestamp are keyword keys, and the statistics are key values and are stored in the database, such as the HBase database. Then, through the timestamp, it is convenient to find the index values of each level in a period of time. This timestamp is the system time at the end of each time period.
  • the topology identifier of the topology is tp1; tp1 has a component, and the hierarchy identifier is spout; there are 5 tasks under the component, and each task has an id corresponding thereto, and the corresponding hierarchical identifiers are respectively Task0 ⁇ Task4; At the same time, there are 2 streams under each task, and the corresponding level identifiers are Stream0 and Stream1 respectively. Then its hierarchical relationship is:
  • the traditional indicator statistical method does not reflect the above hierarchical relationship and calculation logic. If you need to implement this hierarchical logic, you need to do a lot of additional complex logic judgments and calculations. At the same time, the traditional method also needs to carefully select the indicator name to avoid duplication and result in inaccurate data.
  • the embodiment of the present application in the real-time computing system, has a hierarchical structure when processing data, and the real-time computing system can quickly perform statistics on all the levels of the indicators.
  • the embodiment of the present application A structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself, thereby determining the summary relationship. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained.
  • the hierarchical relationship of the structured indicator names can be used to simply perform statistics on the indicators of each level, and the logic is simple, which can reduce system consumption, and is based on the structural indicators.
  • the hierarchical relationship of name construction can easily add or delete one or several levels to facilitate expansion.
  • FIG. 2 a flow chart of steps of a preferred embodiment of an indicator statistical method of the present application is shown. Specifically, the method may include the following steps:
  • Step 210 Each computing node corresponds to an indicator of the lowest level hierarchical node to the system of the computing node.
  • the real-time computing system may adopt a distributed computing system, where the distributed computing system includes a scheduling server and each computing node.
  • the structural definition of the structured indicator name can be configured in the scheduling server of the real-time computing system, and then configured by the scheduling server to each computing node, so that each computing node can name the lowest-level structured indicator according to the above definition. Process it.
  • the worker executed by each computing node can register the index corresponding to the lowest level node in the system of the computing node according to requirements. logo.
  • topology indicator For a topology indicator, it can actually be divided into two parts: 1. The system indicator that has been defined inside the Jstorm calculation framework; 2. The user-defined business-related indicator.
  • the user selects the system indicator, it can be registered in the system of the computing node when the worker is initialized. If the user selects a user-defined business indicator, it can be registered in the system of the computing node when the worker initializes the user code.
  • Register the indicator on each compute node Take jstorm as an example. For a stream, call a global static method in the worker provided by jstorm: registerStreamMetrics (metric related parameter), and then register the stream according to the metric related parameters in the system. Indicator ID. Then, proceed to step 220 to generate the structured indicator names of the layers layer by layer.
  • registerStreamMetrics metric related parameter
  • Step 220 For each index of a topology operation, each computing node uses a delimiter to splice the node identifier of the topmost layer to the lowest level node and the indicator identifier of the index into the lowest structured identifier name;
  • the computing node After each computing node registers the lowest-level indicator identifier, the computing node can generate the lowest-level structured indicator name according to the structured index name and the upper-level relationship of each hierarchical node recorded in the system.
  • Step 230 Each computing node sets, according to the lowest-level structured indicator name, the hierarchical node of the current level in the structured indicator name to the structured indicator name of each level, and obtains the structure of the upper level. The name of the indicator.
  • Step 240 After monitoring the lowest level data corresponding to the indicator, each computing node is at the bottom of the corresponding bottom layer. Statistics under the name of the structured indicator;
  • each computing node summarizes the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and gradually summarizes them to the structural indicator name of the upper level.
  • Each computing node collects data under the name of each structured indicator according to a time period, for example, a period of 1 minute, and at the end of the time period, sends statistical data of each structured indicator name of the period to the scheduling server.
  • Step 260 The scheduling server acquires statistics of each structured indicator name from each computing node, and performs aggregation.
  • the scheduling server obtains statistics of each structured node to obtain the name of each structured indicator, and then can perform aggregation.
  • Step 270 The scheduling server exports the summarized statistical data under the name of each structured indicator to the database, and uses the structured indicator name and time stamp as keywords and stores the statistical data as a key value.
  • the statistic statistics of each level of each computing node can be aggregated to the scheduling server of the cluster.
  • each computing node aggregates the indicator statistics into the scheduling server every 1 time period, for example, 1 minute, and since the scheduling server does not act as a storage server, the data is continuously covered by new indicator statistics. Therefore, only the aggregated indicator statistics for the most recent time period can be seen.
  • the scheduling server of the embodiment of the present application stores the aggregated indicator statistics in an external database.
  • the structured indicator name + time stamp is the key, and the statistical data is the value and stored in the database.
  • FIG. 3 a structural block diagram of an embodiment of an indicator statistical device of the present application is shown, which may specifically include the following modules:
  • the structured identifier creation module 310 is configured to: for an indicator of a topology job, create a structured indicator name for each level node corresponding to the indicator; wherein, the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name;
  • the bottom layer indicator monitoring module 320 is configured to perform statistics on the bottommost structured indicator name after monitoring the lowest level data corresponding to the indicator;
  • the layer-by-layer summary module 330 is configured to summarize the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the previous level.
  • the structured identifier creation module 310 includes:
  • the underlying indicator creation sub-module is configured to, for an indicator of a topology operation, sequentially combine the node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator into the lowest-level structured indicator name;
  • the upper indicator creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node of the current level in the structured indicator name to the structured indicator name of each level, and obtain the upper level Structured indicator name.
  • the bottom layer indicator creation submodule includes:
  • the underlying metrics are separated into sub-modules, which are used to index the nodes of the top-level to the lowest-level hierarchical nodes and the indicator identifiers of the indicators, and splicing them into the lowest-level structured indicator names. .
  • the upper layer indicator creation submodule includes:
  • the upper-level indicator separation creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node corresponding to the current level separator in the structured indicator name to the structured indicator name of each level. , get the name of the structural indicator of the previous level.
  • the method before the structured identifier creation module 310, the method further includes:
  • the registration module is configured to register the indicator identifier corresponding to the lowest level hierarchical node to the system.
  • the method further includes:
  • a data storage module for exporting statistics under each structured indicator name to a database for storage.
  • the data storage module includes:
  • the data storage sub-module is used to export the statistics under each structured indicator name to the database, and use the structured indicator name and time stamp as keywords and store the statistical data as key values.
  • the embodiment of the present application in the real-time computing system, has a hierarchical structure when processing data, and the real-time computing system can quickly perform statistics on all the levels of the indicators.
  • the embodiment of the present application A structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself, thereby determining the summary relationship. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained.
  • the hierarchical relationship of the structured indicator names can be used to simply count the indicators of each level, the logic is simple, the system consumption can be reduced, and the hierarchical relationship established according to the structured indicator name can be convenient. Add or remove one or several levels for easy expansion.
  • FIG. 4 a structural block diagram of an embodiment of an indicator statistical system of the present application is shown, which may specifically include:
  • the scheduling server 410 each computing node 420, database 430.
  • Each computer node 420 is exemplarily shown in FIG. 4, and the actual application may be set by the computer node according to the needs of the cluster.
  • Each computer node includes a registration module 421, an underlying metric separation creation module 422, an upper metric separation creation module 423, an underlying metric monitoring module 424, and a layer by layer summary module 425.
  • the dispatch server includes a summary module 411 and a data storage sub-module 412.
  • each computer node may also include other required modules, which are not limited in the embodiment of the present application.
  • the above scheduling server 410 includes:
  • the summary module 411 is configured to obtain statistics of each structured indicator name from each computing node, and perform summary
  • the data storage sub-module 412 is configured to export the statistical data under each structured indicator name to the database 430.
  • the database 430 stores the structured indicator name and time stamp as keywords and stores the statistical data as a key value.
  • Each compute node 420 includes:
  • the registration module 421 is configured to register, to the system of the computing node, an indicator identifier corresponding to the lowest level hierarchical node.
  • the bottom layer indicator separation creation module 422 is configured to use a separator to select a node identifier of the topmost layer to the lowest level node node and an indicator identifier of the indicator, and sequentially splicing into a bottom layer structured indicator name. .
  • the upper-level indicator separation creation module 423 is configured to set, according to the lowest-level structured indicator name, the hierarchical node corresponding to the current level separator in the structured indicator name to the structured indicator name of each level. , get the name of the structural indicator of the previous level.
  • the bottom layer indicator monitoring module 424 is configured to perform statistics on the bottommost structured index name after monitoring the lowest level data corresponding to the indicator;
  • the layer-by-layer summary module 425 is configured to summarize the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the upper level.
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application may be included in one or more of them.
  • a computer program product embodied on a computer usable storage medium including but not limited to disk storage, CD@ROM, optical storage, etc.).
  • the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, read-only optical read-only memory (CD@ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
  • Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the technical field of real-time computation, and provides a pointer counting method and device. The method comprises: creating, according to nodes that are in respective hierarchical levels and corresponding to a pointer of a topological operation, a structured pointer name for the pointer, wherein a hierarchical relationship between the nodes in the respective hierarchical levels can be determined via the structured pointer name (110); upon detection of lowest-level data corresponding to the pointer, performing a counting operation for structured pointer names corresponding to the lowest level (120); and based on data from the counting operation for the structured pointer names in the lowest level, aggregating, level by level and according to the hierarchical relationship between respective structured pointer names, the data in structured pointer names in successive upper levels (130). The present invention can reduce system costs, and allows convenient addition or deletion of one or more levels because a hierarchical relationship is built in association with structured pointer names, thereby facilitating expansion.

Description

一种指标统计方法和装置Index statistical method and device
本申请要求2015年12月04日递交的申请号为201510886186.X、发明名称为“一种指标统计方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application Serial No. No. No. No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No
技术领域Technical field
本申请涉及实时计算技术领域,特别是涉及一种指标统计方法和一种指标统计装置。The present application relates to the field of real-time computing technologies, and in particular, to an indicator statistical method and an indicator statistical device.
背景技术Background technique
伴随着信息科技日新月异的发展,信息呈现出爆发式的膨胀,人们获取信息的途径也更加多样、更加便捷,同时对于信息的时效性要求也越来越高。举个搜索场景中的例子,当电子商务网站中一个卖家发布了一条商品信息时,该卖家当然希望是这个商品信息马上就可以被买家搜索出来、点击、购买,相反,如果这个商品信息要等到第二天或者更久才可以被搜出来,对于卖家来说,其信息太滞后,特别影响该商品信息的实时性。基于该需求,产生了实时计算系统,比如jstorm/storm等分层的实时计算系统。Along with the rapid development of information technology, information has exploded, and people have more diverse and convenient ways to obtain information. At the same time, the timeliness of information is becoming more and more demanding. As an example in the search scenario, when a seller publishes a product information in an e-commerce website, the seller certainly hopes that the product information can be searched, clicked, and purchased by the buyer immediately. On the contrary, if the product information is to be Wait until the next day or longer to be searched out. For the seller, the information is too lagging, especially affecting the real-time nature of the product information. Based on this demand, a real-time computing system, such as a hierarchical real-time computing system such as jstorm/storm, is generated.
其中,Storm是apache社区下的分布式开源实时计算系统,采用clojure语言(Clojure是一种运行在Java(Java是一种可以撰写跨平台应用程序的面向对象的程序设计语言)平台上的Lisp语言(Lisp是一种以表达性和功能强大著称的编程语言))开发。Storm可被用于“流处理”之中,实时处理消息;也可被用于“连续计算(continuous computation)”,对数据流做连续处理,在计算时就将结果以流的形式输出给用户;它还可被用于“分布式RPC(Remote Procedure Call Protocol,远程过程调用协议)”,以并行的方式执行运算。JStorm是基于Storm开发的实时计算系统,其兼容Storm。Among them, Storm is a distributed open source real-time computing system under the apache community, using clojure language (Clojure is a Lisp language running on Java (Java is an object-oriented programming language that can write cross-platform applications) platform (Lisp is a programming language known for its expressiveness and power)). Storm can be used in "stream processing" to process messages in real time; it can also be used for "continuous computation", which continuously processes data streams and outputs the results to users in the form of streams. It can also be used in "Remote Procedure Call Protocol" to perform operations in parallel. JStorm is a real-time computing system based on Storm, which is compatible with Storm.
在实时计算系统中,为了衡量应用的运行状况和性能,通常需要对应用的各项指标进行测量和统计。如应用的发送消息量(Emitted),每秒发送量(TPS)等指标。In real-time computing systems, in order to measure the health and performance of an application, it is often necessary to measure and count the metrics of the application. Such as the application of the amount of messages sent (Emitted), the amount of transmission per second (TPS) and other indicators.
在jstorm/storm这种分层的实时计算系统下,由于很多实时计算系统都具有层级结构。但是传统的统计方法,对于一个指标来说,只能统计一个层级的数据。如果需要统计所有层级的数据,需要在特定层级上单独定义指标;不同层级间的指标的数据汇总、合并也需要额外的复杂逻辑来实现,其计算过程复杂、系统资源消耗大。In the hierarchical real-time computing system of jstorm/storm, many real-time computing systems have a hierarchical structure. But the traditional statistical method, for an indicator, can only count one level of data. If statistics of all levels need to be counted, indicators need to be defined separately at a specific level; data aggregation and consolidation of indicators between different levels also require additional complex logic to implement, and the calculation process is complicated and system resources are expensive.
发明内容Summary of the invention
鉴于上述问题,提出了本申请实施例以便提供一种克服上述问题或者至少部分地解决上述问题的一种指标统计方法和相应的一种指标统计装置。In view of the above problems, embodiments of the present application have been made in order to provide an indicator statistical method and a corresponding indicator statistical apparatus that overcome the above problems or at least partially solve the above problems.
为了解决上述问题,本申请公开了一种指标统计方法,包括:In order to solve the above problem, the present application discloses an indicator statistical method, including:
对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;For an indicator of a topology operation, a structured indicator name is created for each level node corresponding to the indicator; wherein, the hierarchical relationship between the nodes of each level is determined by the name of the structured indicator;
当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;After monitoring the lowest level data corresponding to the indicator, performing statistics under the name of the corresponding lowest level structured indicator;
基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。Based on the statistical data under the name of the lowest-level structured indicator, according to the hierarchical relationship between the names of each structured indicator, it is summarized step by step to the name of the structural indicator of the previous level.
本申请还公开了一种指标统计装置,包括:The application also discloses an indicator statistical device, comprising:
结构化标识创建模块,用于对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;a structured identifier creation module, configured to create a structured indicator name for each level node corresponding to the indicator for an indicator of a topology operation; wherein, the hierarchical relationship between the nodes of each level is determined by the structured indicator name;
底层指标监控模块,用于当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;The bottom layer indicator monitoring module is configured to perform statistics on the lowest level data corresponding to the indicator, and under the name of the corresponding lowest level structured indicator;
逐层汇总模块,用于基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。The layer-by-layer summary module is used to summarize the statistics under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the previous level.
本申请实施例包括以下优点:Embodiments of the present application include the following advantages:
本申请实施例对于实时计算系统的拓扑作业的待统计的指标,针对对应所述指标各层级节点创建结构化指标名称,而该结构化指标名称用来确定各个层级节点之间的层级关系。然后本申请实施例监控该指标的最底层的数据,在最底层的结构化指标名称下统计该数据,然后根据各结构化指标名称之间的层级关系,基于最底层的统计数据,逐级汇总至上一层级的结构化指标名称下。如此,本申请实施例可以通过结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗,并且由于是按照结构化指标名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。In the embodiment of the present application, for the index to be counted for the topology operation of the real-time computing system, a structured indicator name is created for each level node corresponding to the indicator, and the structured indicator name is used to determine the hierarchical relationship between the hierarchical nodes. Then, the embodiment of the present application monitors the lowest-level data of the indicator, and collects the data under the name of the lowest-level structured indicator, and then summarizes the data based on the lowest level statistical data according to the hierarchical relationship between the structured indicator names. Under the name of the structural indicator of the previous level. In this way, the embodiment of the present application can easily perform statistics on indicators of each level by hierarchical relationship of structured indicator names, which is simple in logic, can reduce system consumption, and can be conveniently implemented because of hierarchical relationships constructed according to structured indicator names. Add or remove one or several levels to facilitate expansion.
附图说明DRAWINGS
图1是本申请的一种指标统计方法实施例的步骤流程图;1 is a flow chart of steps of an embodiment of an indicator statistical method of the present application;
图2是本申请的一种指标统计方法实施例的步骤流程图;2 is a flow chart of steps of an embodiment of an indicator statistical method of the present application;
图3是本申请的一种指标统计装置实施例的结构框图; 3 is a structural block diagram of an embodiment of an indicator statistical device of the present application;
图4是本申请的一种指标统计系统实施例的结构框图。4 is a structural block diagram of an embodiment of an indicator statistical system of the present application.
具体实施方式detailed description
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。The above described objects, features and advantages of the present application will become more apparent and understood.
为了更方便的描述本申请实施例,以jstorm或storm实时计算系统为例,介绍本申请实施例涉及的相关术语。For a more convenient description of the embodiments of the present application, a jstorm or storm real-time computing system is taken as an example to introduce related terms involved in the embodiments of the present application.
topology:拓扑作业,其为运行于jstorm或storm系统上的应用程序。一个拓扑作业提交到storm或jstorm实时计算系统中后,可不间断运行。Topology: A topology job that is an application running on a jstorm or storm system. After a topology job is submitted to the storm or jstorm real-time computing system, it can run without interruption.
component:一个topology由多个组件组成,每个组件称之为component。storm和jstorm的component分为spout和bolt两种。其中spout组件表示处理的数据的来源,如一个spout可以从外部的消息组件中获取数据,也可以从数据库获取数据,广义地说,spout可以从任何外部数据源中不间断地获取数据,并将数据发送至下游,该下游比如bolt,Bolt从Spout中接收数据并进行处理。Component: A topology consists of multiple components, each called a component. The components of storm and jstorm are divided into spout and bolt. The spout component represents the source of the processed data, such as a spout that can fetch data from an external message component, or retrieve data from a database. Broadly speaking, spout can continuously acquire data from any external data source, and The data is sent downstream, such as a bolt, and Bolt receives the data from Spout and processes it.
Task:任务。一个Task表示一个逻辑处理单元,也就是实现的spout/bolt实例。一个component可能包括多个Task。Task: Task. A Task represents a logical processing unit, which is an instance of the implemented spout/bolt. A component may include multiple Tasks.
Stream:数据流。Stream是jstorm和storm中进行指标统计的最小单位。一个Task可能包括多个Stream。Stream: Data stream. Stream is the smallest unit of indicator statistics in jstorm and storm. A Task may include multiple Streams.
在实际应用中,jstorm和storm实时计算系统中具有层级结构,如stream→task→component→topology。在stream中统计各指标的数据。In practical applications, jstorm and storm real-time computing systems have a hierarchical structure, such as stream→task→component→topology. The data of each indicator is counted in the stream.
本申请实施例的核心构思之一在于,对于实时计算系统,由于其处理数据时具备层级结构,而为了实时计算系统能够快捷的对所有层级的指标进行统计,对于一个指标的各个层级的层级节点,本申请实施例为各个层级节点创建了结构化指标名称,通过结构化指标名称本身确定各层级节点之间的层级关系。从而只需要统计最底层的结构化指标名称的数据,即可按照结构化指标名称之间的层级关系逐级进行汇总,得到对应该指标的各个层级的数据。从而本申请实施例可以通过结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗,并且由于是按照结构化指标名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。One of the core concepts of the embodiments of the present application is that, for a real-time computing system, since it has a hierarchical structure when processing data, and for real-time computing systems, it is possible to quickly perform statistics on all levels of indicators, for each level of hierarchical nodes of an indicator. In this embodiment, the structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained. Therefore, in the embodiment of the present application, the hierarchical relationship of the structured indicator names can be used to simply count the indicators of each level, the logic is simple, the system consumption can be reduced, and the hierarchical relationship established according to the structured indicator name can be convenient. Add or remove one or several levels for easy expansion.
实施例一 Embodiment 1
参照图1,示出了本申请的一种指标统计方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 1 , a flow chart of steps of an embodiment of an indicator statistical method of the present application is shown, which may specifically include the following steps:
步骤110,对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;Step 110: Create, for an indicator of a topology job, a structured indicator name for each level node corresponding to the indicator; wherein, the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name;
在本申请实施例中,以jstorm对数据进行实时处理的过程为例,jstorm首先可接收某个拓扑作业topology,即启动某个应用程序。那么本申请实施例则需要统计topology处理过程中各个层级的各种指标,比如各个层级的发送消息量(Emitted),各个层级每秒发送量(TPS)等等。那么本申请实施例则可针对对应所述指标各层级节点创建结构化指标名称,并通过结构化指标名称确定各层级节点之间的层级关系。In the embodiment of the present application, the process of real-time processing of data by jstorm is taken as an example, and jstorm can first receive a topology topology, that is, start an application. Then, the embodiment of the present application needs to count various indicators of each level in the topology processing process, such as the amount of sent messages (Emitted) of each level, the level of transmission per second (TPS), and the like. Then, the embodiment of the present application may create a structured indicator name for each level node corresponding to the indicator, and determine a hierarchical relationship between the nodes of each level by using the structured indicator name.
比如前述的层级结构stream→task→component→topology。本申请可预先定义结构化指标名称的结构如:For example, the aforementioned hierarchical structure stream→task→component→topology. This application can pre-define the structure of the structured indicator name as follows:
topology@component@Task@Stream@nameTopology@component@Task@Stream@name
其中topology、component、task、stream表示节点标识所在位置,name表示指标标识所在位置。Topology, component, task, and stream indicate the location of the node identifier, and name indicates the location of the indicator identifier.
假设topology这个层级节点为tp1。topology包括一个层级节点component,该component的节点标识为spout。component包括两个层级节点Task0、Task1。Task0下包括两个层级节点Steam0、Steam1。Task0下包括两个层级节点Steam2、Steam3。统计的指标标识为Emitted。Assume that the topology node of topology is tp1. The topology includes a hierarchical node component whose node identifier is spout. The component includes two hierarchical nodes Task0 and Task1. Task0 includes two hierarchical nodes Steam0, Steam1. Under Task0, there are two hierarchical nodes Steam2 and Steam3. The statistical indicator is identified as Emitted.
那么,可以为各个层级节点创建结构化指标名称如:Then, you can create structured metric names for each level node:
对应Steam0的结构化指标名称为:The name of the structured indicator corresponding to Steam0 is:
tp1@spout@Task0@Stream0@Emitted;表示统计Stream0的Emitted值。Tp1@spout@Task0@Stream0@Emitted; indicates the value of the Emitted value of Stream0.
对应Steam1的结构化指标名称为:The structured indicator name for Steam1 is:
tp1@spout@Task0@Stream1@Emitted;表示统计Stream1的Emitted值。Tp1@spout@Task0@Stream1@Emitted; indicates the value of the Emitted value of Stream1.
对应Task0的结构化指标名称为:The name of the structured indicator corresponding to Task0 is:
tp1@spout@Task0@@Emitted;表示统计Task0的Emitted值。Tp1@spout@Task0@@Emitted; indicates the Emitted value of the statistics Task0.
对应Steam2的结构化指标名称为:The structured indicator name for Steam2 is:
tp1@spout@Task1@Stream2@Emitted;表示统计Stream2的Emitted值。Tp1@spout@Task1@Stream2@Emitted; indicates the value of the Emitted value of Stream2.
对应Steam3的结构化指标名称为:The name of the structured indicator for Steam3 is:
tp1@spout@Task1@Stream3@Emitted;表示统计Stream3的Emitted值。Tp1@spout@Task1@Stream3@Emitted; indicates the value of the Emitted value of Stream3.
对应Task1的结构化指标名称为: The name of the structured indicator corresponding to Task1 is:
tp1@spout@Task1@@Emitted;表示统计Task1的Emitted值。Tp1@spout@Task1@@Emitted; indicates the Emitted value of the statistics Task1.
对应spout的结构化指标名称为:The name of the structured indicator corresponding to spout is:
tp1@spout@@@Emitted;表示统计spout的Emitted值。Tp1@spout@@@Emitted; indicates the Emitted value of the statistics spout.
对应tp1的结构化指标名称为:The name of the structured indicator corresponding to tp1 is:
tp1@@@@Emitted;表示统计tp1的Emitted值。Tp1@@@@Emitted; indicates the Emitted value of the tp1.
那么对应Task0的结构化指标名由对应Steam0和Steam1的结构化指标名称缩减而来,有明确的上下级关系。而tp1@spout@Task0@Stream0@Emitted和tp1@spout@Task0@Stream1@Emitted是平级关系。同理,tp1@spout@@@Emitted由Task0和Task1的结构化指标名称缩减而来,有明确的上下级关系。而tp1@spout@Task0@@Emitted和tp1@spout@Task1@@Emitted是平级关系。上述结构化指标名称可以明确各个层级节点之间的层级关系。Then the structured indicator name corresponding to Task0 is reduced by the name of the structured indicator corresponding to Steam0 and Steam1, and there is a clear relationship between the superior and the lower. And tp1@spout@Task0@Stream0@Emitted and tp1@spout@Task0@Stream1@Emitted are level relationships. Similarly, tp1@spout@@@Emitted is reduced by the structural index names of Task0 and Task1, with a clear relationship between the superior and the subordinate. And tp1@spout@Task0@@Emitted and tp1@spout@Task1@@Emitted are level relationships. The above structured indicator names can clarify the hierarchical relationship between nodes at each level.
在本申请优选的另外一个实施例中,步骤110包括:In another preferred embodiment of the present application, step 110 includes:
子步骤A11,对于一拓扑作业的一指标,将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称;Sub-step A11, for an indicator of a topology job, the node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator are sequentially combined into the lowest-level structured indicator name;
在本申请实施例中,为了计算方便,对于一个topology的指标,首先按照预先定义的结构化指标名称的结构,将最底层的将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称。如上述Steam0是最底层的层级节点,优先设置其结构化指标名称:tp1@spout@Task0@Stream0@Emitted。其他几个最底层的层级节点类似。In the embodiment of the present application, for the convenience of calculation, for a topology index, firstly, according to the structure of the predefined structured indicator name, the lowest-level node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator are used. The indicator identifiers are sequentially combined into the lowest-level structured indicator names. As mentioned above, Steam0 is the lowest level node, and its structured indicator name is set first: tp1@spout@Task0@Stream0@Emitted. Several other lowest level hierarchy nodes are similar.
该最底层的结构化指标名称表示了从最顶层到最底层的层级路径。The bottom-level structured indicator name represents the hierarchical path from the top to the bottom.
子步骤A12,基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称。Sub-step A12, based on the lowest-level structured indicator name, set the hierarchical node of the current level in the structured indicator name to be empty for the structured indicator name of each level, and obtain the structural indicator name of the previous level. .
在设置了最底层的结构化指标名称后,再逐级向上计算每个层级节点对于的结构化指标名称。After setting the lowest-level structured indicator name, calculate the structured indicator name for each level node step by step.
比如Stream的上一级是Task,那么本申请实施对于Stream上一级中相同的Task进行归并。如将tp1@spout@Task0@Stream0@Emitted中的Stream0设置为空,或者将tp1@spout@Task0@Stream1@Emitted,得到层级节点Task0的结构化指标名称tp1@spout@Task0@@Emitted。同理得到Task1的结构化指标名称tp1@spout@Task1@@Emitted。然后基于Task层级节点向上一层级,将tp1@spout@Task0@@Emitted中的Task0或者tp1@spout@Task1@@Emitted中的Task1 置为空,得到spout的结构化指标名称tp1@spout@@@Emitted。以此类推,直到最上层的结构化指标名称生成。For example, the upper level of Stream is Task, then the implementation of this application merges the same Task in the previous level of Stream. For example, if Stream0 in tp1@spout@Task0@Stream0@Emitted is set to null, or tp1@spout@Task0@Stream1@Emitted, the structured indicator name tp1@spout@Task0@@Emitted of the hierarchical node Task0 is obtained. In the same way, the structural index name of Task1 is obtained tp1@spout@Task1@@Emitted. Then based on the Task hierarchy node up to level, the Task0 in tp1@spout@Task0@@Emitted or Task1 in tp1@spout@Task1@@Emitted Set to null, get the structured index name of spout tp1@spout@@@Emitted. And so on, until the top-level structured indicator name is generated.
可以理解,上述示例中,符号@作为分隔符,是为了更方便的对结构化指标名称进行归并,以生成上一层级的结构化指标名称。在实际应用中,也可以不设置类似@的符号,此时,可以将各层级节点的节点标识及所属层级提供给实时计算系统,使实时计算系统可以根据各层级的节点标识执行子步骤A12。It can be understood that in the above example, the symbol @ is used as a separator to facilitate the merging of structured indicator names to generate a hierarchical level of structured indicator names. In the actual application, the symbol similar to @ may not be set. At this time, the node identifier of each hierarchical node and the associated hierarchical level may be provided to the real-time computing system, so that the real-time computing system can perform sub-step A12 according to the node identifier of each hierarchical level.
优选的,在本申请另一优选的实施例中,子步骤A11包括:Preferably, in another preferred embodiment of the present application, sub-step A11 includes:
A111,基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中,当前层级的分隔符所对应的层级节点设置为空,得到上一层级的结构化指标名称。A111, based on the lowest-level structured indicator name, for each structured hierarchical indicator name, the hierarchical node corresponding to the current hierarchical separator in the structured indicator name is set to be empty, and the upper level structure is obtained. The name of the indicator.
在本申请实施例中,为了防止采用各层级节点的节点标识以及指标标识直接组合所得到全名,出现重名和歧义,本申请实施例在任意个层级的节点标识之间添加了分隔符,在最底层的节点标识和指标标识之间也添加了分隔符。如前述的tp1@spout@Task0@Stream0@Emitted中的@,即为添加的分隔符。在生成上一层级的结构化指标名称时,可以只将指标标识置为空,保留分隔符,以更方便的确定各结构化指标名称之间的层级关系。因为,比如由tp1@spout@Task0@Stream0@Emitted生成的上一层的tp1@spout@Task0@@Emitted,从而在后续归并时,只需要判断第3、4分隔符之间为空,就可以确定tp1@spout@Task0@@Emitted名称是统计Task0之下的所有Steam的Emitted的值。In the embodiment of the present application, in order to prevent the full name of the node identifier and the indicator identifier of each hierarchical node from being directly combined, the duplicate name and the ambiguity occur, and the embodiment of the present application adds a separator between the node identifiers of any level. A separator is also added between the lowest node identifier and the metric identifier. @, as in the above tp1@spout@Task0@Stream0@Emitted, is the added separator. When generating the hierarchical structure name of the previous level, you can only set the indicator ID to be empty and retain the separator to make it easier to determine the hierarchical relationship between the names of each structured indicator. Because, for example, tp1@spout@Task0@@Emitted generated by tp1@spout@Task0@Stream0@Emitted, in the subsequent merge, only need to judge that the 3rd and 4th separators are empty, you can Make sure the tp1@spout@Task0@@Emitted name is the value of Emitted for all Steams under Statistics Task0.
当然,本申请实施例的分隔符还可采用其他的符号,本申请实施例不对其加以限制。Of course, the delimiters of the embodiments of the present application may also adopt other symbols, which are not limited by the embodiments of the present application.
优选的,在本申请另一优选的实施例中,步骤110之前,还包括:Preferably, in another preferred embodiment of the present application, before step 110, the method further includes:
B11,向系统注册对应最底层的层级节点的指标标识。B11: Register the indicator identifier corresponding to the lowest level node to the system.
在实际应用中,调度服务器在将topology分配到各个计算节点执行时,每个计算节点执行的worker可根据需求在本计算节点的系统注册对应最底层的层级节点的指标标识。比如注册Stream层级的各层级节点的指标标识Emitted。然后本申请实施例的计算节点可以在应最底层的层级节点的指标标识被注册后,根据前述结构化指标名称的结构定义,自动生成各层级的层级节点的结构化指标名称,比如对于stream层级的Stream0,在步骤A11生成如前述tp1@spout@Task0@Stream0@Emitted等,然后进入步骤A12,生成直到生成各层级节点的结构化指标名称。In an actual application, when the scheduling server allocates the topology to each computing node for execution, the worker executed by each computing node may register the indicator identifier corresponding to the lowest level node in the system of the computing node according to requirements. For example, register the indicator ID of each level node of the Stream level. Then, the computing node in the embodiment of the present application can automatically generate the structured indicator name of the hierarchical node of each level according to the structural definition of the structured indicator name after the index identifier of the lowest level hierarchical node is registered, for example, for the stream level. Stream0, in step A11, generates tp1@spout@Task0@Stream0@Emitted, etc., and then proceeds to step A12 to generate a structured indicator name until each level node is generated.
上述只通知计算节点向系统注册对应最底层的层级节点的指标标识的方式,可以只 给计算节点一个简单的通知,不用向计算节点各层级节点的结构化指标数据,减少传输开销。The above only informs the computing node to register the indicator identifier corresponding to the lowest level hierarchical node to the system, and only Give the compute node a simple notification without reducing the transmission overhead by structuring the metric data to the compute node's hierarchical nodes.
在本发明实施例中,结构化指标名称的结构定义可以在实时计算系统的调度服务器中配置,然后再由调度服务器传输给各计算节点。技术人员可以根据需要更改调度服务器中配置的结构化指标名称的结构定义,从而更改层级结构,相应更改各层级节点的结构化指标名称。In the embodiment of the present invention, the structural definition of the structured indicator name may be configured in a scheduling server of the real-time computing system, and then transmitted to each computing node by the scheduling server. The technician can change the hierarchical structure of the structured metric names configured in the dispatch server as needed to change the hierarchy and change the structured metric names for each level node accordingly.
比如,对于正常的层级,需要加上在指标标识之上加上指标分组group,那么可以定义结构化指标名称的结构为:For example, for the normal level, you need to add the indicator group group above the indicator ID, then the structure of the structured indicator name can be defined as:
topology@component@Task@Stream@group@name。Topology@component@Task@Stream@group@name.
以Emitted为例,该group如根据不同业务的需求对Emitted的统计进行分组,比如业务A需求发10条消息,结构化指标名称+1,而业务B需求发1条消息,结构化指标名称+1。那么不同的group,相应的结构化指标的值不同。Taking Emitted as an example, the group groups the Emitted statistics according to the requirements of different services. For example, the service A needs to send 10 messages, the structured indicator name is +1, and the service B needs to send 1 message, and the structured indicator name + 1. Then different groups, the corresponding structured indicators have different values.
可以理解,上述结构化指标名称的结构定义可以更新到调度服务器,然后由调度服务器分发到各个计算节点。当然也可以与原有的结构化指标定义一起使用。It can be understood that the structural definition of the above structured indicator name can be updated to the scheduling server and then distributed by the scheduling server to each computing node. Of course, it can also be used with the original structured indicator definition.
当然,本发明实施例还可以根据实际需要更改上述结构化指标名称的结构定义,本申请不对其加以限制。Of course, the embodiment of the present invention may also change the structural definition of the foregoing structured indicator name according to actual needs, which is not limited in this application.
步骤120,当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;Step 120: After monitoring the lowest-level data corresponding to the indicator, perform statistics under the name of the corresponding lowest-level structured indicator;
在本发明实施例中,以jstorm的实时计算系统为例,在jstorm中创建了一个topology,那么jstorm的被调度系统可将该topology划分成多个worker,每个worker代表一个执行具体任务的进程。上述worker分布在jstorm的计算集群的不同计算节点上,并行地执行,所有实际的数据处理工作最后都在worker中执行完成。因此,对于每个计算节点来说,对topology按层级结构进行处理,从而本申请实施例,每个计算节点都会获取上述的对应所述指标各层级节点的结构化指标名称,然后在各个结构化指标名称之下对指标进行统计。In the embodiment of the present invention, taking jstorm's real-time computing system as an example, a topology is created in jstorm, then the jstorm scheduled system can divide the topology into multiple workers, and each worker represents a process that performs a specific task. . The above workers are distributed on different computing nodes of jstorm's computing cluster and executed in parallel. All the actual data processing work is finally completed in the worker. Therefore, for each computing node, the topology is processed in a hierarchical structure, so in the embodiment of the present application, each computing node obtains the above-mentioned structural index names corresponding to the hierarchical nodes of the indicator, and then in each structured The indicators are counted under the indicator name.
在实际应用中,每个worker可运行至少一个spout和/或至少一个bolt。在worker中,将spout或bolt划分给task执行,task以stream的形式处理数据。In practical applications, each worker can run at least one spout and/or at least one bolt. In the worker, the spout or bolt is divided into task execution, and the task processes the data in the form of a stream.
那么本申请实施例监控stream中出现的与指标相关的数据,比如对于发送消息量(Emitted),监控到stream层级节点stream0传递一次Tuple(Tuple:一次消息传递的基本单元),则在对应Steam0的结构化指标名称tp1@spout@Task0@Stream0@Emitted值更 新为1。Then, the embodiment of the present application monitors the data related to the indicator that appears in the stream, for example, for the amount of sent messages (Emitted), monitors that the stream level node stream0 passes a Tuple (Tuple: a basic unit of a message delivery), and corresponds to Steam0. Structured indicator name tp1@spout@Task0@Stream0@Emitted value more New is 1.
可以理解,本申请实施例中,可以通知系统监控最底层的层级节点的指标,最底层之上的不用监控。It can be understood that, in the embodiment of the present application, the system can be notified to monitor the indicators of the lowest level hierarchical node, and no monitoring is performed on the lowest layer.
步骤130,基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。Step 130: Based on the statistical data under the name of the lowest-level structured indicator, according to the hierarchical relationship between the names of the structured indicators, the data is summarized step by step to the structural indicator name of the upper level.
假设是第一次记录,最底层之上的各层的结构化指标名称的初始值都为0,那么基于上述记录:tp1@spout@Task0@Stream0@Emitted:1Assuming the first record, the initial value of the structured indicator name of each layer above the bottom layer is 0, then based on the above record: tp1@spout@Task0@Stream0@Emitted:1
则可以根据tp1@spout@Task0@Stream0@Emitted的结构,则查找到与该结构化名称对应的各层结构化名称为:tp1@spout@Task0@@Emitted,tp1@spout@@@Emitted,tp1@@@@Emitted,那么将该三者的值更新为1。According to the structure of tp1@spout@Task0@Stream0@Emitted, the hierarchical structure name corresponding to the structured name is found: tp1@spout@Task0@@Emitted, tp1@spout@@@Emitted, tp1 @@@@Emitted, then update the values of the three to 1.
又假设在步骤120中,监控到stream层级节点stream0传递一次Tuple(Tuple:一次消息传递的基本单元),则在对应Steam1的结构化指标名称tp1@spout@Task0@Stream0@Emitted值更新为2。It is also assumed that in step 120, it is monitored that the stream level node stream0 passes a Tuple (Tuple: the basic unit of a message delivery), and the value of the structured indicator name tp1@spout@Task0@Stream0@Emitted corresponding to Steam1 is updated to 2.
此时在步骤130中,则根据层级关系,逐级向上层汇总,其汇总顺序和结果如下:At this time, in step 130, according to the hierarchical relationship, the layers are summarized step by step, and the summary order and results are as follows:
tp1@spout@Task0@@Emitted:2Tp1@spout@Task0@@Emitted:2
tp1@spout@@@Emitted:2Tp1@spout@@@Emitted:2
tp1@@@@Emitted:2Tp1@@@@Emitted: 2
又假设在步骤120中,监控到stream层级节点stream3传递一次Tuple(Tuple:一次消息传递的基本单元),则在对应Steam0的结构化指标名称tp1@spout@Task1@Stream3@Emitted值更新为1。It is also assumed that in step 120, the stream level node stream3 is monitored to pass a Tuple (Tuple: the basic unit of a message delivery), and the value of the structured indicator name tp1@spout@Task1@Stream3@Emitted corresponding to Steam0 is updated to 1.
此时在步骤130中,则根据层级关系,逐级向上层汇总,其汇总顺序和结果如下:At this time, in step 130, according to the hierarchical relationship, the layers are summarized step by step, and the summary order and results are as follows:
tp1@spout@Task1@@Emitted:1Tp1@spout@Task1@@Emitted:1
tp1@spout@@@Emitted:3Tp1@spout@@@Emitted:3
tp1@@@@Emitted:3Tp1@@@@Emitted:3
在本申请实施例中,由于实时计算系统各自都以各层级节点的结构化指标名称进行了相应的统计。In the embodiment of the present application, since the real-time computing systems each perform corresponding statistics with the structured indicator names of the hierarchical nodes.
而为了得到整个实时计算系统的各层级的整体的统计数据,本申请实施例则会将各个计算节点的结构化指标名称下的记录进行汇总。In order to obtain the overall statistical data of each level of the entire real-time computing system, the embodiment of the present application aggregates the records under the structural indicator names of the respective computing nodes.
比如有两个计算节点1、2,对于计算节点1的记录为:For example, there are two compute nodes 1, 2, and the record for compute node 1 is:
tp1@spout@Task0@Stream0@Emitted:10 Tp1@spout@Task0@Stream0@Emitted:10
tp1@spout@Task1@Stream3@Emitted:10Tp1@spout@Task1@Stream3@Emitted:10
tp1@spout@Task0@@Emitted:10Tp1@spout@Task0@@Emitted:10
tp1@spout@Task1@@Emitted:10Tp1@spout@Task1@@Emitted:10
tp1@spout@@@Emitted:20Tp1@spout@@@Emitted:20
tp1@@@@Emitted:20Tp1@@@@Emitted:20
比如对于计算节点2的记录为:For example, the record for compute node 2 is:
tp1@spout@Task0@Stream1@Emitted:20Tp1@spout@Task0@Stream1@Emitted:20
tp1@spout@Task1@Stream3@Emitted:10Tp1@spout@Task1@Stream3@Emitted:10
tp1@spout@Task0@@Emitted:20Tp1@spout@Task0@@Emitted:20
tp1@spout@Task1@@Emitted:10Tp1@spout@Task1@@Emitted:10
tp1@spout@@@Emitted:30Tp1@spout@@@Emitted:30
tp1@@@@Emitted:30Tp1@@@@Emitted:30
那么汇总得到整个实时计算系统对于tp1的各层级的Emitted的统计记录为:Then the summary records of the Emitted statistics of the entire real-time computing system for each level of tp1 are:
tp1@spout@Task0@Stream0@Emitted:10Tp1@spout@Task0@Stream0@Emitted:10
tp1@spout@Task0@Stream1@Emitted:20Tp1@spout@Task0@Stream1@Emitted:20
tp1@spout@Task1@Stream3@Emitted:20Tp1@spout@Task1@Stream3@Emitted:20
tp1@spout@Task0@@Emitted:30Tp1@spout@Task0@@Emitted:30
tp1@spout@Task1@@Emitted:20Tp1@spout@Task1@@Emitted:20
tp1@spout@@@Emitted:50Tp1@spout@@@Emitted:50
tp1@@@@Emitted:50Tp1@@@@Emitted:50
在实际应用中,实时计算系统对于各结构化指标,统计一分钟内的数据。然后可以不断将每分钟的统计数据输出。比如以日志的形式数据。In practical applications, the real-time computing system counts data within one minute for each structured indicator. You can then continuously output statistics every minute. For example, data in the form of a log.
优选的,在本申请另一实施例中,在步骤130之后,还包括:Preferably, in another embodiment of the present application, after step 130, the method further includes:
步骤140,将各个结构化指标名称下的统计数据,导出到数据库中进行存储。In step 140, the statistical data under each structured indicator name is exported to a database for storage.
在本申请实施例中,由于实时计算系统不具备数据库的功能,其统计结果不方便查询。In the embodiment of the present application, since the real-time computing system does not have the function of the database, the statistical result is inconvenient to query.
并且,由于该结构化指标名称的结构化的方式,适合很多大数据处理工具和框架来处理,如HBase、Hadoop、Hive等,因此,本申请可以将各个结构化指标名称下的统计数据导出到HBase、Hadoop、Hive等数据库中。Moreover, due to the structured way of the structured indicator name, it is suitable for many big data processing tools and frameworks, such as HBase, Hadoop, Hive, etc. Therefore, the present application can export the statistical data under each structured indicator name to HBase, Hadoop, Hive and other databases.
优选的,在本申请另一优选的实施例中,步骤140包括: Preferably, in another preferred embodiment of the present application, step 140 includes:
子步骤C11,将各个结构化指标名称下的统计数据,导出到数据库中,以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。Sub-step C11, the statistical data under each structured indicator name is exported to the database, and the structured indicator name and time stamp are used as keywords, and the statistical data is used as the key value for storage.
在实际应用中,实时计算系统对于各结构化指标,其是统计一个时间周期内的数据,比如1分钟,到达时间周期后,后该结构化指标名称下的记录将会被刷新,重新记录。因此,实际上对于每个结构化指标名称,在上述时间周期结束时,其会有一个时间戳,本申请实施例则在时间周期结束时,将结构化指标名称及其统计数据,以结构化指标名称和时间戳为关键字key,以统计数据为关键值value,存入到数据库中,比如HBase数据库中。那么可以通过时间戳,很方便的查出一段时间内的各层级的指标值。该时间戳为每个时间周期结束时的系统时间。In practical applications, for real-time computing systems, for each structured indicator, it is to count data in a time period, such as 1 minute, after the arrival time period, the records under the structured indicator name will be refreshed and re-recorded. Therefore, for each structured indicator name, at the end of the above time period, there will be a time stamp. In the embodiment of the present application, the structured indicator name and its statistical data are structured at the end of the time period. The indicator name and timestamp are keyword keys, and the statistics are key values and are stored in the database, such as the HBase database. Then, through the timestamp, it is convenient to find the index values of each level in a period of time. This timestamp is the system time at the end of each time period.
为了进一步说明本申请实施例的优点。以一个jstorm的topology为例,topology的层级标识为tp1;tp1有一个component,层级标识为spout;该component下面有5个task,每个task都有一个id与之对应,相应的层级标识分别为Task0~Task4;同时,每个task下都有2个stream,相应的层级标识分别为Stream0,Stream1。那么其层次关系为:To further illustrate the advantages of embodiments of the present application. Taking a jstorm topology as an example, the topology identifier of the topology is tp1; tp1 has a component, and the hierarchy identifier is spout; there are 5 tasks under the component, and each task has an id corresponding thereto, and the corresponding hierarchical identifiers are respectively Task0~Task4; At the same time, there are 2 streams under each task, and the corresponding level identifiers are Stream0 and Stream1 respectively. Then its hierarchical relationship is:
Stream[0~1]→Task[0~4]→spout→tp1Stream[0~1]→Task[0~4]→spout→tp1
传统技术中如果要统计spout的消息量,则需要定义一个指标,名为SpoutEmitted,并在每次发送消息的时候,更新这个值;如果需要统计task0的消息量,则需要定义名为Task0Emitted,同样地更新这个值;如果需要统计task0中stream0的消息量,则需要定义名为Stream0Emitted,其他情况,依次类推。但是,SpoutEmitted实际上跟Task0Emitted~Task4Emitted是有层次关系的,相当于SpoutEmitted=Task0Emitted+Task1Emitted+Task2Emitted+Task3Emitted+Task4Emitted;类似地,Task0Emitted=Stream0Emitted+Stream1Emitted。而传统指标统计方法并没有体现上述层次关系以及计算逻辑,如果需要实现这种分层的逻辑,需要做许多额外的复杂逻辑判断和计算。同时,传统的方法还需要小心地选择指标名称,以避免重复而导致数据不准确。In the traditional technology, if you want to count the spout message volume, you need to define an indicator named SpoutEmitted, and update this value every time you send a message. If you need to count the message volume of task0, you need to define a task named Task0Emitted. Update this value; if you need to count the stream0 stream0 in task0, you need to define the name Stream0Emitted, other cases, and so on. However, SpoutEmitted is actually hierarchical with Task0Emitted~Task4Emitted, which is equivalent to SpoutEmitted=Task0Emitted+Task1Emitted+Task2Emitted+Task3Emitted+Task4Emitted; similarly, Task0Emitted=Stream0Emitted+Stream1Emitted. The traditional indicator statistical method does not reflect the above hierarchical relationship and calculation logic. If you need to implement this hierarchical logic, you need to do a lot of additional complex logic judgments and calculations. At the same time, the traditional method also needs to carefully select the indicator name to avoid duplication and result in inaccurate data.
而本申请实施例,于实时计算系统,由于其处理数据时具备层级结构,而为了实时计算系统能够快捷的对所有层级的指标进行统计,对于一个指标的各个层级的层级节点,本申请实施例为各个层级节点创建了结构化指标名称,通过结构化指标名称本身确定各层级节点之间的层级关系,从而确定了汇总关系。从而只需要统计最底层的结构化指标名称的数据,即可按照结构化指标名称之间的层级关系逐级进行汇总,得到对应该指标的各个层级的数据。从而本申请实施例可以通过结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗,并且由于是按照结构化指标 名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。The embodiment of the present application, in the real-time computing system, has a hierarchical structure when processing data, and the real-time computing system can quickly perform statistics on all the levels of the indicators. For the hierarchical nodes of each level of an indicator, the embodiment of the present application A structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself, thereby determining the summary relationship. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained. Therefore, in the embodiment of the present application, the hierarchical relationship of the structured indicator names can be used to simply perform statistics on the indicators of each level, and the logic is simple, which can reduce system consumption, and is based on the structural indicators. The hierarchical relationship of name construction can easily add or delete one or several levels to facilitate expansion.
实施例二Embodiment 2
参照图2,示出了本申请的优选的一种指标统计方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 2, a flow chart of steps of a preferred embodiment of an indicator statistical method of the present application is shown. Specifically, the method may include the following steps:
步骤210,各计算节点向本计算节点的系统对应最底层的层级节点的指标标识。Step 210: Each computing node corresponds to an indicator of the lowest level hierarchical node to the system of the computing node.
在本申请实施例中,实时计算系统可以采用分布式计算系统,该分布式计算系统包括调度服务器、各计算节点。In the embodiment of the present application, the real-time computing system may adopt a distributed computing system, where the distributed computing system includes a scheduling server and each computing node.
其中,可以将结构化指标名称的结构定义可以在实时计算系统的调度服务器中配置,然后再由调度服务器配置给各计算节点,以使各个计算节点可以按照上述定义对最底层的结构化指标名称进行处理。Wherein, the structural definition of the structured indicator name can be configured in the scheduling server of the real-time computing system, and then configured by the scheduling server to each computing node, so that each computing node can name the lowest-level structured indicator according to the above definition. Process it.
在本申请实施例中,以Jstorm为例,调度服务器在将topology分配到各个计算节点执行时,每个计算节点执行的worker可根据需求在本计算节点的系统注册对应最底层的层级节点的指标标识。In the embodiment of the present application, taking Jstorm as an example, when the scheduling server allocates the topology to each computing node for execution, the worker executed by each computing node can register the index corresponding to the lowest level node in the system of the computing node according to requirements. Logo.
而对于一个topology的指标,实际上可以分为两部分:1.Jstorm计算框架内部已经定义好的系统指标;2.用户自定义的业务相关指标。For a topology indicator, it can actually be divided into two parts: 1. The system indicator that has been defined inside the Jstorm calculation framework; 2. The user-defined business-related indicator.
那么如果用户选择了系统指标,则可以在worker初始化时在计算节点的系统中进行注册。如果用户选择了用户自定义的业务指标,则可以在worker将用户代码初始化时在计算节点的系统中进行注册。Then if the user selects the system indicator, it can be registered in the system of the computing node when the worker is initialized. If the user selects a user-defined business indicator, it can be registered in the system of the computing node when the worker initializes the user code.
在每个计算节点上注册指标,以jstorm为例,对于一个stream,调用jstorm提供的worker内一个全局静态方法:registerStreamMetrics(metric相关参数),然后即可在系统内部根据metric相关参数,注册该stream的指标标识。然后即可进入步骤220,逐层生成各层的结构化指标名称。Register the indicator on each compute node. Take jstorm as an example. For a stream, call a global static method in the worker provided by jstorm: registerStreamMetrics (metric related parameter), and then register the stream according to the metric related parameters in the system. Indicator ID. Then, proceed to step 220 to generate the structured indicator names of the layers layer by layer.
步骤220,各计算节点对于一拓扑作业的指标,采用分隔符将最顶层到最底层的层级节点的节点标识和所述指标的指标标识,按序拼接为最底层的结构化指标名称;Step 220: For each index of a topology operation, each computing node uses a delimiter to splice the node identifier of the topmost layer to the lowest level node and the indicator identifier of the index into the lowest structured identifier name;
各计算节点在注册了最底层的指标标识后,计算节点即可根据结构化的指标名称,和在系统中记录的各层级节点的上下层级关系,生成最底层的结构化指标名称。After each computing node registers the lowest-level indicator identifier, the computing node can generate the lowest-level structured indicator name according to the structured index name and the upper-level relationship of each hierarchical node recorded in the system.
步骤230,各计算节点基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称。Step 230: Each computing node sets, according to the lowest-level structured indicator name, the hierarchical node of the current level in the structured indicator name to the structured indicator name of each level, and obtains the structure of the upper level. The name of the indicator.
步骤240,各计算节点当监控到所述指标所对应的最底层的数据后,在相应最底层 的结构化指标名称下进行统计;Step 240: After monitoring the lowest level data corresponding to the indicator, each computing node is at the bottom of the corresponding bottom layer. Statistics under the name of the structured indicator;
步骤250,各计算节点基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。In step 250, each computing node summarizes the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and gradually summarizes them to the structural indicator name of the upper level.
各计算节点按时间周期统计各结构化指标名称下的数据,比如以1分钟为周期,在时间周期结束时,将周期的各结构化指标名称的统计数据发送给调度服务器。Each computing node collects data under the name of each structured indicator according to a time period, for example, a period of 1 minute, and at the end of the time period, sends statistical data of each structured indicator name of the period to the scheduling server.
步骤260,调度服务器从各计算节点获取各结构化指标名称的统计数据,并进行汇总。Step 260: The scheduling server acquires statistics of each structured indicator name from each computing node, and performs aggregation.
调度服务器获取到各计算节点获取各结构化指标名称的统计数据,然后可以进行汇总。The scheduling server obtains statistics of each structured node to obtain the name of each structured indicator, and then can perform aggregation.
步骤270,调度服务器将汇总后的各个结构化指标名称下的统计数据,导出到数据库中,以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。Step 270: The scheduling server exports the summarized statistical data under the name of each structured indicator to the database, and uses the structured indicator name and time stamp as keywords and stores the statistical data as a key value.
在实际应用中,各个计算节点的各层级的指标统计数据,可以汇总到集群的调度服务器上。而由于实际应用中,各计算节点每隔1个时间周期,比如1分钟,将指标统计数据汇总到调度服务器,而由于调度服务器并不充当存储服务器,数据会不断被新的指标统计数据覆盖,因此只能看到最近一个时间周期的汇总的指标统计数据。In practical applications, the statistic statistics of each level of each computing node can be aggregated to the scheduling server of the cluster. As a result of the actual application, each computing node aggregates the indicator statistics into the scheduling server every 1 time period, for example, 1 minute, and since the scheduling server does not act as a storage server, the data is continuously covered by new indicator statistics. Therefore, only the aggregated indicator statistics for the most recent time period can be seen.
那么,为了能够看到更长时间甚至所有历史的指标统计数据,本申请实施例的调度服务器将汇总的指标统计数据存储到外部的数据库中。如对于汇总后的各个结构化指标名称下的统计数据,以结构化指标名称+时间戳为key,统计数据为value,存储到数据库中。Then, in order to be able to see the indicator statistics for a longer time or even all the history, the scheduling server of the embodiment of the present application stores the aggregated indicator statistics in an external database. For the statistical data under the name of each structured indicator after the summary, the structured indicator name + time stamp is the key, and the statistical data is the value and stored in the database.
而本申请实施例,具备以下优点:The embodiments of the present application have the following advantages:
1、本申请由于是从最顶层到最底层的层级节点路径,将该层级节点路径中的各层级节点的层级标识,以及指标标识,结合分隔符组合为了结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗。1. Since the application is a hierarchical node path from the topmost layer to the lowest layer, the hierarchical identifier of each hierarchical node in the hierarchical node path, and the indicator identifier, combined with the separator are combined for the hierarchical relationship of the structured indicator name, simple Statistics on the metrics at each level are simple and logical, which can reduce system consumption.
2、由于上述的结构化指标名称的结构化形式,由于实际上每个topology的标识不同,各层级节点的层级节点也不尽相同,因此用户在定义指标名称时,不需要小心选择指标名称,降低出错几率。2. Due to the structured form of the above-mentioned structured indicator name, since the identification of each topology is different, the hierarchical nodes of each hierarchical node are also different, so the user does not need to carefully select the indicator name when defining the indicator name. Reduce the chance of error.
3、本申请由于是按照结构化指标名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。3. Since this application is a hierarchical relationship constructed according to the name of the structured indicator, it is convenient to add or delete one or several levels to facilitate expansion.
4、本申请只需向计算机节点的系统注册对应最底层的层级节点的指标标识,即可自动生成对应各层级的各层级节点的结构化指标名称,传输开销小,操作简单。 4. In this application, only the index of the lowest level hierarchical node is registered with the system of the computer node, and the structural index name corresponding to each hierarchical node of each level can be automatically generated, and the transmission overhead is small and the operation is simple.
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。It should be noted that, for the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present application are not limited by the described action sequence, because In accordance with embodiments of the present application, certain steps may be performed in other sequences or concurrently. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required in the embodiments of the present application.
实施例三Embodiment 3
参照图3,示出了本申请的一种指标统计装置实施例的结构框图,具体可以包括如下模块:Referring to FIG. 3, a structural block diagram of an embodiment of an indicator statistical device of the present application is shown, which may specifically include the following modules:
结构化标识创建模块310,用于对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;The structured identifier creation module 310 is configured to: for an indicator of a topology job, create a structured indicator name for each level node corresponding to the indicator; wherein, the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name;
底层指标监控模块320,用于当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;The bottom layer indicator monitoring module 320 is configured to perform statistics on the bottommost structured indicator name after monitoring the lowest level data corresponding to the indicator;
逐层汇总模块330,用于基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。The layer-by-layer summary module 330 is configured to summarize the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the previous level.
在本申请另一优选的实施例中,所述结构化标识创建模块310包括:In another preferred embodiment of the present application, the structured identifier creation module 310 includes:
底层指标创建子模块,用于对于一拓扑作业的一指标,将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称;The underlying indicator creation sub-module is configured to, for an indicator of a topology operation, sequentially combine the node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator into the lowest-level structured indicator name;
上层指标创建子模块,用于基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称。The upper indicator creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node of the current level in the structured indicator name to the structured indicator name of each level, and obtain the upper level Structured indicator name.
在本申请另一优选的实施例中,所述底层指标创建子模块包括:In another preferred embodiment of the present application, the bottom layer indicator creation submodule includes:
底层指标分隔创建子模块,用于对于一拓扑作业的指标,采用分隔符将最顶层到最底层的层级节点的节点标识和所述指标的指标标识,按序拼接为最底层的结构化指标名称。The underlying metrics are separated into sub-modules, which are used to index the nodes of the top-level to the lowest-level hierarchical nodes and the indicator identifiers of the indicators, and splicing them into the lowest-level structured indicator names. .
在本申请另一优选的实施例中,所述上层指标创建子模块包括:In another preferred embodiment of the present application, the upper layer indicator creation submodule includes:
上层指标分隔创建子模块,用于基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中,当前层级的分隔符所对应的层级节点设置为空,得到上一层级的结构化指标名称。The upper-level indicator separation creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node corresponding to the current level separator in the structured indicator name to the structured indicator name of each level. , get the name of the structural indicator of the previous level.
在本申请另一优选的实施例中,所述结构化标识创建模块310之前,还包括: In another preferred embodiment of the present application, before the structured identifier creation module 310, the method further includes:
注册模块,用于向系统注册对应最底层的层级节点的指标标识。The registration module is configured to register the indicator identifier corresponding to the lowest level hierarchical node to the system.
在本申请另一优选的实施例中,在逐层汇总模块330之后,还包括:In another preferred embodiment of the present application, after the layer-by-layer summary module 330, the method further includes:
数据存储模块,用于将各个结构化指标名称下的统计数据,导出到数据库中进行存储。A data storage module for exporting statistics under each structured indicator name to a database for storage.
在本申请另一优选的实施例中,所述数据存储模块,包括:In another preferred embodiment of the present application, the data storage module includes:
数据存储子模块,用于将各个结构化指标名称下的统计数据,导出到数据库中,以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。The data storage sub-module is used to export the statistics under each structured indicator name to the database, and use the structured indicator name and time stamp as keywords and store the statistical data as key values.
而本申请实施例,于实时计算系统,由于其处理数据时具备层级结构,而为了实时计算系统能够快捷的对所有层级的指标进行统计,对于一个指标的各个层级的层级节点,本申请实施例为各个层级节点创建了结构化指标名称,通过结构化指标名称本身确定各层级节点之间的层级关系,从而确定了汇总关系。从而只需要统计最底层的结构化指标名称的数据,即可按照结构化指标名称之间的层级关系逐级进行汇总,得到对应该指标的各个层级的数据。从而本申请实施例可以通过结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗,并且由于是按照结构化指标名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。The embodiment of the present application, in the real-time computing system, has a hierarchical structure when processing data, and the real-time computing system can quickly perform statistics on all the levels of the indicators. For the hierarchical nodes of each level of an indicator, the embodiment of the present application A structured indicator name is created for each hierarchical node, and the hierarchical relationship between the hierarchical nodes is determined by the structured indicator name itself, thereby determining the summary relationship. Therefore, only the data of the lowest-level structured indicator name needs to be counted, and the data can be summarized step by step according to the hierarchical relationship between the structured indicator names, and the data of each level corresponding to the index can be obtained. Therefore, in the embodiment of the present application, the hierarchical relationship of the structured indicator names can be used to simply count the indicators of each level, the logic is simple, the system consumption can be reduced, and the hierarchical relationship established according to the structured indicator name can be convenient. Add or remove one or several levels for easy expansion.
实施例四Embodiment 4
参照图4,示出了本申请的一种指标统计系统实施例的结构框图,具体可以包括:Referring to FIG. 4, a structural block diagram of an embodiment of an indicator statistical system of the present application is shown, which may specifically include:
调度服务器410,各计算节点420,数据库430。The scheduling server 410, each computing node 420, database 430.
图4中示例性的展示了各计算机节点420,实际应用中计算机节点可能根据集群需要进行设置。每个计算机节点包括注册模块421、底层指标分隔创建模块422、上层指标分隔创建模块423、底层指标监控模块424、逐层汇总模块425。调度服务器包括汇总模块411和数据存储子模块412。当然各计算机节点还可包括其他需求的模块,本申请实施例不对其加以限制。Each computer node 420 is exemplarily shown in FIG. 4, and the actual application may be set by the computer node according to the needs of the cluster. Each computer node includes a registration module 421, an underlying metric separation creation module 422, an upper metric separation creation module 423, an underlying metric monitoring module 424, and a layer by layer summary module 425. The dispatch server includes a summary module 411 and a data storage sub-module 412. Of course, each computer node may also include other required modules, which are not limited in the embodiment of the present application.
上述调度服务器410包括:The above scheduling server 410 includes:
汇总模块411,用于从各计算节点获取各结构化指标名称的统计数据,并进行汇总The summary module 411 is configured to obtain statistics of each structured indicator name from each computing node, and perform summary
数据存储子模块412,用于将各个结构化指标名称下的统计数据,导出到数据库430中,在数据库430中以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。The data storage sub-module 412 is configured to export the statistical data under each structured indicator name to the database 430. The database 430 stores the structured indicator name and time stamp as keywords and stores the statistical data as a key value.
每个计算节点420包括: Each compute node 420 includes:
注册模块421,用于向本计算节点的系统注册对应最底层的层级节点的指标标识。The registration module 421 is configured to register, to the system of the computing node, an indicator identifier corresponding to the lowest level hierarchical node.
底层指标分隔创建模块422,用于对于一拓扑作业的指标,采用分隔符将最顶层到最底层的层级节点的节点标识和所述指标的指标标识,按序拼接为最底层的结构化指标名称。The bottom layer indicator separation creation module 422 is configured to use a separator to select a node identifier of the topmost layer to the lowest level node node and an indicator identifier of the indicator, and sequentially splicing into a bottom layer structured indicator name. .
上层指标分隔创建模块423,用于基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中,当前层级的分隔符所对应的层级节点设置为空,得到上一层级的结构化指标名称。The upper-level indicator separation creation module 423 is configured to set, according to the lowest-level structured indicator name, the hierarchical node corresponding to the current level separator in the structured indicator name to the structured indicator name of each level. , get the name of the structural indicator of the previous level.
底层指标监控模块424,用于当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;The bottom layer indicator monitoring module 424 is configured to perform statistics on the bottommost structured index name after monitoring the lowest level data corresponding to the indicator;
逐层汇总模块425,用于基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。The layer-by-layer summary module 425 is configured to summarize the statistical data under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the upper level.
本申请实施例,具备以下优点:The embodiments of the present application have the following advantages:
1、本申请由于是从最顶层到最底层的层级节点路径,将该层级节点路径中的各层级节点的层级标识,以及指标标识,结合分隔符组合为了结构化指标名称的层级关系,简单的对各个层级的指标进行统计,逻辑简单,能够降低系统消耗。1. Since the application is a hierarchical node path from the topmost layer to the lowest layer, the hierarchical identifier of each hierarchical node in the hierarchical node path, and the indicator identifier, combined with the separator are combined for the hierarchical relationship of the structured indicator name, simple Statistics on the metrics at each level are simple and logical, which can reduce system consumption.
2、由于上述的结构化指标名称的结构化形式,由于实际上每个topology的标识不同,各层级节点的层级节点也不尽相同,因此用户在定义指标名称时,不需要小心选择指标名称,降低出错几率。2. Due to the structured form of the above-mentioned structured indicator name, since the identification of each topology is different, the hierarchical nodes of each hierarchical node are also different, so the user does not need to carefully select the indicator name when defining the indicator name. Reduce the chance of error.
3、本申请由于是按照结构化指标名称构建的层级关系,可以很方便的添加或者删除某个或者某几个层级,方便扩展。3. Since this application is a hierarchical relationship constructed according to the name of the structured indicator, it is convenient to add or delete one or several levels to facilitate expansion.
4、本申请只需向计算机节点的系统注册对应最底层的层级节点的指标标识,即可自动生成对应各层级的各层级节点的结构化指标名称,传输开销小,操作简单。4. In this application, only the index of the lowest level hierarchical node is registered with the system of the computer node, and the structural index name corresponding to each hierarchical node of each level can be automatically generated, and the transmission overhead is small and the operation is simple.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。For the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments can be referred to each other.
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计 算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD@ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application may be included in one or more of them. A computer program product embodied on a computer usable storage medium (including but not limited to disk storage, CD@ROM, optical storage, etc.).
在一个典型的配置中,所述计算机设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD@ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非持续性的电脑可读媒体(transitory media),如调制的数据信号和载波。In a typical configuration, the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium. Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, read-only optical read-only memory (CD@ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device The instructions executed above provide steps for implementing the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本 创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。Although a preferred embodiment of the embodiments of the present application has been described, those skilled in the art once learned the basic Additional changes and modifications to these embodiments can be made in the inventive concept. Therefore, the appended claims are intended to be interpreted as including all the modifications and the modifications
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, or terminal device that includes a plurality of elements includes not only those elements but also Other elements that are included, or include elements inherent to such a process, method, article, or terminal device. An element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article, or terminal device that comprises the element, without further limitation.
以上对本申请所提供的一种指标统计方法、一种指标统计装置和一种指标统计系统,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均可有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。 The above describes an indicator statistical method, an indicator statistical device and an indicator statistical system provided by the present application. The specific examples are used to explain the principle and implementation manner of the present application. The above embodiments The descriptions are only used to help understand the method of the present application and its core ideas; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there may be changes in specific embodiments and application scopes. The contents of this specification are not to be construed as limiting the present application.

Claims (14)

  1. 一种指标统计方法,其特征在于,包括:A statistical method for indicators, characterized in that it comprises:
    对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;For an indicator of a topology operation, a structured indicator name is created for each level node corresponding to the indicator; wherein, the hierarchical relationship between the nodes of each level is determined by the name of the structured indicator;
    当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;After monitoring the lowest level data corresponding to the indicator, performing statistics under the name of the corresponding lowest level structured indicator;
    基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。Based on the statistical data under the name of the lowest-level structured indicator, according to the hierarchical relationship between the names of each structured indicator, it is summarized step by step to the name of the structural indicator of the previous level.
  2. 根据权利要求1所述的方法,其特征在于,所述对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称的步骤,包括:The method according to claim 1, wherein the step of creating a structured indicator name for each level node corresponding to the indicator for an indicator of a topology operation comprises:
    对于一拓扑作业的一指标,将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称;For an indicator of a topology operation, the node identifiers of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator are sequentially combined into the lowest-level structured indicator name;
    基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称。Based on the name of the lowest-level structured indicator, for each structured hierarchical indicator name, the hierarchical node of the current hierarchical level in the structured indicator name is set to be empty, and the structured indicator name of the upper level is obtained.
  3. 根据权利要求2所述的方法,其特征在于,所述对于一拓扑作业的一指标,将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称的步骤,包括:The method according to claim 2, wherein the index of the node of the topmost level node to the lowest level level node and the indicator identifier of the indicator are sequentially combined into an lowest level for an indicator of a topology job. Steps to structure the indicator name, including:
    对于一拓扑作业的指标,采用分隔符将最顶层到最底层的层级节点的节点标识和所述指标的指标标识,按序拼接为最底层的结构化指标名称。For the indicator of a topology operation, the node identifier of the top-level to the lowest-level hierarchical node and the indicator identifier of the indicator are sequentially separated into the lowest-level structured indicator name.
  4. 根据权利要求3所述的方法,其特征在于,所述基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称的步骤,包括:The method according to claim 3, wherein the hierarchical level node of the current level in the structured indicator name is set to the structured indicator name of each level based on the lowest level structured indicator name Empty, the steps to get the level of the structured indicator name, including:
    基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中,当前层级的分隔符所对应的层级节点设置为空,得到上一层级的结构化指标名称。Based on the name of the lowest-level structured indicator, for each structured hierarchical indicator name, the hierarchical node corresponding to the current hierarchical separator in the structured indicator name is set to be empty, and the structural index of the upper level is obtained. name.
  5. 根据权利要求2-4其中之一所述的方法,其特征在于,所述对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称之前,还包括:The method according to any one of claims 2 to 4, wherein before the creation of the structured indicator name for each level node corresponding to the indicator, the method further includes:
    向系统注册对应最底层的层级节点的指标标识。Register the indicator ID corresponding to the lowest level node to the system.
  6. 根据权利要求1-4其中之一所述的方法,其特征在于,还包括:The method of any one of claims 1 to 4, further comprising:
    将各个结构化指标名称下的统计数据,导出到数据库中进行存储。 Export the statistics under each structured indicator name to the database for storage.
  7. 根据权利要求6所述的方法,其特征在于,所述将各个结构化指标名称下的统计数据,导出到数据库中进行存储的步骤,包括:The method according to claim 6, wherein the step of exporting statistical data under each structured indicator name to a database for storage comprises:
    将各个结构化指标名称下的统计数据,导出到数据库中,以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。The statistics under each structured indicator name are exported to the database, and the structured indicator name and time stamp are used as keywords and the statistics are used as key values for storage.
  8. 一种指标统计装置,其特征在于,包括:An indicator statistical device, comprising:
    结构化标识创建模块,用于对于一拓扑作业的一指标,针对对应所述指标各层级节点创建结构化指标名称;其中,通过结构化指标名称确定各层级节点之间的层级关系;a structured identifier creation module, configured to create a structured indicator name for each level node corresponding to the indicator for an indicator of a topology operation; wherein, the hierarchical relationship between the nodes of each level is determined by the structured indicator name;
    底层指标监控模块,用于当监控到所述指标所对应的最底层的数据后,在相应最底层的结构化指标名称下进行统计;The bottom layer indicator monitoring module is configured to perform statistics on the lowest level data corresponding to the indicator, and under the name of the corresponding lowest level structured indicator;
    逐层汇总模块,用于基于最底层的结构化指标名称下的统计数据,根据各结构化指标名称之间的层级关系,逐级汇总至上一层级的结构化指标名称下。The layer-by-layer summary module is used to summarize the statistics under the name of the lowest-level structured indicator according to the hierarchical relationship between the names of the structured indicators, and to summarize them under the structural indicator name of the previous level.
  9. 根据权利要求8所述的装置,其特征在于,所述结构化标识创建模块包括:The device according to claim 8, wherein the structured identification creation module comprises:
    底层指标创建子模块,用于对于一拓扑作业的一指标,将最顶层层级节点到最底层层级节点的节点标识和所述指标的指标标识,按序组合为最底层的结构化指标名称;The underlying indicator creation sub-module is configured to, for an indicator of a topology operation, sequentially combine the node identifier of the top-level hierarchical node to the lowest-level hierarchical node and the indicator identifier of the indicator into the lowest-level structured indicator name;
    上层指标创建子模块,用于基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中的当前层级的层级节点设置为空,得到上一层级的结构化指标名称。The upper indicator creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node of the current level in the structured indicator name to the structured indicator name of each level, and obtain the upper level Structured indicator name.
  10. 根据权利要求9所述的装置,其特征在于,所述底层指标创建子模块包括:The apparatus according to claim 9, wherein the underlying indicator creation submodule comprises:
    底层指标分隔创建子模块,用于对于一拓扑作业的指标,采用分隔符将最顶层到最底层的层级节点的节点标识和所述指标的指标标识,按序拼接为最底层的结构化指标名称。The underlying metrics are separated into sub-modules, which are used to index the nodes of the top-level to the lowest-level hierarchical nodes and the indicator identifiers of the indicators, and splicing them into the lowest-level structured indicator names. .
  11. 根据权利要求10所述的装置,其特征在于,所述上层指标创建子模块包括:The apparatus according to claim 10, wherein the upper layer indicator creation submodule comprises:
    上层指标分隔创建子模块,用于基于最底层的结构化指标名称,对每一层级的结构化指标名称,将所述结构化指标名称中,当前层级的分隔符所对应的层级节点设置为空,得到上一层级的结构化指标名称。The upper-level indicator separation creation sub-module is configured to set, according to the lowest-level structured indicator name, the hierarchical node corresponding to the current level separator in the structured indicator name to the structured indicator name of each level. , get the name of the structural indicator of the previous level.
  12. 根据权利要求9-11其中之一所述的装置,其特征在于,所述结构化标识创建模块之前,还包括:The device according to any one of claims 9 to 11, wherein before the structured identification creation module, the method further comprises:
    注册模块,用于向系统注册对应最底层的层级节点的指标标识。The registration module is configured to register the indicator identifier corresponding to the lowest level hierarchical node to the system.
  13. 根据权利要求8-11其中之一所述的装置,其特征在于,还包括: The device according to any one of claims 8-11, further comprising:
    数据存储模块,用于将各个结构化指标名称下的统计数据,导出到数据库中进行存储。A data storage module for exporting statistics under each structured indicator name to a database for storage.
  14. 根据权利要求13所述的装置,其特征在于,所述数据存储模块,包括:The device according to claim 13, wherein the data storage module comprises:
    数据存储子模块,用于将各个结构化指标名称下的统计数据,导出到数据库中,以结构化指标名称和时间戳为关键字并以统计数据为关键值进行存储。 The data storage sub-module is used to export the statistics under each structured indicator name to the database, and use the structured indicator name and time stamp as keywords and store the statistical data as key values.
PCT/CN2016/107017 2015-12-04 2016-11-24 Pointer counting method and device WO2017092600A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510886186.XA CN106846021A (en) 2015-12-04 2015-12-04 A kind of indicator-specific statistics method and apparatus
CN201510886186.X 2015-12-04

Publications (1)

Publication Number Publication Date
WO2017092600A1 true WO2017092600A1 (en) 2017-06-08

Family

ID=58796252

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/107017 WO2017092600A1 (en) 2015-12-04 2016-11-24 Pointer counting method and device

Country Status (2)

Country Link
CN (1) CN106846021A (en)
WO (1) WO2017092600A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832267A (en) * 2017-11-06 2018-03-23 中国银行股份有限公司 A kind of statistics method of summary and device
CN109963293A (en) * 2017-12-25 2019-07-02 中国移动通信集团上海有限公司 A kind of performance indicator optimization method and device
CN113177725A (en) * 2021-05-18 2021-07-27 浙江捷创智能技术有限公司 Energy statistical method capable of configuring tree structure
CN115965296A (en) * 2023-03-17 2023-04-14 建信金融科技有限责任公司 Assessment data processing method, device, equipment, product and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060285543A1 (en) * 2003-12-12 2006-12-21 British Telecommunications Public Limited Company Distributed computer system
CN102637200A (en) * 2012-03-07 2012-08-15 江苏引跑网络科技有限公司 Method for distributing multi-level associated data to same node of cluster
CN102841891A (en) * 2011-06-21 2012-12-26 金蝶软件(中国)有限公司 Method and device for ordering tree structure nodes, and enquiry system
CN102867059A (en) * 2012-09-19 2013-01-09 浪潮(北京)电子信息产业有限公司 Method and system for processing data in treelike structures
CN102929587A (en) * 2012-09-28 2013-02-13 用友软件股份有限公司 Data processing system and data processing method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102306199A (en) * 2011-09-22 2012-01-04 用友软件股份有限公司 Data management device and data management method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060285543A1 (en) * 2003-12-12 2006-12-21 British Telecommunications Public Limited Company Distributed computer system
CN102841891A (en) * 2011-06-21 2012-12-26 金蝶软件(中国)有限公司 Method and device for ordering tree structure nodes, and enquiry system
CN102637200A (en) * 2012-03-07 2012-08-15 江苏引跑网络科技有限公司 Method for distributing multi-level associated data to same node of cluster
CN102867059A (en) * 2012-09-19 2013-01-09 浪潮(北京)电子信息产业有限公司 Method and system for processing data in treelike structures
CN102929587A (en) * 2012-09-28 2013-02-13 用友软件股份有限公司 Data processing system and data processing method

Also Published As

Publication number Publication date
CN106846021A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
US11797618B2 (en) Data fabric service system deployment
Zheng et al. Service-generated big data and big data-as-a-service: an overview
US20190095510A1 (en) Low-latency streaming analytics
WO2017092600A1 (en) Pointer counting method and device
JP2019517040A (en) Cloud platform based client application information statistics method and apparatus
US9201700B2 (en) Provisioning computer resources on a network
US11755531B1 (en) System and method for storage of data utilizing a persistent queue
US11816511B1 (en) Virtual partitioning of a shared message bus
CN114443437A (en) Alarm root cause output method, apparatus, device, medium, and program product
TWI726041B (en) Index statistical method and device
Jayanthi et al. A framework for real-time streaming analytics using machine learning approach
US11853330B1 (en) Data structure navigator
US20230315726A1 (en) Enriching metrics with application execution context
Zhang et al. View-driven Real-time Sensor Data Processing and Servitization System
CN114254051A (en) Big data calculation method and device and big data platform
CN112988729A (en) Storm-based power transmission monitoring data processing method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16869910

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16869910

Country of ref document: EP

Kind code of ref document: A1