CN111352800A - Big data cluster monitoring method and related equipment - Google Patents

Big data cluster monitoring method and related equipment Download PDF

Info

Publication number
CN111352800A
CN111352800A CN202010114524.9A CN202010114524A CN111352800A CN 111352800 A CN111352800 A CN 111352800A CN 202010114524 A CN202010114524 A CN 202010114524A CN 111352800 A CN111352800 A CN 111352800A
Authority
CN
China
Prior art keywords
monitoring
big data
monitoring index
data cluster
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010114524.9A
Other languages
Chinese (zh)
Inventor
佟铁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JD Digital Technology Holdings Co Ltd
Original Assignee
JD Digital Technology Holdings Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JD Digital Technology Holdings Co Ltd filed Critical JD Digital Technology Holdings Co Ltd
Priority to CN202010114524.9A priority Critical patent/CN111352800A/en
Publication of CN111352800A publication Critical patent/CN111352800A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data

Abstract

The embodiment of the disclosure provides a big data cluster monitoring method and device, a computer readable storage medium and electronic equipment, and belongs to the technical field of computers and communication. The method comprises the following steps: collecting monitoring indexes of the big data cluster through a collector; writing the monitoring index into a time sequence database; comparing the monitoring index written into the time sequence database with an alarm rule; when the monitoring index reaches the alarm rule, alarming; or when the monitoring index does not reach the alarm rule, continuing monitoring. The technical scheme of the embodiment of the disclosure provides a big data cluster monitoring method, which can realize the monitoring of a big data cluster and is easy to expand and use.

Description

Big data cluster monitoring method and related equipment
Technical Field
The present disclosure relates to the field of computer and communication technologies, and in particular, to a big data cluster monitoring method and apparatus, a computer-readable storage medium, and an electronic device.
Background
In the operation of the existing big data cluster, the monitoring and alarming method in the prior art has high difficulty in secondary development and is not easy to expand. In addition, the monitoring and alarming method in the prior art is complex in alarming setting, tedious and difficult to use. With the development and application of large data clusters, a new technical method is needed to assist operation and maintenance personnel to ensure the healthy operation of the large data clusters aiming at the operation and maintenance of the large number of clusters, so that heavy repeated work is avoided.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The embodiment of the disclosure provides a big data cluster monitoring method and device, a computer readable storage medium and an electronic device, which can improve the efficiency and accuracy of big data cluster processing.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a monitoring method for a large data cluster, including:
collecting monitoring indexes of the big data cluster through a collector;
writing the monitoring index into a time sequence database;
comparing the monitoring index written into the time sequence database with an alarm rule;
when the monitoring index reaches the alarm rule, alarming; or
And when the monitoring index does not reach the alarm rule, continuing monitoring.
In one embodiment, the collector is a client data collector, and the method further comprises:
and installing the client data collector to the target equipment of the big data cluster.
In one embodiment, further comprising:
operating the client data collector through a self-contained interpreted programming language parser in the system of the target device;
wherein the client data collector can be dynamically increased.
In one embodiment, the time series database is a distributed time series database, writing the monitoring metrics to the time series database includes:
writing the monitoring index into the distributed time sequence database with extensible bottom storage in a socket mode through the client data collector;
wherein the distributed time series database uses a distributed columnar database cluster for background storage.
In one embodiment, the alarming rule is that the number of times that the monitoring index is greater than or equal to 90% of the maximum value is not greater than or equal to one half of the monitoring number of times, and alarming when the monitoring index reaches the alarming rule includes:
when the frequency of the monitoring index being more than or equal to 90% of the maximum value is more than or equal to one half of the monitoring frequency of a specific value, alarming;
wherein the specific numerical value is an even number of 2 or more.
In one embodiment, writing the monitored metrics to a timing database comprises:
and writing the name, the numerical value, the acquisition time, the cluster name and the address of the monitoring index into the time sequence database.
In one embodiment, collecting monitoring metrics for large data clusters by a collector includes:
and collecting the monitoring index of the big data cluster at a specific frequency through the client data collector.
According to an aspect of the present disclosure, there is also provided a monitoring apparatus for a large data cluster, including:
the acquisition module is configured to acquire the monitoring indexes of the big data cluster through the acquisition device;
a write-in module configured to write the monitoring indicator into a timing database;
the comparison module is configured to compare the monitoring index written into the time sequence database with an alarm rule; and
and the alarm module is configured to alarm when the monitoring index reaches the alarm rule.
According to an aspect of the present disclosure, there is also provided an electronic device, including:
one or more processors;
a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of the above.
According to an aspect of the present disclosure, there is also provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the method of any of the above.
According to the implementation mode of the application, monitoring indexes of the big data cluster are acquired through client data; writing the monitoring index into a time sequence database; comparing the monitoring index written into the time sequence database with an alarm rule; when the monitoring index reaches the alarm rule, alarming; or when the monitoring index does not reach the alarm rule, continuing monitoring. The client data collector of the embodiment is realized by the interpreted programming language, can run by using the interpreted programming language parser of the system, has small invasion to a target machine, can dynamically increase collectors, and does not need to restart the client data collector. The client data acquisition device writes acquired monitoring indexes into the distributed time sequence database in a socket mode, and the written data comprise names and values of the monitoring indexes, time, cluster names, ip and the like. The background storage of the distributed time sequence database uses a distributed column-type database cluster, and the bottom storage is expandable.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:
fig. 1 shows a schematic diagram of an exemplary system architecture of a monitoring method of a big data cluster or a monitoring apparatus of a big data cluster to which an embodiment of the present disclosure may be applied;
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device implementing embodiments of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a large data cluster monitoring method according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a block diagram of a big data cluster monitoring apparatus according to an embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of a big data cluster monitoring apparatus according to another embodiment of the present invention;
FIG. 6 schematically shows a block diagram of a big data cluster monitoring apparatus according to another embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Fig. 1 shows a schematic diagram of an exemplary system architecture 100 of a monitoring method of a big data cluster or a monitoring apparatus of a big data cluster to which the embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may be various electronic devices having display screens including, but not limited to, smart phones, tablets, portable and desktop computers, digital cinema projectors, and the like.
The server 105 may be a server that provides various services. For example, a user sends a monitoring request of a large data cluster to the server 105 by using the terminal device 103 (or the terminal device 101 or 102). Or the terminal device 103 automatically acquires the monitoring index of the big data cluster and sends the monitoring index to the server 105. The server 105 may write the monitoring index into a time sequence database based on a monitoring index of a big data cluster, compare the monitoring index written into the time sequence database with an alarm rule, and alarm when the monitoring index reaches the alarm rule; or when the monitoring index does not reach the alarm rule, continuing monitoring.
Also, for example, the terminal device 103 (also may be the terminal device 101 or 102) may be a smart tv, a VR (virtual Reality)/AR (Augmented Reality) helmet display, or a mobile terminal such as a smart phone, a tablet computer, etc. on which navigation, network appointment, instant messaging, video Application (APP) and the like are installed, and the user may send a monitoring request of a large data cluster to the server 105 through the smart tv, the VR/AR helmet display, or the navigation, network appointment, instant messaging, video APP. The server 105 may collect the monitoring index of the big data cluster through the collector based on the monitoring request of the big data cluster; writing the monitoring index into a time sequence database; comparing the monitoring index written into the time sequence database with an alarm rule; when the monitoring index reaches the alarm rule, alarming; or when the monitoring index does not reach the alarm rule, continuing monitoring.
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.
It should be noted that the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present disclosure.
As shown in fig. 2, the computer system 200 includes a Central Processing Unit (CPU)201 that can perform various appropriate actions and processes in accordance with a program stored in a Read-Only Memory (ROM) 202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for system operation are also stored. The CPU 201, ROM 202, and RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.
The following components are connected to the I/O interface 205: an input portion 206 including a keyboard, a mouse, and the like; an output section 207 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 208 including a hard disk and the like; and a communication section 209 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 210 as necessary, so that a computer program read out therefrom is installed into the storage section 208 as necessary.
In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication section 209 and/or installed from the removable medium 211. The computer program, when executed by a Central Processing Unit (CPU)201, performs various functions defined in the methods and/or apparatus of the present application.
It should be noted that the computer readable storage medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM) or flash Memory), an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules and/or units and/or sub-units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described modules and/or units and/or sub-units may also be disposed in a processor. Wherein the names of such modules and/or units and/or sub-units in some cases do not constitute a limitation on the modules and/or units and/or sub-units themselves.
As another aspect, the present application also provides a computer-readable storage medium, which may be contained in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer-readable storage medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the embodiments below. For example, the electronic device may implement the steps shown in fig. 3.
In the related art, for example, a machine learning method, a deep learning method, or the like may be used to perform large data cluster monitoring, and the application range of different methods is different.
FIG. 3 schematically shows a flow diagram of a large data cluster monitoring method according to an embodiment of the present disclosure. The method steps of the embodiment of the present disclosure may be executed by the terminal device, the server, or both, for example, the server 105 in fig. 1 may be executed by the terminal device and the server, but the present disclosure is not limited thereto.
In step S310, a monitoring index of the big data cluster is collected by the collector.
In the step, the server or the terminal collects the monitoring indexes of the big data cluster through the collector. In one embodiment, the big data is a huge and complex data set, including data sets of various data types such as text, pictures, video, audio, and the like, especially from completely new data sources, and the size of the big data set is overwhelmed by traditional data processing software. In one embodiment, the big data cluster is an application software cluster deployed on a large number of computer servers for storing and computing and processing big data, where the software may include a distributed system infrastructure (Hadoop), a distributed application coordination service (Zookeeper), a distributed columnar database (Hbase, a data warehouse system (Hive), a distributed publish-subscribe messaging system (Kafka), a large-scale data computing engine (Spark), a distributed full-text search engine (Elasticsearch), and a large-scale data stream processing framework (flash) Monitoring indexes such as memory utilization rate, network access flow, thread number, Transaction Per Second (TPS) and disk read-write throughput.
In one embodiment, the collector is a client data collector (Tcollector).
In one embodiment, step S310 is preceded by the step of installing the client data collector (Tcollector) into a target device of the large data cluster. In one embodiment, the client data collector is run through a self-contained interpreted programming language (Python) parser in the system of the target device; wherein the client data collector can be dynamically increased.
In step S320, the monitoring index is written into a time series database.
In this step, the server or the terminal writes or stores the monitoring index collected in step S310 into the time series database. Wherein, in one embodiment, the time series database is a time series database distributed time series database (OpenTSDB). In one embodiment, the monitoring metrics are written into the distributed time series database with an underlying storage extensible by the client data collector in a socket manner; wherein the distributed time series database uses a distributed column-wise database (HBase) cluster for background storage. In one embodiment, the name, the value, the collection time, the cluster name and the address of the monitoring index are written into the time sequence database.
In step S330, the monitoring index written into the time-series database is compared with an alarm rule.
In this step, the server or the terminal compares the monitoring index written in the time series database with an alarm rule. In one embodiment, the alarm rule is that the number of times that the monitoring index is greater than or equal to 90% of the maximum value is not greater than or equal to one half of the monitoring number.
In step S340, when the monitoring index reaches the alarm rule, an alarm is performed; or when the monitoring index does not reach the alarm rule, continuing monitoring.
In the step, the server or the terminal gives an alarm when the monitoring index reaches the alarm rule; or when the monitoring index does not reach the alarm rule, continuing monitoring. In one embodiment, when the number of times that the monitoring index is greater than or equal to 90% of the maximum value is greater than or equal to one half of the monitoring number of times of a specific value, an alarm is given; wherein the specific numerical value is an even number of 2 or more. For example, the specific value is equal to 10, the monitoring value of the monitoring index is more than 90% of the maximum value after more than 5 times of recent 10 times, and an alarm is triggered.
In one embodiment, the monitoring metrics for the large data cluster are collected by the client data collector at a particular frequency. The specific frequency is, for example, once in 1 minute.
According to the implementation mode of the application, monitoring indexes of a big data cluster are collected through a client data collector; writing the monitoring index into a time sequence database; comparing the monitoring index written into the time sequence database with an alarm rule; when the monitoring index reaches the alarm rule, alarming; or when the monitoring index does not reach the alarm rule, continuing monitoring. The client data collector of the embodiment is realized by the interpreted programming language, can run by using the interpreted programming language parser of the system, has small invasion to a target machine, can dynamically increase collectors, and does not need to restart the client data collector. The client data acquisition device writes acquired monitoring indexes into the distributed time sequence database in a socket mode, and the written data comprise names and values of the monitoring indexes, time, cluster names, ip and the like. The background storage of the distributed time sequence database uses a distributed column-type database cluster, and the bottom storage is expandable.
In one embodiment, monitoring is implemented by a World Wide Web backend application framework (Django) and a World Wide Web frontend application framework (read) for exposing monitoring data and configuring alarm condition rules.
In one embodiment, OpenTSDB is read cyclically once per minute and judged according to the alarm condition rule, which is implemented by a non-blocking World Wide Web (Tornado) server framework, and the monitoring indicators of each cluster can be read in multiple batches at one time by using the asynchronous characteristic of Tornado and compared with the alarm condition rule to make a judgment.
In one embodiment, a determination is made as to whether an exception host is included, and if so, no subsequent alarm determination is made.
In one embodiment, after the alarm problem is solved, a recovery notification is sent to enable operation and maintenance personnel to know the health condition of the cluster at any time.
FIG. 4 schematically shows a block diagram of a big data cluster monitoring apparatus according to an embodiment of the present disclosure. The big data cluster monitoring apparatus 400 provided in the embodiment of the present disclosure may be disposed on a terminal device, may also be disposed on a server side, or may be partially disposed on a terminal device and partially disposed on a server side, for example, may be disposed on the server 105 in fig. 1, but the present disclosure is not limited thereto.
The big data cluster monitoring apparatus 400 provided by the embodiment of the present disclosure may include an acquisition module 410, a writing module 420, a comparison module 430, and an alarm module 440.
The acquisition module 410 is configured to acquire the monitoring index of the big data cluster through the acquisition unit; the write module 420 is configured to write the monitoring indicator to a timing database; the comparison module 430 is configured to compare the monitoring index written into the time series database with an alarm rule; and the alarm module 440 is configured to alarm when the monitoring index reaches the alarm rule.
According to the embodiment of the present disclosure, the big data cluster monitoring apparatus 400 may be used to implement the big data cluster monitoring method described in the embodiment of fig. 3.
FIG. 5 schematically shows a block diagram of a big data cluster monitoring apparatus 500 according to another embodiment of the present invention.
As shown in fig. 5, the big data cluster monitoring apparatus 500 further includes a display module 510 in addition to the collection module 410, the writing module 420, the comparison module 430 and the alarm module 440 described in the embodiment of fig. 4.
Specifically, the display module 510 displays the monitoring index of the alarm on the terminal after the alarm module 440 alarms.
In the big data cluster monitoring apparatus 500, the display module 510 may complete the visual display of the monitoring index of the alarm.
FIG. 6 schematically shows a block diagram of a big data cluster monitoring apparatus 600 according to another embodiment of the present invention.
As shown in fig. 6, in addition to the collection module 410, the writing module 420, the comparison module 430, and the alarm module 440 described in the embodiment of fig. 4, the big data cluster monitoring apparatus 600 further includes a storage module 610.
Specifically, the storage module 610 is configured to store data of the monitoring index of the big data cluster, so as to facilitate a call and a reference of a worker or a server.
It is understood that the acquisition module 410, the writing module 420, the comparison module 430, the alarm module 440, the display module 510, and the storage module 610 may be combined into one module for implementation, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present invention, at least one of the acquisition module 410, the writing module 420, the comparing module 430, the alarm module 440, the display module 510, and the storage module 610 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or packaging a circuit, as hardware or firmware, or as a suitable combination of software, hardware, and firmware implementations. Alternatively, at least one of the acquisition module 410, the writing module 420, the comparison module 430, the alarm module 440, the display module 510, and the storage module 610 may be at least partially implemented as a computer program module that, when executed by a computer, may perform the functions of the respective modules.
For details that are not disclosed in the embodiment of the apparatus of the present invention, please refer to the embodiment of the big data cluster monitoring method of the present invention described above for details that are not disclosed in the embodiment of the apparatus of the present invention, because each module of the big data cluster monitoring apparatus of the example embodiment of the present invention may be used to implement the steps of the example embodiment of the big data cluster monitoring method described above in fig. 3.
The specific implementation of each module, unit and subunit in the big data cluster monitoring apparatus provided in the embodiments of the present disclosure may refer to the content in the big data cluster monitoring method, and will not be described herein again.
It should be noted that although several modules, units and sub-units of the apparatus for action execution are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules, units and sub-units described above may be embodied in one module, unit and sub-unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module, unit and sub-unit described above may be further divided into embodiments by a plurality of modules, units and sub-units.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A monitoring method for a big data cluster is characterized by comprising the following steps:
collecting monitoring indexes of the big data cluster through a collector;
writing the monitoring index into a time sequence database;
comparing the monitoring index written into the time sequence database with an alarm rule;
when the monitoring index reaches the alarm rule, alarming; or
And when the monitoring index does not reach the alarm rule, continuing monitoring.
2. The method of claim 1, wherein the collector is a client data collector, the method further comprising:
and installing the client data collector to the target equipment of the big data cluster.
3. The method of claim 2, further comprising:
operating the client data collector through a self-contained interpreted programming language parser in the system of the target device;
wherein the client data collector can be dynamically increased.
4. The method of claim 2, wherein the time series database is a distributed time series database, and wherein writing the monitoring metrics to the time series database comprises:
writing the monitoring index into the distributed time sequence database with extensible bottom storage in a socket mode through the client data collector;
wherein the distributed time series database uses a distributed columnar database cluster for background storage.
5. The method according to claim 2, wherein the alarm rule is that the number of times that the monitoring index is greater than or equal to 90% of the maximum value is not greater than or equal to one-half of the monitoring number, and when the monitoring index reaches the alarm rule, alarming comprises:
when the frequency of the monitoring index being more than or equal to 90% of the maximum value is more than or equal to one half of the monitoring frequency of a specific value, alarming;
wherein the specific numerical value is an even number of 2 or more.
6. The method of claim 2, wherein writing the monitoring metrics to a timing database comprises:
and writing the name, the numerical value, the acquisition time, the cluster name and the address of the monitoring index into the time sequence database.
7. The method of claim 2, wherein collecting the monitoring metrics for the large data cluster by the collector comprises:
and collecting the monitoring index of the big data cluster at a specific frequency through the client data collector.
8. A large data cluster monitoring device, comprising:
the acquisition module is configured to acquire the monitoring indexes of the big data cluster through the acquisition device;
a write-in module configured to write the monitoring indicator into a timing database;
the comparison module is configured to compare the monitoring index written into the time sequence database with an alarm rule; and
and the alarm module is configured to alarm when the monitoring index reaches the alarm rule.
9. An electronic device, comprising:
one or more processors;
a storage device configured to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202010114524.9A 2020-02-25 2020-02-25 Big data cluster monitoring method and related equipment Pending CN111352800A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010114524.9A CN111352800A (en) 2020-02-25 2020-02-25 Big data cluster monitoring method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010114524.9A CN111352800A (en) 2020-02-25 2020-02-25 Big data cluster monitoring method and related equipment

Publications (1)

Publication Number Publication Date
CN111352800A true CN111352800A (en) 2020-06-30

Family

ID=71197158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010114524.9A Pending CN111352800A (en) 2020-02-25 2020-02-25 Big data cluster monitoring method and related equipment

Country Status (1)

Country Link
CN (1) CN111352800A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930590A (en) * 2020-07-13 2020-11-13 上海森亿医疗科技有限公司 Real-time monitoring system for computer software and hardware resources
CN112131073A (en) * 2020-08-25 2020-12-25 新浪网技术(中国)有限公司 Server monitoring method and system
CN112291114A (en) * 2020-11-17 2021-01-29 恩亿科(北京)数据科技有限公司 Data source monitoring method and system, electronic equipment and storage medium
CN112306802A (en) * 2020-10-29 2021-02-02 平安科技(深圳)有限公司 Data acquisition method, device, medium and electronic equipment of system
CN112416874A (en) * 2020-11-23 2021-02-26 苏州浪潮智能科技有限公司 Monitoring data acquisition method, system, equipment and storage medium
CN112434063A (en) * 2020-11-03 2021-03-02 中国南方电网有限责任公司 Monitoring data processing method based on time sequence database
CN112732962A (en) * 2021-01-12 2021-04-30 南京大学 Online real-time junk image category prediction method based on deep learning and Flink
CN114490249A (en) * 2021-12-30 2022-05-13 广州市玄武无线科技股份有限公司 Monitoring alarm method and device, computer equipment and storage medium
CN115269308A (en) * 2022-06-29 2022-11-01 北京结慧科技有限公司 Kafka monitoring method and system, computer equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107896175A (en) * 2017-11-30 2018-04-10 北京小度信息科技有限公司 Collecting method and device
US20180306762A1 (en) * 2017-04-24 2018-10-25 International Business Machines Corporation Automatic siting for air quality monitoring stations

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180306762A1 (en) * 2017-04-24 2018-10-25 International Business Machines Corporation Automatic siting for air quality monitoring stations
CN107896175A (en) * 2017-11-30 2018-04-10 北京小度信息科技有限公司 Collecting method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930590A (en) * 2020-07-13 2020-11-13 上海森亿医疗科技有限公司 Real-time monitoring system for computer software and hardware resources
CN112131073A (en) * 2020-08-25 2020-12-25 新浪网技术(中国)有限公司 Server monitoring method and system
CN112306802A (en) * 2020-10-29 2021-02-02 平安科技(深圳)有限公司 Data acquisition method, device, medium and electronic equipment of system
WO2021190659A1 (en) * 2020-10-29 2021-09-30 平安科技(深圳)有限公司 System data acquisition method and apparatus, and medium and electronic device
CN112434063A (en) * 2020-11-03 2021-03-02 中国南方电网有限责任公司 Monitoring data processing method based on time sequence database
CN112291114A (en) * 2020-11-17 2021-01-29 恩亿科(北京)数据科技有限公司 Data source monitoring method and system, electronic equipment and storage medium
CN112416874A (en) * 2020-11-23 2021-02-26 苏州浪潮智能科技有限公司 Monitoring data acquisition method, system, equipment and storage medium
CN112732962A (en) * 2021-01-12 2021-04-30 南京大学 Online real-time junk image category prediction method based on deep learning and Flink
CN112732962B (en) * 2021-01-12 2023-10-13 南京大学 Online real-time garbage picture category prediction method based on deep learning and Flink
CN114490249A (en) * 2021-12-30 2022-05-13 广州市玄武无线科技股份有限公司 Monitoring alarm method and device, computer equipment and storage medium
CN115269308A (en) * 2022-06-29 2022-11-01 北京结慧科技有限公司 Kafka monitoring method and system, computer equipment and medium

Similar Documents

Publication Publication Date Title
CN111352800A (en) Big data cluster monitoring method and related equipment
US11755452B2 (en) Log data collection method based on log data generated by container in application container environment, log data collection device, storage medium, and log data collection system
CN110362544B (en) Log processing system, log processing method, terminal and storage medium
CN113987074A (en) Distributed service full-link monitoring method and device, electronic equipment and storage medium
CN111190888A (en) Method and device for managing graph database cluster
EP4099170A1 (en) Method and apparatus of auditing log, electronic device, and medium
CN111198859B (en) Data processing method, device, electronic equipment and computer readable storage medium
US20240126415A1 (en) Information presentation method and apparatus, and electronic device and storage medium
CN114625597A (en) Monitoring operation and maintenance system, method and device, electronic equipment and storage medium
CN113505302A (en) Method, device and system for supporting dynamic acquisition of buried point data and electronic equipment
CN111327466B (en) Alarm analysis method, system, equipment and medium
CN108062401B (en) Application recommendation method and device and storage medium
CN112954056A (en) Monitoring data processing method and device, electronic equipment and storage medium
CN112162905A (en) Log processing method and device, electronic equipment and storage medium
CN105245380B (en) Message propagation mode identification method and device
CN111274104B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN114691684A (en) Data display method, device and system
US20210144048A1 (en) Method and apparatus for outputting information
CN110532304B (en) Data processing method and device, computer readable storage medium and electronic device
CN109614137B (en) Software version control method, device, equipment and medium
CN112506490A (en) Interface generation method and device, electronic equipment and storage medium
CN112308074A (en) Method and device for generating thumbnail
CN113672675B (en) Data detection method and device and electronic equipment
CN110532322B (en) Operation and maintenance interaction method, system, computer readable storage medium and equipment
CN112036821B (en) Quantization method, quantization device, quantization medium and quantization electronic equipment based on grid map planning private line

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: Jingdong Digital Technology Holding Co.,Ltd.

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Digital Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone, 100176

Applicant before: JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd.

CB02 Change of applicant information