CN103188103A - Self-monitoring method of network management system - Google Patents

Self-monitoring method of network management system Download PDF

Info

Publication number
CN103188103A
CN103188103A CN 201110458362 CN201110458362A CN103188103A CN 103188103 A CN103188103 A CN 103188103A CN 201110458362 CN201110458362 CN 201110458362 CN 201110458362 A CN201110458362 A CN 201110458362A CN 103188103 A CN103188103 A CN 103188103A
Authority
CN
China
Prior art keywords
thread
network management
management system
monitoring
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201110458362
Other languages
Chinese (zh)
Inventor
周关力
廖昕
杨涛
陈松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Qinzhi Digital Technology Co Ltd
Original Assignee
Chengdu Qinzhi Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Qinzhi Digital Technology Co Ltd filed Critical Chengdu Qinzhi Digital Technology Co Ltd
Priority to CN 201110458362 priority Critical patent/CN103188103A/en
Publication of CN103188103A publication Critical patent/CN103188103A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a self-monitoring method of a network management system. The self-monitoring method of the network management system comprises the following steps that A, the network management system carries out monitoring on the internal threads of the system; B, an internal storage of a server where the network management system is located is monitored; C, a central processing unit (CPU) of the server where the network management system is located is monitored; D, a network interface of the server where the network management system is located is monitored; E, a disc where the network management system is located is monitored; F, a database used by the network management system is monitored; and G, the network management system carries out self-alarm, the network management system ensures that normal operation of other devices in a network environment is monitored, and normal operation of the network management system is ensured through the self-monitoring method of the network management system at the same time.

Description

A kind of network management system is from method for supervising
Technical field
The present invention relates to networking technology area, relate in particular to a kind of network management system from method for supervising.
Background technology
Along with fast development, the network environment of the communication technology are complicated day by day, in order to grasp the operation conditions of whole network in real time, in time find network problem, optimize network performance and service, network management system is taken advantage of a situation and is given birth to.Network management system and operation management have had oneself cover to improve theoretical foundation and best practicable means in the running of for many years industry, but the performance monitoring as webmaster self does not have one to overlap theory and the method that can support, and the present invention has announced that a kind of general network management system is from method for supervising.
Present most network management system needs regularly network management system to be restarted, because the system that long-play occurs takies big internal memory, cup utilization rate crosses problem such as height or avoid because the too high or too high main program stuck phenomenon that causes of internal memory of Thread Count, can't satisfy the customer demand of 7x24 continuous service to discharge.Perfect in method for supervising and measure in default of a cover, take place to be difficult to carry out malfunction elimination and positioning problems when unusual in network management system.
In order to address the above problem, guarantee the normal operation of network management system, know following problem with regard to needs: is the internal memory that internal system is concrete how to distribute? for rationally taking? whether the system line number of passes too high? whether Thread Count too high reasonable? whether does Thread Count continue to rise always? is all which thread starting? this wherein which thread be close and do not close? is there network problem in system place server? whether the network interface state that network management system is monitored normal? can normally receive network management information? does system place disk have sufficient usage space? whether institute's usage data storehouse is good for use.
Have only the answer of having grasped these problems, when system breaks down we could quick positioning question the place: 1, the problem of network management system self; 2, network management system place server performance can't satisfy system and normally moves; 3, whether network management system inside exists internal memory to overflow; 4, whether the remaining space of network management system place server is enough to the information that the storage system operation produces; 5, whether network management system place server exists network problem to cause network management system can't carry out problems such as proper network management.
Summary of the invention
The object of the present invention is to provide a kind of network management system from method for supervising, by the monitoring network guard system various occupation condition that produce in service, monitoring network guard system place server performance, not only make network management system itself have from monitoring function, the running status of monitoring network guard system that can be promptly and accurately, alarm failure information in detail, and the monitoring load is little, monitoring efficiency is high, adopts the system of this method can help the O﹠M engineer to deal with problems faster and server is optimized.
This method suggestion is carried out comprehensively monitoring certainly to carrying out configurable operations from monitoring function in the system trial run stage, adopts part from monitoring mode when commencement of commercial operation, avoids influencing owing to comprehensive self-monitoring resource consumption the normal performance of network management system.
To achieve these goals, this method adopts following scheme: a kind of network management system is from method for supervising, and this method may further comprise the steps.
A, network management system are monitored the internal system thread.
Steps A specifically can be that the internal thread monitoring is mainly three classes: main thread monitoring, interdependent thread monitor, temporary track range monitoring.
Main thread monitoring, in the monitoring network guard system each module main thread whether continuous service do not interrupt, stuck, phenomenon such as restart, monitor mode adopts the heartbeat pattern, does not receive heartbeat message, sends intermediate fault warning for continuous 3 times; Do not receive heartbeat message continuous 15 times, send major alarm; The thread heartbeat off and on, but continuous interruption times less than 3 times all when this phenomenon continues certain number of times or sends intermediate fault warning after the time, illustrates that there is performance issue in this thread, concrete number of times can be joined.
Interdependent thread monitor, each thread depends on situation in the monitoring network guard system, records higher level's thread of each interim thread and depends on state with higher level's class; When a certain thread is closed, check all by this thread dispatching or unlatching and have subordinate's thread of the relations of dependence with it, whether closing and in the time of delay of regulation, normally close with thread.
The temporary track range monitoring, whether all interim threads of opening in the monitoring network guard system record and monitor interim thread and normally close in the time-to-live of regulation.
Internal thread monitoring should be recorded and monitoring thread creation-time, shut-in time, higher level's thread, call method and time-to-live thereof, and record network management system total number of threads.
Thread type: permanent thread; Interim thread.Each module main thread belongs to permanent thread; Interim thread can arrange thread time-to-live threshold values.
Should classify according to system module, thread type to thread monitor, in order to be to locate rapidly and make things convenient for supervision authority setting in the real work pinpointing the problems.
B, monitoring network guard system place server memory.
Step B specifically can be monitoring network guard system place server memory; Memory usage to monitoring network guard system place server is monitored, and the memory usage threshold values is set, the monitoring memory usage surpasses threshold values and sends alarm, still is untreated after a period of time when memory usage continues to surpass threshold values, promotes alarm level automatically and sends new alarm; Network management system is optimized or Device memory is carried out dilatation according to actual conditions.
C, monitoring network guard system place server CPU;
Step C specifically can be monitoring network guard system place server CPU; Monitoring network guard system place server cpu busy percentage is monitored, and the cpu busy percentage threshold values is set, the monitoring CPU utilance surpasses threshold values and sends alarm, still is untreated after a period of time when cpu busy percentage continues to surpass threshold values, promotes alarm level automatically and sends new alarm; Network management system is optimized or changes the host's machine with higher disposal ability according to actual conditions.
D, monitoring network system place server network interface.
Step D specifically can be monitoring network system place server network interface; Main monitor-interface controlled state and mode of operation, the flow number of coming in and going out, number of dropped packets, wrong bag number goes out the inbound traffics utilance, packet loss, information such as Packet Error Rate.Avoid because interface network or physical problem influence the normal network management information communication of network management system.
E, monitoring network guard system disk take situation.
Step e specifically can be that disk takies situation; Supervisory control system daily record place disk, system operation catalogue place disk takies and the total size of disk utilance, remaining space and the disk of data storage place disk.
Disk utilization rate in system's place disk partition is monitored, constantly understood the disk operation conditions, avoid causing information such as daily record, temporary file, data can't deposit situation about maybe can't create in owing to the physical disk subregion is not enough.
Carry out from alarm suggestion user's modification daily record Prune Policies, temporary file deletion strategy and aggregation of data strategy etc. according to the monitoring situation; Perhaps add physical disk original equipment is carried out dilatation.
F, the employed database of monitoring network guard system.
Step F specifically can be the employed database of monitoring network guard system; Monitor database data base read-write number of times, database session number, rollback database number of times, internal memory are write hit rate, deadlock number, storage failure number, current linking number, daily record size etc. and are monitored.Understand Database Dynamic in real time, avoid because database anomalous effects network management system is normally moved.
G, network management system are carried out from alarm.
Step G specifically can be that network management system is carried out from alarm; According to the monitor data of steps A, B, C, D, E, each step of F, whether exist unusual and exceed threshold values issue alarm event, network management system adopt self usefulness the alarm mode as: modes such as acousto-optic, note, mail are carried out from alarm.
Adopt the network management system of this method, when can guarantee that other equipment of monitor network environment normally move, guarantee the normal operation of network management system self, network management system itself has had from monitoring function, the running status of monitoring network guard system that can be promptly and accurately, alarm failure information in detail, and the monitoring load is little, monitoring efficiency is high, can be widely used in the various network management systems.
Description of drawings
Fig. 1: the inventive method workflow is always schemed.
Fig. 2: thread is from the method for supervising schematic diagram.
Fig. 3: disk is from the method for supervising schematic diagram.
Embodiment
Step in this specification in disclosed all features, all methods or the process except mutually exclusive feature or step, all can make up by any way.
Disclosed arbitrary feature in this specification (comprising any accessory claim, summary and accompanying drawing) is unless special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, unless special narration, each feature is an example in a series of equivalences or the similar characteristics.
The present invention is described further below in conjunction with accompanying drawing
As follows as Fig. 1, the inventive method basic procedure: network management system is externally being carried out from monitoring in the management, monitoring internal thread operation conditions, internal memory, CPU take the monitoring of the free space of situation, network interface operation conditions, the operation conditions of institute's accessing database and disk that system is done, and guarantee the normal operation of network management system self.
About the internal thread monitoring as shown in Figure 2:
1, main thread in inner each module of monitoring network guard system, this type of thread adopts heartbeat type to monitor with respect to the permanent survival of network management system to this type of thread; Generally take heartbeat pulse reaction in 5 seconds, if continuous 3 cycles were not namely received thread heartbeat pulse in 15 seconds, should in time send intermediate fault warning, show that this module main thread may exist abnormal problem to cause thread to block and maybe may exist thread to close unusually; The heartbeat pulse should not send major alarm at once if continuous 15 cycles namely received in 75 seconds yet, show that thread inside runs into the stuck thread of significant problem or thread is closed unusually; If thread heartbeat pulse off and on, but interrupt continuously all being discontented with 3 times, should send fault warning, illustrate that there is performance issue in this module.
2, the interdependent thread of interim cross-thread in the monitoring network guard system: this type of thread all belongs to the interim thread of system, is the interim thread of opening of a certain business; Under the environment of multithreading, generally all can open the sub-thread of a plurality of subordinates by higher level's thread, become tree.Wherein might exist so several dependences: a, when higher level's thread is notified close after, subordinate's thread should be closed thereupon; B, after the sub-thread of subordinate is all normally closed, higher level's thread should be closed.This step is mainly monitored these two kinds of interdependent threads, when higher level's thread is notified close after, subordinate should close thread thereupon and whether receive out code; Close when higher level's thread is notified, whether subordinate's thread all closes in time of delay in regulation after receiving out code; Thread is all normally closed when subordinate, and whether the higher level's thread that should thereupon close normally closes.
3, interim thread is closed situation in the monitoring network guard system: at first according to system business inner temporary track journey is carried out classification, the time-to-live threshold values of the thread of different stage can be joined, and when interim thread is not closed in the stipulated time at threshold values, sends fault warning.
As follows about server memory, cpu monitor:
Internal memory and the cpu busy percentage of network management system monitoring place server are monitored, and each rank alarm threshold value is set, and send corresponding alarm when server memory or cpu busy percentage surpass threshold values; Allow simultaneously to arrange and continue the threshold values in rush hour, the duration that surpasses threshold values when Installed System Memory or cpu busy percentage surpasses set lasting rush hour, promotes alarm level automatically and sends new alarm.
Certainly monitor as Fig. 3 about disk
Network management system is monitored self employed disk, main monitoring: system's operation catalogue place disk, system journal place disk, system data storage place disk, when the disk remaining space can't satisfy the sustainable growth that system produces data, send alarm, the O﹠M personnel can revise data storage compression and Prune Policies timely or hardware is carried out dilatation.
For the monitoring of the operation conditions of network interface operation conditions, institute's accessing database, general database monitoring scheme and the network interface monitoring scheme of network management system adopted in suggestion, carries out self monitoring.

Claims (9)

1. a network management system is from method for supervising, and this method may further comprise the steps:
A, network management system are monitored the internal system thread;
B, monitoring network guard system place server memory;
C, monitoring network guard system place server CPU;
D, monitoring network system place server network interface;
E, monitoring network guard system place disk;
F, the employed database of monitoring network guard system;
G, network management system are carried out from alarm.
2. a kind of network management system according to claim 1 is characterized in that from method for supervising: described steps A network management system is monitored the internal system thread, and internal thread is mainly three classes: main thread monitoring, interdependent thread monitor, temporary track range monitoring; Internal thread monitoring should be recorded and monitoring thread creation-time, shut-in time, higher level's thread, call method and time-to-live thereof, and record network management system total number of threads.
According to a kind of network management system described in the claim 2 from method for supervising, it is characterized in that: the main thread monitoring, in the monitoring network guard system each module main thread whether continuous service do not interrupt, stuck, phenomenon such as restart, monitor mode adopts the heartbeat pattern, do not receive heartbeat message, send intermediate fault warning for continuous 3 times; Do not receive heartbeat message continuous 15 times, send major alarm; The thread heartbeat off and on, but continuous interruption times less than 3 times all when this phenomenon continues certain number of times or sends intermediate fault warning after the time, illustrates that there is performance issue in this thread, concrete number of times can be joined.
According to claim 2 or 3 described a kind of network management systems from method for supervising, it is characterized in that: interdependent thread monitor, each thread depends on situation in the monitoring network guard system, records higher level's thread of each interim thread and depends on state with higher level's class; When a certain thread is closed, check all by this thread dispatching or unlatching and have subordinate's thread of the relations of dependence with it, whether closing and in the time of delay of regulation, normally close with thread.
5. a kind of network management system according to claim 4 is characterized in that from method for supervising: the temporary track range monitoring, whether all interim threads of opening in the monitoring network guard system record and monitor interim thread and normally close in the time-to-live of regulation.
6. a kind of network management system according to claim 1 is characterized in that from method for supervising: step D monitoring network system place server network interface; Main monitor-interface controlled state and mode of operation go out inbound traffics, number of dropped packets, and wrong bag number goes out the inbound traffics utilance, packet loss, information such as Packet Error Rate.
7. a kind of network management system according to claim 1 is from method for supervising, it is characterized in that: described step e monitoring network guard system place disk, supervisory control system daily record place disk, system operation catalogue place disk take and the total size of disk utilance, remaining space and the disk of data storage place disk.
8. a kind of network management system according to claim 1 is from method for supervising, it is characterized in that: the employed database of described step F monitoring network guard system, monitor database data base read-write number of times, database session number, rollback database number of times, internal memory are write hit rate, deadlock number, storage failure number, current linking number, daily record size etc. and are monitored.
9. a kind of network management system according to claim 1 is characterized in that from method for supervising: described step G network management system is carried out from alarm, network management system adopt self usefulness the alarm mode as: modes such as acousto-optic, note, mail are carried out from alarm.
CN 201110458362 2011-12-31 2011-12-31 Self-monitoring method of network management system Pending CN103188103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110458362 CN103188103A (en) 2011-12-31 2011-12-31 Self-monitoring method of network management system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110458362 CN103188103A (en) 2011-12-31 2011-12-31 Self-monitoring method of network management system

Publications (1)

Publication Number Publication Date
CN103188103A true CN103188103A (en) 2013-07-03

Family

ID=48679075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110458362 Pending CN103188103A (en) 2011-12-31 2011-12-31 Self-monitoring method of network management system

Country Status (1)

Country Link
CN (1) CN103188103A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104410550A (en) * 2014-12-10 2015-03-11 深圳中兴网信科技有限公司 Web service monitoring method and web service monitoring device
CN104991855A (en) * 2015-06-16 2015-10-21 广州华多网络科技有限公司 Processing method and device for interface lag
CN105119767A (en) * 2015-06-29 2015-12-02 北京宇航时代科技发展有限公司 Data self-check and self-cleaning software operation state monitoring method and system
CN107193642A (en) * 2016-03-14 2017-09-22 阿里巴巴集团控股有限公司 Task data compression switching method, suitable compression degree evaluation method and relevant apparatus
CN107294786A (en) * 2017-07-13 2017-10-24 郑州云海信息技术有限公司 A kind of failure information processing method and device
CN108647123A (en) * 2018-03-29 2018-10-12 浙江慧优科技有限公司 A method of improving database monitoring software data acquisition performance
CN111092996A (en) * 2019-10-31 2020-05-01 国网山东省电力公司信息通信公司 Centralized scheduling recording system and control method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104410550A (en) * 2014-12-10 2015-03-11 深圳中兴网信科技有限公司 Web service monitoring method and web service monitoring device
CN104410550B (en) * 2014-12-10 2018-05-01 深圳中兴网信科技有限公司 Web service monitoring method and web service monitoring device
CN104991855A (en) * 2015-06-16 2015-10-21 广州华多网络科技有限公司 Processing method and device for interface lag
CN104991855B (en) * 2015-06-16 2018-09-11 广州华多网络科技有限公司 Interface interim card processing method and processing device
CN105119767A (en) * 2015-06-29 2015-12-02 北京宇航时代科技发展有限公司 Data self-check and self-cleaning software operation state monitoring method and system
CN107193642A (en) * 2016-03-14 2017-09-22 阿里巴巴集团控股有限公司 Task data compression switching method, suitable compression degree evaluation method and relevant apparatus
CN107294786A (en) * 2017-07-13 2017-10-24 郑州云海信息技术有限公司 A kind of failure information processing method and device
CN108647123A (en) * 2018-03-29 2018-10-12 浙江慧优科技有限公司 A method of improving database monitoring software data acquisition performance
CN111092996A (en) * 2019-10-31 2020-05-01 国网山东省电力公司信息通信公司 Centralized scheduling recording system and control method

Similar Documents

Publication Publication Date Title
CN103188103A (en) Self-monitoring method of network management system
AU2019201687B2 (en) Network device vulnerability prediction
CN105224888B (en) A kind of data of magnetic disk array protection system based on safe early warning technology
US20170155560A1 (en) Management systems for managing resources of servers and management methods thereof
CN105939225A (en) Method and device for executing service
CN102354296B (en) A kind of monitoring system and method that can expanding monitoring resources
CN109861878A (en) The monitoring method and relevant device of the topic data of kafka cluster
CN105763395A (en) Method and system for monitoring and managing virtual machine and container in cloud environment
CN102497292A (en) Computer cluster monitoring method and system thereof
CN102833120B (en) The abnormal method and system of NM server are processed in a kind of rapid automatized test
CN102917010A (en) Automatic preemption in multiple computer systems
CN102855319A (en) ORACLE database operation monitoring system
CN102571413B (en) Method for resource management under cluster environment
CN102622290A (en) Process monitoring method and system
CN102622291A (en) Method and system for monitoring processes
CN101827120A (en) Cluster storage method and system
CN102981939B (en) Disk monitoring method
CN105512788A (en) Intelligent operation and maintenance management method and system
CN101102217B (en) Processing method for duplicate alert and discontinuous reporting and monitoring in telecom network management system
CN103902401B (en) Virtual machine fault-tolerance approach and device based on monitoring
CN106095638A (en) The method of a kind of server resource alarm, Apparatus and system
CN105119765B (en) A kind of Intelligent treatment fault system framework
CN202713533U (en) TV diagnosis and maintenance system and TV
CN112231107B (en) Message speed limiting system, method, equipment and medium of firewall
CN109766198A (en) Stream Processing method, apparatus, equipment and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C05 Deemed withdrawal (patent law before 1993)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130703