CA2284573A1 - Process management infrastructure - Google Patents

Process management infrastructure Download PDF

Info

Publication number
CA2284573A1
CA2284573A1 CA002284573A CA2284573A CA2284573A1 CA 2284573 A1 CA2284573 A1 CA 2284573A1 CA 002284573 A CA002284573 A CA 002284573A CA 2284573 A CA2284573 A CA 2284573A CA 2284573 A1 CA2284573 A1 CA 2284573A1
Authority
CA
Canada
Prior art keywords
processes
director
messages
database
tags
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002284573A
Other languages
French (fr)
Inventor
Richard Waclawik
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Crosskeys Systems Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CA 2200010 external-priority patent/CA2200010A1/en
Application filed by Individual filed Critical Individual
Priority to CA002284573A priority Critical patent/CA2284573A1/en
Publication of CA2284573A1 publication Critical patent/CA2284573A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5009Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • H04L41/0233Object-oriented techniques, for representation of network management data, e.g. common object request broker architecture [CORBA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

A process management infrastructure for use in a system for monitoring network performance comprising process tags for uniquely identifying each instance of a process in the system.

Description

Process Management Infrastructure This invention relates to a process management infrastructure system for use in an object-oriented programming environment, and in particular for use in a system for monitoring the compliance with service level agreements in a telecommunications network.
There is a need for a system to manage service level agreements (SLAs) between telecommunications service providers and their business customers. Part of the management process that relates to SLAs is the comparison of the service providers' performance vis-a-vis specific guarantees that it may provide to its customer.
In packet switched networks, unlike circuit switched networks, customers are not given a dedicated circuit; their data is statistically multiplexed with data from other sources.
Each customer pays for a particular level of service, and it is therefore important to ensure that the customer is receiving the level of service he has paid for. Our co-pending application of even date describes a system for monitoring network performance relative to customer service agreements.
The availability requirements for the system is very high. Downtime must be minimized at all costs. The system collects time based information from various network management systems and operation support systems. While the system is down, critical information can be lost. Various strategies are used within the system to minimize the possibility of losing information. One such strategy is to minimize down time.
The system indicates to various parts of the service provider's organization the level of service they are providing to their customers. Individuals in customer support, sales, network operations and senior executives rely on this information to make decisions.
The system is also expected to evolve to a point where the service provider's customers will have access to the system. System downtime can negatively impact the customer's perception of the quality of the service provider's operation.
To meet the above requirements, it is important that during system failures, the system automatically attempt to recover. Failing this, the system should notify system operators of the failure. The system should also have minimal downtime during deployment of new system functionality.

The sheer volume of data involved makes the task of managing the data quite daunting.
Typically a system might monitor several thousand customers involving the collection of ten million rows of data per day. Detailed real time events are typically kept for 180 days, summarized daily reports may be kept for 180 days, and summarized monthly reports for 18 months.
Object oriented database programming techniques are employed to handle such large volumes of data. The data is received from the network management system in real time through obj ects known as event collectors, which run processes under the control of a director.
In a typical system, a director must know store details about each instance of a process it is to run, such as executable file name, configuration and the like. This can lead to considerable downtime when a process fails or it is desired to change the system.
An object of the invention is to provide a process management infrastructure which is conducive to providing high availability of such a system with improved flexibility.
According to the present invention there is provided a process management for use in an object-oriented programming environment, comprising means for running a plurality of processes, a director for monitoring the operation of said processes, and a database for storing a plurality of process tags, each process tag uniquely identifying each instance of a process, said means for running said processes obtaining information required to run said processes by looking up the respective process tags therefor in said database.
The basic concept underpinning this infrastructure is thus the concept of a ptag or process tag, which uniquely identifies an instance of a program. Each back end process, such as an event collector, which gathers information from a network management system, such as a Newbridge Networks 46020 network management system (NMS) and database monitor generally will have its own ptag recording relevant information about the process, such as the executable name, start up indicator and arguments required to run the process.
The invention may run, for example, on a Unix-based Sun Sparc Ultra 2 workstation.
Normally dual processors should be used with a minimum of 256 Megabytes of RAM.
The invention can run, for example, over a TCP/IP network.
The invention is particularly applicable to a system for monitoring the compliance with service level agreements in a telecommunications network, but has also application in other environments. It allows added flexibility and permits fine control over different elements of the system. For example, a new network manager can be added to the system on the fly simply by starting up another instance of a process without the need to copy and rename executable files.
The invention represents an important technical advance in the management of communications networks.
The invention also provides a method of managing processes in an object-oriented programming environment, wherein process tags are used to uniquely identify each instance of a process in the system.
The invention will now be described in more detail, by way of example only, with reference to the accompanying drawings, in which:-Figure 1 is an overview of a system to which the management infrastructure may be applied; and Figure 2 is a block diagram of a process management infrastructure in accordance with the invention.
Refernng to the Figure 1, the service level management system comprises a command interface 1 and a Director 2 forming part of a system controller 20. The system controller 20 communicates with a Newbridge Networks 46020 network manager 14 and a data management framework 15 through back end processes 13, for example, event collectors and database monitors. The network manager 14 manages a packet switched network, or a fast packet switched network, such as an ATM or frame relay network. The system controller 20 as well as other processes write to log files 18. The log files are used by system utilities 19.
The process management infrastructure, is suitable for managing service level agreements in packet switched networks to ensure, for example, that customers are receiving the quality of service for which they have contracted. Unlike circuit switched networks, where bandwidth is dedicated to a particular customer, ins such networks bandwidth is statistically shared among a number of users, and it is important to ensure that customers are receiving the quality of service that they have paid for, for example, as determined by average throughput, peak rate and the like.
The infrastructure, which is implemented in object-oriented software, for example using C++, is designed to extract event information, for example, relating to the setting up of virtual connections, from a network manager, comprises a command interface ( rci ) 1, a deamon process, which provides the Director, a simple inter process communication component 3 to permit the exchange of messages between processes; a process configuration component 4; and a software logging component 5.
The system command interface 1 is the main user command interface to the process management infrastructure. From this interface, a system operator can issue commands to:
start the system. These commands could be:
~ shutdown the system ~ get status information on the processes currently running ~ start a specific process ~ stop a specific process ~ tell a process to reread its configuration ~ tell a process to change its logging level The following table sets out the effect of the various commands issued by the rci 1.
Command Description status The status command with the all option will pass an IPC (Inter Process Communication) message to the Director requesting the status. The Director will return this status message in a form of an IPC message. The message will be formatted such that rci will directly print the message out to the standard output.

The status command with a specified PTag will pass an IPC message directly to the PTag, that process will then return an IPC message containing specific information related to that process. The message will w0 98/42157 PCT/CA98/00231 be formatted for rci to print out to the standard output.

startup All startup IPC messages go through the Director process.

Upon issuing the startup command with the all option, rci will check to see if the Director is running. If the Director is not running then rci will start it up, then the Director will startup all the other back-end process specified in the Director's master process table.
If the Director is already running, then rci will not try to start up the Director.

When issuing the startup command with a specific PTag, rci will check to see if the Director is running. If the Director is not running then rci will not be able to issue the command to the Director and an error message will be logged. If the Director is running, then an IPC message is send to the Director to startup the specified PTag.

If rci receives an error sending the message to Director a error message will be recorded in its log file.

shutdown This command has message passing very similar to the startup command.

All shutdown messages will go through the Director process. If all is specified then an IPC message will be passed to the Director instructing it to shutdown all the back-end process. Then the Director will shutdown as well. If a PTag is specified, an IPC message is passed to the Director instructing it to only shutdown the process that corresponds to the PTag.

reconfig If all is specificed, an IPC message is sent directly to all processes. If a PTag is specified then an IPC message is sent directly to the specified process. No acknowledgment IPC message is returned.

log This command sends an IPC message to the process specified by the PTag to change the logging level of a process.

The logging levels are:

string value debugl 1 (most verbose) debug2 2 debug3 3 info 4 warning 5 serious 6 fatal 7 operator 8 (least verbose) Example: If a process was running at info level, and it received a message to change logging level without a specific level to change to, then it would change to debug3 logging level. Also if the process was running at debugl level and it received a message to change logging level without a specified level, the level will loop around and change to operator level.

help If no specific command is specified, rci lists the built-in commands and their acceptable syntax. If a built-in command is specified after the help command word, then additional help is displayed for that built-in command. If an invalid command or an invalid command line argument is entered, the general help will be displayed automatically.

<extended Allows for processing of specific (extended) commands, such as the sync command> command for the event collector. Rci will check for invalid commands using a header file. If commands are add or removed, rci must be modified. The syntax checking will be left to the back-end process. If the command is invalid a general help message will be sent to the standard output.

The main purpose of the director daemon 2 is to startup the system, restart processes that get terminated and shutdown the system. On startup, director 2 reads in a master process table 6, which contains the names of the processes that are to be started and monitored by director. The Director starts, stops and monitors back-end processes, such as event collectors and database monitors.
The master process table has the format:
PTag Executable Name Start Up Indicator Process Priority Process Args The format of the configuration file which will specify the processes to start is outlined in the following table.
Field ~ Description PTag Process Tag. It is used to uniquely identify each instance of a process in the system. The PTag is to be used as the name of the config file (with the extension .cfg) and as the name of the FIFO (with the extension .fifo) Executable The name of the executable without path. The path name for executables is in a variable in the system setup config file (systemSetup.cfg).

Start Up IndicatorThis field indicates whether a process is to be started up by director.

This field contains either a Y or a N. When the command program starts up director, director reads its process table and will start up all processes that have a Y in this field. If the command program gives director a startup all command, director starts up all processes which contain a Y in this field. If director gets a specific request to start up a process which contains an N in this field, then it will start up the process, however it will not restart the process if the process should exit with a restart code.

Process PriorityThe nice priority level of the process.

Process The argument string for the process.

arguments Note: director will put the PTag for the process into the argument string for the process with a -PTag flag. All processes may get the PTag from the config file class which will parse the command line for process variables.

The inter process communication component 3 is a simple mechanism used to send messages to various processes in order to control the operation of the system.
This component is implemented using the UNIX FIFO mechanism. The address of a specific process is determined by the ptag associated with that process. Two components are provided, one to read messages and one to write them.
A message has the following components:
~ source process - the ptag of the process that is sending the message ~ destination process - the ptag of the process that is receiving the message ~ message id - an integer value which indicates what the message content is to the receiver ~ response requested - an integer value which indicates whether the process is expecting a reply to the message ~ size - and integer value which indicates the size of the body ~ body.- the body of the message for the process The process configuration component is a set of object oriented classes which are imbedded into each process for the purpose of managing configuration parameters associated with that process. The process configuration component uses the ptag associated with the process to determine which configuration file in the configuration table 7 should be read.
The software logging component 5 is a set of object oriented classes which are embedded into each process for the purpose of providing a status and error logging facility. The software logging component 5 uses the ptag associated with the process to determine which file in the log database 8 the log data should be written to.
A specific example will now be described with reference to Figure 2.
In a first scenario, process B2 is running and suddenly dies. The following events occur to restart process B2.
~ the director is notified of the process death via Unix signals.
~ the director finds the Ptag associated with the dead process in the master processtable 6.
~ the director rereads the process information associated with this Ptag from the master process file.
~ the director issues a log message to inform the system operator that process died and that it will be restarted.
~ the director restarts the process.
_g_ For the second scenario, assume that the system was configured with process B2 running.
It is desirable to add the functionality provided by process A1 without shutting down the system. The following steps are undertaken.
~ An entry is added to the master process table for the particular ptag.
s ~ Using rci, a startup command for that ptag is issued. Specifically , the command is "rci startup A1"
The startup message is sent by the rci command to the director. The director reads the master process table to obtain the process information associated with the ptag A1.
The director starts the process Al and issues a log message to this effect using the software logging component.
In the third scenario, the configuration of process B2 must be modified. As an example, process B2 has a configuration parameter that specifies a specific data store.
The space remaining in this data store exceeds a maximum threshold and new information must be placed in a new data store. The following steps are undertaken.
The parameter in the configuration file associated with process B2 is modified to reflect the new data store where the new data items must be stored.
Using rci, a reread config command is issued to process B2. Specifically, the following command is issued : "rci rereadConfig B2"
The message is sent along to process B2.
Upon receiving this message, the process configuration component rereads the config file and issues a log message to this effect using the software logging component.
In the fourth scenario, the functionality provided by process B2 is no longer desired. It is desirable to remove the functionality provided by process B2 without shutting down the system. The following steps are undertaken.
~ The restart flag for this particular ptag in the master process list is set to 'N' for no restart.
~ Using rci, a shutdown command for that ptag is issued. Specifically , the command is "rci shutdown B2"

~ The shutdown message is sent by the rci command to process B2 ~ Process B2 shuts down.
~ the director is notified of the process death via Unix signals.
the director finds the ptag associated with the dead process.
~ the director rereads the process information associated with this ptag from the master process file and notices that the restart flag is set to no.
~ the director issues a log message stating that process B2 died and will not be restarted.
~ the system operator can then delete the entry associated with the process B2 from the master process table.
The described system minimizes the downtime and enhances the flexibility of the system.

Claims (10)

  1. Claims:
    A process management for use in an object-oriented programming environment, comprising means for running a plurality of processes, a director for monitoring the operation of said processes, and a database for storing a plurality of process tags, each process tag uniquely identifying each instance of a process, said means for running said processes obtaining information required to run said processes by looking up the respective process tags therefor in said database.
  2. 2. A process management infrastructure as claimed in claim 1, further comprising an interprocess communication component for sending messages to various processes in order to control the operation of the system, said processes being identified by the process tag associated therewith.
  3. 3. A process management infrastructure comprising two said interprocess communication components, one for reading messages and the other for writing them.
  4. A process management infrastructure as claimed in any one of claims 1 to 3, wherein said process tags contain at least the name of the executable file and arguments for the process.
  5. 5. A process management infrastructure as claimed in any one of claims 1 to 3, which is implemented in a service level management system for data networks.
  6. 6. A method of managing processes in an object-oriented programming environment, wherein process tags are used to uniquely identify each instance of a process in the system, information pertaining to the process is associated with the process tag in a database, and each instance of a process is run by extracting said information from the database.
  7. 7. A method as claimed in claim 1, wherein messages are sent to various processes in order to control the operation of the system, said processes being identified by the process tag associated therewith.
  8. 8. A method as claimed in claim 1, wherein said process tags are associated with at least the name of the executable file for the process and the process arguments.
  9. 9. A method of monitoring the operation of a packet switched network with a network manager, comprising receiving messages from said network manager containing data relating to the operation of said network, running processes to store and analyze said information, characterized in that said processes are uniquely identified by process tags stored in a database along with other information about said processes.
  10. 10. A method as claimed in claim 9, characterized in that messages are exchanged between said running processes, said messages identifying said processes by the process tag associated therewith.
CA002284573A 1997-03-14 1998-03-16 Process management infrastructure Abandoned CA2284573A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA002284573A CA2284573A1 (en) 1997-03-14 1998-03-16 Process management infrastructure

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CA2,200,010 1997-03-14
CA 2200010 CA2200010A1 (en) 1997-03-14 1997-03-14 Process management infrastructure
CA002284573A CA2284573A1 (en) 1997-03-14 1998-03-16 Process management infrastructure
PCT/CA1998/000231 WO1998042157A2 (en) 1997-03-14 1998-03-16 Process management infrastructure

Publications (1)

Publication Number Publication Date
CA2284573A1 true CA2284573A1 (en) 1998-09-24

Family

ID=31716368

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002284573A Abandoned CA2284573A1 (en) 1997-03-14 1998-03-16 Process management infrastructure

Country Status (1)

Country Link
CA (1) CA2284573A1 (en)

Similar Documents

Publication Publication Date Title
US6754664B1 (en) Schema-based computer system health monitoring
US6349333B1 (en) Platform independent alarm service for manipulating managed objects in a distributed network management system
US7130899B1 (en) Robust indication processing
US6356282B2 (en) Alarm manager system for distributed network management system
US20020120711A1 (en) Method and system for intelligent routing of business events on a subscription-based service provider network
US20030135611A1 (en) Self-monitoring service system with improved user administration and user access control
US20050049924A1 (en) Techniques for use with application monitoring to obtain transaction data
US20040078722A1 (en) XML instrumentation interface for tree-based monitoring architecture
US8533279B2 (en) Method and system for reconstructing transactions in a communication network
CN109460307B (en) Micro-service calling tracking method and system based on log embedded point
US20020120484A1 (en) Method and system for providing intelligent rules-based engine with heuristics for determining optimal routing and processing of business events
JP2005517234A (en) Automatic message processing systems and processes
CN108390907B (en) Management monitoring system and method based on Hadoop cluster
US20020188568A1 (en) Systems and methods of containing and accessing generic policy
WO1997050239A1 (en) System and method for formatting performance data in a telecommunications system
US7389342B2 (en) Service creator apparatus, systems, and methods
US20040111513A1 (en) Automatic employment of resource load information with one or more policies to automatically determine whether to decrease one or more loads
US7275250B1 (en) Method and apparatus for correlating events
US20070100973A1 (en) System and method for reliably purging a fault server
US6965932B1 (en) Method and architecture for a dynamically extensible web-based management solution
CA2284573A1 (en) Process management infrastructure
US7302455B1 (en) System and method for reliably purging statistical records
EP1018255A2 (en) Process management infrastructure
Cisco Working with uOne Log Files
Cisco Resource Manager Essentials Applications

Legal Events

Date Code Title Description
FZDE Discontinued