WO2005104494A2 - Distributed computing environment and methods for managing and controlling the same - Google Patents

Distributed computing environment and methods for managing and controlling the same Download PDF

Info

Publication number
WO2005104494A2
WO2005104494A2 PCT/US2005/012938 US2005012938W WO2005104494A2 WO 2005104494 A2 WO2005104494 A2 WO 2005104494A2 US 2005012938 W US2005012938 W US 2005012938W WO 2005104494 A2 WO2005104494 A2 WO 2005104494A2
Authority
WO
WIPO (PCT)
Prior art keywords
network
management
computing environment
packet
traffic
Prior art date
Application number
PCT/US2005/012938
Other languages
French (fr)
Other versions
WO2005104494A3 (en
Inventor
Thomas Patrick Bishop
Robert A. Fabbio
Ashwin Kamath
Samuel R. Locke
James Morse Mott
Jaisimha Muthegere
Timothy L. Smith
Peter Anthony Walker
Scott R. Williams
Original Assignee
Cesura, Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/826,777 external-priority patent/US20050243814A1/en
Priority claimed from US10/826,719 external-priority patent/US20050232153A1/en
Priority claimed from US10/881,078 external-priority patent/US20060031561A1/en
Priority claimed from US10/885,216 external-priority patent/US20060007941A1/en
Application filed by Cesura, Inc filed Critical Cesura, Inc
Publication of WO2005104494A2 publication Critical patent/WO2005104494A2/en
Publication of WO2005104494A3 publication Critical patent/WO2005104494A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/61Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources taking into account QoS or priority requirements

Definitions

  • the invention relates in general to distributed computing environments, and more particularly, distributed computing environments and methods for managing and controlling the same.
  • Parallel networks allow content traffic to be routed over one network and management traffic to be routed over a separate physical network.
  • the public telephone system is an example of such a parallel physical network.
  • the content traffic can include voice and data that most people associate with telephone calls or telephone-based Internet connections.
  • the management traffic controls- network devices (e.g., computers, servers, hubs, switches, firewalls, routers, etc.) on the content traffic network, so that if a network device fails, the failed network device can be isolated, and content traffic can be re-routed to another network device without the sender or the receiver of the telephone call perceiving the event.
  • Parallel physical networks are expensive because two separate networks must be created and maintained. Parallel physical networks are typically used in situations where the content traffic must go through regardless of the state of individual network devices within the content traffic network.
  • FIG. 1 includes a typical prior art application infrastructure topology that may be used within a distributed computing environment.
  • An application infrastructure 110 may include two portions 140 and 160 that can be connected together by a router 137.
  • Application servers 134 and database servers 135 reside in the portion 140.
  • Web servers 133 and workstation 138 reside in the portion 160.
  • One network device e.g., workstation 138
  • the workstation 138 may be responsible for managing and controlling the application infrastructure 110, including all network devices.
  • router 137 may not be able to communicate with network devices (e.g., the application servers 134 and database servers 135) in the portion 140. Consequently, while the router 137 is non-functional, network devices in the portion 140 are without management and control.
  • the workstation may not effectively manage and control the distributed computing environment in a coherent manner because the workstation 138 cannot manage and control network devices within the portion 140.
  • Another problem with the application infrastructure 110 is its inability to effectively address a broadcast storm.
  • a malfunctioning component within the portion 140 may cause a broadcast storm.
  • the router 137 and its network connections have a limited bandwidth and may effectively act as a bottleneck.
  • the broadcast storm may swamp the router 137 with traffic.
  • Management traffic from the workstation 138 competes with content traffic from the broadcast storm, and therefore, the management traffic cannot correct the problem until after the broadcast storm subsides.
  • the network devices e.g., the application servers 134 and database servers 135) within the portion 140 operate without management and control because the management traffic competes with the content traffic on the same shared network.
  • Still another problem is that the distributed computing environment may not prioritize, allocate resources, or otherwise treat different applications differently.
  • a relatively less important application e.g., streaming a relatively large multimedia file
  • a relatively more important application e.g., operating store-front application
  • a less important application may consume resources that are needed for the more important application and cause missed business opportunities (e.g., revenue, profit, sales, etc.) or other adverse consequences.
  • a distributed computing environment can include a network that is shared by content traffic and management traffic.
  • a management network can be overlaid on top of a content network, so that the shared network operates similar to a parallel network, but without the cost and expense of creating a physically separate parallel network.
  • network packets that are transmitted over the network can be classified as management packets (part of the management traffic) or content packets (part of the content traffic). After being classified, the network packets can be routed as management traffic or content traffic as appropriate. Because at least some of the shared network is reserved for management traffic, management traffic can reach the network devices, including a network device from which a broadcast storm originated. Therefore, the network can have the advantages of a separate parallel network but without its disadvantages, and can have the advantages of a shared network but without its disadvantages.
  • the distributed computing environment can include an application infrastructure where substantially all network devices within the distributed computing environment are directly connected to an appliance that manages and controls the distributed computing environment. Knowledge of the functional state of and the ability to manage any network device within the distributed computing environment is not dependent on the functional state of any other network device within the application infrastructure. Management packets between the appliance and the managed components within the distributed computing environment can be effectively only "one hop" away from their destination.
  • the configuration of the distributed computing environment can also allow for better visibility of the entire application infrastructure.
  • one or more network devices may not be visible if an intermediate network device (e.g., the router 137), which lies between another network device (e.g., the application servers 134 and database servers 135) and a central management component (e.g., the workstation 138), malfunctions.
  • an intermediate network device e.g., the router 137
  • another network device e.g., the application servers 134 and database servers 135
  • a central management component e.g., the workstation 138
  • connections between the network devices and the appliance can allow for better visibility to each of the network devices, components within the network devices, and all network traffic, including management and content traffic, within the distributed computing environment.
  • the relative importance among applications, transaction types, or combinations irmning within a distributed computing environment thereof can be determined.
  • One or more control settings for the distributed computing environment, or a portion thereof, can be assigned to one or more network packets within a stream or flow to better achieve the business objectives of the entity using the distributed computing environment.
  • a distributed computing environment can include at least one apparatus that controls at least a portion of the distributed computing environment, network devices, and a network lying between each of the network devices and the at least one apparatus.
  • the network is configured to allow content traffic and management traffic within the at least a portion of the distributed computing environment to travel over the same network.
  • the network is configured such that at least a portion of a connection or a bandwidth within the network is reserved for the management traffic and is not used for the content traffic.
  • the at least one apparatus, the network devices, or any combination thereof includes a classification module to classify a network packet as part of the content traffic or the management traffic.
  • a distributed computing environment can include at least one apparatus that controls the distributed computing environment, wherein the at least one apparatus includes a central management component and at least one management execution component, and wherein the at least one apparatus includes a classification module to classify a network packet as part of management traffic or contact traffic.
  • the distributed computing environment can also include network devices, wherein substantially all network traffic between any two network devices passes through the at least one management network component.
  • an apparatus can be used to control at least a portion of a distributed computing environment.
  • the apparatus can include a central management component, and at least one management execution component, and a classification module to classify a network packet as part of management traffic or contact traffic within the distributed computing environment.
  • a method of controlling at least a portion of a distributed computing environment can include examining a network packet, classifying the network packet as management data or content data, and routing the network packet based on the classification.
  • a method of controlling at least a portion of a distributed computing environment can include classifying a network packet associated with a stream or a flow, and setting a control for the stream, the flow, or a pipe based at least in part on the classification.
  • a method can be used to process a network packet in a distributed computing environment that includes an application infrastructure.
  • the method can include receiving a communication from the application infrastructure, wherein the communication includes the network packet, classifying the network packet, and setting a control for the network packet based at least in part on the classification.
  • a data processing system readable medium can comprise code that can include instructions for carrying out any one or more of the methods and may be used within a distributed computing environment.
  • an apparatus can be configured to carry out any part or all of any of the methods described herein, the apparatus can include any part or all of any of the data processing system readable media described herein, an apparatus can include any part or all of any of the systems described herein, an apparatus can be a part of any of the systems described herein, or any combination thereof.
  • FIG. 1 includes an illustration of a prior art application infrastructure.
  • FIG.2 includes an illustration of a hardware configuration of an appliance for managing and controlling a distributed computing environment.
  • FIG. 3 includes an illustration of a hardware configuration of the application infrastructure management and control appliance in FIG. 2.
  • FIG.4 includes an illustration of a hardware configuration of one of the management blades in FIG.
  • FIG. 5 includes an illustration of a network connector, wherein at least one connector is reserved for management traffic and other connectors can be used for content traffic.
  • FIG. 6 includes an illustration of a bandwidth for a network, wherein at least one portion of the bandwidth is reserved for management traffic and another portion of the bandwidth can be used for content traffic.
  • FIGs. 7-13 include a flow diagram for a method of using the distributed computing environment of FIG. 2.
  • a distributed computing environment can include a network that is used for transmitting management traffic and content traffic to components within the distributed computing environment.
  • substantially all network traffic between two different network devices within the distributed computing environment can be routed through an appliance.
  • the network traffic can be classified as management traffic or content traffic.
  • a portion of connections or bandwidth of the network may be reserved for management traffic, so that a component causing a broadcast storm or otherwise malfunctioning can be accessed by the appliance to address a problem, isolate a component, or perform another appropriate action.
  • a shared network one network shared by management and content traffic
  • different applications, transaction types, or any combination thereof can be assigned different levels of importance to better meet business objectives of an entity using the distributed computing environment. Thus, not all content traffic may be treated the same.
  • a more important application or transaction type can be assigned control settings to allow its streams or flows to be transmitted more quickly and successfully, as compared to a lesser important application or transaction type.
  • a distributed computing environment can include at least one apparatus that controls at least a portion of the distributed computing environment, network devices, and a network lying between each of the network devices and the at least one apparatus.
  • the network is configured to allow content traffic and management traffic within the at least a portion of the distributed computing environment to travel over the same network.
  • the network is configured such that at least a portion of a connection or a bandwidth within the network is reserved for the management traffic and is not used for the content traffic.
  • the at least one apparatus, the network devices, or any combination thereof includes a classification module to classify a network packet as part of the content traffic or the management traffic.
  • each of the network devices is directly connected to the at least one apparatus.
  • the distributed computing environment is configured so that substantially all network traffic to and from each of the network devices passes through the at least one apparatus.
  • the at least one apparatus includes a central management component and at least one management execution component, each of the network devices includes a software agent, and a management infrastructure includes the central management component, the at least one management execution component, the software agents, and at least a portion of the network.
  • the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof.
  • the distributed computing environment is configured so that substantially all network traffic between any two network devices passes through the at least one management execution component.
  • an application infrastructure includes the at least one management execution component, the network devices and software agents, and at least a portion of the network.
  • the central management component is not part of the application infrastructure.
  • the network devices include at least one Layer 2 network device and at least one Layer 3 network device.
  • at least one of the network devices includes a Layer 2 network device and a Layer 3 network device.
  • the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
  • the at least one apparatus configured to identify a network packet as a management packet or a content packet.
  • a distributed computing environment can include at least one apparatus that controls the distributed computing environment, wherein the at least one apparatus includes a central management component and at least one management execution component, and wherein the at least one apparatus includes a classification module to classify a network packet as part of management traffic or a contact traffic.
  • the distributed computing environment can also include network devices, wherein substantially all network traffic between any two network devices passes through the at least one management network component.
  • each of the network devices is directly connected to the at least one management execution component.
  • the distributed computing environment further includes a network lying between each of the network devices and the at least one apparatus, wherein the network is configured to allow the content traffic and the management traffic within the distributed computing environment to travel over the same network.
  • each of the network devices includes a software agent
  • a management infrastructure includes the central management component, the at least one management execution component, the software agents, and at least a portion of a network between the network devices and the at least one apparatus.
  • an application infrastructure includes the at least one management execution component, the network devices and the software agents, and at least a different portion of the network.
  • the central management component is not part of the application infrastructure.
  • the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof.
  • the network devices include at least one Layer 2 network device and at least one Layer 3 network device.
  • at least one of the network devices includes a Layer 2 network device and a Layer 3 network device.
  • the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
  • an apparatus can be used to control at least a portion of a distributed computing environment.
  • the apparatus can include a central management component, and at least one management execution component, and a classification module to classify a network packet as part of management traffic or a contact traffic within the distributed computing environment.
  • the apparatus further includes ports configured to receive connections from network devices. At least one of the ports has associated connections, wherein at least a portion of the associated connections is reserved for management traffic, an associated bandwidth, wherein at least a portion of the associated with the management traffic, or any combination thereof.
  • the at least one management execution component includes a first management blade, wherein each network device within the distributed computing environment is connected to the first management blade.
  • the at least one management execution component includes a second management blade, wherein each of the network devices within the distributed computing environment is connected to the second management blade.
  • the apparatus further includes a port connectable to a network device, wherein, when the network device malfunctions, the apparatus is configured to perform a function of the network device. In a particular embodiment, when the network device malfunctions, the apparatus is further configured to isolate the network device from a remaining portion of the distributed computing environment.
  • the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
  • the apparatus further includes a control setting module, wherein, based at least in part on the classification, the control setting module sets a control for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof.
  • the at least one management execution component is configured to receive the content traffic and the management traffic, and the central management component is configured to receive the management traffic but not the content traffic.
  • a method of controlling at least a portion of a distributed computing environment can include examining a network packet, classifying the network packet as management data or content data, and routing the network packet based on the classification.
  • classifying the network packet includes classifying the network packet based on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof. In a particular embodiment, classifying the network packet includes classifying the network packet using a stream/flow mapping table.
  • routing the network packet includes routing the network packet over a management infrastructure. In a particular embodiment, routing the network packet further includes routing the network packet to a management execution component. In a more particular embodiment, routing the network packet further includes receiving the network packet at the management execution component, after the network packet is sent from a first network device, and sending the network packet from the management execution component to a second network device different from the first network device. In another particular embodiment, routing the network packet further includes routing the network packet over an application infrastructure. In a more particular embodiment, routing the network packet further includes routing the network packet to an agent on a network device. In another more particular embodiment, the method further includes blocking other traffic in the application infrastructure.
  • the method further includes setting a control for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof, wherein, setting the control is based at least in part on the classification.
  • a method of controlling at least a portion of a distributed computing environment can include classifying a network packet associated with a stream or a flow, and setting a control for the stream, the flow, or a pipe based at least in part on the classification.
  • the method further includes examining a parameter of the network packet.
  • the parameter includes a virtual local area network identification, a source address, a destination address, a source port, a destination port, a protocol, a connection request, a transaction type load tag, or any combination thereof.
  • the method further includes associating the network packet with one of a set of specific flows/streams at least partially based on the parameter.
  • associating the network packet includes using a classification mapping table, wherein an entry in the classification mapping table maps the network packet to a specific stream/flow.
  • each entry in the classification mapping table is mapped to an entry in a stream/flow mapping table.
  • each entry in the classification mapping table or the stream/flow mapping table includes values for settings for priority, latency, a connection throttle, a packet throttle, and a combination thereof.
  • the method further includes determining a value of the setting based at least in part on the value of the parameter.
  • setting the control is applied once to the flow or the stream, regardless of a number of pipes used for the flow or the stream.
  • the value of the setting is obtained from a flow entry and not a stream entry of a table.
  • a method can be used to process a network packet in distributed computing environment includes an application infrastructure. The method can include receiving a communication from the application infrastructure, wherein the communication includes the network packet, classifying the network packet, and setting a control for the network packet based at least in part on the classification.
  • classifying the network packet is based at least in part on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof.
  • classifying the network packet further includes associating the network packet with at least one of a set of application-specific streams, application-specific flows, transaction type-specific streams, transaction type-specific flows, or any combination thereof.
  • associating the network packet is accomplished using a stream/flow mapping table, wherein an entry in the stream/flow matching table maps the network packet to a stream or a flow.
  • members within the set are classified by types of traffic.
  • the method further includes determining an action based on the stream or the flow associated with the network packet.
  • the action includes at least one of drop, meter, and inject.
  • the method further includes assigning a weighted random discard value to the network packet, based at least in part on the stream or the flow associated with the network packet.
  • assigning a weighted random discard value is based on a stream rate, a flow rate, or any combination thereof.
  • the method further includes discarding the network packet based on the weighted random early discard value.
  • the weighted random early discard value is based on a contention level for a port and a control value associated with the stream or the flow. The control value can be on a logarithmic scale.
  • setting the control includes setting a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof.
  • an application is intended to mean a collection of transaction types that serve a particular purpose.
  • a web site store front can be an application
  • human resources can be an application
  • order fulfillment can be an application, etc.
  • AI application infrastructure
  • the hardware may include servers and other computers, data storage and other memories, networks, switches and routers, and the like.
  • the software used may include operating systems and other middleware components (e.g., database software, JAVATM engines, etc.).
  • the application infrastructure can include physical components, logical components, or a combination thereof.
  • application infrastructure component or “AI component” is intended to mean any part of an AI associated with an application.
  • AI components may be hardware, software, firmware, network, or virtual AI components. Many levels of abstraction are possible.
  • a server may be an AI component of a system
  • a CPU may be an AI component of the server
  • a register may be an AI component of the CPU, etc.
  • central management component is intended to mean a component that is capable of obtaining information from management execution component(s), software agents on managed components, or both, and providing directives to the management execution component(s), the software agents, or both.
  • a control blade is an example of a central management component.
  • the term "communication” is intended to mean a packet or a collection of packets that are sent from a source component to a destination component within a distributed computing environment and can be represented by a stream or a flow.
  • component is intended to mean a part within a distributed computing environment.
  • Components may be hardware, software, firmware, or virtual components. Many levels of abstraction are possible.
  • a server may be a component of a system
  • a CPU may be a component of the server
  • a register may be a component of the CPU, etc.
  • Each of the components may be apart of an AI, amanagement infrastructure, or both.
  • component and resource may be used interchangeably.
  • connection throttle is intended to mean a control for regulating a portion of connections or a portion of a bandwidth for a particular stream or a particular flow within an AI.
  • the connection throttle may exist at a beginning or end of a pipe.
  • the connection throttle may allow none, a portion, or all of the connections to be made or allow none, a portion or all of the bandwidth to be used for a particular stream or a particular flow.
  • content traffic is intended to mean the portion of the network traffic other than management traffic.
  • the content traffic includes network traffic used by application(s) running within a distributed computing environment.
  • distributed computing environment is intended to mean a collection of components comprising at least one application environment, wherein different types of components reside on different network devices connected to the same network.
  • flow is intended to mean a communication sent between two physical endpoints in a distributed computing environment.
  • a flow may be a communication that are coming from one port at one Internet protocol (IP) address and going to another port at another IP address using a particular protocol.
  • IP Internet protocol
  • classification mapping table is intended to mean a table having one or more entries that correspond to predefined characteristics of a stream or a flow based on one or more values of parameters.
  • instrument is intended to mean a gauge or control that can monitor or control a component or other part of an AI.
  • latency is intended to mean the amount of time it takes a network packet to travel from one AI component to another AI component. Latency may include a delay time before a network packet begins traveling.
  • local is intended to mean a coupling of two components with no more than one intervening management execution component lying between those two components. For example, if two components reside on the same network device, network traffic may pass between the two components without passing through an intervening management execution component. If two components are connected to the same management blade, the network or other traffic may pass between the two components without passing through two or more intervening management execution components.
  • logical when referring to an instrument or component, is intended to mean an instrument or a component that does not necessarily correspond to a single physical component that otherwise exists or that can be added to an AI.
  • a logical instrument may be coupled to a plurality of instruments on physical components.
  • a logical component may be a collection of different physical components.
  • management execution component is intended to mean a component in the flow of network traffic that may extract management traffic from the network traffic or insert management traffic into the network traffic; send, receive, or transmit management traffic to or from any one or more of a central management component and software agents residing on the AI components; analyze information within the network traffic; modify the behavior of managed components in the AI, or generate instructions or communications regarding the management and control of any portion of the AI; or any combination thereof.
  • a management blade is an example of a management execution component.
  • management infrastructure is intended to mean any and all hardware, software, and firmware that are used to manage, control, or manage and control at least a portion of a distributed computing environment.
  • management traffic is intended to mean is intended to mean network traffic that is used to manage, control, or manage and control at least a portion of a distributed computing environment.
  • network device is intended to mean a Layer 2 or higher network device in accordance with the Open System Interconnection ("OSI") Model.
  • OSI Open System Interconnection
  • a network device is a specific type of a component and may include a plurality of components.
  • network traffic is intended to mean all traffic, including content traffic and management traffic, on a network of a distributed computing environment.
  • packet throttle is intended to mean a control for regulating the transmission of packets over at least a part of the distributed computing environment.
  • the packet throttle may exist at a queue where packets are waiting to be transmitted through a pipe.
  • the packet throttle may allow none, a portion, or all of the network packets to be transmitted through the pipe.
  • a physical instrument when referring to an instrument or component, is intended to mean an instrument or a component that corresponds to a physical entity, including hardware, firmware, or software, that otherwise exists or that can be added to a distributed computing environment.
  • a physical instrument may be coupled to a physical component.
  • a physical component may be a server, a router, software used to operation the server or router, or the like.
  • pipe is intended to mean a physical network segment between two AI components.
  • a network packet may travel between two AI components via a pipe.
  • a pipe is a physical network segment, and by analogy, is similar to a wire within cable.
  • priority is intended to mean the order or ranking in which packets, flows, or streams are to be transmitted over at least a portion of the distributed computing environment.
  • remote is intended to mean at least two mtervening management execution components (e.g., two management blades) lie between two specific components. If two components are connected to different management blades, the network or other traffic between the two components will pass through the different management blades.
  • the term "stream” is intended to mean an aggregate set of flows between two logical components, as opposed to flows that are between two physical components, in a distributed computing environment.
  • the term "stream/flow mapping table” is intended to mean a table having one or more entries that correspond to predefined streams or predefined flows, wherein each of the predefined streams or predefined flows has one or more predefined settings for controls.
  • each entry in a stream/flow mapping table may have one or more predefined characteristics to which actual streams or actual flows within an AI may be compared. For example, a particular flow may substantially match a particular entry in a stream/flow mapping table and, as such, inherit the predefined control settings that correspond to that entry in the stream/flow mapping table.
  • An example of stream/flow mapping table can include a stream mapping table, a flow mapping table, or a combination stream-flow mapping table.
  • transaction type is intended to mean to a type of task or transaction that an application may perform.
  • browse request and order placement are transactions having different transaction types for a store-front application.
  • the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” and any variations thereof, are intended to cover a non-exclusive inclusion.
  • a method, process, article, or appliance that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such method, process, article, or appliance.
  • “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
  • FIG. 2 includes a hardware diagram of a distributed computing environment 200.
  • the distributed computing environment 200 includes an AI.
  • the AI includes management blade(s) (not illustrated in FIG. 2) within an appliance 250 (i.e., an apparatus) and those components above and to the right of the dashed line 210 in FIG.2. More specifically, the AI includes a router/firewall/load balancer 232, which is coupled to the Internet 231 or other network connection.
  • the AI further includes web servers 233, application servers 234, and database servers 235. Other servers may be part of the AI but are not illustrated in FIG. 2. Each of the servers may correspond to a separate computer or may correspond to a virtual engine running on one or more computers. Note that a computer may include one or more server engines.
  • the AI also includes a network 212, a storage network 236, and router/firewalls 237.
  • the management blades within the appliance 250 may be used to route communications (e.g., packets) that are used by applications, and therefore, the management blades are part of the AI.
  • communications e.g., packets
  • other additional components may be used in place of or in addition to those components previously described. In another embodiment, fewer components than illustrated in FIG.2 may be used.
  • Each of the network devices 232 to 237 is bi-directionally coupled in parallel to the appliance 250 via a network 212.
  • Each of the network devices 232 to 237 is a component, and any or all of those network devices 232 to 237 can include other components (e.g., system software, memories, etc.) inside of such network devices 232 to 237.
  • the router/firewalls 237 the inputs and outputs from the router/firewalls 237 are connected to the appliance 250. Therefore, substantially all the traffic to and from each of the network devices 232 to 237 in the AI is routed through the appliance 250.
  • Software agents may or may not be present on each of the network devices 232 to 237 and their corresponding components within such network devices, or any combination thereof.
  • FIG. 3 includes a hardware depiction of the appliance 250 and how it is connected to other parts of the distributed computing environment 200.
  • a console 380 and a disk 390 are bi-directionally coupled to a control blade 310 within the appliance 250.
  • the control blade 310 is an example of a central management component.
  • the console 380 can allow an operator to communicate with the appliance 250.
  • Disk 390 may include logic and data collected from or used by the control blade 310.
  • the control blade 310 is bi- directionally coupled to a hub 320.
  • the hub 320 is bi-directionally coupled to each management blade 330 within the appliance 250.
  • Each management blade 330 is bi-directionally coupled to the network 212 and fabric blades 340. Two or more of the fabric blades 340 may be bi-directionally coupled to one another.
  • the management infrastructure can include the appliance 250, network 212, and software agents on the network devices 232 to 237 and their corresponding components. Note that some of the components within the management infrastructure (e.g., the management blades 330, network 212, and software agents on the components) may be part of both the application and management infrastructures. In one embodiment, the control blade 310 is part of the management infrastructure, but not part of the AI.
  • connection and additional memory may be coupled to each of the components within the appliance 250.
  • the appliance 250 may include one or four management blades 330. When two or more management blades 330 are present, they may be connected to different parts of the AI. Similarly, any number of fabric blades 340 may be present.
  • the control blade 310 and hub 320 may be located outside the appliance 250, and in yet another embodiment, nearly any number of appliances 250 may be bi-directionally coupled to the hub 320 and under the control of the control blade 310.
  • FIG. 4 includes an illustration of one of the management blades 330.
  • Each of the management blades 330 is an illustrative, non-limiting example of a management execution component and has logic to act on its own or can execute on directives received from a central management component (e.g., the control blade 310).
  • a management execution component does not need to be a blade, and the management execution component could reside on the same blade as the central management component.
  • Some or all of the components within the management blade 330 may reside on one or more integrated circuits.
  • Each of the management blades 330 can include a system controller 410, a central processing unit
  • CPU central processing unit
  • FPGA field programmable gate array
  • bridge 450 a fabric interface
  • I/F fabric interface
  • the system controller 410 is bi-directionally coupled to the hub 320.
  • the system controller 410, the CPU 420, and the FPGA 430 are bi-directionally coupled to one another.
  • the bridge 450 is bi-directionally coupled to a media access control (“MAC”) 460, which is bi-directionally coupled to the network 212.
  • MAC media access control
  • the fabric I/F 440 is bi-directionally coupled to the system controller 410 and a fabric blade 340.
  • More than one of any or all components may be present within the management blade 330.
  • a plurality of bridges substantially identical to bridge 450 may be used and would be bi-directionally coupled to the system controller 410, and a plurality of MACs substantially identical to the MAC 460 may be used and would be bi-directionally coupled to the bridge 450.
  • memories may be coupled to any of the components within the management blade 330.
  • content addressable memory, static random access memory, cache, first-in-first-out (“FIFO”), or other memories or any combination thereof may be bi-directionally coupled to the FPGA 430.
  • the control blade 310, the management blades 330, or any combination thereof may include a central processing unit ("CPU") or controller. Therefore, the appliance 250 is an example of a data processing system.
  • other connections and memories may reside in or be coupled to any of the control blade 310, the management blade(s) 330, or any combination thereof.
  • Such memories can include content addressable memory, static random access memory, cache, FIFO, other memories, or any combination thereof.
  • the memories, including the disk 390 can include media that can be read by a controller, CPU, or both. Therefore, each of those types of memories includes a data processing system readable medium.
  • Portions of the methods described herein may be implemented in suitable software code that includes instructions for carrying out the methods.
  • the instructions may be lines of assembly code or compiled C" " , Java, or other language code.
  • Part or all of the code may be executed by one or more processors or controllers within the appliance 250 (e.g., on the control blade 310, one or more of the management blades 330, or any combination thereof) or on one or more software agent(s) (not illustrated) within the network devices 232 to 237, or any combination of the appliance 250 or software agents.
  • the code may be contained on a data storage network device, such as a hard disk (e.g., disk 390), magnetic tape, floppy diskette, CD ROM, optical storage network device, storage network (e.g., storage network 236), or other suitable data processing system readable medium or storage network device, or any combination thereof.
  • a data storage network device such as a hard disk (e.g., disk 390), magnetic tape, floppy diskette, CD ROM, optical storage network device, storage network (e.g., storage network 236), or other suitable data processing system readable medium or storage network device, or any combination thereof.
  • the functions of the appliance 250 may be performed at least in part by another apparatus substantially identical to appliance 250 or by a computer (e.g., console 380).
  • a computer e.g., console 380
  • a computer program or its software components with such code may be embodied in more than one data processing system readable medium in more than one computer. Note that the appliance 250 is not required, and its functions can be incorporated into different parts of the distributed computing environment 200.
  • Each of the network devices 232 to 237 is connected to the appliance 250 via the network 212.
  • Substantially all of the network traffic to and from each of the network devices 232 to 237 passes through the appliance 250, and more specifically, at least one of the management blades 330.
  • the appliance 250 can more closely manage and control the distributed computing environment 200 in real time or near real time.
  • the distributed computing environment 200 dynamically changes in response to (1) applications running within the distributed computing environment 200, (2) changes regarding network devices or other components within the distributed computing environment 200 (e.g., provisioning or de-provisioning a server), (3) changes in priorities of applications, transaction types, or both to more closely match the business objectives of the organization operating the distributed computing environment 200, or (4) any combination thereof.
  • the network traffic on the network 212 includes content traffic and management traffic. Therefore, the network 212 is a shared network. Separate, physical, parallel networks for content traffic and management traffic are not needed. The shared network helps to keep capital and operating expenses lower.
  • the network 212 can include one or more connections, a portion of the bandwidth within the network, or both, which may be reserved for management traffic and not be used for content traffic.
  • a network cable 540 may be attached to a connector 520 having connections 524.
  • a portion 502 of the connections 524 may be reserved for management traffic, and a portion 504 of the connections 524 may be reserved for content traffic.
  • the network traffic may include a bandwidth 600, as illustrated in FIG. 6.
  • the bandwidth 600 may include a portion 602 reserved for management traffic and a portion 604 reserved for content traffic.
  • a port on the appliance 250 can receive the connector 520 or otherwise be represented by the bandwidth 600.
  • Each network device 232 to 237 may be connected to its own port on the appliance 250.
  • FIGs. 5 and 6 are meant to illustrate and not limit the scope of the present invention.
  • the appliance 250 can address one or more network devices 232 to 237 or other AI components within any of the network devices 232 to 237 that may be causing a broadcast storm.
  • the portion 502 of the connections 524 or the portion 602 of the bandwidth 600 allows the appliance 250 to communicate to a software agent on the AI component to address the broadcast storm issue.
  • a conventional shared network does not reserve connections) or a portion of the bandwidth solely for management traffic. Therefore, a designated managing component in a conventional shared network (e.g., workstation 138 in FIG. 1) would not be able to send a management communication to the AI component because the broadcast storm could consume all connections or bandwidth and substantially prevent any packets, including management packets, from being received by the AI component causing the broadcast storm.
  • the distributed computing environment 200 has the advantages of a separate, physical, parallel network but without its disadvantages, and with the advantages of a shared network but without its disadvantages.
  • each of the management blades 330 can extract management traffic from the network traffic or insert management traffic into the network traffic; send, receive, or transmit management traffic to or from any one or more of the appliance 250 and software agents residing on the AI components; analyze information within the network traffic; modify the behavior of managed components in the AI; or generate instructions or communications regarding the management and control of any portion of the AI; or any combination thereof.
  • the various elements within the management blades 330 e.g., system controller 410, CPU 420, FPGA 430, etc.
  • the management blade 330 may perform one or more functions of one or more of the network devices connected to it. For example, if one of the firewall/routers 237 is having a problem, the management blade 330 may be able to detect, isolate, and correct a problem within such firewall/router 237. During the isolation and correction, the management blade 330 can be configured to perform the routing function of the firewall/router 237, which is an example of a Layer 3 network device in accordance with the OSI Model. This non-limiting, illustrative embodiment helps to illustrate the power of the management blades 330. In another embodiment, the management blade 330 may serve any one or more functions of other Layer 2 or higher network devices.
  • Another advantage with the embodiments described herein is that communications to and from a network device 232 to 237 is not dependent on another network device.
  • a conventional distributed computing environment such as the one illustrated in FIG. 1, the ability of the workstation 138 to communicate to any of the application servers 134 or database servers 135 depends on the state of the router 137. Therefore, the router 137 is an intermediate network device with respect to communications between the workstation 138 and the servers 134 and 135.
  • the distributed computing environment 200 as illustrated in FIG. 2 and described herein allows direct communication between the appliance 250 and each of the network devices 232 to 237 without having to depend on the state of another network device because there are no intervening network devices.
  • each the network devices 232 to 237 may be directly connected to the network 210 that can be connected to one or more management blades 330.
  • each of the network devices 232 to 237 may be connected in parallel to different management blades 330 to account for possible failure in any one particular management blade 330.
  • the control blade 310 may detect that one of the web servers 233 is configured incorrectly. However, one of the management blades 330 may be malfunctioning. Control blade 310 may send a management communication through hub 320 and over a functional management blade 330 to the malfunctioning web server 233. Therefore, the malfunctioning management blade 330 is not used.
  • Embodiments can allow for network devices within a distributed computing environment to be no more than "one hop" away from its nearest (local) management blade 330. By being only one hop away, the management infrastructure can manage and control network devices 232 to 237 and their corresponding components in real time or near real time.
  • the distributed computing environment 200 can also be configured to significantly reduce the likelihood that a single malfunctioning AI component brings down the entire distributed computing environment 200.
  • IV. Methods of Managing and Controlling a Distributed Computing Environment Attention is now directed to methods for managing and controlling the distributed computing environment 200.
  • the methods may classify a network packet within a communication, and based upon the classification, properly affect the transmission of the network packet.
  • the classification may be based at least in part on one or more factors, including the application, transaction type, or a combination of application and transaction type with which the communication is associated, whether that network packet is a management packet, the source or destination of the network packets, one or more other relevant factors, or any combination thereof.
  • the classification can be based at least on the type of application, the transaction type, or an application-transaction type combination and can affect a stream or a flow.
  • the corresponding stream, corresponding flow, or corresponding stream-flow for a specific apphcation, specific transaction type, or specific application-transaction type combination is hereinafter referred to as an "ATT-specific stream/flow.”
  • the classification can be used to control one or more settings for the transmission of the network packet.
  • a management execution component e.g., a management blade 330
  • the control settings module may be part of the examination module, or vice versa.
  • the method can include receiving a communication at the appliance 250 (block 700 in FIG. 7).
  • the communication may include one network packet or a collection of network packets that effectively make up one transmission from a source AI component to a destination AI component with AI 210.
  • the communication between AI components on different network devices 232 to 237 within the AI 210 travels through the appliance 250.
  • this communication may be converted into other network packets (e.g., smaller network packets) by the MAC 460 of the management blade 330.
  • these packets may conform to the Open System Interface (OSI) seven layer standard.
  • OSI Open System Interface
  • the communication is assembled by the MAC 460 into Transmission Control Protocol/Internet Protocol ("TCP/IP") packets, which is a specific type of network packets.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the network packets can be part of a stream or a flow.
  • the receiving action is optional since some or all of the succeeding actions may be performed before a communication is received by the appliance 250, (e.g., on an agent at one of the network devices 232 to 237 or their corresponding components).
  • the method can also include examining and classifying the network packet (block 702).
  • the network packet may be classified by examining different OSI layers of the network packet.
  • the network packet can be classified as being a particular stream or a particular flow.
  • One or more parameters can be used in order to classify the network packet as belonging to a particular type of stream or flow.
  • the one or more parameters may include a virtual local area network identification, a source address, a destination address, a source port, a destination port, a protocol, a connection request, transaction type load tag, or any combination thereof.
  • the source and destination addresses may be IP addresses or other network addresses (e.g., Ix250srv).
  • the one or more parameters may exist within the header of each network packet.
  • one or more network packets within a communication may include a connection request that may be a simple "yes/no" parameter (i.e., whether or not the network packet represents a connection request).
  • the transaction type load tag may be used to define the transaction type related to a particular type of stream or flow. The transaction type load tag may be used to provide for more fine-grained control over ATT-specific stream/flows.
  • the classification may be based on whether the communication is in a TCP or UDP IP protocol.
  • a special IP address may be assigned to the control blade 310 to perform management functions, and therefore, all network packets that include the special IP address (as the source or destination address) may be classified as management packets.
  • the examination can focus on the protocol or the special IP address in these particular embodiments.
  • the classification of the network packet can be performed by a classification module within the management blade 330, and in one embodiment, the FPGA 430. In other embodiments, the classification module may reside in another portion of the appliance 250 or can be part of a management software agent on an AI component.
  • the classification may be aided by a tuple, which may be a combination of information from various layers of the network packet(s).
  • a tuple that associates a network packet with a particular type of steam or flow within a classification map.
  • the elements of this tuple (as may be stored by the FPGA 430 on the management blade 330) can include various fields that may be selected from the possible fields in Table 1.
  • a tuple including a particular IP source port, a particular IP destination port, and a particular protocol may be defined and associated with a particular type of steam or flow. If information that is extracted from various layers of an mcoming network packet matches the information in the tuple in the classification map, the network packet may in turn be associated with that particular type of stream or flow.
  • the classification module within management blade 330 can include logic to read one or more specific fields from the first 128 bytes of each network packet and record that information in memory. After reading this specification, skilled artisans will recognize that more detailed information may be added to the tuple to further qualify network packets as belonging to a particular type of stream or flow.
  • Packet processing on the management blade 330 may also include collecting dynamic traffic information corresponding to specific tuples. Traffic counts (number of bytes, number of network packets, or both) for each type of tuple may be kept and provided as gauges to analyze logic on management blade 330.
  • the method can include assigning the network packet to an ATT-specific stream/flow (block 703) using a stream/flow mapping table.
  • the stream/flow mapping table may contain a variety of entries that match a particular steam or flow with an ATT- specific stream/flow.
  • the stream/flow mapping table can contain 128 entries. Each entry maps the tuple associated with a network packet to one of 16 different ATT-specific stream/flows for distinct control when the stream flow mapping table includes settings for controls (e.g., priority, latency, connection throttle, packet throttle, etc.) within the distributed computing environment 200.
  • a control settings module within the management blade 330 can be used for setting the controls.
  • the stream/flow mapping table may have more or fewer entries, streams, flows, or any combination thereof.
  • Use of the ATT-specific stream/flows may increase the ability to allocate different amounts of AI capacity to different applications, different transaction types, or a combination thereof by allowing the management infrastructure to distinguish between network packets (of different communications) that belonging to different applications, different transaction types, or any combination thereof.
  • five basic ATT-specific stream/flows under which an incoming packet may be grouped can include: (1) web traffic (e.g., an AI stream or AI flow between the Internet and a web server); (2) application server traffic (e.g., the AI flow between a web server and an application server); (3) DB traffic (e.g., an AI stream or AI flow between an application server and a database); (4) management traffic (e.g., an AI stream or AI flow between AI components in AI 210 and the control blade 310); and (5) other traffic (e.g., all other AI streams or AI flows that cannot be grouped under the previous four categories).
  • more or fewer ATT-specific stream/flows may be present.
  • transaction type-specific network flows may be used in place of or in conjunction with the ATT-specific stream/flow. While many details have been given with respect to examining the network packet, classifying the network packet with respect to a particular type of stream or flow, assigning the network packet to the ATT- specific stream/flow, and setting controls for the network packet, many alternative embodiments are possible.
  • the classification mapping table and the stream/flow mapping table can be consolidated into one table.
  • the combination of the classification mapping table and stream/flow mapping table can be broken into more tables (e.g., classification mapping table, stream/flow mapping table without control settings, and a control settings table).
  • classification, assignment, setting controls, or any combination thereof may not be required.
  • examination may indicate that the network packet is a management packet. Classification and assignment as described above may not be used because default control settings for management packets may be used. After reading this specification, skilled artisans will appreciate that other embodiments can be used.
  • one or more actions may be assigned to a network packet based on the ATT-specific stream/flow to which the network packet is associated.
  • Actions can include one or more instructions based on the importance of the ATT-specific stream/flow.
  • An example of an action can include drop, meter, or inject.
  • a drop action may include dropping a network packet as the ATT-specific stream flow associated with the network packet is of low importance relative to one or more other ATT-specific stream/flows.
  • a meter action may indicate that the network bandwidth, connection request rate, or both for an ATT-specific stream/flow is under analysis and the network packet is to be tracked or otherwise observed.
  • An inject action may indicate that the network packet is to be given a certain priority or placed in a certain port group.
  • the method includes determining whether the network packets are management packets (diamond 704). After the network packets are associated with an ATT-specific stream/flow, the network packets may be routed depending on whether the network packets are considered management traffic or other traffic (e.g., content traffic). If the network packet is considered to be part of management traffic, it may be redirected for special processing. Network packets that are part of the management traffic are referred to as management packets. If the network packets are management packets, they may be processed as illustrated in FIG. 8.
  • the method can include setting the highest priority for the communication (block 820), setting a value that results in the lowest latency for the communication (block 822), setting a value that results in no connection throttling for the communication (block 824), an setting a value that results in no packet throttling is set for the communication (block 826).
  • the control settings module can be used to set the controls.
  • the portion 502 of the connections 524 (FIG. 5) or the portion 602 of bandwidth 600 (FIG. 6) may be reserved for transmission of the management packets to allow management packets to control an AI component that is causing a broadcast storm.
  • the settings for priority can be simply based on a range of corresponding numbers, for example, from zero to seven (0 to 7), where zero (0) is the lowest priority and seven (7) is the highest priority.
  • the range for latency may be zero or one (0 or 1), where zero (0) means drop packets with normal latency and one (1) means drop packets with high latency.
  • the range for the connection throttle may be from zero to ten (0 to 10), where zero (0) means throttle zero (0) out often (10) connection requests (i.e., zero throttling) and ten (10) means throttle ten (10) out of every ten (10) connection requests (i.e., complete throttling).
  • the range for packet throttling may be substantially the same as the range for the connection throttling.
  • the above ranges are exemplary, and there may exist numerous other ranges of settings for priority, latency, connection throttling, and packet throttling. Moreover, the settings may be represented by nearly any group of alphanumeric characters.
  • the method can include transmitting the management packet (block 828). Accordingly, any management packets transmitted to a managed AI component from the management blade 330 of the appliance 250 can be afforded special treatment relative to content traffic by the distributed computing environment 200, and are transmitted expeditiously within the distributed computing environment 200. Moreover, any management packets that are received by the appliance 250 from a managed AI component are also afforded special treatment by the distributed computing environment 200 and are also expeditiously delivered through the distributed computing environment 200.
  • the method as illustrated in FIG. 8 may or may not be performed for the management packets within the appliance 250.
  • the management packets may be routed without having control settings set because no content traffic may be transmitted to or from the control blade 310 in this particular embodiment.
  • the routing of management packets can be performed in one or more different manners. Below is a specific description of how routing of management packets can be performed in accordance with a particular, non-limiting embodiment as seen from the perspective of an management execution component (e.g., a management blade 330 within the appliance 250), as depicted in FIG.9.
  • the method can include determining whether the management packet is to be routed to an AI component (diamond 950). If the incoming management packet was received from a central management component (e.g., control blade 310 within the appliance 250), the management packet may be destined for an AI component. For this scenario ("yes" branch of diamond 950), the method can include routing the management packets to a management software agent on an AI component (block 970). If the management packet was received from an AI component ("no" branch of diamond 950), the method can include routing the management packet to a central management component (e.g., control blade 210)(block 960).
  • a central management component e.g., control blade 210)(block
  • the FPGA 430 when the FPGA 430 determines the network packet is associated with management traffic, the network packet can be redirected by a switch for special or other processing by the CPU 420 on the management blade 330. If a determination is made that the management packet originated from an AI component (e.g., a network device 232 to 237) in the AI 210 (diamond 950), the CPU 320 may then forward this packet out through a management port on the management blade 330 to a management port on the control blade 310 (block 960).
  • an AI component e.g., a network device 232 to 237
  • the management packets can be routed to the CPU 420 on the management blade 330, and then redirected by the CPU 420 through a switch to an appropriate egress port to the network 212, which routes the management packets to an agent on an AI component coupled to that egress port.
  • the management blade 330 may be coupled to the control blade 310 (via the hub 320) and another AI, which is separate from the AI 210.
  • the management infrastructure allows management packets to be communicated between the management blade 330 and the control blade 310 without placing additional stress on the AI 210. Additionally, even if a problem exists in the AI 210, this problem does not effect communication between the control blade 310 and the management blade 330.
  • management blade(s) 330 Since substantially all network traffic intended for AI components and their associated software agents in AI 210 can pass through at least one of the management blades 330, such management blade(s) 330 is (are) able to more effectively manage and control the AI components by regulating the transmission of network packets. More particularly, with regards to management traffic, when the management blade 330 determines whether a management packet is destined for a management software agent local to an AI component in the AI 210, the management blade(s) 330 may hold the transmission of all network packets that are not management packets (i.e., content packets), to that particular AI component until the transmission of the management packets to the particular AI component has been completed. In this manner, management packets may be transmitted to any one or more AI components regardless of the volume and type of other traffic (i.e., content traffic) in the AI 210.
  • management packets may be transmitted to any one or more AI components regardless of the volume and type of other traffic (i.e., content traffic) in the AI 210.
  • the management packets can alleviate one or more problems in the AI 210 by allowing AI components to be controlled and manipulated regardless of the type and volume of content traffic in the AI 210.
  • broadcast storms may prevent delivery of communications to an AI component when a conventional shared network is used.
  • the management packets may alleviate these broadcast storms in the AI 210, as transmission of content packets originating with a particular AI component may be postponed until one or more management packets which address the problem on the particular AI component, are received to a management agent local to the AI component causing the broadcast storm.
  • the network packet is a content packet.
  • the method can include determining whether the content packet is to be delivered to an AI component from the management blade 330 within the appliance 250 (diamond 706). If yes, the content packet is processed as depicted in FIG. 10.
  • the method can include determining the setting for the priority of the content packet (block 1040), determining the setting for the latency of the content packet (block 1042), determining the setting for the connection throttle of the content packet (block 1044), determining the setting for the packet throttle of the content packet (block 1046), and transmitting the content packet (block 1048), which in this embodiment are content packets that are part of the content traffic.
  • the control settings module can set controls for the content packet after the corresponding settings have been determined.
  • the control settings may be determined based in part on the identification table.
  • the content packet can be further assigned to an ATT-specific stream/flow using the stream/flow mapping table.
  • the stream/flow mapping table can be used to determine the values for the control settings.
  • the content packet can then be transmitted according to the above-determined settings.
  • content traffic can be processed as illustrated in the flow diagram in FIG. 11.
  • the method can start with determining whether the destination of the content packet is local to a management execution component (e.g., a management blade 330) where the content packet currently resides (diamond 1150). This assessment may be made by an analysis of various layers of the content packet.
  • the management blade 330 may determine the IP address of the destination of the content packet or the IP port destination of the content packet, by examining various layers of the content packet. In one embodiment, this examination is done by logic associated with a switch within the management blade 330, or more specifically, by the FPGA 430.
  • the management blade 330 may be aware of the IP address and ports that may be accessed through that management blade's egress ports coupled to the network 212. If a network packet has an IP destination address or an IP port destination that may be accessed through a port coupled to that management blade 330, ("yes" branch from block 1150), the destination of the network packet is local to the management blade 330. Alternatively, if the network packet contains an IP destination address or an IP port destination that cannot be accessed through any of that management blade's egress ports coupled to network 210, the destination of the network packet is remote to that management blade 330.
  • a communication can pass from one AI component over the network 212 to the top management blade 330 (as illustrated in FIG. 3), through one or more fabric blades to the bottom management blade 330 (as illustrated in FIG. 3), and over the network 212 to another AI component.
  • the top management blade 330 is a remote management execution component with respect to the destination port or address
  • the bottom management blade 330 is a local management execution component with respect to the destination port or address.
  • a switch in the management blade 330 can make the determination.
  • the method includes determining if the management blade 330 wherein the content packet currently resides is the local management execution component (diamond 1150).
  • the determination can be made by the management blade 330 on which the content packet currently resides.
  • the fabric I/F 440 on that management blade may determine which management execution component (e.g., a management blade 330) is local to AI component that is the destination of the content packet.
  • the method can also include assigning one or more control settings (e.g., priority, latency, etc.) to the content packet (block 1160).
  • the content packet can be routed through one or more fabric blades 340 to the local management execution component (e.g., the local management blade 330 for the particular packet) (block 1170).
  • the content packet may be forwarded to one or more fabric blades 340 for delivery to another management blade 330, which is local to the port for which the network packet is destined (block 1070).
  • the method can include assigning a latency and a priority to the network packet (block 1160) based at least in part upon the ATT- specific stream/flow with which it is associated. The method can also include forwarding the network packet to the other management execution component (block 1170), which is the local management blade with respect to the AI component that is the destination of the network packet.
  • the content packet may then be converted into a fabric packet suitable for transmission through the one or more fabric blades 340, and the fabric packet can be reconverted back to the content packet for use by the destination AI component.
  • the conversion from the content packet to the fabric packet or vice versa may be performed on a management blade 330 (e.g. the fabric I/F 440), a fabric blade 340, or both.
  • the fabric blade 340 may use virtual lanes, virtual lane arbitration tables, service levels, or any combination thereof to transmit fabric packets between the fabric blades 340 based upon one or more control settings, including latency, priority, etc.
  • Virtual lanes may be multiple independent data flows sharing the same physical link but utilizing separate buffering and flow control for each latency or priority.
  • Embedded in each fabric I F 440 hardware port may be an arbiter that controls usage of these links based on the control settings assigned different packets.
  • the fabric blade 340 may utilize weighted fair queuing to dynamically allocate each fabric packet a proportion of link connections or bandwidth between the fabric blades 340. These virtual lanes and weighted fair queuing can combine to improve fabric utilization, reduce the likelihood of deadlock, and provide differentiated service between packet types when transmitting packets between different management blades 330.
  • the method can include calculating a weighted random early discard (WRED) value for the network packet (block 1180).
  • WRED weighted random early discard
  • the weighting can be based at least in part on the specific stream or the specific flow associated with the content packet.
  • the weighting can be based at least in part on the ATT-specific stream/flow.
  • the WRED value can be used to help the management blade 330 deal with contention for one or more ports, and corresponding transit queues that may form at those ports.
  • Random early discard can be used as a form of load shedding which is commonly known in the art, the goal of which is to preserve a minimum average queue length for the queues at ports on the management blade 330.
  • the end effect of this type of approach is to maintain some bounded latency for a network packet arriving at the management blade 330 and intended for a port on management blade 330.
  • the management blade 330 may calculate an WRED value to influence which one or more content packets are discarded based on the apphcation, transaction type, component, or any combination thereof with which the content packet is associated. Therefore, the management blade 330 may calculate this WRED value based upon a combination of contention level for the port for which the content packet is destined, and a control value associated with the ATT-specific stream/flow with which the content packets is associated.
  • this control mechanism may be a stream rate control, a flow rate control, or a combination stream rate-flow rate control, and a value for such control.
  • Each ATT-specific stream/flow may have a distinct rate value. While the rate value may be a single number, the stream rate or the flow rate control may actually control two distinct aspects of the managed application environment.
  • a first aspect can include control of the connections (FIG. 5) or bandwidth (FIG. 6) available for specific links, including links associated with ports from the management blade 330 and links associated with the outbound fabric I F 440 between the management blades 330 or other portions of the distributed computing environment 200.
  • This methodology in effect, presumes the connections or bandwidth of a specific link is a scarce resource. Thus, when contention occurs for a port, a queue of the network packets waiting to be sent through the port and down the link would normally form.
  • the rate control effectively allows determination of which the content packets from which ATT-specific stream flow get a greater or lesser percentage of the available connections or bandwidth of that port and corresponding network link.
  • a second aspect of this control mechanism can use the access to the egress port or network link as a surrogate for the remainder of the managed and controlled AI 110 that resides on the downstream side of the port. By controlling which content packet gets prioritized at the egress to the port, the rate control, also affects the mix of network packets seen by a particular AI component connected to the egress port.
  • the rate control value may correspond to a number of bytes which will be transmitted out through an egress port and down a network link each second.
  • the control value may be 0 to 19 where each value increments the specific number of bytes per second transmitted on a logarithmic scale to allow an improved degree of control over the number of bytes actually transmitted.
  • the correspondence may be as indicated in Table 2.
  • the method can continue with delivering the communication (block 1190).
  • the method can include determining if the content packet is being received from an AI component (diamond 708) or whether the content packet is being delivered via a virtual local area network uplink (diamond 710).
  • the "yes" branch from diamond 708 or 710 continues with substantially the same processing, as illustrated in FIG. 12.
  • the method can include determining the setting for the priority of the content packet (block 1260), and determining the setting for the latency of the content packet (block 1262).
  • the control settings module can set the control based on the determination.
  • the method can also include transmitting the content packet in accordance with those settings (block 1264).
  • the method can include determining whether the content packet is being delivered via a VLAN downlink (diamond 1264). If so, the content packet is processed, as illustrated in FIG. 13.
  • the method can include determining a setting for the connection throttle for the content packet (block 1370) and determining a setting for the packet throttle for the content packet (block 1372).
  • the control settings module can set the control based on the deternination.
  • the method can also include transmitting the content packet in accordance with those settings (block 1374).
  • management packets are typically assigned to the highest level of importance for confrol settings as compared content packets.
  • the distributed computing environment 200 can allow management traffic and content traffic to share the same network, however, a portion of the connections or a portion of the bandwidth of the network can be reserved for management traffic. In this manner, the appliance 250 can exert real time or near real time confrol over the distributed computing environment 200.
  • control settings can be set based at least in part on the application or transaction type to which the network packet is associated. In this manner, the level of importance between applications, transactions types with applications, or any combination thereof can be adjusted to better meet the business objectives of the entity using the distributed computmg environment 200. For example, a store-front application can be given preference over an inventory management application. Alternatively, the level of importance can be based on fransaction types.
  • an order placement transaction of a store-front application can be more important as compared to a vendor delivery schedule for an inventory management application, which may be given more importance over an email-help message in the store-front application to that is sent to a customer service representative of the entity running the store-front application.
  • the business objectives can be static or change.
  • the appliance 250 can adapt to the distributed computing environment 200 to meet the business objectives can change hourly, daily, monthly, or at any other time.
  • the business objectives can include increasing revenue, increasing profit, reducing inventory, making a deposit into an account before a deadline, etc.
  • a pipe may be a link between a managed AI component and a management blade 330.
  • a pipe may be a VLAN uplink or VLAN downlink.
  • a pipe may be a link between the control blade 310 and a management blade 330.
  • a pipe may be a link between two management blades 330 or an appliance backplane.
  • confrol settings including latency, priority, connection throttling, packet throttling, other suitable control, or any combination thereof can be implemented on the management blade 330 (e.g., through the FPGA 430 or on software operating on a switching control processor (not illustrated) within the management blade 330).
  • confrol settings e.g., latency and priority
  • a communication mechanism can exist between the control blade 310 and a software agent at the managed AI component in order to transmit to the software agent the values that are to be used for the control settmgs. Further, a mechanism can exist at the software agent in order to implement those settings.
  • connection throttling can be used at the management blade 330 or at the managed AI component. Since it may be difficult to retrieve a flow or stream once it has been sent into a pipe, in one embodiment, connection throttling can be implemented at the component from which a stream or a flow originates. Further, in an exemplary, non-limiting embodiment, when a flow or stream is being delivered via a
  • the latency and priority controls can be implemented on the management blade 330.
  • the connection throttle, the packet throttle, or both can also be implemented on the management blade 330.
  • streams or flows can be defined and created for each application, transaction type, or both in the distributed computing environment 200.
  • the pipes are also defined and created.
  • the necessary pipes are created for each uplink or downlink in each VLAN.
  • the provisioning and de-provisioning of certain AI components can have an impact on the distributed computing environment 200.
  • the provisioned server can result in the creation of one or more flows. Therefore, a mechanism can be provided to scan the classification mapping table and to create new entries.
  • the provisioned server can result in the creation of a new pipe.
  • the de-provisioned server can cause one or more flows to be no longer used. Therefore, a mechanism can be provided to scan the classification mapping table and delete entries as provisioning and de-provisioning occurs. If a managed AI component is added, corresponding flows and pipes can be created.
  • the corresponding flows and pipes can be deleted. This also includes the management flows to and from the management blade 330 within the appliance 250.
  • an uplink is added for a VLAN
  • the corresponding pipes can be created.
  • the corresponding pipes can be deleted.
  • the classification mapping table can be considered dynamic during operation (i.e., entries are created and removed as AI components) are provisioned and de-provisioned and as managed AI components are added and removed.
  • a number of flows within the disfricited computing environment 200 may cross network devices that are upstream of a management blade 330.
  • the priority and latency settings that are established during the execution of the above-described method can have an influence on the latency and priority of those affected packets as they cross any upstream network devices.
  • the hierarchy established for priority can be based on a recognized standard (e.g., the IEEE 802.1p/802.1q standards).
  • the requestor may employ an exponential back-off mechanism before re-trying the connection request.
  • the connection throttle can throttle connection requests in whatever manner is required to invoke the standard request back-off mechanism.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A distributed computing environment can include a network that is used for management traffic and content traffic. In one embodiment, a portion of connections or bandwidth may be reserved for management traffic, so that a component causing a broadcast storm or otherwise malfunctioning can be accessed by a management execution component. In another embodiment, network traffic may be classified as management traffic or content traffic. Management traffic can be treated with more importance than the content traffic and be assigned control settings accordingly. In a further embodiment, different applications, transaction types, or any combination thereof can be assigned different levels of importance to better meet business objectives of an entity using the distributed computing environment. A more important application or transaction type can be assigned control settings to allow its streams or flows to be transmitted more quickly and successfully compared to a lesser important application or transaction type.

Description

DISTRIBUTED COMPUTING ENVIRONMENT AND METHODS FOR MANAGING AND CONTROLLING THE SAME
FIELD OF THE DISCLOSURE The invention relates in general to distributed computing environments, and more particularly, distributed computing environments and methods for managing and controlling the same.
DESCRIPTION OF THE RELATED ART Distributed computing environments are extensively used in computing applications. The distributed computing environments are growing more complex. In order to manage and control the distributed computing environment, two approaches are typically taken: parallel networks and software-based management tools.
Parallel networks allow content traffic to be routed over one network and management traffic to be routed over a separate physical network. The public telephone system is an example of such a parallel physical network. The content traffic can include voice and data that most people associate with telephone calls or telephone-based Internet connections. The management traffic controls- network devices (e.g., computers, servers, hubs, switches, firewalls, routers, etc.) on the content traffic network, so that if a network device fails, the failed network device can be isolated, and content traffic can be re-routed to another network device without the sender or the receiver of the telephone call perceiving the event. Parallel physical networks are expensive because two separate networks must be created and maintained. Parallel physical networks are typically used in situations where the content traffic must go through regardless of the state of individual network devices within the content traffic network.
Software-based management applications work poorly because of their inherent limitations, in that the content traffic and the management traffic share the same network. FIG. 1 includes a typical prior art application infrastructure topology that may be used within a distributed computing environment. An application infrastructure 110 may include two portions 140 and 160 that can be connected together by a router 137. Application servers 134 and database servers 135 reside in the portion 140. Web servers 133 and workstation 138 reside in the portion 160. In order for any one of network devices within the portion 140 to communicate with any one of the network devices within the portion 160, the communication must pass through the router 137. One network device (e.g., workstation 138) may be designated as a management component for the distributed computing environment. The workstation 138 may be responsible for managing and controlling the application infrastructure 110, including all network devices. However, if router 137 is malfunctioning, workstation 138 may not be able to communicate with network devices (e.g., the application servers 134 and database servers 135) in the portion 140. Consequently, while the router 137 is non-functional, network devices in the portion 140 are without management and control. The workstation may not effectively manage and control the distributed computing environment in a coherent manner because the workstation 138 cannot manage and control network devices within the portion 140.
Another problem with the application infrastructure 110 is its inability to effectively address a broadcast storm. For example, a malfunctioning component (hardware, software, or firmware) within the portion 140 may cause a broadcast storm. The router 137 and its network connections have a limited bandwidth and may effectively act as a bottleneck. The broadcast storm may swamp the router 137 with traffic. By the time the workstation 138 detects the broadcast storm, it may be too late to address the broadcast storm. Management traffic from the workstation 138 competes with content traffic from the broadcast storm, and therefore, the management traffic cannot correct the problem until after the broadcast storm subsides. During the broadcast storm, the network devices (e.g., the application servers 134 and database servers 135) within the portion 140 operate without management and control because the management traffic competes with the content traffic on the same shared network.
Still another problem is that the distributed computing environment may not prioritize, allocate resources, or otherwise treat different applications differently. A relatively less important application (e.g., streaming a relatively large multimedia file) may compete with a relatively more important application (e.g., operating store-front application). Because the distributed computing environment does not discern differences in importance for different applications, a less important application may consume resources that are needed for the more important application and cause missed business opportunities (e.g., revenue, profit, sales, etc.) or other adverse consequences.
SUMMARY A distributed computing environment can include a network that is shared by content traffic and management traffic. Effectively, a management network can be overlaid on top of a content network, so that the shared network operates similar to a parallel network, but without the cost and expense of creating a physically separate parallel network. In one embodiment, network packets that are transmitted over the network can be classified as management packets (part of the management traffic) or content packets (part of the content traffic). After being classified, the network packets can be routed as management traffic or content traffic as appropriate. Because at least some of the shared network is reserved for management traffic, management traffic can reach the network devices, including a network device from which a broadcast storm originated. Therefore, the network can have the advantages of a separate parallel network but without its disadvantages, and can have the advantages of a shared network but without its disadvantages.
In another embodiment, the distributed computing environment can include an application infrastructure where substantially all network devices within the distributed computing environment are directly connected to an appliance that manages and controls the distributed computing environment. Knowledge of the functional state of and the ability to manage any network device within the distributed computing environment is not dependent on the functional state of any other network device within the application infrastructure. Management packets between the appliance and the managed components within the distributed computing environment can be effectively only "one hop" away from their destination.
The configuration of the distributed computing environment, in accordance with this embodiment, can also allow for better visibility of the entire application infrastructure. In a conventional shared-network distributed computmg environment, one or more network devices may not be visible if an intermediate network device (e.g., the router 137), which lies between another network device (e.g., the application servers 134 and database servers 135) and a central management component (e.g., the workstation 138), malfunctions. Unlike the conventional system, connections between the network devices and the appliance can allow for better visibility to each of the network devices, components within the network devices, and all network traffic, including management and content traffic, within the distributed computing environment.
In another embodiment, the relative importance among applications, transaction types, or combinations irmning within a distributed computing environment thereof can be determined. One or more control settings for the distributed computing environment, or a portion thereof, can be assigned to one or more network packets within a stream or flow to better achieve the business objectives of the entity using the distributed computing environment.
In one set of embodiments, a distributed computing environment can include at least one apparatus that controls at least a portion of the distributed computing environment, network devices, and a network lying between each of the network devices and the at least one apparatus. The network is configured to allow content traffic and management traffic within the at least a portion of the distributed computing environment to travel over the same network. The network is configured such that at least a portion of a connection or a bandwidth within the network is reserved for the management traffic and is not used for the content traffic. The at least one apparatus, the network devices, or any combination thereof includes a classification module to classify a network packet as part of the content traffic or the management traffic.
In another set of embodiments, a distributed computing environment can include at least one apparatus that controls the distributed computing environment, wherein the at least one apparatus includes a central management component and at least one management execution component, and wherein the at least one apparatus includes a classification module to classify a network packet as part of management traffic or contact traffic. The distributed computing environment can also include network devices, wherein substantially all network traffic between any two network devices passes through the at least one management network component.
In still another set of embodiments, an apparatus can be used to control at least a portion of a distributed computing environment. The apparatus can include a central management component, and at least one management execution component, and a classification module to classify a network packet as part of management traffic or contact traffic within the distributed computing environment. In a further set of embodiments, a method of controlling at least a portion of a distributed computing environment can include examining a network packet, classifying the network packet as management data or content data, and routing the network packet based on the classification.
In still a further set of embodiments, a method of controlling at least a portion of a distributed computing environment can include classifying a network packet associated with a stream or a flow, and setting a control for the stream, the flow, or a pipe based at least in part on the classification.
In yet a further set of embodiments, a method can be used to process a network packet in a distributed computing environment that includes an application infrastructure. The method can include receiving a communication from the application infrastructure, wherein the communication includes the network packet, classifying the network packet, and setting a control for the network packet based at least in part on the classification.
In other sets of embodiments, a data processing system readable medium can comprise code that can include instructions for carrying out any one or more of the methods and may be used within a distributed computing environment. In further sets of embodiments, an apparatus can be configured to carry out any part or all of any of the methods described herein, the apparatus can include any part or all of any of the data processing system readable media described herein, an apparatus can include any part or all of any of the systems described herein, an apparatus can be a part of any of the systems described herein, or any combination thereof.
The foregoing general description and the following detailed description are illustrative and explanatory only and are not restrictive of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS The present invention is illustrated by way of example and not limitation in the accompanying figures, in which the same reference number indicates similar elements in the different figure.
FIG. 1 includes an illustration of a prior art application infrastructure. FIG.2 includes an illustration of a hardware configuration of an appliance for managing and controlling a distributed computing environment.
FIG. 3 includes an illustration of a hardware configuration of the application infrastructure management and control appliance in FIG. 2.
FIG.4 includes an illustration of a hardware configuration of one of the management blades in FIG.
FIG. 5 includes an illustration of a network connector, wherein at least one connector is reserved for management traffic and other connectors can be used for content traffic. FIG. 6 includes an illustration of a bandwidth for a network, wherein at least one portion of the bandwidth is reserved for management traffic and another portion of the bandwidth can be used for content traffic.
FIGs. 7-13 include a flow diagram for a method of using the distributed computing environment of FIG. 2.
Skilled artisans appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.
DETAILED DESCRIPTION A distributed computing environment can include a network that is used for transmitting management traffic and content traffic to components within the distributed computing environment. In one embodiment, substantially all network traffic between two different network devices within the distributed computing environment can be routed through an appliance. In another embodiment, the network traffic can be classified as management traffic or content traffic.
A portion of connections or bandwidth of the network may be reserved for management traffic, so that a component causing a broadcast storm or otherwise malfunctioning can be accessed by the appliance to address a problem, isolate a component, or perform another appropriate action. By reserving a portion of the connections or bandwidth, a shared network (one network shared by management and content traffic) can be configured to act more like a parallel network where management traffic and content traffic are routed over different portions of connections or bandwidth.
In a further embodiment, different applications, transaction types, or any combination thereof can be assigned different levels of importance to better meet business objectives of an entity using the distributed computing environment. Thus, not all content traffic may be treated the same. A more important application or transaction type can be assigned control settings to allow its streams or flows to be transmitted more quickly and successfully, as compared to a lesser important application or transaction type.
Portions of the detailed description have been placed into sections to allow readers to more quickly locate subject matter of particular interest to them. The sections include Brief Description of Exemplary Aspects and Embodiments, Definition and Clarification of Terms, Exemplary Hardware Architecture, and Methods of Managing and Controlling a Distributed Computing Environment.
I. Brief Description of Exemplary Aspects and Embodiments. Many different aspects and embodiments are described herein that are related to methods, apparatuses, and systems for controlling a portion or all of a distributed computing environment. After reading this specification, skilled artisans will appreciate that the aspects and embodiments may or may not be implemented independently of one another. More specifically, after network packets are classified as management packets, such network packets may not require further mapping to a specific stream/flow to determine one or more control settings that are to be used as described herein. Skilled artisans appreciate that they may selectively choose to implement any portion of combination of portions of the embodiments, as described herein, to meet the needs or desires for their own applications.
In a first aspect, a distributed computing environment can include at least one apparatus that controls at least a portion of the distributed computing environment, network devices, and a network lying between each of the network devices and the at least one apparatus. The network is configured to allow content traffic and management traffic within the at least a portion of the distributed computing environment to travel over the same network. The network is configured such that at least a portion of a connection or a bandwidth within the network is reserved for the management traffic and is not used for the content traffic. The at least one apparatus, the network devices, or any combination thereof includes a classification module to classify a network packet as part of the content traffic or the management traffic.
In one embodiment of the first aspect, each of the network devices is directly connected to the at least one apparatus. In another embodiment, the distributed computing environment is configured so that substantially all network traffic to and from each of the network devices passes through the at least one apparatus.
In still another embodiment of the first aspect, the at least one apparatus includes a central management component and at least one management execution component, each of the network devices includes a software agent, and a management infrastructure includes the central management component, the at least one management execution component, the software agents, and at least a portion of the network. In a particular embodiment, the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof. In another particular embodiment, the distributed computing environment is configured so that substantially all network traffic between any two network devices passes through the at least one management execution component. In still another particular embodiment, an application infrastructure includes the at least one management execution component, the network devices and software agents, and at least a portion of the network. In a more particular embodiment, the central management component is not part of the application infrastructure. In a further embodiment of the first aspect, the network devices include at least one Layer 2 network device and at least one Layer 3 network device. In still a further embodiment, at least one of the network devices includes a Layer 2 network device and a Layer 3 network device. In yet a further embodiment, the at least one management execution component is configured to perform a routing function of a Layer 3 network device. In another embodiment, the at least one apparatus configured to identify a network packet as a management packet or a content packet.
In a second aspect, a distributed computing environment can include at least one apparatus that controls the distributed computing environment, wherein the at least one apparatus includes a central management component and at least one management execution component, and wherein the at least one apparatus includes a classification module to classify a network packet as part of management traffic or a contact traffic. The distributed computing environment can also include network devices, wherein substantially all network traffic between any two network devices passes through the at least one management network component.
In one embodiment of the second aspect, each of the network devices is directly connected to the at least one management execution component. In another embodiment, the distributed computing environment further includes a network lying between each of the network devices and the at least one apparatus, wherein the network is configured to allow the content traffic and the management traffic within the distributed computing environment to travel over the same network. In still another embodiment, each of the network devices includes a software agent, and a management infrastructure includes the central management component, the at least one management execution component, the software agents, and at least a portion of a network between the network devices and the at least one apparatus. In a particular embodiment, an application infrastructure includes the at least one management execution component, the network devices and the software agents, and at least a different portion of the network. In a more particular embodiment, the central management component is not part of the application infrastructure.
In a further embodiment of the second aspect, the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof. In still a further embodiment, the network devices include at least one Layer 2 network device and at least one Layer 3 network device. In yet a further embodiment, at least one of the network devices includes a Layer 2 network device and a Layer 3 network device. In another embodiment, the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
In a third aspect, an apparatus can be used to control at least a portion of a distributed computing environment. The apparatus can include a central management component, and at least one management execution component, and a classification module to classify a network packet as part of management traffic or a contact traffic within the distributed computing environment.
In one embodiment of the third aspect, the apparatus further includes ports configured to receive connections from network devices. At least one of the ports has associated connections, wherein at least a portion of the associated connections is reserved for management traffic, an associated bandwidth, wherein at least a portion of the associated with the management traffic, or any combination thereof. In a particular embodiment, the at least one management execution component includes a first management blade, wherein each network device within the distributed computing environment is connected to the first management blade. In a more particular embodiment, the at least one management execution component includes a second management blade, wherein each of the network devices within the distributed computing environment is connected to the second management blade. In still another embodiment of the third aspect, the apparatus further includes a port connectable to a network device, wherein, when the network device malfunctions, the apparatus is configured to perform a function of the network device. In a particular embodiment, when the network device malfunctions, the apparatus is further configured to isolate the network device from a remaining portion of the distributed computing environment.
In a further embodiment of the third aspect, the at least one management execution component is configured to perform a routing function of a Layer 3 network device. In still a further embodiment, the apparatus further includes a control setting module, wherein, based at least in part on the classification, the control setting module sets a control for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof. In yet a further embodiment, the at least one management execution component is configured to receive the content traffic and the management traffic, and the central management component is configured to receive the management traffic but not the content traffic.
In a fourth aspect, a method of controlling at least a portion of a distributed computing environment can include examining a network packet, classifying the network packet as management data or content data, and routing the network packet based on the classification.
In one embodiment of the fourth aspect, classifying the network packet includes classifying the network packet based on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof. In a particular embodiment, classifying the network packet includes classifying the network packet using a stream/flow mapping table.
In another embodiment of the fourth aspect, routing the network packet includes routing the network packet over a management infrastructure. In a particular embodiment, routing the network packet further includes routing the network packet to a management execution component. In a more particular embodiment, routing the network packet further includes receiving the network packet at the management execution component, after the network packet is sent from a first network device, and sending the network packet from the management execution component to a second network device different from the first network device. In another particular embodiment, routing the network packet further includes routing the network packet over an application infrastructure. In a more particular embodiment, routing the network packet further includes routing the network packet to an agent on a network device. In another more particular embodiment, the method further includes blocking other traffic in the application infrastructure.
In still another embodiment of the fourth aspect, the method further includes setting a control for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof, wherein, setting the control is based at least in part on the classification. In a fifth aspect, a method of controlling at least a portion of a distributed computing environment can include classifying a network packet associated with a stream or a flow, and setting a control for the stream, the flow, or a pipe based at least in part on the classification.
In one embodiment of the fifth aspect, the method further includes examining a parameter of the network packet. In a particular embodiment, the parameter includes a virtual local area network identification, a source address, a destination address, a source port, a destination port, a protocol, a connection request, a transaction type load tag, or any combination thereof. In another particular embodiment, the method further includes associating the network packet with one of a set of specific flows/streams at least partially based on the parameter. In a more particular embodiment, associating the network packet includes using a classification mapping table, wherein an entry in the classification mapping table maps the network packet to a specific stream/flow. In still a more particular embodiment, each entry in the classification mapping table is mapped to an entry in a stream/flow mapping table. In a very particular embodiment, each entry in the classification mapping table or the stream/flow mapping table includes values for settings for priority, latency, a connection throttle, a packet throttle, and a combination thereof. In another embodiment of the fifth aspect, the method further includes determining a value of the setting based at least in part on the value of the parameter. In a particular embodiment, setting the control is applied once to the flow or the stream, regardless of a number of pipes used for the flow or the stream. In another particular embodiment, the value of the setting is obtained from a flow entry and not a stream entry of a table. In a sixth aspect, a method can be used to process a network packet in distributed computing environment includes an application infrastructure. The method can include receiving a communication from the application infrastructure, wherein the communication includes the network packet, classifying the network packet, and setting a control for the network packet based at least in part on the classification.
In one embodiment of the sixth aspect, classifying the network packet is based at least in part on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof. In a particular embodiment, classifying the network packet further includes associating the network packet with at least one of a set of application-specific streams, application-specific flows, transaction type-specific streams, transaction type-specific flows, or any combination thereof. In a more particular embodiment, associating the network packet is accomplished using a stream/flow mapping table, wherein an entry in the stream/flow matching table maps the network packet to a stream or a flow. In an even more particular embodiment, members within the set are classified by types of traffic. In still an even more particular embodiment, the method further includes determining an action based on the stream or the flow associated with the network packet. In a very particular embodiment, the action includes at least one of drop, meter, and inject. In another more particular embodiment of the sixth aspect, the method further includes assigning a weighted random discard value to the network packet, based at least in part on the stream or the flow associated with the network packet. In an even more particular embodiment, assigning a weighted random discard value is based on a stream rate, a flow rate, or any combination thereof. In still an even more particular embodiment, the method further includes discarding the network packet based on the weighted random early discard value. In a very specific particular embodiment, the weighted random early discard value is based on a contention level for a port and a control value associated with the stream or the flow. The control value can be on a logarithmic scale.
In still another embodiment of the sixth aspect, setting the control includes setting a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof.
II. Definition and Clarification of Terms A few terms are defined or clarified to aid in understanding the terms as used throughout this specification. The term "application" is intended to mean a collection of transaction types that serve a particular purpose. For example, a web site store front can be an application, human resources can be an application, order fulfillment can be an application, etc.
The term "application infrastructure" or "AI" is intended to mean any and all hardware, software, and firmware within a distributed computing environment used by a particular application. The hardware may include servers and other computers, data storage and other memories, networks, switches and routers, and the like. The software used may include operating systems and other middleware components (e.g., database software, JAVA™ engines, etc.). The application infrastructure can include physical components, logical components, or a combination thereof. The term "application infrastructure component" or "AI component" is intended to mean any part of an AI associated with an application. AI components may be hardware, software, firmware, network, or virtual AI components. Many levels of abstraction are possible. For example, a server may be an AI component of a system, a CPU may be an AI component of the server, a register may be an AI component of the CPU, etc. For the purposes of this specification, AI component and resource are used interchangeably The term "central management component" is intended to mean a component that is capable of obtaining information from management execution component(s), software agents on managed components, or both, and providing directives to the management execution component(s), the software agents, or both. A control blade is an example of a central management component.
The term "communication" is intended to mean a packet or a collection of packets that are sent from a source component to a destination component within a distributed computing environment and can be represented by a stream or a flow.
The term "component" is intended to mean a part within a distributed computing environment. Components may be hardware, software, firmware, or virtual components. Many levels of abstraction are possible. For example, a server may be a component of a system, a CPU may be a component of the server, a register may be a component of the CPU, etc. Each of the components may be apart of an AI, amanagement infrastructure, or both. For the purposes of this specification, component and resource may be used interchangeably.
The term "connection throttle" is intended to mean a control for regulating a portion of connections or a portion of a bandwidth for a particular stream or a particular flow within an AI. For example, the connection throttle may exist at a beginning or end of a pipe. Moreover, the connection throttle may allow none, a portion, or all of the connections to be made or allow none, a portion or all of the bandwidth to be used for a particular stream or a particular flow.
The term "content traffic" is intended to mean the portion of the network traffic other than management traffic. In one embodiment, the content traffic includes network traffic used by application(s) running within a distributed computing environment.
The term "distributed computing environment" is intended to mean a collection of components comprising at least one application environment, wherein different types of components reside on different network devices connected to the same network.
The term "flow" is intended to mean a communication sent between two physical endpoints in a distributed computing environment. For example, a flow may be a communication that are coming from one port at one Internet protocol (IP) address and going to another port at another IP address using a particular protocol.
The term "classification mapping table" is intended to mean a table having one or more entries that correspond to predefined characteristics of a stream or a flow based on one or more values of parameters. The term "instrument" is intended to mean a gauge or control that can monitor or control a component or other part of an AI.
The term "latency" is intended to mean the amount of time it takes a network packet to travel from one AI component to another AI component. Latency may include a delay time before a network packet begins traveling. The term "local" is intended to mean a coupling of two components with no more than one intervening management execution component lying between those two components. For example, if two components reside on the same network device, network traffic may pass between the two components without passing through an intervening management execution component. If two components are connected to the same management blade, the network or other traffic may pass between the two components without passing through two or more intervening management execution components.
The term "logical," when referring to an instrument or component, is intended to mean an instrument or a component that does not necessarily correspond to a single physical component that otherwise exists or that can be added to an AI. For example, a logical instrument may be coupled to a plurality of instruments on physical components. Similarly, a logical component may be a collection of different physical components. The term "management execution component" is intended to mean a component in the flow of network traffic that may extract management traffic from the network traffic or insert management traffic into the network traffic; send, receive, or transmit management traffic to or from any one or more of a central management component and software agents residing on the AI components; analyze information within the network traffic; modify the behavior of managed components in the AI, or generate instructions or communications regarding the management and control of any portion of the AI; or any combination thereof. A management blade is an example of a management execution component.
The term "management infrastructure" is intended to mean any and all hardware, software, and firmware that are used to manage, control, or manage and control at least a portion of a distributed computing environment.
The term "management traffic" is intended to mean is intended to mean network traffic that is used to manage, control, or manage and control at least a portion of a distributed computing environment.
The term "network device" is intended to mean a Layer 2 or higher network device in accordance with the Open System Interconnection ("OSI") Model. A network device is a specific type of a component and may include a plurality of components.
The term "network traffic" is intended to mean all traffic, including content traffic and management traffic, on a network of a distributed computing environment.
The term "packet throttle" is intended to mean a control for regulating the transmission of packets over at least a part of the distributed computing environment. For example, the packet throttle may exist at a queue where packets are waiting to be transmitted through a pipe. Moreover, the packet throttle may allow none, a portion, or all of the network packets to be transmitted through the pipe.
The term "physical," when referring to an instrument or component, is intended to mean an instrument or a component that corresponds to a physical entity, including hardware, firmware, or software, that otherwise exists or that can be added to a distributed computing environment. For example, a physical instrument may be coupled to a physical component. Similarly, a physical component may be a server, a router, software used to operation the server or router, or the like.
The term "pipe" is intended to mean a physical network segment between two AI components. For example, a network packet may travel between two AI components via a pipe. A pipe is a physical network segment, and by analogy, is similar to a wire within cable. The term "priority" is intended to mean the order or ranking in which packets, flows, or streams are to be transmitted over at least a portion of the distributed computing environment.
The term "remote" is intended to mean at least two mtervening management execution components (e.g., two management blades) lie between two specific components. If two components are connected to different management blades, the network or other traffic between the two components will pass through the different management blades.
The term "stream" is intended to mean an aggregate set of flows between two logical components, as opposed to flows that are between two physical components, in a distributed computing environment. The term "stream/flow mapping table" is intended to mean a table having one or more entries that correspond to predefined streams or predefined flows, wherein each of the predefined streams or predefined flows has one or more predefined settings for controls. In one embodiment, each entry in a stream/flow mapping table may have one or more predefined characteristics to which actual streams or actual flows within an AI may be compared. For example, a particular flow may substantially match a particular entry in a stream/flow mapping table and, as such, inherit the predefined control settings that correspond to that entry in the stream/flow mapping table. An example of stream/flow mapping table can include a stream mapping table, a flow mapping table, or a combination stream-flow mapping table.
The term "transaction type" is intended to mean to a type of task or transaction that an application may perform. For example, browse request and order placement are transactions having different transaction types for a store-front application.
As used herein, the terms "comprises," "comprising," "includes," "including," "has," "having" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a method, process, article, or appliance that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such method, process, article, or appliance. Further, unless expressly stated to the contrary, "or" refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
Also, use of the "a" or "an" are employed to describe elements and components of the invention. This is done merely for convenience and to give a general sense of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods, hardware, software, and firmware similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods, hardware, software, and firmware are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the methods, hardware, software, and firmware and examples are illustrative only and not intended to be limiting. Unless stated otherwise, components may be bi-directionally or uni-directionally coupled to each other. Coupling should be construed to include direct electrical connections and any one or more of intervening switches, resistors, capacitors, inductors, and the like between any two or more components.
To the extent not described herein, many details regarding specific network, hardware, software, firmware components and acts are conventional and may be found in textbooks and other sources within the computer, information technology, and networking arts.
HI. Exemplary Hardware Architecture Before discussing details of the embodiments, anon-limiting, illustrative hardware architecture is described. After reading this specification, skilled artisans will appreciate that many other hardware architectures can be used and to list every one would be nearly impossible.
FIG. 2 includes a hardware diagram of a distributed computing environment 200. The distributed computing environment 200 includes an AI. The AI includes management blade(s) (not illustrated in FIG. 2) within an appliance 250 (i.e., an apparatus) and those components above and to the right of the dashed line 210 in FIG.2. More specifically, the AI includes a router/firewall/load balancer 232, which is coupled to the Internet 231 or other network connection. The AI further includes web servers 233, application servers 234, and database servers 235. Other servers may be part of the AI but are not illustrated in FIG. 2. Each of the servers may correspond to a separate computer or may correspond to a virtual engine running on one or more computers. Note that a computer may include one or more server engines. The AI also includes a network 212, a storage network 236, and router/firewalls 237. The management blades within the appliance 250 may be used to route communications (e.g., packets) that are used by applications, and therefore, the management blades are part of the AI. Although not illustrated, other additional components may be used in place of or in addition to those components previously described. In another embodiment, fewer components than illustrated in FIG.2 may be used.
Each of the network devices 232 to 237 is bi-directionally coupled in parallel to the appliance 250 via a network 212. Each of the network devices 232 to 237 is a component, and any or all of those network devices 232 to 237 can include other components (e.g., system software, memories, etc.) inside of such network devices 232 to 237. In the case of the router/firewalls 237, the inputs and outputs from the router/firewalls 237 are connected to the appliance 250. Therefore, substantially all the traffic to and from each of the network devices 232 to 237 in the AI is routed through the appliance 250. Software agents may or may not be present on each of the network devices 232 to 237 and their corresponding components within such network devices, or any combination thereof. The software agents can allow the appliance 250 to monitor and control at least a part of any one or more of the network devices 232 to 237 and components within such network devices, or any combination thereof. Note that in other embodiments, software agents on components may not be required in order for the appliance 250 to monitor and control the components. FIG. 3 includes a hardware depiction of the appliance 250 and how it is connected to other parts of the distributed computing environment 200. A console 380 and a disk 390 are bi-directionally coupled to a control blade 310 within the appliance 250. The control blade 310 is an example of a central management component. The console 380 can allow an operator to communicate with the appliance 250. Disk 390 may include logic and data collected from or used by the control blade 310. The control blade 310 is bi- directionally coupled to a hub 320. The hub 320 is bi-directionally coupled to each management blade 330 within the appliance 250. Each management blade 330 is bi-directionally coupled to the network 212 and fabric blades 340. Two or more of the fabric blades 340 may be bi-directionally coupled to one another.
The management infrastructure can include the appliance 250, network 212, and software agents on the network devices 232 to 237 and their corresponding components. Note that some of the components within the management infrastructure (e.g., the management blades 330, network 212, and software agents on the components) may be part of both the application and management infrastructures. In one embodiment, the control blade 310 is part of the management infrastructure, but not part of the AI.
Although not illustrated, other connections and additional memory may be coupled to each of the components within the appliance 250. Further, nearly any number of management blades 330 may be present. For example, the appliance 250 may include one or four management blades 330. When two or more management blades 330 are present, they may be connected to different parts of the AI. Similarly, any number of fabric blades 340 may be present. In still another embodiment, the control blade 310 and hub 320 may be located outside the appliance 250, and in yet another embodiment, nearly any number of appliances 250 may be bi-directionally coupled to the hub 320 and under the control of the control blade 310.
FIG. 4 includes an illustration of one of the management blades 330. Each of the management blades 330 is an illustrative, non-limiting example of a management execution component and has logic to act on its own or can execute on directives received from a central management component (e.g., the control blade 310). In other embodiments, a management execution component does not need to be a blade, and the management execution component could reside on the same blade as the central management component. Some or all of the components within the management blade 330 may reside on one or more integrated circuits. Each of the management blades 330 can include a system controller 410, a central processing unit
("CPU") 420, a field programmable gate array ("FPGA") 430, a bridge 450, and a fabric interface ("I/F") 440, which, in one embodiment, includes a bridge. The system controller 410 is bi-directionally coupled to the hub 320. The system controller 410, the CPU 420, and the FPGA 430 are bi-directionally coupled to one another. The bridge 450 is bi-directionally coupled to a media access control ("MAC") 460, which is bi-directionally coupled to the network 212. The fabric I/F 440 is bi-directionally coupled to the system controller 410 and a fabric blade 340.
More than one of any or all components may be present within the management blade 330. For example, a plurality of bridges substantially identical to bridge 450 may be used and would be bi-directionally coupled to the system controller 410, and a plurality of MACs substantially identical to the MAC 460 may be used and would be bi-directionally coupled to the bridge 450. Again, other connections may be made and memories (not illustrated) may be coupled to any of the components within the management blade 330. For example, content addressable memory, static random access memory, cache, first-in-first-out ("FIFO"), or other memories or any combination thereof may be bi-directionally coupled to the FPGA 430.
The control blade 310, the management blades 330, or any combination thereof may include a central processing unit ("CPU") or controller. Therefore, the appliance 250 is an example of a data processing system. Although not illustrated, other connections and memories (not illustrated) may reside in or be coupled to any of the control blade 310, the management blade(s) 330, or any combination thereof. Such memories can include content addressable memory, static random access memory, cache, FIFO, other memories, or any combination thereof. The memories, including the disk 390 can include media that can be read by a controller, CPU, or both. Therefore, each of those types of memories includes a data processing system readable medium.
Portions of the methods described herein may be implemented in suitable software code that includes instructions for carrying out the methods. In one embodiment, the instructions may be lines of assembly code or compiled C"", Java, or other language code. Part or all of the code may be executed by one or more processors or controllers within the appliance 250 (e.g., on the control blade 310, one or more of the management blades 330, or any combination thereof) or on one or more software agent(s) (not illustrated) within the network devices 232 to 237, or any combination of the appliance 250 or software agents. In another embodiment, the code may be contained on a data storage network device, such as a hard disk (e.g., disk 390), magnetic tape, floppy diskette, CD ROM, optical storage network device, storage network (e.g., storage network 236), or other suitable data processing system readable medium or storage network device, or any combination thereof.
Other architectures may be used. For example, the functions of the appliance 250 may be performed at least in part by another apparatus substantially identical to appliance 250 or by a computer (e.g., console 380). Additionally, a computer program or its software components with such code may be embodied in more than one data processing system readable medium in more than one computer. Note that the appliance 250 is not required, and its functions can be incorporated into different parts of the distributed computing environment 200.
Attention is now directed to specific aspects of the distributed computing environment, how it is controlled by its management infrastructure, and how problems with conventional approaches to managing distributed computing environments can be overcome. Each of the network devices 232 to 237 is connected to the appliance 250 via the network 212.
Substantially all of the network traffic to and from each of the network devices 232 to 237 passes through the appliance 250, and more specifically, at least one of the management blades 330. By routing substantially all of the network traffic to and from the network devices 232 to 237, the appliance 250 can more closely manage and control the distributed computing environment 200 in real time or near real time. The distributed computing environment 200 dynamically changes in response to (1) applications running within the distributed computing environment 200, (2) changes regarding network devices or other components within the distributed computing environment 200 (e.g., provisioning or de-provisioning a server), (3) changes in priorities of applications, transaction types, or both to more closely match the business objectives of the organization operating the distributed computing environment 200, or (4) any combination thereof.
The network traffic on the network 212 includes content traffic and management traffic. Therefore, the network 212 is a shared network. Separate, physical, parallel networks for content traffic and management traffic are not needed. The shared network helps to keep capital and operating expenses lower.
In one embodiment, the network 212 can include one or more connections, a portion of the bandwidth within the network, or both, which may be reserved for management traffic and not be used for content traffic. Referring to FIG. 5, a network cable 540 may be attached to a connector 520 having connections 524. A portion 502 of the connections 524 may be reserved for management traffic, and a portion 504 of the connections 524 may be reserved for content traffic. In another embodiment, the network traffic may include a bandwidth 600, as illustrated in FIG. 6. The bandwidth 600 may include a portion 602 reserved for management traffic and a portion 604 reserved for content traffic. A port on the appliance 250 can receive the connector 520 or otherwise be represented by the bandwidth 600. Each network device 232 to 237 may be connected to its own port on the appliance 250. FIGs. 5 and 6 are meant to illustrate and not limit the scope of the present invention.
In this manner, the appliance 250 can address one or more network devices 232 to 237 or other AI components within any of the network devices 232 to 237 that may be causing a broadcast storm. The portion 502 of the connections 524 or the portion 602 of the bandwidth 600 allows the appliance 250 to communicate to a software agent on the AI component to address the broadcast storm issue. A conventional shared network does not reserve connections) or a portion of the bandwidth solely for management traffic. Therefore, a designated managing component in a conventional shared network (e.g., workstation 138 in FIG. 1) would not be able to send a management communication to the AI component because the broadcast storm could consume all connections or bandwidth and substantially prevent any packets, including management packets, from being received by the AI component causing the broadcast storm. After reading this specification, skilled artisans will appreciate that the distributed computing environment 200 has the advantages of a separate, physical, parallel network but without its disadvantages, and with the advantages of a shared network but without its disadvantages.
In another embodiment, each of the management blades 330 can extract management traffic from the network traffic or insert management traffic into the network traffic; send, receive, or transmit management traffic to or from any one or more of the appliance 250 and software agents residing on the AI components; analyze information within the network traffic; modify the behavior of managed components in the AI; or generate instructions or communications regarding the management and control of any portion of the AI; or any combination thereof. The various elements within the management blades 330 (e.g., system controller 410, CPU 420, FPGA 430, etc.) provide sufficient logic, resources, or any combination thereof to carry out the mission of a management execution component. Also, those elements can allow the management blades 330 to respond very quickly to provide real time or near real time changes to the distributed computing environment 200 as conditions within the distributed computing environment 200 change. In one specific embodiment, the management blade 330 may perform one or more functions of one or more of the network devices connected to it. For example, if one of the firewall/routers 237 is having a problem, the management blade 330 may be able to detect, isolate, and correct a problem within such firewall/router 237. During the isolation and correction, the management blade 330 can be configured to perform the routing function of the firewall/router 237, which is an example of a Layer 3 network device in accordance with the OSI Model. This non-limiting, illustrative embodiment helps to illustrate the power of the management blades 330. In another embodiment, the management blade 330 may serve any one or more functions of other Layer 2 or higher network devices.
Another advantage with the embodiments described herein is that communications to and from a network device 232 to 237 is not dependent on another network device. In a conventional distributed computing environment, such as the one illustrated in FIG. 1, the ability of the workstation 138 to communicate to any of the application servers 134 or database servers 135 depends on the state of the router 137. Therefore, the router 137 is an intermediate network device with respect to communications between the workstation 138 and the servers 134 and 135. Unlike the conventional distributed computing environment, as illustrated in FIG. 1, the distributed computing environment 200 as illustrated in FIG. 2 and described herein allows direct communication between the appliance 250 and each of the network devices 232 to 237 without having to depend on the state of another network device because there are no intervening network devices.
In one particular embodiment, each the network devices 232 to 237 may be directly connected to the network 210 that can be connected to one or more management blades 330. In effect, each of the network devices 232 to 237 may be connected in parallel to different management blades 330 to account for possible failure in any one particular management blade 330. For example, the control blade 310 may detect that one of the web servers 233 is configured incorrectly. However, one of the management blades 330 may be malfunctioning. Control blade 310 may send a management communication through hub 320 and over a functional management blade 330 to the malfunctioning web server 233. Therefore, the malfunctioning management blade 330 is not used. By connecting network devices 232 to 237 to network ports on different management blades 330, failures in a specific management blade 330, a specific network link 212, or a specific network port on network devices 232 to 237 may be circumvented. Such redundancy may be desired for enterprises that require operations to be continuous around the clock (e.g., automated teller machines, store-front applications for web sites, etc.). Embodiments can allow for network devices within a distributed computing environment to be no more than "one hop" away from its nearest (local) management blade 330. By being only one hop away, the management infrastructure can manage and control network devices 232 to 237 and their corresponding components in real time or near real time. The distributed computing environment 200 can also be configured to significantly reduce the likelihood that a single malfunctioning AI component brings down the entire distributed computing environment 200. IV. Methods of Managing and Controlling a Distributed Computing Environment Attention is now directed to methods for managing and controlling the distributed computing environment 200. The methods may classify a network packet within a communication, and based upon the classification, properly affect the transmission of the network packet. The classification may be based at least in part on one or more factors, including the application, transaction type, or a combination of application and transaction type with which the communication is associated, whether that network packet is a management packet, the source or destination of the network packets, one or more other relevant factors, or any combination thereof. In a particular embodiment, the classification can be based at least on the type of application, the transaction type, or an application-transaction type combination and can affect a stream or a flow. The corresponding stream, corresponding flow, or corresponding stream-flow for a specific apphcation, specific transaction type, or specific application-transaction type combination is hereinafter referred to as an "ATT-specific stream/flow." The classification can be used to control one or more settings for the transmission of the network packet. In a particular embodiment, a management execution component (e.g., a management blade 330) can include a classification module configured to perform the classification, a control settings module configured to set one or more controls used for the transmission of the packet, or both. In a more particular embodiment, the control settings module may be part of the examination module, or vice versa.
A software architecture for controlling at least a portion of a distributed computing environment 200 is described herein. The method can include receiving a communication at the appliance 250 (block 700 in FIG. 7). The communication may include one network packet or a collection of network packets that effectively make up one transmission from a source AI component to a destination AI component with AI 210. The communication between AI components on different network devices 232 to 237 within the AI 210 travels through the appliance 250. After the communication arrives from an AI component in AI 210 at the management blade 330, this communication may be converted into other network packets (e.g., smaller network packets) by the MAC 460 of the management blade 330. In certain embodiments, these packets may conform to the Open System Interface (OSI) seven layer standard. In one particular embodiment, the communication is assembled by the MAC 460 into Transmission Control Protocol/Internet Protocol ("TCP/IP") packets, which is a specific type of network packets. The network packets can be part of a stream or a flow. As indicated in FIG. 7, the receiving action is optional since some or all of the succeeding actions may be performed before a communication is received by the appliance 250, (e.g., on an agent at one of the network devices 232 to 237 or their corresponding components).
The method can also include examining and classifying the network packet (block 702). The network packet may be classified by examining different OSI layers of the network packet. The network packet can be classified as being a particular stream or a particular flow. One or more parameters can be used in order to classify the network packet as belonging to a particular type of stream or flow. The one or more parameters may include a virtual local area network identification, a source address, a destination address, a source port, a destination port, a protocol, a connection request, transaction type load tag, or any combination thereof. The source and destination addresses may be IP addresses or other network addresses (e.g., Ix250srv). The one or more parameters may exist within the header of each network packet. Moreover, one or more network packets within a communication may include a connection request that may be a simple "yes/no" parameter (i.e., whether or not the network packet represents a connection request). Also, the transaction type load tag may be used to define the transaction type related to a particular type of stream or flow. The transaction type load tag may be used to provide for more fine-grained control over ATT-specific stream/flows.
In another embodiment, the classification may be based on whether the communication is in a TCP or UDP IP protocol. In still another embodiment, a special IP address may be assigned to the control blade 310 to perform management functions, and therefore, all network packets that include the special IP address (as the source or destination address) may be classified as management packets. Thus, the examination can focus on the protocol or the special IP address in these particular embodiments.
In one particular implementation, the classification of the network packet can be performed by a classification module within the management blade 330, and in one embodiment, the FPGA 430. In other embodiments, the classification module may reside in another portion of the appliance 250 or can be part of a management software agent on an AI component. The classification may be aided by a tuple, which may be a combination of information from various layers of the network packet(s). In one particular embodiment, a tuple that associates a network packet with a particular type of steam or flow within a classification map. The elements of this tuple (as may be stored by the FPGA 430 on the management blade 330) can include various fields that may be selected from the possible fields in Table 1.
Figure imgf000022_0001
For example, a tuple including a particular IP source port, a particular IP destination port, and a particular protocol may be defined and associated with a particular type of steam or flow. If information that is extracted from various layers of an mcoming network packet matches the information in the tuple in the classification map, the network packet may in turn be associated with that particular type of stream or flow. In one particular embodiment, the classification module within management blade 330 can include logic to read one or more specific fields from the first 128 bytes of each network packet and record that information in memory. After reading this specification, skilled artisans will recognize that more detailed information may be added to the tuple to further qualify network packets as belonging to a particular type of stream or flow. Packet processing on the management blade 330 may also include collecting dynamic traffic information corresponding to specific tuples. Traffic counts (number of bytes, number of network packets, or both) for each type of tuple may be kept and provided as gauges to analyze logic on management blade 330.
After the network packet within the communication is classified, the method can include assigning the network packet to an ATT-specific stream/flow (block 703) using a stream/flow mapping table. The stream/flow mapping table may contain a variety of entries that match a particular steam or flow with an ATT- specific stream/flow. In one particular embodiment, the stream/flow mapping table can contain 128 entries. Each entry maps the tuple associated with a network packet to one of 16 different ATT-specific stream/flows for distinct control when the stream flow mapping table includes settings for controls (e.g., priority, latency, connection throttle, packet throttle, etc.) within the distributed computing environment 200. A control settings module within the management blade 330 can be used for setting the controls. In other another embodiment, the stream/flow mapping table may have more or fewer entries, streams, flows, or any combination thereof.
Use of the ATT-specific stream/flows may increase the ability to allocate different amounts of AI capacity to different applications, different transaction types, or a combination thereof by allowing the management infrastructure to distinguish between network packets (of different communications) that belonging to different applications, different transaction types, or any combination thereof.
In one embodiment, five basic ATT-specific stream/flows under which an incoming packet may be grouped can include: (1) web traffic (e.g., an AI stream or AI flow between the Internet and a web server); (2) application server traffic (e.g., the AI flow between a web server and an application server); (3) DB traffic (e.g., an AI stream or AI flow between an application server and a database); (4) management traffic (e.g., an AI stream or AI flow between AI components in AI 210 and the control blade 310); and (5) other traffic (e.g., all other AI streams or AI flows that cannot be grouped under the previous four categories). In another embodiment, more or fewer ATT-specific stream/flows may be present. In still another embodiment, transaction type-specific network flows may be used in place of or in conjunction with the ATT-specific stream/flow. While many details have been given with respect to examining the network packet, classifying the network packet with respect to a particular type of stream or flow, assigning the network packet to the ATT- specific stream/flow, and setting controls for the network packet, many alternative embodiments are possible. For example, the classification mapping table and the stream/flow mapping table can be consolidated into one table. Alternatively, the combination of the classification mapping table and stream/flow mapping table can be broken into more tables (e.g., classification mapping table, stream/flow mapping table without control settings, and a control settings table). In still another embodiment, classification, assignment, setting controls, or any combination thereof may not be required. For example, examination may indicate that the network packet is a management packet. Classification and assignment as described above may not be used because default control settings for management packets may be used. After reading this specification, skilled artisans will appreciate that other embodiments can be used.
In one particular embodiment, one or more actions may be assigned to a network packet based on the ATT-specific stream/flow to which the network packet is associated. Actions can include one or more instructions based on the importance of the ATT-specific stream/flow. An example of an action can include drop, meter, or inject. A drop action may include dropping a network packet as the ATT-specific stream flow associated with the network packet is of low importance relative to one or more other ATT-specific stream/flows. A meter action may indicate that the network bandwidth, connection request rate, or both for an ATT-specific stream/flow is under analysis and the network packet is to be tracked or otherwise observed. An inject action may indicate that the network packet is to be given a certain priority or placed in a certain port group. After reading this specification, skilled artisans appreciate that each of the actions described herein can be modified, and that other actions can be used in place of or in conjunction with such actions.
In one embodiment, the method includes determining whether the network packets are management packets (diamond 704). After the network packets are associated with an ATT-specific stream/flow, the network packets may be routed depending on whether the network packets are considered management traffic or other traffic (e.g., content traffic). If the network packet is considered to be part of management traffic, it may be redirected for special processing. Network packets that are part of the management traffic are referred to as management packets. If the network packets are management packets, they may be processed as illustrated in FIG. 8. In one embodiment, the method can include setting the highest priority for the communication (block 820), setting a value that results in the lowest latency for the communication (block 822), setting a value that results in no connection throttling for the communication (block 824), an setting a value that results in no packet throttling is set for the communication (block 826). The control settings module can be used to set the controls. Regarding the connection throttling, the portion 502 of the connections 524 (FIG. 5) or the portion 602 of bandwidth 600 (FIG. 6) may be reserved for transmission of the management packets to allow management packets to control an AI component that is causing a broadcast storm.
In an exemplary, non-limiting embodiment, the settings for priority can be simply based on a range of corresponding numbers, for example, from zero to seven (0 to 7), where zero (0) is the lowest priority and seven (7) is the highest priority. Further, the range for latency may be zero or one (0 or 1), where zero (0) means drop packets with normal latency and one (1) means drop packets with high latency. Also, the range for the connection throttle may be from zero to ten (0 to 10), where zero (0) means throttle zero (0) out often (10) connection requests (i.e., zero throttling) and ten (10) means throttle ten (10) out of every ten (10) connection requests (i.e., complete throttling). The range for packet throttling may be substantially the same as the range for the connection throttling. The above ranges are exemplary, and there may exist numerous other ranges of settings for priority, latency, connection throttling, and packet throttling. Moreover, the settings may be represented by nearly any group of alphanumeric characters. The method can include transmitting the management packet (block 828). Accordingly, any management packets transmitted to a managed AI component from the management blade 330 of the appliance 250 can be afforded special treatment relative to content traffic by the distributed computing environment 200, and are transmitted expeditiously within the distributed computing environment 200. Moreover, any management packets that are received by the appliance 250 from a managed AI component are also afforded special treatment by the distributed computing environment 200 and are also expeditiously delivered through the distributed computing environment 200.
In another embodiment, the method as illustrated in FIG. 8 may or may not be performed for the management packets within the appliance 250. For example, if the management packets are routed to or from the central blade 310, the management packets may be routed without having control settings set because no content traffic may be transmitted to or from the control blade 310 in this particular embodiment.
The routing of management packets can be performed in one or more different manners. Below is a specific description of how routing of management packets can be performed in accordance with a particular, non-limiting embodiment as seen from the perspective of an management execution component (e.g., a management blade 330 within the appliance 250), as depicted in FIG.9. The method can include determining whether the management packet is to be routed to an AI component (diamond 950). If the incoming management packet was received from a central management component (e.g., control blade 310 within the appliance 250), the management packet may be destined for an AI component. For this scenario ("yes" branch of diamond 950), the method can include routing the management packets to a management software agent on an AI component (block 970). If the management packet was received from an AI component ("no" branch of diamond 950), the method can include routing the management packet to a central management component (e.g., control blade 210)(block 960).
In one particular embodiment, when the FPGA 430 determines the network packet is associated with management traffic, the network packet can be redirected by a switch for special or other processing by the CPU 420 on the management blade 330. If a determination is made that the management packet originated from an AI component (e.g., a network device 232 to 237) in the AI 210 (diamond 950), the CPU 320 may then forward this packet out through a management port on the management blade 330 to a management port on the control blade 310 (block 960).
Similarly, when a management packet arrives at an internal management port on the management blade 330 from the control blade 310 (block 950), the management packets can be routed to the CPU 420 on the management blade 330, and then redirected by the CPU 420 through a switch to an appropriate egress port to the network 212, which routes the management packets to an agent on an AI component coupled to that egress port.
In one specific embodiment, the management blade 330 may be coupled to the control blade 310 (via the hub 320) and another AI, which is separate from the AI 210. The management infrastructure allows management packets to be communicated between the management blade 330 and the control blade 310 without placing additional stress on the AI 210. Additionally, even if a problem exists in the AI 210, this problem does not effect communication between the control blade 310 and the management blade 330.
Since substantially all network traffic intended for AI components and their associated software agents in AI 210 can pass through at least one of the management blades 330, such management blade(s) 330 is (are) able to more effectively manage and control the AI components by regulating the transmission of network packets. More particularly, with regards to management traffic, when the management blade 330 determines whether a management packet is destined for a management software agent local to an AI component in the AI 210, the management blade(s) 330 may hold the transmission of all network packets that are not management packets (i.e., content packets), to that particular AI component until the transmission of the management packets to the particular AI component has been completed. In this manner, management packets may be transmitted to any one or more AI components regardless of the volume and type of other traffic (i.e., content traffic) in the AI 210.
The management packets can alleviate one or more problems in the AI 210 by allowing AI components to be controlled and manipulated regardless of the type and volume of content traffic in the AI 210. For example, as mentioned above, broadcast storms may prevent delivery of communications to an AI component when a conventional shared network is used. Unlike the conventional shared network, the management packets may alleviate these broadcast storms in the AI 210, as transmission of content packets originating with a particular AI component may be postponed until one or more management packets which address the problem on the particular AI component, are received to a management agent local to the AI component causing the broadcast storm.
If the network packet is not a management packet ("no" branch of diamond 704), then the network packet is a content packet. The method can include determining whether the content packet is to be delivered to an AI component from the management blade 330 within the appliance 250 (diamond 706). If yes, the content packet is processed as depicted in FIG. 10. The method can include determining the setting for the priority of the content packet (block 1040), determining the setting for the latency of the content packet (block 1042), determining the setting for the connection throttle of the content packet (block 1044), determining the setting for the packet throttle of the content packet (block 1046), and transmitting the content packet (block 1048), which in this embodiment are content packets that are part of the content traffic. The control settings module can set controls for the content packet after the corresponding settings have been determined. In one particular embodiment, after the network packet has been classified as a content packet belonging to a particular type of stream or flow, the control settings may be determined based in part on the identification table. Alternatively, the content packet can be further assigned to an ATT-specific stream/flow using the stream/flow mapping table. The stream/flow mapping table can be used to determine the values for the control settings. The content packet can then be transmitted according to the above-determined settings. In another alternative embodiment, content traffic can be processed as illustrated in the flow diagram in FIG. 11. The method can start with determining whether the destination of the content packet is local to a management execution component (e.g., a management blade 330) where the content packet currently resides (diamond 1150). This assessment may be made by an analysis of various layers of the content packet. The management blade 330 may determine the IP address of the destination of the content packet or the IP port destination of the content packet, by examining various layers of the content packet. In one embodiment, this examination is done by logic associated with a switch within the management blade 330, or more specifically, by the FPGA 430.
The management blade 330 may be aware of the IP address and ports that may be accessed through that management blade's egress ports coupled to the network 212. If a network packet has an IP destination address or an IP port destination that may be accessed through a port coupled to that management blade 330, ("yes" branch from block 1150), the destination of the network packet is local to the management blade 330. Alternatively, if the network packet contains an IP destination address or an IP port destination that cannot be accessed through any of that management blade's egress ports coupled to network 210, the destination of the network packet is remote to that management blade 330.
For example, a communication can pass from one AI component over the network 212 to the top management blade 330 (as illustrated in FIG. 3), through one or more fabric blades to the bottom management blade 330 (as illustrated in FIG. 3), and over the network 212 to another AI component. In this example, the top management blade 330 is a remote management execution component with respect to the destination port or address, and the bottom management blade 330 is a local management execution component with respect to the destination port or address. In a particular embodiment, a switch in the management blade 330 can make the determination. In one particular embodiment, the method includes determining if the management blade 330 wherein the content packet currently resides is the local management execution component (diamond 1150). The determination can be made by the management blade 330 on which the content packet currently resides. The fabric I/F 440 on that management blade may determine which management execution component (e.g., a management blade 330) is local to AI component that is the destination of the content packet. The method can also include assigning one or more control settings (e.g., priority, latency, etc.) to the content packet (block 1160). The content packet can be routed through one or more fabric blades 340 to the local management execution component (e.g., the local management blade 330 for the particular packet) (block 1170).
If the content packet is destined for a port on another management blade 330, the content packet may be forwarded to one or more fabric blades 340 for delivery to another management blade 330, which is local to the port for which the network packet is destined (block 1070). In one embodiment, if the network packet is destined for a remote management blade 330 ("No" branch from block 1150), the method can include assigning a latency and a priority to the network packet (block 1160) based at least in part upon the ATT- specific stream/flow with which it is associated. The method can also include forwarding the network packet to the other management execution component (block 1170), which is the local management blade with respect to the AI component that is the destination of the network packet. In a particular embodiment, the content packet may then be converted into a fabric packet suitable for transmission through the one or more fabric blades 340, and the fabric packet can be reconverted back to the content packet for use by the destination AI component. The conversion from the content packet to the fabric packet or vice versa may be performed on a management blade 330 (e.g. the fabric I/F 440), a fabric blade 340, or both.
In a particular embodiment, the fabric blade 340 may use virtual lanes, virtual lane arbitration tables, service levels, or any combination thereof to transmit fabric packets between the fabric blades 340 based upon one or more control settings, including latency, priority, etc. Virtual lanes may be multiple independent data flows sharing the same physical link but utilizing separate buffering and flow control for each latency or priority. Embedded in each fabric I F 440 hardware port may be an arbiter that controls usage of these links based on the control settings assigned different packets. The fabric blade 340 may utilize weighted fair queuing to dynamically allocate each fabric packet a proportion of link connections or bandwidth between the fabric blades 340. These virtual lanes and weighted fair queuing can combine to improve fabric utilization, reduce the likelihood of deadlock, and provide differentiated service between packet types when transmitting packets between different management blades 330.
Once the content packet is at the other management blade 330 (i.e., the local management execution component with respect to the destination address or port), the method can include calculating a weighted random early discard (WRED) value for the network packet (block 1180). In a first embodiment, the weighting can be based at least in part on the specific stream or the specific flow associated with the content packet. In a second embodiment, the weighting can be based at least in part on the ATT-specific stream/flow. The WRED value can be used to help the management blade 330 deal with contention for one or more ports, and corresponding transit queues that may form at those ports. Random early discard can be used as a form of load shedding which is commonly known in the art, the goal of which is to preserve a minimum average queue length for the queues at ports on the management blade 330. The end effect of this type of approach is to maintain some bounded latency for a network packet arriving at the management blade 330 and intended for a port on management blade 330.
In one particular embodiment, the management blade 330 may calculate an WRED value to influence which one or more content packets are discarded based on the apphcation, transaction type, component, or any combination thereof with which the content packet is associated. Therefore, the management blade 330 may calculate this WRED value based upon a combination of contention level for the port for which the content packet is destined, and a control value associated with the ATT-specific stream/flow with which the content packets is associated. In one embodiment, this control mechanism may be a stream rate control, a flow rate control, or a combination stream rate-flow rate control, and a value for such control. Each ATT-specific stream/flow may have a distinct rate value. While the rate value may be a single number, the stream rate or the flow rate control may actually control two distinct aspects of the managed application environment.
A first aspect can include control of the connections (FIG. 5) or bandwidth (FIG. 6) available for specific links, including links associated with ports from the management blade 330 and links associated with the outbound fabric I F 440 between the management blades 330 or other portions of the distributed computing environment 200. This methodology, in effect, presumes the connections or bandwidth of a specific link is a scarce resource. Thus, when contention occurs for a port, a queue of the network packets waiting to be sent through the port and down the link would normally form. The rate control effectively allows determination of which the content packets from which ATT-specific stream flow get a greater or lesser percentage of the available connections or bandwidth of that port and corresponding network link. Higher priority streams or higher priority flows get a greater percentage, and lower priority streams or lower priority flows get a lesser percentage. Network links, especially those connected to managed AI components, are often not congested when the application load is transaction-based (such as an e-commerce application) rather than stream-based (such as for streaming video or voice-over-IP applications). Therefore, the specific benefit of this control will vary with application type, transaction type, load, or any combination thereof. A second aspect of this control mechanism can use the access to the egress port or network link as a surrogate for the remainder of the managed and controlled AI 110 that resides on the downstream side of the port. By controlling which content packet gets prioritized at the egress to the port, the rate control, also affects the mix of network packets seen by a particular AI component connected to the egress port.
In one specific embodiment, the rate control value may correspond to a number of bytes which will be transmitted out through an egress port and down a network link each second. The control value may be 0 to 19 where each value increments the specific number of bytes per second transmitted on a logarithmic scale to allow an improved degree of control over the number of bytes actually transmitted. In this particular embodiment, the correspondence may be as indicated in Table 2.
TABLE 2
Figure imgf000029_0001
If the content packet is not dropped, the method can continue with delivering the communication (block 1190).
Returning to the flow diagram in FIG. 7, the method can include determining if the content packet is being received from an AI component (diamond 708) or whether the content packet is being delivered via a virtual local area network uplink (diamond 710). The "yes" branch from diamond 708 or 710 continues with substantially the same processing, as illustrated in FIG. 12. The method can include determining the setting for the priority of the content packet (block 1260), and determining the setting for the latency of the content packet (block 1262). The control settings module can set the control based on the determination. The method can also include transmitting the content packet in accordance with those settings (block 1264). If the communication is not being received from an AI component and is not delivered via a VLAN uplink ("no" branches of diamonds 708 and 710), the method can include determining whether the content packet is being delivered via a VLAN downlink (diamond 1264). If so, the content packet is processed, as illustrated in FIG. 13. The method can include determining a setting for the connection throttle for the content packet (block 1370) and determining a setting for the packet throttle for the content packet (block 1372). The control settings module can set the control based on the deternination. The method can also include transmitting the content packet in accordance with those settings (block 1374).
If the content packet is not being delivered via a VLAN downlink ("no" branch of diamond 712 in FIG. 7), the method ends.
Although much of the discussion above has focused on one management packet or one content packet, the concepts described here can be applied to a communication of any size, from a single network packet to an entire stream or an entire flow, which can include thousands or even more network packets. Also, the description of the processing of content packets can be used for management packets. Management packets are typically assigned to the highest level of importance for confrol settings as compared content packets. The distributed computing environment 200 can allow management traffic and content traffic to share the same network, however, a portion of the connections or a portion of the bandwidth of the network can be reserved for management traffic. In this manner, the appliance 250 can exert real time or near real time confrol over the distributed computing environment 200. The likelihood of a broadcast storm affecting more that one network device 232 to 237 can be reduced or substantially eliminated. In addition, control settings can be set based at least in part on the application or transaction type to which the network packet is associated. In this manner, the level of importance between applications, transactions types with applications, or any combination thereof can be adjusted to better meet the business objectives of the entity using the distributed computmg environment 200. For example, a store-front application can be given preference over an inventory management application. Alternatively, the level of importance can be based on fransaction types. For example, an order placement transaction of a store-front application can be more important as compared to a vendor delivery schedule for an inventory management application, which may be given more importance over an email-help message in the store-front application to that is sent to a customer service representative of the entity running the store-front application.
The business objectives can be static or change. The appliance 250 can adapt to the distributed computing environment 200 to meet the business objectives can change hourly, daily, monthly, or at any other time. The business objectives can include increasing revenue, increasing profit, reducing inventory, making a deposit into an account before a deadline, etc.
In the above-described methods, the controls that are provided (e.g., priority, latency, connection throttle, packet throttle, etc.) can be used to control the AI components (e.g., network devices 232 to 237) coupled to each other by one or more pipes. In an exemplary, non-limiting embodiment, a pipe may be a link between a managed AI component and a management blade 330. Further, a pipe may be a VLAN uplink or VLAN downlink. A pipe may be a link between the control blade 310 and a management blade 330. Moreover, a pipe may be a link between two management blades 330 or an appliance backplane.
It can be appreciated that, in the above-describe method, some or all of the actions may be undertaken at different locations within the distributed computing environment 200 in order to provide controls on the pipes. For example, when a stream or a flow is to be transmitted to a managed AI component from a management blade 330, one or more confrol settings, including latency, priority, connection throttling, packet throttling, other suitable control, or any combination thereof can be implemented on the management blade 330 (e.g., through the FPGA 430 or on software operating on a switching control processor (not illustrated) within the management blade 330). Alternatively, when a stream or a flow is transmitted to a management blade 330 from a managed AI component, one or more of the same or different confrol settings (e.g., latency and priority) can be implemented by a software agent on the managed AI component. In an exemplary, non- limiting embodiment, a communication mechanism can exist between the control blade 310 and a software agent at the managed AI component in order to transmit to the software agent the values that are to be used for the control settmgs. Further, a mechanism can exist at the software agent in order to implement those settings. Depending upon which direction a stream or a flow is traveling (e.g., to or from a managed AI component) connection throttling, packet throttling, or both can be used at the management blade 330 or at the managed AI component. Since it may be difficult to retrieve a flow or stream once it has been sent into a pipe, in one embodiment, connection throttling can be implemented at the component from which a stream or a flow originates. Further, in an exemplary, non-limiting embodiment, when a flow or stream is being delivered via a
VLAN uplink, the latency and priority controls can be implemented on the management blade 330. Also, in an exemplary, non-limiting embodiment, when a flow or stream is being delivered via a VLAN downlink, the connection throttle, the packet throttle, or both can also be implemented on the management blade 330.
During configuration of the disfributed computing environment 200, streams or flows can be defined and created for each application, transaction type, or both in the distributed computing environment 200. For each managed AI component, the pipes are also defined and created. Moreover, for each uplink or downlink in each VLAN, the necessary pipes are created.
During operation, the provisioning and de-provisioning of certain AI components (e.g., servers) can have an impact on the distributed computing environment 200. For example, when a server is provisioned, the provisioned server can result in the creation of one or more flows. Therefore, a mechanism can be provided to scan the classification mapping table and to create new entries. In addition, the provisioned server can result in the creation of a new pipe. When a server is de-provisioned, the de-provisioned server can cause one or more flows to be no longer used. Therefore, a mechanism can be provided to scan the classification mapping table and delete entries as provisioning and de-provisioning occurs. If a managed AI component is added, corresponding flows and pipes can be created. This can include management flows to and from the management blade 330. Alternatively, if a managed AI is removed, the corresponding flows and pipes can be deleted. This also includes the management flows to and from the management blade 330 within the appliance 250. Further, if an uplink is added for a VLAN, the corresponding pipes can be created. On the other hand, if an uplink is removed for a VLAN, the corresponding pipes can be deleted. With the provisioning and de-provisioning of AI components and the addition and removal of managed AI components, the classification mapping table can be considered dynamic during operation (i.e., entries are created and removed as AI components) are provisioned and de-provisioned and as managed AI components are added and removed.
In one exemplary, non-limiting embodiment, a number of flows within the disfributed computing environment 200 may cross network devices that are upstream of a management blade 330. Further, the priority and latency settings that are established during the execution of the above-described method can have an influence on the latency and priority of those affected packets as they cross any upstream network devices. As such, the hierarchy established for priority can be based on a recognized standard (e.g., the IEEE 802.1p/802.1q standards). Additionally, when connection requests are refused, or lost, the requestor may employ an exponential back-off mechanism before re-trying the connection request. Thus, in an exemplary, non-limiting embodiment, the connection throttle can throttle connection requests in whatever manner is required to invoke the standard request back-off mechanism.
The above-described method can be used to control the delivery of flows, streams, or both along pipes to and from managed AI components within a distributed computing environment. Depending on the direction of travel of a particular stream or a particular flow, some or all of the controls can be implemented at the beginning or end of each pipe. Further, by controlling a distributed computmg environment 200 using the method describe above, the efficiency and quality of service of the application using the disfributed computing environment 200can be increased.
Note that not all of the activities described herein are necessary, that a portion of a specific activity may not be required, and that further activities may be performed in addition to those illustrated. Additionally, the order in which each of the activities is listed is not necessarily the order in which they are performed. After reading this specification, a person of ordinary skill in the art will be capable of determining which activities and orderings best suit any particular objective.
In the foregoing specification, the invention has been described with reference to specific embodiments. However, one of ordinary skill inthe art appreciates that various modifications and changes can be made without departing from the scope of the invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of invention.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits,- advantages, solutions to problems, and any component(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or component of any or all the claims.

Claims

WHAT IS CLAIMED IS:
1. A distributed computing environment comprising: at least one apparatus that controls at least a portion of the distributed computing environment; network devices; and a network lying between each of the network devices and the at least one apparatus, wherein: the network is configured to allow content traffic and management traffic within the at least a portion of the distributed computing environment to travel over the same network; and the network is configured such that at least a portion of a connection or a bandwidth within the network is reserved for the management traffic and is not used for the content traffic, wherein the at least one apparatus, the network devices, or any combination thereof comprises a classification module to classify a network packet as part of the content traffic or the management traffic.
2. The disfributed computing environment of claim 1, wherein each of the network devices is directly connected to the at least one apparatus.
3. The disfributed computing environment of claim 1, wherein the distributed computing environment is configured so that substantially all network traffic to and from each of the network devices passes through the at least one apparatus.
4. The distributed computing environment of claim 1, wherein: the at least one apparatus comprises a central management component and at least one management execution component; each of the network devices comprises a software agent; and a management infrastructure comprises the central management component, the at least one management execution component, the software agents, and at least a portion of the network.
5. The distributed computing environment of claim 4, wherein the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof.
6. The distributed computing environment of claim 4, wherein the distributed computing environment is configured so that substantially all network traffic between any two network devices passes through the at least one management execution component.
7. The distributed computing environment of claim 4, wherein an application infrastructure comprises the at least one management execution component, the network devices and software agents, and at least a portion of the network.
8. The distributed computing environment of claim 7, wherein the central management component is not part of the application mfrastructure.
9. The distributed computing environment of claim 1, wherein the network devices comprise at least one Layer 2 network device and at least one Layer 3 network device.
10. The disfributed computing environment of claim 1, wherein at least one of the network devices comprises a Layer 2 network device and a Layer 3 network device.
11. The distributed computing environment of claim 1, wherein the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
12. The disfributed computing environment of claim 1, the at least one apparatus configured to identify a network packet as a management packet or a content packet.
13. A disfributed computing environment comprising: at least one apparatus that controls the distributed computing environment, wherein the at least one apparatus comprises a central management component and at least one management execution component, and wherein the at least one apparatus comprises a classification module to classify a network packet as part of management traffic or a contact traffic; and network devices, wherein substantially all network traffic between any two network devices passes through the at least one management network component.
14. The distributed computing environment of claim 13, wherein each of the network devices is directly connected to the at least one management execution component.
15. The distributed computing environment of claim 13, further comprising a network lying between each of the network devices and the at least one apparatus, wherein the network is configured to allow the content traffic and the management traffic within the distributed computing environment to travel over the same network.
16. The distributed computing environment of claim 13, wherein: each of the network devices comprises a software agent; and a management infrastructure comprises the central management component, the at least one management execution component, the software agents, and at least a portion of a network between the network devices and the at least one apparatus.
17. The distributed computing environment of claim 16, wherein an application infrastructure comprises the at least one management execution component, the network devices and the software agents, and at least a different portion of the network.
18. The distributed computing environment of claim 17, wherein the cenfral management component is not part of the application infrastructure.
19. The disfributed computing environment of claim 13, wherein the management execution component is operable to detect a problem on at least one network device, correct the problem on the at least one network device, isolate the at least one network device, or any combination thereof.
20. The distributed computing environment of claim 13, wherein the network devices comprise at least one Layer 2 network device and at least one Layer 3 network device.
21. The disfributed computing environment of claim 13, wherein at least one of the network devices comprises a Layer 2 network device and a Layer 3 network device.
22. The disfributed computing environment of claim 13, wherein the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
23. An apparatus for controlling at least a portion of a disfributed computing environment, the apparatus comprising: a central management component; and at least one management execution component; and a classification module to classify a network packet as part of management traffic or a contact traffic within the distributed computing environment.
24. The apparatus of claim 23, further comprising ports configured to receive connections from network devices, wherein at least one of the ports has: associated connections, wherein at least a portion of the associated connections is reserved for management traffic; an associated bandwidth, wherein at least a portion of the associated with the management traffic; or any combination thereof.
25. The apparatus of claim 24, wherein the at least one management execution component includes a first management blade, wherein each network device within the distributed computing environment is connected to the first management blade.
26. The apparatus of claim 25, wherein the at least one management execution component includes a second management blade, wherein each network devices within the distributed computing environment is connected to the second management blade.
27. The apparatus of claim 23, further comprising a port connectable to a network device, wherein, when the network device malfunctions, the apparatus is configured to perform a function of the network device.
28. The apparatus of claim 27, wherein, when the network device malfunctions, the apparatus is further configured to isolate the network device from a remaining portion of the distributed computing environment.
29. The apparatus of claim 23, wherein the at least one management execution component is configured to perform a routing function of a Layer 3 network device.
30. The apparatus of claim 23, further comprising a confrol setting module, wherein, based at least in part on the classification, the control setting module sets a confrol for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof.
31. The apparatus if claim 23, wherein: the at least one management execution component is configured to receive the content traffic and the management traffic; and the cenfral management component is configured to receive the management traffic but not the content traffic.
32. A method of controlling at least a portion of a distributed computing environment comprising: examining a network packet; classifying the network packet as management data or content data; and routing the network packet based on the classification.
33. The method of claim 32, wherein classifying the network packet comprises classifying the network packet based on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof.
34. The method of claim 33, wherein classifying the network packet comprises classifying the network packet using a stream/flow mapping table.
35. The method of claim 32, wherein routing the network packet comprises routing the network packet over a management infrastructure.
36. The method of claim 35, wherein routing the network packet further comprises routing the network packet to a management execution component.
37. The method of claim 36, wherein routing the network packet further comprises: receiving the network packet at the management execution component, after the network packet is sent from a first network device; and sending the network packet from the management execution component to a second network device different from the first network device.
38. The method of claim 35, wherein routing the network packet further comprises routing the network packet over an application infrastructure.
39. The method of claim 38, wherein routing the network packet further comprises routing the network packet to an agent on a network device.
40. The method of claim 38, further comprising blocking other traffic in the application infrastructure.
41. The method of claim 32, further comprising setting a control for the network packet, wherein the control corresponds to a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof, wherein, setting the confrol is based at least in part on the classification.
42. A method of controlling at least a portion of a disfributed computing environment comprising: classifying a network packet associated with a stream or a flow; and setting a control for the stream, the flow, or a pipe based at least in part on the classification.
43. The method of claim 42, further comprising examining a parameter of the network packet.
44. The method of claim 43, wherein the parameter comprises a virtual local area network identification, a source address, a destination address, a source port, a destination port, a protocol, a connection request, a fransaction type load tag, or any combination thereof.
45. The method of claim 43, further comprising associating the network packet with one of a set of specific flows/streams at least partially based on the parameter.
46. The method of claim 45, wherein associating the network packet comprises using an classification mapping table, wherein an entry in the classification mapping table maps the network packet to a specific stream/flow.
47. The method of claim 46, wherein each entry in the classification mapping table is mapped to an entry in a stream/flow mapping table.
48. The method of claim 47, wherein each entry in the classification mapping table or the stream/flow mapping table includes values for settings for priority, latency, a connection throttle, a packet throttle, and a combination thereof.
49. The method of claim 43, further comprising determining a value of the setting based at least in part on the value of the parameter.
50. The method of claim 49, wherein setting the control is applied once to the flow or the stream, regardless of a number of pipes used for the flow or the stream.
51. The method of claim 49, wherein the value of the setting is obtained from a flow entry and not a stream entry of a table.
52. A method of processing a network packet in distributed computing environment including an application infrastructure, the method comprising: receiving a communication from the application infrastructure, wherein the communication includes the network packet; classifying the network packet; and setting a control for the network packet based at least in part on the classification.
53. The method of claim 52, wherein classifying the network packet is based at least in part on a protocol, a source address, a destination address, a source port, a destination port, or any combination thereof.
54. The method of claim 53, wherein classifying the network packet further comprises associating the network packet with at least one of a set of: application-specific streams; application-specific flows; transaction type-specific streams; transaction type-specific flows; or any combination thereof.
55. The method of claim 54, wherein associating the network packet is accomplished using a sfream/flow mapping table, wherein an entry in the stream/flow matching table maps the network packet to a stream or a flow.
56. The method of claim 55, wherein members within the set are classified by types of traffic.
57. The method of claim 56, further comprising deteimining an action based on the stream or the flow associated with the network packet.
58. The method of claim 57, wherein the action includes at least one of drop, meter, and inject.
59. The method of claim 54, further comprising assigning a weighted random discard value to the network packet, based at least in part on the stream or the flow associated with the network packet.
60. The method of claim 59, wherein assigning a weighted random discard value is based on a stream rate, a flow rate, or any combination thereof.
61. The method of claim 60, further comprising discarding the network packet based on the weighted random early discard value.
62. The method of claim 61, wherein the weighted random early discard value is based on a weighted random early discard value.
63. The method of claim 62, wherein the confrol value is on a logarithmic scale.
64. The method of claim 52, setting the control comprises setting a priority for the network packet, a latency for the network packet, a connection throttle for the network packet, a packet throttle for the network packet, or any combination thereof.
65. A data processing system readable medium having code for carrying out the method as in any of claims 32 to 64, wherein the code is embodied within the data processing system readable medium, the code comprising instructions corresponding to actions as recited within the method.
66. An apparatus configured to carry out the method as in any of claims 32 to 64.
PCT/US2005/012938 2004-04-16 2005-04-14 Distributed computing environment and methods for managing and controlling the same WO2005104494A2 (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US10/826,719 2004-04-16
US10/826,777 US20050243814A1 (en) 2004-04-16 2004-04-16 Method and system for an overlay management system
US10/826,777 2004-04-16
US10/826,719 US20050232153A1 (en) 2004-04-16 2004-04-16 Method and system for application-aware network quality of service
US10/881,078 2004-06-30
US10/881,078 US20060031561A1 (en) 2004-06-30 2004-06-30 Methods for controlling a distributed computing environment and data processing system readable media for carrying out the methods
US10/885,216 US20060007941A1 (en) 2004-07-06 2004-07-06 Distributed computing environment controlled by an appliance
US10/885,216 2004-07-06

Publications (2)

Publication Number Publication Date
WO2005104494A2 true WO2005104494A2 (en) 2005-11-03
WO2005104494A3 WO2005104494A3 (en) 2006-01-12

Family

ID=34968386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/012938 WO2005104494A2 (en) 2004-04-16 2005-04-14 Distributed computing environment and methods for managing and controlling the same

Country Status (1)

Country Link
WO (1) WO2005104494A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7739686B2 (en) * 2004-12-21 2010-06-15 Sap Ag Grid managed application branching based on priority data representing a history of executing a task with secondary applications
CN103905325A (en) * 2012-12-26 2014-07-02 中兴通讯股份有限公司 Two-layer network data transmission method and network node

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5130974A (en) * 1989-03-30 1992-07-14 Nec Corporation Multidrop control network commonly used for carrying network management signals and topology reconfiguration signals
EP0840533A2 (en) * 1996-10-31 1998-05-06 Lucent Technologies Inc. A method and system for communicating with remote units in a communication system
US20030188003A1 (en) * 2001-05-04 2003-10-02 Mikael Sylvest Method and apparatus for the provision of unified systems and network management of aggregates of separate systems

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5130974A (en) * 1989-03-30 1992-07-14 Nec Corporation Multidrop control network commonly used for carrying network management signals and topology reconfiguration signals
EP0840533A2 (en) * 1996-10-31 1998-05-06 Lucent Technologies Inc. A method and system for communicating with remote units in a communication system
US20030188003A1 (en) * 2001-05-04 2003-10-02 Mikael Sylvest Method and apparatus for the provision of unified systems and network management of aggregates of separate systems

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7739686B2 (en) * 2004-12-21 2010-06-15 Sap Ag Grid managed application branching based on priority data representing a history of executing a task with secondary applications
CN103905325A (en) * 2012-12-26 2014-07-02 中兴通讯股份有限公司 Two-layer network data transmission method and network node
CN103905325B (en) * 2012-12-26 2018-12-11 南京中兴软件有限责任公司 Double layer network data transferring method and network node

Also Published As

Publication number Publication date
WO2005104494A3 (en) 2006-01-12

Similar Documents

Publication Publication Date Title
US8705363B2 (en) Packet scheduling method and apparatus
CN104798356B (en) Method and apparatus for the utilization rate in controlled level expanding software application
US6625650B2 (en) System for multi-layer broadband provisioning in computer networks
US6006264A (en) Method and system for directing a flow between a client and a server
US20020188732A1 (en) System and method for allocating bandwidth across a network
US7200144B2 (en) Router and methods using network addresses for virtualization
US20050232153A1 (en) Method and system for application-aware network quality of service
US20080008202A1 (en) Router with routing processors and methods for virtualization
US20060193318A1 (en) Method and apparatus for processing inbound and outbound quanta of data
US20030033421A1 (en) Method for ascertaining network bandwidth allocation policy associated with application port numbers
US9166927B2 (en) Network switch fabric dispersion
CA2750345A1 (en) Method of allocating bandwidth between zones according to user load and bandwidth management system thereof
US8630296B2 (en) Shared and separate network stack instances
JP2009231890A (en) Packet relay device and traffic monitoring system
JP6389564B2 (en) Improved network utilization in policy-based networks
CN113422699A (en) Data stream processing method and device, computer readable storage medium and electronic equipment
WO2005104494A2 (en) Distributed computing environment and methods for managing and controlling the same
JP2016122960A (en) Management system, network management method, network system
JP5194025B2 (en) How to optimize the sharing of multiple network resources between multiple application flows
WO2021052382A1 (en) Cloud service bandwidth management and configuration methods and related device
US20050243814A1 (en) Method and system for an overlay management system
CN108075955A (en) The data processing method and device of backbone network
Meitinger et al. A hardware packet re-sequencer unit for network processors
US20060031561A1 (en) Methods for controlling a distributed computing environment and data processing system readable media for carrying out the methods
JP3581056B2 (en) Traffic observing device, traffic monitoring device, datagram transfer device, and datagram transfer system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1) EPC

122 Ep: pct application non-entry in european phase

Ref document number: 05737529

Country of ref document: EP

Kind code of ref document: A2