US20150278146A1 - Method And Apparatus For Providing Traffic Re-Aware Slot Placement - Google Patents

Method And Apparatus For Providing Traffic Re-Aware Slot Placement Download PDF

Info

Publication number
US20150278146A1
US20150278146A1 US14/731,980 US201514731980A US2015278146A1 US 20150278146 A1 US20150278146 A1 US 20150278146A1 US 201514731980 A US201514731980 A US 201514731980A US 2015278146 A1 US2015278146 A1 US 2015278146A1
Authority
US
United States
Prior art keywords
slot
slots
placement
redundancy
bandwidth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/731,980
Inventor
Krishna P. Puttaswamy Naga
Ashok Anand
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel Lucent SAS
Original Assignee
Alcatel Lucent SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel Lucent SAS filed Critical Alcatel Lucent SAS
Priority to US14/731,980 priority Critical patent/US20150278146A1/en
Publication of US20150278146A1 publication Critical patent/US20150278146A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4204Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus
    • G06F13/4221Bus transfer protocol, e.g. handshake; Synchronisation on a parallel bus being an input/output bus, e.g. ISA bus, EISA bus, PCI bus, SCSI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Definitions

  • the inventions relate generally to methods and apparatus for providing slot placement.
  • virtual machines are placed in data center racks based on minimizing the bandwidth requirements of data transfers across racks.
  • Various embodiments provide a method and apparatus of providing an RE-aware technique for placing slots based on redundancy across and within slot communication pairs.
  • an apparatus for providing slot placement.
  • the apparatus includes a data storage and a processor communicatively connected to the data storage.
  • the processor is prgrammed to: determine a plurality of slots to be placed within a plurality of containers; determine a plurality of redundancy parameters; and determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • a plurality of the plurality of slots are virtual machines.
  • a plurality of the plurality of containers are racks in a data center.
  • the redundancy parameters are based on intra-comm-redundancy and inter-comm-redundancy of at least a portion of the plurality of slots.
  • the redundancy parameters are based on communication patterns of at least a portion of the plurality of slots.
  • the determination of the placement of the plurality of slots comprises programming the processor to: determine a plurality of clusters; assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters; and assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
  • the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
  • the determination of the placement of the plurality of slots comprises further programming the processor to exchange a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • a switching system fof providing slot placement.
  • the system including: a placement controller and a plurality of agents communicatively connected to the placement controller.
  • the plurality of agents are programmed to: collect a plurality of redundancy measurements and send the plurality of redundancy measurements to the placement controller.
  • the placement controller is prgrammed to: receive the plurality of redundancy measurements, determine a plurality of slots to be placed within a plurality of containers, determine a plurality of redundancy parameters based on the plurality of redundancy measurements, and determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • the determination of the placement of the plurality of slots comprises programming the placement controller to: determine a plurality of clusters, assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters, and assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
  • the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
  • the determination of the placement of the plurality of slots comprises further programming the placement controller to: exchange a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • a method for providing slot placement. The method including: determining a plurality of slots to be placed within a plurality of containers, determining a plurality of redundancy parameters, and determining a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • the method further includes exchanging a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • the step of determining the placement of the plurality of slots includes: determining a plurality of clusters, assigning each of at least a portion of the plurality of slots to at least one of the plurality of clusters, and assigning each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • the step of assigning the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • FIG. 1 illustrates an embodiment of an RE-aware data center 100 serving clients 190 - 1 and 190 - 2 via network 180 ;
  • FIG. 2 depicts a flow chart illustrating an embodiment of a method 200 for an RE-aware system (e.g., RE-aware data center 100 of FIG. 1 ) to perform RE-aware placements of slots;
  • an RE-aware system e.g., RE-aware data center 100 of FIG. 1
  • FIG. 3 illustrates a functional block diagram 300 of redundancy in an RE-aware data center 100 ;
  • FIG. 4 depicts a flow chart illustrating an embodiment of a method 400 for a placement controller (e.g., placement controller 170 of FIG. 1 ) to place slots as illustrated in step 270 of FIG. 2 ;
  • a placement controller e.g., placement controller 170 of FIG. 1
  • FIG. 5 depicts a flow chart illustrating an embodiment of a method 500 for placing slots as illustrated in step 480 of FIG. 4 ;
  • FIG. 6 depicts a flow chart illustrating an embodiment of a method 600 for placing slots as illustrated in step 480 of FIG. 4 and steps 520 , 540 and 560 of FIG. 5 ;
  • FIG. 7 depicts pseudo code illustrating an embodiment of steps 610 - 640 of FIG. 6 ;
  • FIG. 8 depicts pseudo code illustrating an embodiment of step 650 of FIG. 6 ;
  • FIG. 9 depicts pseudo code illustrating an embodiment of step 670 of FIG. 6 .
  • FIG. 10 schematically illustrates an embodiment of various apparatus 1000 such as one one of end hosts 120 , one of ToR switches 130 , one of aggregation switches 150 , or placement controller 170 of FIG. 1 .
  • Various embodiments provide a method and apparatus of providing an RE-aware technique for placing slots based on redundancy across and within slot communication pairs.
  • such placement may reduce overall bandwidth usage of inter-container (e.g., racks or data center) links.
  • FIG. 1 illustrates an embodiment of an RE-aware data center 100 serving clients 190 - 1 and 190 - 2 via network 180 .
  • the RE-aware data center 100 includes racks 110 - 1 and 110 - 2 (collectively, racks 110 ), top of the rack switches 130 - 1 and 130 - 2 (collectively, ToR switches 130 ), redundancy elimination (RE) boxes 140 - 1 and 140 - 2 (collectively, RE boxes 140 ), aggregation switches 150 - 1 - 150 - 2 (collectively, aggregation switches 150 ) and placement controller 170 .
  • racks 110 - 1 and 110 - 2 collectively, racks 110
  • top of the rack switches 130 - 1 and 130 - 2 collectively, ToR switches 130
  • RE redundancy elimination
  • the RE-aware data center 100 also includes links 135 - 1 , 145 - 1 , 155 - 1 , 175 - 1 , and 185 - 1 (collectively, RE-aware data center links) (remaining link labels have been omitted for the purpose of clarity).
  • Racks 110 are communicatively connected with network 180 via an appropriate one of the ToR switches 130 , RE boxes 140 aggregation switches 150 , and appropriate RE-aware data center links. It should be appreciated that RE-aware data center 100 may be architected in any suitable configuration and that RE-aware data center 100 is just one exemplary architecture being used for illustrative purposes.
  • Placement controller 170 controls the allocation of resources (e.g., processing, memory, storage or networking) in racks 110 within RE-aware data center 100 .
  • Racks 110 include end hosts 120 - 1 -EH 1 - 120 - 1 -EH 5 and 120 - 2 -EH 1 - 120 - 2 -EH 5 (collectively, end hosts 120 ). It should be appreciated that while 2 racks are illustrated here, system 100 may include more of less racks. It should be further appreciated that not all of the racks of system 100 may include end hosts.
  • End host 120 - 1 -EH 1 include slots 122 - 1 - 122 - 3 (collectively, slots 122 ) (illustrated as hashed boxes) and agent 160 - 1 -A 1 (illustrated as checkered box). It should be appreciated that while 3 slots are illustrated in end host 120 - 1 -EH 1 , each of end hosts 120 may include fewer or more slots (remaining slots have been omitted for the purpose of clarity), including some end hosts having no slots.
  • each of end hosts 120 may include fewer of more agents, including some end hosts having no agents.
  • Slots 122 are a collection of allocated resources configured to service application instances serving clients 190 - 1 and 190 - 2 (collectively, clients 190 ).
  • the term “slot” as user herein may be any suitable collection of allocated resources such as a virtual machine comprising processing/compute, memory or networking resources or virtualized storage.
  • ToR switches 130 switch data between end hosts in an associated rack and appropriate aggregate switch(es). For example, ToR switch 130 - 1 switches data from end hosts in rack 110 - 1 to end hosts in other racks (e.g., rack 110 - 2 ) or network 180 via aggregate switch 150 - 1 or 150 - 2 .
  • RE boxes 140 perform redundancy elimination.
  • RE boxes 140 cache received data and suppress duplicate traffic. It should be appreciated that while two (2) RE boxes are illustrated system 100 may include fewer of more RE boxes. It should be further appreciated that though illustrated as being placed after ToR switches in system 100 , RE boxes may be placed in any suitable portion of system 100 to perform redundancy elimination. Moreover, though illustrated as a separate component, RE boxes 140 may include any configuration such as: (a) a separate redundancy elimination component; (b) a software module residing on a component such as one or more of the end hosts 120 or one or more of the ToR switches 130 ; (c) or any other configuration combination.
  • a source RE box may create an identifier associated with received data from an end host (e.g., end host 120 - 1 -EH 1 ) and transmits the created identifier to a target RE box (e.g., RE box 140 - 2 ).
  • a sourcing RX box e.g., RE box 140 - 1
  • the receiving RE box uses the created identifier to retrieve the complete stored byte sequence and forwards the retrieved byte sequence to it destination (e.g., end host 120 - 2 -EH 1 ).
  • Aggregation switches 150 switch data between an associated ToR switch and network 180 .
  • ToR switch 130 - 1 - 1 switches data from resources in rack 110 - 1 to network 180 via an appropriate aggregation switch (e.g., aggregation switch 150 - 1 or 150 - 2 ).
  • Agents 160 - 1 -A 1 - 160 - 1 -A 5 , 160 - 2 -A 1 - 160 - 2 -A 5 , 160 - 3 -A 1 - 160 - 3 -A 2 , and 160 - 4 -A 1 - 160 - 4 -A 2 (collectively, agents 160 ) measure the redundancy in communications (e.g., slot-to-slot) and report these measurements to placement controller 170 via one or more appropriate RE-aware data center links.
  • the term “redundancy” as used herein means any content sequence capable of being de-duplicated by RE boxes 140 and should be understood broadly as including any repeated content sequence.
  • system 100 may include any configuration of agents.
  • agents may be deployed: (a) on a portion or all of end hosts 120 ; (b) on a portion or all of end hosts 120 , and a portion or all of aggregation switches 150 ; (c) or any other configuration combination.
  • an agent may monitor slot-to-slot communications for slots residing on the end hosts.
  • an agent may monitor: (a) slot-to-slot communications from slots on different end hosts within a rack; (b) communications among end hosts within the rack; (c) communications among an end host within the rack and an end host on other racks; or (d) the like.
  • agents on aggregation switches 150 may monitor the outgoing communications to other data centers.
  • Placement controller 170 includes assigns slots (e.g., one or more of slots 122 ) to containers (e.g., one of racks 110 or a data center in a distributed data center). It should be appreciated that while one (1) placement controller is illustrated here, system 100 may include more placement controllers or that there could be one placement controller for a set of data centers. In some of these embodiments, though illustrated within system 100 , placement controller 170 may be positioned outside of system 100 and connected to system 100 via network 180 .
  • RE-aware data center links and links 195 - 1 and 195 - 2 support communicating over one or more communication channels such as: wireless communications (e.g., LTE, GSM, CDMA, bluetooth); femtocell communications (e.g., WiFi); packet network communications (e.g., IP); broadband communications (e.g., DOCSIS and DSL); storage communications (e.g., Fibre Channel, iSCSI) and the like.
  • wireless communications e.g., LTE, GSM, CDMA, bluetooth
  • femtocell communications e.g., WiFi
  • packet network communications e.g., IP
  • broadband communications e.g., DOCSIS and DSL
  • storage communications e.g., Fibre Channel, iSCSI
  • the network 180 includes any number of access and edge nodes and network devices and any number and configuration of links (not shown for purposes of clarity). Moreover, it should be appreciated that network 180 may include any combination and any number of wireless, or wire line networks including: LTE, GSM, CDMA, Local Area Network(s) (LAN), Wireless Local Area Network(s) (WLAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), or the like.
  • LTE Long Term Evolution
  • GSM Global System for Mobile communications
  • CDMA Code Division Multiple Access
  • LAN Local Area Network
  • WLAN Wireless Local Area Network
  • WAN Wide Area Network
  • MAN Metropolitan Area Network
  • Clients 190 - 1 and 190 - 2 may include any type of communication device(s) capable of sending or receiving information over network 180 via an appropriate one of links 195 - 1 or 195 - 2 .
  • a communication device may be a thin client, a smart phone (e.g., client 190 - 2 ), a personal or laptop computer (e.g. 190 - 1 ), server, network device, tablet, e-reader or the like.
  • Communication devices may rely on other resources within exemplary system to perform a portion of tasks, such as processing or storage, or may be capable of independently performing tasks. It should be appreciated that while two clients are illustrated here, system 100 may include fewer or more clients. Moreover, the number of clients at any one time may be dynamic as clients may be added or subtracted from the system at various times during operation.
  • some or a portion of slots 122 are virtual machines.
  • some or a portion of slots 122 are virtualized storage.
  • ToR switches 130 are Ethernet switches.
  • aggregation switches 150 are layer 2 Ethernet switches.
  • RE-aware data center 100 is a distributed data center in a system of distributed data centers.
  • RE boxes 140 may be placed after aggregation switches 150 (e.g., between the aggregation switches 150 and network 180 ) in place of or in addition to their illustrated placement after ToR switches 130 .
  • the RE-aware data center 100 may be used to reduce the inter-data center traffic in order to more efficiently utilize the bandwidth between these distributed data centers.
  • FIG. 2 depicts a flow chart illustrating an embodiment of a method 200 for an RE-aware system (e.g., RE-aware data center 100 of FIG. 1 ) to perform RE-aware placements of slots.
  • the method includes one or more agents 160 - 1 - 160 -N (e.g., one or more of agents 160 of FIG. 1 ) collecting redundancy measurements (steps 210 - 1 - 210 -N) and reporting the collected measurements (steps 230 - 1 - 230 -N) to the placement controller 170 (e.g., placement controller 170 of FIG. 1 ) based on a determination that a reporting trigger has occurred (steps 220 - 1 - 220 -N).
  • the placement controller 170 e.g., placement controller 170 of FIG. 1
  • the method further includes the placement control 170 receiving the measurements (step 250 ) from agents 160 - 1 - 160 -N and performing slot placement based on the received measurements (step 270 ) based on a determination that a placement trigger has occurred (step 260 ).
  • the steps 210 - 1 - 210 -N include collecting redundancy measurements.
  • measurements include slot communication patterns and the determined redundancy associated with slot pair(s).
  • Communication patterns may be a traffic matrix of the collected measurements between slots and the determined redundancy may include the redundant traffic identified between slot-to-slot communications in the traffic matrix.
  • the steps 220 - 1 - 220 -N include determining that a reporting trigger has occurred. Based on the trigger determination, the method either proceeds to one of steps 230 - 1 - 230 -N respectively or returns to one of steps 220 - 1 - 220 -N respectively.
  • the trigger may be any suitable event signaling that collected measurements should be sent to placement controller 170 .
  • the trigger event may be: (a) periodically triggered at threshold intervals; (b) based on the measurements collected in one of steps 210 - 1 - 210 -N respectively; (c) based on receipt of a request from placement controller 170 ; or (d) the like. It should be appreciated that multiple trigger events may occur at the same time.
  • the steps 230 - 1 - 230 -N include reporting the collected measurements to placement controller 170 .
  • collected measurements may be reported in any suitable form such as: (a) the raw collected measurements; (b) compressed or modified information based on the collected measurements; (c) an aggregate of the collected measurements; or (d) the like.
  • the step 250 includes receiving measurements from one or more of agents 160 - 1 - 160 -N.
  • the measurements may be pushed or pulled to the apparatus performing the method.
  • the step 260 includes determining that a placement trigger has occurred. Based on the trigger determination, the method either proceeds to steps 270 or returns to step 260 .
  • the trigger may be any suitable event signaling that a placement should be performed.
  • the trigger event may be: (a) periodically triggered at threshold intervals; (b) based on the agent measurements received in step 250 ; or (c) the like. It should be appreciated that multiple trigger events may occur at the same time.
  • a quality threshold may be any suitable threshold such as the current effective inter-container bandwidth being greater than a threshold limit as compared to an estimate optimal effective inter-container bandwidth.
  • the step 270 includes placing the slots within in the end hosts (e.g., end hosts 120 in FIG. 1 ).
  • the slots to be placed in end hosts are determined and are placed in end hosts based on the agent measurements received in step 250 .
  • redundancy measurements include measurements of redundancy in intra-communication-redundancy or inter-communication-redundancy in slot-to-slot communications.
  • intra-comm-redundancy means the redundancy within slot-to-slot communications and should be understood broadly as including the redundancy between any slot-to-slot communication. For example, the redundancy between: (a) slot 122 - 1 and 122 - 3 of FIG. 1 ; (b) 122 - 1 and a slot (not shown) in end host 120 - 2 -EH 1 of FIG. 1 ; (c); 322 -SLOT 1 and 322 -SLOT 2 of FIG. 3 or (d) 322 -SLOT 1 and 322 -SLOT 3 of FIG. 3 .
  • inter-comm-redundancy means the redundancy between two slot-to-slot communications and should be understood broadly as including the redundancy between any two slot-to-slot communications.
  • an API may be provided for communication between a placement controller and agent (e.g., placement controller 170 and one or more of agents 160 of FIG. 1 ).
  • a placement controller may request all of a portion of collected redundancy measurements from one or more agents.
  • a placement trigger in step 260 may involve determining whether the current effective inter-container bandwidth meets the estimated optimal effective inter-container bandwidth.
  • determining the estimated optimal effective inter-container bandwidth may require a determination of the optimal placement of virtual machines. As such, the same determined optimal placement of virtual machines may be used in both steps 260 and 270 .
  • FIG. 3 illustrates a functional block diagram 300 of redundancy in a RE-aware data center 100 .
  • Diagram 300 includes rack 310 - 1 and 310 - 2 (e.g., two of racks 110 in FIG. 1 ) and RE boxes 340 - 1 and 340 - 2 .
  • Rack 310 - 1 includes slots 322 -SLOT 1 and 322 -SLOT 2 and rack 310 - 2 includes slots 322 -SLOT 3 and 322 -SLOT 4 .
  • Data transfer paths 390 - 1 - 390 - 7 illustrate data transfer paths between components.
  • RE boxes are positioned to de-duplicate data over the network links illustrated by data transfer 390 - 4 .
  • rack 310 - 1 and 310 - 2 may be data centers.
  • RE boxes 340 - 1 or 340 - 2 may include more than one RE box.
  • separate RE boxes may be software modules executing on one or more of the end hosts.
  • the four virtual machines are assigned to slots using a bandwidth-based placement strategy that places the VM pairs having higher bandwidth traffic within the same rack.
  • VM 1 and VM 3 are placed in the same rack and VM 2 and VM 4 are placed in the same rack.
  • the placements may be: VM 1 assigned to 322 -SLOT 1 , VM 3 assigned to 322 -SLOT 2 , VM 2 assigned to 322 -SLOT 3 and VM 4 assigned to 322 -SLOT 4 .
  • the bandwidth requirements across data transfer links would be:
  • the inter-rack bandwidth requirement (i.e., the bandwidth over data transfer 390 - 4 ) is 15V as shown below.
  • the four virtual machines are assigned to slots using an RE-aware placement strategy as described herein.
  • VM 1 and VM 2 are placed in the same rack and VM 3 and VM 4 are placed in the same rack.
  • the placements may be: VM 1 assigned to 322 -SLOT 1 , VM 2 assigned to 322 -SLOT 2 , VM 3 assigned to 322 -SLOT 3 and VM 4 assigned to 322 -SLOT 4 .
  • the bandwidth requirements across data transfer links would be:
  • the inter-rack bandwidth requirement i.e., the bandwidth over data transfer 390 - 4
  • the inter-comm-redundancy i.e., similarity between data transfers 322 -SLOT 1 ⁇ 322 -SLOT 3 and 322 -SLOT 2 ⁇ 322 -SLOT 4
  • the inter-comm-redundancy slot pairs traverse the RE box pair 340 - 1 and 340 - 2 allowing de-duplication of data.
  • the RE-aware placement strategy significantly reduced the bandwidth usage of inter-rack (or inter-data center in a distributed data center) link 390 - 4 by 33% over a bandwidth-based placement strategy (i.e., 10V compared with 15V).
  • FIG. 4 depicts a flow chart illustrating an embodiment of a method 400 for a placement controller (e.g., placement controller 160 of FIG. 1 ) to place slots in the end hosts as illustrated in step 270 of FIG. 2 .
  • the method includes placing the slots (step 480 ) based on a determined set of slots to be placed (step 420 ), the determined communication patterns (step 440 ) and the determined slot communication redundancy (step 460 ).
  • the step 420 includes determining the number and types of slots to place.
  • the apparatus performing the method determines the number and type of slots required to support the applications instances serving clients (e.g., clients 190 in FIG. 1 ).
  • the step 440 includes determining communication patterns.
  • communication patterns are derived from the communication patterns received from one or more agents (e.g., step 250 in FIG. 2 ).
  • the communication patterns may be derived using any suitable mechanism such as: (a) using the received communication patterns; (b) modifying the received communication patterns; (c) aggregating the received communication patterns; or (d) the like.
  • the step 460 includes determining communication redundancy.
  • communication redundancies are derived from the communication redundancies received from one or more agents (e.g., step 250 in FIG. 2 ).
  • the communication redundancies may be derived using any suitable mechanism such as: (a) using the received communication redundancies; (b) modifying the received communication redundancies; (c) aggregating the received communication redundancies; or (d) the like.
  • the step 480 includes performing an RE-aware placement strategy based on the determined required slots, the determined communication patterns, and the determined slot redundancies.
  • the RE-aware placement strategy takes into account the redundancy in data transfers for slot-to-slot communications and place slots to exploit redundancy in data transfers while minimizing the overall inter-rack (or inter-data center) bandwidth usage.
  • the step 480 includes basing the placement of slots on the intra-comm-redundancy of slot-to-slot data transfers across racks, and the inter-comm-redundancy of two slot-to-slot transfers across racks.
  • the intra-comm-redundancy and inter-comm-redundancy are determined across data centers (e.g., data centers in distributed settings) instead of or in addition to as across racks.
  • the step 480 includes using conventional integer linear program or classical optimization techniques to determine slot placement.
  • Conventional optimization techniques involve determining the action that best achieves a desired goal or objective.
  • An action that best achieves a goal or objective may be determined by maximizing or minimizing the value of an objective function.
  • the goal or metric of the objective function may be to minimize the overall inter-container (e.g., inter-rack or inter-data center) bandwidth usage
  • u a,b represents the bandwidth used between container C a and C b ).
  • the problem may be represented as:
  • equation E.1 is the objective function and equation E.2 constitutes the set of constraints imposed on the solution.
  • constraints may be one or more of the following:
  • FIG. 5 depicts a flow chart illustrating an embodiment of a method 500 for placing slots as illustrated in step 480 of FIG. 4 .
  • the method includes determining a number of clusters (step 520 ), assign slots to available clusters (step 540 ), and assign clusters to containers (step 560 ).
  • the step 520 includes determining a number of clusters.
  • the determined number of slots to be placed e.g., step 420 in FIG. 4
  • K partitioned into “K” clusters, such that each partitioned cluster may be allocated in at least one container.
  • the term “container” as used herein may be any suitable component capable of aggregating slots such as a rack within a data center setting or a data center itself within a distributed data center setting.
  • a container is a rack.
  • a container is a data center.
  • FIG. 6 depicts a flow chart illustrating an embodiment of a method 600 for placing slots as illustrated in step 480 of FIG. 4 and steps 520 , 540 and 560 of FIG. 5 .
  • the step 610 includes selecting a number of clusters “K” as described herein. Where K is the number of clusters to which the determined slots are distributed.
  • the step 620 includes assigning each of a number of random slots to an available container.
  • the number of random slots assigned is “K”.
  • the steps 630 and 640 include for each remaining slot “S” to be allocated (step 640 ), placing slot “S” in an available cluster based on computed distances to each available cluster (step 630 ).
  • the step 650 includes assigning clusters to containers based on an effective-inter-container-bandwidth-usage metric.
  • Step 660 includes determining whether the effective-inter-container-bandwidth-usage determined in step 650 meets a calculated optimal and whether the determined placement satisfies the constraint that each inter-container traffic (R 1 ,R 2 ) should be less than BW(R 1 ,R 2 ). Where BW is the available bandwidth between container R 1 and R 2 . Determining a calculated optimal effective-inter-container-bandwidth-usage may be done using any suitable method such as: (a) setting an iterative count and selecting the best value; (b) comparing the determined effective-inter-container-bandwidth-usage from step 650 against a predetermined threshold; or (c) the like.
  • the step 670 includes optimizing the determined slot placement.
  • optimization of slot placement includes determining whether moving container pairs would improve the effective-inter-container-bandwidth-usage. For example, exchanging S 1 in C 1 with S 2 in C 2 may be based on a determination that moving S 1 in container C 1 to container C 2 , and moving S 2 in container C 2 to C 1 improves the effective-inter-container-bandwidth-usage.
  • K is selected based on the number of available slots in each container.
  • K is selected randomly.
  • step 630 includes computing the distances to each available cluster and placing slot S in an available cluster based on the computed distances to each available cluster. Where an available cluster is a cluster that is not filled (i.e., has assigned slots than capacity).
  • computing distances to each candidate cluster is based on an effective-bandwidth metric (M).
  • M an effective-bandwidth metric
  • the effective-bandwidth metric is the bandwidth required after redundancy elimination within the data transfer from slot S to any of the other slots in the cluster (intra-comm-redundancy) and across data transfers slot S to any slot i, and slot S to any slot j in the cluster (inter-comm-redundancy).
  • the effective-bandwidth metric is computed as described in the pseudo code lines (1)-(3).
  • the function Demand (slot-i,slot-j) returns the bandwidth demand between slots “i” and “j”; the function IntraRedundancy (slot i,slot j) returns the intra-comm-redundancy for data transfers from slot i to slot j; and the function InterRedundancy (slot i, slot j, slot k, slot l) returns the inter-comm-redundancy between data transfers (slot i, slot j) and data transfers (slot k, slot l).
  • the cluster placement selection is based on a calculated minimum average distance to cluster C (Min(C)).
  • Min(C) is computed as shown in pseudo code lines (4)-(6).
  • the steps 610 - 640 are as described in FIG. 7 .
  • VM virtual machines (e.g., a type of slot);
  • Mindist u is the minimum average distance to cluster C;
  • MinVM and MinC are the respective VM and containers associated with the Mindist u ;
  • Uni u,v is the amount of unique traffic between VMs u and v;
  • Inter j,k,l,m is the inter-VM redundant traffic demands for redundancy between data transfers from VM j ⁇ VM k and VM l ⁇ VM m .
  • step 650 effective inter-container-bandwidth-usage is computed as the sum of all inter-container bandwidth demands minus inter-container redundant traffic (including both intra-comm-redundant traffic and inter-comm-redundant traffic).
  • the effective inter-container-bandwidth-usage is computed as shown in pseudo code is lines (7)-(10).
  • the step 650 is as described in FIG. 8 .
  • VM are virtual machines (e.g., a type of slot);
  • Uni u,v is the amount of unique traffic between VMs u and v;
  • Inter j,k,l,m is the inter-VM redundant traffic demands for redundancy between data transfers from VM j ⁇ VM k and VM l ⁇ VM m .
  • step 670 the optimization determination is computed as shown in pseudo code lines (11)-(16).
  • the step 670 is as described in FIG. 9 .
  • method 600 includes slot placement constraint(s).
  • slot placement constraint(s) In particular, for some slots, only a subset of containers is available for slot placement. In some of these embodiments, the selection of the subset of available containers is based on a quality of service metric such as latency.
  • a number of containers (“K”) is selected and a set of randomly chosen K slots is assignment to the K containers using a one-to-one correspondence and ensuring that the slot placement constraint(s) are met.
  • K a number of containers
  • a set of randomly chosen K slots is assignment to the K containers using a one-to-one correspondence and ensuring that the slot placement constraint(s) are met.
  • step 630 in computing the distance of a slot to the containers, only containers meeting the slot placement constraint(s) are considered.
  • step 670 only exchanges meeting slot placement constraints are performed.
  • steps shown in methods 200 , 400 , 500 and 600 may be performed in any suitable sequence. Moreover, the steps identified by one step may also be performed in one or more other steps in the sequence or common actions of more than one step may be performed only once.
  • steps of various above-described methods can be performed by programmed computers.
  • program storage devices e.g., data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, wherein said instructions perform some or all of the steps of said above-described methods.
  • the program storage devices may be, e.g., digital memories, magnetic storage media such as a magnetic disks and magnetic tapes, hard drives, or optically readable data storage media.
  • the embodiments are also intended to cover computers programmed to perform said steps of the above-described methods.
  • FIG. 10 schematically illustrates an embodiment of various apparatus 1000 such as one one of end hosts 120 , one of ToR switches 130 , one of aggregation switches 150 or placement controller 170 of FIG. 1 .
  • the apparatus 1000 includes a processor 1010 , a data storage 1011 , and optionally an I/O interface 1030 .
  • the processor 1010 controls the operation of the apparatus 1000 .
  • the processor 1010 cooperates with the data storage 1011 .
  • the data storage 1011 may store program data such as redundancy measurements, communication patterns or the like as appropriate.
  • the data storage 1011 also stores programs 1020 executable by the processor 1010 .
  • the processor-executable programs 1020 may include an agent collection program 1021 , a placement controller program 1023 , a placement controller placement program 1025 , or an I/O interface program 1027 as appropriate.
  • Processor 1010 cooperates with processor-executable programs 1020 .
  • the I/O interface 1030 cooperates with processor 1010 and I/O interface program 1027 to support communications over any appropriate communication channel(s) such as links of FIG. 1 .
  • the agent collection 1021 performs the steps 210 - 1 , 220 - 1 and 230 - 1 of method 200 of FIG. 2 as described above.
  • the placement controller program 1023 performs the steps 250 , 260 , and 270 of method 200 of 2IG. 5 as described above.
  • the placement controller placement program 1025 performs the steps of method 270 of FIG. 2 , the steps of FIG. 4 , the steps of FIG. 5 or the steps of FIG. 6 as described above.
  • the processor 1010 may include resources such as processors/CPU cores, the I/O interface 1030 may include any suitable network interfaces, or the data storage 1011 may include memory or storage devices.
  • the apparatus 1000 may be any suitable physical hardware configuration such as: one or more server(s), blades consisting of components such as processor, memory, network interfaces or storage devices. In some of these embodiments, the apparatus 1000 may include cloud network resources that are remote from each other.
  • the apparatus 1000 may be virtual machine.
  • the virtual machine may include components from different machines or be geographically dispersed.
  • the data storage 1011 and the processor 1010 may be in two different physical machines.
  • the apparatus 1000 may be a general purpose computer programmed to perform the methods 200 , 400 , 500 or 600 .
  • processor-executable programs 1020 When processor-executable programs 1020 are implemented on a processor 1010 , the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
  • processors may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software.
  • the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
  • explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non volatile storage.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • ROM read only memory
  • RAM random access memory
  • any switches shown in the FIGS. are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention.
  • any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Various embodiments provide a method and apparatus of providing an RE-aware technique for placing slots based on redundancy across and within slot communication pairs.

Description

    CROSS REFERENCE TO RELATED APPLICATION
  • This application is a Continuation of U.S. application Ser. No. 13/584,859, filed on Aug. 14, 2014, the disclosure of which is incorporated herein.
  • TECHNICAL FIELD
  • The inventions relate generally to methods and apparatus for providing slot placement.
  • BACKGROUND
  • This section introduces aspects that may be helpful in facilitating a better understanding of the inventions. Accordingly, the statements of this section are to be read in this light and are not to be understood as admissions about what is in the prior art or what is not in the prior art.
  • In some known virtual machine placement strategies, virtual machines are placed in data center racks based on minimizing the bandwidth requirements of data transfers across racks.
  • SUMMARY OF SOME EMBODIMENTS
  • Various embodiments provide a method and apparatus of providing an RE-aware technique for placing slots based on redundancy across and within slot communication pairs.
  • In one embodiment, an apparatus is provided for providing slot placement. The apparatus includes a data storage and a processor communicatively connected to the data storage. The processor is prgrammed to: determine a plurality of slots to be placed within a plurality of containers; determine a plurality of redundancy parameters; and determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • In some of the embodiments, a plurality of the plurality of slots are virtual machines.
  • In some of the embodiments, a plurality of the plurality of containers are racks in a data center.
  • In some of the embodiments, the redundancy parameters are based on intra-comm-redundancy and inter-comm-redundancy of at least a portion of the plurality of slots.
  • In some of the embodiments, the redundancy parameters are based on communication patterns of at least a portion of the plurality of slots.
  • In some of the embodiments, the determination of the placement of the plurality of slots comprises programming the processor to: determine a plurality of clusters; assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters; and assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • In some of the embodiments, the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • In some of the embodiments, the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
  • In some of the embodiments, the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
  • In some of the embodiments, the determination of the placement of the plurality of slots comprises further programming the processor to exchange a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • In a second embodiment, a switching system fof providing slot placement. The system including: a placement controller and a plurality of agents communicatively connected to the placement controller. The plurality of agents are programmed to: collect a plurality of redundancy measurements and send the plurality of redundancy measurements to the placement controller. The placement controller is prgrammed to: receive the plurality of redundancy measurements, determine a plurality of slots to be placed within a plurality of containers, determine a plurality of redundancy parameters based on the plurality of redundancy measurements, and determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • In some of the embodiments, the determination of the placement of the plurality of slots comprises programming the placement controller to: determine a plurality of clusters, assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters, and assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • In some of the embodiments, the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • In some of the embodiments, the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
  • In some of the embodiments, the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
  • In some of the embodiments, the determination of the placement of the plurality of slots comprises further programming the placement controller to: exchange a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • In a third embodiment, a method is provided for providing slot placement. The method including: determining a plurality of slots to be placed within a plurality of containers, determining a plurality of redundancy parameters, and determining a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters.
  • In some of the embodiments, the method further includes exchanging a first placed slot and a second placed slot based on a determination that inter-container-bandwidth-usage will be improved and that the exchange meets one or more inter-container bandwidth constraints.
  • In some of the embodiments, the step of determining the placement of the plurality of slots includes: determining a plurality of clusters, assigning each of at least a portion of the plurality of slots to at least one of the plurality of clusters, and assigning each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
  • In some of the embodiments, the step of assigning the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various embodiments are illustrated in the accompanying drawings, in which:
  • FIG. 1 illustrates an embodiment of an RE-aware data center 100 serving clients 190-1 and 190-2 via network 180;
  • FIG. 2 depicts a flow chart illustrating an embodiment of a method 200 for an RE-aware system (e.g., RE-aware data center 100 of FIG. 1) to perform RE-aware placements of slots;
  • FIG. 3 illustrates a functional block diagram 300 of redundancy in an RE-aware data center 100;
  • FIG. 4 depicts a flow chart illustrating an embodiment of a method 400 for a placement controller (e.g., placement controller 170 of FIG. 1) to place slots as illustrated in step 270 of FIG. 2;
  • FIG. 5 depicts a flow chart illustrating an embodiment of a method 500 for placing slots as illustrated in step 480 of FIG. 4;
  • FIG. 6 depicts a flow chart illustrating an embodiment of a method 600 for placing slots as illustrated in step 480 of FIG. 4 and steps 520, 540 and 560 of FIG. 5;
  • FIG. 7 depicts pseudo code illustrating an embodiment of steps 610-640 of FIG. 6;
  • FIG. 8 depicts pseudo code illustrating an embodiment of step 650 of FIG. 6;
  • FIG. 9 depicts pseudo code illustrating an embodiment of step 670 of FIG. 6; and
  • FIG. 10 schematically illustrates an embodiment of various apparatus 1000 such as one one of end hosts 120, one of ToR switches 130, one of aggregation switches 150, or placement controller 170 of FIG. 1.
  • To facilitate understanding, identical reference numerals have been used to designate elements having substantially the same or similar structure or substantially the same or similar function.
  • DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
  • The description and drawings merely illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its scope. Furthermore, all examples recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Additionally, the term, “or,” as used herein, refers to a non-exclusive or, unless otherwise indicated (e.g., “or else” or “or in the alternative”). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.
  • Various embodiments provide a method and apparatus of providing an RE-aware technique for placing slots based on redundancy across and within slot communication pairs. Advantageously, such placement may reduce overall bandwidth usage of inter-container (e.g., racks or data center) links.
  • FIG. 1 illustrates an embodiment of an RE-aware data center 100 serving clients 190-1 and 190-2 via network 180. The RE-aware data center 100 includes racks 110-1 and 110-2 (collectively, racks 110), top of the rack switches 130-1 and 130-2 (collectively, ToR switches 130), redundancy elimination (RE) boxes 140-1 and 140-2 (collectively, RE boxes 140), aggregation switches 150-1-150-2 (collectively, aggregation switches 150) and placement controller 170. The RE-aware data center 100 also includes links 135-1, 145-1, 155-1, 175-1, and 185-1 (collectively, RE-aware data center links) (remaining link labels have been omitted for the purpose of clarity). Racks 110 are communicatively connected with network 180 via an appropriate one of the ToR switches 130, RE boxes 140 aggregation switches 150, and appropriate RE-aware data center links. It should be appreciated that RE-aware data center 100 may be architected in any suitable configuration and that RE-aware data center 100 is just one exemplary architecture being used for illustrative purposes. Placement controller 170 controls the allocation of resources (e.g., processing, memory, storage or networking) in racks 110 within RE-aware data center 100.
  • Racks 110 include end hosts 120-1-EH1-120-1-EH5 and 120-2-EH1-120-2-EH5 (collectively, end hosts 120). It should be appreciated that while 2 racks are illustrated here, system 100 may include more of less racks. It should be further appreciated that not all of the racks of system 100 may include end hosts.
  • End host 120-1-EH1 include slots 122-1-122-3 (collectively, slots 122) (illustrated as hashed boxes) and agent 160-1-A1 (illustrated as checkered box). It should be appreciated that while 3 slots are illustrated in end host 120-1-EH1, each of end hosts 120 may include fewer or more slots (remaining slots have been omitted for the purpose of clarity), including some end hosts having no slots. It should be further appreciated that while one (1) agent is illustrated in each of end hosts 120 (e.g., agents 160-1-A1-160-1-A5 and 160-2-A1-160-2-A5), each of end hosts 120 may include fewer of more agents, including some end hosts having no agents.
  • Slots 122 are a collection of allocated resources configured to service application instances serving clients 190-1 and 190-2 (collectively, clients 190). The term “slot” as user herein may be any suitable collection of allocated resources such as a virtual machine comprising processing/compute, memory or networking resources or virtualized storage.
  • ToR switches 130 switch data between end hosts in an associated rack and appropriate aggregate switch(es). For example, ToR switch 130-1 switches data from end hosts in rack 110-1 to end hosts in other racks (e.g., rack 110-2) or network 180 via aggregate switch 150-1 or 150-2.
  • RE boxes 140 perform redundancy elimination. In particular, RE boxes 140 cache received data and suppress duplicate traffic. It should be appreciated that while two (2) RE boxes are illustrated system 100 may include fewer of more RE boxes. It should be further appreciated that though illustrated as being placed after ToR switches in system 100, RE boxes may be placed in any suitable portion of system 100 to perform redundancy elimination. Moreover, though illustrated as a separate component, RE boxes 140 may include any configuration such as: (a) a separate redundancy elimination component; (b) a software module residing on a component such as one or more of the end hosts 120 or one or more of the ToR switches 130; (c) or any other configuration combination.
  • For example, a source RE box (e.g., RE box 140-1) may create an identifier associated with received data from an end host (e.g., end host 120-1-EH1) and transmits the created identifier to a target RE box (e.g., RE box 140-2). In subsequent transmissions between the source and target RE box (in either direction), a sourcing RX box (e.g., RE box 140-1) may perform data de-duplication by replacing subsequent repeated byte sequences of newly received data with the created identifier to reduce application latency and conserve bandwidth. In this example, the receiving RE box (e.g., RE box 140-2) uses the created identifier to retrieve the complete stored byte sequence and forwards the retrieved byte sequence to it destination (e.g., end host 120-2-EH1).
  • Aggregation switches 150 switch data between an associated ToR switch and network 180. For example, ToR switch 130-1-1 switches data from resources in rack 110-1 to network 180 via an appropriate aggregation switch (e.g., aggregation switch 150-1 or 150-2).
  • Agents 160-1-A1-160-1-A5, 160-2-A1-160-2-A5, 160-3-A1-160-3-A2, and 160-4-A1-160-4-A2 (collectively, agents 160) measure the redundancy in communications (e.g., slot-to-slot) and report these measurements to placement controller 170 via one or more appropriate RE-aware data center links. The term “redundancy” as used herein means any content sequence capable of being de-duplicated by RE boxes 140 and should be understood broadly as including any repeated content sequence. It should be appreciated that though illustrated as residing in each of end hosts 120, ToR switches 130 and aggregation switches 150, system 100 may include any configuration of agents. For example, agents may be deployed: (a) on a portion or all of end hosts 120; (b) on a portion or all of end hosts 120, and a portion or all of aggregation switches 150; (c) or any other configuration combination.
  • On end hosts 120, an agent may monitor slot-to-slot communications for slots residing on the end hosts. On ToR switches 130, an agent may monitor: (a) slot-to-slot communications from slots on different end hosts within a rack; (b) communications among end hosts within the rack; (c) communications among an end host within the rack and an end host on other racks; or (d) the like. In an embodiment including a distributed data center setting, agents on aggregation switches 150 may monitor the outgoing communications to other data centers.
  • Placement controller 170 includes assigns slots (e.g., one or more of slots 122) to containers (e.g., one of racks 110 or a data center in a distributed data center). It should be appreciated that while one (1) placement controller is illustrated here, system 100 may include more placement controllers or that there could be one placement controller for a set of data centers. In some of these embodiments, though illustrated within system 100, placement controller 170 may be positioned outside of system 100 and connected to system 100 via network 180.
  • RE-aware data center links and links 195-1 and 195-2 support communicating over one or more communication channels such as: wireless communications (e.g., LTE, GSM, CDMA, bluetooth); femtocell communications (e.g., WiFi); packet network communications (e.g., IP); broadband communications (e.g., DOCSIS and DSL); storage communications (e.g., Fibre Channel, iSCSI) and the like. It should be appreciated that though depicted as a single connection, links may be any number or combinations of communication channels.
  • The network 180 includes any number of access and edge nodes and network devices and any number and configuration of links (not shown for purposes of clarity). Moreover, it should be appreciated that network 180 may include any combination and any number of wireless, or wire line networks including: LTE, GSM, CDMA, Local Area Network(s) (LAN), Wireless Local Area Network(s) (WLAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), or the like.
  • Clients 190-1 and 190-2 may include any type of communication device(s) capable of sending or receiving information over network 180 via an appropriate one of links 195-1 or 195-2. For example, a communication device may be a thin client, a smart phone (e.g., client 190-2), a personal or laptop computer (e.g. 190-1), server, network device, tablet, e-reader or the like. Communication devices may rely on other resources within exemplary system to perform a portion of tasks, such as processing or storage, or may be capable of independently performing tasks. It should be appreciated that while two clients are illustrated here, system 100 may include fewer or more clients. Moreover, the number of clients at any one time may be dynamic as clients may be added or subtracted from the system at various times during operation.
  • In some embodiments, some or a portion of slots 122 are virtual machines.
  • In some embodiments, some or a portion of slots 122 are virtualized storage.
  • In some embodiments, ToR switches 130 are Ethernet switches.
  • In some embodiments, aggregation switches 150 are layer 2 Ethernet switches.
  • In some embodiments of RE-aware data center 100, RE-aware data center 100 is a distributed data center in a system of distributed data centers. In some of these embodiments, RE boxes 140 may be placed after aggregation switches 150 (e.g., between the aggregation switches 150 and network 180) in place of or in addition to their illustrated placement after ToR switches 130. Advantageously, if RE-aware data center 100 does not accommodate all slots of an application instance (e.g., some slots are allocated in other data centers), the RE-aware data center 100 may be used to reduce the inter-data center traffic in order to more efficiently utilize the bandwidth between these distributed data centers.
  • FIG. 2 depicts a flow chart illustrating an embodiment of a method 200 for an RE-aware system (e.g., RE-aware data center 100 of FIG. 1) to perform RE-aware placements of slots. The method includes one or more agents 160-1-160-N (e.g., one or more of agents 160 of FIG. 1) collecting redundancy measurements (steps 210-1-210-N) and reporting the collected measurements (steps 230-1-230-N) to the placement controller 170 (e.g., placement controller 170 of FIG. 1) based on a determination that a reporting trigger has occurred (steps 220-1-220-N). The method further includes the placement control 170 receiving the measurements (step 250) from agents 160-1-160-N and performing slot placement based on the received measurements (step 270) based on a determination that a placement trigger has occurred (step 260).
  • In the method 200, the steps 210-1-210-N include collecting redundancy measurements. In particular, measurements include slot communication patterns and the determined redundancy associated with slot pair(s). Communication patterns may be a traffic matrix of the collected measurements between slots and the determined redundancy may include the redundant traffic identified between slot-to-slot communications in the traffic matrix.
  • In the method 200, the steps 220-1-220-N include determining that a reporting trigger has occurred. Based on the trigger determination, the method either proceeds to one of steps 230-1-230-N respectively or returns to one of steps 220-1-220-N respectively. The trigger may be any suitable event signaling that collected measurements should be sent to placement controller 170. For example, the trigger event may be: (a) periodically triggered at threshold intervals; (b) based on the measurements collected in one of steps 210-1-210-N respectively; (c) based on receipt of a request from placement controller 170; or (d) the like. It should be appreciated that multiple trigger events may occur at the same time.
  • In the method 200, the steps 230-1-230-N include reporting the collected measurements to placement controller 170. In particular, collected measurements may be reported in any suitable form such as: (a) the raw collected measurements; (b) compressed or modified information based on the collected measurements; (c) an aggregate of the collected measurements; or (d) the like.
  • In the method 200, the step 250 includes receiving measurements from one or more of agents 160-1-160-N.
  • It should be appreciated that the measurements may be pushed or pulled to the apparatus performing the method.
  • In the method 200, the step 260 includes determining that a placement trigger has occurred. Based on the trigger determination, the method either proceeds to steps 270 or returns to step 260. The trigger may be any suitable event signaling that a placement should be performed. For example, the trigger event may be: (a) periodically triggered at threshold intervals; (b) based on the agent measurements received in step 250; or (c) the like. It should be appreciated that multiple trigger events may occur at the same time. A quality threshold may be any suitable threshold such as the current effective inter-container bandwidth being greater than a threshold limit as compared to an estimate optimal effective inter-container bandwidth.
  • In the method 200, the step 270 includes placing the slots within in the end hosts (e.g., end hosts 120 in FIG. 1). In particular, the slots to be placed in end hosts are determined and are placed in end hosts based on the agent measurements received in step 250.
  • In some embodiments of one or more of steps 210-1-210-N, redundancy measurements include measurements of redundancy in intra-communication-redundancy or inter-communication-redundancy in slot-to-slot communications.
  • The term “intra-comm-redundancy” as used herein means the redundancy within slot-to-slot communications and should be understood broadly as including the redundancy between any slot-to-slot communication. For example, the redundancy between: (a) slot 122-1 and 122-3 of FIG. 1; (b) 122-1 and a slot (not shown) in end host 120-2-EH1 of FIG. 1; (c); 322-SLOT1 and 322-SLOT2 of FIG. 3 or (d) 322-SLOT1 and 322-SLOT3 of FIG. 3.
  • The term “inter-comm-redundancy” as used herein means the redundancy between two slot-to-slot communications and should be understood broadly as including the redundancy between any two slot-to-slot communications. For example, the redundancy between: (a) slot-to-slot pair slot 122-1/122-3 and slot-to-slot pair 122-1 and a slot (not shown) in end host 120-2-EH1 of FIG. 1; or (b); slot-to-slot pair 322-SLOT1/322-SLOT3 of and slot-to-slot pair 322-SLOT2/322-SLOT-4 of FIG. 3 is inter-comm-redundancy.
  • In some embodiments of the step 250, an API may be provided for communication between a placement controller and agent (e.g., placement controller 170 and one or more of agents 160 of FIG. 1). In some of these embodiments, a placement controller may request all of a portion of collected redundancy measurements from one or more agents.
  • In some embodiments, portions of steps 260 and 270 may be performed concurrently. For example, a placement trigger in step 260 may involve determining whether the current effective inter-container bandwidth meets the estimated optimal effective inter-container bandwidth. As should be appreciated, determining the estimated optimal effective inter-container bandwidth may require a determination of the optimal placement of virtual machines. As such, the same determined optimal placement of virtual machines may be used in both steps 260 and 270.
  • FIG. 3 illustrates a functional block diagram 300 of redundancy in a RE-aware data center 100. Diagram 300 includes rack 310-1 and 310-2 (e.g., two of racks 110 in FIG. 1) and RE boxes 340-1 and 340-2. Rack 310-1 includes slots 322-SLOT1 and 322-SLOT2 and rack 310-2 includes slots 322-SLOT3 and 322-SLOT4. Data transfer paths 390-1-390-7 illustrate data transfer paths between components. RE boxes are positioned to de-duplicate data over the network links illustrated by data transfer 390-4.
  • In some embodiments such as distributed data center embodiments, rack 310-1 and 310-2 may be data centers.
  • In some embodiments, RE boxes 340-1 or 340-2 may include more than one RE box. For example, separate RE boxes may be software modules executing on one or more of the end hosts.
  • In the examples below, assume the following:
      • 1) each of slots 322-SLOT1-322-SLOT4 may be allocated one of four virtual machines: VM1, VM2, VM3 and VM4;
      • 2) there is inter-comm-redundancy between VM1→VM3 and VM2→VM4 (i.e., there is a high similarity between data transfers that may be de-duplicated by RE boxes 340-1 and 340-2);
      • 3) the bandwidth requirements of the data traffic between VM1 and VM2 is 8V;
      • 4) the bandwidth requirements of the data traffic between VM1 and VM3 is 10V;
      • 5) the bandwidth requirements of the data traffic between VM3 and VM4 is 8V; and
      • 6) the bandwidth requirements of the data traffic between VM2 and VM4 is 10V.
  • In a first example, the four virtual machines are assigned to slots using a bandwidth-based placement strategy that places the VM pairs having higher bandwidth traffic within the same rack. As such, VM1 and VM3 are placed in the same rack and VM2 and VM4 are placed in the same rack. For example the placements may be: VM1 assigned to 322-SLOT1, VM3 assigned to 322-SLOT2, VM2 assigned to 322-SLOT3 and VM4 assigned to 322-SLOT4. As such the bandwidth requirements across data transfer links would be:
      • 7) 10V between 322-SLOT1 and 322-SLOT2 via 390-1;
      • 8) 8V between 322-SLOT1 and 322-SLOT3 via 390-2→RE box 340-1390-4→RE box 340-2 and 390-5;
      • 9) 7V between 322-SLOT2 and 322-SLOT4 via 390-3→RE box 340-1390-4→RE box 340-2 and 390-6; and
      • 10) 10V between 322-SLOT3 and 322-SLOT4 via 390-7.
  • In this example, the inter-rack bandwidth requirement (i.e., the bandwidth over data transfer 390-4) is 15V as shown below.
      • 8V for traffic between slot 322-SLOT1 and slot 322-SLOT3
      • +7V for traffic between slot 322-SLOT2 and slot 322-SLOT4.
  • In a second example, the four virtual machines are assigned to slots using an RE-aware placement strategy as described herein. As such, VM1 and VM2 are placed in the same rack and VM3 and VM4 are placed in the same rack. For example the placements may be: VM1 assigned to 322-SLOT1, VM2 assigned to 322-SLOT2, VM3 assigned to 322-SLOT3 and VM4 assigned to 322-SLOT4. As such the bandwidth requirements across data transfer links would be:
      • 11) 8V between 322-SLOT1 and 322-SLOT2 via 390-1;
      • 12) 10V between 322-SLOT1 and 322-SLOT3 via 390-2→RE box 340-1390-4→RE box 340-2 and 390-5;
      • 13) 10V between 322-SLOT2 and 322-SLOT4 via 390-3→RE box 340-1390-4→RE box 340-2 and 390-6; and
      • 14) 7V between 322-SLOT3 and 322-SLOT4 via 390-7.
  • In this example, the inter-rack bandwidth requirement (i.e., the bandwidth over data transfer 390-4) is 10V as shown below. It should be appreciated that the inter-comm-redundancy (i.e., similarity between data transfers 322-SLOT1322-SLOT3 and 322-SLOT2322-SLOT4) is exploited in this example as the inter-comm-redundancy slot pairs traverse the RE box pair 340-1 and 340-2 allowing de-duplication of data.
      • 10V for traffic between slot 322-SLOT1 and slot 322-SLOT3
      • +10V for traffic between slot 322-SLOT2 and slot 322-SLOT4.
      • −10V for data de-duplicated by RE box pair 340-1 and 340-2.
  • Thus, in this example, by exploiting the high similarity between data transfers, the RE-aware placement strategy significantly reduced the bandwidth usage of inter-rack (or inter-data center in a distributed data center) link 390-4 by 33% over a bandwidth-based placement strategy (i.e., 10V compared with 15V).
  • FIG. 4 depicts a flow chart illustrating an embodiment of a method 400 for a placement controller (e.g., placement controller 160 of FIG. 1) to place slots in the end hosts as illustrated in step 270 of FIG. 2. The method includes placing the slots (step 480) based on a determined set of slots to be placed (step 420), the determined communication patterns (step 440) and the determined slot communication redundancy (step 460).
  • In the method 400, the step 420 includes determining the number and types of slots to place. In particular, the apparatus performing the method determines the number and type of slots required to support the applications instances serving clients (e.g., clients 190 in FIG. 1).
  • In the method 400, the step 440 includes determining communication patterns. In particular, communication patterns are derived from the communication patterns received from one or more agents (e.g., step 250 in FIG. 2). The communication patterns may be derived using any suitable mechanism such as: (a) using the received communication patterns; (b) modifying the received communication patterns; (c) aggregating the received communication patterns; or (d) the like.
  • In the method 400, the step 460 includes determining communication redundancy. In particular, communication redundancies are derived from the communication redundancies received from one or more agents (e.g., step 250 in FIG. 2). The communication redundancies may be derived using any suitable mechanism such as: (a) using the received communication redundancies; (b) modifying the received communication redundancies; (c) aggregating the received communication redundancies; or (d) the like.
  • In the method 400, the step 480 includes performing an RE-aware placement strategy based on the determined required slots, the determined communication patterns, and the determined slot redundancies. In particular, the RE-aware placement strategy takes into account the redundancy in data transfers for slot-to-slot communications and place slots to exploit redundancy in data transfers while minimizing the overall inter-rack (or inter-data center) bandwidth usage.
  • In some embodiments, the step 480 includes basing the placement of slots on the intra-comm-redundancy of slot-to-slot data transfers across racks, and the inter-comm-redundancy of two slot-to-slot transfers across racks. In some embodiments, the intra-comm-redundancy and inter-comm-redundancy are determined across data centers (e.g., data centers in distributed settings) instead of or in addition to as across racks.
  • In some embodiments, the step 480 includes using conventional integer linear program or classical optimization techniques to determine slot placement. Conventional optimization techniques involve determining the action that best achieves a desired goal or objective. An action that best achieves a goal or objective may be determined by maximizing or minimizing the value of an objective function. In some embodiments, the goal or metric of the objective function may be to minimize the overall inter-container (e.g., inter-rack or inter-data center) bandwidth usage
  • ( e . g . , min a b u a , b
  • where ua,b represents the bandwidth used between container Ca and Cb).
  • The problem may be represented as:
      • Optimizing:

  • y=f(x 1 ,x 2 , . . . ,x n)  [E.1]
      • Subject to:
  • G j ( x 1 , x 2 , , x n ) { = } b j j = 1 , 2 , m [ E .2 ]
  • Where the equation E.1 is the objective function and equation E.2 constitutes the set of constraints imposed on the solution. The xi variables, x1, x2, . . . , xn, represent the set of decision variables and y=f(x1, x2, . . . , xn) is the objective function expressed in terms of these decision variables. It should be appreciated that the objective function may be maximized or minimized.
  • In some of any of these embodiments, constraints may be one or more of the following:
      • 1.
  • j x i , j = 1 ;
      • where each slot is placed in one container and xi,j is a variable indicating the placement of slot (i) in container (j).
      • 2.
  • i x i , j Maxj ;
      • where each container (j) has a limit Maxj on the maximum number of slots that the container may accommodate (e.g., based on capacity or provisioning parameters).
      • 3. yj,a,k,b≧xj,a+Xk,b−1; where v represents a variable that is 1 if slot (i) is placed in container (a) and slot (k) is placed in container (b) or otherwise zero.
      • 4. Bandwidth towards usage across containers is based on contributions of each type of traffic (i.e., unique, intra-redundant and inter-redundant).
        • a. The contribution of unique data transfers towards bandwidth usage between container Ca and Cb is Σj,k,j≠kUnij,k×yj,a,k,b. Where Unij,k is the amount of unique traffic between slots j and k. If slot (j) is placed in container Ca and slot (k) is placed in container Cb, then it will contribute to the bandwidth usage between container Ca→Cb.
        • b. The contribution of intra-redundant data transfers towards bandwidth usage between container Ca and Cb is zero. It should be appreciated that intra-redundant data will be suppressed by the RE mechanism.
        • c. The contribution of inter-redundant data transfer toward bandwidth usage is computed as follows: Σj,k,l,m,j≠lork≠mInterj,k,l,m×zj,l,a,k,m,b. Where Interj,k,l,m represents the inter-slot-redundancy, and zj,l,a,k,m,b is a variable that represents when inter-slot redundancy will be suppressed. For example, when slot (j) and slot (k) are placed in containers Ca and Cb respectively (i.e., yj,a,k,b=1), and slot (l) and slot (m) are also placed in containers Ca and Cb respectively (i.e., yl,a,m,b=1), the inter-slot-redundant data will be suppressed by the RE mechanism and thusly, zj,l,a,k,m,b=0. However, if yj,a,k,b=1 and yl,a,m,b=0, inter-slot-redundancy will not be suppressed and thusly, zj,l,a,k,m,b=1.
      • 5. The bandwidth between container Ca and Cb is less than available bandwidth BWa,b: ua,b≦BWa,b.
      • 6. One or more slots are constrained to one or more identified containers. (e.g., xj,a=0).
  • FIG. 5 depicts a flow chart illustrating an embodiment of a method 500 for placing slots as illustrated in step 480 of FIG. 4. The method includes determining a number of clusters (step 520), assign slots to available clusters (step 540), and assign clusters to containers (step 560).
  • In the method 500, the step 520 includes determining a number of clusters. In particular, the determined number of slots to be placed (e.g., step 420 in FIG. 4) are partitioned into “K” clusters, such that each partitioned cluster may be allocated in at least one container. The term “container” as used herein may be any suitable component capable of aggregating slots such as a rack within a data center setting or a data center itself within a distributed data center setting.
  • In some embodiments, a container is a rack.
  • In some embodiments, a container is a data center.
  • FIG. 6 depicts a flow chart illustrating an embodiment of a method 600 for placing slots as illustrated in step 480 of FIG. 4 and steps 520, 540 and 560 of FIG. 5.
  • In the method 600, the step 610 includes selecting a number of clusters “K” as described herein. Where K is the number of clusters to which the determined slots are distributed.
  • In the method 600, the step 620 includes assigning each of a number of random slots to an available container. In some of these embodiments, the number of random slots assigned is “K”.
  • In the method 600, the steps 630 and 640 include for each remaining slot “S” to be allocated (step 640), placing slot “S” in an available cluster based on computed distances to each available cluster (step 630).
  • In the method 600, the step 650 includes assigning clusters to containers based on an effective-inter-container-bandwidth-usage metric.
  • The method 600 optionally includes step 660. Step 660 includes determining whether the effective-inter-container-bandwidth-usage determined in step 650 meets a calculated optimal and whether the determined placement satisfies the constraint that each inter-container traffic (R1,R2) should be less than BW(R1,R2). Where BW is the available bandwidth between container R1 and R2. Determining a calculated optimal effective-inter-container-bandwidth-usage may be done using any suitable method such as: (a) setting an iterative count and selecting the best value; (b) comparing the determined effective-inter-container-bandwidth-usage from step 650 against a predetermined threshold; or (c) the like.
  • In the method 600, the step 670 includes optimizing the determined slot placement. In particular, optimization of slot placement includes determining whether moving container pairs would improve the effective-inter-container-bandwidth-usage. For example, exchanging S1 in C1 with S2 in C2 may be based on a determination that moving S1 in container C1 to container C2, and moving S2 in container C2 to C1 improves the effective-inter-container-bandwidth-usage.
  • In some embodiments of step 610, K is selected based on the number of available slots in each container.
  • In some embodiments of step 610, K is selected randomly.
  • In some embodiments, step 630 includes computing the distances to each available cluster and placing slot S in an available cluster based on the computed distances to each available cluster. Where an available cluster is a cluster that is not filled (i.e., has assigned slots than capacity).
  • In some of these embodiments of step 630, computing distances to each candidate cluster (e.g., cluster C) is based on an effective-bandwidth metric (M). In particular, the effective-bandwidth metric is the bandwidth required after redundancy elimination within the data transfer from slot S to any of the other slots in the cluster (intra-comm-redundancy) and across data transfers slot S to any slot i, and slot S to any slot j in the cluster (inter-comm-redundancy).
  • In some of these embodiments the effective-bandwidth metric, is computed as described in the pseudo code lines (1)-(3).
  • ( 1 ) M_from ( S ) = slot - i C ( Demand ( slot - S , slot - i ) - IntraRedundancy ( slot - S , slot - i ) ) - slot - i , slot - j C ( InterRedundancy ( slot - S , slot - i , slot - S , slot - j ) ) ( 2 ) M_to ( S ) = slot - i C ( Demand ( slot - i , slot - S ) - IntraRedundancy ( slot - i , slot - S ) ) - slot - i , slot - j C ( InterRedundancy ( slot - i , slot - S , slot - j , slot - S ) )
      • (3) M(S)=M_from(S)+M_to(S)
  • Where: the function Demand (slot-i,slot-j) returns the bandwidth demand between slots “i” and “j”; the function IntraRedundancy (slot i,slot j) returns the intra-comm-redundancy for data transfers from slot i to slot j; and the function InterRedundancy (slot i, slot j, slot k, slot l) returns the inter-comm-redundancy between data transfers (slot i, slot j) and data transfers (slot k, slot l).
  • In some embodiments of step 630, the cluster placement selection is based on a calculated minimum average distance to cluster C (Min(C)). In some of these embodiments, the Min(C) is computed as shown in pseudo code lines (4)-(6).
      • (4) for each cluster C
      • (5) compute AvgDistance(C)=(1/N*1/M(S));
        • // where N is the number of slots in the cluster C
      • (6) Min(C)=Min|AvgDistance(C)|
  • It should be appreciated that by taking the inverse of the effective bandwidth metric (e.g., M(S)), and choosing the minimum (slot, container) pair, slots with high bandwidth demands to a container will be placed in that container, potentially reducing the traffic that would traverse across containers (e.g., reduce the effective inter-container-bandwidth-usage).
  • In some embodiments, the steps 610-640 are as described in FIG. 7. Where VM are virtual machines (e.g., a type of slot); Mindistu is the minimum average distance to cluster C; MinVM and MinC are the respective VM and containers associated with the Mindistu; Uniu,v is the amount of unique traffic between VMs u and v; and Interj,k,l,m is the inter-VM redundant traffic demands for redundancy between data transfers from VMj→VMk and VMl→VMm.
  • In some embodiments of step 650, effective inter-container-bandwidth-usage is computed as the sum of all inter-container bandwidth demands minus inter-container redundant traffic (including both intra-comm-redundant traffic and inter-comm-redundant traffic). In some of these embodiments, the effective inter-container-bandwidth-usage is computed as shown in pseudo code is lines (7)-(10).
      • (7) Effective-intercontainer-bandwidth-usage=0
      • (8) For each container Ci, Cj in pair
      • (9) Find the list of (slot i, slot j) communications such that slot i and slot j are in different containers;
  • ( 10 ) Effective - intercontainer - bandwidth - usage += slot - i , slot - j ( Demand ( slot - i , slot - j ) ) - slot - i , slot - j ( IntraRedundancy ( slot - i , slot - j ) ) - slot - i , slot - j , slot - k , slot - l ( InterRedundancy ( slot - i , slot - j , slot - k , slot - l ) ) ;
        • // where slot i and slot k are in one container and
        • // slot j and slot k are in another container
  • In some embodiments, the step 650 is as described in FIG. 8. Where VM are virtual machines (e.g., a type of slot); Uniu,v is the amount of unique traffic between VMs u and v; and Interj,k,l,m is the inter-VM redundant traffic demands for redundancy between data transfers from VMj→VMk and VMl→VMm.
  • In some embodiments of step 670, the optimization determination is computed as shown in pseudo code lines (11)-(16).
      • (11) for each container Ci, Cj pair;
      • (12) for each slot k in Ci;
      • (13) for each slot l in Cj, determine whether exchanging slot k with slot l improves the effective inter-container bandwidth usage;
      • (14) select the slot lmax
        • // where lmax is the slot l with the best calculated
        • // effective inter-container-bandwidth usage metric;
      • (15) if slot k≠slot lmax &&
        • the exchange does not violate the inter-container bandwidth constraints
      • (16) exchange slot k with slot lmax;
  • In some embodiments, the step 670 is as described in FIG. 9.
  • In some embodiments, method 600 includes slot placement constraint(s). In particular, for some slots, only a subset of containers is available for slot placement. In some of these embodiments, the selection of the subset of available containers is based on a quality of service metric such as latency.
  • In some of these embodiments, a number of containers (“K”) is selected and a set of randomly chosen K slots is assignment to the K containers using a one-to-one correspondence and ensuring that the slot placement constraint(s) are met. In step 630, in computing the distance of a slot to the containers, only containers meeting the slot placement constraint(s) are considered. In step 670, only exchanges meeting slot placement constraints are performed.
  • Although primarily depicted and described in a particular sequence, it should be appreciated that the steps shown in methods 200, 400, 500 and 600 may be performed in any suitable sequence. Moreover, the steps identified by one step may also be performed in one or more other steps in the sequence or common actions of more than one step may be performed only once.
  • It should be appreciated that steps of various above-described methods can be performed by programmed computers. Herein, some embodiments are also intended to cover program storage devices, e.g., data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, wherein said instructions perform some or all of the steps of said above-described methods. The program storage devices may be, e.g., digital memories, magnetic storage media such as a magnetic disks and magnetic tapes, hard drives, or optically readable data storage media. The embodiments are also intended to cover computers programmed to perform said steps of the above-described methods.
  • FIG. 10 schematically illustrates an embodiment of various apparatus 1000 such as one one of end hosts 120, one of ToR switches 130, one of aggregation switches 150 or placement controller 170 of FIG. 1. The apparatus 1000 includes a processor 1010, a data storage 1011, and optionally an I/O interface 1030.
  • The processor 1010 controls the operation of the apparatus 1000. The processor 1010 cooperates with the data storage 1011.
  • The data storage 1011 may store program data such as redundancy measurements, communication patterns or the like as appropriate. The data storage 1011 also stores programs 1020 executable by the processor 1010.
  • The processor-executable programs 1020 may include an agent collection program 1021, a placement controller program 1023, a placement controller placement program 1025, or an I/O interface program 1027 as appropriate. Processor 1010 cooperates with processor-executable programs 1020.
  • The I/O interface 1030 cooperates with processor 1010 and I/O interface program 1027 to support communications over any appropriate communication channel(s) such as links of FIG. 1.
  • The agent collection 1021 performs the steps 210-1, 220-1 and 230-1 of method 200 of FIG. 2 as described above.
  • The placement controller program 1023 performs the steps 250, 260, and 270 of method 200 of 2IG. 5 as described above.
  • The placement controller placement program 1025 performs the steps of method 270 of FIG. 2, the steps of FIG. 4, the steps of FIG. 5 or the steps of FIG. 6 as described above.
  • In some embodiments, the processor 1010 may include resources such as processors/CPU cores, the I/O interface 1030 may include any suitable network interfaces, or the data storage 1011 may include memory or storage devices. Moreover the apparatus 1000 may be any suitable physical hardware configuration such as: one or more server(s), blades consisting of components such as processor, memory, network interfaces or storage devices. In some of these embodiments, the apparatus 1000 may include cloud network resources that are remote from each other.
  • In some embodiments, the apparatus 1000 may be virtual machine. In some of these embodiments, the virtual machine may include components from different machines or be geographically dispersed. For example, the data storage 1011 and the processor 1010 may be in two different physical machines.
  • In some embodiments, the apparatus 1000 may be a general purpose computer programmed to perform the methods 200, 400, 500 or 600.
  • When processor-executable programs 1020 are implemented on a processor 1010, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
  • Although depicted and described herein with respect to embodiments in which, for example, programs and logic are stored within the data storage and the memory is communicatively connected to the processor, it should be appreciated that such information may be stored in any other suitable manner (e.g., using any suitable number of memories, storages or databases); using any suitable arrangement of memories, storages or databases communicatively connected to any suitable arrangement of devices; storing information in any suitable combination of memory(s), storage(s) or internal or external database(s); or using any suitable number of accessible external memories, storages or databases. As such, the term data storage referred to herein is meant to encompass all suitable combinations of memory(s), storage(s), and database(s).
  • The description and drawings merely illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass equivalents thereof.
  • The functions of the various elements shown in the FIGS., including any functional blocks labeled as “processors”, may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non volatile storage. Other hardware, conventional or custom, may also be included. Similarly, any switches shown in the FIGS. are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • It should be appreciated that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it should be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

Claims (17)

What is claimed is:
1. An apparatus for providing slot placement, the apparatus comprising:
a data storage; and
a processor communicatively connected to the data storage, the processor being configured to:
determine a plurality of slots to be placed within a plurality of containers;
determine a plurality of redundancy parameters; and
determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters and a determination that inter-container-bandwidth-usage will be improved.
2. The apparatus of claim 1, wherein a plurality of the plurality of slots are virtual machines.
3. The apparatus of claim 1, wherein a plurality of the plurality of containers are racks in a data center.
4. The apparatus of claim 1, wherein the redundancy parameters are based on intra-comm-redundancy and inter-comm-redundancy of at least a portion of the plurality of slots.
5. The apparatus of claim 4, wherein the redundancy parameters are based on communication patterns of at least a portion of the plurality of slots.
6. The apparatus of claim 1, wherein the determination of the placement of the plurality of slots comprises configuring the processor to:
determine a plurality of clusters;
assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters; and
assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
7. The apparatus of claim 6, wherein the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
8. The apparatus of claim 7, wherein the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
9. The apparatus of claim 6, wherein the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
10. A system for providing slot placement, the system comprising:
a placement controller;
a plurality of agents communicatively connected to the placement controller, each of the plurality of agents configured to:
collect a plurality of redundancy measurements; and
send the plurality of redundancy measurements to the placement controller; and
the placement controller configured to:
receive the plurality of redundancy measurements;
determine a plurality of slots to be placed within a plurality of containers;
determine a plurality of redundancy parameters based on the plurality of redundancy measurements; and
determine a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters and a determination that inter-container-bandwidth-usage will be improved.
11. The system of claim 10, wherein the determination of the placement of the plurality of slots comprises configuring the placement controller to:
determine a plurality of clusters;
assign each of at least a portion of the plurality of slots to at least one of the plurality of clusters; and
assign each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
12. The system of claim 11, wherein the assignment of the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
13. The system of claim 12, wherein the assignment of the portion of the plurality of slots is further based on one or more slot placement constraints.
14. The system of claim 11, wherein the assignment of the portion of the plurality of clusters is based on an effective-inter-container-bandwidth-usage.
15. A method for providing slot placement, the method comprising:
at a processor communicatively connected to a data storage, determining a plurality of slots to be placed within a plurality of containers;
determining, by the processor in cooperation with the data storage, a plurality of redundancy parameters; and
determining, by the processor in cooperation with the data storage, a placement of the plurality of slots in the plurality of containers based on the plurality of redundancy parameters and a determination that inter-container-bandwidth-usage will be improved.
16. The method of claim 15, wherein the step of determining the placement of the plurality of slots comprises:
determining, by the processor in cooperation with the data storage, a plurality of clusters;
assigning, by the processor in cooperation with the data storage, each of at least a portion of the plurality of slots to at least one of the plurality of clusters; and
assigning, by the processor in cooperation with the data storage, each of at least a portion of the plurality of clusters to the at least one of the plurality of containers.
17. The method of claim 16, wherein the step of assigning the portion of the plurality of slots is based on a required slot bandwidth after redundancy elimination.
US14/731,980 2012-08-14 2015-06-05 Method And Apparatus For Providing Traffic Re-Aware Slot Placement Abandoned US20150278146A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/731,980 US20150278146A1 (en) 2012-08-14 2015-06-05 Method And Apparatus For Providing Traffic Re-Aware Slot Placement

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/584,859 US9104462B2 (en) 2012-08-14 2012-08-14 Method and apparatus for providing traffic re-aware slot placement
US14/731,980 US20150278146A1 (en) 2012-08-14 2015-06-05 Method And Apparatus For Providing Traffic Re-Aware Slot Placement

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/584,859 Continuation US9104462B2 (en) 2012-08-14 2012-08-14 Method and apparatus for providing traffic re-aware slot placement

Publications (1)

Publication Number Publication Date
US20150278146A1 true US20150278146A1 (en) 2015-10-01

Family

ID=49213062

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/584,859 Active 2033-02-12 US9104462B2 (en) 2012-08-14 2012-08-14 Method and apparatus for providing traffic re-aware slot placement
US14/731,980 Abandoned US20150278146A1 (en) 2012-08-14 2015-06-05 Method And Apparatus For Providing Traffic Re-Aware Slot Placement

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/584,859 Active 2033-02-12 US9104462B2 (en) 2012-08-14 2012-08-14 Method and apparatus for providing traffic re-aware slot placement

Country Status (7)

Country Link
US (2) US9104462B2 (en)
EP (1) EP2885706B1 (en)
JP (1) JP6385932B2 (en)
KR (1) KR20150034265A (en)
CN (1) CN104583954B (en)
IN (1) IN2015DN00456A (en)
WO (1) WO2014028175A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9678800B2 (en) * 2014-01-30 2017-06-13 International Business Machines Corporation Optimum design method for configuration of servers in a data center environment
US20150304450A1 (en) * 2014-04-17 2015-10-22 Alcatel Lucent Canada,Inc. Method and apparatus for network function chaining
WO2017108119A1 (en) * 2015-12-23 2017-06-29 Huawei Technologies Co., Ltd. Rack awareness
JP7571414B2 (en) 2020-08-06 2024-10-23 日本電気株式会社 CONTAINER EXECUTION SYSTEM, TERMINAL DEVICE, SERVER DEVICE, CONTAINER EXECUTION ENVIRONMENT DECISION METHOD, AND PROGRAM

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100254377A1 (en) * 2009-04-03 2010-10-07 Srinivasa Aditya Akella Network Routing System Providing Increased Network Bandwidth
US20110225277A1 (en) * 2010-03-11 2011-09-15 International Business Machines Corporation Placement of virtual machines based on server cost and network cost
US20110283062A1 (en) * 2010-05-14 2011-11-17 Hitachi, Ltd. Storage apparatus and data retaining method for storage apparatus
US20120084445A1 (en) * 2010-10-05 2012-04-05 Brock Scott L Automatic replication and migration of live virtual machines
US8681649B2 (en) * 2011-08-01 2014-03-25 Cisco Technology, Inc. System and method for adaptive optimization of resource utilization for redundancy elimination

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7120904B1 (en) * 2000-04-19 2006-10-10 Intel Corporation Data-flow method for optimizing exception-handling instructions in programs
US20080172526A1 (en) * 2007-01-11 2008-07-17 Akshat Verma Method and System for Placement of Logical Data Stores to Minimize Request Response Time
JP2008299791A (en) * 2007-06-04 2008-12-11 Hitachi Ltd Virtual computer system
JP5131915B2 (en) * 2008-03-31 2013-01-30 国立大学法人 東京大学 Packet encoding method and apparatus, and decoding method and apparatus
US8102781B2 (en) * 2008-07-31 2012-01-24 Cisco Technology, Inc. Dynamic distribution of virtual machines in a communication network
JP2010108409A (en) * 2008-10-31 2010-05-13 Hitachi Ltd Storage management method and management server
JP2011070464A (en) * 2009-09-28 2011-04-07 Hitachi Ltd Computer system, and method of managing performance of the same
US8589921B2 (en) * 2009-11-30 2013-11-19 Red Hat Israel, Ltd. Method and system for target host optimization based on resource sharing in a load balancing host and virtual machine adjustable selection algorithm
US9083708B2 (en) * 2010-05-17 2015-07-14 Microsoft Technology Licensing, Llc Asymmetric end host redundancy elimination for networks
US8667171B2 (en) * 2010-05-28 2014-03-04 Microsoft Corporation Virtual data center allocation with bandwidth guarantees
US9281865B2 (en) * 2011-10-13 2016-03-08 Cisco Technology, Inc. Efficient network discovery in frequency hopping networks
US9201697B2 (en) * 2012-01-23 2015-12-01 International Business Machines Corporation System and method to reduce memory usage by optimally placing VMS in a virtualized data center
US20130268672A1 (en) * 2012-04-05 2013-10-10 Valerie D. Justafort Multi-Objective Virtual Machine Placement Method and Apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100254377A1 (en) * 2009-04-03 2010-10-07 Srinivasa Aditya Akella Network Routing System Providing Increased Network Bandwidth
US20110225277A1 (en) * 2010-03-11 2011-09-15 International Business Machines Corporation Placement of virtual machines based on server cost and network cost
US20110283062A1 (en) * 2010-05-14 2011-11-17 Hitachi, Ltd. Storage apparatus and data retaining method for storage apparatus
US20120084445A1 (en) * 2010-10-05 2012-04-05 Brock Scott L Automatic replication and migration of live virtual machines
US8681649B2 (en) * 2011-08-01 2014-03-25 Cisco Technology, Inc. System and method for adaptive optimization of resource utilization for redundancy elimination

Also Published As

Publication number Publication date
JP6385932B2 (en) 2018-09-05
JP2015534663A (en) 2015-12-03
IN2015DN00456A (en) 2015-06-26
US9104462B2 (en) 2015-08-11
EP2885706A1 (en) 2015-06-24
EP2885706B1 (en) 2020-02-26
US20140052973A1 (en) 2014-02-20
CN104583954B (en) 2017-08-25
KR20150034265A (en) 2015-04-02
CN104583954A (en) 2015-04-29
WO2014028175A1 (en) 2014-02-20

Similar Documents

Publication Publication Date Title
US9430321B2 (en) Reconstructing data stored across archival data storage devices
US9548884B2 (en) Method and apparatus for providing a unified resource view of multiple virtual machines
US10191808B2 (en) Systems and methods for storing, maintaining, and accessing objects in storage system clusters
US20150278146A1 (en) Method And Apparatus For Providing Traffic Re-Aware Slot Placement
US10740198B2 (en) Parallel partial repair of storage
US9424156B2 (en) Identifying a potential failure event for a data storage device
CN106909310B (en) Method and apparatus for path selection for storage systems
JP2015522876A (en) Method and apparatus for eliminating single points of failure in cloud-based applications
US9557938B2 (en) Data retrieval based on storage device activation schedules
US20150331632A1 (en) Managing archival storage
US11979335B2 (en) Network controller
US20200042224A1 (en) Managing wear balancing in mapped raid storage systems
JP6256167B2 (en) Risk reduction in data center networks
US9430149B2 (en) Pipeline planning for low latency storage system
US10254973B2 (en) Data management system and method for processing distributed data
US20150331465A1 (en) Cascading startup power draws of enclosures across a network
JP2015534663A5 (en)
US9430152B2 (en) Data device grouping across data storage device enclosures for synchronized data maintenance
US20230089663A1 (en) Maintenance mode for storage nodes
Xiang et al. Taming latency in data center networking with erasure coded files
US20190245807A1 (en) Communication path setting apparatus, communication path setting method and communication path setting program
CN117093357A (en) Resource scheduling method, device and system of elastic search cluster
CN114007227A (en) Base station virtualization method, system and device
KR20180110494A (en) Virtual network management apparatus considering network resources and method therefor

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION