US20200374341A1 - Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) - Google Patents
Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) Download PDFInfo
- Publication number
- US20200374341A1 US20200374341A1 US16/991,545 US202016991545A US2020374341A1 US 20200374341 A1 US20200374341 A1 US 20200374341A1 US 202016991545 A US202016991545 A US 202016991545A US 2020374341 A1 US2020374341 A1 US 2020374341A1
- Authority
- US
- United States
- Prior art keywords
- server
- client
- delivery
- contact
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 claims abstract description 59
- 230000001052 transient effect Effects 0.000 claims 1
- 230000008569 process Effects 0.000 description 27
- 230000007246 mechanism Effects 0.000 description 23
- 238000013459 approach Methods 0.000 description 21
- 230000015654 memory Effects 0.000 description 20
- 238000004891 communication Methods 0.000 description 14
- 238000012545 processing Methods 0.000 description 11
- 238000003860 storage Methods 0.000 description 11
- RZVAJINKPMORJF-UHFFFAOYSA-N Acetaminophen Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 9
- 102100024881 C3 and PZP-like alpha-2-macroglobulin domain-containing protein 8 Human genes 0.000 description 9
- 108010003205 Vasoactive Intestinal Peptide Proteins 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 5
- 230000005012 migration Effects 0.000 description 5
- 238000013508 migration Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 230000001934 delay Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 108010014173 Factor X Proteins 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 239000003638 chemical reducing agent Substances 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 101100421200 Caenorhabditis elegans sep-1 gene Proteins 0.000 description 1
- 230000001668 ameliorated effect Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
- 230000005641 tunneling Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1023—Server selection for load balancing based on a hash applied to IP addresses or costs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H04L61/1535—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4535—Network directories; Name-to-address mapping using an address exchange platform which sets up a session between two nodes, e.g. rendezvous servers, session initiation protocols [SIP] registrars or H.323 gatekeepers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/14—Session management
- H04L67/148—Migration or transfer of sessions
-
- H04L67/2842—
-
- H04L61/1511—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4505—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
- H04L61/4511—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1014—Server selection for load balancing based on the content of a request
Definitions
- This invention relates to content delivery and content delivery networks. More specifically, this invention relates to direct server return with anycast rendezvous in content delivery networks.
- FIG. 1 depicts aspects of a content delivery network (CDN) according to exemplary embodiments hereof;
- CDN content delivery network
- FIGS. 2A, 2B, and 3 depict aspects of clusters of service endpoints and clustering in an exemplary CDN in accordance with exemplary embodiments hereof;
- FIGS. 4, 5A-5C, 6, and 7 depict aspects of Direct Server Return in a CDN according to exemplary embodiments hereof;
- FIG. 8 is a flowchart depicting aspects of exemplary embodiments hereof.
- FIG. 9 depicts aspects of computing according to exemplary embodiments hereof.
- AS means autonomous system
- BGP border gateway protocol
- CD means content delivery
- CDN means content delivery network
- DNS means domain name system
- DSR means direct server return
- HTTP Hyper Text Transfer Protocol
- HTML Hypertext Markup Language
- HTTPS means HTTP Secure
- IP Internet Protocol
- IPv4 Internet Protocol Version 4
- IPv6 Internet Protocol Version 6
- IP address means an address used in the Internet Protocol, including both IPv4 and IPv6, to identify electronic devices such as servers and the like;
- OSI model refers to the Open Systems Interconnection model
- SSL Secure Sockets Layer
- URI Uniform Resource Identifier
- URL means Uniform Resource Locator.
- a “mechanism” refers to any device(s), process(es), routine(s), service(s), module(s), or combination thereof.
- a mechanism may be implemented in hardware, software, firmware, using a special-purpose device, or any combination thereof.
- a mechanism may be integrated into a single device or it may be distributed over multiple devices. The various components of a mechanism may be co-located or distributed. The mechanism may be formed from other mechanisms.
- the term “mechanism” may thus be considered shorthand for the term device(s) and/or process(es) and/or service(s).
- a content delivery network distributes content (e.g., resources) efficiently to clients on behalf of one or more content providers, preferably via a public Internet.
- Content providers provide their content (e.g., resources) via origin sources (origin servers or origins).
- a CDN can also provide an over-the-top transport mechanism for efficiently sending content in the reverse direction—from a client to an origin server.
- clients end-users
- content providers benefit from using a CDN.
- a content provider is able to take pressure off (and thereby reduce the load on) its own servers (e.g., its origin servers). Clients benefit by being able to obtain content with fewer delays.
- FIG. 1 shows aspects of an exemplary CDN in which one or more content providers 102 provide content via one or more origin sources 104 and delivery services (servers) 106 to clients 108 via one or more networks 110 .
- the delivery services (servers) 106 may form a delivery network from which clients 108 may obtain content.
- the delivery services 106 may be logically and/or physically organized hierarchically and may include edge caches.
- the delivery services 106 may be logically and/or physically organized as clusters and super-clusters, as described below.
- components of a CDN may use the CDN to deliver content to other CDN components.
- a CDN component may itself be a client of the CDN.
- the CDN may use its own infrastructure to deliver CDN content (e.g., CDN control and configuration information) to CDN components.
- Client requests may be associated with delivery server(s) 106 by a rendezvous system 112 comprising rendezvous mechanism(s) 114 , possibly in the form of one or more rendezvous networks.
- the rendezvous mechanism(s) 114 may be implemented, at least in part, using or as part of a DNS system, and the association of a particular client request (e.g., for content) with one or more delivery servers may be done as part of DNS processing associated with that particular client request (e.g., of a domain name associated with the particular client request).
- multiple delivery servers 106 in the CDN can process or handle any particular client request for content (e.g., for one or more resources).
- the rendezvous system 112 associates a particular client request with one or more “best” or “optimal” (or “least worst”) delivery servers 106 to deal with that particular request.
- the “best” or “optimal” delivery server(s) 106 may be one(s) that is (are) close to the client (by some measure of network cost) and that is (are) not overloaded.
- the chosen delivery server(s) 106 i.e., the delivery server(s) chosen by the rendezvous system 112 for a client request
- a chosen delivery server 106 need not have the requested content at the time the request is made, even if that chosen delivery server 106 eventually serves the requested content to the requesting client.
- the client When a client 108 makes a request for content, the client may be referred to as the requesting client, and the delivery server 106 that the rendezvous system 112 associates with that client request (and that the client first contacts to make the request) may be referred to as the “contact” server or just the contact.
- CDNs are described in U.S. Pat. Nos. 8,060,613 and 8,925,930, the entire contents of both of which have been fully incorporated herein by reference for all purposes.
- a CDN generally provides a redundant set of service endpoints running on distinct hardware in different locations. These distinctly addressed but functionally equivalent service endpoints provide options to the rendezvous system 112 . Each distinct endpoint is preferably, but not necessarily, uniquely addressable within the system, preferably using an addressing scheme that may be used to establish a connection with the endpoint.
- the address(es) of an endpoint may be real or virtual. In some implementations, e.g., where service endpoints (preferably functionally equivalent service endpoints) are bound to the same cluster and share a virtual address, the virtual address may be used.
- each distinct endpoint may be defined by at least one unique IP address and port number combination.
- each distinct endpoint may be defined by at least one unique combination of the IP address and port number.
- service endpoints that are logically bound to the same cluster may share a so-called VIP (virtual IP address), in which cases each distinct endpoint may be defined by at least one unique combination of the VIP and a port number. In the latter case, each distinct endpoint may be bound to exactly one physical cluster in the CDN.
- VIP virtual IP address
- the endpoint may be defined in terms of a real address rather than a virtual address (e.g., an IP address rather than a VIP).
- a virtual address may, in some cases, correspond to or be a physical address.
- a VIP may be (or correspond to) a physical address (e.g., for a single machine cluster).
- VIP is used in this description as an example of a virtual address (for an IP-based system). In general any kind of virtual addressing scheme may be used and is contemplated herein. Unless specifically stated otherwise, the term VIP is intended as an example of a virtual address, and the system is not limited to or by IP-based systems or systems with IP addresses and/or VIPs.
- service endpoints SEP 1 , SEP 2 . . . SEP n are logically bound to the same cluster 200 and share an address.
- the shared address may be a virtual address (e.g., a VIP).
- a physical cluster of service endpoints may have one or more logical clusters of service endpoints.
- a physical cluster 202 includes two logical clusters (Logical Cluster 1 and Logical Cluster 2 ).
- Logical Cluster 1 consists of two machines (M 0 , M 1 ), and Logical Cluster 2 consists of three machines (M 2 , M 3 , M 4 ).
- the machines in each logical cluster may share a heartbeat signal (HB) with other machines in the same logical cluster.
- HB heartbeat signal
- the first logical cluster may be addressable by a first unique virtual address (address #1, e.g., a first VIP/port combination), whereas the second logical cluster may be addressable by a second unique virtual address (address #2, e.g., a second VIP/port combination).
- a first unique virtual address e.g., a first VIP/port combination
- a second unique virtual address e.g., a second VIP/port combination
- a machine may only be part of a single logical cluster; although it should be appreciated that this is not a requirement.
- the machines that share a heartbeat signal may be said to be on a heartbeat ring.
- machines M 0 and M 1 are on the same heartbeat ring
- machines M 2 , M 3 , and M 4 are on the same heartbeat ring.
- a service endpoint When a service endpoint is bound to a cluster, it means that a bank of equivalent services are running on all the machines in the cluster and listening for service requests addressed to that cluster endpoint address.
- a local mechanism e.g., a load-balancing mechanism
- a load-balancing mechanism ensures that exactly one service instance (e.g., machine) in the cluster will respond to each unique service request. This may be accomplished, e.g., by consistently hashing attributes of each request to exactly one of the available machines (and of course it is impossible to have more than one service instance listening per machine on the same endpoint).
- Each service instance running on machines in the cluster can be listening to any number of other endpoint addresses, each of which will have corresponding service instances running on all other machines in the cluster.
- each machine is installed in a physical cluster of machines behind a single shared switch.
- One physical cluster may be divided up into multiple logical clusters, where each logical cluster consists of those machines on the same physical cluster that are part of the same HB ring. That is, each machine runs an HB process with knowledge of the other machines in the same logical cluster, monitoring all virtual addresses (e.g., VIPs) and updating the local firewall and NIC (network interface card/controller) configurations in order to implement local load balancing across the cluster.
- VIPs virtual addresses
- NIC network interface card/controller
- each machine may be considered to be a peer of all other machines in the cluster, there is no need for any other active entity specific to the cluster.
- a subcluster is a group of one or more (preferably homogenous) machines sharing an internal, local area network (LAN) address space, possibly load-balanced, each running a group of one or more collaborating service instances.
- LAN local area network
- Service instances within the subcluster's internal LAN address space can preferably address each other with internal or external LAN addresses, and may also have the ability to transfer connections from one machine to another in the midst of a single session with an external client, without the knowledge or participation of the client.
- a supercluster is a group of one or more (preferably homogenous) subclusters, each consisting of a group of one or more collaborating but distinctly addressed service images.
- Different service images in the same supercluster may or may not share a common internal LAN (although it should be appreciated that they still have to be able to communicate, directly or indirectly, with each other over some network).
- Those connected to the same internal LAN may use internal LAN addresses or external LAN addresses, whereas others must use external network addresses to communicate with machines in other subclusters.
- Clusters may be interconnected in arbitrary topologies to form subnetworks.
- the set of subnetworks a service participates in, and the topology of those networks, may be dynamic, constrained by dynamically changing control policies based on dynamically changing information collected from the network itself, and measured by the set of currently active communication links between services.
- FIG. 3 An example showing the distinction between physical clusters, logical subclusters, and logical superclusters is shown in FIG. 3 .
- the machines of two physical clusters A and B are subdivided into groups forming logical subclusters R, S, and T (from the machines of physical cluster A) and logical subclusters X, Y, and Z (from the machines of physical cluster B). These subclusters are then logically recombined to form logical superclusters I (from subclusters R and S), J (from subclusters T and X), and K (from subclusters Y and Z).
- the number of machines that may be combined into one subcluster is limited by the number of machines in a physical cluster, but theoretically any number of logical subclusters may be grouped into one supercluster that may span multiple physical clusters or be contained within one.
- a two-level cluster architecture is assumed, where machines behind a common switch are grouped into logical sub-clusters, and sub-clusters (whether behind the same switch or on different racks/switches) are grouped into super-clusters.
- machines behind a common switch are grouped into logical sub-clusters
- sub-clusters are grouped into super-clusters.
- a single switch may govern multiple sub-clusters and these sub-clusters need not be in the same super-cluster. It is logically possible to have any number of machines in one sub-cluster, and any number of sub-clusters in a super-cluster, though those of ordinary skill in the art will realize and understand that physical and practical realities will dictate otherwise.
- U.S. Pat. No. 8,015,298 describes various approaches to ensure that exactly one service instance in a cluster will respond to each unique service request. These may be referred to as the first allocation approach and the second allocation approach.
- service endpoints on the same HB ring select from among themselves to process service requests.
- the selected service endpoint may select another service endpoint (preferably from service endpoints on the same HB ring) to actually process the service request. This handoff may be made based on, e.g., the type of request or actual content requested.
- an additional level of heartbeat-like functionality exists at the level of virtual addresses (e.g., VIPs) in a super-cluster, detecting virtual addresses that are down and configuring them on machines that are up.
- This super-HB allows the system to avoid relying solely on DNS-based rendezvous for fault-tolerance and to deal with the DNS-TTL phenomenon that would cause clients with stale IP addresses to continue to contact VIPs that are known to be down.
- a super-HB system may have to interact with the underlying network routing mechanism (simply bringing a VIP “up” does not mean that requests will be routed to it properly).
- the routing infrastructure is preferably informed that the VIP has moved to a different switch.
- VIPs it should be appreciated that the system is not limited to an IP-based scheme, and any type of addressing and/or virtual addressing may be used.
- Heartbeat(s) provide a way for machines (or service endpoints) in the same cluster (logical and/or physical and/or super) to know the state of other machines (or service endpoints) in the cluster, and heartbeat(s) provide information to the various allocation techniques.
- a heartbeat and super-heartbeat may be implemented, e.g., using the reducer/collector systems such as described in U.S. Pat. No. 8,925,930.
- reducer/collector systems such as described in U.S. Pat. No. 8,925,930.
- a local heartbeat in a physical cluster is preferably implemented locally and with a fine granularity.
- a super-heartbeat may not have (or need) the granularity of a local heartbeat.
- the first allocation approach system described in U.S. Pat. No. 8,015,298 provides the most responsive failover at the cost of higher communication overhead. This overhead determines an effective maximum number of machines and VIPs in a single logical sub-cluster based on the limitations of the heartbeat protocol.
- the First allocation approach mechanisms described in U.S. Pat. No. 8,015,298 also imposes additional overhead beyond that of heartbeat due to the need to broadcast and filter request traffic.
- a VIP-level failover mechanism that spans the super-cluster would impose similar heartbeat overhead but would not require any request traffic broadcasting or filtering.
- Detection of down VIPs in the cluster may potentially be handled without a heartbeat, using a reduction of log events received outside the cluster.
- a feedback control mechanism could detect inactive VIPs and reallocate them across the cluster by causing new VIP configurations to be generated as local control resources.
- a particular client request for content uses the rendezvous system 112 to determine an appropriate delivery server 106 to handle the request. That appropriate delivery server 106 effectively becomes the contact server for that request.
- certain delivery servers 106 may act primarily (or even solely) as contact servers 116 .
- the contact servers 116 of exemplary embodiments hereof may thus act only as contacts and do not also serve content.
- the contact server selects another delivery server 106 to handle the request.
- the request may then be handled by the “better” server using direct server return (DSR), e.g., as described in U.S. application Ser. No. 15/364,036.
- DSR direct server return
- the contact servers 116 may form a network 118 of contact servers. Although shown in FIG. 4 as a logical subset of the delivery servers 106 , since, in preferred embodiments, the contact servers 116 do not also serve content, they may be considered a separate set of servers. However, as noted, when DSR is used to serve a client, the client is unaware that that contact server 116 is not actually serving the content to the client. As far as the client is concerned, the contact server 116 is a delivery server.
- one or more groups of contact servers 116 within the same autonomous system have the same IP address.
- the contact servers 116 - 1 , . . . in autonomous system AS- 1 have the same IP address, namely IP 1
- the contact servers 116 - 2 , . . . in autonomous system AS- 2 have the same IP address, namely IP 2 , and so on.
- a client's request for content is directed, by the rendezvous system 112 , to a first or initial contact server to handle the request.
- the client 208 is provided with an IP address (e.g., IP 1 ) for the initial contact server, e.g., by the rendezvous system 112 .
- IP 1 IP address
- a group of multiple contact servers in the same AS may have the same IP address.
- contact servers 116 -A and 116 -C are in the same AS and have the same IP address (IP 1 ).
- IP 1 IP address
- the contact server 116 -A to which the client 208 initially connects selects a delivery server (e.g., 106 -B) from the delivery servers 106 , and then uses direct server return (DSR) (e.g., as described in U.S. application Ser. No. 15/364,036) to process the client request.
- a delivery server e.g., 106 -B
- DSR direct server return
- the contact servers may use state information 500 to maintain and/or lookup information about the connection between the client 208 and the request being handled by the contact server 116 -A.
- the state information 500 may include, e.g., a reference table 502 ( FIG. 5B ) that maps client IP addresses and port numbers to corresponding request information.
- the request information may include the IP address of the delivery server that is serving the client, the requested URL (or some identification of or information about the requested resource), and other miscellaneous information (e.g., connection information and the like).
- the first contact server registers the association between the client and the delivery server that is actually serving the client.
- the state information is preferably accessible to other contact servers, at least in the same AS (or AS group) as the contact server 116 -A.
- State information 500 may be provided using, e.g., a collector and/or reducer network of the CDN (e.g., as described in U.S. Patent No. U.S. Pat. Nos. 8,060,613 and 8,925,930, the entire contents of both of which have been fully incorporated herein by reference for all purposes).
- connection between the client 208 and the contact server 116 -A is made by or based on BGP tables.
- BGP tables In other words, although multiple contact servers in the same AS (or AS group) have the same IP address, only one such contact server will be used, based, e.g., on network load, traffic, BGP tables, etc.
- the client 208 may be directed, during the processing of the request, to a different contact server with the same IP address (e.g., to contact server 116 -C).
- the second (or subsequent) contact server 116 -C needs to continue processing the request with the client 208 and the delivery server 106 -B where the first (or previous) contact server 116 -A/delivery server 106 -B left off.
- the second (or subsequent) contact server 116 -C needs to determine which delivery server 106 is serving the client 208 .
- the contact server 116 -C may, e.g., query the reference table 502 in the state information 500 to determine the identity of server 106 -B. The identity may be recorded in the state information 500 as any unique identifier.
- the IP address (IP-B) of server 106 -B may be used to uniquely identify the server in the state information.
- the reference table 502 maps the client's IP address (and port number), inter alia, to the address of the delivery server 106 that is handling the client request.
- the contact server 116 -C may use the IP address (IP-C) of the client 208 to lookup information in the table 502 in the state information 500 .
- the client request may then be processed using DSR with contact server 116 -C, delivery server 106 -B, and client 208 . Since the contact servers 116 -A and 116 -B have the same IP address (IP 1 ), the client is unaware of any change. However, instead of the DSR having contact server 116 -A act as a pass-through proxy, that role is taken by contact server 116 -C. Similarly, the delivery server 106 -B may be unaware of the change in contact servers.
- the state information 500 should be updated to remove the information about the request. This can be done by having the entry for the request removed by the last contact server 116 handling the request. Effectively, the last contact server handling the request un-registers the association between the client and the delivery server.
- the IP address IP-B may correspond to a multi-machine cluster, in which case, the DSR migrated request (from contact server) may be handled by server 106 -B or by any machine in cluster 120 , in accordance with that cluster's request processing policies and protocols.
- the network address that the contact server uses for server 106 -B may be a VIP for the cluster 120 or a VIP for server 106 -B or an IP address of server 106 -B.
- the cluster may choose delivery server 106 to handle the request.
- elements of the request may be used to determine which server within that cluster will process the request. Note, however, that such determining needs to use the elements of the actual client request (rather than, e.g., the LAN address of the IC machine(s)).
- the contact server After the initial contact server hands off the request to server 106 -B, using direct server return (DSR) the contact server essentially acts as a router for that request. While the handoff (from contact server to delivery server 106 -B) is transparent to the client 208 , in TCP/IP communication with the delivery server 106 -B, the client must get the same IP address as the initial contact server (IP 1 ). Therefore the delivery server 106 -B must spoof the IP address of the contact server on a per connection basis (unless the delivery server 106 -B has the same public IP address as the contact server, e.g., in an anycast system).
- DSR direct server return
- the Open Systems Interconnection model is a conceptual model that characterizes and standardizes the communication functions of a telecommunication or computing system without regard to their underlying internal structure and technology.
- the OSI model partitions a communication system into abstraction layers. The original version of the model defined seven layers, including:
- the contact server After the initial client request to the contact server 116 -A (at Layer 5, the HTTP level), the contact server becomes a Layer 3/4 pass-through router in only one direction (from the client to the contact server to the delivery server) for that client request.
- the contact server changes from a Layer 5 session/application layer (e.g. HTTP) server and becomes a Layer 3/4 router.
- the initial contact server is thereby converted into a routing device for that particular client connection.
- the contact server/delivery server may not be able to communicate sufficient state to have the SSL handshake performed by the contact server (so that the request could be inspected by the contact server) and then have the delivery server continue the encryption of the responses.
- the contact server may perform a delivery server selection based on just load and/or client location and then forward the connection immediately that the connection has been established. That is, in such cases, the contact server may function as a Layer 3/4 pass-through immediately on connection establishment.
- the client 208 establishes a connection (e.g., a TCP/IP connection) with the contact server 116 -A and makes a request (e.g., an HTTP request) to the contact server 116 -A.
- the contact server 116 -A migrates the TCP connection to the delivery server 106 -B.
- the contact server 116 -A freezes the connection with the client and determines the required TCP state information (e.g., sequence numbers, etc.), and conveys that information to the delivery server 106 -B over some protocol (e.g., TCP), preferably over a side-channel, possibly using tunneling.
- TCP some protocol
- the client 208 sends an ACK (for the pieces of the TCP packet stream that it receives from the delivery server 106 -B), that ACK is still going to come back to the contact server 116 -A.
- the contact server 116 -A then provides those ACKs to the delivery server 106 -B.
- contact server 116 -A starts at layer 5 (HTTP) with its connections with the client.
- contact server 116 -A effectively becomes a layer 3/4 (router) and forwards layer 3/4 information (e.g., ACKs) from the client to the delivery server 106 -B.
- the contact server 116 -A will still receive the layer 3/4 and layer 5 information (e.g., HTTP) from the client 208 , but this information is forwarded to the delivery server 106 -B.
- the contact server 116 -A may examine layer 3/4 and layer 5 information, e.g., for tracking purposes or the like, but is not required to do so.
- the first request(s) to the first contact server from the client are handled by the first contact server at the application (HTTP) layer, whereas after the handoff to delivery server, subsequent requests are preferably handled by the first contact server (and subsequent contact servers) at the TCP layer.
- HTTP application
- subsequent requests are preferably handled by the first contact server (and subsequent contact servers) at the TCP layer.
- an initial contact server (IC) and then DSR with a delivery server may introduce delays compared to a hypothetical direct TCP/IP connection between delivery server and the client.
- the DSR migration to delivery server may potentially impact the performance of the overall throughput of the session because the path (for the TCP round trip time) is potentially being lengthened.
- the handoff has potential for making some aspects of the response to the client worse than if the response had been served directly from contact server.
- the contact server passes TCP packets from the client 208 to the delivery server 116 -B. These packets are transferred at the TCP (layer 4) level, and the contact server 116 need not examine them.
- the delivery server 106 -B obtains the TCP packets from the client (via the contact server 116 ) and processes the client request. From the client's perspective it has a TCP connection with the contact server 106 -A.
- the chosen delivery server (or the chosen delivery server cluster) handles the request and does not, itself, pass on the request to yet another “better” server. While such processing is possible and contemplated herein, it is likely to introduce unacceptable delays.
- a contact server 116 may, in some cases, be capable of serving the requested content and may sometimes serve requested content to a client.
- the delivery server may, itself, be an initial contact for some client requests and may include the same DSR migration capabilities as contact server.
- contact server may be a “better” server for some other initial contact and may have a client connection DSR migrated to it.
- the contact server 116 that first receives a client request must select a delivery server 106 to handle the client request.
- the contact server may make this determination based on information associated with the request, at least some of which may be information that was not (or may not have been) known to (or knowable by) the rendezvous system 112 at the time that contact server was selected by the rendezvous system. This information may include one or more of:
- a heartbeat e.g., a cross-cluster heartbeat
- reducer/collector systems such as described in U.S. Pat. No. 8,925,930.
- the DSR migration is transparent to the client, and so the client must see the requested content coming from the same address as the contact server (which is where the client thinks it is coming from).
- the delivery server must spoof the IP address of the contact server on a per connection basis unless the delivery server has the same IP address as the contact server, e.g., in an anycast system in which all potential contact servers and BSs may have the same IP address.
- the contact servers may be dedicated appliances that do not serve content and essentially act as a second level HTTP-level rendezvous mechanism.
- subsequent contact servers 116 should use the same delivery server 106 -B for any DSR sessions from the same client 208 for the same resource.
- subsequent contact servers 116 may uses state information 500 that the first contact server registered about client and delivery server (e.g., in table 502 ).
- the contact server must decide whether or not to migrate the request to a “better” delivery server. Such decisions may be made, e.g., as described in U.S. patent application Ser. No. 15/364,036, which has been fully incorporated herein by reference for all purposes.
- the contact server may choose more than one delivery server to handle the request.
- the contact server establishes multiple DSR connections with multiple delivery servers. For example, as shown in FIG. 7 , the contact server 116 -A is selected by the client 208 to handle a request. The contact server selects delivery servers 106 -X and 106 -Y (e.g., using the rendezvous system). The contact server 116 -A then establishes two separate DSR connections with the client and each of the two delivery servers 106 -X and 106 -Y. Although only two delivery servers are shown in the example in FIG. 7 , those of ordinary skill in the art will appreciate and understand, upon reading this description, that more than two delivery servers may be used, along with a corresponding number of DSR connections.
- the contact server 116 -A may be a contact server that can serve content, or it may be a server that acts solely as a contact server.
- each selected delivery server (delivery servers 106 -X and 106 -Y) has identical (and possibly correct) copies of the requested content.
- the approach requires that the packets sent by each delivery server be identical so that the client 208 can accept and use whichever packet(s) it receives first.
- the MSS Maximum TCP segment Size
- Using multiple delivery servers provides redundancy and may provide improved delivery times (over a single delivery server or fewer delivery servers). However, as should be appreciated, there are costs (including overhead) of using multiple delivery servers. Accordingly, the decision as to how many delivery servers to use may be made based on policies (e.g., quality of service guarantees, etc.).
- a client request is received (at 802 ) by a first contact server (e.g., contact server 116 -A in a contact server group or network 118 — FIG. 4 ).
- the client request was directed to the first contact server ( 116 -A) using an IP address (e.g., IP 1 ) associated with that contact server and using Anycast routing mechanisms.
- IP 1 IP address
- the first contact server ( 116 -A) then chooses (at 804 ) one or more delivery servers ( 106 ) and migrates the request (at 806 ) to each of the selected delivery servers.
- the first contact server updates the state information ( 500 in FIG. 5A, 502 in FIG. 5B ) to reflect the selected delivery server(s) 106 and other request information (e.g., a URL associated with the client request). Then, for each delivery server, (at 808 ) the first contact server ( 116 -A) passes TCP data packets (ACKs) from the client to the delivery server(s).
- state information 500 in FIG. 5A, 502 in FIG. 5B
- other request information e.g., a URL associated with the client request.
- the first contact server passes TCP data packets (ACKs) from the client to the delivery server(s).
- the second contact server 116 -C determines the delivery server(s) 106 (e.g. using the state/registration information 500 ).
- the second contact server 116 -C then (at 812 ) passes TCP data packets (ACKs) from the client to the delivery server(s).
- Anycast may be used with or for or as part of a rendezvous system to select a delivery server to handle a client request.
- an Anycast-based system can switch servers mid-stream, which can break long running (TCP-like) connections.
- Programs that implement such methods may be stored and transmitted using a variety of media (e.g., computer readable media) in a number of manners.
- Hard-wired circuitry or custom hardware may be used in place of, or in combination with, some or all of the software instructions that can implement the processes of various embodiments.
- various combinations of hardware and software may be used instead of software only.
- FIG. 9 is a schematic diagram of a computer system 900 upon which embodiments of the present disclosure may be implemented and carried out.
- the computer system 900 includes a bus 902 (i.e., interconnect), one or more processors 904 , a main memory 906 , read-only memory 908 , removable storage media 910 , mass storage 912 , and one or more communications ports 914 .
- Communication port 914 may be connected to one or more networks by way of which the computer system 900 may receive and/or transmit data.
- a “processor” means one or more microprocessors, central processing units (CPUs), computing devices, microcontrollers, digital signal processors, or like devices or any combination thereof, regardless of their architecture.
- An apparatus that performs a process can include, e.g., a processor and those devices such as input devices and output devices that are appropriate to perform the process.
- Processor(s) 904 can be any known processor, such as, but not limited to, an Intel® Itanium® or Itanium 2® processor(s), AMD® Opteron® or Athlon MP® processor(s), or Motorola® lines of processors, and the like.
- Communications port(s) 914 can be any of an RS-232 port for use with a modem based dial-up connection, a 10/100 Ethernet port, a Gigabit port using copper or fiber, or a USB port, and the like. Communications port(s) 914 may be chosen depending on a network such as a Local Area Network (LAN), a Wide Area Network (WAN), a CDN, or any network to which the computer system 900 connects.
- the computer system 900 may be in communication with peripheral devices (e.g., display screen 916 , input device(s) 918 ) via Input/Output (I/O) port 920 .
- peripheral devices e.g., display screen 916 , input device(s) 9
- Main memory 906 can be Random Access Memory (RAM), or any other dynamic storage device(s) commonly known in the art.
- Read-only memory 908 can be any static storage device(s) such as Programmable Read-Only Memory (PROM) chips for storing static information such as instructions for processor 904 .
- Mass storage 912 can be used to store information and instructions.
- hard disks such as the Adaptec® family of Small Computer Serial Interface (SCSI) drives, an optical disc, an array of disks such as Redundant Array of Independent Disks (RAID), such as the Adaptec® family of RAID drives, or any other mass storage devices may be used.
- SCSI Small Computer Serial Interface
- RAID Redundant Array of Independent Disks
- Bus 902 communicatively couples processor(s) 904 with the other memory, storage, and communications blocks.
- Bus 902 can be a PCI/PCI-X, SCSI, a Universal Serial Bus (USB) based system bus (or other) depending on the storage devices used, and the like.
- Removable storage media 910 can be any kind of external hard-drives, floppy drives, IOMEGA® Zip Drives, Compact Disc—Read Only Memory (CD-ROM), Compact Disc—Re-Writable (CD-RW), Digital Versatile Disk—Read Only Memory (DVD-ROM), etc.
- Embodiments herein may be provided as one or more computer program products, which may include a machine-readable medium having stored thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process.
- machine-readable medium refers to any medium, a plurality of the same, or a combination of different media, which participate in providing data (e.g., instructions, data structures) which may be read by a computer, a processor or a like device.
- Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media.
- Non-volatile media include, for example, optical or magnetic disks and other persistent memory.
- Volatile media include dynamic random access memory, which typically constitutes the main memory of the computer.
- Transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise a system bus coupled to the processor. Transmission media may include or convey acoustic waves, light waves and electromagnetic emissions, such as those generated during radio frequency (RF) and infrared (IR) data communications.
- RF radio frequency
- IR infrared
- the machine-readable medium may include, but is not limited to, floppy diskettes, optical discs, CD-ROMs, magneto-optical disks, ROMs, RAMs, erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions.
- embodiments herein may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., modem or network connection).
- data may be (i) delivered from RAM to a processor; (ii) carried over a wireless transmission medium; (iii) formatted and/or transmitted according to numerous formats, standards or protocols; and/or (iv) encrypted in any of a variety of ways well known in the art.
- a computer-readable medium can store (in any appropriate format) those program elements that are appropriate to perform the methods.
- main memory 906 is encoded with application(s) 922 that supports the functionality discussed herein (the application 922 may be an application that provides some or all of the functionality of the CD services described herein, including the client application and the optimization support mechanism 112 ).
- Application(s) 922 (and/or other resources as described herein) can be embodied as software code such as data and/or logic instructions (e.g., code stored in the memory or on another computer readable medium such as a disk) that supports processing functionality according to different embodiments described herein.
- processor(s) 904 accesses main memory 906 via the use of bus 902 in order to launch, run, execute, interpret or otherwise perform the logic instructions of the application(s) 922 .
- Execution of application(s) 922 produces processing functionality of the service related to the application(s).
- the process(es) 924 represent one or more portions of the application(s) 922 performing within or upon the processor(s) 904 in the computer system 900 .
- the application 922 itself (i.e., the un-executed or non-performing logic instructions and/or data).
- the application 922 may be stored on a computer readable medium (e.g., a repository) such as a disk or in an optical medium.
- the application 922 can also be stored in a memory type system such as in firmware, read only memory (ROM), or, as in this example, as executable code within the main memory 906 (e.g., within Random Access Memory or RAM).
- application 922 may also be stored in removable storage media 910 , read-only memory 908 and/or mass storage device 912 .
- the computer system 900 can include other processes and/or software and hardware components, such as an operating system that controls allocation and use of hardware resources.
- embodiments of the present invention include various steps or operations. A variety of these steps may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the operations. Alternatively, the steps may be performed by a combination of hardware, software, and/or firmware.
- the term “module” refers to a self-contained functional component, which can include hardware, software, firmware or any combination thereof.
- an apparatus may include a computer/computing device operable to perform some (but not necessarily all) of the described process.
- Embodiments of a computer-readable medium storing a program or data structure include a computer-readable medium storing a program that, when executed, can cause a processor to perform some (but not necessarily all) of the described process.
- process may operate without any user intervention.
- process includes some human intervention (e.g., a step is performed by or with the assistance of a human).
- the phrase “at least some” means “one or more,” and includes the case of only one.
- the phrase “at least some services” means “one or more services”, and includes the case of one service.
- the phrase “based on” means “based in part on” or “based, at least in part, on,” and is not exclusive.
- the phrase “based on factor X” means “based in part on factor X” or “based, at least in part, on factor X.” Unless specifically stated by use of the word “only”, the phrase “based on X” does not mean “based only on X.”
- the phrase “using” means “using at least,” and is not exclusive. Thus, e.g., the phrase “using X” means “using at least X.” Unless specifically stated by use of the word “only”, the phrase “using X” does not mean “using only X.”
- the phrase “distinct” means “at least partially distinct.” Unless specifically stated, distinct does not mean fully distinct. Thus, e.g., the phrase, “X is distinct from Y” means that “X is at least partially distinct from Y,” and does not mean that “X is fully distinct from Y.” Thus, as used herein, including in the claims, the phrase “X is distinct from Y” means that X differs from Y in at least some way.
- a list may include only one item, and, unless otherwise stated, a list of multiple items need not be ordered in any particular manner.
- a list may include duplicate items.
- the phrase “a list of CDN services” may include one or more CDN services.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
Abstract
Description
- This patent document contains material subject to copyright protection. The copyright owner has no objection to the reproduction of this patent document or any related materials in the files of the United States Patent and Trademark Office, but otherwise reserves all copyrights whatsoever.
- This invention relates to content delivery and content delivery networks. More specifically, this invention relates to direct server return with anycast rendezvous in content delivery networks.
- Other objects, features, and characteristics of the present invention as well as the methods of operation and functions of the related elements of structure, and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification.
-
FIG. 1 depicts aspects of a content delivery network (CDN) according to exemplary embodiments hereof; -
FIGS. 2A, 2B, and 3 depict aspects of clusters of service endpoints and clustering in an exemplary CDN in accordance with exemplary embodiments hereof; -
FIGS. 4, 5A-5C, 6, and 7 depict aspects of Direct Server Return in a CDN according to exemplary embodiments hereof; -
FIG. 8 is a flowchart depicting aspects of exemplary embodiments hereof; and -
FIG. 9 depicts aspects of computing according to exemplary embodiments hereof. - As used herein, unless used otherwise, the following terms or abbreviations have the following meanings:
- AS means autonomous system;
- BGP means border gateway protocol;
- CD means content delivery;
- CDN means content delivery network;
- DNS means domain name system;
- DSR means direct server return;
- HTTP means Hyper Text Transfer Protocol;
- HTML means Hypertext Markup Language;
- HTTPS means HTTP Secure;
- IP means Internet Protocol;
- IPv4 means
Internet Protocol Version 4; - IPv6 means Internet Protocol Version 6;
- IP address means an address used in the Internet Protocol, including both IPv4 and IPv6, to identify electronic devices such as servers and the like;
- OSI model refers to the Open Systems Interconnection model;
- SSL means Secure Sockets Layer;
- URI means Uniform Resource Identifier; and
- URL means Uniform Resource Locator.
- A “mechanism” refers to any device(s), process(es), routine(s), service(s), module(s), or combination thereof. A mechanism may be implemented in hardware, software, firmware, using a special-purpose device, or any combination thereof. A mechanism may be integrated into a single device or it may be distributed over multiple devices. The various components of a mechanism may be co-located or distributed. The mechanism may be formed from other mechanisms. In general, as used herein, the term “mechanism” may thus be considered shorthand for the term device(s) and/or process(es) and/or service(s).
- A content delivery network (CDN) distributes content (e.g., resources) efficiently to clients on behalf of one or more content providers, preferably via a public Internet. Content providers provide their content (e.g., resources) via origin sources (origin servers or origins). A CDN can also provide an over-the-top transport mechanism for efficiently sending content in the reverse direction—from a client to an origin server. Both end-users (clients) and content providers benefit from using a CDN. Using a CDN, a content provider is able to take pressure off (and thereby reduce the load on) its own servers (e.g., its origin servers). Clients benefit by being able to obtain content with fewer delays.
-
FIG. 1 shows aspects of an exemplary CDN in which one ormore content providers 102 provide content via one ormore origin sources 104 and delivery services (servers) 106 toclients 108 via one ormore networks 110. The delivery services (servers) 106 may form a delivery network from whichclients 108 may obtain content. Thedelivery services 106 may be logically and/or physically organized hierarchically and may include edge caches. Thedelivery services 106 may be logically and/or physically organized as clusters and super-clusters, as described below. - As should be appreciated, components of a CDN (e.g., delivery servers or the like) may use the CDN to deliver content to other CDN components. Thus a CDN component may itself be a client of the CDN. For example, the CDN may use its own infrastructure to deliver CDN content (e.g., CDN control and configuration information) to CDN components.
- Client requests (e.g., for content) may be associated with delivery server(s) 106 by a
rendezvous system 112 comprising rendezvous mechanism(s) 114, possibly in the form of one or more rendezvous networks. The rendezvous mechanism(s) 114 may be implemented, at least in part, using or as part of a DNS system, and the association of a particular client request (e.g., for content) with one or more delivery servers may be done as part of DNS processing associated with that particular client request (e.g., of a domain name associated with the particular client request). - Typically,
multiple delivery servers 106 in the CDN can process or handle any particular client request for content (e.g., for one or more resources). Preferably therendezvous system 112 associates a particular client request with one or more “best” or “optimal” (or “least worst”)delivery servers 106 to deal with that particular request. The “best” or “optimal” delivery server(s) 106 may be one(s) that is (are) close to the client (by some measure of network cost) and that is (are) not overloaded. Preferably the chosen delivery server(s) 106 (i.e., the delivery server(s) chosen by therendezvous system 112 for a client request) can deliver the requested content to the client or can direct the client, somehow and in some manner, to somewhere where the client can try to obtain the requested content. A chosendelivery server 106 need not have the requested content at the time the request is made, even if that chosendelivery server 106 eventually serves the requested content to the requesting client. - When a
client 108 makes a request for content, the client may be referred to as the requesting client, and thedelivery server 106 that therendezvous system 112 associates with that client request (and that the client first contacts to make the request) may be referred to as the “contact” server or just the contact. - Exemplary CDNs are described in U.S. Pat. Nos. 8,060,613 and 8,925,930, the entire contents of both of which have been fully incorporated herein by reference for all purposes.
- As designated intermediaries for given origin service, a CDN generally provides a redundant set of service endpoints running on distinct hardware in different locations. These distinctly addressed but functionally equivalent service endpoints provide options to the
rendezvous system 112. Each distinct endpoint is preferably, but not necessarily, uniquely addressable within the system, preferably using an addressing scheme that may be used to establish a connection with the endpoint. The address(es) of an endpoint may be real or virtual. In some implementations, e.g., where service endpoints (preferably functionally equivalent service endpoints) are bound to the same cluster and share a virtual address, the virtual address may be used. - In the case of an IP-based system, each distinct endpoint may be defined by at least one unique IP address and port number combination. In an IP-based system where service endpoints are logically bound to the same cluster and share an IP address, each distinct endpoint may be defined by at least one unique combination of the IP address and port number. In some cases, service endpoints that are logically bound to the same cluster may share a so-called VIP (virtual IP address), in which cases each distinct endpoint may be defined by at least one unique combination of the VIP and a port number. In the latter case, each distinct endpoint may be bound to exactly one physical cluster in the CDN.
- It should be appreciated that not all service types will require or have multi-agent logical clusters. In such cases, the endpoint may be defined in terms of a real address rather than a virtual address (e.g., an IP address rather than a VIP). A virtual address may, in some cases, correspond to or be a physical address. For example, a VIP may be (or correspond to) a physical address (e.g., for a single machine cluster).
- The term VIP is used in this description as an example of a virtual address (for an IP-based system). In general any kind of virtual addressing scheme may be used and is contemplated herein. Unless specifically stated otherwise, the term VIP is intended as an example of a virtual address, and the system is not limited to or by IP-based systems or systems with IP addresses and/or VIPs.
- It should be appreciated that, as used herein, e.g., to describe endpoints in a cluster, the term “functionally equivalent” does not require identical service endpoints. For example, two caching endpoint services may have different capabilities yet may be considered to be functionally equivalent.
- A shown, e.g., in
FIG. 2A ,service endpoints SEP 1,SEP 2 . . . SEP n are logically bound to thesame cluster 200 and share an address. When a logical cluster is within a physical cluster (e.g., when the services are on machines behind a switch), the shared address may be a virtual address (e.g., a VIP). - A physical cluster of service endpoints may have one or more logical clusters of service endpoints. For example, as shown in
FIG. 2B , a physical cluster 202 includes two logical clusters (Logical Cluster 1 and Logical Cluster 2).Logical Cluster 1 consists of two machines (M0, M1), andLogical Cluster 2 consists of three machines (M2, M3, M4). The machines in each logical cluster may share a heartbeat signal (HB) with other machines in the same logical cluster. In this example, the first logical cluster may be addressable by a first unique virtual address (address # 1, e.g., a first VIP/port combination), whereas the second logical cluster may be addressable by a second unique virtual address (address # 2, e.g., a second VIP/port combination). - In a typical case, a machine may only be part of a single logical cluster; although it should be appreciated that this is not a requirement.
- The machines that share a heartbeat signal may be said to be on a heartbeat ring. In the example cluster shown in
FIG. 2B , machines M0 and M1 are on the same heartbeat ring, and machines M2, M3, and M4 are on the same heartbeat ring. - When a service endpoint is bound to a cluster, it means that a bank of equivalent services are running on all the machines in the cluster and listening for service requests addressed to that cluster endpoint address. Preferably a local mechanism (e.g., a load-balancing mechanism) ensures that exactly one service instance (e.g., machine) in the cluster will respond to each unique service request. This may be accomplished, e.g., by consistently hashing attributes of each request to exactly one of the available machines (and of course it is impossible to have more than one service instance listening per machine on the same endpoint). Each service instance running on machines in the cluster can be listening to any number of other endpoint addresses, each of which will have corresponding service instances running on all other machines in the cluster. Those of ordinary skill in the art will realize and understand, upon reading this description, that various mechanisms may be used to allocate/distribute service requests to service instances in a cluster. It should be appreciated that not all types of services need use the same allocation/distribution mechanisms, and that not all clusters of the same kind of service need use the same allocation/distribution mechanisms.
- In some preferred implementations, each machine is installed in a physical cluster of machines behind a single shared switch. One physical cluster may be divided up into multiple logical clusters, where each logical cluster consists of those machines on the same physical cluster that are part of the same HB ring. That is, each machine runs an HB process with knowledge of the other machines in the same logical cluster, monitoring all virtual addresses (e.g., VIPs) and updating the local firewall and NIC (network interface card/controller) configurations in order to implement local load balancing across the cluster.
- U.S. Pat. No. 8,015,298 titled “Load-Balancing Cluster,” (the entire contents of which are fully incorporated herein by reference for all purposes) describes various approaches to ensure that exactly one service instance in a cluster will respond to each unique service request. In a first allocation approach, service endpoints on the same HB ring select from among themselves to process service requests. In a second allocation approach, also for service endpoints on the same HB ring, having selected a service endpoint from among themselves to process service requests, the selected service endpoint may select another service endpoint (preferably from service endpoints on the same HB ring) to actually process the service request. This handoff may be made based on, e.g., the type of request or actual content requested.
- Since, in some cases, each machine may be considered to be a peer of all other machines in the cluster, there is no need for any other active entity specific to the cluster.
- A subcluster is a group of one or more (preferably homogenous) machines sharing an internal, local area network (LAN) address space, possibly load-balanced, each running a group of one or more collaborating service instances. To external clients, i.e., those not connected to the internal LAN of the subcluster, the collection of service instances is addressed as a single service image, meaning that individual externally visible physical addresses can be used to communicate with all machines in the subcluster, though usually one at a time.
- Service instances within the subcluster's internal LAN address space can preferably address each other with internal or external LAN addresses, and may also have the ability to transfer connections from one machine to another in the midst of a single session with an external client, without the knowledge or participation of the client.
- A supercluster is a group of one or more (preferably homogenous) subclusters, each consisting of a group of one or more collaborating but distinctly addressed service images. Different service images in the same supercluster may or may not share a common internal LAN (although it should be appreciated that they still have to be able to communicate, directly or indirectly, with each other over some network). Those connected to the same internal LAN may use internal LAN addresses or external LAN addresses, whereas others must use external network addresses to communicate with machines in other subclusters.
- Clusters may be interconnected in arbitrary topologies to form subnetworks. The set of subnetworks a service participates in, and the topology of those networks, may be dynamic, constrained by dynamically changing control policies based on dynamically changing information collected from the network itself, and measured by the set of currently active communication links between services.
- An example showing the distinction between physical clusters, logical subclusters, and logical superclusters is shown in
FIG. 3 . In this example, the machines of two physical clusters A and B are subdivided into groups forming logical subclusters R, S, and T (from the machines of physical cluster A) and logical subclusters X, Y, and Z (from the machines of physical cluster B). These subclusters are then logically recombined to form logical superclusters I (from subclusters R and S), J (from subclusters T and X), and K (from subclusters Y and Z). The number of machines that may be combined into one subcluster is limited by the number of machines in a physical cluster, but theoretically any number of logical subclusters may be grouped into one supercluster that may span multiple physical clusters or be contained within one. - For some preferred implementations, a two-level cluster architecture is assumed, where machines behind a common switch are grouped into logical sub-clusters, and sub-clusters (whether behind the same switch or on different racks/switches) are grouped into super-clusters. In some preferred implementations, using, e.g., the systems described in U.S. Pat. No. 8,015,298 titled “Load-Balancing Cluster,” all machines in a logical sub-cluster are homogeneous with respect to the virtual address (e.g., VIPs) they serve (each machine serves the same virtual addresses—VIPs—as all other machines in the sub-cluster), and machines in distinct logical clusters will necessarily serve distinct (non-overlapping) sets of virtual addresses—VIPs.
- A single switch may govern multiple sub-clusters and these sub-clusters need not be in the same super-cluster. It is logically possible to have any number of machines in one sub-cluster, and any number of sub-clusters in a super-cluster, though those of ordinary skill in the art will realize and understand that physical and practical realities will dictate otherwise.
- Other features described in U.S. Pat. No. 8,015,298 could be made available as an optional feature of sub-clusters, enabling the transfer of connections from one machine to another in the same sub-cluster.
- U.S. Pat. No. 8,015,298 describes various approaches to ensure that exactly one service instance in a cluster will respond to each unique service request. These may be referred to as the first allocation approach and the second allocation approach. In the first allocation approach, service endpoints on the same HB ring select from among themselves to process service requests. In the second allocation approach, also for service endpoints on the same HB ring, having selected a service endpoint from among themselves to process service requests, the selected service endpoint may select another service endpoint (preferably from service endpoints on the same HB ring) to actually process the service request. This handoff may be made based on, e.g., the type of request or actual content requested.
- It is assumed here that for some implementations an additional level of heartbeat-like functionality (referred to herein as super-HB) exists at the level of virtual addresses (e.g., VIPs) in a super-cluster, detecting virtual addresses that are down and configuring them on machines that are up. This super-HB allows the system to avoid relying solely on DNS-based rendezvous for fault-tolerance and to deal with the DNS-TTL phenomenon that would cause clients with stale IP addresses to continue to contact VIPs that are known to be down. It should be appreciated that a super-HB system may have to interact with the underlying network routing mechanism (simply bringing a VIP “up” does not mean that requests will be routed to it properly). For example, if a sub-cluster is to take over another sub-cluster's VIP because the second sub-cluster is completely down or has lost enough capacity that the system will consider it to be down, the routing infrastructure is preferably informed that the VIP has moved to a different switch. As noted earlier, while this discussion is made with reference to VIPs, it should be appreciated that the system is not limited to an IP-based scheme, and any type of addressing and/or virtual addressing may be used.
- Heartbeat(s) provide a way for machines (or service endpoints) in the same cluster (logical and/or physical and/or super) to know the state of other machines (or service endpoints) in the cluster, and heartbeat(s) provide information to the various allocation techniques. A heartbeat and super-heartbeat may be implemented, e.g., using the reducer/collector systems such as described in U.S. Pat. No. 8,925,930. However, those of ordinary skill in the art will realize and understand, upon reading this description, that a local heartbeat in a physical cluster is preferably implemented locally and with a fine granularity. A super-heartbeat may not have (or need) the granularity of a local heartbeat.
- This leads to two extreme approaches to configuring a super-cluster, one relying on the first allocation approach described above (with reference to U.S. Pat. No. 8,015,298), with optional super-HB, the other with super-HB and optional first allocation approach:
-
- A super-cluster containing N>1 sub-clusters with >1 machines
- First allocation approach required, second allocation approach optional. A super-HB is unnecessary.
- A super-cluster containing N>1 sub-clusters with 1 machine each
- First allocation approach not required, second allocation approach not supported. This requires a super-HB.
- A super-cluster containing N>1 sub-clusters with >1 machines
- Depending on the overhead of the first allocation approach and the fail-over properties of virtual address (e.g., VIP) reconfiguration and rendezvous, it may be advantageous to actually configure a super-cluster somewhere in between these two extremes. On the one hand, the first allocation approach system described in U.S. Pat. No. 8,015,298 provides the most responsive failover at the cost of higher communication overhead. This overhead determines an effective maximum number of machines and VIPs in a single logical sub-cluster based on the limitations of the heartbeat protocol. The First allocation approach mechanisms described in U.S. Pat. No. 8,015,298 also imposes additional overhead beyond that of heartbeat due to the need to broadcast and filter request traffic. On the other hand, a VIP-level failover mechanism that spans the super-cluster would impose similar heartbeat overhead but would not require any request traffic broadcasting or filtering.
- Detection of down VIPs in the cluster may potentially be handled without a heartbeat, using a reduction of log events received outside the cluster. A feedback control mechanism could detect inactive VIPs and reallocate them across the cluster by causing new VIP configurations to be generated as local control resources.
- Request-Response Processing
- As described, a particular client request for content (e.g., for a resource) uses the
rendezvous system 112 to determine anappropriate delivery server 106 to handle the request. Thatappropriate delivery server 106 effectively becomes the contact server for that request. - U.S. patent application Ser. No. 15/364,036, filed Nov. 29, 2016, describes cross-cluster direct server return in a content delivery network (CDN). As described in patent application Ser. No. 15/364,036, a so-called initial contact (IC) server may serve requested content or may transfer a request to a “better” server using direct server return (DSR). The entire contents of patent application Ser. No. 15/364,036 are fully incorporated herein by reference for all purposes.
- With reference now to
FIG. 4 , in embodiments hereof,certain delivery servers 106 may act primarily (or even solely) ascontact servers 116. Thecontact servers 116 of exemplary embodiments hereof may thus act only as contacts and do not also serve content. Thus, when acontact server 116 is contacted by a client and receives a client request, the contact server selects anotherdelivery server 106 to handle the request. The request may then be handled by the “better” server using direct server return (DSR), e.g., as described in U.S. application Ser. No. 15/364,036. - The
contact servers 116 may form anetwork 118 of contact servers. Although shown inFIG. 4 as a logical subset of thedelivery servers 106, since, in preferred embodiments, thecontact servers 116 do not also serve content, they may be considered a separate set of servers. However, as noted, when DSR is used to serve a client, the client is unaware that thatcontact server 116 is not actually serving the content to the client. As far as the client is concerned, thecontact server 116 is a delivery server. - In preferred embodiments hereof, one or more groups of
contact servers 116 within the same autonomous system (AS) have the same IP address. E.g., as shown inFIG. 4 , the contact servers 116-1, . . . in autonomous system AS-1 have the same IP address, namely IP1, the contact servers 116-2, . . . in autonomous system AS-2 have the same IP address, namely IP2, and so on. There may be multiple groups of contact servers in a particular AS. Note that not all contact servers in an AS need have the same IP address, and contact servers in other autonomous systems may have the same IP address. - With reference to
FIG. 5A , according to exemplary embodiments hereof, a client's request for content is directed, by therendezvous system 112, to a first or initial contact server to handle the request. Theclient 208 is provided with an IP address (e.g., IP1) for the initial contact server, e.g., by therendezvous system 112. Recall that a group of multiple contact servers in the same AS may have the same IP address. For the purposes of this explanation, assume that contact servers 116-A and 116-C are in the same AS and have the same IP address (IP1). Using anycast, theclient 208 initially connects to one of thecontact servers 116 with the IP address IP1. For the sake of this description, and without loss of generality, assume that the client initially connects to contact server 116-A. The contact server 116-A to which theclient 208 initially connects then selects a delivery server (e.g., 106-B) from thedelivery servers 106, and then uses direct server return (DSR) (e.g., as described in U.S. application Ser. No. 15/364,036) to process the client request. - The contact servers, including initial contact server 116-A, may use
state information 500 to maintain and/or lookup information about the connection between theclient 208 and the request being handled by the contact server 116-A. For example, thestate information 500 may include, e.g., a reference table 502 (FIG. 5B ) that maps client IP addresses and port numbers to corresponding request information. As shown inFIG. 5C , the request information may include the IP address of the delivery server that is serving the client, the requested URL (or some identification of or information about the requested resource), and other miscellaneous information (e.g., connection information and the like). Effectively, the first contact server registers the association between the client and the delivery server that is actually serving the client. - The state information is preferably accessible to other contact servers, at least in the same AS (or AS group) as the contact server 116-
A. State information 500 may be provided using, e.g., a collector and/or reducer network of the CDN (e.g., as described in U.S. Patent No. U.S. Pat. Nos. 8,060,613 and 8,925,930, the entire contents of both of which have been fully incorporated herein by reference for all purposes). - The connection between the
client 208 and the contact server 116-A (as opposed to another contact server with the same IP address and in the same autonomous system (AS) as contact server 116-A) is made by or based on BGP tables. In other words, although multiple contact servers in the same AS (or AS group) have the same IP address, only one such contact server will be used, based, e.g., on network load, traffic, BGP tables, etc. As such, if network conditions change during the connection (as reflected by changes to the BGP tables), then theclient 208 may be directed, during the processing of the request, to a different contact server with the same IP address (e.g., to contact server 116-C). The second (or subsequent) contact server 116-C needs to continue processing the request with theclient 208 and the delivery server 106-B where the first (or previous) contact server 116-A/delivery server 106-B left off. - In order to continue processing the client request, the second (or subsequent) contact server 116-C needs to determine which
delivery server 106 is serving theclient 208. To do this, the contact server 116-C may, e.g., query the reference table 502 in thestate information 500 to determine the identity of server 106-B. The identity may be recorded in thestate information 500 as any unique identifier. In some embodiments, the IP address (IP-B) of server 106-B may be used to uniquely identify the server in the state information. Recall that the reference table 502 maps the client's IP address (and port number), inter alia, to the address of thedelivery server 106 that is handling the client request. The contact server 116-C may use the IP address (IP-C) of theclient 208 to lookup information in the table 502 in thestate information 500. - The client request may then be processed using DSR with contact server 116-C, delivery server 106-B, and
client 208. Since the contact servers 116-A and 116-B have the same IP address (IP1), the client is unaware of any change. However, instead of the DSR having contact server 116-A act as a pass-through proxy, that role is taken by contact server 116-C. Similarly, the delivery server 106-B may be unaware of the change in contact servers. - When the request processing is complete, the
state information 500 should be updated to remove the information about the request. This can be done by having the entry for the request removed by thelast contact server 116 handling the request. Effectively, the last contact server handling the request un-registers the association between the client and the delivery server. - The IP address IP-B may correspond to a multi-machine cluster, in which case, the DSR migrated request (from contact server) may be handled by server 106-B or by any machine in
cluster 120, in accordance with that cluster's request processing policies and protocols. The network address that the contact server uses for server 106-B may be a VIP for thecluster 120 or a VIP for server 106-B or an IP address of server 106-B. When the address is a VIP for thecluster 120, then the cluster may choosedelivery server 106 to handle the request. In the case of the IP-B being a cluster, elements of the request may be used to determine which server within that cluster will process the request. Note, however, that such determining needs to use the elements of the actual client request (rather than, e.g., the LAN address of the IC machine(s)). - After the initial contact server hands off the request to server 106-B, using direct server return (DSR) the contact server essentially acts as a router for that request. While the handoff (from contact server to delivery server 106-B) is transparent to the
client 208, in TCP/IP communication with the delivery server 106-B, the client must get the same IP address as the initial contact server (IP1). Therefore the delivery server 106-B must spoof the IP address of the contact server on a per connection basis (unless the delivery server 106-B has the same public IP address as the contact server, e.g., in an anycast system). - The Open Systems Interconnection model (OSI model) is a conceptual model that characterizes and standardizes the communication functions of a telecommunication or computing system without regard to their underlying internal structure and technology. The OSI model partitions a communication system into abstraction layers. The original version of the model defined seven layers, including:
-
- Layer 3 (Network layer—packets) Structuring and managing a multi-node network, including addressing, routing and traffic control (e.g., AppleTalk, ICMP, IPsec, IPv4, IPv6)
- Layer 4 (Transport layer) Segments (e.g. TCP)/Datagrams (e.g., UDP)
- Layer 5 (Session layer—Data): Managing communication sessions, i.e. continuous exchange of information in the form of multiple back-and-forth transmissions between two nodes (e.g., HTTP, HTTPS)
- After the initial client request to the contact server 116-A (at
Layer 5, the HTTP level), the contact server becomes aLayer 3/4 pass-through router in only one direction (from the client to the contact server to the delivery server) for that client request. Thus the contact server changes from aLayer 5 session/application layer (e.g. HTTP) server and becomes aLayer 3/4 router. The initial contact server is thereby converted into a routing device for that particular client connection. In the case of an HTTPS request/connection, the contact server/delivery server may not be able to communicate sufficient state to have the SSL handshake performed by the contact server (so that the request could be inspected by the contact server) and then have the delivery server continue the encryption of the responses. In such cases, the contact server may perform a delivery server selection based on just load and/or client location and then forward the connection immediately that the connection has been established. That is, in such cases, the contact server may function as aLayer 3/4 pass-through immediately on connection establishment. - The
client 208 establishes a connection (e.g., a TCP/IP connection) with the contact server 116-A and makes a request (e.g., an HTTP request) to the contact server 116-A. The contact server 116-A migrates the TCP connection to the delivery server 106-B. The contact server 116-A freezes the connection with the client and determines the required TCP state information (e.g., sequence numbers, etc.), and conveys that information to the delivery server 106-B over some protocol (e.g., TCP), preferably over a side-channel, possibly using tunneling. The delivery server 106-B then constructs the socket and starts sending the packets back (to the client 208). - Every time the
client 208 sends an ACK (for the pieces of the TCP packet stream that it receives from the delivery server 106-B), that ACK is still going to come back to the contact server 116-A. The contact server 116-A then provides those ACKs to the delivery server 106-B. - Thus, contact server 116-A starts at layer 5 (HTTP) with its connections with the client. Once the handoff is made to delivery server 106-B, contact server 116-A effectively becomes a
layer 3/4 (router) andforwards layer 3/4 information (e.g., ACKs) from the client to the delivery server 106-B. The contact server 116-A will still receive thelayer 3/4 andlayer 5 information (e.g., HTTP) from theclient 208, but this information is forwarded to the delivery server 106-B. Note that the contact server 116-A may examinelayer 3/4 andlayer 5 information, e.g., for tracking purposes or the like, but is not required to do so. - As shown in
FIG. 6 , the first request(s) to the first contact server from the client are handled by the first contact server at the application (HTTP) layer, whereas after the handoff to delivery server, subsequent requests are preferably handled by the first contact server (and subsequent contact servers) at the TCP layer. - As will be appreciated, the use of an initial contact server (IC) and then DSR with a delivery server may introduce delays compared to a hypothetical direct TCP/IP connection between delivery server and the client. There may, e.g., be a delay added by the extra time T1 from the client to contact server and T2 from contact server to delivery server. The DSR migration to delivery server may potentially impact the performance of the overall throughput of the session because the path (for the TCP round trip time) is potentially being lengthened. There is also a delay in serving the initial response from delivery server as opposed to serving the response directly from contact server. The handoff has potential for making some aspects of the response to the client worse than if the response had been served directly from contact server. These potential delays, etc. can be taken into account when selecting a delivery server.
- Thus, once responsibility for the request has been transferred from an contact server to the delivery server 106-B, the contact server passes TCP packets from the
client 208 to the delivery server 116-B. These packets are transferred at the TCP (layer 4) level, and thecontact server 116 need not examine them. The delivery server 106-B, obtains the TCP packets from the client (via the contact server 116) and processes the client request. From the client's perspective it has a TCP connection with the contact server 106-A. - Preferably the chosen delivery server (or the chosen delivery server cluster) handles the request and does not, itself, pass on the request to yet another “better” server. While such processing is possible and contemplated herein, it is likely to introduce unacceptable delays.
- A
contact server 116 may, in some cases, be capable of serving the requested content and may sometimes serve requested content to a client. Those of ordinary skill in the art will realize and appreciate, upon reading this description, that embodiments of the system are preferably symmetric, in that the delivery server may, itself, be an initial contact for some client requests and may include the same DSR migration capabilities as contact server. Similarly, contact server may be a “better” server for some other initial contact and may have a client connection DSR migrated to it. - Picking a Delivery Server
- Assuming that the contact servers do not serve content, the
contact server 116 that first receives a client request must select adelivery server 106 to handle the client request. The contact server may make this determination based on information associated with the request, at least some of which may be information that was not (or may not have been) known to (or knowable by) therendezvous system 112 at the time that contact server was selected by the rendezvous system. This information may include one or more of: -
- (1) the requesting client's network (IP) address,
- (2) customer information (e.g., the CDN customer with which the requested content is associated, e.g., based on property information);
- (3) size of the requested content;
- (4) kind of the requested content;
- (5) serving policy associated with the requested content (e.g., based on property information);
- (6) media player need or used for the requested content;
- (7) type of client's device; and
- (8) load at the contact server (if contact servers serve content).
- For some of the server selection approaches it is preferable to have an equivalent of the intra-cluster heartbeat process in order to know which machines are online. This may be implemented, at least in part, using the super-HB described above. As noted above, a heartbeat (e.g., a cross-cluster heartbeat) may be implemented, e.g., using the reducer/collector systems such as described in U.S. Pat. No. 8,925,930.
- As noted above, the DSR migration is transparent to the client, and so the client must see the requested content coming from the same address as the contact server (which is where the client thinks it is coming from). To this end, the delivery server must spoof the IP address of the contact server on a per connection basis unless the delivery server has the same IP address as the contact server, e.g., in an anycast system in which all potential contact servers and BSs may have the same IP address.
- Those of ordinary skill in the art will realize and appreciate, upon reading this description, that the contact servers and BSs should be in the same autonomous system (AS) in order for the DSR migration to function, otherwise source-filter routing may filter out packets.
- In some cases, the contact servers may be dedicated appliances that do not serve content and essentially act as a second level HTTP-level rendezvous mechanism.
- As explained above, after the initial contact (IC) server establishes the connection and the DSR connections with the delivery server 106-B and the
client 208,subsequent contact servers 116 should use the same delivery server 106-B for any DSR sessions from thesame client 208 for the same resource. As noted,subsequent contact servers 116 may usesstate information 500 that the first contact server registered about client and delivery server (e.g., in table 502). - If the contact server can, itself, also delivery content, then the contact server must decide whether or not to migrate the request to a “better” delivery server. Such decisions may be made, e.g., as described in U.S. patent application Ser. No. 15/364,036, which has been fully incorporated herein by reference for all purposes.
- In some embodiments the contact server may choose more than one delivery server to handle the request. In such cases the contact server establishes multiple DSR connections with multiple delivery servers. For example, as shown in
FIG. 7 , the contact server 116-A is selected by theclient 208 to handle a request. The contact server selects delivery servers 106-X and 106-Y (e.g., using the rendezvous system). The contact server 116-A then establishes two separate DSR connections with the client and each of the two delivery servers 106-X and 106-Y. Although only two delivery servers are shown in the example inFIG. 7 , those of ordinary skill in the art will appreciate and understand, upon reading this description, that more than two delivery servers may be used, along with a corresponding number of DSR connections. - Further, the contact server 116-A may be a contact server that can serve content, or it may be a server that acts solely as a contact server.
- This approach assumes that each selected delivery server (delivery servers 106-X and 106-Y) has identical (and possibly correct) copies of the requested content. The approach requires that the packets sent by each delivery server be identical so that the
client 208 can accept and use whichever packet(s) it receives first. In this regard, in particular, the MSS (Maximum TCP segment Size) for each delivery server must be the same. - Using multiple delivery servers provides redundancy and may provide improved delivery times (over a single delivery server or fewer delivery servers). However, as should be appreciated, there are costs (including overhead) of using multiple delivery servers. Accordingly, the decision as to how many delivery servers to use may be made based on policies (e.g., quality of service guarantees, etc.).
- With reference to the flowchart in
FIG. 8 , in exemplary embodiments hereof, a client request is received (at 802) by a first contact server (e.g., contact server 116-A in a contact server group ornetwork 118—FIG. 4 ). The client request was directed to the first contact server (116-A) using an IP address (e.g., IP1) associated with that contact server and using Anycast routing mechanisms. For this embodiment, assume that the contact server does not also serve content and so must migrate the request. The first contact server (116-A) then chooses (at 804) one or more delivery servers (106) and migrates the request (at 806) to each of the selected delivery servers. As part of the request migration (at 806), the first contact server updates the state information (500 inFIG. 5A, 502 inFIG. 5B ) to reflect the selected delivery server(s) 106 and other request information (e.g., a URL associated with the client request). Then, for each delivery server, (at 808) the first contact server (116-A) passes TCP data packets (ACKs) from the client to the delivery server(s). - If the Anycast routing mechanisms cause the client request to switch from the first contact server 116-A to a different contact server 116-C, then (at 810) the second contact server 116-C determines the delivery server(s) 106 (e.g. using the state/registration information 500). The second contact server 116-C then (at 812) passes TCP data packets (ACKs) from the client to the delivery server(s).
- Although only one switch is shown (from contact server 116-A to the second contact server 116-C), those of ordinary skill in the art will appreciate and understand, upon reading this description that multiple contact server switches may occur.
- Anycast may be used with or for or as part of a rendezvous system to select a delivery server to handle a client request. However, an Anycast-based system can switch servers mid-stream, which can break long running (TCP-like) connections.
- As described here, these problems with Anycast are removed or ameliorated, avoiding interrupted service.
- The services, mechanisms, operations and acts shown and described above are implemented, at least in part, by software running on one or more computers of a CDN.
- Programs that implement such methods (as well as other types of data) may be stored and transmitted using a variety of media (e.g., computer readable media) in a number of manners. Hard-wired circuitry or custom hardware may be used in place of, or in combination with, some or all of the software instructions that can implement the processes of various embodiments. Thus, various combinations of hardware and software may be used instead of software only.
- One of ordinary skill in the art will readily appreciate and understand, upon reading this description, that the various processes described herein may be implemented by, e.g., appropriately programmed general purpose computers, special purpose computers and computing devices. One or more such computers or computing devices may be referred to as a computer system.
-
FIG. 9 is a schematic diagram of acomputer system 900 upon which embodiments of the present disclosure may be implemented and carried out. - According to the present example, the
computer system 900 includes a bus 902 (i.e., interconnect), one ormore processors 904, amain memory 906, read-only memory 908,removable storage media 910,mass storage 912, and one ormore communications ports 914.Communication port 914 may be connected to one or more networks by way of which thecomputer system 900 may receive and/or transmit data. - As used herein, a “processor” means one or more microprocessors, central processing units (CPUs), computing devices, microcontrollers, digital signal processors, or like devices or any combination thereof, regardless of their architecture. An apparatus that performs a process can include, e.g., a processor and those devices such as input devices and output devices that are appropriate to perform the process.
- Processor(s) 904 can be any known processor, such as, but not limited to, an Intel® Itanium® or
Itanium 2® processor(s), AMD® Opteron® or Athlon MP® processor(s), or Motorola® lines of processors, and the like. Communications port(s) 914 can be any of an RS-232 port for use with a modem based dial-up connection, a 10/100 Ethernet port, a Gigabit port using copper or fiber, or a USB port, and the like. Communications port(s) 914 may be chosen depending on a network such as a Local Area Network (LAN), a Wide Area Network (WAN), a CDN, or any network to which thecomputer system 900 connects. Thecomputer system 900 may be in communication with peripheral devices (e.g.,display screen 916, input device(s) 918) via Input/Output (I/O) port 920. -
Main memory 906 can be Random Access Memory (RAM), or any other dynamic storage device(s) commonly known in the art. Read-onlymemory 908 can be any static storage device(s) such as Programmable Read-Only Memory (PROM) chips for storing static information such as instructions forprocessor 904.Mass storage 912 can be used to store information and instructions. For example, hard disks such as the Adaptec® family of Small Computer Serial Interface (SCSI) drives, an optical disc, an array of disks such as Redundant Array of Independent Disks (RAID), such as the Adaptec® family of RAID drives, or any other mass storage devices may be used. - Bus 902 communicatively couples processor(s) 904 with the other memory, storage, and communications blocks. Bus 902 can be a PCI/PCI-X, SCSI, a Universal Serial Bus (USB) based system bus (or other) depending on the storage devices used, and the like.
Removable storage media 910 can be any kind of external hard-drives, floppy drives, IOMEGA® Zip Drives, Compact Disc—Read Only Memory (CD-ROM), Compact Disc—Re-Writable (CD-RW), Digital Versatile Disk—Read Only Memory (DVD-ROM), etc. - Embodiments herein may be provided as one or more computer program products, which may include a machine-readable medium having stored thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process. As used herein, the term “machine-readable medium” refers to any medium, a plurality of the same, or a combination of different media, which participate in providing data (e.g., instructions, data structures) which may be read by a computer, a processor or a like device. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media include, for example, optical or magnetic disks and other persistent memory. Volatile media include dynamic random access memory, which typically constitutes the main memory of the computer. Transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise a system bus coupled to the processor. Transmission media may include or convey acoustic waves, light waves and electromagnetic emissions, such as those generated during radio frequency (RF) and infrared (IR) data communications.
- The machine-readable medium may include, but is not limited to, floppy diskettes, optical discs, CD-ROMs, magneto-optical disks, ROMs, RAMs, erasable programmable read-only memories (EPROMs), electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, embodiments herein may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., modem or network connection).
- Various forms of computer readable media may be involved in carrying data (e.g. sequences of instructions) to a processor. For example, data may be (i) delivered from RAM to a processor; (ii) carried over a wireless transmission medium; (iii) formatted and/or transmitted according to numerous formats, standards or protocols; and/or (iv) encrypted in any of a variety of ways well known in the art.
- A computer-readable medium can store (in any appropriate format) those program elements that are appropriate to perform the methods.
- As shown,
main memory 906 is encoded with application(s) 922 that supports the functionality discussed herein (the application 922 may be an application that provides some or all of the functionality of the CD services described herein, including the client application and the optimization support mechanism 112). Application(s) 922 (and/or other resources as described herein) can be embodied as software code such as data and/or logic instructions (e.g., code stored in the memory or on another computer readable medium such as a disk) that supports processing functionality according to different embodiments described herein. - During operation of one embodiment, processor(s) 904 accesses
main memory 906 via the use of bus 902 in order to launch, run, execute, interpret or otherwise perform the logic instructions of the application(s) 922. Execution of application(s) 922 produces processing functionality of the service related to the application(s). In other words, the process(es) 924 represent one or more portions of the application(s) 922 performing within or upon the processor(s) 904 in thecomputer system 900. - It should be noted that, in addition to the process(es) 924 that carries (carry) out operations as discussed herein, other embodiments herein include the application 922 itself (i.e., the un-executed or non-performing logic instructions and/or data). The application 922 may be stored on a computer readable medium (e.g., a repository) such as a disk or in an optical medium. According to other embodiments, the application 922 can also be stored in a memory type system such as in firmware, read only memory (ROM), or, as in this example, as executable code within the main memory 906 (e.g., within Random Access Memory or RAM). For example, application 922 may also be stored in
removable storage media 910, read-only memory 908 and/ormass storage device 912. - Those skilled in the art will understand that the
computer system 900 can include other processes and/or software and hardware components, such as an operating system that controls allocation and use of hardware resources. - As discussed herein, embodiments of the present invention include various steps or operations. A variety of these steps may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the operations. Alternatively, the steps may be performed by a combination of hardware, software, and/or firmware. The term “module” refers to a self-contained functional component, which can include hardware, software, firmware or any combination thereof.
- One of ordinary skill in the art will readily appreciate and understand, upon reading this description, that embodiments of an apparatus may include a computer/computing device operable to perform some (but not necessarily all) of the described process.
- Embodiments of a computer-readable medium storing a program or data structure include a computer-readable medium storing a program that, when executed, can cause a processor to perform some (but not necessarily all) of the described process.
- Where a process is described herein, those of ordinary skill in the art will appreciate that the process may operate without any user intervention. In another embodiment, the process includes some human intervention (e.g., a step is performed by or with the assistance of a human).
- As used herein, including in the claims, the phrase “at least some” means “one or more,” and includes the case of only one. Thus, e.g., the phrase “at least some services” means “one or more services”, and includes the case of one service.
- As used herein, including in the claims, the phrase “based on” means “based in part on” or “based, at least in part, on,” and is not exclusive. Thus, e.g., the phrase “based on factor X” means “based in part on factor X” or “based, at least in part, on factor X.” Unless specifically stated by use of the word “only”, the phrase “based on X” does not mean “based only on X.”
- As used herein, including in the claims, the phrase “using” means “using at least,” and is not exclusive. Thus, e.g., the phrase “using X” means “using at least X.” Unless specifically stated by use of the word “only”, the phrase “using X” does not mean “using only X.”
- In general, as used herein, including in the claims, unless the word “only” is specifically used in a phrase, it should not be read into that phrase.
- As used herein, including in the claims, the phrase “distinct” means “at least partially distinct.” Unless specifically stated, distinct does not mean fully distinct. Thus, e.g., the phrase, “X is distinct from Y” means that “X is at least partially distinct from Y,” and does not mean that “X is fully distinct from Y.” Thus, as used herein, including in the claims, the phrase “X is distinct from Y” means that X differs from Y in at least some way.
- As used herein, including in the claims, a list may include only one item, and, unless otherwise stated, a list of multiple items need not be ordered in any particular manner. A list may include duplicate items. For example, as used herein, the phrase “a list of CDN services” may include one or more CDN services.
- It should be appreciated that the words “first” and “second” in the description and claims are used to distinguish or identify, and not to show a serial or numerical limitation. Similarly, the use of letter or numerical labels (such as “(a)”, “(b)”, and the like) are used to help distinguish and/or identify, and not to show any serial or numerical limitation or ordering.
- No ordering is implied by any of the labeled boxes in any of the flow diagrams unless specifically shown and stated. When disconnected boxes are shown in a diagram, the activities associated with those boxes may be performed in any order, including fully or partially in parallel.
- While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Claims (26)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/991,545 US20200374341A1 (en) | 2017-10-09 | 2020-08-12 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/727,765 US10749945B2 (en) | 2017-10-09 | 2017-10-09 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (CDN) |
US16/991,545 US20200374341A1 (en) | 2017-10-09 | 2020-08-12 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/727,765 Continuation US10749945B2 (en) | 2017-10-09 | 2017-10-09 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (CDN) |
Publications (1)
Publication Number | Publication Date |
---|---|
US20200374341A1 true US20200374341A1 (en) | 2020-11-26 |
Family
ID=65994132
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/727,765 Active 2037-11-02 US10749945B2 (en) | 2017-10-09 | 2017-10-09 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (CDN) |
US16/991,545 Abandoned US20200374341A1 (en) | 2017-10-09 | 2020-08-12 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/727,765 Active 2037-11-02 US10749945B2 (en) | 2017-10-09 | 2017-10-09 | Cross-cluster direct server return with anycast rendezvous in a content delivery network (CDN) |
Country Status (2)
Country | Link |
---|---|
US (2) | US10749945B2 (en) |
WO (1) | WO2019074552A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110324163B (en) * | 2018-03-29 | 2020-11-17 | 华为技术有限公司 | Data transmission method and related device |
US10797734B2 (en) * | 2018-08-01 | 2020-10-06 | Commscope Technologies Llc | System with multiple virtual radio units in a radio unit that is remote from at least one baseband controller |
US10848427B2 (en) * | 2018-11-21 | 2020-11-24 | Amazon Technologies, Inc. | Load balanced access to distributed endpoints using global network addresses and connection-oriented communication session handoff |
US10855580B2 (en) | 2019-03-27 | 2020-12-01 | Amazon Technologies, Inc. | Consistent route announcements among redundant controllers in global network access point |
US10972554B1 (en) | 2019-09-27 | 2021-04-06 | Amazon Technologies, Inc. | Management of distributed endpoints |
US11552898B2 (en) * | 2019-09-27 | 2023-01-10 | Amazon Technologies, Inc. | Managing data throughput in a distributed endpoint network |
US11425042B2 (en) | 2019-09-27 | 2022-08-23 | Amazon Technologies, Inc. | Managing data throughput in a distributed endpoint network |
US11394636B1 (en) | 2020-12-10 | 2022-07-19 | Amazon Technologies, Inc. | Network connection path obfuscation using global access points |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080091812A1 (en) * | 2006-10-12 | 2008-04-17 | Etai Lev-Ran | Automatic proxy registration and discovery in a multi-proxy communication system |
US20100332595A1 (en) * | 2008-04-04 | 2010-12-30 | David Fullagar | Handling long-tail content in a content delivery network (cdn) |
US20120184258A1 (en) * | 2010-07-15 | 2012-07-19 | Movik Networks | Hierarchical Device type Recognition, Caching Control & Enhanced CDN communication in a Wireless Mobile Network |
US20120203825A1 (en) * | 2011-02-09 | 2012-08-09 | Akshat Choudhary | Systems and methods for ntier cache redirection |
US8291046B2 (en) * | 1998-02-10 | 2012-10-16 | Level 3 Communications, Llc | Shared content delivery infrastructure with rendezvous based on load balancing and network conditions |
US20130238759A1 (en) * | 2012-03-06 | 2013-09-12 | Cisco Technology, Inc. | Spoofing technique for transparent proxy caching |
US8688808B1 (en) * | 2009-04-06 | 2014-04-01 | Sprint Communications Company L.P. | Assignment of domain name system (DNS) servers |
US20140164584A1 (en) * | 2012-12-07 | 2014-06-12 | Verizon Patent And Licensing Inc. | Selecting a content delivery network |
US20150222732A1 (en) * | 2012-08-08 | 2015-08-06 | Sagemcom Broadband Sas | Device and method for providing services in a communication network |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6745243B2 (en) | 1998-06-30 | 2004-06-01 | Nortel Networks Limited | Method and apparatus for network caching and load balancing |
US7272640B1 (en) * | 2000-12-08 | 2007-09-18 | Sun Microsystems, Inc. | Dynamic network session redirector |
US7587500B2 (en) | 2001-01-10 | 2009-09-08 | Xcelera | Distributed selection of a content server |
CA2411806A1 (en) * | 2001-11-16 | 2003-05-16 | Telecommunications Research Laboratory | Wide-area content-based routing architecture |
US9167036B2 (en) | 2002-02-14 | 2015-10-20 | Level 3 Communications, Llc | Managed object replication and delivery |
US8838830B2 (en) | 2010-10-12 | 2014-09-16 | Sap Portals Israel Ltd | Optimizing distributed computer networks |
US20130173806A1 (en) | 2011-12-31 | 2013-07-04 | Level 3 Communications, Llc | Load-balancing cluster |
US9705754B2 (en) * | 2012-12-13 | 2017-07-11 | Level 3 Communications, Llc | Devices and methods supporting content delivery with rendezvous services |
US10212238B2 (en) | 2013-05-15 | 2019-02-19 | Level 3 Communications, Llc | Selecting a content providing server in a content delivery network |
EP3213222B1 (en) * | 2014-10-27 | 2021-03-24 | Level 3 Communications, LLC | Content delivery systems and methods |
WO2016194973A1 (en) * | 2015-06-04 | 2016-12-08 | 日本電気株式会社 | Local congestion determination method and congestion control device for mobile body communication |
WO2017147250A1 (en) | 2016-02-23 | 2017-08-31 | Level 3 Communications, Llc | Systems and methods for content server rendezvous in a dual stack protocol network |
-
2017
- 2017-10-09 US US15/727,765 patent/US10749945B2/en active Active
-
2018
- 2018-05-31 WO PCT/US2018/035397 patent/WO2019074552A1/en active Application Filing
-
2020
- 2020-08-12 US US16/991,545 patent/US20200374341A1/en not_active Abandoned
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8291046B2 (en) * | 1998-02-10 | 2012-10-16 | Level 3 Communications, Llc | Shared content delivery infrastructure with rendezvous based on load balancing and network conditions |
US20080091812A1 (en) * | 2006-10-12 | 2008-04-17 | Etai Lev-Ran | Automatic proxy registration and discovery in a multi-proxy communication system |
US20100332595A1 (en) * | 2008-04-04 | 2010-12-30 | David Fullagar | Handling long-tail content in a content delivery network (cdn) |
US8688808B1 (en) * | 2009-04-06 | 2014-04-01 | Sprint Communications Company L.P. | Assignment of domain name system (DNS) servers |
US20120184258A1 (en) * | 2010-07-15 | 2012-07-19 | Movik Networks | Hierarchical Device type Recognition, Caching Control & Enhanced CDN communication in a Wireless Mobile Network |
US20120203825A1 (en) * | 2011-02-09 | 2012-08-09 | Akshat Choudhary | Systems and methods for ntier cache redirection |
US20130238759A1 (en) * | 2012-03-06 | 2013-09-12 | Cisco Technology, Inc. | Spoofing technique for transparent proxy caching |
US20150222732A1 (en) * | 2012-08-08 | 2015-08-06 | Sagemcom Broadband Sas | Device and method for providing services in a communication network |
US20140164584A1 (en) * | 2012-12-07 | 2014-06-12 | Verizon Patent And Licensing Inc. | Selecting a content delivery network |
Also Published As
Publication number | Publication date |
---|---|
US20190109899A1 (en) | 2019-04-11 |
US10749945B2 (en) | 2020-08-18 |
WO2019074552A1 (en) | 2019-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200374341A1 (en) | Cross-cluster direct server return with anycast rendezvous in a content delivery network (cdn) | |
US11349912B2 (en) | Cross-cluster direct server return in a content delivery network (CDN) | |
US10855580B2 (en) | Consistent route announcements among redundant controllers in global network access point | |
US10848427B2 (en) | Load balanced access to distributed endpoints using global network addresses and connection-oriented communication session handoff | |
EP2798513B1 (en) | Load-balancing cluster | |
US10972554B1 (en) | Management of distributed endpoints | |
US9942153B2 (en) | Multiple persistant load balancer system | |
US11451477B2 (en) | Load balanced access to distributed endpoints | |
CN116489157B (en) | Management of distributed endpoints | |
US11425042B2 (en) | Managing data throughput in a distributed endpoint network | |
EP3032803B1 (en) | Providing requested content in an overlay information centric networking (o-icn) architecture | |
US11552898B2 (en) | Managing data throughput in a distributed endpoint network | |
US11431577B1 (en) | Systems and methods for avoiding duplicate endpoint distribution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: LEVEL 3 COMMUNICATIONS, LLC, COLORADO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEWTON, CHRISTOPHER;REEL/FRAME:053502/0288 Effective date: 20171107 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: SANDPIPER CDN, LLC, DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEVEL 3 COMMUNICATIONS, LLC;REEL/FRAME:067772/0171 Effective date: 20240531 |
|
AS | Assignment |
Owner name: SANDPIPER CDN, LLC, DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEVEL 3 COMMUNICATIONS, LLC;REEL/FRAME:068256/0091 Effective date: 20240531 |