EP1311957A2 - Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires - Google Patents

Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires

Info

Publication number
EP1311957A2
EP1311957A2 EP01956739A EP01956739A EP1311957A2 EP 1311957 A2 EP1311957 A2 EP 1311957A2 EP 01956739 A EP01956739 A EP 01956739A EP 01956739 A EP01956739 A EP 01956739A EP 1311957 A2 EP1311957 A2 EP 1311957A2
Authority
EP
European Patent Office
Prior art keywords
cache
ttl
content
page
internet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01956739A
Other languages
German (de)
English (en)
Inventor
Shmuel Melamed
Yves Bigio
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eplication Networks Ltd
Original Assignee
Eplication Networks Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eplication Networks Ltd filed Critical Eplication Networks Ltd
Publication of EP1311957A2 publication Critical patent/EP1311957A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Definitions

  • the invention relates to the storage and transport of information on an internet and, more particularly, to caching Web-page content on the Internet.
  • the "Internet” (upper case T) is a global internet, which is a plurality of inter-connected networks, all using a common protocol known as TCP/IP (Transmission Control Protocol/Internet Protocol).
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the World Wide Web (“WWW”, or simply “Web”) is a set of standards and protocols that enable the exchange of information between computers, more particularly hypertext (HTTP) servers, on the Internet, tying them together into a vast collection of interactive multimedia resources. Traffic across the Internet passes through multiple "backbone” carriers, such as UUNet or Digex, which forward the signals to one another under a system called peering.
  • backbone such as UUNet or Digex
  • QoS Quality of Service
  • a content source typically a server, or an Internet Service Provider, ISP
  • ISP Internet Service Provider
  • Browsewidth is a measurement of the volume of information that can be transmitted over a network (or a given portion thereof) at a given time. The higher the bandwidth, the more data can be transported. Latency is a measure of the time that it takes for a packet of data to traverse the Internet, from source to user.
  • Caching is a means of storing content objects from a Web server closer to the user, where they can be retrieved more quickly.
  • the storehouse of objects, including text pages, images, and other content, is called a Web cache (pronounced cash)." (Goulde, page 2)
  • Caching is an approach that improves QoS for users while improving the economics for network service providers. Caching entails storing frequently accessed Web content closer to users, thereby reducing the number of hops that must be traversed in order retrieve content that resides on a remote site.
  • requests for content originate from a user's browser and are first directed to the caching server. If the content is currently contained in the cache and is up to date, the content is sent to the user without the browser 's request having to be sent on to the originating server. This can reduce the latency for the transfer of content and reduce the amount of time it takes to transfer the content to the user. It also reduces the amount of data the service provider has to retrieve upstream from the Internet to fulfill requests. (Goulde, page 8)
  • Caching is a concept that is well understood in computer hardware design. Modern microprocessors employ on-chip caches of high-speed memory to store frequently used instructions and data locally instead of having to read them from slower memory. Computers implement several additional levels of caching, including RAM cache and even on-disk cache, all designed to reduce the latency for reading instructions and data and speeding up the transfer of data within the system.
  • the basic principle behind caching is to store frequently accessed data in a location that can be accessed more quickly than it could from the data's more permanent location. In the case of a system's CPU, the permanent location is the disk. On the Web, the permanent location is the origin server somewhere out on the Internet. (Goulde, page 8-9)
  • caching provides direct benefits to the end user in terms of reduced latency of Web page download (the time you have to wait before anything starts to happen) and faster download speeds (the time it takes for all the downloading to finish). But caching also provides benefits to network service providers. By storing, or caching, Web content that users request from remote servers locally, fewer of these requests have to be sent out over the Internet to be fulfilled. As a result, the access provider maintaining a cache has its upstream bandwidth requirements reduced. This reduces the bandwidth the service provider has to purchase in order to provide its customers with a satisfactory
  • Caching also provides several benefits in a Web hosting environment in which an access provider hosts Web sites for many customers. It can reduce the impact of traffic spikes caused by high-interest content and also serve as the basis for a variety of value-added services, such as providing additional capacity, guaranteed service levels, and replication services. (Goulde, page 3)
  • [Caching] network data makes it possible to provide users with an optimal experience and predictable response times. With typical cache hit rates, the user experience has a higher quality of service (QoS). Improved QoS provides significant benefits to ISPs, content providers, and corporate sites. Better QoS results in higher customer loyalty and retention. It helps create a stronger brand equity, both for the access provider and for content providers. Content providers who ensure that their content is amply cached throughout the network , close to users, will ultimately see more visitors accessing their content. (Goulde, page 5)
  • Web browsers also implement a form of caching, storing recently accessed Web content on the user's hard disk and reading from that local cache of files instead of accessing the Internet. This works well when the user hits the "Back" and “Forward” buttons in their browser during a session, but it does nothing if the user is browsing a site for the first time. (Goulde, page 9) "Neither the browser's cache nor a Web server's cache can address network performance issues. By placing a cache of Web content on the network between the user and the originating Web sites, the distance that commonly accessed content has to travel over the Internet is reduced, and users experience quicker response and faster performance. Network caching takes advantage of the fact that some content is accessed more frequently that other content.
  • the content may also be a dynamically created page that is generated from a search engine, a database query, or a Web application.
  • the HTTP server returns the requested content to the Web browser one file at a time.
  • Even a dynamically created page often has static components that are combined with the dynamic content to create the final page. (Goulde, page 13) "When caching is used, frequently accessed content is stored close to the user.
  • NAP Network Access Point
  • POP Point of Presence
  • Network caching can be applied to content delivered over many different protocols. These include HTTP, NNTP, FTP, RTSP, and others. All are characterized by having some proportion of static content and high utilization. Cache server support for each protocol is, of course, required.” (page 14)
  • Gould article discusses the effective use of caches, load balancing, where to locate caches in the infrastructure, and designing cache-friendly web content. There is also mention of protocols which have been developed - for example, Web Cache Control Protocol (WCCP) (page 18). There is also discussion of appropriate use of the "expire” and "max- age” headers in the HTTP protocol. (Goulde, page 27). And, as expressly stated by Gould,
  • Cache Server A highly optimized application that stores frequently accessed content at strategic aggregation points close to the users requesting that content in order to reduce the impact of delays and network bottlenecks.
  • CARP Cache Array Routing Protocol A protocol for synchronizing multiple cache servers.
  • CARP maintains a shared namespace that maps any Web object's address (URL) to only one node in the array. Requests are routed to that node.
  • URL Web object's address
  • Cookie The most common meaning of "Cookie” on the Internet refers to a piece of information sent by a Web Server to a Web Browser that the Browser software is expected to save and to send back to the Server whenever the browser makes additional requests from the Server.
  • the Browser may accept or not accept the Cookie, and may save the Cookie for either a short time or a long time.
  • Cookies might contain information such as login or registration information, online "shopping cart” information, user preferences, etc..
  • the Server When a Server receives a request from a Browser that includes a Cookie, the Server is able to use the information stored in the Cookie. For example, the Server might customize what is sent back to the user, or keep a log of particular user's requests. Cookies are usually set to expire after a predetermined amount of time and are usually saved in memory until the Browser software is closed down, at which time they may be saved to disk if their "expire time" has not been reached.
  • Dynamic Content Live content which is updated on a regular basis.
  • Examples of dynamic content might include a "current temperature” display on a weather web site, search results, or a "Current Top Headlines” item on a news web site.
  • HTTP Server A server that implements the HTTP protocol, enabling it to serve Web pages to client agents (browsers).
  • HTTP Servers support interfaces so that Web pages can call external programs. They also support encryption mechanisms for securely exchanging information and authentication and access control mechanisms to control access to content.
  • ICP Internet Cache Protocol A protocol for synchronizing multiple cache servers. Each time a cache server experiences a miss, it broadcasts messages to all peer nodes asking whether each has the content. The requesting server then must issue a request for the content and forward it on to the user.
  • Proxy Server acts as an intermediary between a user and the Internet so an enterprise can ensure security and administrative control and also provide a caching service.
  • a proxy server is usually associated with or part of a gateway server that separates the enterprise network from the outside network or a firewall that protects the enterprise network from outside intrusion.
  • Routers are the devices that build a fully interconnected network out of a collection of point-to-point links. Routers on the Internet exchange information pertaining to their local section of the network, particularly how close they are topologically to local systems. They collectively build a map of how to get from any point in the Internet to any other. Packets are routed based on the exchanged mapping information, until the last router connects directly to the target system.
  • Static Content "Fixed" or long-term unchanging components of web pages stored as . files that are either never changed or are changed only on an infrequent basis.
  • Switches High-speed network devices that typically sit on the periphery of the Internet. Switches differ from routers in providing higher performance at a lower price but with limited functionality. Typical switches can route traffic locally but aren't concerned with complexities of routing found in the high-speed Internet backbone. Switches play an important role in caching because they are often used to divert the cacheable traffic to the caching system.
  • HTML Hypertext Markup Language A specification based on Standard Generalized Markup Language (SGML) for tagging text so that it may be displayed in a user agent (browser) in a standard way.
  • HTTP Hypertext Transmission Protocol An application-level protocol that runs on top of TCP/IP, which is the foundation for the World Wide Web.
  • IP Internet Protocol The network layer for the TCP/IP protocol suite. It is a connectionless, best-effort, packet-switching protocol.
  • IP Address A 32-bit address defined by the Internet Protocol that is usually represented in decimal notation. It uniquely identifies each computer on the Internet.
  • Protocol An agreed-upon set of technical rules by which computers exchange information.
  • URL Uniform Resource Locator The method by which Internet sites are addressed. It includes an access protocol and either an IP address or DNS name. An example is http://www.domain.com.
  • NTP Network News Transfer Protocol
  • Web Server See HTTP Server.
  • An object of the invention to provide a technique for reducing bandwidth usage of WWW servers and improving the QoS of WWW sites.
  • a technique for caching objects having dynamic content on the Internet generally comprises disposing a cache in the Internet for storing and updating copies of dynamic content.
  • the cache may be disposed at a location selected from the group consisting of major Internet switching locations, dial-in aggregation points, and corporate gateways.
  • the cache may also store and update copies of static content.
  • update characteristics of the objects are determined, and a time to live (TTL) parameter for the objects is adjusted based upon the update characteristics.
  • TTL time to live
  • the object is updated if its TTL is less than its age.
  • the TTL for an obj ect may be adj usted to :
  • a method of responding to a user request for an object having dynamic content comprises storing a copy of the object in a cache; establishing a time to live (TTL) for the object; receiving the user request at the cache; fulfilling the user request with the stored copy of the object if its TTL is greater than its age; and fetching an updated copy of the object and responding to the user request with the updated copy if the TTL of the stored copy is less than its age.
  • TTL time to live
  • the TTL for the object is first set to a reasonable lower limit (Tmin) and is then adjusted based on the frequency at which the object actually changes.
  • Tmin a reasonable lower limit
  • each time the cache fetches the object from the server the cache performs the following procedures: a. if another fetch for the same object is ongoing, waiting for the previous fetch to complete; b. fetching the object from the server; c. replacing the cached copy, if present, by the fetched object, after having compared them to determine whether the object had changed since it was last fetched; d. initializing or updating the object's change statistics accordingly; e.
  • the caches are dedicated machines and are placed so that Web browsing passes through the cache instead of going all the way to the original sites, in many different locations, ideally within the network of ISPs providing the Internet connectivity to the highest number of users in those locations. In this manner:
  • the cache reloads a page (fetches an object from the server), whenever its corresponding cached copy has not been refreshed for a given time (time to live, "TTL").
  • TTL can be thought of as the "shelf life" of the page.
  • the cache first sets the TTL for a dynamic object to a reasonable lower limit T m jschreib. Then, over time, as the cache reloads the page several times, it can track the frequency at which the page actually changes content, and adjust the TTL for that page accordingly.
  • This "TTL" technique mimics a common caching method for static pages, if the original server of the page specifies a TTL. But since servers for dynamic pages do not specify a TTL for them, the cache has to establish a reasonable TTL of its own accord.
  • the system can adapt, in real time, according to the number of requests to each page and the actual update frequency of the page.
  • rectangular boxes generally represent a sequential step being performed
  • an empty circle is not a step or a test, but is merely a graphical junction point at which two or more paths in the flowchart converge.
  • Figure 1 is a greatly simplified schematic illustration of the Internet, illustrating an embodiment of the caching system of the present invention
  • Figure 2 is a flowchart illustrating how user requests for static and/or dynamic content are handled by the cache, according to the invention
  • Figure 3 is a graph illustrating an average error probability, according to an analysis of the invention.
  • Figure 4 is a graph illustrating the evolution of error probability, according to an analysis of the invention.
  • FIG. 1 is a greatly simplified schematic illustration of the Internet environment, illustrating an embodiment of the caching system of the present invention.
  • a user 102 makes a "request” for an "object” (e.g., a Web page) which is made available on the Internet by a server (e.g., ISP) 104.
  • the object typically originates at a content provider, not shown.
  • a switch 106 interfaces a cache (cache server) 108 to the Internet.
  • the cache may contain a copy of the Web page.
  • There are two possible “responses” to the user request either the server "serves" (or “services") the request, or it is “fulfilled” in the cache. In the latter case, the content must first have been “transferred” to the cache, which may periodically “fetch” (or “reload”) updated Web page content from the server.
  • the switch 106 may be a high-performance processor that can look at network traffic and make routing decisions based on protocols above the IP level. As a result, the switch can direct HTTP (and other) traffic to caches (108), and send the rest of the traffic directly to the Internet.
  • HTTP and other traffic
  • caches 108
  • Nontransparent caches can either be deployed in a transparent or nontransparent form.
  • a nontransparent cache is explicitly visible, and browsers or other caches that use the cache are overtly configured to direct traffic to the cache. In this case, the cache acts as a proxy agent for the browser, fulfilling requests when possible and forwarding requests to the origin server when necessary.
  • Nontransparent caches are often a component of a larger proxy server acting as part of a gateway or firewall and addressing many different applications.
  • a transparent cache sits in the network flow and functions invisibly to a browser.
  • a transparent configuration is preferred because it minimizes the total administrative and support burden of supporting users in configuring their browsers to find the cache.
  • Caches should be implemented transparently to maximize the benefits of caching.
  • a nontransparent implementation requires having browsers manually configured to direct their requests for content to the cache server.
  • cache benefits are delivered to clients without having to reconfigure the browser. Users automatically gain the benefits of caching. (Goulde, page 17)
  • the cache can be configured as if it were a router so that all Internet-based traffic is aimed at it. This is a transparent configuration that requires no configuration of the browser; the browser or downstream cache is unaware of the cache's existence but still benefits from it.
  • the downside is that the system on which the cache resides has to devote some of its resources to routing, and the cache becomes a mission-critical part of the network. Sophisticated router configuration with policy-based routing can minimize some of these issues by only directing HTTP (TCP Port 80) traffic to the cache, bypassing the cache in the event of failure and sending traffic directly to the Internet. (Goulde, page 17) -
  • An increasingly popular option is to use a Layer 4 switch to interface the cache to the Internet (see Illustration 4).
  • switches are high-performance processors that can look at network traffic and make routing decisions based on protocols above the IP level.
  • the switch can direct HTTP (and other) traffic to the caches and send the rest of the traffic directly to the Internet ...
  • HTTP and other traffic
  • the switch can parse the HTTP request and send the request to a specific node in a cache farm based on the URL requested.
  • Using an intelligent switch keeps unnecessary network traffic off the cache, simplifies designing for availability, and distributes loading on the cache farm based on specific URLs. (Goulde, page 18) (An architecture similar to this one is described hereinabove with respect to Figure 1)
  • WCCP Web Cache Control Protocol
  • WPAD Web Proxy Autodiscovery Protocol
  • WPAD Web Proxy Autodiscovery Protocol
  • Caching systems can be used to optimize the performance of a Web server site as well as to speed Internet access for Web browser users.
  • the caching system sits in front of one or more Web servers, intercepting traffic to those servers and standing in, or proxying, for one or more of the servers.
  • Cache servers can be deployed throughout a network, creating a distributed network for hosted content.
  • the proxy cache server will request dynamic and other short-lived content from the origin servers. This enables content from the site to be served from a local cache instead of from the origin server.
  • the proxy server can be optimized for high performance, efficient operation, conserving resources, and off-loading the origin server from serving static content.
  • Reverse proxy caching provides benefits to the access provider as well as to the user. Those benefits include the ability to enable load balancing, provide peak-demand insurance to assure availability, and provide dynamic mirroring of content for high availability. (Goulde, page 20)
  • the cache of the present invention can be located anywhere that there is (or could be) a cache serving static content, or it can be incorporated into an existing cache which fulfills requests for static content, with the additional functionality enabled according to the techniques set forth below. Or, it can be provided as a separate, dedicated machine (computer).
  • Figure 2 is a flowchart illustrating how user requests for static and/or dynamic content are handled by the cache.
  • a first step 202 for a user request for an object, the cache determines whether the requested object is in cache. If not (N), the user request is passed on to the server for servicing the request and meanwhile, in a step 204, the cache fetches the object from the server in anticipation of the next request for the object from the same or another user.
  • a cache server which is transparent to the user. It intercepts information requests and decides whether it will provide a response from a cached local copy or from a remote information source. After fetching information from a local source, the cache server decides whether to store it locally, and if so, for how long.
  • a request for information which can be provided from a local copy is known as a "cache hit”.
  • a request for information which is not stored locally is known a "cache miss”.
  • the storage determination algorithm is well-designed the probability of a cache hit is greatly improved, and apparent response time to user requests (related to QoS) is reduced. Further, every information request satisfied by locally cached content (cache hit) reduces traffic on the external network, permitting shorter response times over the external network.
  • step 206 If the requested object is in the cache, it is next determined in a step 206 whether the requested object is marked as static. If so (Y), it is then determined in a step 208 whether to update the cached copy or to use it to fulfill the user request using any suitable standard algorithm for caching static objects, such as comparing the objects "age” (the time elapsed since it has last been refreshed) to the TTL (if the original server of the page specifies a
  • step 206 If the requested object is in cache, and it is dynamic (N, step 206), it is determined in a step
  • the cached copy's TTL is less than (a lower number than) its age (Y, step 210), it is considered to be "stale", and in a step 212 the cache:
  • fetches the object from the original server. If the cached copy's TTL is equal to or greater than its age (N, step 210), it is considered to be "fresh", and in a step 214 the cache:
  • fulfills the request using the cached copy; and ⁇ updates the object's access statistics.
  • the time difference between the cached copy's age and its TTL is less than a given time, and the number of recent user requests is more than a given rate, it is considered to be "aged” and "popular", and the cache fetches the object from the server in what is termed an "anticipated refresh".
  • the cache fetches an object from the server in a step 216: a. if another fetch for the same object is ongoing (e.g., due to a previous user request), the cache waits for the previous fetch to complete, rather than duplicating it request; b.
  • T Time To Live for the dynamic object
  • W is a given time since the original object has changed (i.e., how long it is outdated); ⁇ is an average time between changes, which is determined from the object's change statistics; n is number of user requests per time unit (e.g., frequency); po is maximum error probability, which is the average ratio of the number of requests fulfilled using a cached copy whose corresponding original object has changed for more than the given time W, over the total number of requests; T
  • Do is maximum delay, which is the average time between an object change and when the cached copy is refreshed.
  • the changes do have an average frequency of 1/ ⁇ .
  • the error probability is defined as being the percentage of cases where the user receives the cached copy whereas the object has changed on the original server, and the cached copy is outdated by more than a given time W.
  • the probability is again zero, and a re-fetch occurs at 45 seconds, and the probability rises until 60 seconds, as illustrated by the sawtooth pattern 304.
  • a similar result is shown by the sawtooth pattern 306 between 75 and 90 seconds.
  • the average time E within the interval during which the content is outdated is the sum over all intervals [u; u+du] of the probability that there was no change between 0 and u, but that there was a change between u and u+du, multiplied by the length of time during which the content remains outdated by more than W. That is: T-w
  • TTL should be chosen, such as:
  • the probability of the retrieved information being "stale" (older than W) is essentially zero for values of T(s) (Time to live) less than or equal to W and increases with increasing T(s) according to a decaying exponential, approaching 100% probability of error at infinity.
  • This observation that the error probability is zero for values of T(s) less than or equal to W is essentially a "trivial" result, since it is clear that no information can be older than W if it is updated more frequently than W.
  • the average update interval ⁇ has a significant effect on how steeply the error probability climbs for values of T(s) greater than W. The greater the average update interval ⁇ with respect to W, the less sharply the error probability rises with increasing values of T(s).
  • a graphical illustration of a technique for choosing a TTL to maintain error probability below a threshold value Q Q ⁇ S obtained by identifying the value of p_p_on the vertical axis of the graph 400 and following an imaginary line horizontally across the graph 400 to where it intersects the curve (402,404,406, or 408) for the appropriate value of ⁇ .
  • the web page may optionally plant a cookie in the user's browser. Thereafter, every time the end user accesses the web page, the browser sends the cookie along with the access request.
  • the cookie tells the server (104) what the server wants (or needs) to know about the end user, in addition to simply the fact that the user wants to retrieve the web page. For example, if the web page is a personalized web page of a single end user, the server knows the end user's preferences.
  • System for caching objects having dynamic content on the Internet comprising: a cache connected in the Internet for storing and updating copies of dynamic content.
  • the cache is disposed at a location selected from the group consisting of major Internet switching locations, dial-in aggregation points, and corporate gateways.
  • System further comprising: means for monitoring one or more of the objects to determine update characteristics thereof; and means for adjusting a time to live (TTL) parameter for the objects based upon the update characteristics.
  • TTL time to live
  • System further comprising: means for determining an age of an object; and means for updating the object if the TTL for the object is less than its age.
  • System further comprising: means for determining a probability of error for each of the objects; and means for adjusting the TTL for each of the objects to maintain its probability of error below a predetermined error probability threshold.
  • T is the Time To Live, for the dynamic object
  • W is a given time since the original object has changed (i.e., how long it is outdated); ⁇ is an average time between changes, which is determined from the object's change statistics; and p 0 is maximum error probability, which is the average ratio of the number of requests fulfilled using a cached copy whose corresponding original object has changed for more than the given time W, over the total number of requests.
  • T is the Time To Live for the dynamic object:
  • W is a given time since the original object has changed (i.e., how long it is outdated); is an average time between changes, which is determined from the object's change statistics; n is number of user requests per time unit (e.g., frequency); and n is maximum error rate, which is the average number per time unit of requests fulfilled using a cached copy whose corresponding original object has changed for more than the given time W.
  • System further comprising: means for determining a delay time for each of the objects; and means for adjusting TTL for each of the objects to maintain its delay time below a predetermined delay threshold.
  • T is the Time To Live for the dynamic object
  • is an average time between changes, which is determined from the object's change statistics
  • Do is maximum delay, which is the average time between an object change and when the cached copy is refreshed.
  • System further comprising: means for determining at least one object characteristic selected from the group consisting of error probability, error rate and delay time for each of the objects; and means for adjusting TTL to maintain the selected object charactertics below a respective threshold value.
  • means for limiting adjustment of TTL for each of the objects to a range bounded by predetermined minimum (Tmin) and maximum (Tmax)values for TTL.
  • Method of responding to a user request for information having dynamic content comprising: storing a copy of the dynamic content in a cache; establishing a time to live (TTL) for the dynamic content; receiving the user request at the cache; responding to the user request with the stored copy of the dynamic content if its TTL is greater than its age; and retrieving an updated copy of the dynamic content and responding to the user request with the updated copy if the TTL of the stored copy is less than its age.
  • TTL time to live
  • Method, according to claim 15, further comprising: determining an average update frequency for the dynamic content; and determining the TTL for the dynamic content as a function of its average update frequency.
  • Method, according to claim 15, further comprising: determining an average update frequency for the dynamic content; and determining the TTL for the dynamic content as a function of its average update frequency and a predetermined error probability threshold.
  • Method, according to claim 18, further comprising: adjusting the TTL for the dynamic content according to a frequency of user requests for the dynamic content.
  • W is a given time since the original object has changed (i.e., how long it is outdated); ⁇ is an average time between changes, which is determined from the object's change statistics; and po is maximum error probability, which is the average ratio of the number of requests fulfilled using a cached copy whose corresponding original object has changed for more than the given time W, over the total number of requests.
  • T is the Time To Live for the dynamic object
  • W is a given time since the original object has changed (i.e., how long it is outdated);
  • is an average time between changes, which is determined from the object's change statistics;
  • n is number of user requests per time unit (e.g., frequency);
  • «o is maximum error rate, which is the average number per time unit of requests fulfilled using a cached copy whose corresponding original object has changed for more than the given time W.
  • T is the Time To Live for the dynamic object
  • is an average time between changes, which is determined from the object's
  • Method, according to claim 23, further comprising: if the requested web page is a personalized web page for a single end user, then the web page is not cached.
  • Method, according to claim 15, further comprising: if the information is supposed to be modified each time it is accessed, setting TTL 0.
  • Method of responding to a user request for an object having dynamic content comprising: storing a copy of the object in a cache; establishing a time to live (TTL) for the object; receiving the user request at the cache; fulfilling the user request with the stored copy of the object if its TTL is greater than its age; and fetching an updated copy of the object and responding to the user request with the updated copy if the TTL of the stored copy is less than its age.
  • TTL time to live
  • Method, according to claim 28, further comprising, in the cache: first setting the TTL for the object to a reasonable lower limit (Tmin); and adjusting the TTL for the object based on the frequency at which the object actually changes.
  • Tmin a reasonable lower limit
  • Method, according to claim 28, further comprising: each time the cache fetches the object from the server, performing the following procedures: a. if another fetch for the same object is ongoing, waiting for the previous fetch to complete; b. fetching the object from the server; c. replacing the cached copy, if present, by the fetched object, after having compared them to determine whether the object had changed since it was last fetched; d. initializing or updating the object's change statistics accordingly; e. marking the object as static or dynamic content depending on the server's reply; and f.
  • Method, according to claim 31 further comprising: if the requested web page is a personalized web page for a single end user, then the web page is not cached.
  • Method, according to claim 28, further comprising: if the information is supposed to be modified each time it is accessed, setting TTL 0.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Complex Calculations (AREA)

Abstract

Dans cette invention, des antémémoires sont disposées sur l'Internet pour stocker et mettre à jour des copies d'objets possédant un contenu dynamique. Les caractéristiques de mise à jour des objets sont déterminées, et un paramètre de durée de vie pour les objets est ajusté sur la base des caractéristiques de mise à jour. En général, l'objet est mis à jour si sa durée de vie est inférieure à son âge. La durée de vie pour un objet peut être ajustée pour (i) maintenir sa probabilité d'erreur au-dessous d'un seuil de probabilité d'erreur prédéterminé; (ii) maintenir son taux d'erreur au-dessous d'un seuil de probabilité d'erreur prédéterminé; ou (iii) maintenir son temps de retard au-dessous d'un seuil de retard prédéterminé. De préférence, les antémémoires sont des machines spécialisées et sont disposées de manière que le chercheur Web passe à travers l'antémémoire au lieu d'aller jusqu'aux sites originaux, dans de nombreux emplacements différents, de façon idéale au sein du réseau de fournisseurs de services Internet fournissant la connectivité au plus grand nombre d'utilisateurs dans ces emplacements. De cette manière, les utilisateurs de ces fournisseurs de services Internet et, dans une moindre mesure, des fournisseurs de service Internet voisins, profiteront d'une énorme amélioration de la qualité de service et de la vitesse, étant donné que la majeure partie du trafic restera dans ou proche des réseaux internes des fournisseurs de services Internet et ne devra pas passer à travers un réseau de base très chargé. En outre, les sites Web originaux n'auront plus besoin d'autant de bande passante puisque les antémémoires absorberont la majeure partie de la charge. Ce système peut s'adapter, en temps réel, en fonction du nombre de demandes à chaque page et de la fréquence de mise à jour actuelle de cette page.
EP01956739A 2000-07-17 2001-07-16 Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires Withdrawn EP1311957A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US21855900P 2000-07-17 2000-07-17
US218559P 2000-07-17
PCT/IL2001/000651 WO2002007364A2 (fr) 2000-07-17 2001-07-16 Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires

Publications (1)

Publication Number Publication Date
EP1311957A2 true EP1311957A2 (fr) 2003-05-21

Family

ID=22815575

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01956739A Withdrawn EP1311957A2 (fr) 2000-07-17 2001-07-16 Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires

Country Status (5)

Country Link
EP (1) EP1311957A2 (fr)
JP (1) JP2004504681A (fr)
AU (1) AU2001278654A1 (fr)
IL (1) IL153782A0 (fr)
WO (1) WO2002007364A2 (fr)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7676813B2 (en) 2004-09-30 2010-03-09 Citrix Systems, Inc. Method and system for accessing resources
US7752600B2 (en) 2004-09-30 2010-07-06 Citrix Systems, Inc. Method and apparatus for providing file-type associations to multiple applications
US7853947B2 (en) 2004-09-30 2010-12-14 Citrix Systems, Inc. System for virtualizing access to named system objects using rule action associated with request
US7925694B2 (en) 2007-10-19 2011-04-12 Citrix Systems, Inc. Systems and methods for managing cookies via HTTP content layer
US8090877B2 (en) 2008-01-26 2012-01-03 Citrix Systems, Inc. Systems and methods for fine grain policy driven cookie proxying
US8095940B2 (en) 2005-09-19 2012-01-10 Citrix Systems, Inc. Method and system for locating and accessing resources
US8117559B2 (en) 2004-09-30 2012-02-14 Citrix Systems, Inc. Method and apparatus for virtualizing window information
US8131825B2 (en) 2005-10-07 2012-03-06 Citrix Systems, Inc. Method and a system for responding locally to requests for file metadata associated with files stored remotely
US8171479B2 (en) 2004-09-30 2012-05-01 Citrix Systems, Inc. Method and apparatus for providing an aggregate view of enumerated system resources from various isolation layers
US8171483B2 (en) 2007-10-20 2012-05-01 Citrix Systems, Inc. Method and system for communicating between isolation environments
US8326943B2 (en) 2009-05-02 2012-12-04 Citrix Systems, Inc. Methods and systems for launching applications into existing isolation environments
GB2510073A (en) * 2012-01-05 2014-07-23 Seven Networks Inc Mobile device caching

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4744792B2 (ja) * 2003-06-26 2011-08-10 ソフトバンクモバイル株式会社 キャッシングシステム
EP1770954A1 (fr) 2005-10-03 2007-04-04 Amadeus S.A.S. Système et procédé permettant de maintenir la cohérence d'une mémoire cache dans un système logiciel multi-paliers destinée à interface base de données large
US7461206B2 (en) * 2006-08-21 2008-12-02 Amazon Technologies, Inc. Probabilistic technique for consistency checking cache entries
JP5116319B2 (ja) * 2007-03-06 2013-01-09 キヤノン株式会社 メッセージ中継装置及び方法
US8751925B1 (en) 2010-04-05 2014-06-10 Facebook, Inc. Phased generation and delivery of structured documents
EP2801236A4 (fr) 2012-01-05 2015-10-21 Seven Networks Inc Détection et gestion d'interactions d'utilisateur à l'aide d'applications d'avant-plan sur un dispositif mobile dans une mise en cache distribuée
CN104471573B (zh) * 2012-08-14 2017-07-18 艾玛迪斯简易股份公司 更新高速缓存的数据库查询结果
KR101529602B1 (ko) * 2013-01-07 2015-06-18 한국과학기술원 캐시 서버, 콘텐츠 제공 시스템 및 콘텐츠 교체 방법
KR101962301B1 (ko) * 2013-03-01 2019-03-26 페이스북, 인크. 구조화 문서의 페이지렛의 캐싱
KR101540847B1 (ko) * 2013-07-09 2015-07-30 광운대학교 산학협력단 스토리지의 부하에 기초한 웹 브라우저 정보 캐싱 장치 및 방법
KR101645222B1 (ko) * 2015-05-06 2016-08-12 (주)넷피아 어드밴스드 도메인 네임 시스템 및 운용 방법

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6026413A (en) * 1997-08-01 2000-02-15 International Business Machines Corporation Determining how changes to underlying data affect cached objects
US6185608B1 (en) * 1998-06-12 2001-02-06 International Business Machines Corporation Caching dynamic web pages

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0207364A2 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352964B2 (en) 2004-09-30 2013-01-08 Citrix Systems, Inc. Method and apparatus for moving processes between isolation environments
US8302101B2 (en) 2004-09-30 2012-10-30 Citrix Systems, Inc. Methods and systems for accessing, by application programs, resources provided by an operating system
US7676813B2 (en) 2004-09-30 2010-03-09 Citrix Systems, Inc. Method and system for accessing resources
US7853947B2 (en) 2004-09-30 2010-12-14 Citrix Systems, Inc. System for virtualizing access to named system objects using rule action associated with request
US8132176B2 (en) 2004-09-30 2012-03-06 Citrix Systems, Inc. Method for accessing, by application programs, resources residing inside an application isolation scope
US8042120B2 (en) 2004-09-30 2011-10-18 Citrix Systems, Inc. Method and apparatus for moving processes between isolation environments
US8171479B2 (en) 2004-09-30 2012-05-01 Citrix Systems, Inc. Method and apparatus for providing an aggregate view of enumerated system resources from various isolation layers
US7680758B2 (en) 2004-09-30 2010-03-16 Citrix Systems, Inc. Method and apparatus for isolating execution of software applications
US8117559B2 (en) 2004-09-30 2012-02-14 Citrix Systems, Inc. Method and apparatus for virtualizing window information
US7752600B2 (en) 2004-09-30 2010-07-06 Citrix Systems, Inc. Method and apparatus for providing file-type associations to multiple applications
US8095940B2 (en) 2005-09-19 2012-01-10 Citrix Systems, Inc. Method and system for locating and accessing resources
US8131825B2 (en) 2005-10-07 2012-03-06 Citrix Systems, Inc. Method and a system for responding locally to requests for file metadata associated with files stored remotely
US7925694B2 (en) 2007-10-19 2011-04-12 Citrix Systems, Inc. Systems and methods for managing cookies via HTTP content layer
US8171483B2 (en) 2007-10-20 2012-05-01 Citrix Systems, Inc. Method and system for communicating between isolation environments
US9009721B2 (en) 2007-10-20 2015-04-14 Citrix Systems, Inc. Method and system for communicating between isolation environments
US9021494B2 (en) 2007-10-20 2015-04-28 Citrix Systems, Inc. Method and system for communicating between isolation environments
US9009720B2 (en) 2007-10-20 2015-04-14 Citrix Systems, Inc. Method and system for communicating between isolation environments
US9059966B2 (en) 2008-01-26 2015-06-16 Citrix Systems, Inc. Systems and methods for proxying cookies for SSL VPN clientless sessions
US8769660B2 (en) 2008-01-26 2014-07-01 Citrix Systems, Inc. Systems and methods for proxying cookies for SSL VPN clientless sessions
US8090877B2 (en) 2008-01-26 2012-01-03 Citrix Systems, Inc. Systems and methods for fine grain policy driven cookie proxying
US8326943B2 (en) 2009-05-02 2012-12-04 Citrix Systems, Inc. Methods and systems for launching applications into existing isolation environments
GB2510073B (en) * 2012-01-05 2014-12-31 Seven Networks Inc Mobile device caching
GB2510073A (en) * 2012-01-05 2014-07-23 Seven Networks Inc Mobile device caching

Also Published As

Publication number Publication date
WO2002007364A2 (fr) 2002-01-24
JP2004504681A (ja) 2004-02-12
WO2002007364A3 (fr) 2002-05-02
AU2001278654A1 (en) 2002-01-30
IL153782A0 (en) 2003-07-31

Similar Documents

Publication Publication Date Title
US20040128346A1 (en) Bandwidth savings and qos improvement for www sites by catching static and dynamic content on a distributed network of caches
EP1311957A2 (fr) Economie de bande passante et amelioration de la qualite de service pour des sites www par mise en antememoire de contenu statique et dynamique dans un reseau reparti d'antememoires
US10476984B2 (en) Content request routing and load balancing for content distribution networks
Davison A web caching primer
US8346956B2 (en) Dynamic image delivery system
US8447837B2 (en) Site acceleration with content prefetching enabled through customer-specific configurations
US8060581B2 (en) Dynamic image delivery system
US6959318B1 (en) Method of proxy-assisted predictive pre-fetching with transcoding
US8326956B2 (en) System and method for handling persistence information in a network
US7653706B2 (en) Dynamic image delivery system
US20030078964A1 (en) System and method for reducing the time to deliver information from a communications network to a user
US7725598B2 (en) Network cache-based content routing
US20060259690A1 (en) Methods and system for prepositioning frequently accessed web content
US7349902B1 (en) Content consistency in a data access network system
US20030225873A1 (en) Optimization of network performance through uni-directional encapsulation
Park et al. Deploying Large File Transfer on an HTTP Content Distribution Network.
EP2552082B1 (fr) Procédé d'accélération de site Web favori et système
WO2002006961A2 (fr) Procede permettant de determiner la metrologie d'un reseau de distribution de contenu et de gestion du trafic mondial
WO2007079192A2 (fr) Accélération de site à fonction de prélecture de contenu activée par des configurations spécifiques de client
Iyengar et al. Web caching, consistency, and content distribution
JP2002259333A (ja) コンテンツ転送方法
Rangarajan et al. A technique for user specific request redirection in a content delivery network
Danalis Web Caching
Bhagwan et al. Cache is better than Check
Danalis et al. Web Caching: A Survey

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030115

AK Designated contracting states

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20050201