CA2327198A1 - Cache page indexing for web server environments - Google Patents

Cache page indexing for web server environments Download PDF

Info

Publication number
CA2327198A1
CA2327198A1 CA002327198A CA2327198A CA2327198A1 CA 2327198 A1 CA2327198 A1 CA 2327198A1 CA 002327198 A CA002327198 A CA 002327198A CA 2327198 A CA2327198 A CA 2327198A CA 2327198 A1 CA2327198 A1 CA 2327198A1
Authority
CA
Canada
Prior art keywords
cache
page
application server
web server
pages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002327198A
Other languages
French (fr)
Inventor
Don A. Bourne
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
IBM Canada Ltd
Original Assignee
IBM Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IBM Canada Ltd filed Critical IBM Canada Ltd
Priority to CA002327198A priority Critical patent/CA2327198A1/en
Publication of CA2327198A1 publication Critical patent/CA2327198A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]

Abstract

A cache page indexing system for pages in a web server environment. An application server in the web server environment has an associated configuration file. The configurati on file is defined to specify those cached pages for which a user-defined component is used to dynamically generate a name to access the page in the cache. For requests that qualify, the application server uses user- defined plug-in components to generate metadata used in forming names of cac he pages during retrieval from the cache.

Description

CACHE PAGE INDEXING FOR WEB SERVER ENVIRONMENTS
FIELD OF THE INVENTION
The present invention is directed to an improvement in computing systems and in particular to indexing for cache pages accessed by a web server.
BACKGROUND OF THE INVENTION
Web servers provide HTML format pages to web browsers in response to requests from browsers.
It is known for web servers to maintain a cache of frequently accessed HTML
pages. When a browser requests a page with a URL that is in the cache, the web server can provide the page to the browser more quickly than is the case when the HTML page has to be generated by the web server from other data.
In web server environments web servers call application servers to obtain data for inclusion in web pages to be provided by the web server in response to browser requests.
Application servers also include a caching function to maintain cache pages at the application server level. It is known to include rudimentary mechanisms to permit variations of web pages to be indexed in the application server cache.
Such existing cache page indexing for web server environments is limited, however, to web pages that have no variations (typically maintained in the web server cache) or to those web pages that are identified in general by the application server itself as being likely candidates for variation.
Where each application's data requirements for page indexing vary, existing cache indexing schemes either require fixed data definitions, or are simply unable to index the content, and hence cannot be used. Furthermore, in some applications, the indexing criteria can vary for different input based on the application logic.

It is therefore desirable to have a cache page indexing system for a web server environment that permits a user of the web server environment to dynamically define cache page indexing information.
SUMMARY OF THE INVENTION
According to an aspect of the present invention there is provided an improved computer system for providing web server cache page indexing.
According to another aspect of the present invention there is provided an indexing system for pages stored in a cache, the pages to be accessed in response to requests received by a web server, the indexing system including means for accessing user-defined components for dynamically generating cache page identification strings, means for storing pages to, and retrieving pages from, the cache based on page names including generated cache page identification strings, and means for associating user-defined components with requests received by the web server, whereby for a received request the indexing system accesses an associated user-defined component to generate a cache page identification string for use in retrieving a page from the cache based on the page name.
According to another aspect of the present invention there is provided the above indexing system in which the indexing system includes an application server and in which the user-defined components are plugs-ins to the application server.
According to another aspect of the present invention there is provided the above indexing system in which the means for associating user-defined components with requests includes a configuration file accessible by the web server and the application server.
According to another aspect of the present invention there is provided the above indexing system further including pages stored in the cache which are accessible by the web server in which the configuration file includes data specifying, for a defined page request, whether a page associated
2 with the request is accessible by the web server.
According to another aspect of the present invention there is provided the above indexing system in which the means for storing pages to, and retrieving pages from, the cache includes a cache daemon accessible to both the web server and the application server.
According to another aspect of the present invention there is provided the above indexing system in which the web server provides a base component for a page name and the cache page identification string is appended to the base component to form the page name for retrieval of pages from the cache.
According to another aspect of the present invention there is provided a cache page system to provide cache pages in response to received requests, the system including a web server, an application server, a configuration file, and a cache, the received requests being potentially associated with a defined page type, the defined page types including a web server cache page, an application server cache page, a user-defined cache page, or a non-cached page, the configuration file being accessible by the web server and by the application server, the configuration file including data for defined requests, the data specifying pages, and page types, associated with the defined requests, the web server accessing the configuration file in response to a received request and where the configuration file data specifies that the received request is associated with a web server cache page, the web server retrieving the said page from the cache, the application server accessing the configuration file in response to a received request to conditionally retrieve pages from the cache whereby the application server invokes a user-defined component to generate a cache page name string used by the application server to index into the cache , where the configuration file specifies that the received request is associated with a user-defined cache page, the application server directly accesses a cache page where the configuration file specifies that the received request is associated with an application server cache page, and the application server does not access the cache where the configuration file specifies that the received request is associated with non-cached page.
3 According to another aspect of the present invention there is provided an application server cache page indexing system, the application server accepting input from a web server corresponding to a browser request, the application server having access to a cache for pages used to respond to the browser request, the system including a configuration file accessible to the application server for associating received requests with identifiers for user-defined components for generating cache page name strings, means for accessing the configuration file to retrieve identifiers for user-defined components for a received request, means for invoking the identified user-defined components to obtain cache page name strings, means for indexing into the cache based on values including the obtained cache page name strings.
According to another aspect of the present invention there is provided a method for cache page indexing in a system including a web server for receiving browser requests, a cache for pages used to respond to browser requests, an application server accepting input from a web server corresponding to browser requests, the application server having access to the cache, a set of user-defined components for generating cache page name strings, a configuration file accessible to the application server for associating received requests with identifiers for the user-defined components, the method including the application server carrying out the following steps a) accessing the configuration file to retrieve identifiers for user-defined components for a received request, b) invoking the identified user-defined components to obtain cache page name strings, c) indexing into the cache based on values including the obtained cache page name strings to retrieve cache pages, and d) returning pages to the web server based on the pages retrieved from the cache.
According to another aspect of the present invention there is provided the above method further including the step of the application server determining base page name strings corresponding to the received requests and in which the step of indexing into the cache includes the step of combining the base page name strings and the obtained cache page name strings.
4 According to another aspect of the present invention there is provided the above method in which the configuration file associates selected received requests with web server cache page identifiers and in which the method further includes the step of the web server accessing the cache to retrieve cache pages based on web server cache page identifiers retrieved by the web server from the configuration file.
According to another aspect of the present invention there is provided a computer program product for indexing cache pages, the computer program product including a computer usable medium having computer readable code means embodied in said medium, including computer readable program code means for carrying out the above methods.
It will be appreciated by those skilled in the art that the computer usable medium can include various data recording media or a carrier signal carrying the program code. The signal can be transmitted over a network such as the Internet or by wired or wireless means.
According to another aspect of the present invention there is a cache page indexing system for a web server environment, the environment including a web server for receiving browser requests, a cache for pages used to respond to browser requests, an application server accepting input from a web server corresponding to browser requests, the application server having access to the cache, the indexing system including means in the application server to invoke a set of user-defined components for generating cache page name strings, a configuration file accessible to the application server for associating received requests with identifiers for the user-defined components, whereby the application server accesses the configuration file to retrieve identifiers for user-defined components for a received request, and indexes into the cache based on values including the obtained cache page name strings to retrieve cache pages.
According to another aspect of the present invention there is provided the above indexing system further including means for the application server to determine base page name strings corresponding
5 to the received requests and means for combining the base page name strings and the obtained cache page name strings.
Advantages of the present invention include cache page indexing based on user-defined components to provide for more sophisticated and more extensive caching of pages being provided to browsers by a web server.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram showing system components utilizing the cache page indexing of the preferred embodiment.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Figure 1 shows, in a block diagram format, the web server environment of the preferred embodiment.
Figure 1 shows browser 10, web server 12, application server 14 and database 16. Figure 1 also shows cache daemon 18 and associated cache storage 20. In the preferred embodiment, there is a single cache accessible by both web server 12 and application server 14. In other implementations, separate caches may be used by the two server components. It may also be the case that one or other of the servers may provide the write to cache functionality for the system.
The retrieval from the page cache is undertaken by web server 12 and application server 14, independently, although as shown in Figure 1, the same mechanism (cache daemon 18) may be used for accessing single cache storage 20.
A typical use of web server 12 by browser 10, without caching, includes browser 10 making a request for a HTML page from web server 12. In response to the request from browser 10, web server 12 may access application server 14. Application server 14 carries out the processing required to ensure that the correct response to browser 10 is provided via web server 12. Application server 14 may access database 16 to retrieve data to be included in an HTML page to be returned by web server 12 to browser 10.
6 The steps involved in application server 14 being accessed by web server 12 and potentially accessing data base 16, are often time-consuming and can result in delays in returning information to browser 10. For this reason, prior art systems incorporate caching of HTML
pages. When a browser or different browsers request pages having the same URL the application server and database execution path may be avoided by web server 12 obtaining the page by calling cache daemon 18 to access cache storage 20. Alternatively, application server 14 may access cache storage 20 using cache daemon 18 to have a page provided at the application server level.
A simple form of caching of HTML pages involves the indexing of such pages by incorporating the URL of the page in the file name of the page stored in the cache. Thus when browser 10 requests a page with a given URL it is potentially obtained by web server 12 using cache daemon 18 to looking up cache pages in cache storage 20 having the name of the URL.
In other cases, different versions of the same page may be returned to browsers based on information other than the URL requested. For example an application server can return a page with or without a "sold out" banner included depending on whether the application server 14 determines that the product is out of stock. It is known to use a more sophisticated version of cache page indexing for the web server system to handle such variations in pages supplied by the web server. In such an approach a base file name is provided as a result of the received request. The base file name is then modified to reflect a version of the page to be provided based on data available to the application server.
Using this indexing mechanism, application server 14 includes an extension or string in the cache page file name to reflect possible variations. For example, the string ".currCAD" may be included in a cache page file name to reflect that the page shows currency in Canadian dollars. Cache daemon 18 is therefore provided with a base file name and the extension string to index cache storage 20 (alternatively, application server 14 may concatenate the base file name and the extension).
In the preferred embodiment, further functionality is provided to permit web server cache pages to
7 be indexed in a more sophisticated way. As shown in Figure 1, the preferred embodiment system includes plug-ins 24. Application server 14 is an object-oriented component in the preferred embodiment. User-defined components for application server 14 are shown as plug-ins 24 in Figure 1. Configuration file 26 is also provided in the preferred embodiment system and is accessible by both web server 12 and application server 14.
According to the system of the preferred embodiment, users defining web sites using web server 12 and application server 14 are also able to define plug-ins 24. These components are callable by application server 14 and use data (apart from the URL requested by browser 10) to define an index into cache pages stored in cache storage 20, and managed by cache daemon 18.
Configuration file 26 stores data indicating whether a given URL corresponds to a page which is cacheable and if the page is cacheable, whether the page is accessible by web server 12 directly or by application server 14. Where the page is retrievable from the cache by application server 14, configuration file 26 stores data indicating whether a given URL requires the invocation of the plug-in 24 for index information.
In operation, when browser 10 requests a page with a specified URL, web server 12 accesses configuration file 26. If configuration file 26 specifies that the URL is cacheable and retrievable from web server 12, then web server 12 attempts to retrieve the page cache storage 20 by invoking cache daemon 18. If the page is located (ie. it is in the web server cache) the HTML page is returned to browser 10. Where the page is not yet cached but is a candidate, web server 12 will request the page from application server 14 and once the page is obtained the page will be stored by cache daemon 18 for future access by web server 12 from cache storage 20.
Where configuration file 26 specifies that a defined URL is cacheable in the application server cache (is retrievable by application server 14 from cache storage 20) web server 12 passes the request for that URL to application server 14. Application server 14 then accesses configuration file 26 to determine whether the page being accessed has been set up to have an associated user-defined
8 command (shown in Figure 1 as one of plug-ins 24). The returned value from the user-defined plug-in (metadata) is used as part of the name to index into cache storage 20.
This approach permits a user of the website system to define a set of web pages that may be displayed for browsers that are variants specific to the particular website being supported. The application server for the website has its cache effectively extended by the user-defined plug-ins that define pages that are cachable for the specific website.
The above mechanism also permits a cache page file name to include a combination of "hard coded"
text from the application and metadata returned by the user-defined component.
For example, a browser may request a defined URL that provides currency values. Information about the browser will indicate that a Canadian dollar version of the URL defined page is to be provided. As described above, it is known to have application server 14 include a string such as ".currCAD" to the file name and index into cache storage 20 using the file name having that string. Where a user-defined plug-in is also associated with the URL, the file name may include the ".currCAD"
string, and also have a string defined by the user-defined component in plug-in 24.
In the preferred embodiment, configuration file 26 is implemented as an XML
file and is accessible by web server 12 and application server 14. The concatenation of the different elements in a file name used to index into cache storage 20 is carried out by cache daemon 18 in the preferred embodiment. It will be appreciated by those skilled in the art that other formats for configuration file 26 and other mechanisms for assembling a file name for indexing into cache storage 20, may be used.
Although a preferred embodiment of the present invention has been described here in detail, it will be appreciated by those skilled in the art that variations may be made thereto without departing from the spirit of the invention or the scope of the appended claims.
9

Claims (17)

THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE PROPERTY OR
PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. An indexing system for pages stored in a cache, the pages to be accessed in response to requests received by a web server, the indexing system comprising means for accessing user-defined components for dynamically generating cache page identification strings, means for storing pages to, and retrieving pages from, the cache based on page names comprising generated cache page identification strings, and means for associating user-defined components with requests received by the web server, whereby for a received request the indexing system accesses an associated user-defined component to generate a cache page identification string for use in retrieving a page from the cache based on the page name.
2. The indexing system of claim 1 in which the indexing system comprises an application server and in which the user-defined components are plugs-ins to the application server.
3. The indexing system of claim 2 in which the means for associating user-defined components with requests comprises a configuration file accessible by the web server and the application server.
4. The indexing system of claim 3 further comprising pages stored in the cache which are accessible by the web server in which the configuration file comprises data specifying, for a defined page request, whether a page associated with the request is accessible by the web server.
5. The indexing system of claim 2 in which the means for storing pages to, and retrieving pages from, the cache comprises a cache daemon accessible to both the web server and the application server.
6. The indexing system of claim 1 in which the web server provides a base component for a page name and the cache page identification string is appended to the base component to form the page name for retrieval of pages from the cache.
7. A cache page system to provide cache pages in response to received requests, the system comprising a web server, an application server, a configuration file, and a cache, the received requests being potentially associated with a defined page type, the defined page types comprising a web server cache page, an application server cache page, a user-defined cache page, or a non-cached page, the configuration file being accessible by the web server and by the application server, the configuration file comprising data for defined requests, the data specifying pages, and page types, associated with the defined requests, the web server accessing the configuration file in response to a received request and where the configuration file data specifies that the received request is associated with a web server cache page, the web server retrieving the said page from the cache, the application server accessing the configuration file in response to a received request to conditionally retrieve pages from the cache whereby the application server invokes a user-defined component to generate a cache page name string used by the application server to index into the cache , where the configuration file specifies that the received request is associated with a user-defined cache page, the application server directly accesses a cache page where the configuration file specifies that the received request is associated with an application server cache page, and the application server does not access the cache where the configuration file specifies that the received request is associated with non-cached page.
8. An application server cache page indexing system, the application server accepting input from a web server corresponding to a browser request, the application server having access to a cache for pages used to respond to the browser request, the system comprising, a configuration file accessible to the application server for associating received requests with identifiers for user-defined components for generating cache page name strings, means for accessing the configuration file to retrieve identifiers for user-defined components for a received request, means for invoking the identified user-defined components to obtain cache page name strings, means for indexing into the cache based on values comprising the obtained cache page name strings.
9. A method for cache page indexing in a system comprising a web server for receiving browser requests, a cache for pages used to respond to browser requests, an application server accepting input from a web server corresponding to browser requests, the application server having access to the cache, a set of user-defined components for generating cache page name strings, a configuration file accessible to the application server for associating received requests with identifiers for the user-defined components, the method comprising the application server carrying out the following steps a) accessing the configuration file to retrieve identifiers for user-defined components for a received request, b) invoking the identified user-defined components to obtain cache page name strings, c) indexing into the cache based on values comprising the obtained cache page name strings to retrieve cache pages, and d) returning pages to the web server based on the pages retrieved from the cache.
10. The method of claim 9 further comprising the step of the application server determining base page name strings corresponding to the received requests and in which the step of indexing into the cache comprises the step of combining the base page name strings and the obtained cache page name strings.
11. The method of claim 9 in which the configuration file associates selected received requests with web server cache page identifiers and in which the method further comprises the step of the web server accessing the cache to retrieve cache pages based on web server cache page identifiers retrieved by the web server from the configuration file.
12. A computer program product for indexing cache pages, the computer program product comprising a computer usable medium having computer readable code means embodied in said medium, comprising computer readable program code means for carrying out the method of claims 9, 10 or 11.
13. A cache page indexing system for a web server environment, the environment comprising a a web server for receiving browser requests, a cache for pages used to respond to browser requests, an application server accepting input from a web server corresponding to browser requests, the application server having access to the cache, the indexing system comprising means in the application server to invoke a set of user-defined components for generating cache page name strings, a configuration file accessible to the application server for associating received requests with identifiers for the user-defined components, whereby the application server accesses the configuration file to retrieve identifiers for user-defined components for a received request, and indexes into the cache based on values comprising the obtained cache page name strings to retrieve cache pages.
14. The indexing system of claim 13 further comprising means for the application server to determine base page name strings corresponding to the received requests and means for combining the base page name strings and the obtained cache page name strings.
15. The computer program product of claim 12 wherein said computer readable code means comprises a signal and said medium comprises a recordable data storage medium.
16. The computer program product of claim 12 wherein said medium comprises a modulated carrier signal.
17. The computer program product of claim 16 wherein said signal comprises a transmission over a network.
CA002327198A 2000-11-30 2000-11-30 Cache page indexing for web server environments Abandoned CA2327198A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA002327198A CA2327198A1 (en) 2000-11-30 2000-11-30 Cache page indexing for web server environments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA002327198A CA2327198A1 (en) 2000-11-30 2000-11-30 Cache page indexing for web server environments

Publications (1)

Publication Number Publication Date
CA2327198A1 true CA2327198A1 (en) 2002-05-30

Family

ID=4167787

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002327198A Abandoned CA2327198A1 (en) 2000-11-30 2000-11-30 Cache page indexing for web server environments

Country Status (1)

Country Link
CA (1) CA2327198A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1565842A2 (en) * 2002-11-28 2005-08-24 International Business Machines Corporation Method and system for hyperlinking files
US8060485B2 (en) 2003-02-10 2011-11-15 International Business Machines Corporation Method, system, and program product for accessing required software to process a file

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1565842A2 (en) * 2002-11-28 2005-08-24 International Business Machines Corporation Method and system for hyperlinking files
US8041753B2 (en) 2002-11-28 2011-10-18 International Business Machines Corporation Method and systems for hyperlinking files
US8060485B2 (en) 2003-02-10 2011-11-15 International Business Machines Corporation Method, system, and program product for accessing required software to process a file

Similar Documents

Publication Publication Date Title
US7660844B2 (en) Network service system and program using data processing
US7334087B2 (en) Context-sensitive caching
US6662342B1 (en) Method, system, and program for providing access to objects in a document
US6564218B1 (en) Method of checking the validity of a set of digital information, and a method and an apparatus for retrieving digital information from an information source
US6584548B1 (en) Method and apparatus for invalidating data in a cache
JP5069285B2 (en) Propagating useful information between related web pages, such as web pages on a website
KR100398711B1 (en) Content publication system for supporting real-time integration and processing of multimedia contents including dynamic data and method thereof
US6223178B1 (en) Subscription and internet advertising via searched and updated bookmark sets
US7246170B2 (en) Scheme for systematically registering meta-data with respect to various types of data
CA2200138C (en) A url rewriting pseudo proxy server
US8914519B2 (en) Request tracking for analysis of website navigation
US8583643B2 (en) Caching electronic document resources in a client device having an electronic resource database
US6970873B2 (en) Configurable mechanism and abstract API model for directory operations
US20040133580A1 (en) Persistent data storage for metadata related to web service entities
AU2003267650A1 (en) Method, system, and program for maintaining data in distributed caches
US7376650B1 (en) Method and system for redirecting a request using redirection patterns
US6725265B1 (en) Method and system for caching customized information
US7895337B2 (en) Systems and methods of generating a content aware interface
JP2009544102A (en) Semantic processing of XML documents
US20060242105A1 (en) Pack URI scheme to identify and reference parts of a package
KR100371696B1 (en) Method of allocating article codes and searching the article by the code on internet
US20050027549A1 (en) Multi-layer architecture for property management
Krottmaier et al. Transclusions in the 21st Century.
CA2327198A1 (en) Cache page indexing for web server environments
KR20060115488A (en) Personalized search method using bookmark list of web browser and system for enabling the method

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued